gemkeeper 0.8.0 → 0.8.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +10 -1
- data/lib/gemkeeper/version.rb +1 -1
- metadata +17 -16
- data/specs/20260518-154733-gemkeeper-contractor-support/implementation-summary.md +0 -75
- data/specs/20260518-154733-gemkeeper-contractor-support/spec.md +0 -287
- data/specs/20260529-091429-replace-geminabox-compact-proxy/critique-consolidated-v-1.md +0 -168
- data/specs/20260529-091429-replace-geminabox-compact-proxy/critique-v-1-claude.md +0 -124
- data/specs/20260529-091429-replace-geminabox-compact-proxy/critique-v-1-codex.md +0 -125
- data/specs/20260529-091429-replace-geminabox-compact-proxy/critique-v-1-copilot.md +0 -261
- data/specs/20260529-091429-replace-geminabox-compact-proxy/spec.md +0 -360
- data/specs/20260529-131354-sync-serve-cache-contract/critique-consolidated-v-1.md +0 -95
- data/specs/20260529-131354-sync-serve-cache-contract/critique-v-1-claude.md +0 -47
- data/specs/20260529-131354-sync-serve-cache-contract/critique-v-1-codex.md +0 -112
- data/specs/20260529-131354-sync-serve-cache-contract/critique-v-1-copilot.md +0 -169
- data/specs/20260529-131354-sync-serve-cache-contract/implementation-summary.md +0 -59
- data/specs/20260529-131354-sync-serve-cache-contract/spec.md +0 -169
|
@@ -1,360 +0,0 @@
|
|
|
1
|
-
# Spec 20260529-091429: Replace Geminabox with Compact Index Proxy
|
|
2
|
-
|
|
3
|
-
## Overview
|
|
4
|
-
|
|
5
|
-
Replace the Geminabox dependency with a minimal custom Rack application (`Gemkeeper::CompactIndexServer`) that serves locally-built private gems via the Bundler compact index protocol and proxies public gem requests to RubyGems.org.
|
|
6
|
-
The server also falls back to the system gem cache for offline use when RubyGems.org is unreachable.
|
|
7
|
-
|
|
8
|
-
## Goals
|
|
9
|
-
|
|
10
|
-
- Remove the broken Geminabox proxy (uses the retired `bundler.rubygems.org/api/v1/dependencies` endpoint)
|
|
11
|
-
- Implement the compact index protocol so Bundler uses efficient, cacheable resolution
|
|
12
|
-
- Proxy public gems from RubyGems.org transparently through the same source URL
|
|
13
|
-
- Enable offline use by serving from the system gem cache and a local response cache
|
|
14
|
-
|
|
15
|
-
---
|
|
16
|
-
|
|
17
|
-
## Feature 1: Compact Index Rack Application
|
|
18
|
-
|
|
19
|
-
**Who & why:** Developers using gemkeeper configure `source "http://localhost:9292"` as their single Bundler source.
|
|
20
|
-
Today that source proxies through Geminabox, whose upstream API was retired in May 2023, producing four retries on every `bundle install`.
|
|
21
|
-
They need a server that speaks the compact index protocol Bundler has used since 2016, without that noise.
|
|
22
|
-
|
|
23
|
-
### Functional Requirements
|
|
24
|
-
|
|
25
|
-
#### FR-1.1: Core compact index endpoints
|
|
26
|
-
The server MUST implement the following endpoints:
|
|
27
|
-
|
|
28
|
-
- `GET /names` — sorted, newline-delimited list of all gem names (local + proxied upstream), generated per FR-3.5
|
|
29
|
-
- `GET /versions` — merged versions index combining private gems and the proxied RubyGems.org versions file, generated per FR-3.1
|
|
30
|
-
- `GET /info/:gemname` — per-gem dependency metadata; served from local data for private gems, proxied from RubyGems.org for public gems per FR-3.1
|
|
31
|
-
- `GET /gems/:filename.gem` — serve gem binary (local-first, then system cache, then proxy per FR-3.3)
|
|
32
|
-
|
|
33
|
-
All URL path parameters (`:gemname`, `:filename`) are validated against `/\A[a-zA-Z0-9._-]+\z/` before any filesystem or upstream URL use.
|
|
34
|
-
Return 400 for parameters that do not match.
|
|
35
|
-
For `GET /gems/:filename`, additionally assert the resolved path is under `gems_path/gems/` before serving.
|
|
36
|
-
|
|
37
|
-
**Verify:** `bundle install` against a Gemfile backed by this server completes without retries or `HTTPError` output.
|
|
38
|
-
|
|
39
|
-
#### FR-1.2: Gem upload endpoint
|
|
40
|
-
`POST /upload` accepts a multipart form upload with field name `file` (matching the current Geminabox API consumed by `GemUploader`).
|
|
41
|
-
|
|
42
|
-
Validation: open the uploaded data as a tar archive and confirm it contains `metadata.gz` and `data.tar.gz`.
|
|
43
|
-
Extract gemspec metadata from `metadata.gz` using `Gem::Package` inside a rescue block.
|
|
44
|
-
Return 422 if the archive is malformed or metadata extraction raises.
|
|
45
|
-
Do not `load` or `eval` gemspec content.
|
|
46
|
-
|
|
47
|
-
On success: create `gems_path/gems/` if absent; write to a temp file in the same directory; rename atomically to `gems_path/gems/<name>-<version>.gem`.
|
|
48
|
-
Delete the temp file if validation fails.
|
|
49
|
-
Response codes: 201 on success, 409 if the target path already exists, 422 on invalid gem.
|
|
50
|
-
|
|
51
|
-
After a successful write, rebuild the in-memory gem index per AR-1.1.
|
|
52
|
-
|
|
53
|
-
**Verify:** `gemkeeper sync` completes successfully; the gem appears in `gems_path/gems/` and in subsequent compact index responses.
|
|
54
|
-
|
|
55
|
-
#### FR-1.3: Conditional and range request support
|
|
56
|
-
All endpoints serving locally-generated or merged content (`/names`, `/versions`, `/info/:gemname`) MUST include:
|
|
57
|
-
|
|
58
|
-
- `ETag: "<sha256-hex>"` — SHA256 hex digest of the final response body
|
|
59
|
-
- `Repr-Digest: sha-256=<base64-encoded-sha256>` — RFC 9530; computed from the same final body
|
|
60
|
-
- `Accept-Ranges: bytes`
|
|
61
|
-
|
|
62
|
-
Do not forward `ETag` or `Repr-Digest` headers from RubyGems.org unchanged for merged responses; recompute from the merged body.
|
|
63
|
-
|
|
64
|
-
The server MUST handle:
|
|
65
|
-
- `If-None-Match` — return 304 if the ETag matches
|
|
66
|
-
- `Range: bytes=N-` (open-ended only) — return 206 with the partial body from byte N onward
|
|
67
|
-
- `Range: bytes=N-M` or multi-range — return 416
|
|
68
|
-
|
|
69
|
-
**Verify:** A second `bundle install` produces `304 Not Modified` responses for unchanged index files.
|
|
70
|
-
|
|
71
|
-
#### FR-1.4: Health endpoint
|
|
72
|
-
`GET /` returns `200 OK`.
|
|
73
|
-
Used by `ServerReadinessProbe` (`lib/gemkeeper/server_readiness_probe.rb`).
|
|
74
|
-
|
|
75
|
-
**Verify:** `gemkeeper server start` completes without timing out.
|
|
76
|
-
|
|
77
|
-
### Architectural Requirements
|
|
78
|
-
|
|
79
|
-
#### AR-1.1: Atomic in-memory gem index
|
|
80
|
-
The server maintains an in-memory gem index (private gem metadata read from `gems_path/gems/`).
|
|
81
|
-
After each successful upload, the index is rebuilt into a new object and swapped via a single instance variable assignment.
|
|
82
|
-
Index reads do not acquire a lock; the swap is atomic at the Ruby object reference level (copy-on-write).
|
|
83
|
-
On startup, create `gems_path/gems/` if absent before scanning.
|
|
84
|
-
|
|
85
|
-
---
|
|
86
|
-
|
|
87
|
-
## Feature 2: Private Gem Serving
|
|
88
|
-
|
|
89
|
-
**Who & why:** The gems built by `gemkeeper sync` must appear in Bundler's dependency graph with correct version and dependency metadata.
|
|
90
|
-
Without accurate compact index data for private gems, Bundler will either fail to find them or resolve wrong versions.
|
|
91
|
-
|
|
92
|
-
### Functional Requirements
|
|
93
|
-
|
|
94
|
-
#### FR-2.1: Gem file discovery and metadata extraction
|
|
95
|
-
On startup and after each successful upload, the server scans `gems_path/gems/*.gem`.
|
|
96
|
-
For each file, gemspec metadata is extracted from the embedded `metadata.gz` using `Gem::Package` inside a rescue block.
|
|
97
|
-
Files that raise on extraction are skipped with a warning log entry; they do not abort startup.
|
|
98
|
-
Extracted metadata: gem name, version, platform, runtime dependencies (name + version constraint), SHA256 checksum of the `.gem` file.
|
|
99
|
-
|
|
100
|
-
**Verify:** A gem uploaded after server start appears in `/names`, `/versions`, and `/info/:gemname` without restarting the server.
|
|
101
|
-
|
|
102
|
-
#### FR-2.2: Compact index data generation
|
|
103
|
-
Uses the `compact_index` gem to produce correct response bodies.
|
|
104
|
-
|
|
105
|
-
**`info_checksum` ordering** — info bodies for all private gems must be generated before the `/versions` index is built.
|
|
106
|
-
For each private gem, compute `Digest::MD5.hexdigest(CompactIndex.info(gem_versions_array))` and store it as `info_checksum` in the corresponding `CompactIndex::GemVersion`.
|
|
107
|
-
The versions index is then built referencing those pre-computed checksums.
|
|
108
|
-
Checksums are recomputed after each upload.
|
|
109
|
-
|
|
110
|
-
**Verified `compact_index` 0.15.0 API:**
|
|
111
|
-
- `CompactIndex::GemVersion` — `Struct.new(:number, :platform, :checksum, :info_checksum, :dependencies, :ruby_version, :rubygems_version)`. Field is `number`, not `version`. `checksum` is the SHA256 of the `.gem` file.
|
|
112
|
-
- `CompactIndex::Gem` — `Struct.new(:name, :versions)`.
|
|
113
|
-
- `CompactIndex::Dependency` — `Struct.new(:gem, :version, :platform, :checksum)`. The dependency gem name is field `:gem`; the constraint string is field `:version`.
|
|
114
|
-
- `info_checksum` uses MD5 (not SHA256) per the compact index protocol. Bundler verifies this checksum when it downloads `/info/:gemname`.
|
|
115
|
-
|
|
116
|
-
**Verify:** `bundle exec gem dependency <private-gem>` resolves correctly when the Gemfile sources from `http://localhost:9292`.
|
|
117
|
-
|
|
118
|
-
### Architectural Requirements
|
|
119
|
-
|
|
120
|
-
#### AR-2.1: `compact_index` and `rubygems-generate_index` dependency swap
|
|
121
|
-
`gemkeeper.gemspec` drops `geminabox ~> 3.0` and `rubygems-generate_index ~> 1.0`, and adds `compact_index ~> 0.15`.
|
|
122
|
-
No other runtime dependencies are added for this feature.
|
|
123
|
-
|
|
124
|
-
---
|
|
125
|
-
|
|
126
|
-
## Feature 3: Public Gem Proxy with Offline Cache
|
|
127
|
-
|
|
128
|
-
**Who & why:** The Gemfile sources all gems — public and private — from `http://localhost:9292`.
|
|
129
|
-
Public gem resolution must work when online (proxying to RubyGems.org) and degrade gracefully when offline rather than returning 500 errors.
|
|
130
|
-
When offline, gems already installed on the developer's system should be servable directly, avoiding re-download on reconnect.
|
|
131
|
-
|
|
132
|
-
### Functional Requirements
|
|
133
|
-
|
|
134
|
-
#### FR-3.1: Merge algorithm for `/versions` and `/info/:gemname`
|
|
135
|
-
|
|
136
|
-
**`/versions`:**
|
|
137
|
-
Fetch `https://rubygems.org/versions` and cache the raw response body to `cache_dir/rubygems_cache/versions` (refreshed when the upstream ETag changes or after 30 minutes).
|
|
138
|
-
Construct a `CompactIndex::VersionsFile` from that cached file.
|
|
139
|
-
Pass private gem objects as `extra_gems` to `CompactIndex.versions(versions_file, extra_gems)` so they are appended after the upstream public block.
|
|
140
|
-
When a private gem name collides with a public gem name, the private gem entry takes precedence: suppress the public entry for that name from the merged output so Bundler cannot resolve the public version.
|
|
141
|
-
The public block is never reordered; private entries are appended. This keeps byte offsets of the public block stable across private gem additions, preserving Bundler's incremental range fetching.
|
|
142
|
-
Write the merged response to `cache_dir/rubygems_cache/versions.merged` and keep only the merged body's SHA256 hex digest in memory as the current ETag.
|
|
143
|
-
Regenerate `versions.merged` on each upload or when the upstream ETag changes.
|
|
144
|
-
Serve `/versions` by streaming `versions.merged` from disk; the OS file cache handles hot reads without holding the full body in memory.
|
|
145
|
-
|
|
146
|
-
**`/info/:gemname` for private gems:** generate from local metadata (FR-2.2).
|
|
147
|
-
**`/info/:gemname` for public gems:** proxy `https://rubygems.org/info/:gemname` and cache per FR-3.2.
|
|
148
|
-
|
|
149
|
-
**Verify:** `bundle install` resolves a public gem (e.g., `rake`) and a private gem through the same local source with no errors.
|
|
150
|
-
|
|
151
|
-
#### FR-3.2: Cache proxy responses for offline use
|
|
152
|
-
Cache proxied compact index responses under `cache_dir/rubygems_cache/`:
|
|
153
|
-
|
|
154
|
-
- `/versions` raw upstream body: refreshed when upstream ETag changes or after 30 minutes. Use a conditional GET (`If-None-Match`) to upstream; a 304 resets the local TTL without rewriting the file.
|
|
155
|
-
- `/info/:gemname` per-gem: cached per gem name; refreshed after 60 minutes using the same conditional GET pattern.
|
|
156
|
-
- `.gem` binaries: cached permanently (content-addressed; gem files are immutable once published).
|
|
157
|
-
|
|
158
|
-
Cache files are written atomically (temp file + rename).
|
|
159
|
-
|
|
160
|
-
When RubyGems.org is unreachable (connection error or timeout) and a cached copy exists, serve from cache.
|
|
161
|
-
|
|
162
|
-
**Verify:** After a successful `bundle install` online, disconnecting from the network and running `bundle install` again completes using only cached data.
|
|
163
|
-
|
|
164
|
-
#### FR-3.3: System gem cache fallback for `.gem` files
|
|
165
|
-
Before proxying `GET /gems/:filename.gem` to RubyGems.org, check each path in `Gem.path.map { |p| File.join(p, "cache", filename) }` using `File.exist?` before attempting to read.
|
|
166
|
-
If a matching readable file is found, serve it directly without a network request.
|
|
167
|
-
|
|
168
|
-
**Verify:** A `.gem` file present in the system gem cache is served without an outbound RubyGems.org request.
|
|
169
|
-
|
|
170
|
-
#### FR-3.4: Response semantics for missing or unreachable upstream
|
|
171
|
-
Distinguish three cases for upstream gem requests:
|
|
172
|
-
|
|
173
|
-
- **Upstream reachable, gem not found** (upstream returns 4xx): return 404 with no body.
|
|
174
|
-
- **Upstream unreachable** (connection error, timeout) **+ cache exists**: serve from cache.
|
|
175
|
-
- **Upstream unreachable + no cache**: return 503 with body `"Upstream unavailable and no local cache. Connect to the internet and run bundle install to warm the cache."`.
|
|
176
|
-
|
|
177
|
-
Do not return 500 in any of these cases.
|
|
178
|
-
|
|
179
|
-
**Verify:** With RubyGems.org blocked and no cache, a request to `/info/nonexistent-gem` returns 503, not 500.
|
|
180
|
-
|
|
181
|
-
#### FR-3.5: `/names` endpoint
|
|
182
|
-
Fetch `https://rubygems.org/names` and cache the raw response body under `cache_dir/rubygems_cache/names` with the same 60-minute TTL and conditional GET refresh as `/info/:gemname`.
|
|
183
|
-
Merge local private gem names with the cached upstream names; sort the combined list alphabetically.
|
|
184
|
-
Write the merged result to `cache_dir/rubygems_cache/names.merged`; keep only its SHA256 hex digest in memory as the current ETag.
|
|
185
|
-
Regenerate `names.merged` on each upload or when the upstream names ETag changes.
|
|
186
|
-
Serve `/names` by streaming `names.merged` from disk.
|
|
187
|
-
|
|
188
|
-
**Verify:** `/names` includes both a known private gem name and a known public gem name.
|
|
189
|
-
|
|
190
|
-
### Architectural Requirements
|
|
191
|
-
|
|
192
|
-
#### AR-3.1: HTTP client for proxying
|
|
193
|
-
Use `Net::HTTP` (stdlib) for all outbound RubyGems.org requests.
|
|
194
|
-
Do not add `faraday`, `httpclient`, or other HTTP client gems for proxy use.
|
|
195
|
-
|
|
196
|
-
#### AR-3.2: Proxy timeout
|
|
197
|
-
Outbound requests use a 5-second open timeout and a 10-second read timeout.
|
|
198
|
-
Timeout errors are treated as unreachable (see FR-3.4).
|
|
199
|
-
|
|
200
|
-
#### AR-3.3: Gem binary streaming
|
|
201
|
-
Proxy and cache responses for `.gem` file downloads using bounded buffering or streaming rather than reading the full binary into memory before sending.
|
|
202
|
-
RubyGems.org gem files range from a few KB to tens of MB.
|
|
203
|
-
|
|
204
|
-
---
|
|
205
|
-
|
|
206
|
-
## Feature 4: Gemkeeper Integration
|
|
207
|
-
|
|
208
|
-
**Who & why:** The server is an implementation detail inside gemkeeper.
|
|
209
|
-
All existing CLI commands, upload flow, list command, server lifecycle, and mirror configuration must continue to work without changes to their respective classes.
|
|
210
|
-
|
|
211
|
-
### Functional Requirements
|
|
212
|
-
|
|
213
|
-
#### FR-4.1: config.ru generation
|
|
214
|
-
`RackupProcess#config_ru_content` (`lib/gemkeeper/rackup_process.rb`) is updated to generate a config.ru that requires `Gemkeeper::CompactIndexServer` and mounts it, passing `gems_path` and `cache_dir`.
|
|
215
|
-
All Geminabox configuration is removed.
|
|
216
|
-
|
|
217
|
-
**Verify:** The generated `config.ru` contains no references to `Geminabox`; the server starts and responds normally.
|
|
218
|
-
|
|
219
|
-
#### FR-4.2: Upload API compatibility — no changes to `GemUploader`
|
|
220
|
-
`lib/gemkeeper/gem_uploader.rb` is unchanged.
|
|
221
|
-
The server's `POST /upload` endpoint accepts the same multipart payload and returns status codes compatible with `GemUploader#handle_response`: 200, 201, or 302 for success; 409 for conflict.
|
|
222
|
-
Return 201 for a new upload; 409 if the gem already exists.
|
|
223
|
-
|
|
224
|
-
**Verify:** `gemkeeper sync` uploads gems without error; a second sync of the same version produces a skip (409 → already-exists path).
|
|
225
|
-
|
|
226
|
-
#### FR-4.3: List command compatibility — no changes to list
|
|
227
|
-
`gemkeeper list` reads `Dir.glob(File.join(gems_path, "gems", "*.gem"))` directly from the filesystem.
|
|
228
|
-
The custom server stores uploaded gems at `gems_path/gems/` matching current structure.
|
|
229
|
-
|
|
230
|
-
**Verify:** `gemkeeper list` output is unchanged after migration.
|
|
231
|
-
|
|
232
|
-
#### FR-4.4: Server lifecycle — no changes to `ServerManager`, `ServerReadinessProbe`, `BundlerMirrorConfigurator`
|
|
233
|
-
These classes are Rack-server-agnostic and require no modifications.
|
|
234
|
-
|
|
235
|
-
**Verify:** `gemkeeper server start`, `gemkeeper server stop`, and `gemkeeper server status` all behave identically before and after migration.
|
|
236
|
-
|
|
237
|
-
### Architectural Requirements
|
|
238
|
-
|
|
239
|
-
#### AR-4.1: New server class location
|
|
240
|
-
`Gemkeeper::CompactIndexServer` is implemented in `lib/gemkeeper/compact_index_server.rb` as a Rack application (responds to `call(env)`).
|
|
241
|
-
It is instantiated and `run` in the generated `config.ru`.
|
|
242
|
-
It is not required anywhere else in the gemkeeper library.
|
|
243
|
-
|
|
244
|
-
---
|
|
245
|
-
|
|
246
|
-
## Data Requirements
|
|
247
|
-
|
|
248
|
-
The `rubygems_cache/` directory layout under `cache_dir`:
|
|
249
|
-
|
|
250
|
-
```
|
|
251
|
-
cache_dir/
|
|
252
|
-
rubygems_cache/
|
|
253
|
-
versions # raw upstream /versions body
|
|
254
|
-
versions.merged # merged upstream + private gems (served to Bundler)
|
|
255
|
-
versions.meta # sidecar: upstream ETag + fetched_at timestamp
|
|
256
|
-
names # raw upstream /names body
|
|
257
|
-
names.merged # merged upstream + private gem names (served to Bundler)
|
|
258
|
-
names.meta # sidecar: upstream ETag + fetched_at timestamp
|
|
259
|
-
info/
|
|
260
|
-
<gemname> # raw upstream /info/:gemname body
|
|
261
|
-
<gemname>.meta # sidecar: upstream ETag + fetched_at timestamp
|
|
262
|
-
gems/
|
|
263
|
-
<name>-<version>.gem # cached gem binaries (permanent)
|
|
264
|
-
```
|
|
265
|
-
|
|
266
|
-
Sidecar `.meta` files are written atomically alongside the body file.
|
|
267
|
-
|
|
268
|
-
---
|
|
269
|
-
|
|
270
|
-
## Integration Points
|
|
271
|
-
|
|
272
|
-
| File | Change |
|
|
273
|
-
| ---- | ------ |
|
|
274
|
-
| `lib/gemkeeper/rackup_process.rb` | Replace `config_ru_content` |
|
|
275
|
-
| `lib/gemkeeper/compact_index_server.rb` | New file — the Rack app |
|
|
276
|
-
| `gemkeeper.gemspec` | Remove `geminabox ~> 3.0` and `rubygems-generate_index ~> 1.0`; add `compact_index ~> 0.15` |
|
|
277
|
-
| `test/integration/test_server_lifecycle_integration.rb` | Update config.ru content assertions (lines 80–81) |
|
|
278
|
-
| `CLAUDE.md` | Update Architecture section; remove Geminabox references |
|
|
279
|
-
| `AGENTS.md` | Same updates as CLAUDE.md |
|
|
280
|
-
|
|
281
|
-
### Known dead code after migration
|
|
282
|
-
`GemUploader#list_gems` calls `GET /api/v1/gems.json`, a Geminabox-specific endpoint the new server does not implement.
|
|
283
|
-
The method is unused in production (the list CLI reads the filesystem directly).
|
|
284
|
-
Remove or raise `NotImplementedError` — do not leave a silently broken public method.
|
|
285
|
-
|
|
286
|
-
## Related Specs
|
|
287
|
-
|
|
288
|
-
None — this is a standalone infrastructure replacement.
|
|
289
|
-
|
|
290
|
-
## Constraints
|
|
291
|
-
|
|
292
|
-
- No changes to `gem_uploader.rb`, `server_manager.rb`, `server_readiness_probe.rb`, `bundler_mirror_configurator.rb`, or `configuration.rb`
|
|
293
|
-
- No new runtime dependencies beyond `compact_index ~> 0.15`
|
|
294
|
-
- `POST /upload` multipart API must remain compatible with `GemUploader`
|
|
295
|
-
- Gem storage path (`gems_path/gems/*.gem`) must remain unchanged so `gemkeeper list` is unaffected
|
|
296
|
-
|
|
297
|
-
## Out of Scope
|
|
298
|
-
|
|
299
|
-
- Authentication for uploads or downloads
|
|
300
|
-
- HTTPS/TLS
|
|
301
|
-
- Yanking gems
|
|
302
|
-
- Proxying sources other than rubygems.org
|
|
303
|
-
- The `gemkeeper manifest`, `gemkeeper setup`, or `gemkeeper sync` internals
|
|
304
|
-
- Serving legacy index formats (`Marshal.4.8.gz`, `specs.4.8.gz`)
|
|
305
|
-
- `GET /api/v1/gems.json` (Geminabox-specific; unused in production after `list_gems` removal)
|
|
306
|
-
|
|
307
|
-
## Spec Completeness Checklist
|
|
308
|
-
|
|
309
|
-
- [x] **Scope & acceptance criteria** — each FR has a Verify line; Out of Scope list is explicit; blocking ambiguities from critique resolved
|
|
310
|
-
- [x] **Testing strategy** — FRs reference existing tests (FR-4.2, FR-4.4); integration verify conditions cover server start, upload round-trip, offline cache, and Bundler resolution; `test_compact_index_server.rb` implied by AR-4.1 convention (one test file per class)
|
|
311
|
-
- [x] **Existing patterns** — references `GemUploader`, `ServerReadinessProbe`, `ServerManager`, `Dir.glob` list pattern, `Gem::Package` extraction, and existing storage path conventions throughout
|
|
312
|
-
- [x] **Dependencies** — `compact_index ~> 0.15` justified in AR-2.1; `rubygems-generate_index` removal explicit; `Net::HTTP` (stdlib) chosen in AR-3.1; no other additions
|
|
313
|
-
- [x] **Architecture & interfaces** — Rack app interface in AR-4.1; storage layout in Data Requirements; proxy HTTP client in AR-3.1/AR-3.2; config.ru generation in FR-4.1; cache layout in Data Requirements; atomic upload in FR-1.2; atomic index swap in AR-1.1
|
|
314
|
-
- [x] **Error handling & failure modes** — FR-3.4 distinguishes upstream 404 vs 503; FR-1.2 covers malformed upload and 422; FR-2.1 covers corrupt gem at startup; AR-1.1 covers concurrent read/write; FR-1.3 covers invalid range (416)
|
|
315
|
-
- [x] **Security review** — FR-1.1 adds path parameter validation (`/\A[a-zA-Z0-9._-]+\z/`) and path-under-gems_path assertion; FR-1.2 prohibits gemspec eval; AR-3.1 scopes proxy to rubygems.org only; localhost-only binding inherited from `RackupProcess`
|
|
316
|
-
- [x] **Performance impact** — merged `/versions` (~23 MB) and `/names` (~2.7 MB) written to disk and streamed; only SHA256 ETag strings held in memory (FR-3.1, FR-3.5); gem binary proxy streamed per AR-3.3; private gem index is small and negligible
|
|
317
|
-
- [x] **Rollout & migration** — drop-in replacement; no data migration; existing `gems_path/gems/` reused; Homebrew formula rebuild required; `list_gems` dead method called out explicitly
|
|
318
|
-
- [x] **Assumptions & risks** — `compact_index` 0.15.x field names flagged for pre-implementation verification (FR-2.2); Bundler `Range`/`Repr-Digest` strictness addressed in FR-1.3; `/versions` byte-stability addressed in FR-3.1
|
|
319
|
-
|
|
320
|
-
---
|
|
321
|
-
|
|
322
|
-
## Change Log
|
|
323
|
-
|
|
324
|
-
### Update from `critique-consolidated-v-1.md`
|
|
325
|
-
|
|
326
|
-
**Applied:**
|
|
327
|
-
- B-1: Specified `/versions` merge algorithm — upstream verbatim block first via `VersionsFile`, private gems as `extra_gems`, collision = suppress public entry, byte-stable layout (FR-3.1)
|
|
328
|
-
- B-2: Specified ETag and `Repr-Digest` computed from merged body; SHA256 only; no forwarding of upstream headers for merged responses (FR-1.3)
|
|
329
|
-
- B-3: Added `info_checksum` generation ordering — info bodies first, checksums embedded before versions index is built (FR-2.2)
|
|
330
|
-
- B-4: Added `/names` as a full fetch/cache/merge endpoint matching `/versions` semantics (FR-3.5)
|
|
331
|
-
- G-1: Added URL parameter validation (`/\A[a-zA-Z0-9._-]+\z/`) and path-containment assertion to FR-1.1
|
|
332
|
-
- G-2: Defined "valid gem" as tar-parseable with extractable metadata; no eval (FR-1.2)
|
|
333
|
-
- G-3: Added `mkdir_p`, atomic temp-file write, and temp cleanup to FR-1.2
|
|
334
|
-
- G-4: Added AR-1.1 specifying atomic index swap (copy-on-write) for thread safety
|
|
335
|
-
- G-5: Replaced FR-3.4 with three-way distinction: upstream 404 → 404; unreachable + cache → serve cache; unreachable + no cache → 503
|
|
336
|
-
- G-6: Added `GemUploader#list_gems` dead-method callout to Integration Points; `/api/v1/gems.json` added to Out of Scope
|
|
337
|
-
- G-7: Added `rubygems-generate_index` to AR-2.1 as dependency to remove; added to Integration Points table
|
|
338
|
-
- Corrected FR-4.2 response codes to match actual `GemUploader#handle_response`: 200/201/302 success, 409 conflict
|
|
339
|
-
- Added AR-3.3 requiring gem binary streaming to avoid full-file memory allocation
|
|
340
|
-
- Added Data Requirements section with `rubygems_cache/` directory layout and sidecar metadata files
|
|
341
|
-
- Added AGENTS.md to Integration Points (both CLAUDE.md and AGENTS.md exist in repo)
|
|
342
|
-
- Clarified FR-3.2 cache write atomicity and conditional GET (If-None-Match) refresh behavior
|
|
343
|
-
|
|
344
|
-
### Pre-implementation compact_index API verification
|
|
345
|
-
|
|
346
|
-
**Applied:**
|
|
347
|
-
- Corrected `info_checksum` hash algorithm from SHA256 to MD5 — the protocol spec and `compact_index` gem both use `Digest::MD5` for this field; Bundler verifies it on download
|
|
348
|
-
- Confirmed `GemVersion` field is `number` (not `version`); documented full struct signature
|
|
349
|
-
- Confirmed `Dependency` fields: `:gem` for the dep name, `:version` for the constraint
|
|
350
|
-
- Confirmed collision suppression works via last-wins semantics — `VersionsFile#contents` appends `extra_gems` verbatim; no pre-filtering of upstream file needed
|
|
351
|
-
- Improved FR-3.4 503 body to include actionable guidance for cold-start offline case
|
|
352
|
-
|
|
353
|
-
**Rejected:**
|
|
354
|
-
- "Set Puma thread count to 1" — over-specifies implementation; AR-1.1's atomic swap is the correct architectural constraint
|
|
355
|
-
- "Add `test/gemkeeper/test_compact_index_server.rb` as an explicit FR" — the one-test-file-per-class convention is already established in the project; calling it out in the spec over-specifies test structure
|
|
356
|
-
|
|
357
|
-
**Reorganized:**
|
|
358
|
-
- Split old FR-3.1 into FR-3.1 (merge algorithm) and FR-3.5 (/names endpoint) for clarity
|
|
359
|
-
- Moved `gems_path/gems/` creation from an implicit assumption into FR-1.2 and AR-1.1 explicitly
|
|
360
|
-
- Added Data Requirements section to centralize the cache directory layout (previously scattered across FR-3.1 and FR-3.2)
|
|
@@ -1,95 +0,0 @@
|
|
|
1
|
-
# Spec 20260529-131354: Consolidated Critique (v1)
|
|
2
|
-
|
|
3
|
-
## Overview
|
|
4
|
-
|
|
5
|
-
**Critiques received from:** Claude, Codex, Copilot (claude-sonnet-4.6)
|
|
6
|
-
**Critiques missing:** Gemini (not installed; Copilot used as the third critic)
|
|
7
|
-
|
|
8
|
-
## Executive Summary
|
|
9
|
-
|
|
10
|
-
All three critics agree the spec targets the right bug with the right shape (server-authoritative skip + artifact re-upload, uploader-seam placement, both deployment modes preserved). But they converge on one finding that **undercuts the chosen mechanism**: `GET /info/<name>` is *not* a private-store-authoritative signal. `CompactIndexServer#serve_info` falls back to `serve_upstream_info` (rubygems.org) when the gem isn't in the private `GemIndex` (`compact_index_server.rb:74`). That breaks the presence check two ways:
|
|
11
|
-
|
|
12
|
-
1. **Public name-collision false positive** — a public gem sharing the private gem's name/version makes `/info` return 200, so `sync` wrongly skips while the private store is still missing the artifact (Codex risk 1, Claude point 2).
|
|
13
|
-
2. **Offline failure / slow recovery** — a missing private gem triggers an *upstream* probe with 5s/10s timeouts, returning **503** (not 404) when offline. The spec only treats 404 as "not present," and 503 is exactly the recovery scenario the spec exists to fix (Copilot MR-2, Codex risk 2).
|
|
14
|
-
|
|
15
|
-
This means **Q1 (the cache-check mechanism) needs to be reconsidered**: "reuse `/info`, no new server surface" is not actually authoritative. The fix is either a private-store-only signal (small dedicated endpoint, or a flag/header on the existing one) or a checksum comparison. The dedicated-endpoint option I originally dismissed is now the better-justified path.
|
|
16
|
-
|
|
17
|
-
Beyond that, the strongest convergent gaps are the `version: latest` contradiction, platform filenames, the `build_and_upload` decomposition, and HTTP-status handling.
|
|
18
|
-
|
|
19
|
-
## Consolidated Requirements Feedback
|
|
20
|
-
|
|
21
|
-
### A. `/info` is not private-authoritative (mechanism flaw) — HIGHEST PRIORITY
|
|
22
|
-
**Issue:** The presence check must reflect only the private uploaded store, never the upstream-proxied merge.
|
|
23
|
-
**Agreement:** All three. Codex and Copilot independently trace the `serve_upstream_info` fallback; Claude flagged the public-name-collision edge.
|
|
24
|
-
**Divergence:** Mechanism. Codex: add a private presence contract *or* a checksum-based rule. Copilot: at minimum treat 503 as not-present. Claude: pin an AR that presence is read only from the private index.
|
|
25
|
-
**Recommendation:** Reverse the Q1 decision toward a **private-store-only presence signal**. Cleanest: a tiny read-only endpoint that consults `GemIndex` only (e.g. `GET /gemkeeper/has/<name>/<version>` → 200/404, or `HEAD /gems/<file>` wired to the private store), returning unambiguous present/absent with no upstream probe. This also removes the private-name leak to rubygems.org and the offline-timeout problem in one move. If avoiding a new endpoint is still preferred, define a checksum rule and explicit 503-means-not-present handling — but the endpoint is simpler and more correct.
|
|
26
|
-
|
|
27
|
-
### B. `version: latest` contradicts "never re-clone"
|
|
28
|
-
**Issue:** Goals promise recovery never re-clones, but `latest` resolves its version only post-checkout, so it must clone first.
|
|
29
|
-
**Agreement:** All three (Claude point 4, Codex risk 3, Copilot MR-1/PI-1).
|
|
30
|
-
**Recommendation:** Pick one and write it down: (a) scope the no-reclone guarantee to pinned + `from_lockfile` versions, and state `latest` keeps today's always-fetch behavior; or (b) for `latest` with a local artifact, read the version from the artifact via `Gem::Package.new(path).spec.version` to avoid cloning. (a) is the smaller, safer change and matches today's `!gem_def.latest?` cache bypass; recommend (a) unless offline `latest` recovery is a stated requirement.
|
|
31
|
-
|
|
32
|
-
### C. Platform filenames
|
|
33
|
-
**Issue:** Presence is framed as `(name, version)`, but artifacts/served files can be `<name>-<version>-<platform>.gem` (`SpecMapper.filename`).
|
|
34
|
-
**Agreement:** Claude point 1, Codex risk 4.
|
|
35
|
-
**Recommendation:** Either declare private gems pure-Ruby (filename `<name>-<version>.gem`) as an explicit assumption, or make presence/artifact lookup operate on the exact filename including platform. Given these are internal source-built gems, the pure-Ruby assumption is likely fine — but it must be stated.
|
|
36
|
-
|
|
37
|
-
### D. Artifact integrity before re-upload
|
|
38
|
-
**Issue:** FR-1.3 re-uploads a local `.gem` without validating it; a partial/corrupt artifact (interrupted build) would fail server-side in `GemIndex#add` (`Gem::Package.new(...).spec`).
|
|
39
|
-
**Agreement:** Codex (missing req), Copilot EH-3.
|
|
40
|
-
**Recommendation:** Require a pre-upload integrity/identity check — parse the artifact's spec and confirm name+version match the requested gem before declaring success — or explicitly scope corrupt-artifact handling out. Recommend the lightweight check; it also closes Codex's "upload the wrong thing" concern.
|
|
41
|
-
|
|
42
|
-
### E. `build_and_upload` decomposition unspecified
|
|
43
|
-
**Issue:** FR-1.3 needs an upload-without-build path, but `GemSyncer#build_and_upload` does both unconditionally and the spec doesn't name the structural split.
|
|
44
|
-
**Agreement:** Copilot MR-3, Codex "refactoring GemSyncer ordering."
|
|
45
|
-
**Recommendation:** Add an AR naming the flow: defer repo/manifest resolution until after the server-missing check; separate "upload existing artifact" from "build then upload." This also matters because today `sync` resolves the repo *before* the cache check (`gem_syncer.rb:21`) — ordering must change.
|
|
46
|
-
|
|
47
|
-
### F. HTTP status handling + counting/messaging
|
|
48
|
-
**Issue:** AR-1.4 covers unreachable + malformed but not the full status matrix (400/500/503), and the CLI summary only knows `:synced`/`:skipped` while the spec introduces a third outcome.
|
|
49
|
-
**Agreement:** All three.
|
|
50
|
-
**Recommendation:** Add a status table: 200→inspect, 404 (and 503-from-upstream-miss, if `/info` is retained)→not present, 400→programming error (raise, not "absent"), connection failure→`ServerNotReachableError`. Define whether artifact re-upload counts as `:synced` (recommended; differentiate via output text only) or a new symbol that `run_sync`/`report_results` must learn.
|
|
51
|
-
|
|
52
|
-
### G. Faraday connection reuse for GET
|
|
53
|
-
**Issue:** `GemUploader#connection` carries `:multipart`/`:url_encoded` middleware meant for `POST /upload`; reusing it for a GET is wasteful and theoretically fragile.
|
|
54
|
-
**Agreement:** Copilot CA-1.
|
|
55
|
-
**Recommendation:** Acceptable to reuse (note it), or use a plain connection for read requests. Low priority.
|
|
56
|
-
|
|
57
|
-
### H. `list_gems` vs `has_version?` ambiguity
|
|
58
|
-
**Issue:** AR-1.1's "e.g. `has_version?` / replacing the `list_gems` stub" conflates two different contracts.
|
|
59
|
-
**Agreement:** Copilot AM-1.
|
|
60
|
-
**Recommendation:** State explicitly: add `has_version?(name, version)` (or the chosen private-presence call); leave or remove `list_gems` deliberately, not as a side effect.
|
|
61
|
-
|
|
62
|
-
### I. Security is not strictly N/A
|
|
63
|
-
**Issue:** The `/info` upstream fallback can leak private gem names to rubygems.org; client doesn't validate the name before building the URL.
|
|
64
|
-
**Agreement:** Codex (FAIL), Copilot (defense-in-depth note).
|
|
65
|
-
**Recommendation:** If the private-endpoint fix (A) is adopted, the leak disappears. Regardless, add a note: client treats 400 as a programming error; gem names come from config/manifest and should match `VALID_NAME`. Downgrade from "N/A" to "low, with these mitigations."
|
|
66
|
-
|
|
67
|
-
## Additional Requirements Identified
|
|
68
|
-
|
|
69
|
-
- **AR:** Presence is determined solely from the private index, never an upstream-proxied response (resolves A, I).
|
|
70
|
-
- **AR:** Repo/manifest resolution is deferred until after the server-presence and local-artifact checks (resolves E).
|
|
71
|
-
- **FR:** Before re-uploading an existing artifact, verify its embedded spec name+version match the requested gem (resolves D).
|
|
72
|
-
- **FR/AR:** Define the presence-check HTTP status → outcome mapping (resolves F).
|
|
73
|
-
- **Testing:** `test_gem_syncer.rb` currently tests only `resolve_repo` — `sync()` orchestration tests must be **created**. Add integration coverage for: empty server + local artifact (the original bug), divergent server `gems_path`, offline upstream, upload 409 conflict, and public-name collision (all critics).
|
|
74
|
-
|
|
75
|
-
## Ambiguities Requiring Clarification
|
|
76
|
-
|
|
77
|
-
1. **Mechanism for private-authoritative presence** (new endpoint vs. checksum vs. retained `/info` + 503 handling) — the central open decision.
|
|
78
|
-
2. **`latest` scope** — accept always-fetch, or add artifact version-read path.
|
|
79
|
-
3. **Platform** — pure-Ruby assumption or full filename matching.
|
|
80
|
-
4. **Third-outcome counting** — `:synced` vs new symbol.
|
|
81
|
-
|
|
82
|
-
## Summary of Required Changes
|
|
83
|
-
|
|
84
|
-
1. **Reconsider Q1:** make the presence check private-store-authoritative (recommend a small read-only private endpoint; eliminates name-collision, offline-503, and name-leak issues at once).
|
|
85
|
-
2. Resolve the `latest` / "never re-clone" contradiction (recommend: scope guarantee to pinned + `from_lockfile`).
|
|
86
|
-
3. State the platform assumption (recommend: private gems pure-Ruby, filename `<name>-<version>.gem`).
|
|
87
|
-
4. Require pre-upload artifact identity check (name+version match).
|
|
88
|
-
5. Specify the `GemSyncer` flow change: defer repo/manifest resolution; split build vs. upload-existing.
|
|
89
|
-
6. Add the HTTP status→outcome table and the third-outcome counting decision.
|
|
90
|
-
7. Resolve `list_gems`/`has_version?`; note Faraday connection choice.
|
|
91
|
-
8. Update testing: create `sync()` tests + the listed integration scenarios; downgrade security from N/A with explicit mitigations.
|
|
92
|
-
|
|
93
|
-
## Verdict
|
|
94
|
-
|
|
95
|
-
Right problem, right overall shape, well-bounded scope — but **NEEDS WORK before implementation**, primarily because the recommended `/info` mechanism isn't private-authoritative. That single decision (change A) cascades into the security and offline-503 items. With A resolved plus the `latest`, platform, decomposition, and status-handling tightenings, this is ready to implement.
|
|
@@ -1,47 +0,0 @@
|
|
|
1
|
-
# Critique (v1) — Claude
|
|
2
|
-
|
|
3
|
-
Spec: Server-authoritative sync cache check (`20260529-131354`)
|
|
4
|
-
|
|
5
|
-
## Summary
|
|
6
|
-
|
|
7
|
-
The spec correctly diagnoses the root cause and picks the right core fix (server-authoritative skip via the existing `/info` endpoint, with artifact re-upload to avoid rebuilds). Scope is well-bounded and the deployment-mode constraint (AR-1.3) is the key insight that keeps the design honest. Below are gaps that would cause an implementer to stop and ask, plus a few correctness traps in the actual `/info` and upload mechanics.
|
|
8
|
-
|
|
9
|
-
## Blocking / should-fix
|
|
10
|
-
|
|
11
|
-
### 1. `/info` presence parsing is under-specified against the real format (FR-1.2)
|
|
12
|
-
The compact-index info document produced by `CompactIndex.info` is **not** a flat list of versions. Each line is roughly `VERSION DEP:REQ,...|checksum:...,ruby:...`, and the document begins with a `---` header line. FR-1.2 says "contains a version line for that exact version," but doesn't pin down:
|
|
13
|
-
- That the match must be on the **first whitespace-delimited token** of a line, anchored, not a substring (`1.0.5` must not match `1.0.50` or a checksum that happens to contain `1.0.5`).
|
|
14
|
-
- That the leading `---` and any blank lines are ignored.
|
|
15
|
-
- **Platform variants:** a gem can have multiple lines for the same version with different platforms (e.g. `1.0.5` and `1.0.5-x86_64-darwin`). The spec treats presence as `(name, version)` only. If a platformed gem is involved, "version present" may be true while the *specific artifact filename* `<name>-<version>-<platform>.gem` is still missing from the server. Either declare platforms out of scope explicitly, or check the artifact filename, not just the version token.
|
|
16
|
-
|
|
17
|
-
Recommend FR-1.2 specify anchored first-token matching and state the platform assumption (private gems are pure-Ruby → filename is `<name>-<version>.gem`); if that assumption holds it should be written down, because `SpecMapper.filename` already branches on platform.
|
|
18
|
-
|
|
19
|
-
### 2. Presence-vs-served gap: `/info` is built from `GemIndex`, but is it the same store the binary is served from? (FR-1.1/FR-1.2)
|
|
20
|
-
The skip decision trusts `/info` to mean "the server can serve `/gems/<file>`." In the current server, `serve_info` reads `@index[gemname]` (from `GemIndex`, i.e. `gems_path/gems`) and `serve_gem_file` reads `@index.gem_path || @cache.gem_binary`. These share `@index`, so they should agree — but the spec should state this invariant explicitly as the thing it depends on: **a version appearing in `/info` implies `/gems/<file>` is serveable from the private store.** If that ever stops holding (e.g. `/info` proxied upstream), the skip becomes wrong. Worth an AR pinning "presence is determined only from the private index, never an upstream-proxied `/info`." Note `serve_info` falls back to `@cache.info(gemname)` (upstream) when the gem isn't private — for a private gem name that the server doesn't have, `/info` would proxy to rubygems.org, 404, and return not-found, which is fine; but a name collision with a public gem could make `/info` return a *public* document and produce a false "present." This edge (private gem sharing a name with a public gem) should be acknowledged.
|
|
21
|
-
|
|
22
|
-
### 3. "Existing local artifact" lookup location is ambiguous (FR-1.3, AR-1.2)
|
|
23
|
-
FR-1.3 says re-upload when "the corresponding `.gem` already exists in the local `gems_path`." But `cached?` today checks `gems_path/<name>-<bare>.gem` (flat) while the server store is `gems_path/gems/`. The spec should state unambiguously that the artifact lookup is the **flat build-output location** (`gems_path/<name>-<version>.gem`, where `GemBuilder` writes), to avoid an implementer re-introducing the same flat-vs-nested confusion the spec is trying to fix. Tie it to `SpecMapper.filename`/the bare-semver key so the filename is derived one way.
|
|
24
|
-
|
|
25
|
-
### 4. `version: latest` interaction needs the ordering spelled out (AR-1.2)
|
|
26
|
-
For `latest`, the version isn't known until after clone+checkout (`current_version` post-checkout). So the "ask server first, skip without building" optimization **cannot apply to `latest`** — you must fetch the repo to learn the version before you can query `/info`. The spec acknowledges ordering in AR-1.2 but doesn't state the consequence: `latest` gems always incur a fetch (today they do too — `cached?` is bypassed for `latest` via `!gem_def.latest?` in `sync`). Make explicit that `latest` keeps today's behavior (always fetch, then the server check applies to the resolved version for the *upload/skip* decision), so the skip optimization is for pinned versions only.
|
|
27
|
-
|
|
28
|
-
## Edge cases / smaller
|
|
29
|
-
|
|
30
|
-
- **FR-1.4 conflict handling:** `UploadHandler` maps `Errno::EEXIST` → `409 "Gem already exists"`. `GemUploader#handle_response` currently treats `409` as `{ success: true, skipped: true }`. Good — the spec's "treat conflict as skip" already matches code, but FR-1.4 should cite the 409 path so the implementer doesn't change it.
|
|
31
|
-
- **Counting/reporting (FR-1.5):** the `sync` command tallies `:synced`/`:skipped` from `GemSyncer#sync`'s return symbol. Adding a third outcome (artifact re-upload) — is it `:synced` or a new symbol? `report_results` only knows two. Decide whether re-upload counts as `:synced` (simplest, keeps the tally) with differentiated *output text*, or a new `:uploaded` symbol (touches `run_sync`/`report_results`). The spec implies differentiated messaging but not the symbol contract.
|
|
32
|
-
- **Idempotency race (FR-1.4):** "upload that races with an already-present gem" — concurrency isn't really present (sync is sequential), so this is just the 409 path. Reword to avoid implying real concurrency handling is required.
|
|
33
|
-
- **Malformed `/info` → not-present (AR-1.4):** treating malformed as not-present means a flaky/garbage response triggers an upload attempt, which then 409s or 201s harmlessly. That's a safe failure direction; worth stating that "not-present on parse failure" is deliberately biased toward re-uploading rather than skipping.
|
|
34
|
-
- **`reachable?` already exists** on `GemUploader` but `sync` doesn't currently call it; the new presence call effectively becomes the reachability probe. Consider whether the first presence call should produce the not-reachable error early (per-gem vs once up front).
|
|
35
|
-
|
|
36
|
-
## Testing
|
|
37
|
-
|
|
38
|
-
- The spec leans on stubbing `/info`. `test_gem_uploader.rb` likely already stubs Faraday — confirm the presence method is testable the same way (it is, if it's on `GemUploader`).
|
|
39
|
-
- Add an integration test that reproduces the original bug: build artifact present, fresh/empty server, `sync` → server serves the gem afterward. This is the regression guard and should be called out as required, not optional.
|
|
40
|
-
|
|
41
|
-
## Checklist assessment
|
|
42
|
-
|
|
43
|
-
Honest and well-evidenced. Security N/A is justified (loopback, `VALID_NAME`, read-only `/info`) — but the **private-name-collides-with-public-gem** false-positive (point 2) is a small correctness/security-adjacent edge the checklist's security note should acknowledge. Performance and rollout are appropriately sized.
|
|
44
|
-
|
|
45
|
-
## Verdict
|
|
46
|
-
|
|
47
|
-
Sound design, right scope. Resolve the `/info` parsing precision + platform assumption (1), the presence-implies-serveable invariant incl. public-name-collision (2), the artifact-location wording (3), and the `latest` ordering consequence (4) before implementing. The rest are wording tightenings.
|
|
@@ -1,112 +0,0 @@
|
|
|
1
|
-
# Critique: Server-authoritative sync cache check
|
|
2
|
-
|
|
3
|
-
## Overview
|
|
4
|
-
|
|
5
|
-
The spec targets the right bug: `sync` currently treats a flat local build artifact as proof that the running compact-index server can serve the gem.
|
|
6
|
-
Moving the skip decision to a server-side check and re-uploading existing artifacts is the right shape, but the proposed `/info/<name>` contract is not yet authoritative enough because that endpoint is merged with the RubyGems upstream cache.
|
|
7
|
-
|
|
8
|
-
## Approach Summary
|
|
9
|
-
|
|
10
|
-
- Put the presence check in `GemUploader`, keeping HTTP concerns out of `GemSyncer`.
|
|
11
|
-
- Replace the local `cached?` skip with a server query, then choose skip, upload existing artifact, or build and upload.
|
|
12
|
-
- Reuse existing version normalization for fixed tags, `from_lockfile`, and `latest`.
|
|
13
|
-
- Keep the current `POST /upload` bridge and avoid assuming `sync` and server `gems_path` are the same.
|
|
14
|
-
- The major under-justified choice is "no new server endpoint": current `/info/<name>` is not private-store-only, so using it as the authoritative signal has correctness, privacy, and performance consequences.
|
|
15
|
-
|
|
16
|
-
## Risks
|
|
17
|
-
|
|
18
|
-
1. `/info/<name>` can report an upstream public gem, not just a privately uploaded gem.
|
|
19
|
-
Likelihood: medium.
|
|
20
|
-
Severity: high.
|
|
21
|
-
`CompactIndexServer#serve_info` falls back to `serve_upstream_info` when `@index[gemname]` is absent (`lib/gemkeeper/compact_index_server.rb:74`), so a public gem with the same name/version could make `sync` skip even though the private server store is missing the intended artifact.
|
|
22
|
-
The spec does not address this.
|
|
23
|
-
|
|
24
|
-
2. Offline recovery can stall or fail on upstream RubyGems lookups.
|
|
25
|
-
Likelihood: high for the stated offline use case.
|
|
26
|
-
Severity: high.
|
|
27
|
-
A missing private gem causes `/info/<name>` to probe RubyGems through `GemCache#info` (`lib/gemkeeper/compact_index_server/gem_cache.rb:20`), with 5s open and 10s read timeouts in `RubygemsClient` (`lib/gemkeeper/compact_index_server/rubygems_client.rb:15`).
|
|
28
|
-
The spec says 404 means not-present, but offline misses may return 503 after waiting, which conflicts with the goal of fast local recovery.
|
|
29
|
-
|
|
30
|
-
3. `version: latest` conflicts with the no-git recovery goal.
|
|
31
|
-
Likelihood: high.
|
|
32
|
-
Severity: medium.
|
|
33
|
-
The spec says `latest` must resolve from the checked-out gemspec before presence checking (`spec.md:49`), which requires clone/pull via the existing `GemSyncer` flow (`lib/gemkeeper/gem_syncer.rb:30`).
|
|
34
|
-
That contradicts the goal that recovery never re-clones when an artifact exists unless the spec explicitly scopes that guarantee to fixed or `from_lockfile` versions.
|
|
35
|
-
|
|
36
|
-
4. Existing artifact selection can upload the wrong thing or miss valid platform gems.
|
|
37
|
-
Likelihood: medium.
|
|
38
|
-
Severity: medium.
|
|
39
|
-
The spec repeats the current `gems_path/<name>-<version>.gem` shape, but the server stores platform filenames as `<name>-<version>-<platform>.gem` via `SpecMapper.filename` (`lib/gemkeeper/compact_index_server/spec_mapper.rb:13`).
|
|
40
|
-
It also does not require validating that a reused local artifact's embedded gemspec name/version matches the requested gem before declaring success.
|
|
41
|
-
|
|
42
|
-
5. Security is not actually N/A.
|
|
43
|
-
Likelihood: medium.
|
|
44
|
-
Severity: medium.
|
|
45
|
-
The client will construct a path from a config/manifest gem name, and `Configuration::GemDefinition` validates version but not name.
|
|
46
|
-
The server validates names after routing (`lib/gemkeeper/compact_index_server.rb:49`), but the client still needs path-segment escaping and the `/info` fallback can leak private gem names to RubyGems.org.
|
|
47
|
-
|
|
48
|
-
## Complexity Hotspots
|
|
49
|
-
|
|
50
|
-
### Making `/info` Authoritative
|
|
51
|
-
|
|
52
|
-
This is the hardest part.
|
|
53
|
-
The endpoint is a Bundler-facing merged compact index endpoint, not a private-store API.
|
|
54
|
-
If the spec keeps "no new endpoint," it needs a precise rule for distinguishing private presence from upstream presence, probably by comparing the compact-index checksum to the local artifact when one exists or by changing server behavior for sync-specific checks.
|
|
55
|
-
|
|
56
|
-
### Refactoring `GemSyncer` Ordering
|
|
57
|
-
|
|
58
|
-
Current `sync` resolves the repo before cache handling (`lib/gemkeeper/gem_syncer.rb:21`) and fetches the repo before `latest_version!`.
|
|
59
|
-
To satisfy "upload existing artifact without git/build," implementation likely must defer repo resolution and `GitRepository` creation until after the server-missing/local-artifact path.
|
|
60
|
-
The spec names the high-level flow but does not call out this ordering change.
|
|
61
|
-
|
|
62
|
-
### HTTP Status And Parse Semantics
|
|
63
|
-
|
|
64
|
-
The presence method needs exact status handling: 200 parse, 404 absent, 400 invalid config/name, 503 upstream unavailable, 5xx server failure, redirects, and Faraday connection failures.
|
|
65
|
-
AR-1.4 currently says both "erroring server" maps to `ServerNotReachableError` and "malformed `/info`" means not-present, which leaves important cases open.
|
|
66
|
-
|
|
67
|
-
### Counting And Messaging
|
|
68
|
-
|
|
69
|
-
FR-1.4 says an upload conflict is "success/skip," while FR-1.1 says re-uploaded missing gems report as synced, not skipped.
|
|
70
|
-
The existing CLI summary only knows `:synced` and `:skipped` (`lib/gemkeeper/cli/commands/sync.rb:40`), so the spec should say how conflict, re-upload, and freshly built upload affect the summary counts.
|
|
71
|
-
|
|
72
|
-
## Missing Or Ambiguous Requirements
|
|
73
|
-
|
|
74
|
-
- Define whether "server reports exact version" means private uploaded gem only, merged private-or-public compact index entry, or matching checksum for the exact artifact.
|
|
75
|
-
- Specify how `GET /info/<name>` names are encoded and what happens when the server returns 400 for an invalid name.
|
|
76
|
-
- Clarify whether 503 from upstream miss while the local server is otherwise healthy should mean not-present or hard failure.
|
|
77
|
-
- State whether local artifact reuse requires reading the `.gem` spec and verifying expected name/version/platform before upload.
|
|
78
|
-
- Specify platform gem behavior: exact filename lookup, multiple artifacts for one name/version, and whether any platform version is considered present.
|
|
79
|
-
- Clarify the `latest` guarantee: either accept that it still needs git to discover the current version or define a local-artifact discovery rule for latest.
|
|
80
|
-
- Require deferring repo and manifest resolution when fixed-version or lockfile-version artifact upload can succeed without source checkout.
|
|
81
|
-
- Define how upload conflicts are counted in the sync summary.
|
|
82
|
-
- Add acceptance coverage for a server whose `/info` would proxy upstream, not only a stubbed 404.
|
|
83
|
-
|
|
84
|
-
## Completeness Checklist Audit
|
|
85
|
-
|
|
86
|
-
| Item | Status | Notes |
|
|
87
|
-
|------------------------------|--------|-------|
|
|
88
|
-
| Scope & acceptance criteria | WARN | Main behavior is clear, but `latest`, platform artifacts, and public upstream collisions are not bounded. |
|
|
89
|
-
| Testing strategy | WARN | Needs `test_gem_syncer.rb` flow tests, CLI summary updates, upstream 503/offline tests, public name collision tests, and platform/corrupt artifact cases. |
|
|
90
|
-
| Existing patterns | WARN | Correctly uses `GemUploader`, but misses current `GemSyncer` repo-resolution ordering and `CompactIndexServer` upstream fallback behavior. |
|
|
91
|
-
| Dependencies | PASS | No new library dependency is needed. |
|
|
92
|
-
| Architecture & interfaces | WARN | Uploader seam is right, but `/info` is not currently a private-store interface. |
|
|
93
|
-
| Error handling & failures | WARN | Unreachable server is covered, but HTTP status mapping and upstream 503 are underspecified. |
|
|
94
|
-
| Security review | FAIL | The N/A claim misses private-name leakage to RubyGems.org, client path encoding, and local artifact validation. |
|
|
95
|
-
| Performance impact | FAIL | The spec assumes one cheap loopback GET, but misses per-gem upstream lookups and offline timeout behavior. |
|
|
96
|
-
| Rollout & migration | PASS | No data migration is needed and existing stores can remain in place. |
|
|
97
|
-
| Assumptions & risks | WARN | Identifies `/info` parsing risk, but misses that `/info` is merged/proxied rather than private authoritative. |
|
|
98
|
-
|
|
99
|
-
## Verdict
|
|
100
|
-
|
|
101
|
-
NEEDS WORK.
|
|
102
|
-
|
|
103
|
-
The intended sync behavior is implementable, but the spec needs to resolve the `/info` authority problem, offline 503 behavior, artifact validation, and `latest` semantics before implementation can proceed without guessing.
|
|
104
|
-
|
|
105
|
-
## Suggested Next Steps
|
|
106
|
-
|
|
107
|
-
1. Decide whether the authoritative check must be private-store-only.
|
|
108
|
-
If yes, either add a private presence contract or specify a checksum-based rule that cannot be fooled by upstream public gems.
|
|
109
|
-
2. Clarify fixed, `from_lockfile`, and `latest` flows separately, including exactly when repo/manifest resolution is required.
|
|
110
|
-
3. Specify local artifact lookup and validation, including platform suffixes and corrupt/wrong gem files.
|
|
111
|
-
4. Define presence-check HTTP status handling and output/counting semantics.
|
|
112
|
-
5. Update the test plan to include syncer-level orchestration tests plus integration coverage for empty server, offline upstream, divergent server path, upload conflict, and public name collision.
|