scint 0.7.0 → 0.7.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 54c3f7baaea3992bcba60136b76f3b08caa580717d82906ebed3a0a69fb23314
4
- data.tar.gz: 246966d0ba167f43b8c0aee1205e95377afec245dca33060fe3fcce202d0c3ab
3
+ metadata.gz: d9c9934c3d5cfb4786daa6d766e031b27919cc290ffd9d3f17607e70f5f58535
4
+ data.tar.gz: ffc27f93068721d1fdfee4cb3fc62355beba8322c295968ae2f4cce6ccd4470e
5
5
  SHA512:
6
- metadata.gz: 84c58296b1c391e68503090ea3889a3a469c8f63ca6c289710dc6a5f986cd33059aec6a61038ab14c2f653aefaa1675815d2e13a5ba3fb49dad6fe825cf8d3a5
7
- data.tar.gz: 1cde1ae7ff37856bf44a4f909e024cb094ba1436fe062536aa4c73860ce0aa7c2494c2f5dd09ca14af5be8b226d8ada36ec212278b08a55902da708d734bea3d
6
+ metadata.gz: e722c6d0c83bf1a6bbe2f4e4b65da8282f3b4f6ac83742900f91c158a93262092780b15778979bbbc9ccc59847c7d232f2c9bc689e4b01eb6b3920dd4a875347
7
+ data.tar.gz: 5c7a26fe05647b14e865b60b9f08adbb4f020edb0c18f393fd577a90af701a502471485bc6eaf90d947d1f82324f2d8fa322280f5916e243705300d1ca3defea
data/FEATURES.md CHANGED
@@ -10,4 +10,8 @@
10
10
  - the installation process involves compilation. We attempt to have compilation happen while its not blocking other operations, but also only one compilation at a time
11
11
  - we have a book keeping object that governs the worker pools and that's present during each step (fetch, extract, compile, install) and recieves the tasks for each phase from the workers.
12
12
  - i suspect that we need to fork of a worker for compilation which we then have to communicate with via some rpc format. simple "-> CALL method, <- RESULT:\n...." type line protocol through stdin/out might work well enough there.
13
+ - cache validity is defined by cached tree + spec marshal + versioned manifest, scoped by ABI (with `gem.build_complete` for native extensions)
14
+ - cached manifests are versioned JSON with deterministic ordering and a file list for fast materialization
15
+ - git cache slugs are deterministic hashes of the normalized source URI with collision detection + telemetry
16
+ - legacy cached entries without a manifest remain read-compatible but emit telemetry for deprecation
13
17
 
data/README.md CHANGED
@@ -212,6 +212,77 @@ Project-local runtime (`.bundle/`):
212
212
  4. `bin/` project-level wrappers
213
213
  5. `scint.lock.marshal` runtime lock for `scint exec`
214
214
 
215
+ ## Cache Validity + Manifest Specification (Draft)
216
+
217
+ ### Cache validity criteria
218
+
219
+ A cached artifact is considered valid only when *all* of the following are true:
220
+
221
+ 1. `cached/<abi>/<full_name>/` exists and is a fully materialized tree.
222
+ 2. `cached/<abi>/<full_name>.spec.marshal` exists and loads successfully.
223
+ 3. `cached/<abi>/<full_name>.manifest` exists, parses, and its `version` is supported.
224
+ 4. Manifest fields `full_name` and `abi` match the requested spec/ABI.
225
+ 5. If the gem has native extensions, `ext/<abi>/<full_name>/gem.build_complete` exists.
226
+
227
+ Any failure means the cache entry is *invalid* and must be rebuilt (fetch/assemble/compile).
228
+
229
+ ### Manifest schema
230
+
231
+ The manifest is a single UTF-8 JSON object written to
232
+ `cached/<abi>/<full_name>.manifest`. It is versioned and deterministically
233
+ serialized for repeatable cache reuse.
234
+
235
+ **Serialization rules**:
236
+
237
+ - Top-level keys sorted lexicographically.
238
+ - Array ordering is deterministic (see `files` below).
239
+ - JSON is emitted without extra whitespace (canonical/minified).
240
+ - No timestamps or host-specific values are stored.
241
+
242
+ **Schema (version 1)**:
243
+
244
+ - `version` (Integer, required): schema version (starting at `1`).
245
+ - `full_name` (String, required): `name-version(-platform)`.
246
+ - `abi` (String, required): Ruby ABI key (e.g. `ruby-3.4.5-arm64-darwin24`).
247
+ - `source` (Object, required):
248
+ - `type` (String): `rubygems`, `git`, or `path`.
249
+ - `uri` (String): canonical source URI.
250
+ - `revision` (String, git only): resolved commit SHA.
251
+ - `path` (String, path only): absolute source path.
252
+ - `files` (Array, required): sorted by `path` ascending. Each entry is:
253
+ - `path` (String): relative to the cached root.
254
+ - `type` (String): `file`, `dir`, or `symlink`.
255
+ - `mode` (Integer): numeric permission bits (e.g. `755`).
256
+ - `size` (Integer): bytes (directories use `0`).
257
+ - `sha256` (String, optional): content hash for files/symlinks.
258
+ - `build` (Object, required):
259
+ - `extensions` (Boolean): whether the gem builds native extensions.
260
+ - `ext_complete` (String, optional): completion marker path when extensions exist.
261
+
262
+ ### Git slug normalization + collisions
263
+
264
+ Git cache directories use a deterministic slug derived from the source URI:
265
+
266
+ 1. Normalize the URI to a stable string form (`uri.to_s`; callers should pass a
267
+ parsed URI for consistent normalization).
268
+ 2. Compute `sha256(normalized_uri)`, use the first 16 hex characters.
269
+
270
+ **Collision handling**: when a slug directory already exists, validate that the
271
+ manifest `source.uri` matches the normalized URI. If it does not, treat it as a
272
+ collision, emit telemetry, and fall back to a longer hash (e.g. full 64 hex) or
273
+ an additional suffix.
274
+
275
+ ### Legacy read-compat + telemetry
276
+
277
+ Legacy cache entries that lack a manifest (or use an unsupported schema version)
278
+ remain *read-compatible* for now:
279
+
280
+ - Treat the entry as valid only if the cached tree + `.spec.marshal` exist and
281
+ the gemspec loads cleanly.
282
+ - Emit telemetry counters (per run) such as `cache.manifest.missing`,
283
+ `cache.manifest.unsupported`, and `cache.manifest.collision`.
284
+ - Log a single warning per run with counts and cache root to guide deprecation.
285
+
215
286
  ## Warm Path Guarantees
216
287
 
217
288
  Required behavior:
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.7.0
1
+ 0.7.1
@@ -23,10 +23,28 @@ module Scint
23
23
  File.join(@root, "inbound")
24
24
  end
25
25
 
26
+ def inbound_gems_dir
27
+ File.join(inbound_dir, "gems")
28
+ end
29
+
30
+ def inbound_gits_dir
31
+ File.join(inbound_dir, "gits")
32
+ end
33
+
34
+ def assembling_dir
35
+ File.join(@root, "assembling")
36
+ end
37
+
38
+ def cached_dir
39
+ File.join(@root, "cached")
40
+ end
41
+
42
+ # Legacy extracted cache (read-compat only).
26
43
  def extracted_dir
27
44
  File.join(@root, "extracted")
28
45
  end
29
46
 
47
+ # Legacy extension cache (read-compat only).
30
48
  def ext_dir
31
49
  File.join(@root, "ext")
32
50
  end
@@ -36,7 +54,7 @@ module Scint
36
54
  end
37
55
 
38
56
  def git_dir
39
- File.join(inbound_dir, "git")
57
+ inbound_gits_dir
40
58
  end
41
59
 
42
60
  # Isolated gem home used while compiling native extensions during install.
@@ -52,17 +70,40 @@ module Scint
52
70
  # -- Per-spec paths ------------------------------------------------------
53
71
 
54
72
  def inbound_path(spec)
55
- File.join(inbound_dir, "#{full_name(spec)}.gem")
73
+ File.join(inbound_gems_dir, "#{full_name(spec)}.gem")
74
+ end
75
+
76
+ def assembling_path(spec, abi_key = Platform.abi_key)
77
+ File.join(assembling_dir, abi_key, full_name(spec))
56
78
  end
57
79
 
80
+ def cached_abi_dir(abi_key = Platform.abi_key)
81
+ File.join(cached_dir, abi_key)
82
+ end
83
+
84
+ def cached_path(spec, abi_key = Platform.abi_key)
85
+ File.join(cached_dir, abi_key, full_name(spec))
86
+ end
87
+
88
+ def cached_spec_path(spec, abi_key = Platform.abi_key)
89
+ File.join(cached_dir, abi_key, "#{full_name(spec)}.spec.marshal")
90
+ end
91
+
92
+ def cached_manifest_path(spec, abi_key = Platform.abi_key)
93
+ File.join(cached_dir, abi_key, "#{full_name(spec)}.manifest")
94
+ end
95
+
96
+ # Legacy extracted cache (read-compat only).
58
97
  def extracted_path(spec)
59
98
  File.join(extracted_dir, full_name(spec))
60
99
  end
61
100
 
101
+ # Legacy extracted gemspec cache (read-compat only).
62
102
  def spec_cache_path(spec)
63
103
  File.join(extracted_dir, "#{full_name(spec)}.spec.marshal")
64
104
  end
65
105
 
106
+ # Legacy extension cache (read-compat only).
66
107
  def ext_path(spec, abi_key = Platform.abi_key)
67
108
  File.join(ext_dir, abi_key, full_name(spec))
68
109
  end
@@ -80,13 +121,13 @@ module Scint
80
121
 
81
122
  def git_path(uri)
82
123
  slug = git_slug(uri)
83
- File.join(git_dir, "repos", slug)
124
+ File.join(git_dir, slug)
84
125
  end
85
126
 
86
127
  def git_checkout_path(uri, revision)
87
128
  slug = git_slug(uri)
88
129
  rev = revision.to_s.gsub(/[^0-9A-Za-z._-]/, "_")
89
- File.join(git_dir, "checkouts", slug, rev)
130
+ File.join(git_dir, slug, "checkouts", rev)
90
131
  end
91
132
 
92
133
  # -- Helpers -------------------------------------------------------------
@@ -113,6 +154,10 @@ module Scint
113
154
  File.join(base, "scint")
114
155
  end
115
156
 
157
+ # Slug rules are defined in README.md (Cache Validity + Manifest Specification).
158
+ # - Index slugs prefer host/path when available, otherwise fall back to a hash.
159
+ # - Hash slugs are deterministic but must be paired with manifest checks for
160
+ # collision detection.
116
161
  def slugify_uri(str)
117
162
  uri = URI.parse(str) rescue nil
118
163
  if uri && uri.host
@@ -125,8 +170,19 @@ module Scint
125
170
  end
126
171
  end
127
172
 
173
+ # Git slugs are SHA256 of the normalized URI string (uri.to_s), truncated
174
+ # to 16 hex chars. Callers must validate `source.uri` in the manifest to
175
+ # detect collisions and fall back to a longer hash if needed.
128
176
  def git_slug(uri)
129
- Digest::SHA256.hexdigest(uri.to_s)[0, 16]
177
+ normalized = normalize_uri(uri)
178
+ Digest::SHA256.hexdigest(normalized)[0, 16]
179
+ end
180
+
181
+ def normalize_uri(uri)
182
+ return uri.to_s if uri.is_a?(URI)
183
+ URI.parse(uri.to_s).to_s
184
+ rescue URI::InvalidURIError
185
+ uri.to_s
130
186
  end
131
187
  end
132
188
  end
@@ -0,0 +1,120 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "digest"
4
+ require "find"
5
+ require "json"
6
+
7
+ require_relative "../fs"
8
+ require_relative "../spec_utils"
9
+
10
+ module Scint
11
+ module Cache
12
+ module Manifest
13
+ module_function
14
+
15
+ VERSION = 1
16
+
17
+ def build(spec:, gem_dir:, abi_key:, source:, extensions:)
18
+ {
19
+ "abi" => abi_key,
20
+ "build" => build_block(extensions: extensions),
21
+ "files" => collect_files(gem_dir),
22
+ "full_name" => SpecUtils.full_name(spec),
23
+ "source" => source,
24
+ "version" => VERSION,
25
+ }
26
+ end
27
+
28
+ def write(path, manifest)
29
+ ordered = order_keys(manifest)
30
+ json = JSON.generate(ordered)
31
+ FS.atomic_write(path, json)
32
+ end
33
+
34
+ # Write a flat file listing (one relative path per line) into the gem dir.
35
+ # This enables bulk install via a single `cp -al` or `cpio -pld` call.
36
+ DOTFILES_NAME = ".scint-files"
37
+
38
+ def write_dotfiles(gem_dir, manifest = nil)
39
+ files = if manifest
40
+ Array(manifest["files"]).filter_map { |e| e["path"] if e["type"] != "dir" }
41
+ else
42
+ collect_file_paths(gem_dir)
43
+ end
44
+
45
+ dotfiles_path = File.join(gem_dir, DOTFILES_NAME)
46
+ File.write(dotfiles_path, files.sort.join("\n") + "\n")
47
+ end
48
+
49
+ def collect_file_paths(root)
50
+ paths = []
51
+ Find.find(root) do |path|
52
+ next if path == root
53
+ rel = path.delete_prefix("#{root}/")
54
+ next if rel == DOTFILES_NAME
55
+ stat = File.lstat(path)
56
+ paths << rel unless stat.directory?
57
+ end
58
+ paths
59
+ end
60
+
61
+ def collect_files(root)
62
+ entries = []
63
+ Find.find(root) do |path|
64
+ next if path == root
65
+
66
+ rel = path.delete_prefix("#{root}/")
67
+ stat = File.lstat(path)
68
+
69
+ if stat.symlink?
70
+ target = File.readlink(path)
71
+ entries << file_entry(rel, "symlink", stat, Digest::SHA256.hexdigest(target))
72
+ elsif stat.directory?
73
+ entries << dir_entry(rel, stat)
74
+ else
75
+ entries << file_entry(rel, "file", stat, Digest::SHA256.file(path).hexdigest)
76
+ end
77
+ end
78
+
79
+ entries.sort_by { |entry| entry["path"] }
80
+ end
81
+
82
+ def build_block(extensions:)
83
+ { "extensions" => !!extensions }
84
+ end
85
+
86
+ def order_keys(object)
87
+ case object
88
+ when Hash
89
+ object.keys.sort.each_with_object({}) do |key, acc|
90
+ acc[key] = order_keys(object[key])
91
+ end
92
+ when Array
93
+ object.map { |entry| order_keys(entry) }
94
+ else
95
+ object
96
+ end
97
+ end
98
+
99
+ def dir_entry(rel, stat)
100
+ {
101
+ "mode" => stat.mode & 0o777,
102
+ "path" => rel,
103
+ "size" => 0,
104
+ "type" => "dir",
105
+ }
106
+ end
107
+
108
+ def file_entry(rel, type, stat, sha)
109
+ entry = {
110
+ "mode" => stat.mode & 0o777,
111
+ "path" => rel,
112
+ "size" => stat.size,
113
+ "type" => type,
114
+ }
115
+ entry["sha256"] = sha if sha
116
+ entry
117
+ end
118
+ end
119
+ end
120
+ end