scout-essentials 1.7.1 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. checksums.yaml +4 -4
  2. data/.vimproject +200 -47
  3. data/README.md +136 -0
  4. data/Rakefile +1 -0
  5. data/VERSION +1 -1
  6. data/doc/Annotation.md +352 -0
  7. data/doc/CMD.md +203 -0
  8. data/doc/ConcurrentStream.md +163 -0
  9. data/doc/IndiferentHash.md +240 -0
  10. data/doc/Log.md +235 -0
  11. data/doc/NamedArray.md +174 -0
  12. data/doc/Open.md +331 -0
  13. data/doc/Path.md +217 -0
  14. data/doc/Persist.md +214 -0
  15. data/doc/Resource.md +229 -0
  16. data/doc/SimpleOPT.md +236 -0
  17. data/doc/TmpFile.md +154 -0
  18. data/lib/scout/annotation/annotated_object.rb +8 -0
  19. data/lib/scout/annotation/annotation_module.rb +1 -0
  20. data/lib/scout/cmd.rb +19 -12
  21. data/lib/scout/concurrent_stream.rb +3 -1
  22. data/lib/scout/config.rb +2 -2
  23. data/lib/scout/indiferent_hash/options.rb +2 -2
  24. data/lib/scout/indiferent_hash.rb +16 -0
  25. data/lib/scout/log/color.rb +5 -3
  26. data/lib/scout/log/fingerprint.rb +8 -8
  27. data/lib/scout/log/progress/report.rb +6 -6
  28. data/lib/scout/log.rb +7 -7
  29. data/lib/scout/misc/digest.rb +11 -13
  30. data/lib/scout/misc/format.rb +2 -2
  31. data/lib/scout/misc/system.rb +5 -0
  32. data/lib/scout/open/final.rb +16 -1
  33. data/lib/scout/open/remote.rb +0 -1
  34. data/lib/scout/open/stream.rb +30 -5
  35. data/lib/scout/open/util.rb +32 -0
  36. data/lib/scout/path/digest.rb +12 -2
  37. data/lib/scout/path/find.rb +19 -6
  38. data/lib/scout/path/util.rb +37 -1
  39. data/lib/scout/persist/open.rb +2 -0
  40. data/lib/scout/persist.rb +7 -1
  41. data/lib/scout/resource/path.rb +2 -2
  42. data/lib/scout/resource/util.rb +18 -4
  43. data/lib/scout/resource.rb +15 -1
  44. data/lib/scout/simple_opt/parse.rb +2 -0
  45. data/lib/scout/tmpfile.rb +1 -1
  46. data/scout-essentials.gemspec +19 -6
  47. data/test/scout/misc/test_hook.rb +2 -2
  48. data/test/scout/open/test_stream.rb +43 -15
  49. data/test/scout/path/test_find.rb +1 -1
  50. data/test/scout/path/test_util.rb +11 -0
  51. data/test/scout/test_path.rb +4 -4
  52. data/test/scout/test_persist.rb +10 -1
  53. metadata +31 -5
  54. data/README.rdoc +0 -18
data/doc/Path.md ADDED
@@ -0,0 +1,217 @@
1
+ # Path
2
+
3
+ Path is a lightweight path utility layered on top of the framework's Annotation system. It makes it easy to build, transform and locate resources by logical name across a variety of search maps (current, user, global, lib, etc.). Path objects are plain string values extended with Path behavior (via Path.setup or by extending instances). Many Open/Path helpers accept Path objects and will call `.find` / `.find_all` / `.produce_and_find` as needed.
4
+
5
+ Key features:
6
+ - Build and compose path strings fluently (join, /, method_missing).
7
+ - Map logical names to physical locations using configurable path maps.
8
+ - Find the first existing file across map order or list all matches.
9
+ - Helpers for file extension manipulation, globbing, dirname/basename, sanitizing filenames.
10
+ - Integration with Open and TmpFile utilities; supports annotation (pkgdir, libdir, map configuration).
11
+ - Helpers for digest/MD5 summary for files and directories.
12
+
13
+ ---
14
+
15
+ ## Creating / wrapping Path values
16
+
17
+ - Path.setup(str, pkgdir = nil)
18
+ - Convert a string into a Path (i.e., extend it with Path methods). Many test examples call `Path.setup("...")`.
19
+ - A Path is just a String extended with Path behavior and optional annotations (`pkgdir`, `libdir`, `path_maps`, `map_order`).
20
+
21
+ Shortcuts on Path instances:
22
+ - join(subpath, prevpath = nil) — join subpath to the Path (returns annotated Path)
23
+ - Aliases: [] and /
24
+ - Example:
25
+ ```ruby
26
+ p = Path.setup('/tmp')
27
+ p.join(:foo) # => "/tmp/foo"
28
+ p[:bar, :foo] # => "/tmp/bar/foo"
29
+ p / :foo # => "/tmp/foo"
30
+ ```
31
+
32
+ - method_missing is used to make `path.component` behave like join:
33
+ - `path.foo` → join("foo")
34
+ - Be careful: methods starting `to_` or blocks will not be treated as path components.
35
+
36
+ ---
37
+
38
+ ## Package / library defaults
39
+
40
+ - Path.default_pkgdir and Path.default_pkgdir= — global default package dir (default `'scout'`).
41
+ - Instance attributes:
42
+ - pkgdir — per-path override of package directory (defaults to Path.default_pkgdir).
43
+ - libdir — library directory (attempts to infer caller lib dir).
44
+ - path_maps — instance copy of global path maps (modifiable per-path).
45
+ - map_order — instance map order (derived from global map_order unless overridden).
46
+
47
+ ---
48
+
49
+ ## Path maps and find behavior
50
+
51
+ Path maps let you define templates where logical paths can be found on disk. The module ships a sensible set of maps (current, user, global, usr, local, fast, cache, bulk, lib, tmp, …) and a default map order.
52
+
53
+ - Path.path_maps — global IndiferentHash of map templates (strings containing placeholders like {TOPLEVEL}, {PKGDIR}, {SUBPATH}, {PATH}, {LIBDIR}, etc.)
54
+ - Path.map_order / Path.basic_map_order — global order to search maps.
55
+ - You can add or change maps:
56
+ - Path.add_path(name, map)
57
+ - Path.prepend_path(name, map)
58
+ - Path.append_path(name, map)
59
+ - For a Path instance you can also call add_path / prepend_path / append_path (instance-level override).
60
+
61
+ Templates can reference:
62
+ - {PKGDIR}, {HOME}, {RESOURCE}, {PWD}, {TOPLEVEL}, {SUBPATH}, {BASENAME}, {PATH}, {LIBDIR}, {MAPNAME}, and custom substitutions.
63
+
64
+ Finding:
65
+ - Path#follow(map_name = :default) — substitute template tokens and return the resulting path string (annotated). Does not check existence. Annotates result with `.where` and `.original` when requested (see annotate_found_where).
66
+ - Path#find(where = nil) — search for the first existing file for the Path:
67
+ - If path is absolute (located?), returns itself if exists or checks alternatives (.gz, .bgz, .zip).
68
+ - If where is given, uses that map name only.
69
+ - If where == :all returns an array of matching paths (see find_all).
70
+ - Otherwise iterates configured map_order and returns the first found path (annotated with `.where` and `.original`).
71
+ - Path#find_all — returns all existing matches across map_order (useful to locate duplicates).
72
+ - Path#find_with_extension(extension, *args) — try original then the given extension.
73
+ - Path.exists_file_or_alternatives(file) — helper that checks for file or file.gz/.bgz/.zip alternatives.
74
+
75
+ Helpers for locating:
76
+ - Path.located?(path) / instance located? — returns true if path is absolute (~/, /, ./).
77
+ - Path.caller_file / Path.caller_lib_dir — helper to find the caller's script dir / lib directory (used to set libdir defaults and map values).
78
+ - Path.follow supports advanced `{PATH/old/new}` style substitutions (see tests showing `Path.follow(path, "/some_dir/{PATH/scout/scout_commands}")`).
79
+
80
+ When a find succeeds, Path#set attributes:
81
+ - found.where — the map name used to find the file
82
+ - found.original — original logical path string before substitution
83
+
84
+ ---
85
+
86
+ ## Globbing and directory utilities
87
+
88
+ - Path#glob(pattern = "*") — list children matching pattern if this Path points to a directory; returns annotated Path instances.
89
+ - Path#glob_all(pattern = nil) — search across path maps and return all matching annotated paths.
90
+ - Path#directory? — true if the found path is a directory.
91
+ - Path.dirname / Path.basename — string helpers returning annotated strings.
92
+
93
+ ---
94
+
95
+ ## Filename / extension utilities
96
+
97
+ - Path.is_filename?(string, need_to_exists = true) — static predicate to test if a value looks like a filename.
98
+ - Path.sanitize_filename(filename, length = 254) — shorten a filename safely preserving an extension and adding a digest postfix when needed.
99
+ - Extension helpers:
100
+ - get_extension(multiple = false) — last extension or multiple.
101
+ - set_extension(extension) — return path with extension appended.
102
+ - unset_extension — remove last extension.
103
+ - remove_extension(extension = nil) — remove a specific extension or unset last.
104
+ - replace_extension(new_extension, multiple = false) — replace extension(s).
105
+ - Path.relative_to(dir) — returns relative path from dir to this path (uses Misc.path_relative_to).
106
+
107
+ ---
108
+
109
+ ## Misc utilities
110
+
111
+ - Path.digest_str — extension in path/digest.rb:
112
+ - If path is a file and exists → "File MD5: <md5>".
113
+ - If path is a directory → "Directory MD5: <digest of glob>".
114
+ - Otherwise returns quoted path string.
115
+
116
+ - Path.no_method_missing — remove the module-level method_missing implementation (rarely used).
117
+
118
+ - TmpFile.with_path — helper that yields a path string and ensures it is a Path (via Path.setup).
119
+
120
+ - Path.newer?(path, file, by_link = false) — compare mtimes; returns truthy if `path` is newer than `file`. Handles non-existing files and optionally compares lstat (links).
121
+
122
+ ---
123
+
124
+ ## Integration & annotations
125
+
126
+ Path extends the Annotation module; each Path can carry annotations:
127
+ - `pkgdir`, `libdir`, `path_maps`, `map_order`. These let different packages or code contexts override search behavior for a path instance.
128
+
129
+ `Path.setup` creates a Path with default annotations:
130
+ - `pkgdir` defaults to Path.default_pkgdir (usually 'scout').
131
+ - `libdir` defaults to caller library directory (Path.caller_lib_dir).
132
+
133
+ Examples from tests and common use:
134
+
135
+ - Compose paths:
136
+ ```ruby
137
+ p = Path.setup('/tmp')
138
+ p.join(:foo) # => "/tmp/foo"
139
+ p[:bar, :foo] # => "/tmp/bar/foo"
140
+ p.foo[:bar] # => "/tmp/foo/bar"
141
+ ```
142
+
143
+ - Find a file across path maps:
144
+ ```ruby
145
+ p = Path.setup("share/data/some_file", 'scout')
146
+ p.find(:usr) # resolves using :usr map -> "/usr/share/scout/data/some_file"
147
+ p.find # searches map_order and returns first existing match
148
+ p.find.where # map name where it was found (e.g. :current)
149
+ p.find.original # original logical path prior to substitution
150
+ ```
151
+
152
+ - Search all matches:
153
+ ```ruby
154
+ Path.setup("share/data/some_file", 'scout').find_all
155
+ ```
156
+
157
+ - Add custom search paths:
158
+ ```ruby
159
+ file = Path.setup("somefile")
160
+ file.append_path('dir1', '/tmp/dir1')
161
+ file.prepend_path('dir2', '/opt/dir2')
162
+ ```
163
+
164
+ - Work with extensions:
165
+ ```ruby
166
+ p = Path.setup("/home/.scout/dir/file.txt")
167
+ p.unset_extension # => "/home/.scout/dir/file"
168
+ p.replace_extension('tsv')
169
+ ```
170
+
171
+ - Glob directories:
172
+ ```ruby
173
+ dir = Path.setup(tmpdir)
174
+ dir.glob # returns annotated Path children
175
+ ```
176
+
177
+ ---
178
+
179
+ ## Notes & edge cases
180
+
181
+ - Path uses `Path.located?` to decide if a path is already absolute / explicitly located (leading `/`, `~/` or `./`).
182
+ - If `find` is called on a non-located path, it searches maps in `map_order`. Map templates are string patterns — if a map is missing `Path.find` will fallback to the default map.
183
+ - `find` will also check for compressed alternatives (filename.gz, .bgz, .zip) via helper `exists_file_or_alternatives`.
184
+ - The system allows per-path overrides of `pkgdir` and `path_maps` — useful for testing and package-specific layouts.
185
+ - `Path.follow` performs token substitution and supports nested `{PATH/.../...}` style replacements.
186
+ - `caller_lib_dir` and `caller_file` try to infer the caller's library directory; used to set sensible defaults for `libdir`.
187
+ - Many helpers in the framework accept Path objects and will call `.find` (or `.produce_and_find` when supported). Use Path.setup to annotate and pass Path objects to other APIs.
188
+
189
+ ---
190
+
191
+ ## API quick reference
192
+
193
+ - Creation / basics:
194
+ - Path.setup(string, pkgdir=nil)
195
+ - path.join(subpath, prevpath=nil) — aliases: path[:x], path / :x
196
+ - path.method_missing to allow `path.foo` as join
197
+
198
+ - Mapping & locating:
199
+ - Path.path_maps, Path.map_order, Path.add_path / prepend_path / append_path
200
+ - path.follow(map_name=:default) — expand template without checking existence
201
+ - path.find(where = nil) — locate first existing match (annotates `.where` and `.original`)
202
+ - path.find_all — return all existing matches across maps
203
+ - path.find_with_extension(extension, *args)
204
+
205
+ - Filesystem / filename helpers:
206
+ - path.glob(pattern="*"), path.glob_all(pattern=nil)
207
+ - path.dirname, path.basename
208
+ - path.get_extension, set_extension, unset_extension, remove_extension, replace_extension
209
+ - Path.sanitize_filename, Path.is_filename?
210
+
211
+ - Utilities:
212
+ - Path.located?(s), path.located?
213
+ - Path.caller_file, Path.caller_lib_dir
214
+ - Path.digest_str (file/directory MD5 summary)
215
+ - Path.newer?(path, file, by_link = false)
216
+
217
+ This document summarizes the Path module's purpose and primary surface area. Use Path.setup to get annotated strings that interoperate with Open and other framework utilities to find and manipulate package-oriented filesystem resources.
data/doc/Persist.md ADDED
@@ -0,0 +1,214 @@
1
+ # Persist
2
+
3
+ Persist provides a unified persistence layer for serializing, saving and loading Ruby objects to/from files, with helpers for common formats (JSON, YAML, Marshal), typed serialization, memory-backed persistence, atomic writes, caching, and safe concurrent access (locking). It integrates with Open, Path and the Lockfile-based locking in Open.lock.
4
+
5
+ Key capabilities:
6
+ - Typed serialization/deserialization for many common types (string, integer, float, boolean, array, json, yaml, marshal, path, file, binary, and *_array variants).
7
+ - Save/load drivers can be customized (Persist.save_drivers / Persist.load_drivers).
8
+ - Safe, atomic writes using Open.sensible_write.
9
+ - `Persist.persist` — higher-level pattern: compute value if not present or stale, save it, and return result; supports streaming writes and keeping locks while streaming.
10
+ - In-process memory persistence (`:memory`).
11
+ - Helpers to load JSON/YAML/Marshal with Open (Open.json/Open.yaml/Open.marshal).
12
+ - Configurable cache_dir and lock_dir for persisted artifacts and lockfiles.
13
+
14
+ ---
15
+
16
+ ## Configuration & defaults
17
+
18
+ - Persist.cache_dir — default cache directory (Path) used when creating persistence paths. Defaults to Path.setup("var/cache/persistence").
19
+ - Persist.lock_dir — directory used to create lockfiles for persist operations (default tmp/persist_locks).
20
+ - Persist.SERIALIZER — default serializer symbol used when type is `:serializer` (default :json).
21
+ - Persist.TRUE_STRINGS — strings that are treated as true when deserializing booleans.
22
+
23
+ You can set:
24
+ ```ruby
25
+ Persist.cache_dir = Path.setup("var/cache/persistence") # or string
26
+ Persist.lock_dir = Path.setup("tmp/persist_locks").find
27
+ ```
28
+
29
+ ---
30
+
31
+ ## Serialization helpers
32
+
33
+ - Persist.serialize(content, type)
34
+ - Convert an object into a string/bytes suitable to save for the given `type`.
35
+ - Supported types include:
36
+ - nil, :string, :text, :integer, :float, :boolean, :file, :path, :select, :folder, :binary — converted to string representation (IO/StringIO contents read).
37
+ - :array — joined by newline
38
+ - :yaml — YAML.dump
39
+ - :json — JSON.generate
40
+ - :marshal — Marshal.dump
41
+ - :annotation / :annotations — annotation TSV (framework-specific)
42
+ - `<type>_array` variants — will serialize each entry with given `type` and join with newlines
43
+ - Raises if unknown type.
44
+
45
+ - Persist.deserialize(serialized, type)
46
+ - Converts serialized string back into Ruby object for `type`.
47
+ - Supported types include :string, :integer, :float, :boolean, :array, :yaml, :json, :marshal, :path (returns Path.setup(...)), :file, :file_array, :annotation, and `<type>_array` variants.
48
+
49
+ ---
50
+
51
+ ## Low-level save/load
52
+
53
+ - Persist.save(content, file, type = :serializer)
54
+ - Save `content` to `file` according to `type`.
55
+ - Uses `Persist.save_drivers[type]` when registered (callable).
56
+ - If the driver arity is 1, it should return serialized string (Persist will sensible_write it).
57
+ - If arity > 1, the driver is called with (file, content) and should perform the save itself.
58
+ - For binary, writes in 'wb' mode.
59
+ - Otherwise calls `serialize` and writes via `Open.sensible_write(file, serialized, :force => true)`.
60
+ - Special `type == :memory` stores content in in-memory hash.
61
+ - Returns nil (or driver-specific return).
62
+
63
+ - Persist.load(file, type = :serializer)
64
+ - Load and return object saved at `file` according to `type`.
65
+ - Accepts `type` values described in deserialize; special handlers:
66
+ - :binary -> Open.read(file, mode: 'rb')
67
+ - :yaml / :json / :marshal -> uses Open.yaml/open.json/Open.marshal
68
+ - :stream -> returns Open.open(file) (an IO-like stream)
69
+ - :path -> Path.setup(file)
70
+ - :file -> reads content, handles leading `./` as relative path replacement; returns a filename or file path depending on contents
71
+ - :file_array -> reads lines and expands relative paths
72
+ - :memory -> if type is a Hash, returns type[file]
73
+ - Can use custom `Persist.load_drivers[type]` if registered.
74
+
75
+ - Custom drivers:
76
+ - `Persist.save_drivers[type] = ->(file, content) { ... }` or ->(content) { ... }
77
+ - `Persist.load_drivers[type] = ->(file) { ... }`
78
+
79
+ ---
80
+
81
+ ## The high-level pattern: Persist.persist
82
+
83
+ The most important API: `Persist.persist(name, type = :serializer, options = {}) { ... }`
84
+
85
+ Purpose: compute or produce a value and cache it persistently; subsequent calls return cached value unless stale or update requested.
86
+
87
+ Behavior summary:
88
+ - `name` — logical name (used to build persistent path if path not provided).
89
+ - `type` — serializer type (see above). `:serializer` resolves to Persist.SERIALIZER.
90
+ - `options` — various options (see list below).
91
+ - The block computes and returns the value to be persisted (or may write to an IO stream and return an IO to be consumed).
92
+ - If a persisted file exists and not stale, `Persist.persist` returns the loaded value (or the file path if `:no_load` true).
93
+ - If file absent or stale, it runs the block, saves result via `Persist.save`, and returns result (or file or stream depending on options/return).
94
+
95
+ Common options (via IndiferentHash/persist-specific keys):
96
+ - :dir or :path or :persist[:path] — explicit file path to use (otherwise generated by Persist.persistence_path).
97
+ - :data — optional object passed to block when block accepts arity 1.
98
+ - :no_load — if true, do not load existing file, return file path instead (or file if empty result).
99
+ - :update — boolean or Path; if provided triggers recompute. If a Path is passed it is compared by mtime; if that mtime is newer than the cached file, block executes and caches new result.
100
+ - :lockfile — specify lockfile path (or use defaults created under Persist.lock_dir)
101
+ - :tee_copies — when block returns an IO stream, tee the stream into multiple copies (useful to both write to file and also return streams)
102
+ - :prefix — when persistence_path is generated, can influence tmp filename prefix (used in tests)
103
+ - :canfail — if true, swallow save errors (used by some callers)
104
+ - :update can be used as a path or Time to force recalculation based on modification time
105
+
106
+ Concurrency & locking:
107
+ - Persist.persist uses `Open.lock` with a lockfile derived from persistence path (unless disabled) to avoid concurrent processes duplicating work.
108
+ - When the block returns an IO / StringIO (stream), Persist persists via streaming: it creates tee copies, writes one copy to file (using Open.sensible_write) in a background thread and returns another readable stream to the caller.
109
+ - In that streaming case Persist raises a `KeepLocked` (internal sentinel) so the lock is kept while the background writer thread runs; the calling code obtains an IO to read the stream while the persisting thread writes to file under the same lock.
110
+ - Test usage demonstrates two processes racing to produce the same persisted stream; locking ensures only one writes and others read the saved result or receive a streamed IO.
111
+
112
+ Return semantics:
113
+ - If `no_load` true, returns path (string/Path) instead of loaded object.
114
+ - If the block returns `nil`, behavior:
115
+ - If `no_load` true -> return file path (or file)
116
+ - If type nil -> returns nil (no persist)
117
+ - Else attempt to load from file and return result
118
+
119
+ Memory variant:
120
+ - `Persist.memory(name, options = {}, &block)` — convenience calling `Persist.persist` with `:memory` type (keeps persist in RAM).
121
+ - `Persist.MEMORY_CACHE` holds memory entries.
122
+
123
+ Persistence path helper:
124
+ - `Persist.persistence_path(name, options = {})` — returns a path in cache_dir for the named persistence file (uses TmpFile.tmp_for_file under the hood). Caller can pass `:dir` override.
125
+
126
+ ---
127
+
128
+ ## Helpers & Open/Path integration
129
+
130
+ - Open.json(file) / Open.yaml(file) / Open.marshal(file) — convenience wrappers reading & parsing using Open.open.
131
+ - `Path#yaml` / `Path#json` / `Path#marshal` are added to Path via persist/path.rb as convenience to call Open.* on a Path.
132
+
133
+ ---
134
+
135
+ ## Examples (from tests)
136
+
137
+ Save/load primitive types:
138
+ ```ruby
139
+ Persist.save("ABC", "/tmp/x", :string)
140
+ Persist.load("/tmp/x", :string) # => "ABC"
141
+
142
+ Persist.save([1,2], "/tmp/x", :integer_array)
143
+ Persist.load("/tmp/x", :integer_array) # => [1,2]
144
+ ```
145
+
146
+ Using `persist` to cache a computed value:
147
+ ```ruby
148
+ res = Persist.persist("myname", :json, dir: some_dir) do
149
+ expensive_compute()
150
+ end
151
+ # On subsequent calls returns cached value unless update requested.
152
+ ```
153
+
154
+ Streaming content into persistence and returning an IO stream:
155
+ ```ruby
156
+ io = Persist.persist("stream_name", :string, path: file) do
157
+ Open.open_pipe do |sin|
158
+ 1000.times { |i| sin.puts "line #{i}" }
159
+ end
160
+ end
161
+ # io is a readable IO (stream), background thread writes file atomically while caller can read one copy.
162
+ ```
163
+
164
+ Memory persistence:
165
+ ```ruby
166
+ value = Persist.memory("key"){ expensive_compute }
167
+ ```
168
+
169
+ Force update when a source file changed:
170
+ ```ruby
171
+ Persist.persist("cache_key", :string, update: source_file) { produce_new_value() }
172
+ ```
173
+
174
+ ---
175
+
176
+ ## Custom drivers
177
+
178
+ Register custom save/load behavior for a type:
179
+ ```ruby
180
+ Persist.save_drivers[:mytype] = ->(file, content) { ... } # or ->(content) { ... }
181
+ Persist.load_drivers[:mytype] = ->(file) { ... }
182
+ ```
183
+ If driver takes 1 argument it should return serialized string; Persist.save will sensible_write it. If driver expects file as first argument it should perform write itself.
184
+
185
+ ---
186
+
187
+ ## Error handling & notes
188
+
189
+ - Persist.persist wraps work in an Open.lock to avoid duplicate computations; it will rethrow exceptions by default (unless `:canfail` true).
190
+ - If a block raises, persist will normally propagate the exception; tests demonstrate that when the persisted file already exists the caller will get cached value even if block raises (depending on options).
191
+ - When streaming, Persist keeps locks while the background writer thread runs (ensures atomicity and consistent readers).
192
+ - Persist.save uses `Open.sensible_write` → atomic temp->mv behavior and uses locks when requested.
193
+ - The `:path` and `:dir` options allow customizing where persisted files go (useful in tests).
194
+ - Use `Persist.save_drivers` and `Persist.load_drivers` to support external formats or custom storage backends.
195
+
196
+ ---
197
+
198
+ ## API quick reference
199
+
200
+ - Persist.serialize(content, type)
201
+ - Persist.deserialize(serialized, type)
202
+ - Persist.save(content, file, type = :serializer)
203
+ - Persist.load(file, type = :serializer)
204
+ - Persist.persist(name, type = :serializer, options = {}) { ... } — high-level caching pattern (with locking)
205
+ - Persist.memory(name, options = {}, &block) — in-RAM persist wrapper
206
+ - Persist.persistence_path(name, options = {}) — generate a path under Persist.cache_dir
207
+ - Persist.cache_dir / Persist.lock_dir — configurable directories
208
+ - Persist.save_drivers / Persist.load_drivers — custom driver registries
209
+ - Open.json(file), Open.yaml(file), Open.marshal(file) — helpers to read parsed content
210
+ - Path#json / Path#yaml / Path#marshal — Path helpers to read persisted data
211
+
212
+ ---
213
+
214
+ Persist is intended for reproducible caching of computed results in scripts and workflows, where atomic writes and cross-process locking are required. Use `Persist.persist` for the common compute-and-cache pattern; register custom drivers for special file formats or storage backends.
data/doc/Resource.md ADDED
@@ -0,0 +1,229 @@
1
+ # Resource
2
+
3
+ The Resource module provides a filesystem “resource” abstraction and a production system for on-demand creation of files and directories. It integrates tightly with Path and Open: Path objects can be tied to a Resource, and when you attempt to open/read a Path the Resource system may produce that file (download, generate, run a rake task, install software, etc.). Resource also supports simple synchronization, installation helpers and per-package mapping of search paths.
4
+
5
+ Main responsibilities:
6
+ - Declare (claim) resources and how to produce them (string content, proc, URL, rake tasks, installers, etc.).
7
+ - Produce (create) resources on demand, with atomic writes and locking to avoid races.
8
+ - Map filesystem paths back to logical resource identifiers (identify / relocate).
9
+ - Install software helpers and set up environment variables from installed packages.
10
+ - Synchronize resources (rsync) into target locations.
11
+
12
+ Resource is intended to be extended by modules representing package/resource collections. Example in the framework: `Scout` extends Resource and becomes the default resource provider.
13
+
14
+ ---
15
+
16
+ ## Key concepts
17
+
18
+ - Resource module is extended into a module representing a package or resource collection.
19
+ - Example:
20
+ ```ruby
21
+ module MyPkg
22
+ extend Resource
23
+ self.pkgdir = 'mypkg'
24
+ self.subdir = Path.setup('share/mypkg') # optional
25
+ end
26
+ ```
27
+ - A Resource holds claims: mappings between logical paths and how to produce them.
28
+ - Path objects can carry a `pkgdir` referring to a Resource; Path#produce will invoke the corresponding Resource produce logic.
29
+
30
+ ---
31
+
32
+ ## Claiming resources
33
+
34
+ Use `claim` on a Resource to register how to create a particular logical path:
35
+
36
+ - Resource.claim(path, type, content = nil, &block)
37
+ - `path` — a Path (or string that will be converted to Path).
38
+ - `type` — a symbol describing how to produce (see below).
39
+ - `content` or block — content, URL, proc, rakefile, installer script, etc.
40
+
41
+ Common `type` values (used by the built-in produce logic):
42
+ - `:string` — static string written to file.
43
+ - `:proc` — a Proc that returns content (String, IO, Array, TSV/dumper, etc.). If the proc accepts an arity of 1 it may receive the final filename.
44
+ - `:url` — remote URL; the resource is produced by downloading the URL (Open.wget/Open.open).
45
+ - `:rake` — a Rakefile or proc to be executed to generate targets via Rake rules (see rake integration).
46
+ - `:install` — calls Resource.install to run system-level install helpers for software.
47
+ - custom types may be supported by extending Resource.produce or providing other logic.
48
+
49
+ Examples:
50
+ ```ruby
51
+ module TestResource
52
+ extend Resource
53
+ claim self.tmp.test.string, :string, "TEST"
54
+ claim self.tmp.test.proc, :proc do
55
+ "PROC TEST"
56
+ end
57
+ claim self.tmp.test.google, :url, "http://google.com"
58
+ claim self.tmp.test.rakefiles.Rakefile , :string, "file('foo'){ |t| Open.write(t.name,'FOO') }"
59
+ claim self.tmp.test.work.footest, :rake, TestResource.tmp.test.rakefiles.Rakefile
60
+ end
61
+ ```
62
+
63
+ ---
64
+
65
+ ## Producing resources
66
+
67
+ - Resource.produce(path, force = false)
68
+ - Produces (creates) the file for `path` according to the claim registered on the Resource.
69
+ - If `force` is true it will overwrite existing file.
70
+ - Uses a lock per target (TmpFile tmp_for_file) and performs atomic write using Open.sensible_write.
71
+ - Supports producing compressed alternatives (tries .gz and .bgz when appropriate).
72
+ - Handles `:string`, `:proc`, `:url`, `:rake`, `:install` and other types as implemented.
73
+
74
+ Behavior notes:
75
+ - For `:proc` type: the proc may return String, IO, Array (written as newlines), TSV dumper, etc. Open.sensible_write is used.
76
+ - For streaming scenarios block may produce an IO stream; produce writes the stream atomically using tees and background thread under lock.
77
+ - On failure the partial output is removed and exception re-raised.
78
+
79
+ Convenience wrappers on Path trigger production:
80
+ - `Path#produce(force = false)` — call resource-produce for the path.
81
+ - `Path#produce_with_extension(ext, *args)` — try producing path or path.ext.
82
+ - `Path#produce_and_find(extension = nil, *args)` — produce then return found path.
83
+ - `Path#open`/`Path#read` will call `produce` before delegating to Open.
84
+
85
+ ---
86
+
87
+ ## Rake integration
88
+
89
+ Resource supports producing files by invoking Rake tasks:
90
+
91
+ - `Resource.claim(..., :rake, rakefile)` — cooker references a Rakefile (Path or proc or string). The Rake file should define file rules/tasks for targets.
92
+ - Under the hood:
93
+ - `ScoutRake.run(rakefile, dir, task)` forks and runs Rake in a child, invoking the requested task (task is derived from relative path).
94
+ - `Rake::FileTask.define_task` is hooked to track file tasks.
95
+ - `Resource.run_rake(path, rakefile, rake_dir)` handles invoking the appropriate rake task for a path and supports retries moving up directories to find tasks.
96
+
97
+ This makes it possible to register a Rakefile that generates many resources via Rake rules.
98
+
99
+ ---
100
+
101
+ ## Identification & relocation
102
+
103
+ - Resource.identify(path)
104
+ - Given an absolute `path`, try to identify its logical package/path by matching the configured `path_maps` for the resource.
105
+ - Returns an unlocated Path (logical path within the package) or a Path annotated with the resource context.
106
+
107
+ - Resource.relocate(path)
108
+ - If `path` does not currently exist locally, attempt to identify it and find it in configured maps (e.g., different locations, caches) returning an available path.
109
+
110
+ - Path#identify (delegates to Resource.identify for Path objects with pkgdir set).
111
+ - Use `Resource.identify` to map filesystem locations back to `TOPLEVEL/{PKGDIR}/{SUBPATH}` style logical identifiers.
112
+
113
+ ---
114
+
115
+ ## Sync / rsync support
116
+
117
+ - `Resource.sync(path, map = nil, options = {})`
118
+ - Copy a resource (file or directory) into a target location determined by resource maps and `map` name (defaults :user).
119
+ - Uses `Open.sync` (rsync wrapper) for actual transfer.
120
+ - Can accept `:resource` option to choose a resource module other than the path's pkgdir.
121
+
122
+ ---
123
+
124
+ ## Software install helpers
125
+
126
+ Resource includes helpers to install software in a per-resource `software` directory and to set environment variables:
127
+
128
+ - `Resource.install(content, name, software_dir = Path.setup('software'))`
129
+ - `content` may be a script, Path, String, Hash describing git/src/jar/commands, or a block that returns script text.
130
+ - Builds a wrapper script with a common install helper preamble and executes installation commands (using CMD).
131
+ - After install, calls `Resource.set_software_env(software_dir)`.
132
+
133
+ - `Resource.set_software_env(software_dir)`
134
+ - Scans `software/opt` configuration files (`.ld-paths`, `.c-paths`, `.pkgconfig-paths`, `.aclocal-paths`, `.java-classpaths`) and adds entries to environment variables (PATH, CLASSPATH, PKG_CONFIG_PATH, etc.).
135
+ - Adds `opt/bin` to PATH and loads `.post_install` exports.
136
+
137
+ This supports installing package-local tools and updating runtime environment.
138
+
139
+ ---
140
+
141
+ ## Helpers / utilities
142
+
143
+ - `rake_for(path)`, `has_rake?(path)` — find registered rake dirs that can produce a given path.
144
+ - `run_rake(path, rakefile, rake_dir)` — run rake task to produce target.
145
+ - `relocate(path)` — attempt to relocate a missing path via resource identification.
146
+ - Method delegation: Resource defines `root` and `method_missing` so resource instances behave like a Path root; e.g. `MyResource.foo.bar` delegates to `MyResource.root.foo.bar`.
147
+
148
+ ---
149
+
150
+ ## Concurrency & atomicity
151
+
152
+ - Resource.produce uses Open.lock to acquire a lockfile for the target before producing, avoiding concurrent producers colliding.
153
+ - Writes are performed via `Open.sensible_write` (temp file then atomic mv) to avoid partial files being visible.
154
+ - Rake-run and other production steps execute in subprocesses or controlled threads so they don't corrupt the workspace.
155
+
156
+ ---
157
+
158
+ ## Integration details
159
+
160
+ - Path.open has been overridden to call `produce` for Path objects before delegating to Open.open, so many consumers transparently trigger production when opening resources.
161
+ - `Resource.default_resource` can be set to a default package (the framework sets `Resource.default_resource = Scout`).
162
+ - Resources use Path.path_maps to find logical locations; packages typically set `pkgdir` and `subdir` and may provide per-resource `path_maps`.
163
+
164
+ ---
165
+
166
+ ## Examples (from tests)
167
+
168
+ Registering and producing resources:
169
+ ```ruby
170
+ module TestResource
171
+ extend Resource
172
+ self.subdir = Path.setup('tmp/test-resource')
173
+
174
+ claim self.tmp.test.string, :string, "TEST"
175
+ claim self.tmp.test.proc, :proc do
176
+ "PROC TEST"
177
+ end
178
+ claim self.tmp.test.google, :url, "http://google.com"
179
+ end
180
+
181
+ TestResource.produce TestResource.tmp.test.string
182
+ puts TestResource.tmp.test.string.read # "TEST"
183
+ TestResource.tmp.test.google.produce # downloads google HTML
184
+ ```
185
+
186
+ Rake-based production:
187
+ ```ruby
188
+ # Rakefile claims and tasks:
189
+ claim self.tmp.test.rakefiles.Rakefile , :string , <<-EOF
190
+ file('foo') { |t| Open.write(t.name, "FOO") }
191
+ rule(/.*/) do |t|
192
+ Open.write(t.name, "OTHER")
193
+ end
194
+ EOF
195
+
196
+ claim self.tmp.test.work.footest, :rake, TestResource.tmp.test.rakefiles.Rakefile
197
+ # produce targets:
198
+ TestResource.produce TestResource.tmp.test.work.footest.foo # runs rake task to create foo
199
+ ```
200
+
201
+ Software installation:
202
+ ```ruby
203
+ Resource.install nil, "scout_install_example", tmpdir.software do
204
+ <<-EOF
205
+ echo "#!/bin/bash\necho WORKING" > $OPT_BIN_DIR/scout_install_example
206
+ chmod +x $OPT_BIN_DIR/scout_install_example
207
+ EOF
208
+ end
209
+ # then the installed helper is available in PATH via set_software_env
210
+ ```
211
+
212
+ Syncing:
213
+ ```ruby
214
+ Resource.sync(source_path, :current) # rsync source into resource's current map target
215
+ ```
216
+
217
+ ---
218
+
219
+ ## Notes & caveats
220
+
221
+ - Resource assumes the availability of external commands when needed (wget, git, bash, rake, etc.).
222
+ - Rake tasks are executed in forked subprocesses; Rake definitions must be loadable in that context.
223
+ - Production steps must be idempotent or guarded by the lock/atomic write semantics.
224
+ - When claiming URL resources, network failures or remote changes can affect production; consider caching strategies or update checks in callers.
225
+ - `Resource.identify` relies on path templates and caller libdir heuristics — mapping may require customizing `path_maps` per resource.
226
+
227
+ ---
228
+
229
+ Resource is intended to centralize how resources for a package are produced and located, to let clients simply ask for a Path and have the resource created on demand with safe concurrency and atomic writes. Use `claim` to declare resources and `produce`/Path.open to trigger creation.