scout-essentials 1.7.1 → 1.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. checksums.yaml +4 -4
  2. data/.vimproject +200 -47
  3. data/README.md +136 -0
  4. data/Rakefile +1 -0
  5. data/VERSION +1 -1
  6. data/doc/Annotation.md +352 -0
  7. data/doc/CMD.md +203 -0
  8. data/doc/ConcurrentStream.md +163 -0
  9. data/doc/IndiferentHash.md +240 -0
  10. data/doc/Log.md +235 -0
  11. data/doc/NamedArray.md +174 -0
  12. data/doc/Open.md +331 -0
  13. data/doc/Path.md +217 -0
  14. data/doc/Persist.md +214 -0
  15. data/doc/Resource.md +229 -0
  16. data/doc/SimpleOPT.md +236 -0
  17. data/doc/TmpFile.md +154 -0
  18. data/lib/scout/annotation/annotated_object.rb +8 -0
  19. data/lib/scout/annotation/annotation_module.rb +1 -0
  20. data/lib/scout/cmd.rb +19 -12
  21. data/lib/scout/concurrent_stream.rb +3 -1
  22. data/lib/scout/config.rb +2 -2
  23. data/lib/scout/indiferent_hash/options.rb +2 -2
  24. data/lib/scout/indiferent_hash.rb +16 -0
  25. data/lib/scout/log/color.rb +5 -3
  26. data/lib/scout/log/fingerprint.rb +8 -8
  27. data/lib/scout/log/progress/report.rb +6 -6
  28. data/lib/scout/log.rb +7 -7
  29. data/lib/scout/misc/digest.rb +11 -13
  30. data/lib/scout/misc/format.rb +2 -2
  31. data/lib/scout/misc/system.rb +5 -0
  32. data/lib/scout/open/final.rb +16 -1
  33. data/lib/scout/open/remote.rb +0 -1
  34. data/lib/scout/open/stream.rb +30 -5
  35. data/lib/scout/open/util.rb +32 -0
  36. data/lib/scout/path/digest.rb +12 -2
  37. data/lib/scout/path/find.rb +19 -6
  38. data/lib/scout/path/util.rb +37 -1
  39. data/lib/scout/persist/open.rb +2 -0
  40. data/lib/scout/persist.rb +7 -1
  41. data/lib/scout/resource/path.rb +2 -2
  42. data/lib/scout/resource/util.rb +18 -4
  43. data/lib/scout/resource.rb +15 -1
  44. data/lib/scout/simple_opt/parse.rb +2 -0
  45. data/lib/scout/tmpfile.rb +1 -1
  46. data/scout-essentials.gemspec +19 -6
  47. data/test/scout/misc/test_hook.rb +2 -2
  48. data/test/scout/open/test_stream.rb +43 -15
  49. data/test/scout/path/test_find.rb +1 -1
  50. data/test/scout/path/test_util.rb +11 -0
  51. data/test/scout/test_path.rb +4 -4
  52. data/test/scout/test_persist.rb +10 -1
  53. metadata +31 -5
  54. data/README.rdoc +0 -18
data/doc/NamedArray.md ADDED
@@ -0,0 +1,174 @@
1
+ # NamedArray
2
+
3
+ NamedArray is a small utility mixin built on top of the Annotation system that gives arrays named fields and name-based accessors. It lets you treat an Array like a record/tuple where elements can be accessed by name (symbol or string), supports fuzzy name matching, conversion to a hash (indifferent to string/symbol keys), and provides helpers for zipping/combining lists of named values.
4
+
5
+ NamedArray extends Annotation and declares two annotation attributes:
6
+ - fields — an ordered list of names for each position in the array
7
+ - key — an optional primary key field name
8
+
9
+ Since NamedArray extends Annotation, you can apply it to an array via NamedArray.setup(array, fields, key: ...), or by extending an instance.
10
+
11
+ Examples:
12
+ ```ruby
13
+ a = NamedArray.setup([1,2], [:a, :b])
14
+ a[:a] # => 1
15
+ a["b"] # => 2
16
+ a.a # => 1 # method_missing lookup
17
+ a.to_hash # => IndiferentHash { a: 1, b: 2 }
18
+ ```
19
+
20
+ ## Core instance API
21
+
22
+ - fields, key
23
+ - Provided by Annotation (accessors). fields is an Array of field names associated with array positions.
24
+
25
+ - all_fields
26
+ - Returns [key, fields].compact.flatten — useful if you want the key included with other fields.
27
+
28
+ - [](name_or_index)
29
+ - Accepts a field name (symbol or string) or numeric index. Name is resolved to a numeric position via identify_name; if unresolved returns nil.
30
+ - Example: a[:a], a["a"], a[0]
31
+
32
+ - []=(name_or_index, value)
33
+ - Sets element by name (resolved to a position) or index; returns nil if name not found.
34
+
35
+ - positions(fields)
36
+ - Resolve one or many fields to their positions (delegates to NamedArray.identify_name).
37
+
38
+ - values_at(*positions)
39
+ - Accepts named fields or positions; it will translate names to indices before calling Array#values_at.
40
+
41
+ - concat(other)
42
+ - If other is a Hash: appends values of the hash in iteration order and adds the hash keys to this array's fields list.
43
+ Example:
44
+ a.concat({c: 3, d: 4}) # adds 3 and 4 to array and [:c, :d] to fields
45
+ - If other is another NamedArray: standard concat and fields from other are appended.
46
+ - Otherwise behaves like Array#concat.
47
+
48
+ - to_hash
49
+ - Returns a hash mapping fields => value for each field position. The returned hash is extended with IndiferentHash (so both string and symbol lookups work).
50
+ - Example: a.to_hash[:a] => 1
51
+
52
+ - prety_print
53
+ - Convenience pretty-print wrapper: uses Misc.format_definition_list(self.to_hash, sep: "\n").
54
+
55
+ - method_missing(name, *args)
56
+ - If name resolves to a field (via identify_name) returns self[name]; otherwise calls super. This gives quick accessors like a.foo
57
+
58
+ ## Name resolution and matching
59
+
60
+ NamedArray provides flexible name resolution via:
61
+
62
+ - NamedArray.identify_name(names, selected, strict: false)
63
+ - names: array of field names (usually the NamedArray#fields)
64
+ - selected: value to resolve — may be nil, Range, Integer, Symbol, or String
65
+ - Returns:
66
+ - Integer index (position) for a single field selection
67
+ - Range (unchanged) if a Range is passed
68
+ - 0 for nil (treat nil as first field)
69
+ - :key for Symbol :key (special sentinel)
70
+ - nil if unresolved
71
+
72
+ Resolution rules:
73
+ - nil => 0
74
+ - Range => returned as-is
75
+ - Integer => returned as-is
76
+ - Symbol:
77
+ - if :key => returns :key
78
+ - otherwise finds first field whose to_s equals the symbol name
79
+ - String:
80
+ - exact string match first
81
+ - if string is numeric (^\d+$) it is treated as an index
82
+ - unless strict: fuzzy match using NamedArray.field_match
83
+ - field_match returns true if:
84
+ - exact equality
85
+ - one contains the other inside parentheses
86
+ - one starts with the other followed by a space
87
+ - returns the index found or nil if none
88
+
89
+ Instance helper identify_name(selected) delegates to the class method using this NamedArray's fields.
90
+
91
+ Note: identify_name accepts arrays for selected (returns an array of resolved positions), so values_at and other helpers can pass multiple names.
92
+
93
+ ## Class-level helpers for lists
94
+
95
+ - NamedArray.field_match(field, name)
96
+ - Helper used by identify_name for fuzzy matching of two strings (parentheses and prefix matching).
97
+
98
+ - NamedArray._zip_fields(array, max = nil)
99
+ - Internal helper to zip together an array of lists, expanding single-element lists to match `max`.
100
+
101
+ - NamedArray.zip_fields(array)
102
+ - Zips a list-of-lists into per-position combined lists. Optimized to slice large inputs when array length is huge.
103
+
104
+ Example:
105
+ ```ruby
106
+ NamedArray.zip_fields([ %w(a b), %w(1 1) ]) # => [["a","1"], ["b","1"]]
107
+ ```
108
+
109
+ - NamedArray.add_zipped(source, new)
110
+ - Given two zipped-lists (source and new), concatenates each corresponding sub-array from `new` into `source` (skips nil entries).
111
+ - Useful to merge results incrementally.
112
+
113
+ ## Concatenation with Hash
114
+
115
+ Calling concat with a Hash behaves like:
116
+ ```ruby
117
+ a = NamedArray.setup([1,2], [:a, :b])
118
+ a.concat({c: 3, d: 4})
119
+ # resulting array becomes [1,2,3,4] and fields => [:a, :b, :c, :d]
120
+ ```
121
+
122
+ This is handy when building named rows incrementally from keyed data.
123
+
124
+ ## Integration with Annotation
125
+
126
+ Because NamedArray extends Annotation:
127
+ - You can call NamedArray.setup(array, fields) to set the `@fields` annotation on the array and extend it with NamedArray behavior.
128
+ - NamedArray.setup delegates to the Annotation::AnnotationModule.setup implementation for assigning @fields/@key values to the array.
129
+
130
+ Example:
131
+ ```ruby
132
+ a = NamedArray.setup([1,2], [:a, :b])
133
+ a.fields # => [:a, :b]
134
+ ```
135
+
136
+ ## Examples (from tests)
137
+
138
+ Identify names:
139
+ ```ruby
140
+ names = ["ValueA", "ValueB (Entity type)", "15"]
141
+ NamedArray.identify_name(names, "ValueA") # => 0
142
+ NamedArray.identify_name(names, :key) # => :key
143
+ NamedArray.identify_name(names, nil) # => 0
144
+ NamedArray.identify_name(names, "ValueB") # => 1 (fuzzy match)
145
+ NamedArray.identify_name(names, 1) # => 1
146
+ ```
147
+
148
+ Basic named array usage:
149
+ ```ruby
150
+ a = NamedArray.setup([1,2], [:a, :b])
151
+ a[:a] # => 1
152
+ a[:c] # => nil
153
+ a.a # => 1 (method_missing provides a getter)
154
+ a.to_hash # => IndiferentHash { a: 1, b: 2 }
155
+ ```
156
+
157
+ Zipping and adding zipped:
158
+ ```ruby
159
+ NamedArray.zip_fields([ %w(a b), %w(1 1) ]) # => [["a","1"], ["b","1"]]
160
+
161
+ a = [%w(a b), %w(1 1)]
162
+ NamedArray.add_zipped(a, [%w(c), %w(1)])
163
+ NamedArray.add_zipped(a, [%w(d), %w(1)])
164
+ # a => [%w(a b c d), %w(1 1 1 1)]
165
+ ```
166
+
167
+ ## Notes & caveats
168
+
169
+ - Name matching is intentionally forgiving (parentheses and space-prefix checks). Use `identify_name(..., strict: true)` to force exact matches only.
170
+ - The `fields` annotation must correspond to element positions in the array. If fields and array lengths differ, name resolution may return nil or indices outside current array bounds.
171
+ - to_hash returns an IndiferentHash (so consumers can use either string or symbol keys).
172
+ - method_missing exposes field getters only; it does not create setters (use []= to assign by name).
173
+
174
+ NamedArray is small but convenient when treating Arrays as records/rows with named columns and needing flexible lookup and composition tools.
data/doc/Open.md ADDED
@@ -0,0 +1,331 @@
1
+ # Open
2
+
3
+ The Open module provides unified, high-level file/stream/remote I/O and filesystem utilities. It wraps plain File I/O, streaming helpers, remote fetching (wget/ssh), atomic/sensible writes, pipe/fifo helpers, gzip/bgzip/zip helpers, file-system operations (mkdir, mv, ln, cp, rm, etc.), and a lock wrapper (Lockfile). Use Open when you need robust file access, streaming, temporary/atomic writes, remote access and process-safe locking.
4
+
5
+ Sections:
6
+ - Opening / reading / writing files
7
+ - Streams, pipes and tees
8
+ - Sensible / atomic writes
9
+ - Remote fetching, caching and downloads
10
+ - File / filesystem helpers
11
+ - Locking
12
+ - Sync (rsync)
13
+ - Utilities (gzip/bgzip/grep/sort/collapse)
14
+ - Examples
15
+ - Notes and edge cases
16
+
17
+ ---
18
+
19
+ ## Opening / reading / writing files
20
+
21
+ Open unifies access and transparently handles compressed and remote files.
22
+
23
+ - Open.open(file, options = {}) { |io| ... } or returns an IO-like stream
24
+ - Accepts File paths, Path objects, IO, StringIO.
25
+ - Options (via IndiferentHash):
26
+ - :mode (default 'r') — file open mode (e.g., 'r', 'w', 'rb', etc.)
27
+ - :grep, :invert_grep, :fixed_grep — pipe the stream through grep
28
+ - :noz — if true, do not auto-decompress zip/gzip/bgz
29
+ - :gzip / :bgzip / :zip — force decompression
30
+ - For compressed files: Open detects .gz, .bgz, .zip and pipes through gzip/bgzip/unzip unless mode includes "w" (noz true).
31
+ - The returned stream is extended with NamedStream (has .filename and digest_str helper).
32
+
33
+ - Open.file_open(file, grep = false, mode = 'r', invert_grep = false, fixed_grep = true, options = {})
34
+ - Returns the basic stream (no auto-decompress). Uses get_stream to handle remote/ssh/wget or open file.
35
+
36
+ - Open.get_stream(file, mode = 'r', options = {})
37
+ - Low-level stream getter:
38
+ - If file is already a stream, returns it.
39
+ - If file responds to .stream, returns file.stream.
40
+ - If Path, resolves .find.
41
+ - If remote URL, delegates to Open.ssh or Open.wget.
42
+ - Finally falls back to File.open(File.expand_path(file), mode).
43
+
44
+ - Open.read(file, options = {}) { |line| ... } or returns String
45
+ - If block given: yields lines from file (fixes UTF-8 by default; suppressed via :nofix).
46
+ - If no block: returns full file contents (UTF-8 fixed by default).
47
+ - Supports :grep and :invert_grep (see tests).
48
+
49
+ - Open.write(file, content = nil, options = {})
50
+ - Atomic/robust writing wrapper:
51
+ - options default includes :mode => 'w'
52
+ - If mode includes 'w' ensures parent directory exists.
53
+ - If a block is given, yields a File object opened in mode and ensures close; on exception removes target file.
54
+ - If content is nil => writes empty file.
55
+ - If content is String => writes content.
56
+ - If content is an IO/StringIO => streams content into file with locking.
57
+ - On success calls Open.notify_write(file) and returns nil.
58
+ - Use for simple writes; for safer concurrency-sensitive writes prefer Open.sensible_write (below).
59
+
60
+ ---
61
+
62
+ ## Streams, pipes and tees
63
+
64
+ Open contains multiple helpers to produce and manage streams:
65
+
66
+ - Open.consume_stream(io, in_thread = false, into = nil, into_close = true)
67
+ - Consumes `io` reading blocks of size BLOCK_SIZE and writes into `into` (file or IO) or discards.
68
+ - If in_thread true, spawns a consumer thread and returns it (thread named and stored).
69
+ - Handles exceptions: closes and deletes partial files on failure; forwards abort to stream if supported.
70
+
71
+ - Open.pipe
72
+ - Creates an IO.pipe pair [reader, writer] and records writer for management. Returns [sout, sin] (sout reader, sin writer).
73
+ - Caller typically uses sout as stream to read and sin to write.
74
+
75
+ - Open.open_pipe { |sin| ... } -> returns sout
76
+ - Creates a pipe and runs the block with `sin` (writer) in a new thread (or fork if requested). The returned `sout` is a ConcurrentStream configured to join/handle threads/pids.
77
+ - Example: use for producing a stream on-the-fly for other consumers.
78
+
79
+ - Open.open_pipe(do_fork = false, close = true) yields sin in child/forked thread and returns sout for parent.
80
+
81
+ - Open.tee_stream_thread(stream) / Open.tee_stream_thread_multiple(stream, num)
82
+ - Duplicate input stream into multiple output pipes; returns out pipes.
83
+ - Uses a splitter thread that reads from the source and writes to each pipe; sets up abort callbacks and cleanup.
84
+ - Useful to fan-out a single stream to multiple consumers without re-reading source.
85
+
86
+ - Open.tee_stream(stream) — convenience returns two outputs.
87
+
88
+ - Open.read_stream / Open.read_stream(stream, size)
89
+ - Blocking reads helper to ensure reading exactly `size` bytes; raises ClosedStream if stream EOF.
90
+
91
+ - Open.with_fifo(path = nil, clean = true) { |path| ... }
92
+ - Create FIFO in temp path and yield it; removes it after block.
93
+
94
+ ---
95
+
96
+ ## Sensible / atomic writes
97
+
98
+ Use `Open.sensible_write(path, content, options = {})` for safe writes that avoid overwriting existing targets and use temporary files + atomic rename.
99
+
100
+ - Behavior:
101
+ - If path exists and :force not true, will consume source and skip update.
102
+ - Writes to a temporary file in `Open.sensible_write_dir` then moves (Open.mv) into place.
103
+ - Supports lock options via `:lock` key (uses Open.lock). Accepts hash of lock settings or Lockfile instance.
104
+ - Ensures cleanup of temp files on exception; preserves existing target if write fails.
105
+ - On successful move, calls Open.notify_write(path).
106
+
107
+ - Open.sensible_write_lock_dir / Open.sensible_write_dir are configurable directories (Paths) used for temporary files and lock state.
108
+
109
+ - Open.sensible_write uses Open.lock to protect move operations.
110
+
111
+ - For basic atomic writes, Open.write does attempt file lock during write (f.flock File::LOCK_EX) but sensible_write also uses safer tmp->mv semantics and optionally locking for concurrent processes.
112
+
113
+ ---
114
+
115
+ ## Remote fetching, caching and downloads
116
+
117
+ Open supports remote URLs and SSH-style access:
118
+
119
+ - Open.remote?(file) -> Boolean if URL-like (http|https|ftp|ssh)
120
+ - Open.ssh?(file) -> ssh:// scheme detection
121
+ - Open.ssh(file, options = {})
122
+ - Parses ssh://server:path and streams via `ssh server cat 'path'` (if server != 'localhost').
123
+ - For localhost returns Open.open(file) (local path handling).
124
+
125
+ - Open.wget(url, options = {})
126
+ - Download via `wget` (through CMD.cmd), returns an IO-like stream (ConcurrentStream).
127
+ - Options:
128
+ - :pipe => true (default), :autojoin => true
129
+ - supports `--post-data=`, cookies, quiet mode, :force, :nocache
130
+ - caching: unless :nocache true, saves to remote_cache_dir under a digest filename (Open.add_cache) and returns Open.open on cache file.
131
+ - :nice / :nice_key for throttling repeated requests with a wait
132
+ - Errors raise OpenURLError on failure
133
+ - Example: Open.wget('http://example.com', quiet: true, nocache: true).read
134
+
135
+ - Open.cache_file(url, options), Open.in_cache(url, options), Open.add_cache(url, data, options), Open.open_cache(url)
136
+ - Support caching remote requests to `Open.remote_cache_dir`.
137
+
138
+ - Open.download(url, file) — wrapper to run wget into local file with logging.
139
+
140
+ - Open.digest_url(url, options) — compute cache key based on url and post data/file.
141
+
142
+ - Open.scp(source_file, target_file, target:, source:) — convenience wrapper for scp and remote mkdir.
143
+
144
+ ---
145
+
146
+ ## File / filesystem helpers
147
+
148
+ Common filesystem operations with Path support:
149
+
150
+ - Open.mkdir(path) — ensure directory exists (mkdir_p). Accepts Path.
151
+ - Open.mkfiledir(target) — ensure parent dir exists for file target.
152
+ - Open.mv(source, target) — move with tmp intermediate to reduce risk (move to .tmp_mv.* then rename).
153
+ - Open.rm(file) — remove file if exists or broken symlink.
154
+ - Open.rm_rf(file) — recursive remove
155
+ - Open.touch(file) — create or update mtime (ensures parent dir).
156
+ - Open.cp(source, target) — copy (uses cp_r, removes existing target).
157
+ - Open.directory?(file)
158
+ - Open.exists?(file) / Open.exist? alias — existence check (Path supported).
159
+ - Open.ctime(file), Open.mtime(file) — time helpers; mtime has logic to follow symlinks and handle special Step info file cases.
160
+ - Open.size(file)
161
+ - Open.ln_s(source, target) — create symbolic link (ensures parent dir and remove existing).
162
+ - Open.ln(source, target) — create hard link (removing target if present).
163
+ - Open.ln_h(source, target) — attempt hard link via `ln -L`, fallback to copy on failure.
164
+ - Open.link(source, target) — tries ln then ln_s as fallback.
165
+ - Open.link_dir(source, target) — cp with hard-links (cp_lr).
166
+ - Open.same_file(file1, file2) — File.identical?
167
+ - Open.writable?(path) — checks writability handling symlinks and non-existing files.
168
+ - Open.realpath(file) — returns canonical realpath (resolves symlinks).
169
+ - Open.list(file) — returns file contents split on newline (convenience).
170
+
171
+ ---
172
+
173
+ ## Locking
174
+
175
+ Open wraps Lockfile to provide safe locking primitives and a simpler interface.
176
+
177
+ - Open.lock(file, unlock = true, options = {}) { |lockfile| ... }
178
+ - Acquire a lock (Lockfile) for a given path.
179
+ - `file` may be:
180
+ - a Lockfile instance (used directly),
181
+ - Path/String (lockfile path defaulting to `file + '.lock'`),
182
+ - nil with options[:lock] being a Lockfile instance or false.
183
+ - `unlock` default true; set false to keep lock after block (or raise KeepLocked inside block to keep lock and return payload).
184
+ - Options passed to Lockfile constructor (min_sleep, max_sleep, sleep_inc, max_age, refresh, timeout, etc.).
185
+ - Handles exceptions and unlocks safely in ensure.
186
+ - Example (from tests):
187
+ ```ruby
188
+ Open.lock lockfile_path, min_sleep: 0.01, max_sleep: 0.05 do
189
+ # critical section
190
+ end
191
+ ```
192
+
193
+ - Lockfile class is included in `open/lock/lockfile.rb` — classic NFS-safe lockfile implementation (supports refreshing, stealing detection, sweeps, retries, timeouts, etc.). Use its options via Open.lock(..., options).
194
+
195
+ ---
196
+
197
+ ## Sync (rsync)
198
+
199
+ - Open.rsync(source, target, options = {})
200
+ - Wrapper to build and execute an `rsync` command with common options.
201
+ - Options processed via IndiferentHash:
202
+ - :excludes, :files (list of files to transfer), :hard_link (use --link-dest), :test (dry-run), :print (return command), :delete, :source, :target (server strings), :other (extra args)
203
+ - Handles directory trailing slashes, remote server prefixes, ensures target dirs exist (remote mkdir via ssh when needed).
204
+ - Uses TMP files for --files-from when passing a list.
205
+ - Example:
206
+ ```ruby
207
+ Open.rsync(source_dir, target_dir, excludes: 'tmp_dir', delete: true)
208
+ ```
209
+
210
+ - Open.sync is alias for rsync.
211
+
212
+ ---
213
+
214
+ ## Utilities
215
+
216
+ - Compression helpers:
217
+ - Open.gzip?(file) / Open.bgzip?(file) / Open.zip?(file) — simple extension checks.
218
+ - Open.gunzip(stream), Open.gzip(stream), Open.bgzip(stream) — spawn subprocesses (zcat/gzip/bgzip) returning a piped IO.
219
+ - Open.gzip_pipe(file) — returns shell-friendly expression for gzip handling.
220
+
221
+ - Open.grep(stream, grep, invert = false, fixed = nil, options = {})
222
+ - Uses system grep (GREP_CMD) to filter stream. Accepts Array of patterns (written to temporary file and used with -f) or single pattern.
223
+
224
+ - Open.sort_stream(stream, header_hash: "#", cmd_args: nil, memory: false)
225
+ - Sort stream while preserving header lines (lines starting with header_hash).
226
+ - For memory=false runs external sort (env LC_ALL=C sort).
227
+ - Splits into substreams to avoid loading entire stream into memory for large inputs.
228
+
229
+ - Open.collapse_stream(s, line: nil, sep: "\t", header: nil, compact: false, &block)
230
+ - Collapses consecutive lines with same key (first field) merging rest columns with `|` separators or processed by provided block.
231
+ - Useful for aggregating grouped data in streaming fashion.
232
+
233
+ - Open.consume_stream described above.
234
+
235
+ - Open.notify_write(file)
236
+ - If `<file>.notify` exists, reads its contents and sends notification (email or system notify) and removes .notify file.
237
+
238
+ - Open.broken_link?(path) — true if symlink target missing
239
+ - Open.exist_or_link?(file) — exists or symlink
240
+ - Open.list(file) — read as lines
241
+
242
+ - Lockfile utility: Lockfile.create(path) creates lock and opens file (used internally).
243
+
244
+ ---
245
+
246
+ ## Examples (from tests)
247
+
248
+ Reading and line-wise processing:
249
+ ```ruby
250
+ sum = 0
251
+ Open.read(file) { |line| sum += line.to_i }
252
+ ```
253
+
254
+ Open compressed file:
255
+ ```ruby
256
+ Open.read("file.txt.gz") # decompresses and returns content
257
+ ```
258
+
259
+ Sensible write:
260
+ ```ruby
261
+ Open.sensible_write(target_path, File.open(source)) # safe atomic write from stream
262
+ ```
263
+
264
+ Pipe and open_pipe:
265
+ ```ruby
266
+ sout = Open.open_pipe do |sin|
267
+ 10.times { |i| sin.puts "line #{i}" }
268
+ end
269
+ # sout is a readable stream; consume:
270
+ Open.consume_stream(sout, false, target_file)
271
+ ```
272
+
273
+ Tee stream to two consumers:
274
+ ```ruby
275
+ sout = Open.open_pipe do |sin|
276
+ 2000.times { |i| sin.puts "line #{i}" }
277
+ end
278
+ s1, s2 = Open.tee_stream_thread(sout)
279
+ t1 = Open.consume_stream(s1, true, tmp.file1)
280
+ t2 = Open.consume_stream(s2, true, tmp.file2)
281
+ t1.join; t2.join
282
+ ```
283
+
284
+ Locking (concurrency safe):
285
+ ```ruby
286
+ Open.lock(lockfile_path, min_sleep: 0.01, max_sleep: 0.05) do
287
+ # critical section
288
+ end
289
+ ```
290
+
291
+ Rsync:
292
+ ```ruby
293
+ Open.rsync(source, target)
294
+ Open.sync(source, target) # alias
295
+ ```
296
+
297
+ Sorting a stream while preserving headers:
298
+ ```ruby
299
+ sorted = Open.sort_stream(string_io)
300
+ puts sorted.read
301
+ ```
302
+
303
+ Collapse grouped rows:
304
+ ```ruby
305
+ stream = Open.collapse_stream(s, sep: " ") do |parts|
306
+ parts.map(&:upcase) # or aggregate
307
+ end
308
+ ```
309
+
310
+ Remote fetch:
311
+ ```ruby
312
+ io = Open.wget('http://example.com', quiet: true)
313
+ puts io.read
314
+ ```
315
+
316
+ ---
317
+
318
+ ## Notes & edge cases
319
+
320
+ - Many functions accept Path objects and will call `.find` or `.produce_and_find` where appropriate.
321
+ - Remote functions rely on external commands (wget, ssh). Errors from those commands are wrapped/propagated (OpenURLError, ConcurrentStreamProcessFailed, etc.).
322
+ - Open.sensible_write and Open.write try to avoid inconsistent partial files; sensible_write uses tmp-file + mv and optional Lockfile to avoid races.
323
+ - Stream utilities use a ConcurrentStream abstraction (not documented here) to manage thread/pid/join semantics.
324
+ - Tee/splitter threads forward aborts and exceptions to downstream consumers; callers must handle cleanup and join threads.
325
+ - Open.lock relies on the included Lockfile implementation which supports NFS-safe locking, lock refreshing, stealing detection and sweeping stale locks.
326
+ - gzip/bgzip/unzip operations spawn external processes and return piped IOs — ensure you consume/join and close these streams to avoid zombies.
327
+ - Open.grep handles Array of patterns by writing them to a tmp file and using `-f` grep; fixed matching uses -F and -w by default.
328
+
329
+ ---
330
+
331
+ This document covers the main public behaviors of the Open module: unified file/stream opening, robust writing, streaming utilities, remote fetching and caching, filesystem helpers, locking and synchronization, and convenience utilities for sorting, collapsing and grepping streams. Use Open for safe, composable I/O operations in scripts and concurrent code.