RubyGems - scout-gear - Versions diffs - 10.11.10 → 10.12.0 - Mend

scout-gear 10.11.10 → 10.12.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

checksums.yaml +4 -4
data/.vimproject +6 -0
data/VERSION +1 -1
data/doc/Workflow.md +159 -1
data/lib/scout/association/index.rb +4 -1
data/lib/scout/association.rb +1 -1
data/lib/scout/knowledge_base/entity.rb +2 -2
data/lib/scout/knowledge_base/query.rb +3 -1
data/lib/scout/work_queue/socket.rb +7 -3
data/lib/scout/workflow/deployment/local.rb +25 -13
data/lib/scout/workflow/documentation.rb +3 -1
data/lib/scout/workflow/step/info.rb +7 -1
data/lib/scout/workflow/step/inputs.rb +1 -3
data/lib/scout/workflow/step/status.rb +1 -1
data/lib/scout/workflow/step.rb +10 -11
data/lib/scout/workflow/task.rb +1 -1
data/lib/scout/workflow.rb +1 -0
data/scout-gear.gemspec +4 -3
data/scout_commands/purge +170 -0
metadata +3 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 50a2a5c7fad9fdeaf0df507d0543f60fe49e80b3c8273bf68d0b5985ca3f1f9d
-  data.tar.gz: e22c535bd1a2eb0ab72bef331b4c8d52acca8302e86887af87aae00744117501
+  metadata.gz: cae9e2eae34c1d25717ba78283cca252e5b677c289fbe16e733ba03f6a438e02
+  data.tar.gz: 91c3f4c813c2fb493ebaf7d795e3085c0959e8a45719a6caf7961f27fa92f814
 SHA512:
-  metadata.gz: ffb552f4563504e7d17e8269869e706c6d25c8c4d99b2941a1fea1b732014a5214d72290ae521ec7389c7d9c69fc5c2bd3fd16377966b9102cd6e369cf9c147c
-  data.tar.gz: 0d56894df6f5bdd78919484404af43abf6737afdae2f635bf11211ab7ac83b0e3a31f43e410ea3754dd608fed68a8ae928714d3128a4735b5c954fd9e79e01f1
+  metadata.gz: b3133d93ece825983115375d519daea394914f4c70b083fc2f6f6d1c28605422129e31170becd5da4d338e8a1623ac9685855eed24b15752b31ba261417de3f7
+  data.tar.gz: 8f8bc352a274b6564360fce1a79b86f268fa88a8f22854393bb1aa773a04247ac35ed78cd22d81842a4d5a1ac04f3a1ce3ab3c292db2109ba7c05196c658533c

data/.vimproject CHANGED Viewed

@@ -2,6 +2,11 @@ scout-gear=/$PWD filter="*.rb *.yaml" {
  Rakefile
  README.md
  chats=chats filter="*"{
+  purge
+  update_workflow_doc
   job_chains
@@ -179,6 +184,7 @@ scout-gear=/$PWD filter="*.rb *.yaml" {
   alias
   entity
   find
+  purge
   cat
   glob
   log

data/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 10.11.10
1	+ 10.12.0

data/doc/Workflow.md CHANGED Viewed

@@ -89,6 +89,75 @@ input :count, :integer, "Times", 1, required: false
   - required: true — missing or nil values raise ParameterException.
   - shortcut — preferred CLI short option letter (SOPT).
+Important (common pitfall): inputs and other annotations are **queued for the next task definition**.
+In implementation terms, `input`, `dep`, `desc`, `returns` and `extension` call `annotate_next_task(...)` and
+their annotations are consumed by the next call to `task(...)` **or** `task_alias(...)`/`dep_task(...)`.
+After that, the annotation queue is cleared.
+This is the most frequent source of confusion when you introduce intermediate `task_alias` helpers.
+Bad (inputs attach to the alias, not to `analysis`):
+```ruby
+input :top_k, :integer, "How many states", 5
+task_alias :backend, self, :tool_run, mode: :fast
+dep :backend
+task :analysis => :json do |top_k|
+  # top_k will be nil here (it was attached to :backend)
+end
+```
+Good (define alias first, then analysis-specific inputs, then the task):
+```ruby
+task_alias :backend, self, :tool_run, mode: :fast
+dep :backend
+input :top_k, :integer, "How many states", 5
+task :analysis => :json do |top_k|
+  # ok
+end
+```
+### Common gotchas
+These are the failure modes that most often bite first-time workflow authors:
+- **Annotations attach to the next task definition**: `input`, `dep`, `desc`, `returns`, `extension` are queued and consumed by the next `task(...)` *or* `task_alias(...)` call.
+  - If you use `task_alias` as a convenience backend, define the alias *first*, then define analysis-only inputs, then define the analysis task.
+- **`return` cannot be used inside tasks**: because of how they are implemented you need to use next to abort execution
+  and return a value. Just replace `return` for `next` inside a task block.
+- **`dep_task` is just an alias for `task_alias`**: it defines a task alias; it does not mean “declare a dependency”. You still need `dep :that_alias` if you want it to be a dependency.
+- **`step(:name)` only finds declared dependencies**: inside a task, `step(:x)` returns a dependency Step whose `task_name` is `:x`.
+  - If you forgot `dep :x`, then `step(:x)` will be `nil`.
+- **On-disk layout is `<job>`, `<job>.info`, `<job>.files/`**:
+  - `.info` is a file (JSON metadata), and `.files/` is a directory.
+  - There is no `.info.files/` path.
+- **Caching and recomputation**:
+  - Jobs are cached by *non-default inputs* plus a digest of their dependency tree.
+  - Changing task code does not automatically invalidate old results; use `--update` to recompute when dependencies are newer,
+    or `--clean`/`--recursive_clean` to remove cached outputs.
+  - When debugging, it can be useful to change the job name (`--jobname`) or set an explicit input so you get a fresh job directory.
+- **Array/list CLI inputs**: prefer a single comma-separated flag (e.g. `--nodes A,B,C`) rather than repeating the same flag many times.
+- **Introspection vs execution helpers**:
+  - `Task#dependencies(...)` is an internal constructor used during job creation and requires arguments.
+  - For introspection, use `task.deps`, `workflow.usage(task)`, or `workflow.dep_tree(task)`.
+- **Tool invocation**:
+  - `CMD.cmd('MyTool', ...)` runs a binary from PATH.
+  - `CMD.cmd(:mytool, ...)` only works if that tool symbol is registered in CMD’s tool registry.
+  - For details, see the CMD documentation (in scout-essentials: `doc/CMD.md`).
 Task definitions:
 ```ruby
@@ -162,6 +231,17 @@ Step basics:
 - `step.files_dir`: companion directory `<path>.files` holding auxiliary files.
 - `step.file("name")`: file helper within files_dir.
 - `step.info`: IndiferentHash with status, pid, start/end times, messages, inputs, dependencies, etc. Stored at `<path>.info` (JSON by default).
+On disk you will typically see:
+```text
+var/jobs/<Workflow>/<task>/<jobname>.<ext>        # main result
+var/jobs/<Workflow>/<task>/<jobname>.<ext>.info   # JSON info (status, inputs, deps, messages, exceptions)
+var/jobs/<Workflow>/<task>/<jobname>.<ext>.files/ # auxiliary files created by the task
+```
+There is **no** `...<job>.info.files/` directory; `.info` is a file alongside the `.files/` directory.
 - `step.log(status, [message_or_block])`: set info status and message (block timed).
 - Status helpers: `done?`, `error?`, `aborted?`, `running?`, `waiting?`, `updated?`, `dirty?`, `started?`, `recoverable_error?`.
 - Cleanup: `clean`, `recursive_clean`, `produce(with_fork: false)`.
@@ -234,6 +314,14 @@ task_alias :say_hello, self, :say, name: "Miguel"
 # alias name => inferred type, returns and extension from :say
 ```
+Notes:
+- `dep_task` is an alias for `task_alias` (same method).
+- The `workflow` argument should be a Workflow module (often `self` inside the workflow module, or an explicit module name).
+  There is no special `Self` constant.
+- `task_alias` is itself a task definition, so any queued `input` / `dep` / `desc` / `returns` / `extension` immediately preceding it
+  are consumed by the alias (not by the following task).
 Behavior:
 - The alias depends on the original task; upon completion:
   - With config forget/remove enabled (see below), the alias job archives dependency info and either hard-links, copies, or removes dep artifacts.
@@ -247,6 +335,73 @@ Behavior:
 Overriding dependencies at job time:
 - Pass `"Workflow#task" => Step_or_Path` in job inputs; the system marks dep as overridden, adjusts naming, and uses provided artifact.
+### Pattern: backend + analysis tasks (wrapping external tools)
+When wrapping external command-line tools (any CLI program), prefer a two-layer design:
+1) **Backend task**: runs the tool, writes full outputs to `step.files_dir`, and returns a small JSON document
+   describing what was produced (paths, key parameters, summary stats).
+2) **Analysis task(s)**: `dep` on the backend task and parse its outputs into compact, LLM-friendly summaries.
+This pattern keeps caching/reproducibility correct (because the backend inputs are part of the dependency graph)
+and avoids blowing up the CLI / LLM context window with large outputs.
+Example skeleton:
+```ruby
+# backend
+input :network, :text, required: true
+input :seed, :integer, 0
+task :tool_run => :json do |network, seed|
+  Open.write(file('input.txt'), network)
+  io = CMD.cmd('SomeTool', "--seed #{seed} '#{file('input.txt')}'", log: true, save_stderr: true)
+  raise ScoutException, io.read + "\n" + io.std_err if io.exit_status != 0
+  {
+    "files" => Dir.glob(file('out').to_s + '*'),
+    "params" => {"seed" => seed}
+  }.to_json
+end
+# analysis
+dep :tool_run
+input :top_k, :integer, 5
+task :tool_summary => :json do |top_k|
+  info = JSON.parse(step(:tool_run).load)
+  # parse info["files"] ...
+end
+```
+Notes:
+- Use `step.file('name')`/`file('name')` to ensure artifacts land in the step `.files` directory.
+- For binaries that are not registered in CMD's tool registry, use `CMD.cmd('BinaryName', ...)` (string),
+  not `CMD.cmd(:BinaryName, ...)` (symbol).
+### Designing tasks for interactive/agent use
+Many users (and autonomous agents) cannot afford to load large tool outputs into memory or into a chat context window.
+A robust pattern is:
+- **Persist the full output to disk** (in `step.files_dir`) and return only a *small* summary object.
+  - Prefer returning `:json`/`:text` with a compact JSON document.
+- **Echo analysis parameters** in the returned JSON.
+  - This makes it obvious what was actually used when debugging caching, CLI parsing, or defaults.
+- **Separate “run” from “summarize”**.
+  - Backend task: run tool, write outputs, return metadata + file list.
+  - Analysis task(s): parse, aggregate, downsample, and return small summaries.
+- **Use task_alias for common presets**.
+  - e.g. a `*_final_run` alias that fixes `final: true`, or a `*_trajectory_run` alias that fixes `format: 'csv'`.
+- **Keep results stable and machine-readable**.
+  - Prefer JSON hashes/arrays over ad-hoc human-readable text; add derived fields (like expression strings) for convenience.
 ---
 ## Usage and documentation
@@ -272,6 +427,7 @@ SOPT integration:
   - `task.get_SOPT` returns parsed `--input` options from ARGV.
   - Boolean inputs render as `--flag`; string-like inputs accept `--key=value` or `--key value`.
   - Array inputs accept comma-separated values; file/path arrays resolve files.
+  - Tip: prefer a single flag with comma-separated values (e.g. `--nodes A,B,C`) over repeating the same flag multiple times.
 ---
@@ -366,6 +522,8 @@ Task:
 - assign_inputs(provided_inputs, id=nil) => [input_array, non_default_inputs, jobname_input?]
 - process_inputs(provided_inputs, id=nil) => [input_array, non_default_inputs, digest_str]
 - dependencies(id, provided_inputs, non_default_inputs, compute) => [Step...]
+  - Note: `Task#dependencies` is an internal constructor used during job creation and requires arguments.
+    For introspection, use `task.deps` (declared dependency annotations) or `workflow.usage(task)` / `workflow.dep_tree(...)`.
 - recursive_inputs(overridden=[]) => inputs array
 - save_inputs(dir, provided_inputs) and load_inputs(dir)
@@ -583,4 +741,4 @@ puts Step.prov_report(job)
 ---
-This document covers the Workflow engine: defining tasks and dependencies, creating and running jobs, streaming, info management, orchestration, documentation, and CLI integration. Use it to build reproducible pipelines with safe persistence and rich provenance.
+This document covers the Workflow engine: defining tasks and dependencies, creating and running jobs, streaming, info management, orchestration, documentation, and CLI integration. Use it to build reproducible pipelines with safe persistence and rich provenance.

data/lib/scout/association/index.rb CHANGED Viewed

@@ -29,8 +29,9 @@ module Association
       if database.type == :double
         transformer.traverse do |source,value_list|
           res = []
-          NamedArray.zip_fields(value_list).collect do |values|
+          NamedArray.zip_fields(value_list).each do |values|
             target, *info = values
+            next if source.nil? or target.nil?
             key = [source, target] * "~"
             res << [key, info]
             if undirected
@@ -45,6 +46,7 @@ module Association
           res = []
           res.extend MultipleResult
           targets.each do |target|
+            next if source.nil? or target.nil?
             key = [source, target] * "~"
             res << [key, []]
             if undirected
@@ -59,6 +61,7 @@ module Association
           res = []
           res.extend MultipleResult
           target, *info = values
+          next if source.nil? or target.nil?
           key = [source, target] * "~"
           res << [key, info]
           if undirected

data/lib/scout/association.rb CHANGED Viewed

@@ -131,7 +131,7 @@ module Association
     persist_options = IndiferentHash.pull_keys kwargs, :persist
     database_persist_options = IndiferentHash.add_defaults persist_options.dup, persist: true,
-      prefix: "Association::Index", serializer: :double, update: true,
+      prefix: "Association::Index", serializer: :double,
       other_options: kwargs
     Persist.tsv(file, kwargs, engine: "BDB", persist_options: database_persist_options) do |data|

data/lib/scout/knowledge_base/entity.rb CHANGED Viewed

@@ -91,7 +91,7 @@ class KnowledgeBase
       identifier_files.collect!{|f| f.annotate(f.gsub(/\bNAMESPACE\b/, namespace))} if namespace
       identifier_files.collect!{|f| f.annotate(f.gsub(/\bNAMESPACE\b/, db_namespace(name)))} if not namespace and db_namespace(name)
       identifier_files.reject!{|f| f.match(/\bNAMESPACE\b/)}
-      TSV.translation_index identifier_files, nil, source(name), :persist => true
+      TSV.translation_index identifier_files.uniq, nil, source(name), :persist => true
     end
   end
@@ -114,7 +114,7 @@ class KnowledgeBase
       identifier_files.collect!{|f| f.annotate(f.gsub(/\bNAMESPACE\b/, namespace))} if self.namespace
       identifier_files.collect!{|f| f.annotate(f.gsub(/\bNAMESPACE\b/, db_namespace(name)))} if namespace.nil? and db_namespace(name)
       identifier_files.reject!{|f| f.match(/\bNAMESPACE\b/)}
-      TSV.translation_index identifier_files, nil, target(name), :persist => true
+      TSV.translation_index identifier_files.uniq, nil, target(name), :persist => true
     end
   end

data/lib/scout/knowledge_base/query.rb CHANGED Viewed

@@ -75,7 +75,9 @@ class KnowledgeBase
     entity = identify_target(name, entity)
     matches = _parents(name, entity)
     #matches.each{|m| m.replace(m.partition("~").reverse*"") } unless undirected(name)
-    setup(name, matches, true)
+    items = setup(name, matches, true)
+    items = items.invert unless undirected(name)
+    items
   end
   def _neighbours(name, entity)

data/lib/scout/work_queue/socket.rb CHANGED Viewed

@@ -44,16 +44,19 @@ class WorkQueue
         str = size_head
       when Annotation::AnnotatedObject
         payload = @serializer.dump(obj)
+        payload.force_encoding("BINARY")
         size_head = [payload.bytesize,"S"].pack 'La'
-        str = size_head << payload
+        str = size_head + payload
       when String
         payload = obj
         size_head = [payload.bytesize,"C"].pack 'La'
-        str = size_head << payload
+        payload.force_encoding("BINARY")
+        str = size_head + payload
       else
         payload = @serializer.dump(obj)
+        payload.force_encoding("BINARY")
         size_head = [payload.bytesize,"S"].pack 'La'
-        str = size_head << payload
+        str = size_head + payload
       end
       write_length = str.length
@@ -82,6 +85,7 @@ class WorkQueue
             raise $!
           end
         when "C"
+          payload.force_encoding('UTF-8')
           payload
         end
       rescue TryAgain

data/lib/scout/workflow/deployment/local.rb CHANGED Viewed

@@ -6,13 +6,10 @@ class Workflow::LocalExecutor
     self.new.process(*args)
   end
-  def self.produce(jobs, rules = {}, produce_cpus: Etc.nprocessors, produce_timer: 1)
+  def self.produce(jobs, rules = {}, produce_cpus: Etc.nprocessors, produce_timer: 1, bar: nil)
     jobs = [jobs] unless Array === jobs
     orchestrator = self.new produce_timer.to_f, cpus: produce_cpus.to_i
-    begin
-      orchestrator.process(rules, jobs)
-    rescue self::NoWork
-    end
+    orchestrator.process(rules, jobs, bar: bar)
   end
   def self.produce_dependencies(jobs, tasks, rules = {}, produce_cpus: Etc.nprocessors, produce_timer: 1)
@@ -59,6 +56,7 @@ class Workflow::LocalExecutor
         bar.pos batches.select{|b| Workflow::Orchestrator.done_batch?(b) }.length if bar
         candidates = Workflow::LocalExecutor.candidates(batches)
+        candidates = candidates.reject{|batch| failed_jobs.include? batch[:top_level] }
         top_level_jobs = candidates.collect{|batch| batch[:top_level] }
         raise NoWork, "No candidates and no running jobs #{Log.fingerprint batches}" if resources_used.empty? && top_level_jobs.empty?
@@ -148,7 +146,7 @@ class Workflow::LocalExecutor
     }
   end
-  def process(rules, jobs = nil)
+  def process(rules, jobs = nil, bar: nil)
     jobs, rules = rules, {} if jobs.nil?
     if Step === jobs
@@ -157,11 +155,19 @@ class Workflow::LocalExecutor
     batches = Workflow::Orchestrator.job_batches(rules, jobs)
-    if jobs.length == 1
-      bar = jobs.first.progress_bar("Processing batches for #{jobs.first.short_path}", max: batches.length)
-    else
-      bar = true
-    end
+    bar = case bar
+          when true
+            true
+          when Log::ProgressBar
+            bar.max = batches.length
+            bar
+          when nil
+            if jobs.length == 1
+              jobs.first.progress_bar("Processing batches for #{jobs.first.short_path}", max: batches.length)
+            else
+              true
+            end
+          end
     batches.each do |batch|
       rules = IndiferentHash.setup batch[:rules]
@@ -172,7 +178,14 @@ class Workflow::LocalExecutor
       batch[:rules] = rules
     end
-    process_batches(batches, bar: bar)
+    begin
+      process_batches(batches, bar: bar)
+    rescue NoWork
+      batches.each do |batch|
+        job = batch[:top_level]
+        raise job.exception if job.error? && ! job.recoverable_error?
+      end
+    end
   end
   def release_resources(job)
@@ -309,7 +322,6 @@ class Workflow::LocalExecutor
   end
   def self.candidates(batches)
     leaf_nodes = batches.select{|b| b[:deps].empty? }
     leaf_nodes.reject!{|b| Workflow::Orchestrator.done_batch?(b) }

data/lib/scout/workflow/documentation.rb CHANGED Viewed

@@ -29,7 +29,9 @@ module Workflow
   def self.parse_workflow_doc(doc)
     title = doc_parse_first_line doc
-    description, task_info = doc_parse_up_to doc, /^# Tasks/i
+    description, task_info_and_extra = doc_parse_up_to doc, /^# Tasks/i
+    task_info, extra = doc_parse_up_to task_info_and_extra, /^#[^#]/i, true
     task_description, tasks = doc_parse_up_to task_info, /^##/, true
     tasks = doc_parse_chunks tasks, /^## (.*)/
     {:title => title.strip, :description => description.strip, :task_description => task_description.strip, :tasks => tasks}

data/lib/scout/workflow/step/info.rb CHANGED Viewed

@@ -1,5 +1,7 @@
 require 'time'
 require 'scout/config'
+require "json/add/exception"
 class Step
   SERIALIZER = Scout::Config.get(:serializer, :step_info, :info, :step, env: "SCOUT_SERIALIZER", default: :json)
   def info_file
@@ -186,10 +188,14 @@ class Step
     ! (done? && status == :done) && (info[:pid] && Misc.pid_alive?(info[:pid]))
   end
+  def self.encode_exception(e)
+    return e.to_json
+  end
   def exception
     return nil unless info[:exception]
     begin
-      Marshal.load(Base64.decode64(info[:exception]))
+      JSON.parse(info[:exception], create_additions: true)
     rescue
       Log.exception $!
       return Exception.new messages.last

data/lib/scout/workflow/step/inputs.rb CHANGED Viewed

@@ -1,8 +1,6 @@
 class Step
   def save_inputs(inputs_dir)
-    if clean_name != name
-      #hash = name[clean_name.length..-1]
-      #inputs_dir += hash
+    if provided_inputs.any?
       Log.medium "Saving job inputs to: #{Log.fingerprint inputs_dir} #{Log.fingerprint provided_inputs}"
       self.task.save_inputs(inputs_dir, provided_inputs)
     else

data/lib/scout/workflow/step/status.rb CHANGED Viewed

@@ -41,7 +41,7 @@ class Step
   def updated?
     return false if self.error? && self.recoverable_error?
-    return true if (self.done? || (self.error? && ! self.recoverable_error?)) && ! ENV["SCOUT_UPDATE"].to_s.downcase == 'true'
+    return true if (self.done? || (self.error? && ! self.recoverable_error?)) && ENV["SCOUT_UPDATE"].to_s.downcase != 'true'
     newer = newer_dependencies
     cleaned = cleaned_dependencies

data/lib/scout/workflow/step.rb CHANGED Viewed

@@ -192,7 +192,13 @@ class Step
     return @result || self.load if done?
-    prepare_dependencies
+    begin
+      prepare_dependencies
+    rescue => e
+      exception_encoded = Step.encode_exception e
+      merge_info :status => :error, :exception => exception_encoded, :end => Time.now, :backtrace => e.backtrace, :message => "#{e.class}: #{e.message}"
+      raise $!
+    end
     begin
@@ -242,17 +248,10 @@ class Step
     rescue Exception => e
       begin
         begin
-          if ConcurrentStreamProcessFailed === e
-            s = e.concurrent_stream
-            e.concurrent_stream = nil
-            exception_encoded = Base64.encode64(Marshal.dump(e))
-            e.concurrent_stream = s
-          else
-            exception_encoded = Base64.encode64(Marshal.dump(e))
-          end
+          exception_encoded = Step.encode_exception e
           merge_info :status => :error, :exception => exception_encoded, :end => Time.now, :backtrace => e.backtrace, :message => "#{e.class}: #{e.message}"
-        rescue Exception
-          exception_encoded = Base64.encode64(Marshal.dump(Exception.new(e.message)))
+        rescue Exception => e
+          exception_encoded = Step.encode_exception e
           merge_info :status => :error, :exception => exception_encoded, :end => Time.now, :backtrace => e.backtrace, :message => "#{e.class}: #{e.message}"
         end

data/lib/scout/workflow/task.rb CHANGED Viewed

@@ -59,7 +59,7 @@ module Task
                              when Array
                                inputs.collect{|name,*| name }[0..provided_inputs.length]
                              when Hash
-                               provided_inputs.keys
+                               provided_inputs.keys.collect{|k| k.to_sym }
                              end
       jobname_input = nil

data/lib/scout/workflow.rb CHANGED Viewed

@@ -122,6 +122,7 @@ module Workflow
     file = file.find if Path === file
     $LOAD_PATH.unshift(File.join(File.dirname(file), 'lib'))
     load file
+    Workflow.main || Workflow.workflows.last
   end
   def self.require_workflow(workflow_name_orig)

data/scout-gear.gemspec CHANGED Viewed

@@ -2,11 +2,11 @@
 # DO NOT EDIT THIS FILE DIRECTLY
 # Instead, edit Juwelier::Tasks in Rakefile, and run 'rake gemspec'
 # -*- encoding: utf-8 -*-
-# stub: scout-gear 10.11.10 ruby lib
+# stub: scout-gear 10.12.0 ruby lib
 Gem::Specification.new do |s|
   s.name = "scout-gear".freeze
-  s.version = "10.11.10".freeze
+  s.version = "10.12.0".freeze
   s.required_rubygems_version = Gem::Requirement.new(">= 0".freeze) if s.respond_to? :required_rubygems_version=
   s.require_paths = ["lib".freeze]
@@ -161,6 +161,7 @@ Gem::Specification.new do |s|
     "scout_commands/kb/show",
     "scout_commands/kb/traverse",
     "scout_commands/log",
+    "scout_commands/purge",
     "scout_commands/rbbt",
     "scout_commands/resource/produce",
     "scout_commands/resource/sync",
@@ -284,7 +285,7 @@ Gem::Specification.new do |s|
   ]
   s.homepage = "http://github.com/mikisvaz/scout-gear".freeze
   s.licenses = ["MIT".freeze]
-  s.rubygems_version = "3.7.2".freeze
+  s.rubygems_version = "3.7.0.dev".freeze
   s.summary = "basic gear for scouts".freeze
   s.specification_version = 4

data/scout_commands/purge ADDED Viewed

@@ -0,0 +1,170 @@
+#!/usr/bin/env ruby
+#
+# scout purge – delete files by their last access time.
+#
+# Supports three operations:
+#   * --before <specifier> – delete all files accessed before the given
+#     timestamp.  The specifier can be a relative description such as
+#     `last_week`, `3_day` or an absolute ISO date.
+#   * --older <file> – delete all files older than the last access time of
+#     the reference file.
+#   * --save <N> – keep the N most‑recently accessed files; delete the
+#     rest.  Can be combined with ``--before`` or ``--older``.
+#
+# The command uses the Scout option parser and prints messages using the
+# standard Log facility.  It works on the working directory when called
+# directly.
+require 'scout'
+require 'fileutils'
+require 'time'
+require 'pathname'
+# $0 handling used by other Scout commands.
+$0 = "scout #{$previous_commands.any? ? $previous_commands*" " + " " : "" }#{File.basename(__FILE__)}" if $previous_commands
+# ---- Options ------------------------------------------------------------
+options = SOPT.setup <<EOF
+Delete files by last access time.
+$ #{$0} <directory> [<options>]
+You need to specify the 'directory', at least one of the two parameters
+'before' or 'older'. I will find all the files under 'directory' that match the
+criteria and delete them, but only if the flag 'delete' is set, otherwise it
+will only list them. It only works with files; directories are ignored so that
+the directory structure is preserved.
+-b--before* Delete files accessed for the last time before the given time
+-o--older*  Delete files whose atime is older than the reference file
+-s--save*   Keep the N most recently accessed files, delete the rest
+-d--delete  Delete the files
+-h--help    Print this help
+EOF
+if options[:help]
+  if defined? scout_usage
+    scout_usage
+  else
+    puts SOPT.doc
+  end
+  exit 0
+end
+# ---- Validate directory ---------------------------------------------------
+dir = ARGV[0]
+raise MissingParameterException, "'directory' required" if dir.nil?
+dir = Path.setup dir.dup
+raise ParameterException, "'directory' does not exist" unless dir.exists?
+# ---- Helper -------------------------------------------------------------
+def time_from_specifier(spec)
+  case spec
+  when /^last_(day|week|month|year)$/
+    period = $1
+    secs = case period
+           when 'day'   then 24 * 60 * 60
+           when 'week'  then 7 * 24 * 60 * 60
+           when 'month' then 30 * 24 * 60 * 60
+           when 'year' then 335 * 24 * 60 * 60
+           else 24 * 60 * 60
+           end
+    Time.now - secs
+  when /^(\d+)_(day|week|month|year)$/
+    amount = $1.to_i
+    period = $2
+    secs = case period
+           when 'day'   then 24 * 60 * 60
+           when 'week'  then 7 * 24 * 60 * 60
+           when 'month' then 30 * 24 * 60 * 60
+           when 'year'  then 365 * 24 * 60 * 60
+           end
+    Time.now - amount * secs
+  when /^\d+\w+$/
+    Time.now - Misc.timespan(spec)
+  else
+    raise ParameterException, "unable to parse time spec '#{spec}'"
+  end
+end
+# ---- Gather files --------------------------------------------------------
+dir = Path.setup(dir.dup)
+dir = dir.find
+Log.info "Purging #{dir}"
+all_files = {}
+# Include dot files, ignore directories
+dir.glob('**/*').each do |p|
+  next if File.directory?(p)
+  all_files[p] = begin
+                   File.atime(p)
+                 rescue
+                   Log.warn $!.message
+                   next
+                 end
+end
+if all_files.empty?
+  puts 'no files found'
+  exit 0
+end
+# ---- Resolve options -----------------------------------------------------
+if options[:before] && options[:older]
+  Log.warn 'error: --before and --older are mutually exclusive'
+  exit 1
+end
+before_threshold = options[:before] ? time_from_specifier(options[:before]) : nil
+older_path      = options[:older]
+keep_count      = options[:save] ? options[:save].to_i : nil
+delete          = options[:delete]
+reference_at    = nil
+if older_path
+  older_path = File.expand_path(older_path)
+  unless File.exist?(older_path)
+    Log.warn "reference file #{older_path.inspect} does not exist"
+    exit 1
+  end
+  reference_at = File.atime(older_path)
+end
+# ---- Decide which files to delete ---------------------------------------
+target_files = if before_threshold
+                 all_files.select { |_, at| at < before_threshold }.keys
+               elsif reference_at
+                 all_files.select { |_, at| at < reference_at }.keys
+               else
+                 raise ParameterException, 'no criteria to select files'
+               end
+# Apply --save if provided
+if keep_count && keep_count.positive?
+  sorted = all_files.sort_by { |_, at| -at.to_f }
+  keep = sorted.take(keep_count).map(&:first)
+  target_files -= keep
+end
+if target_files.empty?
+  Log.warn 'no files matched deletion criteria'
+  exit 0
+end
+if delete
+  target_files.each do |file|
+    begin
+      Log.info "Delete #{file} atime: #{File.atime(file)}"
+      FileUtils.rm(file)
+    rescue => e
+      Log.warn "failed to delete #{file}: #{e.message}"
+    end
+  end
+else
+  target_files.each do |file|
+    Log.debug "atime: #{File.atime(file)} #{file}"
+    puts file
+  end
+end
+# ---- Perform or preview -------------------------------------------------

metadata CHANGED Viewed

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: scout-gear
 version: !ruby/object:Gem::Version
-  version: 10.11.10
+  version: 10.12.0
 platform: ruby
 authors:
 - Miguel Vazquez
@@ -244,6 +244,7 @@ files:
 - scout_commands/kb/show
 - scout_commands/kb/traverse
 - scout_commands/log
+- scout_commands/purge
 - scout_commands/rbbt
 - scout_commands/resource/produce
 - scout_commands/resource/sync
@@ -382,7 +383,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
     - !ruby/object:Gem::Version
       version: '0'
 requirements: []
-rubygems_version: 3.7.2
+rubygems_version: 3.7.0.dev
 specification_version: 4
 summary: basic gear for scouts
 test_files: []