scout-rig 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: bcb31a6421de24abf3a11f37bcf00a295ffcf9d5163c55f02ccdb806ff54c4fb
4
- data.tar.gz: b45379e2fb720a9e397faf467f7d3fcd9b14f0212096d40fcb62d2208a4936d1
3
+ metadata.gz: 7bb0f8b3f40dffa53114de0592de65c204a75980fd501f1a7d792063b0e90b3c
4
+ data.tar.gz: f077d7493516b97a16fe5229f276045ac7c2fe2a770233500a785457efefec22
5
5
  SHA512:
6
- metadata.gz: cf8a3e4b67bdda6d867279d0b3e6166f5a942f0ee3a51d90105355121e8b2d8205b5f6468ec43c7cc487964d671bbeed73625815779fca633e64e4647e62a877
7
- data.tar.gz: 8576a2ef1cd982db94fdab16ff207a1d9e7f189c125c5f92d4d0940091b452a64e69d3fa5af95f477131e8969686f47732ea79308c2f4f9734584e3ac3fbfffd
6
+ metadata.gz: a046eed20f43b8c4f41927d684744796127b36147b54ece94b4d3bce280fcf3ec63dfaa9caa0fcf2b7f278a8e9c80643d7485baca4ce69eef7cd00cbfe557810
7
+ data.tar.gz: 610ef0e18dab437d62f89dd7c8abb7532c911126654e5cf3784fabd738554d45d778beb2b6f04d9dbd0cde5f76044b7d00cb3c053d3f1460a72aeab996ff9c29
data/.vimproject CHANGED
@@ -2,6 +2,19 @@ scout-rig=/$PWD filter="*" {
2
2
  LICENSE.txt
3
3
  README.rdoc
4
4
  Rakefile
5
+ chats=chats{
6
+ documenter.rb
7
+
8
+ python_workflow
9
+
10
+ python_init
11
+
12
+ python_pycall
13
+
14
+ python_multi
15
+
16
+ docstring
17
+ }
5
18
  lib=lib {
6
19
  scout-rig.rb
7
20
  scout=scout{
@@ -12,17 +25,29 @@ scout-rig=/$PWD filter="*" {
12
25
  script.rb
13
26
  util.rb
14
27
  }
28
+ workflow=workflow{
29
+ python.rb
30
+ python=python{
31
+ inputs.rb
32
+ task.rb
33
+ }
34
+ }
15
35
  }
16
36
  }
17
37
  test=test{
18
38
  test_helper.rb
19
39
  }
20
40
  python=python{
41
+ task=task{
42
+ hello.py
43
+ }
21
44
  test.py
22
45
  scout=scout{
23
46
  __init__.py
47
+ runner.py
24
48
  workflow.py
25
49
  workflow=workflow{
50
+ definition.py
26
51
  remote.py
27
52
  }
28
53
  }
data/README.md ADDED
@@ -0,0 +1,337 @@
1
+ # scout-rig
2
+
3
+ scout-rig provides the language interop “rigging” for the Scout ecosystem. It currently focuses on Python: executing Python from Ruby, round‑tripping data (TSV ↔ pandas), and running Scout Workflows from Python code. It builds on the low-level/core packages:
4
+
5
+ - scout-essentials — low level utilities (Annotation, CMD, ConcurrentStream, IndiferentHash, Log, Open, Path, Persist, TmpFile)
6
+ - scout-gear — data and workflow primitives (TSV, Workflow, KnowledgeBase, Association, Entity, WorkQueue, Semaphore)
7
+ - scout-rig — interop with other languages (currently Python)
8
+ - scout-camp — remote servers, cloud deployments, web interfaces, cross-site operations
9
+ - scout-ai — model training and agentic tools
10
+
11
+ All packages are available on GitHub under https://github.com/mikisvaz (for example, https://github.com/mikisvaz/scout-gear).
12
+
13
+ For broader background and many real workflow examples, see Rbbt (the bioinformatics framework from which Scout was refactored) and the Rbbt-Workflows organization:
14
+ - https://github.com/mikisvaz/rbbt
15
+ - https://github.com/Rbbt-Workflows
16
+
17
+ This README focuses on the Python bridge in scout-rig (ScoutPython). See the docs in doc/ for reference material.
18
+
19
+ - doc/Python.md — ScoutPython user guide
20
+
21
+ ---
22
+
23
+ ## What you get
24
+
25
+ ScoutPython (Ruby) and a companion Python package (python/scout) provide:
26
+
27
+ - Safe, ergonomic execution of Python code from Ruby (PyCall-based), with:
28
+ - Simple import helpers and localized bindings
29
+ - Synchronous, direct, or background-thread execution
30
+ - Logging wrappers that capture Python stdout/stderr
31
+ - Scripting to run ad‑hoc Python text with Ruby variables (including TSV) injected, and results returned
32
+ - Data conversion helpers:
33
+ - numpy arrays → Ruby Arrays
34
+ - pandas DataFrame ↔ TSV (key_field, fields, type respected)
35
+ - Python path management (expose package python/ dirs to sys.path)
36
+ - Python‑side helpers to:
37
+ - Read/write TSVs with headers (pandas)
38
+ - Run Ruby Workflows from Python
39
+ - Call remote Workflow services over HTTP
40
+
41
+ ---
42
+
43
+ ## Installation and requirements
44
+
45
+ Ruby
46
+ - Ruby 2.6+ (or compatible with PyCall)
47
+ - Gems:
48
+ - pycall (PyCall)
49
+ - json (standard)
50
+ - Optional for script result loading:
51
+ - python/pickle (gem) for loading pickle from Python scripts
52
+
53
+ Python
54
+ - Python 3
55
+ - Packages:
56
+ - pandas
57
+ - numpy
58
+ - requests (only for remote workflow client)
59
+ - Ensure python3 is in PATH
60
+
61
+ Add scout-rig to your Ruby project (Gemfile or local checkout), then ensure Python dependencies are installed in your Python environment.
62
+
63
+ ---
64
+
65
+ ## Quick start
66
+
67
+ Execute Python directly from Ruby:
68
+
69
+ ```ruby
70
+ require 'scout_python'
71
+
72
+ # Sum with numpy
73
+ arr_sum = ScoutPython.run 'numpy', as: :np do
74
+ np.array([1,2,3]).sum
75
+ end
76
+ # => PyObject (to_i if needed)
77
+
78
+ # Background thread execution
79
+ ScoutPython.run_threaded :sys do
80
+ sys.path.append('/opt/my_py_pkg')
81
+ end
82
+ ScoutPython.stop_thread
83
+ ```
84
+
85
+ Run an ad‑hoc Python script, returning a result value:
86
+
87
+ ```ruby
88
+ tsv = TSV.setup({}, "Key~ValueA,ValueB#:type=:list")
89
+ tsv["k1"] = %w[a1 b1]; tsv["k2"] = %w[a2 b2]
90
+
91
+ TmpFile.with_file do |target|
92
+ result = ScoutPython.script <<~PY, df: tsv, target: target
93
+ import scout
94
+ # df is a pandas DataFrame (tsv injected)
95
+ result = df.loc["k2", "ValueB"]
96
+ scout.save_tsv(target, df) # save as TSV with header
97
+ PY
98
+
99
+ # result is "b2"; target holds a TSV round-tripped from pandas
100
+ end
101
+ ```
102
+
103
+ Convert between TSV and pandas:
104
+
105
+ ```ruby
106
+ df = ScoutPython.tsv2df(tsv) # TSV -> pandas DataFrame
107
+ tsv2 = ScoutPython.df2tsv(df) # pandas DataFrame -> TSV
108
+ ```
109
+
110
+ Run a Workflow from Python:
111
+
112
+ ```python
113
+ import sys
114
+ sys.path.append('python') # add this repo's python/ on dev checkouts
115
+
116
+ import scout.workflow as sw
117
+
118
+ wf = sw.Workflow('Baking')
119
+ print(wf.tasks())
120
+ step = wf.fork('bake_muffin_tray', add_blueberries=True, clean='recursive')
121
+ step.join()
122
+ print(step.load()) # load Ruby job result
123
+ ```
124
+
125
+ ---
126
+
127
+ ## Core concepts
128
+
129
+ ### Path management for Python imports
130
+
131
+ ScoutPython tracks Python directories to add to sys.path:
132
+
133
+ - ScoutPython.add_path(path) / add_paths(paths)
134
+ - ScoutPython.process_paths # idempotent; run before/inside sessions
135
+
136
+ These are applied in Python contexts by run/run_simple/run_direct.
137
+
138
+ ### Running Python from Ruby
139
+
140
+ Pick the execution model that fits:
141
+
142
+ - run(mod = nil, imports = nil) { ... }
143
+ - Initialize PyCall if needed, set up paths, run block; GC after run
144
+ - run_simple(mod = nil, imports = nil) { ... }
145
+ - Lightweight; process_paths, then run block
146
+ - run_direct(mod = nil, imports = nil) { ... }
147
+ - Minimal overhead: optional single pyimport/pyfrom, then evaluate
148
+ - run_threaded(mod = nil, imports = nil) { ... }
149
+ - Queue work into a dedicated Python thread; stop with stop_thread
150
+
151
+ Logging wrappers capture Python’s stdout/stderr via the Scout Log:
152
+
153
+ - run_log(mod=nil, imports=nil, severity=Log::LOW, severity_err=nil) { ... }
154
+ - run_log_stderr(mod=nil, imports=nil, severity=Log::LOW) { ... }
155
+
156
+ Imports
157
+ - Pass 'numpy', as: :np or "module.submodule", import: [:Class, :func]
158
+
159
+ ### Binding scopes and imports
160
+
161
+ Keep imports local to a binding:
162
+
163
+ ```ruby
164
+ ScoutPython.binding_run do
165
+ pyimport :torch
166
+ pyfrom :torch, import: ['nn']
167
+ # torch and nn available here only
168
+ end
169
+ ```
170
+
171
+ Helpers
172
+ - new_binding, binding_run
173
+ - import_method, call_method
174
+ - get_module, get_class, class_new_obj
175
+ - exec(script) → PyCall.exec
176
+
177
+ ### Scripting
178
+
179
+ Run arbitrary Python text with Ruby variables injected:
180
+
181
+ - ScoutPython.script(text, variables = {}) → result
182
+ - Ruby primitives → Python literals
183
+ - Arrays/Hashes → recursively converted
184
+ - TSV variables → materialized to temp file and loaded into pandas via the python/scout helper
185
+ - result is read back via pickle (default) or JSON (configurable)
186
+
187
+ Swap result serializer if desired:
188
+
189
+ ```ruby
190
+ class << ScoutPython
191
+ alias save_script_result save_script_result_json
192
+ alias load_result load_json
193
+ end
194
+ ```
195
+
196
+ ### Iteration utilities
197
+
198
+ Traverse Python iterables with optional progress bars:
199
+
200
+ - iterate(iterator, bar: nil|true|String) { |elem| ... }
201
+ - iterate_index(sequence, bar: ...) { |elem| ... }
202
+ - collect(iterator, bar: ...) { |elem| ... } → Array
203
+
204
+ ### Data conversion and pandas helpers
205
+
206
+ - numpy2ruby(numpy_array)
207
+ - to_a/py2ruby_a(py_list)
208
+ - obj2hash(py_mapping)
209
+ - tsv2df(tsv) / df2tsv(df, options={type: :list, key_field: ...})
210
+
211
+ ---
212
+
213
+ ## Python-side package (python/scout)
214
+
215
+ The included Python package is importable as scout and provides:
216
+
217
+ General utilities
218
+ - scout.libdir(), scout.add_libdir()
219
+ - scout.path(), scout.read()
220
+ - scout.inspect(obj), scout.rich(obj)
221
+
222
+ TSV IO (pandas-aware)
223
+ - scout.tsv(tsv_path_or_stream, ...) → pandas.DataFrame (Scout headers respected)
224
+ - scout.save_tsv(filename, df, key=None)
225
+
226
+ Workflow wrappers
227
+ - scout.run_job(workflow, task, name='Default', fork=False, clean=False, **inputs)
228
+ - Shells out to the Ruby CLI to execute/fork jobs
229
+ - scout.workflow.Workflow(name).run/fork/tasks/task_info
230
+ - scout.workflow.Step(path).info/status/join/load
231
+
232
+ Remote workflows (HTTP)
233
+ - scout.workflow.remote.RemoteWorkflow(url).job/task_info
234
+ - scout.workflow.remote.RemoteStep(url).status/wait/raw/json
235
+
236
+ ---
237
+
238
+ ## Error handling and threading
239
+
240
+ - Python process errors from script are surfaced as ConcurrentStreamProcessFailed (non‑zero exit), with stderr logged via Log if a logging wrapper is used
241
+ - Background thread execution must be stopped explicitly:
242
+ - ScoutPython.stop_thread — sends a sentinel, tries to join/kill, GCs, and finalizes PyCall if available
243
+
244
+ ---
245
+
246
+ ## Command line usage and discovery
247
+
248
+ Scout commands are discovered under scout_commands across installed packages using the Path subsystem. The dispatcher resolves nested commands by adding terms until a file is found to execute; if you stop on a directory, it lists available subcommands.
249
+
250
+ - General pattern:
251
+ - scout <top-level> [<subcommand> ...] [options] [args...]
252
+ - Examples relevant to Python integration (executed from Ruby CLI but callable from Python via scout.run_job):
253
+ - scout workflow task <Workflow> <task> [task-input-options...]
254
+ - scout workflow prov <step_path>
255
+ - scout workflow info <step_path>
256
+
257
+ Notes
258
+ - The bin/scout launcher walks scout_commands/… across packages; Workflows and other packages can add their own commands and they will be discovered
259
+ - See the Workflow, TSV, and KnowledgeBase docs for their CLI suites:
260
+ - TSV: scout tsv …
261
+ - Workflow: scout workflow …
262
+ - KnowledgeBase: scout kb …
263
+
264
+ scout-rig itself does not register standalone CLI commands; instead, its Python wrapper invokes the existing Ruby CLI to run jobs from Python.
265
+
266
+ ---
267
+
268
+ ## Reference
269
+
270
+ Read the full module guide in doc/Python.md. For core building blocks referenced above, see these docs in scout-essentials and scout-gear:
271
+
272
+ - Annotation.md, CMD.md, ConcurrentStream.md, IndiferentHash.md, Log.md, Open.md, Path.md, Persist.md, TmpFile.md
273
+ - TSV.md, Workflow.md, KnowledgeBase.md, Association.md, Entity.md, WorkQueue.md, Semaphore.md
274
+
275
+ ---
276
+
277
+ ## Examples
278
+
279
+ Direct PyCall with imports:
280
+
281
+ ```ruby
282
+ ScoutPython.run 'numpy', as: :np do
283
+ a = np.array([1,2,3])
284
+ a.sum # PyObject; convert with to_i if needed
285
+ end
286
+ ```
287
+
288
+ Script with a returned value and TSV round‑trip:
289
+
290
+ ```ruby
291
+ tsv = TSV.setup({}, "Key~ValueA,ValueB#:type=:list")
292
+ tsv["k1"] = ["a1", "b1"]; tsv["k2"] = ["a2", "b2"]
293
+
294
+ TmpFile.with_file do |target|
295
+ result = ScoutPython.script <<~PY, df: tsv, target: target
296
+ import scout
297
+ result = df.loc["k2", "ValueB"]
298
+ scout.save_tsv(target, df)
299
+ PY
300
+ # result == "b2"; target contains the saved TSV
301
+ end
302
+ ```
303
+
304
+ numpy conversion:
305
+
306
+ ```ruby
307
+ ra = ScoutPython.run :numpy, as: :np do
308
+ na = np.array([[[1,2,3], [4,5,6]]])
309
+ ScoutPython.numpy2ruby(na)
310
+ end
311
+ ra[0][1][2] # => 6
312
+ ```
313
+
314
+ Run workflows from Python:
315
+
316
+ ```python
317
+ import scout.workflow as sw
318
+
319
+ wf = sw.Workflow('Baking')
320
+ step = wf.fork('bake_muffin_tray', add_blueberries=True, clean='recursive')
321
+ step.join()
322
+ print(step.load())
323
+ ```
324
+
325
+ ---
326
+
327
+ ## Project links
328
+
329
+ - scout-essentials — https://github.com/mikisvaz/scout-essentials
330
+ - scout-gear — https://github.com/mikisvaz/scout-gear
331
+ - scout-rig — https://github.com/mikisvaz/scout-rig
332
+ - scout-camp — https://github.com/mikisvaz/scout-camp
333
+ - scout-ai — https://github.com/mikisvaz/scout-ai
334
+ - Rbbt — https://github.com/mikisvaz/rbbt
335
+ - Rbbt-Workflows — https://github.com/Rbbt-Workflows
336
+
337
+ Contributions and issues are welcome in their respective GitHub repositories.
data/VERSION CHANGED
@@ -1 +1 @@
1
- 0.1.0
1
+ 0.2.0