scout-rig 0.1.1 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.vimproject +25 -0
- data/README.md +337 -0
- data/VERSION +1 -1
- data/doc/Python.md +482 -0
- data/lib/scout/python/run.rb +16 -7
- data/lib/scout/python/script.rb +5 -0
- data/lib/scout/python.rb +17 -1
- data/lib/scout/workflow/python/inputs.rb +59 -0
- data/lib/scout/workflow/python/task.rb +110 -0
- data/lib/scout/workflow/python.rb +14 -0
- data/python/scout/__init__.py +10 -3
- data/python/scout/runner.py +385 -0
- data/scout-rig.gemspec +13 -7
- data/test/scout/python/test_run.rb +15 -0
- data/test/scout/workflow/python/test_task.rb +37 -0
- data/test/scout/workflow/test_python.rb +47 -0
- data/test/test_helper.rb +51 -1
- metadata +12 -6
- data/README.rdoc +0 -18
- data/python/scout/__pycache__/__init__.cpython-310.pyc +0 -0
- data/python/scout/__pycache__/workflow.cpython-310.pyc +0 -0
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 7bb0f8b3f40dffa53114de0592de65c204a75980fd501f1a7d792063b0e90b3c
|
|
4
|
+
data.tar.gz: f077d7493516b97a16fe5229f276045ac7c2fe2a770233500a785457efefec22
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: a046eed20f43b8c4f41927d684744796127b36147b54ece94b4d3bce280fcf3ec63dfaa9caa0fcf2b7f278a8e9c80643d7485baca4ce69eef7cd00cbfe557810
|
|
7
|
+
data.tar.gz: 610ef0e18dab437d62f89dd7c8abb7532c911126654e5cf3784fabd738554d45d778beb2b6f04d9dbd0cde5f76044b7d00cb3c053d3f1460a72aeab996ff9c29
|
data/.vimproject
CHANGED
|
@@ -2,6 +2,19 @@ scout-rig=/$PWD filter="*" {
|
|
|
2
2
|
LICENSE.txt
|
|
3
3
|
README.rdoc
|
|
4
4
|
Rakefile
|
|
5
|
+
chats=chats{
|
|
6
|
+
documenter.rb
|
|
7
|
+
|
|
8
|
+
python_workflow
|
|
9
|
+
|
|
10
|
+
python_init
|
|
11
|
+
|
|
12
|
+
python_pycall
|
|
13
|
+
|
|
14
|
+
python_multi
|
|
15
|
+
|
|
16
|
+
docstring
|
|
17
|
+
}
|
|
5
18
|
lib=lib {
|
|
6
19
|
scout-rig.rb
|
|
7
20
|
scout=scout{
|
|
@@ -12,17 +25,29 @@ scout-rig=/$PWD filter="*" {
|
|
|
12
25
|
script.rb
|
|
13
26
|
util.rb
|
|
14
27
|
}
|
|
28
|
+
workflow=workflow{
|
|
29
|
+
python.rb
|
|
30
|
+
python=python{
|
|
31
|
+
inputs.rb
|
|
32
|
+
task.rb
|
|
33
|
+
}
|
|
34
|
+
}
|
|
15
35
|
}
|
|
16
36
|
}
|
|
17
37
|
test=test{
|
|
18
38
|
test_helper.rb
|
|
19
39
|
}
|
|
20
40
|
python=python{
|
|
41
|
+
task=task{
|
|
42
|
+
hello.py
|
|
43
|
+
}
|
|
21
44
|
test.py
|
|
22
45
|
scout=scout{
|
|
23
46
|
__init__.py
|
|
47
|
+
runner.py
|
|
24
48
|
workflow.py
|
|
25
49
|
workflow=workflow{
|
|
50
|
+
definition.py
|
|
26
51
|
remote.py
|
|
27
52
|
}
|
|
28
53
|
}
|
data/README.md
ADDED
|
@@ -0,0 +1,337 @@
|
|
|
1
|
+
# scout-rig
|
|
2
|
+
|
|
3
|
+
scout-rig provides the language interop “rigging” for the Scout ecosystem. It currently focuses on Python: executing Python from Ruby, round‑tripping data (TSV ↔ pandas), and running Scout Workflows from Python code. It builds on the low-level/core packages:
|
|
4
|
+
|
|
5
|
+
- scout-essentials — low level utilities (Annotation, CMD, ConcurrentStream, IndiferentHash, Log, Open, Path, Persist, TmpFile)
|
|
6
|
+
- scout-gear — data and workflow primitives (TSV, Workflow, KnowledgeBase, Association, Entity, WorkQueue, Semaphore)
|
|
7
|
+
- scout-rig — interop with other languages (currently Python)
|
|
8
|
+
- scout-camp — remote servers, cloud deployments, web interfaces, cross-site operations
|
|
9
|
+
- scout-ai — model training and agentic tools
|
|
10
|
+
|
|
11
|
+
All packages are available on GitHub under https://github.com/mikisvaz (for example, https://github.com/mikisvaz/scout-gear).
|
|
12
|
+
|
|
13
|
+
For broader background and many real workflow examples, see Rbbt (the bioinformatics framework from which Scout was refactored) and the Rbbt-Workflows organization:
|
|
14
|
+
- https://github.com/mikisvaz/rbbt
|
|
15
|
+
- https://github.com/Rbbt-Workflows
|
|
16
|
+
|
|
17
|
+
This README focuses on the Python bridge in scout-rig (ScoutPython). See the docs in doc/ for reference material.
|
|
18
|
+
|
|
19
|
+
- doc/Python.md — ScoutPython user guide
|
|
20
|
+
|
|
21
|
+
---
|
|
22
|
+
|
|
23
|
+
## What you get
|
|
24
|
+
|
|
25
|
+
ScoutPython (Ruby) and a companion Python package (python/scout) provide:
|
|
26
|
+
|
|
27
|
+
- Safe, ergonomic execution of Python code from Ruby (PyCall-based), with:
|
|
28
|
+
- Simple import helpers and localized bindings
|
|
29
|
+
- Synchronous, direct, or background-thread execution
|
|
30
|
+
- Logging wrappers that capture Python stdout/stderr
|
|
31
|
+
- Scripting to run ad‑hoc Python text with Ruby variables (including TSV) injected, and results returned
|
|
32
|
+
- Data conversion helpers:
|
|
33
|
+
- numpy arrays → Ruby Arrays
|
|
34
|
+
- pandas DataFrame ↔ TSV (key_field, fields, type respected)
|
|
35
|
+
- Python path management (expose package python/ dirs to sys.path)
|
|
36
|
+
- Python‑side helpers to:
|
|
37
|
+
- Read/write TSVs with headers (pandas)
|
|
38
|
+
- Run Ruby Workflows from Python
|
|
39
|
+
- Call remote Workflow services over HTTP
|
|
40
|
+
|
|
41
|
+
---
|
|
42
|
+
|
|
43
|
+
## Installation and requirements
|
|
44
|
+
|
|
45
|
+
Ruby
|
|
46
|
+
- Ruby 2.6+ (or compatible with PyCall)
|
|
47
|
+
- Gems:
|
|
48
|
+
- pycall (PyCall)
|
|
49
|
+
- json (standard)
|
|
50
|
+
- Optional for script result loading:
|
|
51
|
+
- python/pickle (gem) for loading pickle from Python scripts
|
|
52
|
+
|
|
53
|
+
Python
|
|
54
|
+
- Python 3
|
|
55
|
+
- Packages:
|
|
56
|
+
- pandas
|
|
57
|
+
- numpy
|
|
58
|
+
- requests (only for remote workflow client)
|
|
59
|
+
- Ensure python3 is in PATH
|
|
60
|
+
|
|
61
|
+
Add scout-rig to your Ruby project (Gemfile or local checkout), then ensure Python dependencies are installed in your Python environment.
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
## Quick start
|
|
66
|
+
|
|
67
|
+
Execute Python directly from Ruby:
|
|
68
|
+
|
|
69
|
+
```ruby
|
|
70
|
+
require 'scout_python'
|
|
71
|
+
|
|
72
|
+
# Sum with numpy
|
|
73
|
+
arr_sum = ScoutPython.run 'numpy', as: :np do
|
|
74
|
+
np.array([1,2,3]).sum
|
|
75
|
+
end
|
|
76
|
+
# => PyObject (to_i if needed)
|
|
77
|
+
|
|
78
|
+
# Background thread execution
|
|
79
|
+
ScoutPython.run_threaded :sys do
|
|
80
|
+
sys.path.append('/opt/my_py_pkg')
|
|
81
|
+
end
|
|
82
|
+
ScoutPython.stop_thread
|
|
83
|
+
```
|
|
84
|
+
|
|
85
|
+
Run an ad‑hoc Python script, returning a result value:
|
|
86
|
+
|
|
87
|
+
```ruby
|
|
88
|
+
tsv = TSV.setup({}, "Key~ValueA,ValueB#:type=:list")
|
|
89
|
+
tsv["k1"] = %w[a1 b1]; tsv["k2"] = %w[a2 b2]
|
|
90
|
+
|
|
91
|
+
TmpFile.with_file do |target|
|
|
92
|
+
result = ScoutPython.script <<~PY, df: tsv, target: target
|
|
93
|
+
import scout
|
|
94
|
+
# df is a pandas DataFrame (tsv injected)
|
|
95
|
+
result = df.loc["k2", "ValueB"]
|
|
96
|
+
scout.save_tsv(target, df) # save as TSV with header
|
|
97
|
+
PY
|
|
98
|
+
|
|
99
|
+
# result is "b2"; target holds a TSV round-tripped from pandas
|
|
100
|
+
end
|
|
101
|
+
```
|
|
102
|
+
|
|
103
|
+
Convert between TSV and pandas:
|
|
104
|
+
|
|
105
|
+
```ruby
|
|
106
|
+
df = ScoutPython.tsv2df(tsv) # TSV -> pandas DataFrame
|
|
107
|
+
tsv2 = ScoutPython.df2tsv(df) # pandas DataFrame -> TSV
|
|
108
|
+
```
|
|
109
|
+
|
|
110
|
+
Run a Workflow from Python:
|
|
111
|
+
|
|
112
|
+
```python
|
|
113
|
+
import sys
|
|
114
|
+
sys.path.append('python') # add this repo's python/ on dev checkouts
|
|
115
|
+
|
|
116
|
+
import scout.workflow as sw
|
|
117
|
+
|
|
118
|
+
wf = sw.Workflow('Baking')
|
|
119
|
+
print(wf.tasks())
|
|
120
|
+
step = wf.fork('bake_muffin_tray', add_blueberries=True, clean='recursive')
|
|
121
|
+
step.join()
|
|
122
|
+
print(step.load()) # load Ruby job result
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
---
|
|
126
|
+
|
|
127
|
+
## Core concepts
|
|
128
|
+
|
|
129
|
+
### Path management for Python imports
|
|
130
|
+
|
|
131
|
+
ScoutPython tracks Python directories to add to sys.path:
|
|
132
|
+
|
|
133
|
+
- ScoutPython.add_path(path) / add_paths(paths)
|
|
134
|
+
- ScoutPython.process_paths # idempotent; run before/inside sessions
|
|
135
|
+
|
|
136
|
+
These are applied in Python contexts by run/run_simple/run_direct.
|
|
137
|
+
|
|
138
|
+
### Running Python from Ruby
|
|
139
|
+
|
|
140
|
+
Pick the execution model that fits:
|
|
141
|
+
|
|
142
|
+
- run(mod = nil, imports = nil) { ... }
|
|
143
|
+
- Initialize PyCall if needed, set up paths, run block; GC after run
|
|
144
|
+
- run_simple(mod = nil, imports = nil) { ... }
|
|
145
|
+
- Lightweight; process_paths, then run block
|
|
146
|
+
- run_direct(mod = nil, imports = nil) { ... }
|
|
147
|
+
- Minimal overhead: optional single pyimport/pyfrom, then evaluate
|
|
148
|
+
- run_threaded(mod = nil, imports = nil) { ... }
|
|
149
|
+
- Queue work into a dedicated Python thread; stop with stop_thread
|
|
150
|
+
|
|
151
|
+
Logging wrappers capture Python’s stdout/stderr via the Scout Log:
|
|
152
|
+
|
|
153
|
+
- run_log(mod=nil, imports=nil, severity=Log::LOW, severity_err=nil) { ... }
|
|
154
|
+
- run_log_stderr(mod=nil, imports=nil, severity=Log::LOW) { ... }
|
|
155
|
+
|
|
156
|
+
Imports
|
|
157
|
+
- Pass 'numpy', as: :np or "module.submodule", import: [:Class, :func]
|
|
158
|
+
|
|
159
|
+
### Binding scopes and imports
|
|
160
|
+
|
|
161
|
+
Keep imports local to a binding:
|
|
162
|
+
|
|
163
|
+
```ruby
|
|
164
|
+
ScoutPython.binding_run do
|
|
165
|
+
pyimport :torch
|
|
166
|
+
pyfrom :torch, import: ['nn']
|
|
167
|
+
# torch and nn available here only
|
|
168
|
+
end
|
|
169
|
+
```
|
|
170
|
+
|
|
171
|
+
Helpers
|
|
172
|
+
- new_binding, binding_run
|
|
173
|
+
- import_method, call_method
|
|
174
|
+
- get_module, get_class, class_new_obj
|
|
175
|
+
- exec(script) → PyCall.exec
|
|
176
|
+
|
|
177
|
+
### Scripting
|
|
178
|
+
|
|
179
|
+
Run arbitrary Python text with Ruby variables injected:
|
|
180
|
+
|
|
181
|
+
- ScoutPython.script(text, variables = {}) → result
|
|
182
|
+
- Ruby primitives → Python literals
|
|
183
|
+
- Arrays/Hashes → recursively converted
|
|
184
|
+
- TSV variables → materialized to temp file and loaded into pandas via the python/scout helper
|
|
185
|
+
- result is read back via pickle (default) or JSON (configurable)
|
|
186
|
+
|
|
187
|
+
Swap result serializer if desired:
|
|
188
|
+
|
|
189
|
+
```ruby
|
|
190
|
+
class << ScoutPython
|
|
191
|
+
alias save_script_result save_script_result_json
|
|
192
|
+
alias load_result load_json
|
|
193
|
+
end
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
### Iteration utilities
|
|
197
|
+
|
|
198
|
+
Traverse Python iterables with optional progress bars:
|
|
199
|
+
|
|
200
|
+
- iterate(iterator, bar: nil|true|String) { |elem| ... }
|
|
201
|
+
- iterate_index(sequence, bar: ...) { |elem| ... }
|
|
202
|
+
- collect(iterator, bar: ...) { |elem| ... } → Array
|
|
203
|
+
|
|
204
|
+
### Data conversion and pandas helpers
|
|
205
|
+
|
|
206
|
+
- numpy2ruby(numpy_array)
|
|
207
|
+
- to_a/py2ruby_a(py_list)
|
|
208
|
+
- obj2hash(py_mapping)
|
|
209
|
+
- tsv2df(tsv) / df2tsv(df, options={type: :list, key_field: ...})
|
|
210
|
+
|
|
211
|
+
---
|
|
212
|
+
|
|
213
|
+
## Python-side package (python/scout)
|
|
214
|
+
|
|
215
|
+
The included Python package is importable as scout and provides:
|
|
216
|
+
|
|
217
|
+
General utilities
|
|
218
|
+
- scout.libdir(), scout.add_libdir()
|
|
219
|
+
- scout.path(), scout.read()
|
|
220
|
+
- scout.inspect(obj), scout.rich(obj)
|
|
221
|
+
|
|
222
|
+
TSV IO (pandas-aware)
|
|
223
|
+
- scout.tsv(tsv_path_or_stream, ...) → pandas.DataFrame (Scout headers respected)
|
|
224
|
+
- scout.save_tsv(filename, df, key=None)
|
|
225
|
+
|
|
226
|
+
Workflow wrappers
|
|
227
|
+
- scout.run_job(workflow, task, name='Default', fork=False, clean=False, **inputs)
|
|
228
|
+
- Shells out to the Ruby CLI to execute/fork jobs
|
|
229
|
+
- scout.workflow.Workflow(name).run/fork/tasks/task_info
|
|
230
|
+
- scout.workflow.Step(path).info/status/join/load
|
|
231
|
+
|
|
232
|
+
Remote workflows (HTTP)
|
|
233
|
+
- scout.workflow.remote.RemoteWorkflow(url).job/task_info
|
|
234
|
+
- scout.workflow.remote.RemoteStep(url).status/wait/raw/json
|
|
235
|
+
|
|
236
|
+
---
|
|
237
|
+
|
|
238
|
+
## Error handling and threading
|
|
239
|
+
|
|
240
|
+
- Python process errors from script are surfaced as ConcurrentStreamProcessFailed (non‑zero exit), with stderr logged via Log if a logging wrapper is used
|
|
241
|
+
- Background thread execution must be stopped explicitly:
|
|
242
|
+
- ScoutPython.stop_thread — sends a sentinel, tries to join/kill, GCs, and finalizes PyCall if available
|
|
243
|
+
|
|
244
|
+
---
|
|
245
|
+
|
|
246
|
+
## Command line usage and discovery
|
|
247
|
+
|
|
248
|
+
Scout commands are discovered under scout_commands across installed packages using the Path subsystem. The dispatcher resolves nested commands by adding terms until a file is found to execute; if you stop on a directory, it lists available subcommands.
|
|
249
|
+
|
|
250
|
+
- General pattern:
|
|
251
|
+
- scout <top-level> [<subcommand> ...] [options] [args...]
|
|
252
|
+
- Examples relevant to Python integration (executed from Ruby CLI but callable from Python via scout.run_job):
|
|
253
|
+
- scout workflow task <Workflow> <task> [task-input-options...]
|
|
254
|
+
- scout workflow prov <step_path>
|
|
255
|
+
- scout workflow info <step_path>
|
|
256
|
+
|
|
257
|
+
Notes
|
|
258
|
+
- The bin/scout launcher walks scout_commands/… across packages; Workflows and other packages can add their own commands and they will be discovered
|
|
259
|
+
- See the Workflow, TSV, and KnowledgeBase docs for their CLI suites:
|
|
260
|
+
- TSV: scout tsv …
|
|
261
|
+
- Workflow: scout workflow …
|
|
262
|
+
- KnowledgeBase: scout kb …
|
|
263
|
+
|
|
264
|
+
scout-rig itself does not register standalone CLI commands; instead, its Python wrapper invokes the existing Ruby CLI to run jobs from Python.
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## Reference
|
|
269
|
+
|
|
270
|
+
Read the full module guide in doc/Python.md. For core building blocks referenced above, see these docs in scout-essentials and scout-gear:
|
|
271
|
+
|
|
272
|
+
- Annotation.md, CMD.md, ConcurrentStream.md, IndiferentHash.md, Log.md, Open.md, Path.md, Persist.md, TmpFile.md
|
|
273
|
+
- TSV.md, Workflow.md, KnowledgeBase.md, Association.md, Entity.md, WorkQueue.md, Semaphore.md
|
|
274
|
+
|
|
275
|
+
---
|
|
276
|
+
|
|
277
|
+
## Examples
|
|
278
|
+
|
|
279
|
+
Direct PyCall with imports:
|
|
280
|
+
|
|
281
|
+
```ruby
|
|
282
|
+
ScoutPython.run 'numpy', as: :np do
|
|
283
|
+
a = np.array([1,2,3])
|
|
284
|
+
a.sum # PyObject; convert with to_i if needed
|
|
285
|
+
end
|
|
286
|
+
```
|
|
287
|
+
|
|
288
|
+
Script with a returned value and TSV round‑trip:
|
|
289
|
+
|
|
290
|
+
```ruby
|
|
291
|
+
tsv = TSV.setup({}, "Key~ValueA,ValueB#:type=:list")
|
|
292
|
+
tsv["k1"] = ["a1", "b1"]; tsv["k2"] = ["a2", "b2"]
|
|
293
|
+
|
|
294
|
+
TmpFile.with_file do |target|
|
|
295
|
+
result = ScoutPython.script <<~PY, df: tsv, target: target
|
|
296
|
+
import scout
|
|
297
|
+
result = df.loc["k2", "ValueB"]
|
|
298
|
+
scout.save_tsv(target, df)
|
|
299
|
+
PY
|
|
300
|
+
# result == "b2"; target contains the saved TSV
|
|
301
|
+
end
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
numpy conversion:
|
|
305
|
+
|
|
306
|
+
```ruby
|
|
307
|
+
ra = ScoutPython.run :numpy, as: :np do
|
|
308
|
+
na = np.array([[[1,2,3], [4,5,6]]])
|
|
309
|
+
ScoutPython.numpy2ruby(na)
|
|
310
|
+
end
|
|
311
|
+
ra[0][1][2] # => 6
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
Run workflows from Python:
|
|
315
|
+
|
|
316
|
+
```python
|
|
317
|
+
import scout.workflow as sw
|
|
318
|
+
|
|
319
|
+
wf = sw.Workflow('Baking')
|
|
320
|
+
step = wf.fork('bake_muffin_tray', add_blueberries=True, clean='recursive')
|
|
321
|
+
step.join()
|
|
322
|
+
print(step.load())
|
|
323
|
+
```
|
|
324
|
+
|
|
325
|
+
---
|
|
326
|
+
|
|
327
|
+
## Project links
|
|
328
|
+
|
|
329
|
+
- scout-essentials — https://github.com/mikisvaz/scout-essentials
|
|
330
|
+
- scout-gear — https://github.com/mikisvaz/scout-gear
|
|
331
|
+
- scout-rig — https://github.com/mikisvaz/scout-rig
|
|
332
|
+
- scout-camp — https://github.com/mikisvaz/scout-camp
|
|
333
|
+
- scout-ai — https://github.com/mikisvaz/scout-ai
|
|
334
|
+
- Rbbt — https://github.com/mikisvaz/rbbt
|
|
335
|
+
- Rbbt-Workflows — https://github.com/Rbbt-Workflows
|
|
336
|
+
|
|
337
|
+
Contributions and issues are welcome in their respective GitHub repositories.
|
data/VERSION
CHANGED
|
@@ -1 +1 @@
|
|
|
1
|
-
0.
|
|
1
|
+
0.2.0
|