robot_lab 0.0.8 → 0.0.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (78) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +71 -0
  3. data/README.md +106 -4
  4. data/Rakefile +2 -1
  5. data/docs/api/core/robot.md +336 -1
  6. data/docs/api/mcp/client.md +1 -0
  7. data/docs/api/mcp/server.md +27 -8
  8. data/docs/api/mcp/transports.md +21 -6
  9. data/docs/architecture/core-concepts.md +1 -1
  10. data/docs/architecture/robot-execution.md +20 -2
  11. data/docs/concepts.md +4 -0
  12. data/docs/guides/building-robots.md +18 -0
  13. data/docs/guides/creating-networks.md +39 -0
  14. data/docs/guides/index.md +10 -0
  15. data/docs/guides/knowledge.md +182 -0
  16. data/docs/guides/mcp-integration.md +180 -2
  17. data/docs/guides/memory.md +2 -0
  18. data/docs/guides/observability.md +486 -0
  19. data/docs/guides/ractor-parallelism.md +364 -0
  20. data/docs/superpowers/plans/2026-04-14-ractor-integration.md +1538 -0
  21. data/docs/superpowers/specs/2026-04-14-ractor-integration-design.md +258 -0
  22. data/examples/14_rusty_circuit/.gitignore +1 -0
  23. data/examples/14_rusty_circuit/open_mic.rb +1 -1
  24. data/examples/19_token_tracking.rb +128 -0
  25. data/examples/20_circuit_breaker.rb +153 -0
  26. data/examples/21_learning_loop.rb +164 -0
  27. data/examples/22_context_compression.rb +179 -0
  28. data/examples/23_convergence.rb +137 -0
  29. data/examples/24_structured_delegation.rb +150 -0
  30. data/examples/25_history_search/conversation.jsonl +30 -0
  31. data/examples/25_history_search.rb +136 -0
  32. data/examples/26_document_store/api_versioning_adr.md +52 -0
  33. data/examples/26_document_store/incident_postmortem.md +46 -0
  34. data/examples/26_document_store/postgres_runbook.md +49 -0
  35. data/examples/26_document_store/redis_caching_guide.md +48 -0
  36. data/examples/26_document_store/sidekiq_guide.md +51 -0
  37. data/examples/26_document_store.rb +147 -0
  38. data/examples/27_incident_response/incident_response.rb +244 -0
  39. data/examples/28_mcp_discovery.rb +112 -0
  40. data/examples/29_ractor_tools.rb +243 -0
  41. data/examples/30_ractor_network.rb +256 -0
  42. data/examples/README.md +136 -0
  43. data/examples/prompts/skill_with_mcp_test.md +9 -0
  44. data/examples/prompts/skill_with_robot_name_test.md +5 -0
  45. data/examples/prompts/skill_with_tools_test.md +6 -0
  46. data/lib/robot_lab/bus_poller.rb +149 -0
  47. data/lib/robot_lab/convergence.rb +69 -0
  48. data/lib/robot_lab/delegation_future.rb +93 -0
  49. data/lib/robot_lab/document_store.rb +155 -0
  50. data/lib/robot_lab/error.rb +25 -0
  51. data/lib/robot_lab/history_compressor.rb +205 -0
  52. data/lib/robot_lab/mcp/client.rb +23 -9
  53. data/lib/robot_lab/mcp/connection_poller.rb +187 -0
  54. data/lib/robot_lab/mcp/server.rb +26 -3
  55. data/lib/robot_lab/mcp/server_discovery.rb +110 -0
  56. data/lib/robot_lab/mcp/transports/base.rb +10 -2
  57. data/lib/robot_lab/mcp/transports/stdio.rb +58 -26
  58. data/lib/robot_lab/memory.rb +103 -6
  59. data/lib/robot_lab/network.rb +44 -9
  60. data/lib/robot_lab/ractor_boundary.rb +42 -0
  61. data/lib/robot_lab/ractor_job.rb +37 -0
  62. data/lib/robot_lab/ractor_memory_proxy.rb +85 -0
  63. data/lib/robot_lab/ractor_network_scheduler.rb +154 -0
  64. data/lib/robot_lab/ractor_worker_pool.rb +117 -0
  65. data/lib/robot_lab/robot/bus_messaging.rb +43 -65
  66. data/lib/robot_lab/robot/history_search.rb +69 -0
  67. data/lib/robot_lab/robot/mcp_management.rb +61 -4
  68. data/lib/robot_lab/robot.rb +351 -11
  69. data/lib/robot_lab/robot_result.rb +26 -5
  70. data/lib/robot_lab/run_config.rb +1 -1
  71. data/lib/robot_lab/text_analysis.rb +103 -0
  72. data/lib/robot_lab/tool.rb +42 -3
  73. data/lib/robot_lab/tool_config.rb +1 -1
  74. data/lib/robot_lab/version.rb +1 -1
  75. data/lib/robot_lab/waiter.rb +49 -29
  76. data/lib/robot_lab.rb +25 -0
  77. data/mkdocs.yml +1 -0
  78. metadata +71 -2
@@ -0,0 +1,364 @@
1
+ # Ractor Parallelism
2
+
3
+ RobotLab supports true CPU parallelism via Ruby Ractors — isolated execution contexts that bypass the Global VM Lock (GVL). This guide explains how to put both CPU-bound tools and multi-robot pipelines on parallel hardware threads.
4
+
5
+ ## Why Ractors?
6
+
7
+ Ruby's standard thread model is I/O-concurrent but CPU-serialized: the GVL means only one thread runs Ruby code at a time. For LLM workflows this is usually fine — robots spend most of their time waiting on the network. But some workloads benefit from real parallel execution:
8
+
9
+ - **CPU-intensive tools** — text processing, image analysis, embeddings, cryptography
10
+ - **Independent robot pipelines** — multiple robots working on unrelated subtasks simultaneously
11
+
12
+ Ractors bypass the GVL entirely. Each Ractor runs on its own OS thread with no shared mutable state, so multiple Ractors genuinely execute in parallel on multi-core hardware.
13
+
14
+ ## Architecture Overview
15
+
16
+ RobotLab provides two parallel tracks:
17
+
18
+ ```
19
+ ┌─────────────────────────────────────────────────────┐
20
+ │ Your Application │
21
+ ├─────────────────────────────┬───────────────────────┤
22
+ │ Track 1: CPU-bound Tools │ Track 2: Robots │
23
+ │ │ │
24
+ │ Tool#ractor_safe │ Network │
25
+ │ ↓ │ parallel_mode: :ractor│
26
+ │ RactorWorkerPool │ ↓ │
27
+ │ (N Ractor workers) │ RactorNetworkScheduler│
28
+ │ │ (N Ractor workers) │
29
+ ├─────────────────────────────┴───────────────────────┤
30
+ │ Shared Infrastructure │
31
+ │ RactorBoundary · RactorJob · RactorMemoryProxy │
32
+ └─────────────────────────────────────────────────────┘
33
+ ```
34
+
35
+ **Track 1** routes Ractor-safe tools through a global worker pool instead of calling them inline. The robot never notices — it still gets back a result string.
36
+
37
+ **Track 2** replaces the `SimpleFlow::Pipeline` executor for a network with a `RactorNetworkScheduler` that dispatches frozen robot specs to Ractor workers, respecting `depends_on` ordering.
38
+
39
+ Both tracks share the same frozen-data convention: all values crossing a Ractor boundary must be Ractor-shareable.
40
+
41
+ ---
42
+
43
+ ## Track 1: CPU-Bound Tools
44
+
45
+ ### Declaring a Tool as Ractor-Safe
46
+
47
+ Add `ractor_safe true` to any `RubyLLM::Tool` or `RobotLab::Tool` subclass:
48
+
49
+ ```ruby
50
+ class TranscribeAudio < RubyLLM::Tool
51
+ ractor_safe true
52
+
53
+ description "Transcribe an audio file to text"
54
+
55
+ param :path, type: :string, desc: "Absolute path to the audio file"
56
+ param :format, type: :string, desc: "Audio format (wav, mp3, ogg)", required: false
57
+
58
+ def execute(path:, format: "wav")
59
+ # Pure computation — no shared mutable state, no IO closures
60
+ AudioTranscriber.run(path, format: format)
61
+ end
62
+ end
63
+ ```
64
+
65
+ When a robot calls this tool, RobotLab automatically routes the call through the global `RactorWorkerPool` rather than executing it inline. The robot is unaffected — it receives the result string as normal.
66
+
67
+ `ractor_safe` is inherited. If you declare it on a base class, all subclasses are also treated as Ractor-safe:
68
+
69
+ ```ruby
70
+ class BaseAudioTool < RubyLLM::Tool
71
+ ractor_safe true
72
+ end
73
+
74
+ class TranscribeAudio < BaseAudioTool # also ractor_safe
75
+ # ...
76
+ end
77
+
78
+ class DetectLanguage < BaseAudioTool # also ractor_safe
79
+ # ...
80
+ end
81
+ ```
82
+
83
+ ### What Makes a Tool Ractor-Safe?
84
+
85
+ A tool is safe to run inside a Ractor when its `execute` method:
86
+
87
+ - Uses only **frozen or locally-created** objects
88
+ - Does **not** read or write class-level mutable state (class variables, module-level globals)
89
+ - Does **not** hold references to closures, Procs, or lambdas defined outside the Ractor
90
+ - Does **not** use non-Ractor-safe C extensions (most pure-Ruby code is fine)
91
+
92
+ ```ruby
93
+ # Safe: all inputs arrive as frozen args; result is fresh
94
+ class HashContent < RubyLLM::Tool
95
+ ractor_safe true
96
+ description "SHA-256 hash of a string"
97
+ param :text, type: :string, desc: "Text to hash"
98
+
99
+ def execute(text:)
100
+ require "digest"
101
+ Digest::SHA256.hexdigest(text)
102
+ end
103
+ end
104
+
105
+ # Not safe: reads and writes @@cache (shared mutable state)
106
+ class CachedLookup < RubyLLM::Tool
107
+ @@cache = {} # mutable class variable — NOT Ractor-safe
108
+
109
+ def execute(key:)
110
+ @@cache[key] ||= expensive_lookup(key)
111
+ end
112
+ end
113
+ ```
114
+
115
+ ### Configuring the Worker Pool
116
+
117
+ The global pool is created lazily on first use. You can control its size through `RunConfig`:
118
+
119
+ ```ruby
120
+ RobotLab.configure do |config|
121
+ config.ractor_pool_size = 8 # default: Etc.nprocessors
122
+ end
123
+ ```
124
+
125
+ Or per-robot / per-network via `RunConfig`:
126
+
127
+ ```ruby
128
+ config = RobotLab::RunConfig.new(ractor_pool_size: 4)
129
+ robot = RobotLab.build(name: "cruncher", config: config, ...)
130
+ ```
131
+
132
+ Access the shared pool directly:
133
+
134
+ ```ruby
135
+ pool = RobotLab.ractor_pool # RactorWorkerPool instance
136
+ RobotLab.shutdown_ractor_pool # graceful shutdown (poison-pill pattern)
137
+ ```
138
+
139
+ ---
140
+
141
+ ## Track 2: Parallel Robot Networks
142
+
143
+ ### Enabling Ractor Mode
144
+
145
+ Pass `parallel_mode: :ractor` when creating a network:
146
+
147
+ ```ruby
148
+ network = RobotLab.create_network(name: "analysis", parallel_mode: :ractor) do
149
+ task :fetch, fetcher_robot, depends_on: :none
150
+ task :sentiment, sentiment_robot, depends_on: [:fetch]
151
+ task :entities, entity_robot, depends_on: [:fetch]
152
+ task :summarize, summary_robot, depends_on: [:sentiment, :entities]
153
+ end
154
+
155
+ result = network.run(message: "Analyze customer feedback")
156
+ ```
157
+
158
+ When `parallel_mode: :ractor` is set, `Network#run` delegates to `RactorNetworkScheduler` instead of the default `SimpleFlow::Pipeline` executor. The default is `:async` (unchanged behavior).
159
+
160
+ ### How It Works
161
+
162
+ The scheduler builds a `RobotSpec` — a frozen, Ractor-shareable description — for each robot in the network, then dispatches them in dependency order:
163
+
164
+ 1. **Partition** tasks into waves: tasks whose dependencies are all resolved are dispatched together.
165
+ 2. **Each wave** spawns one thread per task; each thread submits a `RactorJob` to the shared work queue and blocks on the per-job reply queue.
166
+ 3. **Worker Ractors** pop jobs, construct a fresh `Robot` from the spec, call `robot.run(message)`, and push the frozen result string back.
167
+ 4. **LLM calls** (ruby_llm) always happen in threads — Ractors hand off network I/O naturally since the thread is doing the blocking.
168
+
169
+ ```
170
+ Wave 1: [ fetch ]
171
+ ↓ result passed to next wave
172
+ Wave 2: [ sentiment | entities ] ← run in parallel
173
+ ↓ both results available
174
+ Wave 3: [ summarize ]
175
+ ```
176
+
177
+ The return value of `run` is a `Hash` mapping robot name strings to their result strings:
178
+
179
+ ```ruby
180
+ results = network.run(message: "Analyze this")
181
+ # => { "fetch" => "...", "sentiment" => "positive", "entities" => "...", "summarize" => "..." }
182
+ ```
183
+
184
+ ### Dependency Ordering
185
+
186
+ Dependency semantics mirror those of `SimpleFlow::Pipeline`:
187
+
188
+ | `depends_on` value | Meaning |
189
+ |---|---|
190
+ | `:none` | Entry-point task; dispatched in the first wave |
191
+ | `:optional` | Runs in the first wave (not blocked by anything) |
192
+ | `["task_a", "task_b"]` | Waits until both `task_a` and `task_b` complete |
193
+
194
+ ```ruby
195
+ RobotLab.create_network(name: "pipeline", parallel_mode: :ractor) do
196
+ task :ingest, ingester, depends_on: :none
197
+ task :classify, classifier, depends_on: ["ingest"]
198
+ task :summarize, summarizer, depends_on: ["ingest"]
199
+ task :report, reporter, depends_on: ["classify", "summarize"]
200
+ end
201
+ ```
202
+
203
+ ---
204
+
205
+ ## Shared Memory Across Ractors
206
+
207
+ Robots running in Ractor workers cannot share a standard `Memory` instance directly — it contains mutable Ruby objects. RobotLab solves this with `RactorMemoryProxy`, which wraps a `Memory` via `Ractor::Wrapper`.
208
+
209
+ You typically interact with the proxy from the thread side (before and after Ractor dispatch), not from inside workers. Workers receive the frozen result string; the scheduler stores it in `completed` for subsequent waves.
210
+
211
+ For cases where you need Ractor workers to write into shared memory at runtime, use the proxy's Ractor-shareable stub:
212
+
213
+ ```ruby
214
+ memory = RobotLab::Memory.new
215
+ proxy = RobotLab::RactorMemoryProxy.new(memory)
216
+
217
+ # Pass the stub (not the proxy) into Ractor.new
218
+ Ractor.new(proxy.stub) do |mem|
219
+ mem.set(:status, "done")
220
+ mem.get(:status) # => "done"
221
+ end.value
222
+
223
+ memory.get(:status) # => "done"
224
+
225
+ proxy.shutdown
226
+ ```
227
+
228
+ Values written via `set` are automatically deep-frozen before crossing the boundary.
229
+
230
+ ---
231
+
232
+ ## The Frozen-Data Contract
233
+
234
+ Everything that crosses a Ractor boundary must be Ractor-shareable: frozen strings, frozen hashes, frozen arrays, `Data.define` structs, and integers/symbols/nil.
235
+
236
+ `RactorBoundary.freeze_deep` recursively freezes a nested Hash/Array structure and raises `RactorBoundaryError` if it encounters something that cannot be made shareable (like a `StringIO` or a `Proc`):
237
+
238
+ ```ruby
239
+ safe = RobotLab::RactorBoundary.freeze_deep({ key: "value", tags: ["a", "b"] })
240
+ # => { key: "value", tags: ["a", "b"] } (all frozen)
241
+
242
+ RobotLab::RactorBoundary.freeze_deep(StringIO.new)
243
+ # => raises RobotLab::RactorBoundaryError
244
+ ```
245
+
246
+ You generally do not need to call this directly — `RactorWorkerPool#submit` and `RactorMemoryProxy#set` call it for you. But it is public if you build tooling on top.
247
+
248
+ ---
249
+
250
+ ## Error Handling
251
+
252
+ ### Tool Errors
253
+
254
+ If a Ractor-safe tool raises inside a worker, the worker catches the error, wraps it in a `RactorJobError`, and sends it back through the reply queue. The pool unwraps it and re-raises as `RobotLab::ToolError`:
255
+
256
+ ```ruby
257
+ begin
258
+ pool.submit("MyTool", { input: "bad data" })
259
+ rescue RobotLab::ToolError => e
260
+ puts e.message # "Tool 'MyTool' failed in Ractor: ..."
261
+ end
262
+ ```
263
+
264
+ ### Robot Pipeline Errors
265
+
266
+ The scheduler raises `RobotLab::Error` if a robot fails inside a Ractor worker:
267
+
268
+ ```ruby
269
+ begin
270
+ network.run(message: "go")
271
+ rescue RobotLab::Error => e
272
+ puts e.message # "Robot 'summarize' failed in Ractor: ..."
273
+ end
274
+ ```
275
+
276
+ ### Boundary Errors
277
+
278
+ Passing unshareable data raises `RobotLab::RactorBoundaryError` before any Ractor is involved:
279
+
280
+ ```ruby
281
+ begin
282
+ pool.submit("MyTool", { io: StringIO.new })
283
+ rescue RobotLab::RactorBoundaryError => e
284
+ puts e.message # "Cannot make value Ractor-shareable: ..."
285
+ end
286
+ ```
287
+
288
+ ---
289
+
290
+ ## Configuration Reference
291
+
292
+ | Parameter | Where | Default | Description |
293
+ |---|---|---|---|
294
+ | `ractor_pool_size` | `RunConfig` / global config | `Etc.nprocessors` | Worker count for `RactorWorkerPool` |
295
+ | `parallel_mode` | `Network.new` | `:async` | `:async` (SimpleFlow) or `:ractor` (RactorNetworkScheduler) |
296
+
297
+ ---
298
+
299
+ ## Best Practices
300
+
301
+ ### 1. Profile Before Reaching for Ractors
302
+
303
+ Ractors add overhead: freezing data, queue coordination, thread synchronization. For fast tools or networks with few tasks, standard threads are often faster. Measure first.
304
+
305
+ ### 2. Keep Tool State Stateless
306
+
307
+ The safest Ractor-safe tool is a pure function:
308
+
309
+ ```ruby
310
+ class NormalizeText < RubyLLM::Tool
311
+ ractor_safe true
312
+ description "Unicode-normalize and strip a string"
313
+ param :text, type: :string, desc: "Input text"
314
+
315
+ def execute(text:)
316
+ text.unicode_normalize(:nfkc).strip
317
+ end
318
+ end
319
+ ```
320
+
321
+ ### 3. Freeze Tool Return Values
322
+
323
+ Tool results travel back through the reply queue — freeze them proactively to avoid the overhead of `Ractor.make_shareable`:
324
+
325
+ ```ruby
326
+ def execute(id:)
327
+ { id: id, name: "result" }.freeze
328
+ end
329
+ ```
330
+
331
+ ### 4. Parallel Mode Doesn't Share Robot Instances
332
+
333
+ Each Ractor worker constructs a **fresh Robot** from the frozen spec. Side-effects on the original robot objects (callbacks, in-memory state) are not visible inside workers. Use `Memory` (via `RactorMemoryProxy`) for shared state.
334
+
335
+ ### 5. LLM Calls Stay in Threads
336
+
337
+ `ruby_llm` is not Ractor-safe. Workers spawn a Thread internally for each LLM call and block the Ractor fiber on the thread result. This is transparent — you don't need to do anything — but it means robot-mode Ractors are I/O-concurrent, not purely CPU-parallel.
338
+
339
+ ### 6. Shut Down the Pool Cleanly
340
+
341
+ Always shut down the global pool before exiting, especially in scripts:
342
+
343
+ ```ruby
344
+ at_exit { RobotLab.shutdown_ractor_pool }
345
+ ```
346
+
347
+ ---
348
+
349
+ ## Constraints and Limitations
350
+
351
+ - **No closures across boundaries.** Procs and lambdas cannot cross Ractor boundaries. Callbacks (`on_tool_call`, `on_tool_result`) registered on the outer robot are not available inside workers.
352
+ - **No mutable class-level state.** Class variables and module globals accessed from `execute` must be frozen.
353
+ - **`parallel_mode: :ractor` returns a plain Hash**, not a `SimpleFlow::Result`. If downstream code depends on `result.context` or `result.value`, use `:async` mode.
354
+ - **Memory subscriptions don't transfer.** Subscriptions registered on the outer `Memory` before a Ractor dispatch are not triggered by writes made via `RactorMemoryProxy#set` inside workers during the run.
355
+ - **Ruby version.** Ractors require Ruby 3.0+. `Ractor#value` / `Ractor#join` are the supported APIs from Ruby 4.0 onwards (`Ractor#take` was removed).
356
+
357
+ ---
358
+
359
+ ## Next Steps
360
+
361
+ - [Using Tools](using-tools.md) — Tool definitions and configuration
362
+ - [Creating Networks](creating-networks.md) — Network orchestration patterns
363
+ - [Memory System](memory.md) — Shared data between robots
364
+ - [API Reference: Network](../api/core/network.md) — Complete Network API