robot_lab 0.0.8 → 0.0.11
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +71 -0
- data/README.md +106 -4
- data/Rakefile +2 -1
- data/docs/api/core/robot.md +336 -1
- data/docs/api/mcp/client.md +1 -0
- data/docs/api/mcp/server.md +27 -8
- data/docs/api/mcp/transports.md +21 -6
- data/docs/architecture/core-concepts.md +1 -1
- data/docs/architecture/robot-execution.md +20 -2
- data/docs/concepts.md +4 -0
- data/docs/guides/building-robots.md +18 -0
- data/docs/guides/creating-networks.md +39 -0
- data/docs/guides/index.md +10 -0
- data/docs/guides/knowledge.md +182 -0
- data/docs/guides/mcp-integration.md +180 -2
- data/docs/guides/memory.md +2 -0
- data/docs/guides/observability.md +486 -0
- data/docs/guides/ractor-parallelism.md +364 -0
- data/docs/superpowers/plans/2026-04-14-ractor-integration.md +1538 -0
- data/docs/superpowers/specs/2026-04-14-ractor-integration-design.md +258 -0
- data/examples/14_rusty_circuit/.gitignore +1 -0
- data/examples/14_rusty_circuit/open_mic.rb +1 -1
- data/examples/19_token_tracking.rb +128 -0
- data/examples/20_circuit_breaker.rb +153 -0
- data/examples/21_learning_loop.rb +164 -0
- data/examples/22_context_compression.rb +179 -0
- data/examples/23_convergence.rb +137 -0
- data/examples/24_structured_delegation.rb +150 -0
- data/examples/25_history_search/conversation.jsonl +30 -0
- data/examples/25_history_search.rb +136 -0
- data/examples/26_document_store/api_versioning_adr.md +52 -0
- data/examples/26_document_store/incident_postmortem.md +46 -0
- data/examples/26_document_store/postgres_runbook.md +49 -0
- data/examples/26_document_store/redis_caching_guide.md +48 -0
- data/examples/26_document_store/sidekiq_guide.md +51 -0
- data/examples/26_document_store.rb +147 -0
- data/examples/27_incident_response/incident_response.rb +244 -0
- data/examples/28_mcp_discovery.rb +112 -0
- data/examples/29_ractor_tools.rb +243 -0
- data/examples/30_ractor_network.rb +256 -0
- data/examples/README.md +136 -0
- data/examples/prompts/skill_with_mcp_test.md +9 -0
- data/examples/prompts/skill_with_robot_name_test.md +5 -0
- data/examples/prompts/skill_with_tools_test.md +6 -0
- data/lib/robot_lab/bus_poller.rb +149 -0
- data/lib/robot_lab/convergence.rb +69 -0
- data/lib/robot_lab/delegation_future.rb +93 -0
- data/lib/robot_lab/document_store.rb +155 -0
- data/lib/robot_lab/error.rb +25 -0
- data/lib/robot_lab/history_compressor.rb +205 -0
- data/lib/robot_lab/mcp/client.rb +23 -9
- data/lib/robot_lab/mcp/connection_poller.rb +187 -0
- data/lib/robot_lab/mcp/server.rb +26 -3
- data/lib/robot_lab/mcp/server_discovery.rb +110 -0
- data/lib/robot_lab/mcp/transports/base.rb +10 -2
- data/lib/robot_lab/mcp/transports/stdio.rb +58 -26
- data/lib/robot_lab/memory.rb +103 -6
- data/lib/robot_lab/network.rb +44 -9
- data/lib/robot_lab/ractor_boundary.rb +42 -0
- data/lib/robot_lab/ractor_job.rb +37 -0
- data/lib/robot_lab/ractor_memory_proxy.rb +85 -0
- data/lib/robot_lab/ractor_network_scheduler.rb +154 -0
- data/lib/robot_lab/ractor_worker_pool.rb +117 -0
- data/lib/robot_lab/robot/bus_messaging.rb +43 -65
- data/lib/robot_lab/robot/history_search.rb +69 -0
- data/lib/robot_lab/robot/mcp_management.rb +61 -4
- data/lib/robot_lab/robot.rb +351 -11
- data/lib/robot_lab/robot_result.rb +26 -5
- data/lib/robot_lab/run_config.rb +1 -1
- data/lib/robot_lab/text_analysis.rb +103 -0
- data/lib/robot_lab/tool.rb +42 -3
- data/lib/robot_lab/tool_config.rb +1 -1
- data/lib/robot_lab/version.rb +1 -1
- data/lib/robot_lab/waiter.rb +49 -29
- data/lib/robot_lab.rb +25 -0
- data/mkdocs.yml +1 -0
- metadata +71 -2
|
@@ -101,6 +101,26 @@ Global (RobotLab.config.mcp)
|
|
|
101
101
|
-> Runtime (robot.run("msg", mcp: [...]))
|
|
102
102
|
```
|
|
103
103
|
|
|
104
|
+
## Timeout Configuration
|
|
105
|
+
|
|
106
|
+
All transports support a configurable request timeout. The default is 15 seconds. Set a custom timeout at the server level:
|
|
107
|
+
|
|
108
|
+
```ruby
|
|
109
|
+
robot = RobotLab.build(
|
|
110
|
+
name: "patient_bot",
|
|
111
|
+
system_prompt: "You help with slow operations.",
|
|
112
|
+
mcp: [
|
|
113
|
+
{
|
|
114
|
+
name: "heavy_server",
|
|
115
|
+
transport: { type: "stdio", command: "heavy-mcp-server" },
|
|
116
|
+
timeout: 60 # seconds
|
|
117
|
+
}
|
|
118
|
+
]
|
|
119
|
+
)
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
Values >= 1000 are auto-converted from milliseconds to seconds. The minimum timeout is 1 second.
|
|
123
|
+
|
|
104
124
|
## Transport Types
|
|
105
125
|
|
|
106
126
|
### Stdio Transport
|
|
@@ -290,6 +310,156 @@ client.list_resources # => Array of resource definitions
|
|
|
290
310
|
client.disconnect
|
|
291
311
|
```
|
|
292
312
|
|
|
313
|
+
## Connection Multiplexing
|
|
314
|
+
|
|
315
|
+
When a robot connects to several local (stdio) MCP servers, each client normally blocks independently while waiting for a response. `MCP::ConnectionPoller` replaces this with a single `IO.select` call across all registered stdout file descriptors, dispatching each response to the pending request for that client.
|
|
316
|
+
|
|
317
|
+
This is primarily useful in networks where many robots each have multiple stdio MCP servers. Async-based transports (SSE, WebSocket, StreamableHTTP) are unaffected — they already use the Async fiber scheduler.
|
|
318
|
+
|
|
319
|
+
```ruby
|
|
320
|
+
# Create a shared poller
|
|
321
|
+
poller = RobotLab::MCP::ConnectionPoller.new.start
|
|
322
|
+
|
|
323
|
+
# Pass the poller when building clients
|
|
324
|
+
client1 = RobotLab::MCP::Client.new(
|
|
325
|
+
{ name: "filesystem", transport: { type: "stdio", command: "mcp-server-fs" } },
|
|
326
|
+
poller: poller
|
|
327
|
+
)
|
|
328
|
+
client2 = RobotLab::MCP::Client.new(
|
|
329
|
+
{ name: "github", transport: { type: "stdio", command: "mcp-server-github" } },
|
|
330
|
+
poller: poller
|
|
331
|
+
)
|
|
332
|
+
|
|
333
|
+
client1.connect # registers with poller
|
|
334
|
+
client2.connect # registers with poller
|
|
335
|
+
|
|
336
|
+
# Both clients share the IO.select loop
|
|
337
|
+
client1.list_tools
|
|
338
|
+
client2.list_tools
|
|
339
|
+
|
|
340
|
+
poller.stop
|
|
341
|
+
```
|
|
342
|
+
|
|
343
|
+
Without a shared poller each client uses its own blocking `Timeout.timeout` call. With a poller, responses from any registered server wake the poller's select loop, which dispatches to the right waiting thread via a `Thread::Queue`.
|
|
344
|
+
|
|
345
|
+
!!! note
|
|
346
|
+
Only stdio clients are registered with the poller. SSE, WebSocket, and StreamableHTTP clients passed a `poller:` argument ignore it silently.
|
|
347
|
+
|
|
348
|
+
## Server Discovery
|
|
349
|
+
|
|
350
|
+
When a robot has many MCP servers configured, connecting to all of them upfront is wasteful — most servers will be irrelevant for any given user message. **Server Discovery** uses TF cosine similarity to select only the semantically relevant servers before the first `ensure_mcp_clients` call.
|
|
351
|
+
|
|
352
|
+
### Enabling Discovery
|
|
353
|
+
|
|
354
|
+
Add `description:` to each server config and set `mcp_discovery: true` on the robot:
|
|
355
|
+
|
|
356
|
+
```ruby
|
|
357
|
+
robot = RobotLab.build(
|
|
358
|
+
name: "assistant",
|
|
359
|
+
system_prompt: "You are a helpful assistant.",
|
|
360
|
+
mcp_discovery: true,
|
|
361
|
+
mcp: [
|
|
362
|
+
{
|
|
363
|
+
name: "filesystem",
|
|
364
|
+
description: "Read, write, and search local files and directories",
|
|
365
|
+
transport: { type: "stdio", command: "mcp-server-filesystem" }
|
|
366
|
+
},
|
|
367
|
+
{
|
|
368
|
+
name: "github",
|
|
369
|
+
description: "GitHub repos, issues, pull requests, code search",
|
|
370
|
+
transport: { type: "stdio", command: "mcp-server-github" }
|
|
371
|
+
},
|
|
372
|
+
{
|
|
373
|
+
name: "brew",
|
|
374
|
+
description: "Install, update, and manage macOS packages via Homebrew",
|
|
375
|
+
transport: { type: "stdio", command: "mcp-server-brew" }
|
|
376
|
+
}
|
|
377
|
+
]
|
|
378
|
+
)
|
|
379
|
+
|
|
380
|
+
# Discovery connects only :brew for this message — filesystem and github are skipped
|
|
381
|
+
robot.run("install imagemagick")
|
|
382
|
+
```
|
|
383
|
+
|
|
384
|
+
### How It Works
|
|
385
|
+
|
|
386
|
+
`MCP::ServerDiscovery.select(query, from:, threshold:)` computes TF cosine similarity between the user's query and each server's topic text (`name + description`). Servers scoring at or above `DEFAULT_THRESHOLD` (0.05) are returned; the rest are excluded.
|
|
387
|
+
|
|
388
|
+
The threshold is intentionally low — server descriptions are short, so raw cosine scores are naturally small even for on-topic queries.
|
|
389
|
+
|
|
390
|
+
Discovery only applies on the **first** `run()` call (before `@mcp_initialized`). Once a set of servers is connected they remain connected for the robot's lifetime, preserving tool continuity across a conversation.
|
|
391
|
+
|
|
392
|
+
### Fallback Behaviour
|
|
393
|
+
|
|
394
|
+
All servers are returned unchanged when any of the following apply:
|
|
395
|
+
|
|
396
|
+
| Condition | Reason |
|
|
397
|
+
|-----------|--------|
|
|
398
|
+
| No server has a `description` field | Nothing to score against |
|
|
399
|
+
| `classifier` gem unavailable | Raises `DependencyError`, caught internally |
|
|
400
|
+
| Query is blank or nil | Nothing to compare |
|
|
401
|
+
| No server scores ≥ threshold | Rather fall back than leave the robot with no tools |
|
|
402
|
+
|
|
403
|
+
### Using the API Directly
|
|
404
|
+
|
|
405
|
+
```ruby
|
|
406
|
+
servers = [
|
|
407
|
+
{ name: "filesystem", description: "Read and write files", transport: { ... } },
|
|
408
|
+
{ name: "github", description: "GitHub repos and PRs", transport: { ... } }
|
|
409
|
+
]
|
|
410
|
+
|
|
411
|
+
relevant = RobotLab::MCP::ServerDiscovery.select(
|
|
412
|
+
"list open pull requests",
|
|
413
|
+
from: servers,
|
|
414
|
+
threshold: 0.05 # optional, default
|
|
415
|
+
)
|
|
416
|
+
# => only the :github entry
|
|
417
|
+
```
|
|
418
|
+
|
|
419
|
+
## Connection Resilience
|
|
420
|
+
|
|
421
|
+
### Eager Connection
|
|
422
|
+
|
|
423
|
+
By default, MCP connections are lazy — established on the first `run()` call. Use `connect_mcp!` to connect early:
|
|
424
|
+
|
|
425
|
+
```ruby
|
|
426
|
+
robot = RobotLab.build(
|
|
427
|
+
name: "assistant",
|
|
428
|
+
system_prompt: "You help with tasks.",
|
|
429
|
+
mcp: [
|
|
430
|
+
{ name: "github", transport: { type: "stdio", command: "mcp-server-github" } },
|
|
431
|
+
{ name: "filesystem", transport: { type: "stdio", command: "mcp-server-fs" } }
|
|
432
|
+
]
|
|
433
|
+
)
|
|
434
|
+
|
|
435
|
+
robot.connect_mcp!
|
|
436
|
+
|
|
437
|
+
# Check which servers failed
|
|
438
|
+
if robot.failed_mcp_server_names.any?
|
|
439
|
+
puts "Failed to connect: #{robot.failed_mcp_server_names.join(', ')}"
|
|
440
|
+
end
|
|
441
|
+
```
|
|
442
|
+
|
|
443
|
+
### Automatic Retry
|
|
444
|
+
|
|
445
|
+
Failed MCP servers are automatically retried on subsequent `run()` calls. If a server was down when the robot first connected, it will be retried transparently:
|
|
446
|
+
|
|
447
|
+
```ruby
|
|
448
|
+
robot.run("First message") # github connects, filesystem fails
|
|
449
|
+
# ... filesystem comes back up ...
|
|
450
|
+
robot.run("Second message") # filesystem retried and connects
|
|
451
|
+
```
|
|
452
|
+
|
|
453
|
+
### Injecting External MCP Clients
|
|
454
|
+
|
|
455
|
+
Host applications that manage MCP connections externally can inject pre-connected clients into a robot:
|
|
456
|
+
|
|
457
|
+
```ruby
|
|
458
|
+
robot.inject_mcp!(clients: my_clients, tools: my_tools)
|
|
459
|
+
```
|
|
460
|
+
|
|
461
|
+
This skips the normal connection process and marks the robot as MCP-initialized.
|
|
462
|
+
|
|
293
463
|
## Error Handling
|
|
294
464
|
|
|
295
465
|
### Connection Errors
|
|
@@ -302,8 +472,16 @@ rescue RobotLab::MCPError => e
|
|
|
302
472
|
end
|
|
303
473
|
```
|
|
304
474
|
|
|
305
|
-
|
|
306
|
-
|
|
475
|
+
MCP connection failures are logged as warnings but do not raise errors by default. The robot will continue without MCP tools if a server is unreachable. One failing server does not prevent other servers from connecting.
|
|
476
|
+
|
|
477
|
+
### Timeout Errors
|
|
478
|
+
|
|
479
|
+
Stdio transports wrap all blocking I/O with a configurable timeout. If a server does not respond within the timeout period, an `MCPError` is raised with a descriptive message:
|
|
480
|
+
|
|
481
|
+
```ruby
|
|
482
|
+
# Server that takes too long will raise:
|
|
483
|
+
# RobotLab::MCPError: MCP server 'heavy-server' did not respond within 15s
|
|
484
|
+
```
|
|
307
485
|
|
|
308
486
|
## Disconnecting
|
|
309
487
|
|
data/docs/guides/memory.md
CHANGED
|
@@ -190,6 +190,8 @@ results = memory.get(:sentiment, :entities, :keywords, wait: 60)
|
|
|
190
190
|
# => { sentiment: {...}, entities: [...], keywords: [...] }
|
|
191
191
|
```
|
|
192
192
|
|
|
193
|
+
Each blocking wait is backed by an `IO.pipe` pair (`Waiter` class). Calling `signal` writes one byte per waiting caller, so all threads blocked on `IO.select` wake immediately. This design works cleanly with Ruby's Async fiber scheduler — no mutex contention or spurious wakeups.
|
|
194
|
+
|
|
193
195
|
### Subscriptions
|
|
194
196
|
|
|
195
197
|
Subscribe to key changes with asynchronous callbacks:
|
|
@@ -0,0 +1,486 @@
|
|
|
1
|
+
# Observability & Safety
|
|
2
|
+
|
|
3
|
+
Facilities that help you monitor, control, improve, and scale robot behaviour:
|
|
4
|
+
|
|
5
|
+
- **Token & Cost Tracking** — measure LLM usage per run and cumulatively
|
|
6
|
+
- **Tool Loop Circuit Breaker** — guard against runaway tool call loops
|
|
7
|
+
- **Learning Accumulation** — build up cross-run observations that guide future runs
|
|
8
|
+
- **Context Window Compression** — prune irrelevant history to stay within token budgets
|
|
9
|
+
- **Convergence Detection** — detect when independent agents reach the same conclusion
|
|
10
|
+
- **Structured Delegation** — synchronous inter-robot calls with duration and token metadata
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## Token & Cost Tracking
|
|
15
|
+
|
|
16
|
+
### Per-Run Counts
|
|
17
|
+
|
|
18
|
+
Every `robot.run()` returns a `RobotResult` that carries the token usage for that call:
|
|
19
|
+
|
|
20
|
+
```ruby
|
|
21
|
+
robot = RobotLab.build(
|
|
22
|
+
name: "analyst",
|
|
23
|
+
system_prompt: "You are a concise technical analyst.",
|
|
24
|
+
model: "claude-haiku-4-5-20251001"
|
|
25
|
+
)
|
|
26
|
+
|
|
27
|
+
result = robot.run("What is the difference between a stack and a queue?")
|
|
28
|
+
|
|
29
|
+
puts result.input_tokens # tokens sent to the model this run
|
|
30
|
+
puts result.output_tokens # tokens generated this run
|
|
31
|
+
puts result.input_tokens + result.output_tokens # total for this call
|
|
32
|
+
```
|
|
33
|
+
|
|
34
|
+
Token counts are `0` for providers that do not report usage data.
|
|
35
|
+
|
|
36
|
+
### Cumulative Totals
|
|
37
|
+
|
|
38
|
+
The robot accumulates totals across all `run()` calls:
|
|
39
|
+
|
|
40
|
+
```ruby
|
|
41
|
+
3.times { |i| robot.run("Question #{i + 1}") }
|
|
42
|
+
|
|
43
|
+
puts robot.total_input_tokens # sum across all three runs
|
|
44
|
+
puts robot.total_output_tokens
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### Cost Estimation
|
|
48
|
+
|
|
49
|
+
Use per-provider pricing constants to estimate cost:
|
|
50
|
+
|
|
51
|
+
```ruby
|
|
52
|
+
HAIKU_INPUT_CPM = 0.80 # $ per 1M input tokens
|
|
53
|
+
HAIKU_OUTPUT_CPM = 4.00 # $ per 1M output tokens
|
|
54
|
+
|
|
55
|
+
def run_cost(input, output)
|
|
56
|
+
(input * HAIKU_INPUT_CPM + output * HAIKU_OUTPUT_CPM) / 1_000_000.0
|
|
57
|
+
end
|
|
58
|
+
|
|
59
|
+
result = robot.run("Explain memoization.")
|
|
60
|
+
puts "$#{"%.5f" % run_cost(result.input_tokens, result.output_tokens)}"
|
|
61
|
+
```
|
|
62
|
+
|
|
63
|
+
### Batch Accounting with reset_token_totals
|
|
64
|
+
|
|
65
|
+
`reset_token_totals` clears the accounting counters without touching the chat history. Use it to isolate the cost of a specific task batch:
|
|
66
|
+
|
|
67
|
+
```ruby
|
|
68
|
+
# Batch 1
|
|
69
|
+
prompts_batch_1.each { |p| robot.run(p) }
|
|
70
|
+
puts "Batch 1 cost: $#{"%.4f" % run_cost(robot.total_input_tokens, robot.total_output_tokens)}"
|
|
71
|
+
|
|
72
|
+
robot.reset_token_totals # start fresh accounting
|
|
73
|
+
|
|
74
|
+
# Batch 2 — totals start at zero, but chat history is still intact
|
|
75
|
+
prompts_batch_2.each { |p| robot.run(p) }
|
|
76
|
+
puts "Batch 2 cost: $#{"%.4f" % run_cost(robot.total_input_tokens, robot.total_output_tokens)}"
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
> **Important:** Because the chat history keeps growing after a reset, the next run's `input_tokens` will be larger than the first batch's runs. This is expected — it is the real cost of sending the full accumulated context to the API. The counter reset tracks *accounting*, not context size.
|
|
80
|
+
|
|
81
|
+
For a truly fresh context and fresh counters, build a new robot:
|
|
82
|
+
|
|
83
|
+
```ruby
|
|
84
|
+
fresh = RobotLab.build(
|
|
85
|
+
name: "analyst",
|
|
86
|
+
system_prompt: "You are a concise technical analyst."
|
|
87
|
+
)
|
|
88
|
+
result = fresh.run("Explain memoization.")
|
|
89
|
+
puts result.input_tokens # smallest possible — no prior history
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
## Tool Loop Circuit Breaker
|
|
95
|
+
|
|
96
|
+
### The Problem
|
|
97
|
+
|
|
98
|
+
When a tool always instructs the LLM to call it again (e.g., a step-processor returning "more steps remain"), the robot loops indefinitely. Without a guard this consumes tokens, API quota, and time without bound.
|
|
99
|
+
|
|
100
|
+
### max_tool_rounds
|
|
101
|
+
|
|
102
|
+
Set `max_tool_rounds:` on the robot to cap how many tool calls can happen in a single `run()`. When the limit is exceeded, `RobotLab::ToolLoopError` is raised.
|
|
103
|
+
|
|
104
|
+
```ruby
|
|
105
|
+
robot = RobotLab.build(
|
|
106
|
+
name: "runner",
|
|
107
|
+
system_prompt: "Execute every step sequentially.",
|
|
108
|
+
local_tools: [StepTool],
|
|
109
|
+
max_tool_rounds: 10
|
|
110
|
+
)
|
|
111
|
+
|
|
112
|
+
begin
|
|
113
|
+
robot.run("Run all steps.")
|
|
114
|
+
rescue RobotLab::ToolLoopError => e
|
|
115
|
+
puts "Circuit breaker fired: #{e.message}"
|
|
116
|
+
# => "Circuit breaker fired: Tool call limit of 10 exceeded"
|
|
117
|
+
end
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
`max_tool_rounds` can also be supplied via `RunConfig`:
|
|
121
|
+
|
|
122
|
+
```ruby
|
|
123
|
+
config = RobotLab::RunConfig.new(max_tool_rounds: 10)
|
|
124
|
+
robot = RobotLab.build(name: "runner", system_prompt: "...", config: config)
|
|
125
|
+
```
|
|
126
|
+
|
|
127
|
+
### Recovering After ToolLoopError
|
|
128
|
+
|
|
129
|
+
After a `ToolLoopError` the chat contains a **dangling `tool_use` block** with no matching `tool_result`. Anthropic and most other providers will reject any subsequent request with that broken history:
|
|
130
|
+
|
|
131
|
+
```
|
|
132
|
+
Error: tool_use ids were found without tool_result blocks immediately after
|
|
133
|
+
```
|
|
134
|
+
|
|
135
|
+
Call `clear_messages` to flush the corrupted history before reusing the robot. The system prompt and all configuration (tools, `max_tool_rounds`, etc.) are preserved:
|
|
136
|
+
|
|
137
|
+
```ruby
|
|
138
|
+
rescue RobotLab::ToolLoopError => e
|
|
139
|
+
puts "Breaker fired: #{e.message}"
|
|
140
|
+
end
|
|
141
|
+
|
|
142
|
+
robot.clear_messages
|
|
143
|
+
# Robot is healthy — config unchanged
|
|
144
|
+
puts robot.config.max_tool_rounds # still 10
|
|
145
|
+
|
|
146
|
+
result = robot.run("Start fresh with a simple question.")
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
### Normal Tool Use Is Unaffected
|
|
150
|
+
|
|
151
|
+
`max_tool_rounds` is a safety net, not a tax. A robot that calls a tool once and terminates works identically with or without the guard:
|
|
152
|
+
|
|
153
|
+
```ruby
|
|
154
|
+
unguarded = RobotLab.build(
|
|
155
|
+
name: "calculator",
|
|
156
|
+
system_prompt: "Use the provided tool to answer questions.",
|
|
157
|
+
local_tools: [DoubleTool]
|
|
158
|
+
)
|
|
159
|
+
result = unguarded.run("Double the number 21 using the tool.")
|
|
160
|
+
puts result.reply # "The result is 42."
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
## Learning Accumulation
|
|
166
|
+
|
|
167
|
+
### The Problem
|
|
168
|
+
|
|
169
|
+
A robot's inherent memory persists key-value data, but there is no built-in way to tell the LLM "here is what I've learned from previous interactions." Learning accumulation fills that gap.
|
|
170
|
+
|
|
171
|
+
### robot.learn
|
|
172
|
+
|
|
173
|
+
```ruby
|
|
174
|
+
robot.learn(text)
|
|
175
|
+
```
|
|
176
|
+
|
|
177
|
+
Records `text` as an observation. On every subsequent `run()`, active learnings are automatically prepended to the user message:
|
|
178
|
+
|
|
179
|
+
```
|
|
180
|
+
LEARNINGS FROM PREVIOUS RUNS:
|
|
181
|
+
- This codebase prefers map/collect over manual array accumulation
|
|
182
|
+
- Explicit nil comparisons appear frequently here
|
|
183
|
+
|
|
184
|
+
<original user message>
|
|
185
|
+
```
|
|
186
|
+
|
|
187
|
+
This gives the LLM access to prior context without requiring a persistent conversation history.
|
|
188
|
+
|
|
189
|
+
### Bidirectional Deduplication
|
|
190
|
+
|
|
191
|
+
Learnings deduplicate bidirectionally:
|
|
192
|
+
|
|
193
|
+
- If the new text is already contained in an existing learning, it is dropped.
|
|
194
|
+
- If an existing learning is contained in the new text (the new one is broader), the narrower one is replaced.
|
|
195
|
+
|
|
196
|
+
```ruby
|
|
197
|
+
robot.learn("avoid using puts")
|
|
198
|
+
robot.learn("avoid using puts and p in production code")
|
|
199
|
+
|
|
200
|
+
robot.learnings.size # => 1 — broader learning replaced the narrower one
|
|
201
|
+
robot.learnings.first # => "avoid using puts and p in production code"
|
|
202
|
+
```
|
|
203
|
+
|
|
204
|
+
### Accumulated Learnings
|
|
205
|
+
|
|
206
|
+
```ruby
|
|
207
|
+
robot.learnings # => Array<String>
|
|
208
|
+
```
|
|
209
|
+
|
|
210
|
+
Returns the current list of active learnings in insertion order.
|
|
211
|
+
|
|
212
|
+
### Full Example
|
|
213
|
+
|
|
214
|
+
```ruby
|
|
215
|
+
reviewer = RobotLab.build(
|
|
216
|
+
name: "reviewer",
|
|
217
|
+
system_prompt: <<~PROMPT
|
|
218
|
+
You are a concise Ruby code reviewer.
|
|
219
|
+
Identify the main issue in one sentence and show the fix.
|
|
220
|
+
PROMPT
|
|
221
|
+
)
|
|
222
|
+
|
|
223
|
+
snippets = [snippet_a, snippet_b, snippet_c]
|
|
224
|
+
insights = [
|
|
225
|
+
"This codebase prefers map/collect over manual accumulation",
|
|
226
|
+
"Explicit nil comparisons appear frequently",
|
|
227
|
+
"Cart logic tends to have missing edge cases around nil discounts"
|
|
228
|
+
]
|
|
229
|
+
|
|
230
|
+
snippets.each_with_index do |code, i|
|
|
231
|
+
result = reviewer.run("Review this snippet:\n\n#{code}")
|
|
232
|
+
puts result.reply
|
|
233
|
+
|
|
234
|
+
reviewer.learn(insights[i])
|
|
235
|
+
puts "Added learning ##{reviewer.learnings.size}"
|
|
236
|
+
end
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
After all three runs, `reviewer.learnings` contains up to three insights (fewer if any are subsets of others).
|
|
240
|
+
|
|
241
|
+
### Memory Persistence
|
|
242
|
+
|
|
243
|
+
Learnings are stored in `memory[:learnings]`. They survive a robot rebuild when the same `Memory` object is passed to the new robot:
|
|
244
|
+
|
|
245
|
+
```ruby
|
|
246
|
+
shared_memory = original_robot.memory
|
|
247
|
+
|
|
248
|
+
rebuilt = RobotLab.build(
|
|
249
|
+
name: "reviewer",
|
|
250
|
+
system_prompt: "You review code."
|
|
251
|
+
)
|
|
252
|
+
rebuilt.instance_variable_set(:@memory, shared_memory)
|
|
253
|
+
persisted = shared_memory.get(:learnings)
|
|
254
|
+
rebuilt.instance_variable_set(:@learnings, Array(persisted))
|
|
255
|
+
|
|
256
|
+
puts rebuilt.learnings.size # same as original_robot.learnings.size
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
---
|
|
260
|
+
|
|
261
|
+
## Context Window Compression
|
|
262
|
+
|
|
263
|
+
### The Problem
|
|
264
|
+
|
|
265
|
+
Long conversations accumulate turns that are no longer relevant to the current topic. Sending all of them to the LLM on every `run()` wastes tokens and money, and risks exceeding the model's context window.
|
|
266
|
+
|
|
267
|
+
### robot.compress_history
|
|
268
|
+
|
|
269
|
+
```ruby
|
|
270
|
+
robot.compress_history(
|
|
271
|
+
recent_turns: 3, # last N user+assistant pairs — always protected
|
|
272
|
+
keep_threshold: 0.6, # score >= this → keep verbatim
|
|
273
|
+
drop_threshold: 0.2, # score < this → drop
|
|
274
|
+
summarizer: nil # optional lambda(text) -> String for medium tier
|
|
275
|
+
)
|
|
276
|
+
```
|
|
277
|
+
|
|
278
|
+
Internally, each old turn is scored against the mean of the recent turns using stemmed term-frequency cosine similarity (via the `classifier` gem). Turns that score high are kept; turns that score low are dropped; turns in the middle band are either summarized or dropped depending on whether a `summarizer` is provided.
|
|
279
|
+
|
|
280
|
+
**Always preserved regardless of score:**
|
|
281
|
+
|
|
282
|
+
- System messages
|
|
283
|
+
- Tool call/result message pairs
|
|
284
|
+
- All messages within the `recent_turns` window
|
|
285
|
+
|
|
286
|
+
### Thresholds
|
|
287
|
+
|
|
288
|
+
```
|
|
289
|
+
score >= keep_threshold → keep verbatim
|
|
290
|
+
score < drop_threshold → drop
|
|
291
|
+
otherwise → summarize (if summarizer given) or drop
|
|
292
|
+
```
|
|
293
|
+
|
|
294
|
+
A good starting point: `keep_threshold: 0.6, drop_threshold: 0.2`. Widen the drop band (raise `drop_threshold`) to compress more aggressively; raise `keep_threshold` to summarize more.
|
|
295
|
+
|
|
296
|
+
### Without a Summarizer (Drop Mode)
|
|
297
|
+
|
|
298
|
+
```ruby
|
|
299
|
+
robot.compress_history(recent_turns: 3, keep_threshold: 0.6, drop_threshold: 0.2)
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
Medium-relevance turns are dropped along with low-relevance ones. This is the simplest form — no extra LLM calls, no added latency.
|
|
303
|
+
|
|
304
|
+
### With an LLM Summarizer
|
|
305
|
+
|
|
306
|
+
```ruby
|
|
307
|
+
summarizer_bot = RobotLab.build(
|
|
308
|
+
name: "summarizer",
|
|
309
|
+
system_prompt: "Summarize the following text in one sentence."
|
|
310
|
+
)
|
|
311
|
+
|
|
312
|
+
robot.compress_history(
|
|
313
|
+
recent_turns: 3,
|
|
314
|
+
keep_threshold: 0.6,
|
|
315
|
+
drop_threshold: 0.2,
|
|
316
|
+
summarizer: ->(text) { summarizer_bot.run("Summarize: #{text}").reply }
|
|
317
|
+
)
|
|
318
|
+
```
|
|
319
|
+
|
|
320
|
+
The summarizer replaces each medium-relevance turn with a one-sentence digest, preserving some context while reducing token count. The summary inherits the **original message's role** so the user/assistant alternation required by LLM APIs is maintained.
|
|
321
|
+
|
|
322
|
+
### Optional Dependency
|
|
323
|
+
|
|
324
|
+
`compress_history` requires the `classifier` gem. Add it to your Gemfile:
|
|
325
|
+
|
|
326
|
+
```ruby
|
|
327
|
+
gem "classifier", "~> 2.3"
|
|
328
|
+
```
|
|
329
|
+
|
|
330
|
+
Without it, calling `compress_history` raises `RobotLab::DependencyError` with an install hint.
|
|
331
|
+
|
|
332
|
+
---
|
|
333
|
+
|
|
334
|
+
## Convergence Detection
|
|
335
|
+
|
|
336
|
+
### The Problem
|
|
337
|
+
|
|
338
|
+
Multi-robot verification patterns (two independent reviewers, a debate network, a fact-checker) typically ask a reconciler robot to resolve any differences. But when both verifiers already agree, paying for that reconciler call is pure waste.
|
|
339
|
+
|
|
340
|
+
### RobotLab::Convergence
|
|
341
|
+
|
|
342
|
+
```ruby
|
|
343
|
+
score = RobotLab::Convergence.similarity(text_a, text_b) # Float 0.0..1.0
|
|
344
|
+
agreed = RobotLab::Convergence.detected?(text_a, text_b) # Boolean (threshold: 0.85)
|
|
345
|
+
agreed = RobotLab::Convergence.detected?(text_a, text_b, threshold: 0.6)
|
|
346
|
+
```
|
|
347
|
+
|
|
348
|
+
Similarity is computed via L2-normalized stemmed term-frequency cosine similarity. Term frequencies (not TF-IDF) are used because fitting TF-IDF on a 2-document corpus suppresses shared terms to near-zero IDF, giving counter-intuitively low scores for texts that agree on the same topic.
|
|
349
|
+
|
|
350
|
+
Texts shorter than 30 characters always return `0.0`.
|
|
351
|
+
|
|
352
|
+
### Typical Scores
|
|
353
|
+
|
|
354
|
+
| Relationship | Typical Score |
|
|
355
|
+
|---|---|
|
|
356
|
+
| Identical | 1.000 |
|
|
357
|
+
| Same conclusion, different phrasing | 0.60 – 0.75 |
|
|
358
|
+
| Same topic, different emphasis | 0.45 – 0.60 |
|
|
359
|
+
| Unrelated | < 0.15 |
|
|
360
|
+
|
|
361
|
+
### Router Fast-Path Pattern
|
|
362
|
+
|
|
363
|
+
Skip the reconciler when verifiers agree:
|
|
364
|
+
|
|
365
|
+
```ruby
|
|
366
|
+
router = ->(args) do
|
|
367
|
+
a = args.context[:verifier_a]&.reply.to_s
|
|
368
|
+
b = args.context[:verifier_b]&.reply.to_s
|
|
369
|
+
|
|
370
|
+
if RobotLab::Convergence.detected?(a, b)
|
|
371
|
+
nil # both agree — network halts, no reconciler call
|
|
372
|
+
else
|
|
373
|
+
["reconciler"] # diverged — send to reconciler
|
|
374
|
+
end
|
|
375
|
+
end
|
|
376
|
+
|
|
377
|
+
network = RobotLab.create_network(
|
|
378
|
+
name: "fact_check",
|
|
379
|
+
robots: [verifier_a, verifier_b, reconciler],
|
|
380
|
+
router: router
|
|
381
|
+
)
|
|
382
|
+
```
|
|
383
|
+
|
|
384
|
+
Tune `threshold:` to control how strictly "agreement" is defined. A lower threshold (e.g., `0.6`) accepts more variation between verifiers; a higher threshold (e.g., `0.9`) only fast-paths near-identical responses.
|
|
385
|
+
|
|
386
|
+
### Optional Dependency
|
|
387
|
+
|
|
388
|
+
`RobotLab::Convergence` requires the `classifier` gem (same as `compress_history`):
|
|
389
|
+
|
|
390
|
+
```ruby
|
|
391
|
+
gem "classifier", "~> 2.3"
|
|
392
|
+
```
|
|
393
|
+
|
|
394
|
+
---
|
|
395
|
+
|
|
396
|
+
---
|
|
397
|
+
|
|
398
|
+
## Structured Delegation
|
|
399
|
+
|
|
400
|
+
### The Problem
|
|
401
|
+
|
|
402
|
+
RobotLab has two existing patterns for one robot to involve another:
|
|
403
|
+
|
|
404
|
+
- **Pipelines** — predefined sequences where robots share memory and run in order
|
|
405
|
+
- **Bus messaging** — fire-and-forget pub/sub with no return value
|
|
406
|
+
|
|
407
|
+
Neither gives you a synchronous call that returns a result with provenance and cost metadata. `delegate` fills that gap.
|
|
408
|
+
|
|
409
|
+
### Synchronous delegation
|
|
410
|
+
|
|
411
|
+
Blocks until the delegatee finishes and returns a `RobotResult` annotated with provenance and timing:
|
|
412
|
+
|
|
413
|
+
```ruby
|
|
414
|
+
result = manager.delegate(to: specialist, task: "Analyze this data: ...")
|
|
415
|
+
|
|
416
|
+
puts result.reply # specialist's answer
|
|
417
|
+
puts result.robot_name # => "specialist" (who did the work)
|
|
418
|
+
puts result.delegated_by # => "manager" (who asked)
|
|
419
|
+
puts result.duration # => 1.43 (wall-clock seconds)
|
|
420
|
+
puts result.input_tokens # => 812
|
|
421
|
+
puts result.output_tokens # => 94
|
|
422
|
+
```
|
|
423
|
+
|
|
424
|
+
All keyword arguments are forwarded to the delegatee's `run()`:
|
|
425
|
+
|
|
426
|
+
```ruby
|
|
427
|
+
result = manager.delegate(to: worker, task: "hello", company_name: "Acme")
|
|
428
|
+
```
|
|
429
|
+
|
|
430
|
+
### Asynchronous delegation — parallel fan-out
|
|
431
|
+
|
|
432
|
+
Pass `async: true` to get a `DelegationFuture` back immediately. The delegatee runs in a background thread. Call `future.value` to block for the result, or `future.resolved?` to poll without blocking.
|
|
433
|
+
|
|
434
|
+
```ruby
|
|
435
|
+
# Fire both delegations simultaneously
|
|
436
|
+
f1 = manager.delegate(to: summarizer, task: "Summarize: #{doc}", async: true)
|
|
437
|
+
f2 = manager.delegate(to: analyst, task: "Key metric: #{doc}", async: true)
|
|
438
|
+
|
|
439
|
+
# Both are running in parallel here
|
|
440
|
+
puts f1.resolved? # false (probably)
|
|
441
|
+
|
|
442
|
+
# Collect when ready (optional timeout in seconds)
|
|
443
|
+
summary = f1.value(timeout: 30)
|
|
444
|
+
analysis = f2.value(timeout: 30)
|
|
445
|
+
```
|
|
446
|
+
|
|
447
|
+
If the delegatee raises an error, `future.value` re-raises it. If `timeout:` expires before the result arrives, `DelegationFuture::DelegationTimeout` is raised.
|
|
448
|
+
|
|
449
|
+
### When to Use Each Pattern
|
|
450
|
+
|
|
451
|
+
| Pattern | Return value | Concurrent | Use when |
|
|
452
|
+
|---|---|---|---|
|
|
453
|
+
| `pipeline` | shared memory | yes (parallel groups) | fixed workflow graph |
|
|
454
|
+
| `bus` messaging | none (fire-and-forget) | yes | notify without waiting for a reply |
|
|
455
|
+
| `delegate` | `RobotResult` with metadata | no | need the result back, one at a time |
|
|
456
|
+
| `delegate(async: true)` | `DelegationFuture` | yes | parallel fan-out, collect results later |
|
|
457
|
+
|
|
458
|
+
### Full Example
|
|
459
|
+
|
|
460
|
+
```ruby
|
|
461
|
+
manager = RobotLab.build(name: "manager", system_prompt: "You are a project manager.")
|
|
462
|
+
summarizer = RobotLab.build(name: "summarizer", system_prompt: "Summarize in 1-2 sentences.")
|
|
463
|
+
analyst = RobotLab.build(name: "analyst", system_prompt: "Identify the key metric.")
|
|
464
|
+
|
|
465
|
+
# Parallel fan-out
|
|
466
|
+
f1 = manager.delegate(to: summarizer, task: "Summarize: #{document}", async: true)
|
|
467
|
+
f2 = manager.delegate(to: analyst, task: "Key metric: #{document}", async: true)
|
|
468
|
+
|
|
469
|
+
summary = f1.value(timeout: 60)
|
|
470
|
+
analysis = f2.value(timeout: 60)
|
|
471
|
+
|
|
472
|
+
puts "#{summary.robot_name} (#{summary.duration.round(2)}s): #{summary.reply}"
|
|
473
|
+
puts "#{analysis.robot_name} (#{analysis.duration.round(2)}s): #{analysis.reply}"
|
|
474
|
+
```
|
|
475
|
+
|
|
476
|
+
---
|
|
477
|
+
|
|
478
|
+
## See Also
|
|
479
|
+
|
|
480
|
+
- [Robot API](../api/core/robot.md#token--cost-tracking)
|
|
481
|
+
- [Example 19 — Token & Cost Tracking](../../examples/19_token_tracking.rb)
|
|
482
|
+
- [Example 20 — Tool Loop Circuit Breaker](../../examples/20_circuit_breaker.rb)
|
|
483
|
+
- [Example 21 — Learning Accumulation Loop](../../examples/21_learning_loop.rb)
|
|
484
|
+
- [Example 22 — Context Window Compression](../../examples/22_context_compression.rb)
|
|
485
|
+
- [Example 23 — Convergence Detection](../../examples/23_convergence.rb)
|
|
486
|
+
- [Example 24 — Structured Delegation](../../examples/24_structured_delegation.rb)
|