robot_lab 0.0.9 → 0.0.12

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (70) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +53 -0
  3. data/README.md +210 -1
  4. data/Rakefile +2 -1
  5. data/docs/api/core/result.md +123 -0
  6. data/docs/api/core/robot.md +182 -0
  7. data/docs/api/errors.md +185 -0
  8. data/docs/guides/building-robots.md +125 -0
  9. data/docs/guides/creating-networks.md +21 -0
  10. data/docs/guides/index.md +10 -0
  11. data/docs/guides/knowledge.md +182 -0
  12. data/docs/guides/mcp-integration.md +106 -0
  13. data/docs/guides/memory.md +2 -0
  14. data/docs/guides/observability.md +486 -0
  15. data/docs/guides/ractor-parallelism.md +364 -0
  16. data/docs/superpowers/plans/2026-04-14-ractor-integration.md +1538 -0
  17. data/docs/superpowers/specs/2026-04-14-ractor-integration-design.md +258 -0
  18. data/examples/19_token_tracking.rb +128 -0
  19. data/examples/20_circuit_breaker.rb +153 -0
  20. data/examples/21_learning_loop.rb +164 -0
  21. data/examples/22_context_compression.rb +179 -0
  22. data/examples/23_convergence.rb +137 -0
  23. data/examples/24_structured_delegation.rb +150 -0
  24. data/examples/25_history_search/conversation.jsonl +30 -0
  25. data/examples/25_history_search.rb +136 -0
  26. data/examples/26_document_store/api_versioning_adr.md +52 -0
  27. data/examples/26_document_store/incident_postmortem.md +46 -0
  28. data/examples/26_document_store/postgres_runbook.md +49 -0
  29. data/examples/26_document_store/redis_caching_guide.md +48 -0
  30. data/examples/26_document_store/sidekiq_guide.md +51 -0
  31. data/examples/26_document_store.rb +147 -0
  32. data/examples/27_incident_response/incident_response.rb +244 -0
  33. data/examples/28_mcp_discovery.rb +112 -0
  34. data/examples/29_ractor_tools.rb +243 -0
  35. data/examples/30_ractor_network.rb +256 -0
  36. data/examples/README.md +136 -0
  37. data/examples/prompts/skill_with_mcp_test.md +9 -0
  38. data/examples/prompts/skill_with_robot_name_test.md +5 -0
  39. data/examples/prompts/skill_with_tools_test.md +6 -0
  40. data/lib/robot_lab/bus_poller.rb +149 -0
  41. data/lib/robot_lab/convergence.rb +69 -0
  42. data/lib/robot_lab/delegation_future.rb +93 -0
  43. data/lib/robot_lab/document_store.rb +155 -0
  44. data/lib/robot_lab/error.rb +25 -0
  45. data/lib/robot_lab/history_compressor.rb +205 -0
  46. data/lib/robot_lab/mcp/client.rb +17 -5
  47. data/lib/robot_lab/mcp/connection_poller.rb +187 -0
  48. data/lib/robot_lab/mcp/server.rb +7 -2
  49. data/lib/robot_lab/mcp/server_discovery.rb +110 -0
  50. data/lib/robot_lab/mcp/transports/stdio.rb +6 -0
  51. data/lib/robot_lab/memory.rb +103 -6
  52. data/lib/robot_lab/network.rb +44 -9
  53. data/lib/robot_lab/ractor_boundary.rb +42 -0
  54. data/lib/robot_lab/ractor_job.rb +37 -0
  55. data/lib/robot_lab/ractor_memory_proxy.rb +85 -0
  56. data/lib/robot_lab/ractor_network_scheduler.rb +154 -0
  57. data/lib/robot_lab/ractor_worker_pool.rb +117 -0
  58. data/lib/robot_lab/robot/bus_messaging.rb +43 -65
  59. data/lib/robot_lab/robot/history_search.rb +69 -0
  60. data/lib/robot_lab/robot.rb +228 -11
  61. data/lib/robot_lab/robot_result.rb +24 -5
  62. data/lib/robot_lab/run_config.rb +1 -1
  63. data/lib/robot_lab/text_analysis.rb +103 -0
  64. data/lib/robot_lab/tool.rb +42 -3
  65. data/lib/robot_lab/tool_config.rb +1 -1
  66. data/lib/robot_lab/version.rb +1 -1
  67. data/lib/robot_lab/waiter.rb +49 -29
  68. data/lib/robot_lab.rb +25 -0
  69. data/mkdocs.yml +1 -0
  70. metadata +72 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: c852fcf7f4aed4ce95fabdc5b0296723ca8aa10e780dabaa7759e618a22bc640
4
- data.tar.gz: 1bcb205c958ede9967886dae78a1d1a6d47da42e4cd9bd29d7bdd3e094b0a088
3
+ metadata.gz: 3a8ae2e2cf690116950548d732987e16756870f8444c91504ea14fe039f25996
4
+ data.tar.gz: 115694d1449233b3a17a28e87deda8bd3d0ac204f51301aee7781156a3b2003e
5
5
  SHA512:
6
- metadata.gz: 5620e7798ac04441cb23c6a7cc5f0cdad7447103825db35ef6f3a3987785b8ff5fb355ec03a309ef9c8a5ce5b0b7a29d9f5adef0e6a5d9de5cd66d3c94fb0469
7
- data.tar.gz: 9300b1f5ed98e70226c7c670bcf2e3dee033310db6b2182b2705085f02474a1ea6157a011c93906da1d45ba38b4c9f8b9e62545cdb5fd304ca1550734f7dc043
6
+ metadata.gz: d512eea2ce533c92b4f791c0d3527fe61805fdca2638e926c0869e0e8f5b0c9a9dc5bac0791db2e5bae8a326b46eaea31c5ffae9ba761b17d1b93b3113735087
7
+ data.tar.gz: 7e6025d5bbe7252e61e4d7922eea639cda523d7c197535e46400caeeec7f30e7554a015cada107eef796a47c10564d286cdb1d2d4b539a7ac51cc71975b65352
data/CHANGELOG.md CHANGED
@@ -8,6 +8,59 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
8
8
 
9
9
  ## [Unreleased]
10
10
 
11
+ ## [0.0.12] - 2026-04-18
12
+
13
+ ### Added
14
+
15
+ - **README: Context Window Compression section** — documents `robot.compress_history` with threshold tuning (`recent_turns`, `keep_threshold`, `drop_threshold`) and summarizer lambda pattern
16
+ - **README: Convergence Detection section** — documents `RobotLab::Convergence.detected?` / `.similarity` with network router fast-path example
17
+ - **README: Structured Delegation section** — documents `robot.delegate(to:, task:)` sync and async modes, `DelegationFuture` fan-out pattern, and timeout handling
18
+ - **README: Ractor Parallelism section** — documents `ractor_safe true` tool macro and `parallel_mode: :ractor` network mode with link to full guide
19
+ - **`docs/guides/building-robots.md`** — added matching sections for all four features above with expanded API detail, `DelegationFuture` method table, and convergence router example
20
+ - **`docs/api/core/result.md`** — new API reference for `RobotResult`: attributes, token tracking, delegation metadata, persistence (`export`, `from_hash`, `checksum`), and debug fields
21
+ - **`docs/api/errors.md`** — new error hierarchy reference covering all `RobotLab::Error` subclasses (`ConfigurationError`, `DependencyError`, `InferenceError`, `ToolLoopError`, `ToolNotFoundError`, `MCPError`, `BusError`, `RactorBoundaryError`, `ToolError`, `DelegationFuture::DelegationTimeout`) with rescue examples
22
+
23
+ ### Changed
24
+
25
+ - Bumped version to 0.0.12
26
+ - Updated `bigdecimal` to 4.1.2
27
+ - Updated `protocol-http` to 0.62.2
28
+ - Updated `protocol-websocket` to 0.21.0
29
+ - Updated `rake` to 13.4.2
30
+ - Updated `sqlite3` to 2.9.3
31
+
32
+ ## [0.0.11] - 2026-04-14
33
+
34
+ ### Added
35
+
36
+ - **Ractor parallelism — Track 1: CPU-bound tools** (`RactorWorkerPool`)
37
+ - `ractor_safe true` class macro on `Tool` — opts a tool class into Ractor execution; subclasses inherit automatically
38
+ - `RobotLab.ractor_pool` — global `RactorWorkerPool` singleton, one Ractor worker per CPU core by default
39
+ - `ractor_pool_size` field on `RunConfig` for configuring pool capacity
40
+ - `RactorWorkerPool#submit(tool_name, args)` — submits a job and blocks for the frozen result; raises `ToolError` on failure
41
+ - Tool dispatch routes `ractor_safe` tools through the pool automatically, bypassing the GVL for CPU-intensive work
42
+ - `RactorBoundary.freeze_deep(obj)` — deep-freezes nested hashes/arrays/strings to make them Ractor-shareable; raises `RactorBoundaryError` for non-shareable objects (Procs, IOs, etc.)
43
+ - **Ractor parallelism — Track 2: parallel robot pipelines** (`RactorNetworkScheduler`)
44
+ - `parallel_mode: :ractor` on `Network.new` — routes `network.run` through `RactorNetworkScheduler` instead of `SimpleFlow::Pipeline`
45
+ - `RactorNetworkScheduler` dispatches dependency waves: independent tasks run concurrently (one Thread per task); dependent tasks wait for their wave to complete
46
+ - `RobotSpec` — frozen `Data.define` descriptor carrying robot name, template, system prompt, and config; safely crosses Ractor boundaries
47
+ - `RactorNetworkScheduler#run_pipeline` returns `Hash { robot_name => result_string }` for the full pipeline
48
+ - `RactorNetworkScheduler#run_spec` for single-spec dispatch
49
+ - `RactorNetworkScheduler#shutdown` for graceful poison-pill cleanup
50
+ - `network.parallel_mode` reader exposes the configured mode (default `:async`)
51
+ - **Ractor memory proxy** — `RactorMemoryProxy` wraps `Memory` via `ractor-wrapper` for safe cross-Ractor memory access
52
+ - **Infrastructure data classes** — `RactorJob`, `RactorJobError` (`Data.define` structs) for job submission and error propagation across Ractor boundaries
53
+ - **`RactorBoundaryError`** — raised by `freeze_deep` when a non-shareable value (Proc, IO, etc.) would cross a Ractor boundary
54
+ - **`ToolError`** — raised by `RactorWorkerPool#submit` when a tool raises inside a Ractor; propagates message and frozen backtrace
55
+ - **Dependencies** — `ractor_queue` (~> 0.1) and `ractor-wrapper` (~> 0.4) added to gemspec
56
+ - **Ractor Parallelism guide** (`docs/guides/ractor-parallelism.md`) — covers architecture, two-track design, configuration, error handling, constraints, and best practices
57
+ - **Example 29: Ractor-Safe CPU Tools** (`examples/29_ractor_tools.rb`) — demonstrates `ractor_safe` flag, inheritance, `freeze_deep`, pool submissions, `ToolError` propagation, and parallel batch timing; no API key required
58
+ - **Example 30: Ractor Network Scheduler** (`examples/30_ractor_network.rb`) — demonstrates `RactorNetworkScheduler` wave ordering with simulated latencies, `Network.new(parallel_mode: :ractor)` API, and dependency graph inspection; no API key required for Parts 1 & 2
59
+
60
+ ### Fixed
61
+
62
+ - `ToolConfig::NONE_VALUES` constant was not Ractor-shareable because its inner empty array `[]` was mutable; fixed by replacing `[]` with `[].freeze` so the entire constant is deeply frozen and safe to read from any Ractor
63
+
11
64
  ## [0.0.9] - 2026-03-02
12
65
 
13
66
  ### Added
data/README.md CHANGED
@@ -26,7 +26,13 @@
26
26
  - <strong>Message Bus</strong> - Bidirectional robot communication via TypedBus<br>
27
27
  - <strong>Dynamic Spawning</strong> - Robots create new robots at runtime<br>
28
28
  - <strong>Layered Configuration</strong> - Cascading YAML, env vars, and RunConfig<br>
29
- - <strong>Rails Integration</strong> - Generators, background jobs, Turbo Stream broadcasting
29
+ - <strong>Rails Integration</strong> - Generators, background jobs, Turbo Stream broadcasting<br>
30
+ - <strong>Token &amp; Cost Tracking</strong> - Per-run and cumulative token counts on every robot<br>
31
+ - <strong>Tool Loop Circuit Breaker</strong> - <code>max_tool_rounds:</code> guards against runaway tool call loops<br>
32
+ - <strong>Learning Accumulation</strong> - <code>robot.learn()</code> builds up cross-run observations with deduplication<br>
33
+ - <strong>Context Window Compression</strong> - <code>robot.compress_history()</code> prunes irrelevant old turns via TF cosine scoring<br>
34
+ - <strong>Convergence Detection</strong> - <code>RobotLab::Convergence</code> detects when independent agents agree, enabling reconciler fast-path<br>
35
+ - <strong>Structured Delegation</strong> - <code>robot.delegate(to:, task:)</code> sync or async inter-robot calls with duration and token metadata; async fan-out via <code>DelegationFuture</code>
30
36
  </td>
31
37
  </tr>
32
38
  </table>
@@ -621,6 +627,209 @@ robot.run("Tell me a story") { |chunk| stream_to_client(chunk.content) }
621
627
 
622
628
  The `on_content:` callback participates in the RunConfig cascade, so it can be set at the network or config level and inherited by robots.
623
629
 
630
+ ## Token & Cost Tracking
631
+
632
+ Every `robot.run()` returns a `RobotResult` that carries token usage for that call. The robot itself accumulates running totals across all runs.
633
+
634
+ ```ruby
635
+ robot = RobotLab.build(name: "analyst", system_prompt: "You are helpful.")
636
+
637
+ result = robot.run("What is a stack?")
638
+ puts result.input_tokens # tokens sent to the LLM this run
639
+ puts result.output_tokens # tokens generated this run
640
+
641
+ puts robot.total_input_tokens # cumulative across all runs
642
+ puts robot.total_output_tokens
643
+ ```
644
+
645
+ To start a fresh cost batch without rebuilding the robot, call `reset_token_totals`. This resets the **accounting counter only** — the chat history keeps accumulating, so subsequent `input_tokens` will reflect the full context window sent to the API:
646
+
647
+ ```ruby
648
+ robot.reset_token_totals
649
+ puts robot.total_input_tokens # => 0
650
+ ```
651
+
652
+ Token counts are zero for providers that do not return usage data.
653
+
654
+ ## Tool Loop Circuit Breaker
655
+
656
+ Set `max_tool_rounds:` to prevent a robot from looping indefinitely through tool calls. When the limit is exceeded, `RobotLab::ToolLoopError` is raised.
657
+
658
+ ```ruby
659
+ robot = RobotLab.build(
660
+ name: "runner",
661
+ system_prompt: "Execute every step.",
662
+ local_tools: [StepTool],
663
+ max_tool_rounds: 10
664
+ )
665
+
666
+ begin
667
+ robot.run("Run all steps.")
668
+ rescue RobotLab::ToolLoopError => e
669
+ puts e.message # "Tool call limit of 10 exceeded"
670
+ end
671
+ ```
672
+
673
+ After a `ToolLoopError` the chat contains a dangling `tool_use` block with no matching `tool_result`. Most providers (including Anthropic) will reject any subsequent request with that history. Call `clear_messages` before reusing the robot:
674
+
675
+ ```ruby
676
+ robot.clear_messages # flushes broken history; system prompt is kept
677
+ result = robot.run("Something new.") # robot is healthy again
678
+ ```
679
+
680
+ ## Learning Accumulation
681
+
682
+ `robot.learn(text)` records a cross-run observation. On each subsequent `run()`, active learnings are automatically prepended to the user message as a `LEARNINGS FROM PREVIOUS RUNS:` block so the LLM can incorporate prior context without needing a persistent chat:
683
+
684
+ ```ruby
685
+ reviewer = RobotLab.build(
686
+ name: "reviewer",
687
+ system_prompt: "You are a Ruby code reviewer."
688
+ )
689
+
690
+ reviewer.run("Review snippet A")
691
+ reviewer.learn("This codebase prefers map/collect over manual array accumulation")
692
+
693
+ reviewer.run("Review snippet B") # learning is injected automatically
694
+ ```
695
+
696
+ Learnings deduplicate bidirectionally: if a broader learning is added that contains an existing narrower one, the narrower one is dropped. Learnings are persisted to the robot's `Memory` and survive a robot rebuild when the same `Memory` object is reused.
697
+
698
+ ```ruby
699
+ reviewer.learnings # => ["This codebase prefers map/collect..."]
700
+ reviewer.learn("new fact") # deduplicates before storing
701
+ ```
702
+
703
+ ## Context Window Compression
704
+
705
+ `robot.compress_history` prunes old conversation turns using TF-IDF cosine similarity, keeping only turns that are relevant to the most recent context. System messages and tool call/result pairs are always preserved.
706
+
707
+ ```ruby
708
+ # Basic compression: protect the 3 most recent turns, drop unrelated old turns
709
+ robot.compress_history
710
+
711
+ # Tune the thresholds
712
+ robot.compress_history(
713
+ recent_turns: 5, # protect this many recent user+assistant pairs
714
+ keep_threshold: 0.6, # turns scoring >= this are kept verbatim
715
+ drop_threshold: 0.2 # turns scoring < this are dropped
716
+ )
717
+
718
+ # Summarize medium-relevance turns instead of dropping them
719
+ summarizer_bot = RobotLab.build(name: "summarizer", system_prompt: "Summarize concisely.")
720
+ robot.compress_history(
721
+ summarizer: ->(text) { summarizer_bot.run("One sentence: #{text}").reply }
722
+ )
723
+ ```
724
+
725
+ Requires the optional `classifier` gem (`~> 2.3`). Add it to your Gemfile:
726
+
727
+ ```ruby
728
+ gem "classifier", "~> 2.3"
729
+ ```
730
+
731
+ ## Convergence Detection
732
+
733
+ `RobotLab::Convergence` detects when two independent agents have reached the same conclusion using TF-IDF cosine similarity. Use it as a router fast-path to skip an expensive reconciler LLM call when verifiers already agree.
734
+
735
+ ```ruby
736
+ # Check similarity directly
737
+ score = RobotLab::Convergence.similarity(result_a.reply, result_b.reply)
738
+ # => 0.92
739
+
740
+ # Boolean check against a threshold (default: 0.85)
741
+ RobotLab::Convergence.detected?(result_a.reply, result_b.reply)
742
+ # => true
743
+
744
+ # Use a custom threshold
745
+ RobotLab::Convergence.detected?(text_a, text_b, threshold: 0.75)
746
+ ```
747
+
748
+ A common pattern is wiring convergence into a network router to skip reconciliation:
749
+
750
+ ```ruby
751
+ router = ->(args) do
752
+ a = args.context[:verifier_a]&.reply.to_s
753
+ b = args.context[:verifier_b]&.reply.to_s
754
+ RobotLab::Convergence.detected?(a, b) ? nil : ["reconciler"]
755
+ end
756
+
757
+ network = RobotLab.create_network(name: "verify", router: router) do
758
+ # ...
759
+ end
760
+ ```
761
+
762
+ Requires the `classifier` gem (`~> 2.3`).
763
+
764
+ ## Structured Delegation
765
+
766
+ `robot.delegate(to:, task:)` dispatches work to another robot and returns the result, with duration and token metadata attached. Pass `async: true` for non-blocking fan-out.
767
+
768
+ ```ruby
769
+ analyst = RobotLab.build(name: "analyst", system_prompt: "Analyze data.")
770
+ writer = RobotLab.build(name: "writer", system_prompt: "Write reports.")
771
+ manager = RobotLab.build(name: "manager", system_prompt: "Coordinate work.")
772
+
773
+ # Synchronous delegation — blocks until done
774
+ result = manager.delegate(to: analyst, task: "Analyze Q3 sales data")
775
+ puts result.reply
776
+ puts "%.2fs, %d tokens" % [result.duration, result.output_tokens]
777
+
778
+ # Asynchronous fan-out — returns immediately
779
+ f1 = manager.delegate(to: analyst, task: "Analyze Q3 sales", async: true)
780
+ f2 = manager.delegate(to: writer, task: "Draft Q3 summary", async: true)
781
+
782
+ # Do other work here while both run in parallel...
783
+
784
+ analysis = f1.value # blocks until resolved
785
+ summary = f2.value # blocks until resolved
786
+
787
+ # With a timeout
788
+ result = f1.value(timeout: 30) # raises DelegationFuture::DelegationTimeout if too slow
789
+ ```
790
+
791
+ `DelegationFuture` attributes:
792
+
793
+ ```ruby
794
+ future.resolved? # => true/false (non-blocking poll)
795
+ future.robot_name # => "analyst"
796
+ future.delegated_by # => "manager"
797
+ ```
798
+
799
+ ## Ractor Parallelism
800
+
801
+ RobotLab supports true CPU parallelism via Ruby Ractors — isolated execution contexts that bypass the GVL. Two modes are available:
802
+
803
+ **CPU-bound tools** — mark a tool `ractor_safe true` and RobotLab automatically routes its calls through a global `RactorWorkerPool` instead of running inline:
804
+
805
+ ```ruby
806
+ class TranscribeAudio < RubyLLM::Tool
807
+ ractor_safe true
808
+ description "Transcribe an audio file"
809
+ param :path, type: :string, desc: "Path to audio file"
810
+
811
+ def execute(path:)
812
+ AudioTranscriber.run(path) # pure computation, no shared mutable state
813
+ end
814
+ end
815
+ ```
816
+
817
+ **Parallel robot networks** — pass `parallel_mode: :ractor` when creating a network to dispatch independent robots across hardware threads simultaneously:
818
+
819
+ ```ruby
820
+ network = RobotLab.create_network(name: "analysis", parallel_mode: :ractor) do
821
+ task :fetch, fetcher_robot, depends_on: :none
822
+ task :sentiment, sentiment_robot, depends_on: [:fetch]
823
+ task :entities, entity_robot, depends_on: [:fetch] # runs in parallel with sentiment
824
+ task :summarize, summary_robot, depends_on: [:sentiment, :entities]
825
+ end
826
+
827
+ results = network.run(message: "Analyze customer feedback")
828
+ # => { "fetch" => "...", "sentiment" => "positive", "entities" => "...", "summarize" => "..." }
829
+ ```
830
+
831
+ See the [Ractor Parallelism guide](https://madbomber.github.io/robot_lab/guides/ractor-parallelism) for constraints, the frozen-data contract, and `RactorMemoryProxy` for shared state.
832
+
624
833
  ## Rails Integration
625
834
 
626
835
  ```bash
data/Rakefile CHANGED
@@ -49,7 +49,8 @@ namespace :examples do
49
49
  SUBDIR_ENTRY_POINTS = {
50
50
  "14_rusty_circuit" => "open_mic.rb",
51
51
  "15_memory_network_and_bus" => "editorial_pipeline.rb",
52
- "16_writers_room" => "writers_room.rb"
52
+ "16_writers_room" => "writers_room.rb",
53
+ "27_incident_response" => "incident_response.rb"
53
54
  }.freeze
54
55
 
55
56
  # Subdirectory demos that are standalone apps (not run via `ruby`)
@@ -0,0 +1,123 @@
1
+ # RobotResult
2
+
3
+ `RobotResult` is returned by every `robot.run()` call. It carries the LLM output, tool call results, token usage, timing, and delegation metadata for that execution.
4
+
5
+ ## Accessing the Response
6
+
7
+ ```ruby
8
+ result = robot.run("What is the capital of France?")
9
+
10
+ result.reply # => "The capital of France is Paris."
11
+ result.last_text_content # => alias for reply
12
+ result.output # => Array of Message objects (full turn)
13
+ result.tool_calls # => Array of ToolResultMessage objects
14
+ ```
15
+
16
+ `reply` / `last_text_content` returns the content of the last text message in `output`. This is the string you want for the vast majority of use cases.
17
+
18
+ ## Token & Cost Tracking
19
+
20
+ ```ruby
21
+ result.input_tokens # => Integer — tokens sent to the LLM this run
22
+ result.output_tokens # => Integer — tokens generated this run
23
+ ```
24
+
25
+ Token counts are zero for providers that do not return usage data.
26
+
27
+ ## Timing
28
+
29
+ `duration` is set when the result travels through a network pipeline or a `delegate` call. It is `nil` when calling `robot.run()` directly.
30
+
31
+ ```ruby
32
+ result.duration # => Float (elapsed seconds) or nil
33
+ ```
34
+
35
+ ## Delegation Metadata
36
+
37
+ When a result comes back through `robot.delegate(to:, task:)`, two additional fields are populated:
38
+
39
+ ```ruby
40
+ result.delegated_by # => "manager" (the robot that issued the delegation)
41
+ result.duration # => 2.34 (always set by delegate)
42
+ ```
43
+
44
+ ## Identity & Status
45
+
46
+ ```ruby
47
+ result.robot_name # => "analyst"
48
+ result.id # => "550e8400-e29b-..." (UUID, unique per run)
49
+ result.created_at # => Time instance
50
+ result.stop_reason # => "end_turn", "tool_use", or nil
51
+ ```
52
+
53
+ ## Inspecting the Full Output
54
+
55
+ ```ruby
56
+ result.output.each do |message|
57
+ puts message.role # :assistant, :tool, etc.
58
+ puts message.content # String or Array
59
+ end
60
+
61
+ result.has_tool_calls? # => true if the LLM called any tools
62
+ result.stopped? # => true if execution ended naturally (not mid-tool-call)
63
+ ```
64
+
65
+ ## Persistence
66
+
67
+ Export for serialization (excludes debug fields):
68
+
69
+ ```ruby
70
+ hash = result.export
71
+ # {
72
+ # robot_name: "analyst",
73
+ # output: [...],
74
+ # tool_calls: [...],
75
+ # created_at: "2026-04-18T12:00:00Z",
76
+ # id: "550e8400-...",
77
+ # checksum: "a1b2c3...",
78
+ # stop_reason: "end_turn",
79
+ # duration: 2.34,
80
+ # input_tokens: 512,
81
+ # output_tokens: 128
82
+ # }
83
+
84
+ json = result.to_json
85
+
86
+ # Reconstruct from hash
87
+ restored = RobotLab::RobotResult.from_hash(hash)
88
+ ```
89
+
90
+ `checksum` is a SHA-256 digest of `output + tool_calls + created_at`. Use it for deduplication when persisting results.
91
+
92
+ ## Debug Fields
93
+
94
+ These are `nil` by default and only populated when explicitly set for debugging:
95
+
96
+ ```ruby
97
+ result.prompt # Array<Message> — prompt sent to the LLM
98
+ result.history # Array<Message> — history used
99
+ result.raw # raw LLM response object from ruby_llm
100
+ ```
101
+
102
+ ## Attribute Reference
103
+
104
+ | Attribute | Type | Description |
105
+ |-----------|------|-------------|
106
+ | `robot_name` | String | Name of the robot that produced this result |
107
+ | `reply` | String, nil | Last text content (alias: `last_text_content`) |
108
+ | `output` | Array\<Message\> | All output messages from this run |
109
+ | `tool_calls` | Array\<ToolResultMessage\> | Tool call results |
110
+ | `input_tokens` | Integer | Tokens sent to LLM |
111
+ | `output_tokens` | Integer | Tokens generated |
112
+ | `duration` | Float, nil | Elapsed seconds (set by delegate/pipeline) |
113
+ | `delegated_by` | String, nil | Delegating robot's name |
114
+ | `id` | String | UUID |
115
+ | `created_at` | Time | Creation timestamp |
116
+ | `stop_reason` | String, nil | LLM stop reason |
117
+ | `checksum` | String | SHA-256 of output content |
118
+
119
+ ## Related
120
+
121
+ - [Robot API](robot.md) — `run`, `delegate`, `compress_history`
122
+ - [Building Robots](../../guides/building-robots.md) — Robot construction patterns
123
+ - [Structured Delegation](../../guides/building-robots.md#structured-delegation) — `DelegationFuture` and async fan-out
@@ -33,6 +33,8 @@ Robot.new(
33
33
  enable_cache: true,
34
34
  bus: nil,
35
35
  skills: nil,
36
+ max_tool_rounds: nil,
37
+ token_budget: nil,
36
38
  temperature: nil,
37
39
  top_p: nil,
38
40
  top_k: nil,
@@ -65,6 +67,8 @@ Robot.new(
65
67
  | `enable_cache` | `Boolean` | `true` | Whether to enable semantic caching |
66
68
  | `bus` | `TypedBus::MessageBus`, `nil` | `nil` | Optional message bus for inter-robot communication |
67
69
  | `skills` | `Symbol`, `Array<Symbol>`, `nil` | `nil` | Skill templates to prepend (see [Skills](#skills)) |
70
+ | `max_tool_rounds` | `Integer`, `nil` | `nil` | Circuit breaker: raise `ToolLoopError` after this many tool calls in one `run()` (see [Tool Loop Circuit Breaker](#tool-loop-circuit-breaker)) |
71
+ | `token_budget` | `Integer`, `nil` | `nil` | Raise `InferenceError` if cumulative input tokens exceed this limit |
68
72
  | `config` | `RunConfig`, `nil` | `nil` | Shared config merged with explicit kwargs (see [RunConfig](#runconfig)) |
69
73
  | `temperature` | `Float`, `nil` | `nil` | Controls randomness (0.0-1.0) |
70
74
  | `top_p` | `Float`, `nil` | `nil` | Nucleus sampling threshold |
@@ -113,6 +117,9 @@ If `name` is omitted, it defaults to `"robot"`.
113
117
  | `config` | `RunConfig` | Effective RunConfig (merged from constructor kwargs and passed-in config) |
114
118
  | `mcp_config` | `Symbol`, `Array` | Build-time MCP configuration (raw, unresolved) |
115
119
  | `tools_config` | `Symbol`, `Array` | Build-time tools configuration (raw, unresolved) |
120
+ | `total_input_tokens` | `Integer` | Cumulative input tokens sent across all `run()` calls |
121
+ | `total_output_tokens` | `Integer` | Cumulative output tokens received across all `run()` calls |
122
+ | `learnings` | `Array<String>` | Accumulated cross-run observations (see [Learning Accumulation](#learning-accumulation)) |
116
123
 
117
124
  ## Attributes (Read-Write)
118
125
 
@@ -902,6 +909,181 @@ bot.with_bus(bus)
902
909
  bot.send_message(to: :someone, content: "Hello!")
903
910
  ```
904
911
 
912
+ ## Token & Cost Tracking
913
+
914
+ Every `robot.run()` returns a `RobotResult` with token counts for that call. The robot accumulates running totals across all runs.
915
+
916
+ ### RobotResult Token Fields
917
+
918
+ | Field | Type | Description |
919
+ |-------|------|-------------|
920
+ | `input_tokens` | `Integer` | Input tokens sent to the LLM in this run (0 if provider doesn't report usage) |
921
+ | `output_tokens` | `Integer` | Output tokens received from the LLM in this run (0 if not reported) |
922
+
923
+ ### Robot Cumulative Totals
924
+
925
+ | Attribute | Type | Description |
926
+ |-----------|------|-------------|
927
+ | `total_input_tokens` | `Integer` | Cumulative input tokens across all `run()` calls |
928
+ | `total_output_tokens` | `Integer` | Cumulative output tokens across all `run()` calls |
929
+
930
+ ### reset_token_totals
931
+
932
+ ```ruby
933
+ robot.reset_token_totals
934
+ # => nil
935
+ ```
936
+
937
+ Reset the cumulative accounting counters to zero. Useful when you want to measure cost for a specific task batch while keeping the robot alive for the next batch.
938
+
939
+ > **Note:** This resets the *accounting counter only* — the underlying chat history keeps growing. The next run's `input_tokens` will reflect the full accumulated chat context sent to the API.
940
+
941
+ **Example:**
942
+
943
+ ```ruby
944
+ robot = RobotLab.build(name: "analyst", system_prompt: "You are helpful.")
945
+
946
+ result = robot.run("What is a stack?")
947
+ puts result.input_tokens # e.g. 120
948
+ puts result.output_tokens # e.g. 45
949
+
950
+ result2 = robot.run("And a queue?")
951
+ puts result2.input_tokens # larger — full chat history sent
952
+
953
+ puts robot.total_input_tokens # 120 + result2.input_tokens
954
+ puts robot.total_output_tokens
955
+
956
+ # Start a fresh accounting batch
957
+ robot.reset_token_totals
958
+ puts robot.total_input_tokens # => 0
959
+ ```
960
+
961
+ ## Tool Loop Circuit Breaker
962
+
963
+ Set `max_tool_rounds:` to guard against a robot looping indefinitely through tool calls. After the limit is reached, `RobotLab::ToolLoopError` is raised.
964
+
965
+ ### max_tool_rounds Parameter
966
+
967
+ ```ruby
968
+ robot = RobotLab.build(
969
+ name: "runner",
970
+ system_prompt: "Execute every step.",
971
+ local_tools: [StepTool],
972
+ max_tool_rounds: 10
973
+ )
974
+ ```
975
+
976
+ `max_tool_rounds` can also be set via `RunConfig`:
977
+
978
+ ```ruby
979
+ config = RobotLab::RunConfig.new(max_tool_rounds: 10)
980
+ robot = RobotLab.build(name: "runner", system_prompt: "...", config: config)
981
+ ```
982
+
983
+ ### ToolLoopError
984
+
985
+ `RobotLab::ToolLoopError < RobotLab::InferenceError`
986
+
987
+ Raised when the number of tool calls in a single `run()` exceeds `max_tool_rounds`. The error message includes the limit that was exceeded.
988
+
989
+ ### Recovery after ToolLoopError
990
+
991
+ After a `ToolLoopError`, the chat contains a dangling `tool_use` block with no matching `tool_result`. Anthropic and most providers will reject any subsequent request with that broken history.
992
+
993
+ **You must call `clear_messages` before reusing the robot:**
994
+
995
+ ```ruby
996
+ begin
997
+ robot.run("Execute all steps.")
998
+ rescue RobotLab::ToolLoopError => e
999
+ puts "Circuit breaker fired: #{e.message}"
1000
+ end
1001
+
1002
+ # Flush the corrupted chat (system prompt is kept)
1003
+ robot.clear_messages
1004
+ puts robot.config.max_tool_rounds # still set — config unchanged
1005
+
1006
+ # Robot is healthy again
1007
+ result = robot.run("Something new.")
1008
+ ```
1009
+
1010
+ ## Learning Accumulation
1011
+
1012
+ `robot.learn(text)` records a cross-run observation. On each subsequent `run()`, active learnings are automatically prepended to the user message as a `LEARNINGS FROM PREVIOUS RUNS:` block.
1013
+
1014
+ ### learn
1015
+
1016
+ ```ruby
1017
+ robot.learn(text)
1018
+ # => self
1019
+ ```
1020
+
1021
+ Add a learning to the robot's accumulated observations. Learnings are automatically deduplicated:
1022
+
1023
+ - If the new text is a substring of an existing learning, it is dropped (the existing broader learning already covers it).
1024
+ - If an existing learning is a substring of the new text, the narrower one is replaced.
1025
+
1026
+ Learnings are persisted to `memory[:learnings]` and survive a robot rebuild when the same `Memory` object is reused.
1027
+
1028
+ **Parameters:**
1029
+
1030
+ | Name | Type | Description |
1031
+ |------|------|-------------|
1032
+ | `text` | `String` | The observation or insight to record |
1033
+
1034
+ **Returns:** `self`
1035
+
1036
+ ### learnings
1037
+
1038
+ ```ruby
1039
+ robot.learnings
1040
+ # => Array<String>
1041
+ ```
1042
+
1043
+ Returns the list of accumulated learning strings in insertion order.
1044
+
1045
+ ### How Learnings Are Injected
1046
+
1047
+ When learnings are present, each `run(message)` prepends them to the message before sending to the LLM:
1048
+
1049
+ ```
1050
+ LEARNINGS FROM PREVIOUS RUNS:
1051
+ - This codebase prefers map/collect over manual array accumulation
1052
+ - Explicit nil comparisons appear frequently here
1053
+
1054
+ <original user message>
1055
+ ```
1056
+
1057
+ **Example:**
1058
+
1059
+ ```ruby
1060
+ reviewer = RobotLab.build(
1061
+ name: "reviewer",
1062
+ system_prompt: "You are a Ruby code reviewer."
1063
+ )
1064
+
1065
+ # Run 1 — no learnings yet
1066
+ reviewer.run("Review snippet A")
1067
+ reviewer.learn("Prefer map/collect over manual accumulation")
1068
+
1069
+ # Run 2 — learning injected automatically
1070
+ reviewer.run("Review snippet B")
1071
+ reviewer.learn("Avoid explicit nil comparisons")
1072
+
1073
+ # Run 3 — both learnings injected
1074
+ reviewer.run("Review snippet C")
1075
+
1076
+ puts reviewer.learnings.size # => 2
1077
+ ```
1078
+
1079
+ ### Deduplication Example
1080
+
1081
+ ```ruby
1082
+ robot.learn("avoid using puts")
1083
+ robot.learn("avoid using puts and p in production code")
1084
+ # => broader learning replaces narrower; robot.learnings.size == 1
1085
+ ```
1086
+
905
1087
  ## See Also
906
1088
 
907
1089
  - [Building Robots Guide](../../guides/building-robots.md) (includes [Composable Skills](../../guides/building-robots.md#composable-skills))