RubyGems - robot_lab - Versions diffs - 0.0.9 → 0.0.11 - Mend

robot_lab 0.0.9 → 0.0.11

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (67) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +32 -0
data/README.md +80 -1
data/Rakefile +2 -1
data/docs/api/core/robot.md +182 -0
data/docs/guides/creating-networks.md +21 -0
data/docs/guides/index.md +10 -0
data/docs/guides/knowledge.md +182 -0
data/docs/guides/mcp-integration.md +106 -0
data/docs/guides/memory.md +2 -0
data/docs/guides/observability.md +486 -0
data/docs/guides/ractor-parallelism.md +364 -0
data/docs/superpowers/plans/2026-04-14-ractor-integration.md +1538 -0
data/docs/superpowers/specs/2026-04-14-ractor-integration-design.md +258 -0
data/examples/19_token_tracking.rb +128 -0
data/examples/20_circuit_breaker.rb +153 -0
data/examples/21_learning_loop.rb +164 -0
data/examples/22_context_compression.rb +179 -0
data/examples/23_convergence.rb +137 -0
data/examples/24_structured_delegation.rb +150 -0
data/examples/25_history_search/conversation.jsonl +30 -0
data/examples/25_history_search.rb +136 -0
data/examples/26_document_store/api_versioning_adr.md +52 -0
data/examples/26_document_store/incident_postmortem.md +46 -0
data/examples/26_document_store/postgres_runbook.md +49 -0
data/examples/26_document_store/redis_caching_guide.md +48 -0
data/examples/26_document_store/sidekiq_guide.md +51 -0
data/examples/26_document_store.rb +147 -0
data/examples/27_incident_response/incident_response.rb +244 -0
data/examples/28_mcp_discovery.rb +112 -0
data/examples/29_ractor_tools.rb +243 -0
data/examples/30_ractor_network.rb +256 -0
data/examples/README.md +136 -0
data/examples/prompts/skill_with_mcp_test.md +9 -0
data/examples/prompts/skill_with_robot_name_test.md +5 -0
data/examples/prompts/skill_with_tools_test.md +6 -0
data/lib/robot_lab/bus_poller.rb +149 -0
data/lib/robot_lab/convergence.rb +69 -0
data/lib/robot_lab/delegation_future.rb +93 -0
data/lib/robot_lab/document_store.rb +155 -0
data/lib/robot_lab/error.rb +25 -0
data/lib/robot_lab/history_compressor.rb +205 -0
data/lib/robot_lab/mcp/client.rb +17 -5
data/lib/robot_lab/mcp/connection_poller.rb +187 -0
data/lib/robot_lab/mcp/server.rb +7 -2
data/lib/robot_lab/mcp/server_discovery.rb +110 -0
data/lib/robot_lab/mcp/transports/stdio.rb +6 -0
data/lib/robot_lab/memory.rb +103 -6
data/lib/robot_lab/network.rb +44 -9
data/lib/robot_lab/ractor_boundary.rb +42 -0
data/lib/robot_lab/ractor_job.rb +37 -0
data/lib/robot_lab/ractor_memory_proxy.rb +85 -0
data/lib/robot_lab/ractor_network_scheduler.rb +154 -0
data/lib/robot_lab/ractor_worker_pool.rb +117 -0
data/lib/robot_lab/robot/bus_messaging.rb +43 -65
data/lib/robot_lab/robot/history_search.rb +69 -0
data/lib/robot_lab/robot.rb +228 -11
data/lib/robot_lab/robot_result.rb +24 -5
data/lib/robot_lab/run_config.rb +1 -1
data/lib/robot_lab/text_analysis.rb +103 -0
data/lib/robot_lab/tool.rb +42 -3
data/lib/robot_lab/tool_config.rb +1 -1
data/lib/robot_lab/version.rb +1 -1
data/lib/robot_lab/waiter.rb +49 -29
data/lib/robot_lab.rb +25 -0
data/mkdocs.yml +1 -0
metadata +70 -2

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: c852fcf7f4aed4ce95fabdc5b0296723ca8aa10e780dabaa7759e618a22bc640
-  data.tar.gz: 1bcb205c958ede9967886dae78a1d1a6d47da42e4cd9bd29d7bdd3e094b0a088
+  metadata.gz: f4b2a3fafbdf3a54de3044b57597b42d86c68bd2afdad6ce866ac82483e61091
+  data.tar.gz: 5137cff56485a26fabe5ab6606b144c4c3c21c1673ecec1d2254a392e015c25c
 SHA512:
-  metadata.gz: 5620e7798ac04441cb23c6a7cc5f0cdad7447103825db35ef6f3a3987785b8ff5fb355ec03a309ef9c8a5ce5b0b7a29d9f5adef0e6a5d9de5cd66d3c94fb0469
-  data.tar.gz: 9300b1f5ed98e70226c7c670bcf2e3dee033310db6b2182b2705085f02474a1ea6157a011c93906da1d45ba38b4c9f8b9e62545cdb5fd304ca1550734f7dc043
+  metadata.gz: 33045f27ec803094a020caee4133c1d6c65887446330c294d9a9babd56a0fe7e71979fe0d032c2421488d1dcae886ad784a78dae579ba01b2786b5f9f91c0172
+  data.tar.gz: 5554296590bfb3dea031c95090ef8a47e946ac1c7a92b3efc6924439f55df7267e0b04d385e2332ea827099c37688c4988aa908caf5bb9d2f21eabc2c50c3167

data/CHANGELOG.md CHANGED Viewed

@@ -8,6 +8,38 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.0.11] - 2026-04-14
+### Added
+- **Ractor parallelism — Track 1: CPU-bound tools** (`RactorWorkerPool`)
+  - `ractor_safe true` class macro on `Tool` — opts a tool class into Ractor execution; subclasses inherit automatically
+  - `RobotLab.ractor_pool` — global `RactorWorkerPool` singleton, one Ractor worker per CPU core by default
+  - `ractor_pool_size` field on `RunConfig` for configuring pool capacity
+  - `RactorWorkerPool#submit(tool_name, args)` — submits a job and blocks for the frozen result; raises `ToolError` on failure
+  - Tool dispatch routes `ractor_safe` tools through the pool automatically, bypassing the GVL for CPU-intensive work
+  - `RactorBoundary.freeze_deep(obj)` — deep-freezes nested hashes/arrays/strings to make them Ractor-shareable; raises `RactorBoundaryError` for non-shareable objects (Procs, IOs, etc.)
+- **Ractor parallelism — Track 2: parallel robot pipelines** (`RactorNetworkScheduler`)
+  - `parallel_mode: :ractor` on `Network.new` — routes `network.run` through `RactorNetworkScheduler` instead of `SimpleFlow::Pipeline`
+  - `RactorNetworkScheduler` dispatches dependency waves: independent tasks run concurrently (one Thread per task); dependent tasks wait for their wave to complete
+  - `RobotSpec` — frozen `Data.define` descriptor carrying robot name, template, system prompt, and config; safely crosses Ractor boundaries
+  - `RactorNetworkScheduler#run_pipeline` returns `Hash { robot_name => result_string }` for the full pipeline
+  - `RactorNetworkScheduler#run_spec` for single-spec dispatch
+  - `RactorNetworkScheduler#shutdown` for graceful poison-pill cleanup
+  - `network.parallel_mode` reader exposes the configured mode (default `:async`)
+- **Ractor memory proxy** — `RactorMemoryProxy` wraps `Memory` via `ractor-wrapper` for safe cross-Ractor memory access
+- **Infrastructure data classes** — `RactorJob`, `RactorJobError` (`Data.define` structs) for job submission and error propagation across Ractor boundaries
+- **`RactorBoundaryError`** — raised by `freeze_deep` when a non-shareable value (Proc, IO, etc.) would cross a Ractor boundary
+- **`ToolError`** — raised by `RactorWorkerPool#submit` when a tool raises inside a Ractor; propagates message and frozen backtrace
+- **Dependencies** — `ractor_queue` (~> 0.1) and `ractor-wrapper` (~> 0.4) added to gemspec
+- **Ractor Parallelism guide** (`docs/guides/ractor-parallelism.md`) — covers architecture, two-track design, configuration, error handling, constraints, and best practices
+- **Example 29: Ractor-Safe CPU Tools** (`examples/29_ractor_tools.rb`) — demonstrates `ractor_safe` flag, inheritance, `freeze_deep`, pool submissions, `ToolError` propagation, and parallel batch timing; no API key required
+- **Example 30: Ractor Network Scheduler** (`examples/30_ractor_network.rb`) — demonstrates `RactorNetworkScheduler` wave ordering with simulated latencies, `Network.new(parallel_mode: :ractor)` API, and dependency graph inspection; no API key required for Parts 1 & 2
+### Fixed
+- `ToolConfig::NONE_VALUES` constant was not Ractor-shareable because its inner empty array `[]` was mutable; fixed by replacing `[]` with `[].freeze` so the entire constant is deeply frozen and safe to read from any Ractor
 ## [0.0.9] - 2026-03-02
 ### Added

data/README.md CHANGED Viewed

@@ -26,7 +26,13 @@
 - <strong>Message Bus</strong> - Bidirectional robot communication via TypedBus<br>
 - <strong>Dynamic Spawning</strong> - Robots create new robots at runtime<br>
 - <strong>Layered Configuration</strong> - Cascading YAML, env vars, and RunConfig<br>
-- <strong>Rails Integration</strong> - Generators, background jobs, Turbo Stream broadcasting
+- <strong>Rails Integration</strong> - Generators, background jobs, Turbo Stream broadcasting<br>
+- <strong>Token &amp; Cost Tracking</strong> - Per-run and cumulative token counts on every robot<br>
+- <strong>Tool Loop Circuit Breaker</strong> - <code>max_tool_rounds:</code> guards against runaway tool call loops<br>
+- <strong>Learning Accumulation</strong> - <code>robot.learn()</code> builds up cross-run observations with deduplication<br>
+- <strong>Context Window Compression</strong> - <code>robot.compress_history()</code> prunes irrelevant old turns via TF cosine scoring<br>
+- <strong>Convergence Detection</strong> - <code>RobotLab::Convergence</code> detects when independent agents agree, enabling reconciler fast-path<br>
+- <strong>Structured Delegation</strong> - <code>robot.delegate(to:, task:)</code> sync or async inter-robot calls with duration and token metadata; async fan-out via <code>DelegationFuture</code>
 </td>
 </tr>
 </table>
@@ -621,6 +627,79 @@ robot.run("Tell me a story") { |chunk| stream_to_client(chunk.content) }
 The `on_content:` callback participates in the RunConfig cascade, so it can be set at the network or config level and inherited by robots.
+## Token & Cost Tracking
+Every `robot.run()` returns a `RobotResult` that carries token usage for that call. The robot itself accumulates running totals across all runs.
+```ruby
+robot = RobotLab.build(name: "analyst", system_prompt: "You are helpful.")
+result = robot.run("What is a stack?")
+puts result.input_tokens   # tokens sent to the LLM this run
+puts result.output_tokens  # tokens generated this run
+puts robot.total_input_tokens   # cumulative across all runs
+puts robot.total_output_tokens
+```
+To start a fresh cost batch without rebuilding the robot, call `reset_token_totals`. This resets the **accounting counter only** — the chat history keeps accumulating, so subsequent `input_tokens` will reflect the full context window sent to the API:
+```ruby
+robot.reset_token_totals
+puts robot.total_input_tokens  # => 0
+```
+Token counts are zero for providers that do not return usage data.
+## Tool Loop Circuit Breaker
+Set `max_tool_rounds:` to prevent a robot from looping indefinitely through tool calls. When the limit is exceeded, `RobotLab::ToolLoopError` is raised.
+```ruby
+robot = RobotLab.build(
+  name: "runner",
+  system_prompt: "Execute every step.",
+  local_tools: [StepTool],
+  max_tool_rounds: 10
+)
+begin
+  robot.run("Run all steps.")
+rescue RobotLab::ToolLoopError => e
+  puts e.message  # "Tool call limit of 10 exceeded"
+end
+```
+After a `ToolLoopError` the chat contains a dangling `tool_use` block with no matching `tool_result`. Most providers (including Anthropic) will reject any subsequent request with that history. Call `clear_messages` before reusing the robot:
+```ruby
+robot.clear_messages   # flushes broken history; system prompt is kept
+result = robot.run("Something new.")  # robot is healthy again
+```
+## Learning Accumulation
+`robot.learn(text)` records a cross-run observation. On each subsequent `run()`, active learnings are automatically prepended to the user message as a `LEARNINGS FROM PREVIOUS RUNS:` block so the LLM can incorporate prior context without needing a persistent chat:
+```ruby
+reviewer = RobotLab.build(
+  name: "reviewer",
+  system_prompt: "You are a Ruby code reviewer."
+)
+reviewer.run("Review snippet A")
+reviewer.learn("This codebase prefers map/collect over manual array accumulation")
+reviewer.run("Review snippet B")  # learning is injected automatically
+```
+Learnings deduplicate bidirectionally: if a broader learning is added that contains an existing narrower one, the narrower one is dropped. Learnings are persisted to the robot's `Memory` and survive a robot rebuild when the same `Memory` object is reused.
+```ruby
+reviewer.learnings          # => ["This codebase prefers map/collect..."]
+reviewer.learn("new fact")  # deduplicates before storing
+```
 ## Rails Integration
 ```bash

data/Rakefile CHANGED Viewed

@@ -49,7 +49,8 @@ namespace :examples do
   SUBDIR_ENTRY_POINTS = {
     "14_rusty_circuit" => "open_mic.rb",
     "15_memory_network_and_bus" => "editorial_pipeline.rb",
-    "16_writers_room" => "writers_room.rb"
+    "16_writers_room" => "writers_room.rb",
+    "27_incident_response" => "incident_response.rb"
   }.freeze
   # Subdirectory demos that are standalone apps (not run via `ruby`)

data/docs/api/core/robot.md CHANGED Viewed

@@ -33,6 +33,8 @@ Robot.new(
   enable_cache: true,
   bus: nil,
   skills: nil,
+  max_tool_rounds: nil,
+  token_budget: nil,
   temperature: nil,
   top_p: nil,
   top_k: nil,
@@ -65,6 +67,8 @@ Robot.new(
 | `enable_cache` | `Boolean` | `true` | Whether to enable semantic caching |
 | `bus` | `TypedBus::MessageBus`, `nil` | `nil` | Optional message bus for inter-robot communication |
 | `skills` | `Symbol`, `Array<Symbol>`, `nil` | `nil` | Skill templates to prepend (see [Skills](#skills)) |
+| `max_tool_rounds` | `Integer`, `nil` | `nil` | Circuit breaker: raise `ToolLoopError` after this many tool calls in one `run()` (see [Tool Loop Circuit Breaker](#tool-loop-circuit-breaker)) |
+| `token_budget` | `Integer`, `nil` | `nil` | Raise `InferenceError` if cumulative input tokens exceed this limit |
 | `config` | `RunConfig`, `nil` | `nil` | Shared config merged with explicit kwargs (see [RunConfig](#runconfig)) |
 | `temperature` | `Float`, `nil` | `nil` | Controls randomness (0.0-1.0) |
 | `top_p` | `Float`, `nil` | `nil` | Nucleus sampling threshold |
@@ -113,6 +117,9 @@ If `name` is omitted, it defaults to `"robot"`.
 | `config` | `RunConfig` | Effective RunConfig (merged from constructor kwargs and passed-in config) |
 | `mcp_config` | `Symbol`, `Array` | Build-time MCP configuration (raw, unresolved) |
 | `tools_config` | `Symbol`, `Array` | Build-time tools configuration (raw, unresolved) |
+| `total_input_tokens` | `Integer` | Cumulative input tokens sent across all `run()` calls |
+| `total_output_tokens` | `Integer` | Cumulative output tokens received across all `run()` calls |
+| `learnings` | `Array<String>` | Accumulated cross-run observations (see [Learning Accumulation](#learning-accumulation)) |
 ## Attributes (Read-Write)
@@ -902,6 +909,181 @@ bot.with_bus(bus)
 bot.send_message(to: :someone, content: "Hello!")
 ```
+## Token & Cost Tracking
+Every `robot.run()` returns a `RobotResult` with token counts for that call. The robot accumulates running totals across all runs.
+### RobotResult Token Fields
+| Field | Type | Description |
+|-------|------|-------------|
+| `input_tokens` | `Integer` | Input tokens sent to the LLM in this run (0 if provider doesn't report usage) |
+| `output_tokens` | `Integer` | Output tokens received from the LLM in this run (0 if not reported) |
+### Robot Cumulative Totals
+| Attribute | Type | Description |
+|-----------|------|-------------|
+| `total_input_tokens` | `Integer` | Cumulative input tokens across all `run()` calls |
+| `total_output_tokens` | `Integer` | Cumulative output tokens across all `run()` calls |
+### reset_token_totals
+```ruby
+robot.reset_token_totals
+# => nil
+```
+Reset the cumulative accounting counters to zero. Useful when you want to measure cost for a specific task batch while keeping the robot alive for the next batch.
+> **Note:** This resets the *accounting counter only* — the underlying chat history keeps growing. The next run's `input_tokens` will reflect the full accumulated chat context sent to the API.
+**Example:**
+```ruby
+robot = RobotLab.build(name: "analyst", system_prompt: "You are helpful.")
+result = robot.run("What is a stack?")
+puts result.input_tokens    # e.g. 120
+puts result.output_tokens   # e.g. 45
+result2 = robot.run("And a queue?")
+puts result2.input_tokens   # larger — full chat history sent
+puts robot.total_input_tokens   # 120 + result2.input_tokens
+puts robot.total_output_tokens
+# Start a fresh accounting batch
+robot.reset_token_totals
+puts robot.total_input_tokens   # => 0
+```
+## Tool Loop Circuit Breaker
+Set `max_tool_rounds:` to guard against a robot looping indefinitely through tool calls. After the limit is reached, `RobotLab::ToolLoopError` is raised.
+### max_tool_rounds Parameter
+```ruby
+robot = RobotLab.build(
+  name: "runner",
+  system_prompt: "Execute every step.",
+  local_tools: [StepTool],
+  max_tool_rounds: 10
+)
+```
+`max_tool_rounds` can also be set via `RunConfig`:
+```ruby
+config = RobotLab::RunConfig.new(max_tool_rounds: 10)
+robot = RobotLab.build(name: "runner", system_prompt: "...", config: config)
+```
+### ToolLoopError
+`RobotLab::ToolLoopError < RobotLab::InferenceError`
+Raised when the number of tool calls in a single `run()` exceeds `max_tool_rounds`. The error message includes the limit that was exceeded.
+### Recovery after ToolLoopError
+After a `ToolLoopError`, the chat contains a dangling `tool_use` block with no matching `tool_result`. Anthropic and most providers will reject any subsequent request with that broken history.
+**You must call `clear_messages` before reusing the robot:**
+```ruby
+begin
+  robot.run("Execute all steps.")
+rescue RobotLab::ToolLoopError => e
+  puts "Circuit breaker fired: #{e.message}"
+end
+# Flush the corrupted chat (system prompt is kept)
+robot.clear_messages
+puts robot.config.max_tool_rounds  # still set — config unchanged
+# Robot is healthy again
+result = robot.run("Something new.")
+```
+## Learning Accumulation
+`robot.learn(text)` records a cross-run observation. On each subsequent `run()`, active learnings are automatically prepended to the user message as a `LEARNINGS FROM PREVIOUS RUNS:` block.
+### learn
+```ruby
+robot.learn(text)
+# => self
+```
+Add a learning to the robot's accumulated observations. Learnings are automatically deduplicated:
+- If the new text is a substring of an existing learning, it is dropped (the existing broader learning already covers it).
+- If an existing learning is a substring of the new text, the narrower one is replaced.
+Learnings are persisted to `memory[:learnings]` and survive a robot rebuild when the same `Memory` object is reused.
+**Parameters:**
+| Name | Type | Description |
+|------|------|-------------|
+| `text` | `String` | The observation or insight to record |
+**Returns:** `self`
+### learnings
+```ruby
+robot.learnings
+# => Array<String>
+```
+Returns the list of accumulated learning strings in insertion order.
+### How Learnings Are Injected
+When learnings are present, each `run(message)` prepends them to the message before sending to the LLM:
+```
+LEARNINGS FROM PREVIOUS RUNS:
+- This codebase prefers map/collect over manual array accumulation
+- Explicit nil comparisons appear frequently here
+<original user message>
+```
+**Example:**
+```ruby
+reviewer = RobotLab.build(
+  name: "reviewer",
+  system_prompt: "You are a Ruby code reviewer."
+)
+# Run 1 — no learnings yet
+reviewer.run("Review snippet A")
+reviewer.learn("Prefer map/collect over manual accumulation")
+# Run 2 — learning injected automatically
+reviewer.run("Review snippet B")
+reviewer.learn("Avoid explicit nil comparisons")
+# Run 3 — both learnings injected
+reviewer.run("Review snippet C")
+puts reviewer.learnings.size  # => 2
+```
+### Deduplication Example
+```ruby
+robot.learn("avoid using puts")
+robot.learn("avoid using puts and p in production code")
+# => broader learning replaces narrower; robot.learnings.size == 1
+```
 ## See Also
 - [Building Robots Guide](../../guides/building-robots.md) (includes [Composable Skills](../../guides/building-robots.md#composable-skills))

data/docs/guides/creating-networks.md CHANGED Viewed

@@ -124,6 +124,7 @@ end
 | `memory` | Task-specific memory |
 | `config` | Per-task `RunConfig` (merged on top of network's config) |
 | `depends_on` | `:none`, `[:task1]`, or `:optional` |
+| `poller_group` | Bus delivery group label (`:default`, `:slow`, etc.) |
 ## Conditional Routing
@@ -164,6 +165,26 @@ network = RobotLab.create_network(name: "support") do
 end
 ```
+## Poller Groups
+Each network maintains a shared `BusPoller` that serializes TypedBus deliveries on a per-robot basis: if a robot is already processing a message, new deliveries are queued and drained after the current one completes. This prevents re-entrancy without blocking other robots.
+Named **poller groups** let you label tasks so slow robots are identifiable in logs and monitoring without needing separate infrastructure:
+```ruby
+network = RobotLab.create_network(name: "mixed_speed") do
+  # Fast robots on the default group
+  task :fetcher,   fetcher_robot,   depends_on: :none
+  task :summarize, summarizer,      depends_on: [:fetcher]
+  # Slow robots with expensive LLM calls — label them :slow
+  task :analyst,   analyst_robot,   depends_on: [:fetcher],  poller_group: :slow
+  task :writer,    writer_robot,    depends_on: [:analyst],  poller_group: :slow
+end
+```
+Group labels are informational — there is no separate queue per group. In Async execution, robots naturally yield during LLM HTTP calls, so fast and slow robots interleave without explicit isolation.
 ## Running Networks
 ### Basic Run

data/docs/guides/index.md CHANGED Viewed

@@ -38,6 +38,14 @@ If you're new to RobotLab, start here:
     Share data between robots with the memory system
+-   [:octicons-pulse-24: **Observability & Safety**](observability.md)
+    Token tracking, circuit breakers, and learning accumulation
+-   [:material-cpu-64-bit: **Ractor Parallelism**](ractor-parallelism.md)
+    True CPU parallelism for tools and robot pipelines via Ruby Ractors
 </div>
 ## Framework Integration
@@ -61,3 +69,5 @@ If you're new to RobotLab, start here:
 | [Streaming](streaming.md) | Real-time responses | 5 min |
 | [Memory](memory.md) | Shared data store | 5 min |
 | [Rails Integration](rails-integration.md) | Rails application setup | 15 min |
+| [Observability & Safety](observability.md) | Token tracking, circuit breaker, learning loop | 10 min |
+| [Ractor Parallelism](ractor-parallelism.md) | CPU-parallel tools and robot pipelines | 15 min |

data/docs/guides/knowledge.md ADDED Viewed

@@ -0,0 +1,182 @@
+# Knowledge & Retrieval
+Facilities for searching and retrieving knowledge from a robot's history and from external documents:
+- **Chat History Search** — semantic search over accumulated conversation turns
+- **Embedding-Based Document Store** — lightweight RAG: store arbitrary text, search by meaning
+---
+## Chat History Search
+### The Problem
+Long-running robots accumulate many conversation turns. When you need to recall what was discussed earlier on a specific topic, re-sending the full history wastes tokens. `search_history` gives you a focused slice of the most relevant past messages without touching the LLM.
+### robot.search_history
+```ruby
+results = robot.search_history(query, limit: 5)
+```
+Scores every message in the robot's conversation history against `query` using stemmed term-frequency cosine similarity (via the `classifier` gem). Returns up to `limit` `HistoryResult` objects sorted by score descending.
+```ruby
+results = robot.search_history("quarterly revenue", limit: 3)
+results.each do |r|
+  puts "[#{r.role}] score=#{r.score.round(3)} idx=#{r.index}"
+  puts "  #{r.text}"
+end
+```
+### HistoryResult Fields
+| Field | Type | Description |
+|-------|------|-------------|
+| `text` | String | The message text |
+| `role` | Symbol | `:user`, `:assistant`, or `:system` |
+| `score` | Float (0.0–1.0) | Cosine similarity with the query |
+| `index` | Integer | Position in `@chat.messages` |
+### Typical Scores
+| Relationship | Typical Score |
+|---|---|
+| Direct answer to the query | 0.50 – 0.80 |
+| Same topic, different phrasing | 0.20 – 0.50 |
+| Unrelated | < 0.10 |
+### Short Messages
+Messages shorter than 20 characters are skipped — they produce no meaningful term vector.
+### Full Example
+```ruby
+robot = RobotLab.build(name: "analyst", system_prompt: "You are a financial analyst.")
+# … after several robot.run() calls …
+hits = robot.search_history("customer acquisition cost")
+hits.each { |r| puts "#{r.role} (#{r.score.round(2)}): #{r.text}" }
+```
+### RAG Pattern — Retrieve Then Generate
+Use `search_history` to inject only the relevant past context into the next call:
+```ruby
+hits    = robot.search_history(user_query, limit: 3)
+context = hits.map(&:text).join("\n")
+robot.run("Recall context:\n#{context}\n\nNew question: #{user_query}")
+```
+### Optional Dependency
+`search_history` requires the `classifier` gem:
+```ruby
+gem "classifier", "~> 2.3"
+```
+Without it, calling `search_history` raises `RobotLab::DependencyError` with an install hint.
+---
+## Embedding-Based Document Store
+### The Problem
+Sometimes the knowledge you need isn't in the conversation history — it's in a README, a product spec, a changelog. `store_document` / `search_documents` embed arbitrary text with `fastembed` and retrieve the most relevant chunk at query time.
+### memory.store_document / memory.search_documents
+```ruby
+memory.store_document(:readme,    File.read("README.md"))
+memory.store_document(:changelog, File.read("CHANGELOG.md"))
+hits = memory.search_documents("how to configure redis", limit: 3)
+hits.each { |h| puts "#{h[:key]} (#{h[:score].round(3)}): #{h[:text][0..80]}" }
+```
+Each result hash contains:
+| Key | Type | Description |
+|-----|------|-------------|
+| `:key` | Symbol | The key the document was stored under |
+| `:text` | String | The full stored text |
+| `:score` | Float (0.0–1.0) | Cosine similarity with the query |
+### Standalone DocumentStore
+The `Memory` methods delegate to `RobotLab::DocumentStore`, which can also be used directly:
+```ruby
+store = RobotLab::DocumentStore.new
+store.store(:doc_a, "Ruby on Rails is a full-stack web framework.")
+store.store(:doc_b, "Postgres is an advanced relational database.")
+results = store.search("relational database SQL", limit: 2)
+puts results.first[:key]  # => :doc_b
+```
+Management methods:
+```ruby
+store.size          # => 2
+store.keys          # => [:doc_a, :doc_b]
+store.empty?        # => false
+store.delete(:doc_a)
+store.clear
+```
+### Embedding Model
+Default: `BAAI/bge-small-en-v1.5` (~23 MB, downloaded on first use, cached in `~/.cache/fastembed/`).
+Documents are embedded with a `"passage: "` prefix and queries with `"query: "` prefix — the standard retrieval convention for BGE models.
+Custom model:
+```ruby
+store = RobotLab::DocumentStore.new(model_name: "BAAI/bge-base-en-v1.5")
+```
+### RAG Pattern
+```ruby
+# 1. Index your knowledge base at startup
+memory.store_document(:readme,    File.read("README.md"))
+memory.store_document(:changelog, File.read("CHANGELOG.md"))
+memory.store_document(:api_docs,  File.read("docs/api.md"))
+# 2. At query time, retrieve the most relevant chunks
+hits    = memory.search_documents(user_query, limit: 3)
+context = hits.map { |h| h[:text] }.join("\n\n")
+# 3. Pass context to your robot
+result = robot.run("Use the following context:\n#{context}\n\nQuestion: #{user_query}")
+```
+### Memory API Summary
+| Method | Description |
+|--------|-------------|
+| `memory.store_document(key, text)` | Embed and store a document |
+| `memory.search_documents(query, limit: 5)` | Search by semantic similarity |
+| `memory.document_keys` | List stored keys |
+| `memory.delete_document(key)` | Remove a document |
+### Dependency
+`fastembed` is a core RobotLab dependency — no optional gem required. The ONNX model is downloaded on first use.
+---
+## See Also
+- [Observability Guide](observability.md)
+- [Example 25 — Chat History Search](../../examples/25_history_search.rb)
+- [Example 26 — Embedding-Based Document Store](../../examples/26_document_store.rb)