RubyGems - rlm-rb - Versions diffs - 0.1.0 → 0.2.0 - Mend

rlm-rb 0.1.0 → 0.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (20) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +45 -2
data/README.md +157 -55
data/examples/plain_ruby_invoice_extraction.rb +85 -0
data/lib/rlm/code_extractor.rb +125 -0
data/lib/rlm/file.rb +1 -1
data/lib/rlm/lm/mock.rb +45 -0
data/lib/rlm/lm/ruby_llm.rb +99 -0
data/lib/rlm/predict.rb +18 -9
data/lib/rlm/prompt_builder.rb +199 -0
data/lib/rlm/runtime/bridge.rb +146 -0
data/lib/rlm/runtime/signature_registry.rb +75 -0
data/lib/rlm/runtime.rb +352 -0
data/lib/rlm/sandbox/unsafe_in_process.rb +116 -0
data/lib/rlm/signature/dspy.rb +155 -0
data/lib/rlm/signature.rb +76 -0
data/lib/rlm/trace.rb +2 -0
data/lib/rlm/version.rb +1 -1
data/lib/rlm.rb +9 -0
metadata +66 -10

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: e52a1acbaea2c8a9955a4a75058e6e4d4780c16b327a9b28b68bb620c1ddb406
-  data.tar.gz: 75c6d410914cd74a4d949658acf9d2c747cf0a87b9ad4c43c80d6a7cbf9a417f
+  metadata.gz: 287e62dff4524b7ff81b1df336ee22d086dec01a9b16d82047ccb6d3cb6adce2
+  data.tar.gz: e3f19ce47df41aa2ac26d284f1cc13741aaff2703a9b0004a662107798c555ca
 SHA512:
-  metadata.gz: 9e5a488efd6e4ddbd0bb7e9b6df4df2624dd0c127244069120cc78b4e260432fd2e66453c2526734a5114f2c2cdb13220bfaffdc6cca3876282698b77a1c1c92
-  data.tar.gz: ee9901f38ffdc87df153319c7a723b801e441da54696118512dee03d5dcef22ccd1adb13f17db6bb785c99aeaa95b315bcd6bd017c2b3051928ab7c33c34d64f
+  metadata.gz: b34d4334b16262fa6e4a8bfb32168998fc89a3e77c47cc413356ff1f316b9186d88d25298eca0014637fd82098d2c713c41af69c0fa26e7983e436f633c708b1
+  data.tar.gz: a3136dfd0a13da006b8b719a1565af16cb1692b1eda272a174944608c01660d990260af50cf0d36284d6c846c7606317f343a3d3fdc02b76199a4bad0e326cf8

data/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,50 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.2.0] - 2026-05-15
+### Added
+- Shipped `examples/plain_ruby_invoice_extraction.rb` as an opt-in live plain Ruby smoke example for real RubyLLM
+  and dspy adapters.
+- `RLM::Lm::RubyLLM` provider adapter for root and sub-LM calls through RubyLLM.
+- `RLM::Signature::Dspy` adapter for wrapping dspy.rb signatures behind the existing RLM signature protocol.
+- `RLM::Signature.coerce_output` hook for normalizing parsed final output before validation.
+- Optional `usage` payloads on `:root_lm_called` and `:sub_lm_called` trace events for adapters that expose token
+  and cost metadata.
+- `RLM::CodeExtractor` for strict `<rlm-code>` / `<rlm-final>` response parsing.
+- `RLM::Lm::Mock` for deterministic runtime-spine tests.
+- `RLM::PromptBuilder` for deterministic strict prompt construction from signatures, inputs, context
+  manifests, and limits.
+- `RLM::Runtime::Bridge` for sandbox-exposed `predict`, `tool`, `submit`, `read_file`,
+  `list_files`, and `log` runtime services.
+- `RLM::Signature` protocol helpers for runtime-independent signature validation.
+- `RLM::Sandbox::UnsafeInProcess` for dev/test-only runtime-spine integration tests.
+- `RLM::Runtime` mock execution loop with prompt building, LM calls, code/final extraction,
+  sandbox execution, recursive subcalls, validation, budget policies, and `RLM::Result` output.
+- `RLM::Predict#call` now delegates to the runtime spine.
+- Budget enforcement expanded to `max_sub_lm_calls`, `max_tool_calls`, `max_cost_cents`, and `max_runtime_seconds`.
+- Budget policies are honored: `:fail`, `:needs_review`, and conservative `:return_partial` when a valid submitted
+  output already exists.
+- `trace_store` is forwarded into runtime as a best-effort callable hook receiving the terminal `RLM::Result`.
+- `RLM::ToolError` is preserved through sandbox execution and reported as `status: :tool_error`.
+- Trace event completeness: `:budget_checked` recorded at all budget checks, `:run_failed` recorded on all failure paths.
+- PromptBuilder v0.2 contract: signature description, input/output fields, available helpers, safety instructions.
+- Parse failures are deterministic and fail-closed (deferred repair attempts to future milestone).
+- Sandbox cleanup proven across all failure modes (success, validation, parse, provider, budget, sandbox errors).
+- `RLM::Sandbox::UnsafeInProcess` serializes process-global stream capture with a mutex while remaining dev/test-only
+  and unsuitable for production isolation.
+### Changed
+- Ruby compatibility now requires Ruby `>= 3.3.0` because dspy.rb support is part of the plain Ruby milestone.
+- Runtime final-output validation now runs after signature-level output coercion.
+### Fixed
+- Unknown RubyLLM provider costs are recorded as `cost_known: false`, contribute `0` cents for that call, and do not
+  crash cost accounting.
 ## [0.1.0] - 2026-05-12
 Skeleton release. Establishes the public types, configuration surface, sandbox
@@ -28,9 +72,8 @@ v0.2.
 - `RLM::Tool` base class with category DSL.
 - `RLM::Predict` skeleton (`#call` raises `NotImplementedError` until the runtime loop lands).
-### Not yet implemented (tracked for v0.2+)
+### Not yet implemented (tracked for future milestones)
-- Runtime execution loop, code extractor, runtime bridge, recursive `predict(...)`.
 - RubyLLM root/sub-LM adapters.
 - dspy.rb signature adapter and output validation.
 - `RLM::Sandbox::Subprocess` backend.

data/README.md CHANGED Viewed

@@ -3,16 +3,19 @@
 [![Gem Version](https://badge.fury.io/rb/rlm-rb.svg)](https://badge.fury.io/rb/rlm-rb)
 [![CI](https://github.com/dpaluy/rlm/actions/workflows/ci.yml/badge.svg)](https://github.com/dpaluy/rlm/actions/workflows/ci.yml)
-Recursive Language Models for Ruby and Rails.
+Recursive Language Models for Ruby.
-RLM.rb is a Ruby/Rails-native runtime for typed, sandboxed, auditable AI jobs over large application context.
-It depends on [RubyLLM](https://github.com/crmne/ruby_llm) for provider access and [dspy.rb](https://github.com/vicentereig/dspy.rb)
-for typed signatures, and adds the missing recursive execution runtime: sandbox, REPL loop, file and context mounting,
-recursive sub-LM calls, typed final output, budget controls, and durable trajectories.
+RLM.rb is a Ruby runtime for typed, sandbox-oriented, auditable AI jobs over large application context.
+It integrates with [RubyLLM](https://github.com/crmne/ruby_llm) for provider access and
+[dspy.rb](https://github.com/vicentereig/dspy.rb) for typed signatures. The current plain Ruby milestone includes the
+recursive execution spine: prompt loop, file and context mounting, recursive sub-LM calls, typed final output, budget
+controls, trace events, a RubyLLM LM adapter, a dspy signature adapter, and a minimal trace persistence hook.
-> **Status: v0.1.0 skeleton.** Core types are in place. The runtime loop, provider adapters, signature adapter,
-> subprocess sandbox, and Rails integration are not yet implemented and are tracked in the v0.2 milestone in
-> `docs/prd.md`. `RLM::Predict#call` raises `NotImplementedError` in this release.
+> **Status: Plain Ruby adapter milestone.** The released gem is v0.2.0. It includes `RLM::Lm::RubyLLM`,
+> `RLM::Signature::Dspy`, `RLM::Lm::Mock`, `RLM::Sandbox::UnsafeInProcess`, budget enforcement and budget policies,
+> trace events, recursive `predict`, prompt building, and a best-effort `trace_store` callable hook.
+> Rails integration, subprocess/container sandboxing, tools, skills, cache, telemetry, and evals remain future
+> milestones. `UnsafeInProcess` is dev/test-only and executes generated code in the host Ruby process.
 ## Why
@@ -25,6 +28,9 @@ typed LLM functions only when needed, and returns validated Ruby objects with a
 ## Install
+RLM.rb requires Ruby 3.3 or newer. Ruby 3.2 and older are not supported because dspy.rb is mandatory for the plain
+Ruby adapter milestone.
 Add the gem to your Gemfile:
 ```ruby
@@ -41,9 +47,8 @@ gem install rlm-rb
 ```ruby
 RLM.configure do |config|
-  # Provider adapters land in the next milestone.
-  # config.root_lm = RubyLLM.chat(model: "anthropic/claude-sonnet-4")
-  # config.sub_lm  = RubyLLM.chat(model: "openai/gpt-5-mini")
+  config.root_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
+  config.sub_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
   config.sandbox = RLM::Sandbox::Mock.new
@@ -58,14 +63,20 @@ RLM.configure do |config|
 end
 ```
-## Intended API (not yet executable)
+`RLM::Lm::RubyLLM` creates a fresh `RubyLLM.chat` for each runtime LM call. That keeps RLM prompts standalone and
+prevents conversation history from leaking between root and sub-model calls.
+## Plain Ruby API
 ```ruby
+require "dspy"
+require "rlm"
 class InvoiceExtraction < DSPy::Signature
   description "Extract normalized invoice fields from a vendor invoice."
   input do
-    const :invoice_pdf, RLM::File
+    const :invoice_text, String
     const :vendor_id, Integer
   end
@@ -73,29 +84,110 @@ class InvoiceExtraction < DSPy::Signature
     const :vendor_name, String
     const :invoice_number, String
     const :total_cents, Integer
-    const :confidence, Float
-    const :needs_review, T::Boolean
   end
 end
+RLM.configure do |config|
+  config.root_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
+  config.sub_lm = RLM::Lm::RubyLLM.new(model: "gpt-5-mini")
+  config.sandbox = RLM::Sandbox::UnsafeInProcess.new # dev/test only
+end
+signature = RLM::Signature::Dspy.new(InvoiceExtraction)
 result = RLM.predict(
-  InvoiceExtraction,
+  signature,
   input: {
-    invoice_pdf: RLM::File.from_path("invoice.pdf"),
+    invoice_text: "Vendor: Acme\nInvoice: INV-001\nTotal: $100.00",
     vendor_id: 123
   },
-  max_iterations: 10,
-  max_llm_calls: 30,
-  max_cost_cents: 150
+  limits: RLM::Limits.new(max_iterations: 8, max_llm_calls: 25)
 )
-result.output           # typed object
-result.trace            # readable steps, llm calls, tool calls
+result.output
+# => { vendor_name: "Acme", invoice_number: "INV-001", total_cents: 10000 }
+result.trace.events.find { |event| event[:type] == :root_lm_called }[:payload][:usage]
+# => { model_id: "...", input_tokens: ..., output_tokens: ..., cost_cents: ..., cost_known: true }
+```
+Usage metadata is recorded on `:root_lm_called` and `:sub_lm_called` trace events when an adapter exposes it. It is not
+duplicated onto `RLM::Result` in this milestone. RubyLLM cost helpers can return `nil` when model pricing is unknown;
+RLM records `cost_known: false`, contributes `0` cents for that call, and cannot enforce unknown provider cost.
+## Run a Live Plain Ruby Example
+The gem ships one opt-in live example at `examples/plain_ruby_invoice_extraction.rb`. By default it exits before
+provider credential checks, LM configuration, or `RLM.predict`, even if provider credentials are already present:
+```bash
+bundle exec ruby examples/plain_ruby_invoice_extraction.rb
+```
+To run the live path, configure provider credentials and opt in explicitly:
+```bash
+RLM_RUN_LIVE_EXAMPLE=1 OPENAI_API_KEY="$OPENAI_API_KEY" \
+  bundle exec ruby examples/plain_ruby_invoice_extraction.rb
+```
+The example uses `RLM::Lm::RubyLLM` for root and sub-LM calls, wraps a real `DSPy::Signature` with
+`RLM::Signature::Dspy`, calls the public `RLM.predict(...)` API, and prints result status, typed output, trace id, cost,
+and usage payloads when RubyLLM exposes them. Set `RLM_EXAMPLE_MODEL` and `RLM_EXAMPLE_SUB_MODEL` to override the
+default model.
+The live example uses `RLM::Sandbox::UnsafeInProcess`, which is dev/test-only and runs generated Ruby code in the host
+process. Rails integration, subprocess/container sandboxing, tools, skills, evals, telemetry, and production execution
+examples remain future milestones.
+## Mock Runtime API
+```ruby
+class InvoiceExtraction
+  def self.name = "InvoiceExtraction"
+  def self.description = "Extract normalized invoice fields from a vendor invoice."
+  def self.input_fields = { invoice_pdf: :file, vendor_id: :integer }
+  def self.output_fields = { vendor_name: :string, invoice_number: :string, total_cents: :integer }
+  def self.validate_input(input) = input.key?(:vendor_id) ? [] : ["vendor_id is required"]
+  def self.validate_output(output) = output.key?(:vendor_name) ? [] : ["vendor_name is required"]
+end
+# Mock LM for testing (no provider needed)
+lm = RLM::Lm::Mock.new(responses: ['<rlm-final>{"vendor_name":"Acme","invoice_number":"INV-001","total_cents":10000}</rlm-final>'])
+result = RLM.predict(
+  InvoiceExtraction,
+  input: { vendor_id: 123 },
+  lm: lm,
+  sandbox: RLM::Sandbox::UnsafeInProcess.new,  # dev/test only: executes in host process
+  limits: RLM::Limits.new(max_iterations: 8, max_llm_calls: 25)
+)
+result.output           # { "vendor_name" => "Acme", ... }
+result.trace            # full event stream
 result.cost_cents       # accumulated cost
-result.status           # :completed, :needs_review, :budget_exceeded, ...
+result.status           # :completed, :budget_exceeded, :failed_validation, ...
 ```
-## What's in this skeleton today
+## dspy Signature Adapter
+`RLM::Signature::Dspy` wraps a `DSPy::Signature` class behind RLM's internal signature protocol:
+- `description`
+- `input_fields`
+- `output_fields`
+- `validate_input`
+- `validate_output`
+- `coerce_output`
+The adapter derives fields and simple validation from dspy JSON schema metadata. Output coercion normalizes parsed
+JSON/hash output to schema keys before validation.
+## Rails
+Rails integration is not yet implemented. Rails remains a v2 milestone tracked in `docs/postponed-issues.md`.
+## What's Implemented
 | Component | Status |
 |-----------|--------|
@@ -106,26 +198,34 @@ result.status           # :completed, :needs_review, :budget_exceeded, ...
 | `RLM::Trace` with NDJSON / JSON export | Ready |
 | `RLM::Result` with full status enum | Ready |
 | `RLM::Sandbox::Base` interface + `Mock` backend | Ready |
+| `RLM::Sandbox::UnsafeInProcess` | Ready for dev/test only; executes in host process and mutates global streams during serialized capture |
 | `RLM::Tool` base class with category DSL | Ready |
 | Error hierarchy | Ready |
-| `RLM::Predict` skeleton | Stub, raises on `#call` |
-| RubyLLM provider adapter | Not yet |
-| dspy.rb signature adapter | Not yet |
-| Runtime execution loop + recursive `predict` | Not yet |
-| `RLM::Sandbox::Subprocess` | Not yet |
-| Rails Railtie, generator, migrations, ActiveStorage adapter | Not yet |
-See `docs/prd.md` for the full product spec and v0.2 milestone list.
-## Rails setup (intended, lands in v0.3)
+| `RLM::Predict#call` | Delegates to `RLM::Runtime` |
+| `RLM::Runtime` mock loop | Ready (with `RLM::Lm::Mock`) |
+| `RLM::PromptBuilder` | Ready (v0.2 contract) |
+| `RLM::CodeExtractor` | Ready |
+| `RLM::Runtime::Bridge` | Ready for runtime-owned subcalls, tools, submission, file reads, and logging |
+| Budget enforcement and policies (`max_llm_calls`, `max_sub_lm_calls`, `max_tool_calls`, `max_iterations`, `max_cost_cents`, `max_runtime_seconds`, `on_budget_exceeded`) | Ready |
+| `trace_store` callable hook | Ready (best-effort; receives terminal `RLM::Result`) |
+| Recursive `predict` + depth limit | Ready |
+| `RLM::Lm::RubyLLM` provider adapter | Ready |
+| `RLM::Signature::Dspy` signature adapter | Ready |
+| Trace usage metadata for RubyLLM calls | Ready |
+| `RLM::Sandbox::Subprocess` | Future milestone |
+| Rails Railtie, generator, migrations, ActiveStorage adapter | Future milestone |
+The table above reflects the current unreleased plain Ruby adapter implementation status.
+## Rails setup (intended v2 milestone)
 The Rails integration is not yet implemented, but the intended setup is:
 ```ruby
 # config/initializers/rlm.rb
 RLM.configure do |config|
-  config.root_lm = RubyLLM.chat(model: Rails.application.credentials.dig(:rlm, :root_model))
-  config.sub_lm  = RubyLLM.chat(model: Rails.application.credentials.dig(:rlm, :sub_model))
+  config.root_lm = RLM::Lm::RubyLLM.new(model: Rails.application.credentials.dig(:rlm, :root_model))
+  config.sub_lm = RLM::Lm::RubyLLM.new(model: Rails.application.credentials.dig(:rlm, :sub_model))
   config.sandbox = RLM::Sandbox::Subprocess.new   # development
   # config.sandbox = RLM::Sandbox::Docker.new     # production (v0.4)
@@ -173,9 +273,6 @@ rescue RLM::ToolError => e
 rescue RLM::ParseError => e
   # Root LM response could not be parsed into <rlm-code>/<rlm-final>.
   raise
-rescue RLM::NoProgressError => e
-  # The model emitted no new progress across iterations.
-  raise
 rescue RLM::ConfigurationError => e
   # Missing signature, missing root LM, invalid sandbox, etc.
   raise
@@ -186,33 +283,38 @@ end
 ```
 Soft failures land on `result.status` instead of raising. Inspect `result.success?`, `result.needs_review?`,
-`result.failed?`, and `result.validation_errors` to branch.
+`result.failed?`, and `result.validation_errors` to branch. Budget handling honors `limits.on_budget_exceeded`:
+`:fail` returns `:budget_exceeded`, `:needs_review` returns `:needs_review`, and `:return_partial` returns
+`:needs_review` only when a valid submitted output already exists; otherwise it fails as `:budget_exceeded`.
 | Status | Predicate | Meaning |
 |--------|-----------|---------|
 | `:completed` | `success?` | Output valid, ready to use. |
-| `:needs_review` | `needs_review?` | Output present but validation flagged it or budget policy is `:needs_review`. |
-| `:failed_validation` | `failed?` | Output invalid after repair attempts. |
-| `:budget_exceeded` | `failed?` | Hit a hard limit and policy is `:fail`. |
+| `:needs_review` | `needs_review?` | Budget policy requested review, optionally with a valid submitted partial output. |
+| `:failed_validation` | `failed?` | Output invalid after validation. |
+| `:budget_exceeded` | `failed?` | Hit a hard limit with `:fail`, or `:return_partial` had no valid submitted output. |
 | `:sandbox_error` | `failed?` | Sandbox violation or crash. |
 | `:tool_error` | `failed?` | Tool raised or returned invalid output. |
 | `:provider_error` | `failed?` | RubyLLM provider failure. |
 | `:aborted` | `failed?` | Run cancelled by caller. |
-## Production safety (when the runtime loop ships)
+## Production safety
-- The subprocess sandbox planned for v0.2 is intended for local development and low-risk internal use.
-- Production deployments should use the Docker sandbox (v0.4) or a remote isolated runner.
-- Generated code must not execute inside the host Ruby process. The codebase will hold this invariant.
-- Mounted files are data, not instructions. Prompt injection mitigations are documented in the PRD.
+- `RLM::Sandbox::UnsafeInProcess` executes generated code in the host Ruby process. It is dev/test-only and unsafe.
+- `UnsafeInProcess` captures `$stdout`/`$stderr` by mutating process-global streams; capture is serialized with a mutex,
+  but the sandbox remains unsuitable for production and should not be treated as concurrency-safe isolation.
+- The subprocess sandbox is a future milestone for local development.
+- Production deployments should use a container sandbox or remote isolated runner (future milestone).
+- Generated code must not execute inside the host Ruby process in production. The codebase will hold this invariant.
+- Mounted files are data, not instructions; generated code should treat file contents as untrusted input.
 ## Development
 ```bash
-bundle install
-bundle exec rake test       # 58 runs / 139 assertions / 0 failures
-bundle exec rubocop         # lint
-bundle exec rake            # test + rubocop
+zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle install'
+zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rake test'
+zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rubocop'
+zsh -lc 'source ~/.zshrc && eval "$(mise activate zsh)" && bundle exec rake'
 ```
 ## Contributing
@@ -221,10 +323,10 @@ Issues and pull requests welcome at https://github.com/dpaluy/rlm.
 ## API reference
-RLM.rb sits on top of two upstream libraries. When you need provider or signature details, go to source:
+RLM.rb integrates with these upstream libraries. For provider or signature details, go to source:
-- [RubyLLM](https://github.com/crmne/ruby_llm), [Rails integration guide](https://rubyllm.com/rails/) for provider/chat/file API.
-- [dspy.rb](https://github.com/vicentereig/dspy.rb), [Signatures guide](https://vicentereig.github.io/dspy.rb/core-concepts/signatures/) for typed input/output contracts.
+- [RubyLLM](https://github.com/crmne/ruby_llm), [chat guide](https://rubyllm.com/chat/) for provider, chat, token, and cost APIs.
+- [dspy.rb](https://github.com/vicentereig/dspy.rb), [Signatures guide](https://oss.vicente.services/dspy.rb/core-concepts/signatures/) for typed input/output contracts.
 - The [Recursive Language Models](https://github.com/alexzhang13/rlm) reference implementation and the
   [DSPy RLM module](https://dspy.ai/api/modules/RLM/) for the underlying idea.

data/examples/plain_ruby_invoice_extraction.rb ADDED Viewed

@@ -0,0 +1,85 @@
+# frozen_string_literal: true
+require "bundler/setup"
+require "dspy"
+require "json"
+require "rlm"
+class InvoiceExtraction < DSPy::Signature
+  description "Extract normalized invoice fields from a vendor invoice."
+  input do
+    const :invoice_text, String
+    const :vendor_id, Integer
+  end
+  output do
+    const :vendor_name, String
+    const :invoice_number, String
+    const :total_cents, Integer
+  end
+end
+def live_example_enabled?
+  ENV["RLM_RUN_LIVE_EXAMPLE"] == "1"
+end
+def provider_configured?
+  !ENV["OPENAI_API_KEY"].to_s.empty?
+end
+def print_skipped_message
+  puts "Skipped live RLM example."
+  puts "Set RLM_RUN_LIVE_EXAMPLE=1 and OPENAI_API_KEY to run a real RubyLLM provider call."
+  puts "Optional: set RLM_EXAMPLE_MODEL and RLM_EXAMPLE_SUB_MODEL to override the default model."
+end
+def usage_events(result)
+  result.trace.events.select { |event| %i[root_lm_called sub_lm_called].include?(event[:type]) }
+end
+unless live_example_enabled?
+  print_skipped_message
+  exit 0
+end
+unless provider_configured?
+  warn "RLM_RUN_LIVE_EXAMPLE=1 is set, but OPENAI_API_KEY is missing."
+  warn "Configure provider credentials before running the live example."
+  exit 1
+end
+root_model = ENV.fetch("RLM_EXAMPLE_MODEL", "gpt-5-mini")
+sub_model = ENV.fetch("RLM_EXAMPLE_SUB_MODEL", root_model)
+RLM.configure do |config|
+  config.root_lm = RLM::Lm::RubyLLM.new(model: root_model)
+  config.sub_lm = RLM::Lm::RubyLLM.new(model: sub_model)
+  # Dev/test only: UnsafeInProcess runs generated Ruby code in this host process.
+  config.sandbox = RLM::Sandbox::UnsafeInProcess.new
+end
+signature = RLM::Signature::Dspy.new(InvoiceExtraction)
+result = RLM.predict(
+  signature,
+  input: {
+    invoice_text: "Vendor: Acme Supplies\nInvoice: INV-001\nTotal: $100.00",
+    vendor_id: 123
+  },
+  limits: RLM::Limits.new(max_iterations: 8, max_llm_calls: 25, max_recursion_depth: 1)
+)
+puts "status: #{result.status}"
+puts "trace_id: #{result.trace.id}"
+puts "cost_cents: #{result.cost_cents}"
+puts "output:"
+puts JSON.pretty_generate(result.output)
+usage_events(result).each do |event|
+  next unless event[:payload][:usage]
+  puts "#{event[:type]} usage:"
+  puts JSON.pretty_generate(event[:payload][:usage])
+end

data/lib/rlm/code_extractor.rb ADDED Viewed

@@ -0,0 +1,125 @@
+# frozen_string_literal: true
+require "json"
+module RLM
+  class CodeExtractor
+    CODE_OPEN = "<rlm-code>"
+    CODE_CLOSE = "</rlm-code>"
+    FINAL_OPEN = "<rlm-final>"
+    FINAL_CLOSE = "</rlm-final>"
+    KNOWN_TAG_PATTERN = %r{</?rlm-(?:code|final)>}
+    TYPES = %i[code final].freeze
+    class Result
+      attr_reader :type, :content
+      def initialize(type:, content:)
+        raise ArgumentError, "Unknown code extraction result type: #{type.inspect}" unless TYPES.include?(type)
+        @type = type
+        @content = content
+      end
+      def code?
+        type == :code
+      end
+      def final?
+        type == :final
+      end
+      def to_h
+        {
+          type: type,
+          content: content
+        }
+      end
+    end
+    def self.extract(response)
+      new.extract(response)
+    end
+    def extract(response)
+      raise ParseError, "response must be a String" unless response.is_a?(String)
+      tags = scan_tags(response)
+      raise ParseError, "response must contain one rlm-code or rlm-final block" if tags.empty?
+      type = block_type_for(tags)
+      block = extract_block(response, tags, type)
+      Result.new(type: type, content: parse_content(type, block))
+    end
+    private
+    def extract_block(response, tags, type)
+      open_tag, close_tag = tags_for(type)
+      opening, closing = matching_tags(tags, open_tag, close_tag)
+      raise ParseError, "#{close_tag} must appear after #{open_tag}" if closing[:begin] < opening[:end]
+      reject_non_whitespace_outside_block!(response, opening, closing)
+      content = response[opening[:end]...closing[:begin]]
+      reject_nested_tags!(content)
+      content
+    end
+    def matching_tags(tags, open_tag, close_tag)
+      open_tags = tags.select { |tag| tag[:text] == open_tag }
+      close_tags = tags.select { |tag| tag[:text] == close_tag }
+      raise ParseError, "response must contain exactly one #{open_tag} tag" unless open_tags.one?
+      raise ParseError, "response must contain exactly one #{close_tag} tag" unless close_tags.one?
+      [open_tags.first, close_tags.first]
+    end
+    def scan_tags(response)
+      response.to_enum(:scan, KNOWN_TAG_PATTERN).map do
+        match = Regexp.last_match
+        { text: match[0], begin: match.begin(0), end: match.end(0) }
+      end
+    end
+    def block_type_for(tags)
+      has_code = tags.any? { |tag| [CODE_OPEN, CODE_CLOSE].include?(tag[:text]) }
+      has_final = tags.any? { |tag| [FINAL_OPEN, FINAL_CLOSE].include?(tag[:text]) }
+      raise ParseError, "response must not mix rlm-code and rlm-final blocks" if has_code && has_final
+      has_code ? :code : :final
+    end
+    def tags_for(type)
+      case type
+      when :code then [CODE_OPEN, CODE_CLOSE]
+      when :final then [FINAL_OPEN, FINAL_CLOSE]
+      else raise ParseError, "unknown block type: #{type.inspect}"
+      end
+    end
+    def reject_non_whitespace_outside_block!(response, opening, closing)
+      before = response[0...opening[:begin]]
+      after = response[closing[:end]..]
+      return if before.match?(/\A\s*\z/) && after.match?(/\A\s*\z/)
+      raise ParseError, "response must contain only one rlm block and surrounding whitespace"
+    end
+    def reject_nested_tags!(content)
+      return unless content.match?(KNOWN_TAG_PATTERN)
+      raise ParseError, "rlm blocks must not contain nested rlm tags"
+    end
+    def parse_content(type, content)
+      return content if type == :code
+      JSON.parse(content)
+    rescue JSON::ParserError => e
+      raise ParseError, "invalid JSON in rlm-final block: #{e.message}"
+    end
+  end
+end

data/lib/rlm/file.rb CHANGED Viewed

@@ -85,7 +85,7 @@ module RLM
       when :path then ::File.read(source[:path])
       when :text, :io then source[:text]
       when :active_storage then source[:blob].download
-      else raise SandboxError, "Unknown file source kind: #{source[:kind].inspect}"
+      else raise ConfigurationError, "Unknown file source kind: #{source[:kind].inspect}"
       end
     end

data/lib/rlm/lm/mock.rb ADDED Viewed

@@ -0,0 +1,45 @@
+# frozen_string_literal: true
+module RLM
+  module Lm
+    class Mock
+      attr_reader :prompts, :cost_cents
+      def initialize(responses:, cost_cents: 0)
+        @responses = Array(responses).dup.freeze
+        raise ArgumentError, "responses must not be empty" if @responses.empty?
+        @cost_cents_per_call = cost_cents
+        @cost_cents = 0
+        @prompts = []
+        @index = 0
+      end
+      def call(prompt:, **)
+        raise ProviderError, "prompt must be a String" unless prompt.is_a?(String)
+        raise ProviderError, "mock LM responses exhausted" if exhausted?
+        prompts << prompt
+        @cost_cents += @cost_cents_per_call
+        response = @responses.fetch(@index)
+        @index += 1
+        response
+      end
+      def call_count
+        prompts.length
+      end
+      def last_prompt
+        prompts.last
+      end
+      private
+      def exhausted?
+        @index >= @responses.length
+      end
+    end
+  end
+end