RubyGems - gemlings - Versions diffs - 0.3.0 - Mend

gemlings 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (36) hide show

checksums.yaml +7 -0
data/.ruby-version +1 -0
data/CHANGELOG.md +39 -0
data/LICENSE +21 -0
data/README.md +239 -0
data/ROADMAP.md +56 -0
data/Rakefile +6 -0
data/demo.gif +0 -0
data/examples/custom_tool.rb +24 -0
data/examples/fibonacci.rb +7 -0
data/examples/with_tools.rb +12 -0
data/exe/gemlings +6 -0
data/gemlings.gemspec +44 -0
data/lib/gemlings/agent.rb +316 -0
data/lib/gemlings/callback.rb +12 -0
data/lib/gemlings/cli.rb +146 -0
data/lib/gemlings/code_agent.rb +83 -0
data/lib/gemlings/errors.rb +21 -0
data/lib/gemlings/mcp.rb +128 -0
data/lib/gemlings/memory.rb +241 -0
data/lib/gemlings/model.rb +99 -0
data/lib/gemlings/models/ruby_llm_adapter.rb +158 -0
data/lib/gemlings/prompt.rb +123 -0
data/lib/gemlings/sandbox.rb +142 -0
data/lib/gemlings/tool.rb +124 -0
data/lib/gemlings/tool_calling_agent.rb +86 -0
data/lib/gemlings/tools/file_read.rb +20 -0
data/lib/gemlings/tools/file_write.rb +22 -0
data/lib/gemlings/tools/list_gems.rb +15 -0
data/lib/gemlings/tools/user_input.rb +16 -0
data/lib/gemlings/tools/visit_webpage.rb +44 -0
data/lib/gemlings/tools/web_search.rb +43 -0
data/lib/gemlings/ui.rb +245 -0
data/lib/gemlings/version.rb +5 -0
data/lib/gemlings.rb +21 -0
metadata +219 -0

checksums.yaml ADDED Viewed

@@ -0,0 +1,7 @@
+---
+SHA256:
+  metadata.gz: 6740ac8c9242e1493c8196521b37a151adf1a753fcea54307e9c4f7c2b59b6d9
+  data.tar.gz: 923dc72b8683652654b4b0f5514689406585000016b38cbd47af7fce2930a80c
+SHA512:
+  metadata.gz: f1b904c2b1406d410974bbf4c36a317a6b5c5d2282a9f2d507d4d32d67454219641e423580db9f48c368735aedf352ccb813bd8e55e18edda59dcf5e54c7b1fd
+  data.tar.gz: bbf5e3731dca63565e24a7fb4200343956441486cb1474859a34643be22be9716e30cde408eb9bc34fe1b3ea10ae96472bca86c0c6ff829b3e0da5b41d81bc2f

data/.ruby-version ADDED Viewed

	@@ -0,0 +1 @@
1	+ 3.4.8

data/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,39 @@
+# Changelog
+## 0.3.0
+JRuby 10 support, interactive UI, and CI.
+- **JRuby 10 support** -- Thread-based executor for JRuby where fork is unavailable; lipgloss gracefully skipped via `rescue LoadError`
+- **Interactive status lines** -- Spinner shows "Executing..." / "Running tool_name..." during execution, resolves in-place to green dot (success) or red dot (error)
+- **GitHub Actions CI** -- Build matrix with Ruby 3.2, 3.3, 3.4, and JRuby 10.0
+## 0.2.0
+Tools, MCP filtering, observability, and code agent improvements.
+- **FileRead tool** -- Read file contents with path expansion and 50k char truncation
+- **FileWrite tool** -- Write files with automatic parent directory creation
+- **ListGems tool** -- Lists available Ruby gems; auto-included in CodeAgent
+- **`tool_from_mcp`** -- Load a single tool from an MCP server by name
+- **Run export** -- `to_h` / `to_json` on Memory, RunResult, and all step types for serialization
+- **Richer callbacks** -- `Callback` base class with `on_run_start`, `on_step_start`, `on_step_end`, `on_tool_call`, `on_error`, `on_run_end`; backward-compatible with existing `step_callbacks`
+- **`memory.replay`** -- Pretty-print a completed run with syntax-highlighted code and metrics
+- **Code agent prompt** -- Tells the model about available Ruby stdlib and gems so it uses `net/http`, `json`, etc. without being asked
+## 0.1.0
+Initial release.
+- **CodeAgent** -- LLM writes and executes Ruby code in a sandboxed fork
+- **ToolCallingAgent** -- LLM calls tools via structured tool_calls (OpenAI-style)
+- **Model adapters** -- OpenAI, Anthropic, and Ollama out of the box
+- **Tool DSL** -- Define tools as classes or inline blocks
+- **MCP client** -- Load tools from any MCP server via stdio transport
+- **Structured output** -- Validate final answers against JSON Schema or custom procs
+- **Prompt customization** -- Override system prompts, planning prompts, or inject instructions
+- **Final answer checks** -- Validation procs that can reject and retry answers
+- **Step-by-step execution** -- `agent.step()` for debugging and custom UIs
+- **Planning** -- Optional periodic re-planning during long runs
+- **Managed agents** -- Nest agents as tools for multi-agent workflows
+- **CLI** -- `gemlings` command with interactive mode, tool loading, and MCP support

data/LICENSE ADDED Viewed

@@ -0,0 +1,21 @@
+MIT License
+Copyright (c) 2025 Chris Hasiński
+Permission is hereby granted, free of charge, to any person obtaining a copy
+of this software and associated documentation files (the "Software"), to deal
+in the Software without restriction, including without limitation the rights
+to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+copies of the Software, and to permit persons to whom the Software is
+furnished to do so, subject to the following conditions:
+The above copyright notice and this permission notice shall be included in all
+copies or substantial portions of the Software.
+THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
+SOFTWARE.

data/README.md ADDED Viewed

@@ -0,0 +1,239 @@
+# Gemlings
+[![Gem Version](https://badge.fury.io/rb/gemlings.svg)](https://rubygems.org/gems/gemlings)
+[![CI](https://github.com/khasinski/gemlings/actions/workflows/ci.yml/badge.svg)](https://github.com/khasinski/gemlings/actions/workflows/ci.yml)
+*Small, autonomous agents running Ruby snippets.*
+Your LLM writes and executes Ruby code -- not JSON blobs. Tool calls are method calls, variables persist between steps, and the full power of Ruby is available to the agent at every turn. Inspired by [smolagents](https://github.com/huggingface/smolagents).
+![gemlings demo](demo.gif)
+## Quick start
+```bash
+gem install gemlings
+```
+```ruby
+require "gemlings"
+agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
+agent.run("What is the 118th Fibonacci number?")
+```
+The agent thinks, writes Ruby, executes it in a sandbox, and returns the answer.
+## Agent types
+**CodeAgent** writes Ruby code that runs in a sandboxed fork (MRI) or thread (JRuby). Tools are methods the model can call directly. Variables persist across steps.
+```ruby
+agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
+```
+**ToolCallingAgent** uses structured tool calls (OpenAI function calling style). Better for models with strong native tool support.
+```ruby
+agent = Gemlings::ToolCallingAgent.new(model: "openai/gpt-4o")
+```
+## Models
+Pass `provider/model_name`. Supports Anthropic, OpenAI, Google Gemini, DeepSeek, OpenRouter, and Ollama.
+```ruby
+Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
+Gemlings::CodeAgent.new(model: "openai/gpt-4o")
+Gemlings::CodeAgent.new(model: "ollama/qwen2.5:3b")
+```
+Set API keys via environment variables: `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GEMINI_API_KEY`, etc.
+## Tools
+Six built-in tools: `web_search`, `visit_webpage`, `file_read`, `file_write`, `user_input`, and `list_gems` (auto-included in CodeAgent).
+Define your own as a class:
+```ruby
+class StockPrice < Gemlings::Tool
+  tool_name "stock_price"
+  description "Gets the current stock price for a ticker symbol"
+  input :ticker, type: :string, description: "Stock ticker symbol (e.g. AAPL)"
+  output_type :number
+  def call(ticker:)
+    182.52
+  end
+end
+agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514", tools: [StockPrice])
+```
+Or inline:
+```ruby
+weather = Gemlings.tool(:weather, "Gets weather for a city", city: "City name") do |city:|
+  "72F and sunny in #{city}"
+end
+```
+### MCP tools
+Load tools from any [MCP](https://modelcontextprotocol.io/) server:
+```ruby
+tools = Gemlings.tools_from_mcp(command: ["npx", "-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
+# Or load a single tool by name
+tool = Gemlings.tool_from_mcp(command: ["npx", "-y", "@modelcontextprotocol/server-filesystem", "/tmp"], tool_name: "read_file")
+```
+## Multi-agent workflows
+Nest agents as tools. The manager agent calls sub-agents by name:
+```ruby
+researcher = Gemlings::ToolCallingAgent.new(
+  model: "openai/gpt-4o",
+  name: "researcher",
+  description: "Researches topics on the web",
+  tools: [Gemlings::WebSearch]
+)
+manager = Gemlings::CodeAgent.new(
+  model: "anthropic/claude-sonnet-4-20250514",
+  agents: [researcher]
+)
+manager.run("Find out when Ruby 3.4 was released and summarize the key features")
+```
+## Output validation
+Validate final answers against a JSON Schema:
+```ruby
+agent = Gemlings::CodeAgent.new(
+  model: "anthropic/claude-sonnet-4-20250514",
+  output_type: {
+    "type" => "object",
+    "required" => ["name", "age"],
+    "properties" => {
+      "name" => { "type" => "string" },
+      "age" => { "type" => "integer" }
+    }
+  }
+)
+```
+Or add custom checks that reject answers and force retries:
+```ruby
+agent = Gemlings::CodeAgent.new(
+  model: "anthropic/claude-sonnet-4-20250514",
+  final_answer_checks: [
+    ->(answer, memory) { answer.length > 10 },
+    ->(answer, memory) { !answer.include?("I don't know") }
+  ]
+)
+```
+## Observability
+### Callbacks
+```ruby
+class Logger < Gemlings::Callback
+  def on_run_start(task:, agent:) = puts("Starting: #{task}")
+  def on_step_end(step:, agent:) = puts("Step done: #{step.duration}s")
+  def on_tool_call(tool_name:, arguments:, agent:) = puts("Calling #{tool_name}")
+  def on_run_end(result:, agent:) = puts("Done: #{result}")
+end
+agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514", callbacks: [Logger.new])
+```
+### Export and replay
+```ruby
+result = agent.run("What is 2+2?", return_full_result: true)
+result.to_json  # serialize the full run
+agent.memory.replay  # pretty-print with syntax-highlighted code
+```
+## Step-by-step execution
+Run one step at a time for debugging or custom UIs:
+```ruby
+agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
+agent.step("What is 2+2?")
+agent.step until agent.done?
+puts agent.final_answer_value
+```
+## Prompt customization
+```ruby
+# Append instructions
+agent = Gemlings::CodeAgent.new(
+  model: "anthropic/claude-sonnet-4-20250514",
+  instructions: "Always respond in French. Use metric units."
+)
+# Or fully replace the system prompt
+templates = Gemlings::PromptTemplates.new(
+  system_prompt: "You are a data analyst. Tools: {{tool_descriptions}}"
+)
+agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514", prompt_templates: templates)
+```
+## Planning
+Enable periodic re-planning during long runs:
+```ruby
+agent = Gemlings::CodeAgent.new(
+  model: "anthropic/claude-sonnet-4-20250514",
+  planning_interval: 3,
+  max_steps: 15
+)
+```
+## CLI
+```bash
+gemlings "What is the 10th prime number?"
+gemlings -m openai/gpt-4o -t web_search "Who won the latest Super Bowl?"
+gemlings -a tool_calling -m openai/gpt-4o "What is 6 * 7?"
+gemlings --mcp "npx -y @modelcontextprotocol/server-filesystem /tmp" "List files in /tmp"
+gemlings -i  # interactive mode
+```
+## Configuration
+| Option | Default | Description |
+|---|---|---|
+| `model:` | -- | `"provider/model_name"` |
+| `tools:` | `[]` | Array of Tool classes or instances |
+| `agents:` | `[]` | Sub-agents (become callable tools) |
+| `max_steps:` | `10` | Maximum steps before stopping |
+| `planning_interval:` | `nil` | Re-plan every N steps |
+| `instructions:` | `nil` | Extra instructions appended to system prompt |
+| `prompt_templates:` | `nil` | Custom `PromptTemplates` instance |
+| `output_type:` | `nil` | JSON Schema hash or validation Proc |
+| `final_answer_checks:` | `[]` | Procs `(answer, memory) -> bool` |
+| `callbacks:` | `[]` | Array of `Callback` instances |
+| `step_callbacks:` | `[]` | Procs `(step, agent:) -> void` |
+Requires Ruby 3.2+. JRuby 10+ is also supported.
+## License
+MIT

data/ROADMAP.md ADDED Viewed

@@ -0,0 +1,56 @@
+# Roadmap
+Gaps identified by comparing gemlings with [smolagents](https://github.com/huggingface/smolagents), prioritized for Ruby developers building agents.
+## Phase 5 -- Core DX (high impact, low effort)
+These make the framework usable for real work.
+- [x] **MCP client for tools** -- Load tools from any MCP server (stdio + HTTP). This is the single biggest ecosystem unlock since MCP servers already exist for databases, APIs, file systems, browsers, etc. Ruby devs shouldn't have to rewrite tools that already exist.
+- [x] **Structured output** -- Let agents return typed results (not just strings). Accept a schema or Data class, validate the final answer against it. Enables agents as reliable building blocks in larger apps.
+- [x] **Prompt customization** -- Expose `PromptTemplates` object (system prompt, planning, managed agent) so users can override prompts without subclassing. Add `instructions:` parameter for injecting custom rules.
+- [x] **`agent.step()` method** -- Single-step execution for debugging and building custom UIs. Returns the step, lets the caller inspect/modify memory before continuing.
+- [x] **`final_answer_checks`** -- List of validation procs run before accepting a final answer. If any returns false, the agent keeps going. Cheap way to add guardrails.
+## Phase 6 -- Model & tool ecosystem (high impact, medium effort)
+Broader model support and tool discovery. The RubyLLM migration (replacing 3 hand-rolled adapters with a single wrapper) completed most of the model items here.
+- [x] **RubyLLM universal adapter** -- Replaced OpenAI, Anthropic, and Ollama adapters with a single RubyLLM wrapper. Supports 800+ models across OpenAI, Anthropic, Gemini, DeepSeek, OpenRouter, Ollama, and any OpenAI-compatible endpoint. Auto-configures from env vars.
+- [x] **Rate limiting (basic)** -- RubyLLM provides built-in `max_retries` and `retry_interval` for 429s. Per-minute quotas are not yet exposed.
+- [x] **More built-in tools** -- File read/write tools. Google search and Wikipedia search remain TODO.
+- [x] **Tool.from_mcp** -- Load a single tool from an MCP server by name (vs loading all tools from a server).
+## Phase 7 -- Observability & debugging (medium impact, medium effort)
+Understanding what agents actually do.
+- [ ] **Structured logging** -- JSON-structured logs per step with run_id, step_number, thought, action, observation, timing, tokens. Emit to any Ruby logger.
+- [x] **`memory.replay`** -- Pretty-print a completed run to the terminal (like smolagents' `agent.replay()`).
+- [x] **Run export** -- Serialize a run (memory + steps + metadata) to JSON for later analysis or replay.
+- [x] **Callbacks for observability** -- Richer callback interface: `on_step_start`, `on_step_end`, `on_tool_call`, `on_error`. Current `step_callbacks` only fires after completion.
+## Phase 8 -- Sandboxing & security (medium impact, high effort)
+For production use where agent code can't be trusted.
+- [ ] **Docker executor** -- Run agent code in a Docker container instead of a fork. Filesystem isolation, network control, resource limits.
+- [ ] **Import/require allowlist** -- Restrict which Ruby gems/stdlib modules agent code can load in the sandbox (like smolagents' `additional_authorized_imports`).
+- [ ] **Operation count limit** -- Cap iterations/operations in the sandbox to prevent infinite loops eating CPU (smolagents caps at 1M operations).
+## Phase 9 -- Advanced features (lower priority, nice to have)
+- [ ] **Agent serialization** -- `agent.save(dir)` / `Agent.load(dir)` for persisting agent configuration (tools, prompts, model, settings).
+- [ ] **Media types in tools** -- Support image/audio inputs and outputs for multimodal agents.
+- [ ] **Async/parallel tool calls** -- ToolCallingAgent processes multiple tool calls concurrently (like smolagents' `max_tool_threads`).
+- [ ] **Web UI** -- Lightweight web interface for interactive agent sessions (alternative to CLI). Could be a simple Rack app or use Hotwire.
+- [ ] **Persistent memory** -- Long-term memory across runs (conversation history, learned facts). Could be file-based or backed by SQLite.
+## Not planned
+These exist in smolagents but don't fit gemlings' design goals:
+- **Hub sharing** -- No equivalent to HuggingFace Hub in Ruby. Gems are the distribution mechanism.
+- **LangChain/Gradio interop** -- Python-specific ecosystems.
+- **WASM executor** -- Ruby WASM support is too immature.
+- **MLX/vLLM adapters** -- Python-only inference runtimes. RubyLLM covers local models via Ollama.

data/Rakefile ADDED Viewed

@@ -0,0 +1,6 @@
+# frozen_string_literal: true
+require "rspec/core/rake_task"
+RSpec::Core::RakeTask.new(:spec)
+task default: :spec

data/demo.gif ADDED Viewed

Binary file

data/examples/custom_tool.rb ADDED Viewed

@@ -0,0 +1,24 @@
+#!/usr/bin/env ruby
+# frozen_string_literal: true
+require_relative "../lib/gemlings"
+class StockPrice < Gemlings::Tool
+  tool_name "stock_price"
+  description "Gets the current stock price for a ticker symbol"
+  input :ticker, type: :string, description: "Stock ticker symbol (e.g. AAPL)"
+  output_type :number
+  def call(ticker:)
+    # Simulated stock prices for demo
+    prices = { "AAPL" => 182.52, "GOOGL" => 141.80, "TSLA" => 248.42, "RIVN" => 14.73 }
+    prices.fetch(ticker.upcase, "Unknown ticker: #{ticker}")
+  end
+end
+agent = Gemlings::CodeAgent.new(
+  model: "anthropic/claude-sonnet-4-20250514",
+  tools: [StockPrice]
+)
+agent.run("What's the difference in stock price between AAPL and TSLA?")

data/examples/fibonacci.rb ADDED Viewed

@@ -0,0 +1,7 @@
+#!/usr/bin/env ruby
+# frozen_string_literal: true
+require_relative "../lib/gemlings"
+agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
+agent.run("What is the 118th Fibonacci number?")

data/examples/with_tools.rb ADDED Viewed

@@ -0,0 +1,12 @@
+#!/usr/bin/env ruby
+# frozen_string_literal: true
+require_relative "../lib/gemlings"
+require_relative "../lib/gemlings/tools/web_search"
+agent = Gemlings::CodeAgent.new(
+  model: "anthropic/claude-sonnet-4-20250514",
+  tools: [Gemlings::WebSearch]
+)
+agent.run("What year was Ruby created and who created it?")

data/exe/gemlings ADDED Viewed

@@ -0,0 +1,6 @@
+#!/usr/bin/env ruby
+# frozen_string_literal: true
+require_relative "../lib/gemlings"
+Gemlings::CLI.run

data/gemlings.gemspec ADDED Viewed

@@ -0,0 +1,44 @@
+# frozen_string_literal: true
+require_relative "lib/gemlings/version"
+Gem::Specification.new do |spec|
+  spec.name = "gemlings"
+  spec.version = Gemlings::VERSION
+  spec.authors = ["Chris Hasiński"]
+  spec.email = ["krzysztof.hasinski@gmail.com"]
+  spec.summary = "A radically simple, code-first AI agent framework for Ruby"
+  spec.description = "Agents that write and execute Ruby code. Inspired by smolagents. " \
+                     "LLMs write executable code, not JSON blobs."
+  spec.homepage = "https://github.com/khasinski/gemlings"
+  spec.license = "MIT"
+  spec.required_ruby_version = ">= 3.2.0"
+  spec.metadata["homepage_uri"] = spec.homepage
+  spec.metadata["source_code_uri"] = spec.homepage
+  spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
+  spec.files = Dir.chdir(__dir__) do
+    `git ls-files -z`.split("\x0").reject do |f|
+      (File.expand_path(f) == __FILE__) ||
+        f.start_with?("spec/", "test/", ".git", ".github", "Gemfile")
+    end
+  end
+  spec.bindir = "exe"
+  spec.executables = ["gemlings"]
+  spec.require_paths = ["lib"]
+  spec.add_dependency "lipgloss", "~> 0.2" unless RUBY_ENGINE == "jruby"
+  spec.add_dependency "reverse_markdown", "~> 3.0"
+  spec.add_dependency "rouge", "~> 4.0"
+  spec.add_dependency "json-schema", "~> 4.0"
+  spec.add_dependency "bigdecimal"
+  spec.add_dependency "mcp", "~> 0.7"
+  spec.add_dependency "ruby_llm", "~> 1.1"
+  spec.add_development_dependency "rake", "~> 13.0"
+  spec.add_development_dependency "rspec", "~> 3.0"
+  spec.add_development_dependency "rubocop", "~> 1.0"
+end