gemlings 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 6740ac8c9242e1493c8196521b37a151adf1a753fcea54307e9c4f7c2b59b6d9
4
+ data.tar.gz: 923dc72b8683652654b4b0f5514689406585000016b38cbd47af7fce2930a80c
5
+ SHA512:
6
+ metadata.gz: f1b904c2b1406d410974bbf4c36a317a6b5c5d2282a9f2d507d4d32d67454219641e423580db9f48c368735aedf352ccb813bd8e55e18edda59dcf5e54c7b1fd
7
+ data.tar.gz: bbf5e3731dca63565e24a7fb4200343956441486cb1474859a34643be22be9716e30cde408eb9bc34fe1b3ea10ae96472bca86c0c6ff829b3e0da5b41d81bc2f
data/.ruby-version ADDED
@@ -0,0 +1 @@
1
+ 3.4.8
data/CHANGELOG.md ADDED
@@ -0,0 +1,39 @@
1
+ # Changelog
2
+
3
+ ## 0.3.0
4
+
5
+ JRuby 10 support, interactive UI, and CI.
6
+
7
+ - **JRuby 10 support** -- Thread-based executor for JRuby where fork is unavailable; lipgloss gracefully skipped via `rescue LoadError`
8
+ - **Interactive status lines** -- Spinner shows "Executing..." / "Running tool_name..." during execution, resolves in-place to green dot (success) or red dot (error)
9
+ - **GitHub Actions CI** -- Build matrix with Ruby 3.2, 3.3, 3.4, and JRuby 10.0
10
+
11
+ ## 0.2.0
12
+
13
+ Tools, MCP filtering, observability, and code agent improvements.
14
+
15
+ - **FileRead tool** -- Read file contents with path expansion and 50k char truncation
16
+ - **FileWrite tool** -- Write files with automatic parent directory creation
17
+ - **ListGems tool** -- Lists available Ruby gems; auto-included in CodeAgent
18
+ - **`tool_from_mcp`** -- Load a single tool from an MCP server by name
19
+ - **Run export** -- `to_h` / `to_json` on Memory, RunResult, and all step types for serialization
20
+ - **Richer callbacks** -- `Callback` base class with `on_run_start`, `on_step_start`, `on_step_end`, `on_tool_call`, `on_error`, `on_run_end`; backward-compatible with existing `step_callbacks`
21
+ - **`memory.replay`** -- Pretty-print a completed run with syntax-highlighted code and metrics
22
+ - **Code agent prompt** -- Tells the model about available Ruby stdlib and gems so it uses `net/http`, `json`, etc. without being asked
23
+
24
+ ## 0.1.0
25
+
26
+ Initial release.
27
+
28
+ - **CodeAgent** -- LLM writes and executes Ruby code in a sandboxed fork
29
+ - **ToolCallingAgent** -- LLM calls tools via structured tool_calls (OpenAI-style)
30
+ - **Model adapters** -- OpenAI, Anthropic, and Ollama out of the box
31
+ - **Tool DSL** -- Define tools as classes or inline blocks
32
+ - **MCP client** -- Load tools from any MCP server via stdio transport
33
+ - **Structured output** -- Validate final answers against JSON Schema or custom procs
34
+ - **Prompt customization** -- Override system prompts, planning prompts, or inject instructions
35
+ - **Final answer checks** -- Validation procs that can reject and retry answers
36
+ - **Step-by-step execution** -- `agent.step()` for debugging and custom UIs
37
+ - **Planning** -- Optional periodic re-planning during long runs
38
+ - **Managed agents** -- Nest agents as tools for multi-agent workflows
39
+ - **CLI** -- `gemlings` command with interactive mode, tool loading, and MCP support
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2025 Chris Hasiński
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,239 @@
1
+ # Gemlings
2
+
3
+ [![Gem Version](https://badge.fury.io/rb/gemlings.svg)](https://rubygems.org/gems/gemlings)
4
+ [![CI](https://github.com/khasinski/gemlings/actions/workflows/ci.yml/badge.svg)](https://github.com/khasinski/gemlings/actions/workflows/ci.yml)
5
+
6
+ *Small, autonomous agents running Ruby snippets.*
7
+
8
+ Your LLM writes and executes Ruby code -- not JSON blobs. Tool calls are method calls, variables persist between steps, and the full power of Ruby is available to the agent at every turn. Inspired by [smolagents](https://github.com/huggingface/smolagents).
9
+
10
+ ![gemlings demo](demo.gif)
11
+
12
+ ## Quick start
13
+
14
+ ```bash
15
+ gem install gemlings
16
+ ```
17
+
18
+ ```ruby
19
+ require "gemlings"
20
+
21
+ agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
22
+ agent.run("What is the 118th Fibonacci number?")
23
+ ```
24
+
25
+ The agent thinks, writes Ruby, executes it in a sandbox, and returns the answer.
26
+
27
+ ## Agent types
28
+
29
+ **CodeAgent** writes Ruby code that runs in a sandboxed fork (MRI) or thread (JRuby). Tools are methods the model can call directly. Variables persist across steps.
30
+
31
+ ```ruby
32
+ agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
33
+ ```
34
+
35
+ **ToolCallingAgent** uses structured tool calls (OpenAI function calling style). Better for models with strong native tool support.
36
+
37
+ ```ruby
38
+ agent = Gemlings::ToolCallingAgent.new(model: "openai/gpt-4o")
39
+ ```
40
+
41
+ ## Models
42
+
43
+ Pass `provider/model_name`. Supports Anthropic, OpenAI, Google Gemini, DeepSeek, OpenRouter, and Ollama.
44
+
45
+ ```ruby
46
+ Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
47
+ Gemlings::CodeAgent.new(model: "openai/gpt-4o")
48
+ Gemlings::CodeAgent.new(model: "ollama/qwen2.5:3b")
49
+ ```
50
+
51
+ Set API keys via environment variables: `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GEMINI_API_KEY`, etc.
52
+
53
+ ## Tools
54
+
55
+ Six built-in tools: `web_search`, `visit_webpage`, `file_read`, `file_write`, `user_input`, and `list_gems` (auto-included in CodeAgent).
56
+
57
+ Define your own as a class:
58
+
59
+ ```ruby
60
+ class StockPrice < Gemlings::Tool
61
+ tool_name "stock_price"
62
+ description "Gets the current stock price for a ticker symbol"
63
+ input :ticker, type: :string, description: "Stock ticker symbol (e.g. AAPL)"
64
+ output_type :number
65
+
66
+ def call(ticker:)
67
+ 182.52
68
+ end
69
+ end
70
+
71
+ agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514", tools: [StockPrice])
72
+ ```
73
+
74
+ Or inline:
75
+
76
+ ```ruby
77
+ weather = Gemlings.tool(:weather, "Gets weather for a city", city: "City name") do |city:|
78
+ "72F and sunny in #{city}"
79
+ end
80
+ ```
81
+
82
+ ### MCP tools
83
+
84
+ Load tools from any [MCP](https://modelcontextprotocol.io/) server:
85
+
86
+ ```ruby
87
+ tools = Gemlings.tools_from_mcp(command: ["npx", "-y", "@modelcontextprotocol/server-filesystem", "/tmp"])
88
+
89
+ # Or load a single tool by name
90
+ tool = Gemlings.tool_from_mcp(command: ["npx", "-y", "@modelcontextprotocol/server-filesystem", "/tmp"], tool_name: "read_file")
91
+ ```
92
+
93
+ ## Multi-agent workflows
94
+
95
+ Nest agents as tools. The manager agent calls sub-agents by name:
96
+
97
+ ```ruby
98
+ researcher = Gemlings::ToolCallingAgent.new(
99
+ model: "openai/gpt-4o",
100
+ name: "researcher",
101
+ description: "Researches topics on the web",
102
+ tools: [Gemlings::WebSearch]
103
+ )
104
+
105
+ manager = Gemlings::CodeAgent.new(
106
+ model: "anthropic/claude-sonnet-4-20250514",
107
+ agents: [researcher]
108
+ )
109
+
110
+ manager.run("Find out when Ruby 3.4 was released and summarize the key features")
111
+ ```
112
+
113
+ ## Output validation
114
+
115
+ Validate final answers against a JSON Schema:
116
+
117
+ ```ruby
118
+ agent = Gemlings::CodeAgent.new(
119
+ model: "anthropic/claude-sonnet-4-20250514",
120
+ output_type: {
121
+ "type" => "object",
122
+ "required" => ["name", "age"],
123
+ "properties" => {
124
+ "name" => { "type" => "string" },
125
+ "age" => { "type" => "integer" }
126
+ }
127
+ }
128
+ )
129
+ ```
130
+
131
+ Or add custom checks that reject answers and force retries:
132
+
133
+ ```ruby
134
+ agent = Gemlings::CodeAgent.new(
135
+ model: "anthropic/claude-sonnet-4-20250514",
136
+ final_answer_checks: [
137
+ ->(answer, memory) { answer.length > 10 },
138
+ ->(answer, memory) { !answer.include?("I don't know") }
139
+ ]
140
+ )
141
+ ```
142
+
143
+ ## Observability
144
+
145
+ ### Callbacks
146
+
147
+ ```ruby
148
+ class Logger < Gemlings::Callback
149
+ def on_run_start(task:, agent:) = puts("Starting: #{task}")
150
+ def on_step_end(step:, agent:) = puts("Step done: #{step.duration}s")
151
+ def on_tool_call(tool_name:, arguments:, agent:) = puts("Calling #{tool_name}")
152
+ def on_run_end(result:, agent:) = puts("Done: #{result}")
153
+ end
154
+
155
+ agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514", callbacks: [Logger.new])
156
+ ```
157
+
158
+ ### Export and replay
159
+
160
+ ```ruby
161
+ result = agent.run("What is 2+2?", return_full_result: true)
162
+ result.to_json # serialize the full run
163
+
164
+ agent.memory.replay # pretty-print with syntax-highlighted code
165
+ ```
166
+
167
+ ## Step-by-step execution
168
+
169
+ Run one step at a time for debugging or custom UIs:
170
+
171
+ ```ruby
172
+ agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
173
+
174
+ agent.step("What is 2+2?")
175
+ agent.step until agent.done?
176
+
177
+ puts agent.final_answer_value
178
+ ```
179
+
180
+ ## Prompt customization
181
+
182
+ ```ruby
183
+ # Append instructions
184
+ agent = Gemlings::CodeAgent.new(
185
+ model: "anthropic/claude-sonnet-4-20250514",
186
+ instructions: "Always respond in French. Use metric units."
187
+ )
188
+
189
+ # Or fully replace the system prompt
190
+ templates = Gemlings::PromptTemplates.new(
191
+ system_prompt: "You are a data analyst. Tools: {{tool_descriptions}}"
192
+ )
193
+
194
+ agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514", prompt_templates: templates)
195
+ ```
196
+
197
+ ## Planning
198
+
199
+ Enable periodic re-planning during long runs:
200
+
201
+ ```ruby
202
+ agent = Gemlings::CodeAgent.new(
203
+ model: "anthropic/claude-sonnet-4-20250514",
204
+ planning_interval: 3,
205
+ max_steps: 15
206
+ )
207
+ ```
208
+
209
+ ## CLI
210
+
211
+ ```bash
212
+ gemlings "What is the 10th prime number?"
213
+ gemlings -m openai/gpt-4o -t web_search "Who won the latest Super Bowl?"
214
+ gemlings -a tool_calling -m openai/gpt-4o "What is 6 * 7?"
215
+ gemlings --mcp "npx -y @modelcontextprotocol/server-filesystem /tmp" "List files in /tmp"
216
+ gemlings -i # interactive mode
217
+ ```
218
+
219
+ ## Configuration
220
+
221
+ | Option | Default | Description |
222
+ |---|---|---|
223
+ | `model:` | -- | `"provider/model_name"` |
224
+ | `tools:` | `[]` | Array of Tool classes or instances |
225
+ | `agents:` | `[]` | Sub-agents (become callable tools) |
226
+ | `max_steps:` | `10` | Maximum steps before stopping |
227
+ | `planning_interval:` | `nil` | Re-plan every N steps |
228
+ | `instructions:` | `nil` | Extra instructions appended to system prompt |
229
+ | `prompt_templates:` | `nil` | Custom `PromptTemplates` instance |
230
+ | `output_type:` | `nil` | JSON Schema hash or validation Proc |
231
+ | `final_answer_checks:` | `[]` | Procs `(answer, memory) -> bool` |
232
+ | `callbacks:` | `[]` | Array of `Callback` instances |
233
+ | `step_callbacks:` | `[]` | Procs `(step, agent:) -> void` |
234
+
235
+ Requires Ruby 3.2+. JRuby 10+ is also supported.
236
+
237
+ ## License
238
+
239
+ MIT
data/ROADMAP.md ADDED
@@ -0,0 +1,56 @@
1
+ # Roadmap
2
+
3
+ Gaps identified by comparing gemlings with [smolagents](https://github.com/huggingface/smolagents), prioritized for Ruby developers building agents.
4
+
5
+ ## Phase 5 -- Core DX (high impact, low effort)
6
+
7
+ These make the framework usable for real work.
8
+
9
+ - [x] **MCP client for tools** -- Load tools from any MCP server (stdio + HTTP). This is the single biggest ecosystem unlock since MCP servers already exist for databases, APIs, file systems, browsers, etc. Ruby devs shouldn't have to rewrite tools that already exist.
10
+ - [x] **Structured output** -- Let agents return typed results (not just strings). Accept a schema or Data class, validate the final answer against it. Enables agents as reliable building blocks in larger apps.
11
+ - [x] **Prompt customization** -- Expose `PromptTemplates` object (system prompt, planning, managed agent) so users can override prompts without subclassing. Add `instructions:` parameter for injecting custom rules.
12
+ - [x] **`agent.step()` method** -- Single-step execution for debugging and building custom UIs. Returns the step, lets the caller inspect/modify memory before continuing.
13
+ - [x] **`final_answer_checks`** -- List of validation procs run before accepting a final answer. If any returns false, the agent keeps going. Cheap way to add guardrails.
14
+
15
+ ## Phase 6 -- Model & tool ecosystem (high impact, medium effort)
16
+
17
+ Broader model support and tool discovery. The RubyLLM migration (replacing 3 hand-rolled adapters with a single wrapper) completed most of the model items here.
18
+
19
+ - [x] **RubyLLM universal adapter** -- Replaced OpenAI, Anthropic, and Ollama adapters with a single RubyLLM wrapper. Supports 800+ models across OpenAI, Anthropic, Gemini, DeepSeek, OpenRouter, Ollama, and any OpenAI-compatible endpoint. Auto-configures from env vars.
20
+ - [x] **Rate limiting (basic)** -- RubyLLM provides built-in `max_retries` and `retry_interval` for 429s. Per-minute quotas are not yet exposed.
21
+ - [x] **More built-in tools** -- File read/write tools. Google search and Wikipedia search remain TODO.
22
+ - [x] **Tool.from_mcp** -- Load a single tool from an MCP server by name (vs loading all tools from a server).
23
+
24
+ ## Phase 7 -- Observability & debugging (medium impact, medium effort)
25
+
26
+ Understanding what agents actually do.
27
+
28
+ - [ ] **Structured logging** -- JSON-structured logs per step with run_id, step_number, thought, action, observation, timing, tokens. Emit to any Ruby logger.
29
+ - [x] **`memory.replay`** -- Pretty-print a completed run to the terminal (like smolagents' `agent.replay()`).
30
+ - [x] **Run export** -- Serialize a run (memory + steps + metadata) to JSON for later analysis or replay.
31
+ - [x] **Callbacks for observability** -- Richer callback interface: `on_step_start`, `on_step_end`, `on_tool_call`, `on_error`. Current `step_callbacks` only fires after completion.
32
+
33
+ ## Phase 8 -- Sandboxing & security (medium impact, high effort)
34
+
35
+ For production use where agent code can't be trusted.
36
+
37
+ - [ ] **Docker executor** -- Run agent code in a Docker container instead of a fork. Filesystem isolation, network control, resource limits.
38
+ - [ ] **Import/require allowlist** -- Restrict which Ruby gems/stdlib modules agent code can load in the sandbox (like smolagents' `additional_authorized_imports`).
39
+ - [ ] **Operation count limit** -- Cap iterations/operations in the sandbox to prevent infinite loops eating CPU (smolagents caps at 1M operations).
40
+
41
+ ## Phase 9 -- Advanced features (lower priority, nice to have)
42
+
43
+ - [ ] **Agent serialization** -- `agent.save(dir)` / `Agent.load(dir)` for persisting agent configuration (tools, prompts, model, settings).
44
+ - [ ] **Media types in tools** -- Support image/audio inputs and outputs for multimodal agents.
45
+ - [ ] **Async/parallel tool calls** -- ToolCallingAgent processes multiple tool calls concurrently (like smolagents' `max_tool_threads`).
46
+ - [ ] **Web UI** -- Lightweight web interface for interactive agent sessions (alternative to CLI). Could be a simple Rack app or use Hotwire.
47
+ - [ ] **Persistent memory** -- Long-term memory across runs (conversation history, learned facts). Could be file-based or backed by SQLite.
48
+
49
+ ## Not planned
50
+
51
+ These exist in smolagents but don't fit gemlings' design goals:
52
+
53
+ - **Hub sharing** -- No equivalent to HuggingFace Hub in Ruby. Gems are the distribution mechanism.
54
+ - **LangChain/Gradio interop** -- Python-specific ecosystems.
55
+ - **WASM executor** -- Ruby WASM support is too immature.
56
+ - **MLX/vLLM adapters** -- Python-only inference runtimes. RubyLLM covers local models via Ollama.
data/Rakefile ADDED
@@ -0,0 +1,6 @@
1
+ # frozen_string_literal: true
2
+
3
+ require "rspec/core/rake_task"
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task default: :spec
data/demo.gif ADDED
Binary file
@@ -0,0 +1,24 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require_relative "../lib/gemlings"
5
+
6
+ class StockPrice < Gemlings::Tool
7
+ tool_name "stock_price"
8
+ description "Gets the current stock price for a ticker symbol"
9
+ input :ticker, type: :string, description: "Stock ticker symbol (e.g. AAPL)"
10
+ output_type :number
11
+
12
+ def call(ticker:)
13
+ # Simulated stock prices for demo
14
+ prices = { "AAPL" => 182.52, "GOOGL" => 141.80, "TSLA" => 248.42, "RIVN" => 14.73 }
15
+ prices.fetch(ticker.upcase, "Unknown ticker: #{ticker}")
16
+ end
17
+ end
18
+
19
+ agent = Gemlings::CodeAgent.new(
20
+ model: "anthropic/claude-sonnet-4-20250514",
21
+ tools: [StockPrice]
22
+ )
23
+
24
+ agent.run("What's the difference in stock price between AAPL and TSLA?")
@@ -0,0 +1,7 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require_relative "../lib/gemlings"
5
+
6
+ agent = Gemlings::CodeAgent.new(model: "anthropic/claude-sonnet-4-20250514")
7
+ agent.run("What is the 118th Fibonacci number?")
@@ -0,0 +1,12 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require_relative "../lib/gemlings"
5
+ require_relative "../lib/gemlings/tools/web_search"
6
+
7
+ agent = Gemlings::CodeAgent.new(
8
+ model: "anthropic/claude-sonnet-4-20250514",
9
+ tools: [Gemlings::WebSearch]
10
+ )
11
+
12
+ agent.run("What year was Ruby created and who created it?")
data/exe/gemlings ADDED
@@ -0,0 +1,6 @@
1
+ #!/usr/bin/env ruby
2
+ # frozen_string_literal: true
3
+
4
+ require_relative "../lib/gemlings"
5
+
6
+ Gemlings::CLI.run
data/gemlings.gemspec ADDED
@@ -0,0 +1,44 @@
1
+ # frozen_string_literal: true
2
+
3
+ require_relative "lib/gemlings/version"
4
+
5
+ Gem::Specification.new do |spec|
6
+ spec.name = "gemlings"
7
+ spec.version = Gemlings::VERSION
8
+ spec.authors = ["Chris Hasiński"]
9
+ spec.email = ["krzysztof.hasinski@gmail.com"]
10
+
11
+ spec.summary = "A radically simple, code-first AI agent framework for Ruby"
12
+ spec.description = "Agents that write and execute Ruby code. Inspired by smolagents. " \
13
+ "LLMs write executable code, not JSON blobs."
14
+ spec.homepage = "https://github.com/khasinski/gemlings"
15
+ spec.license = "MIT"
16
+ spec.required_ruby_version = ">= 3.2.0"
17
+
18
+ spec.metadata["homepage_uri"] = spec.homepage
19
+ spec.metadata["source_code_uri"] = spec.homepage
20
+ spec.metadata["changelog_uri"] = "#{spec.homepage}/blob/main/CHANGELOG.md"
21
+
22
+ spec.files = Dir.chdir(__dir__) do
23
+ `git ls-files -z`.split("\x0").reject do |f|
24
+ (File.expand_path(f) == __FILE__) ||
25
+ f.start_with?("spec/", "test/", ".git", ".github", "Gemfile")
26
+ end
27
+ end
28
+
29
+ spec.bindir = "exe"
30
+ spec.executables = ["gemlings"]
31
+ spec.require_paths = ["lib"]
32
+
33
+ spec.add_dependency "lipgloss", "~> 0.2" unless RUBY_ENGINE == "jruby"
34
+ spec.add_dependency "reverse_markdown", "~> 3.0"
35
+ spec.add_dependency "rouge", "~> 4.0"
36
+ spec.add_dependency "json-schema", "~> 4.0"
37
+ spec.add_dependency "bigdecimal"
38
+ spec.add_dependency "mcp", "~> 0.7"
39
+ spec.add_dependency "ruby_llm", "~> 1.1"
40
+
41
+ spec.add_development_dependency "rake", "~> 13.0"
42
+ spec.add_development_dependency "rspec", "~> 3.0"
43
+ spec.add_development_dependency "rubocop", "~> 1.0"
44
+ end