rcrewai 0.2.1 → 0.3.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.rubocop.yml +1 -0
- data/.rubocop_todo.yml +99 -0
- data/CHANGELOG.md +24 -0
- data/README.md +2 -2
- data/Rakefile +53 -53
- data/bin/rcrewai +3 -3
- data/docs/mcp.md +109 -0
- data/docs/superpowers/plans/2026-05-11-llm-modernization.md +2753 -0
- data/docs/superpowers/specs/2026-05-11-llm-modernization-design.md +479 -0
- data/docs/upgrading-to-0.3.md +163 -0
- data/examples/async_execution_example.rb +82 -81
- data/examples/hierarchical_crew_example.rb +68 -72
- data/examples/human_in_the_loop_example.rb +73 -74
- data/examples/mcp_example.rb +48 -0
- data/examples/native_tools_example.rb +64 -0
- data/examples/streaming_example.rb +56 -0
- data/lib/rcrewai/agent.rb +148 -287
- data/lib/rcrewai/async_executor.rb +43 -43
- data/lib/rcrewai/cli.rb +11 -11
- data/lib/rcrewai/configuration.rb +14 -9
- data/lib/rcrewai/crew.rb +56 -39
- data/lib/rcrewai/events.rb +30 -0
- data/lib/rcrewai/human_input.rb +104 -114
- data/lib/rcrewai/legacy_react_runner.rb +172 -0
- data/lib/rcrewai/llm_client.rb +1 -1
- data/lib/rcrewai/llm_clients/anthropic.rb +174 -54
- data/lib/rcrewai/llm_clients/azure.rb +23 -128
- data/lib/rcrewai/llm_clients/base.rb +11 -7
- data/lib/rcrewai/llm_clients/google.rb +159 -95
- data/lib/rcrewai/llm_clients/ollama.rb +150 -106
- data/lib/rcrewai/llm_clients/openai.rb +140 -63
- data/lib/rcrewai/mcp/client.rb +101 -0
- data/lib/rcrewai/mcp/tool_adapter.rb +59 -0
- data/lib/rcrewai/mcp/transport/http.rb +53 -0
- data/lib/rcrewai/mcp/transport/stdio.rb +55 -0
- data/lib/rcrewai/mcp.rb +8 -0
- data/lib/rcrewai/memory.rb +45 -37
- data/lib/rcrewai/pricing.rb +34 -0
- data/lib/rcrewai/process.rb +86 -95
- data/lib/rcrewai/provider_schema.rb +38 -0
- data/lib/rcrewai/sse_parser.rb +55 -0
- data/lib/rcrewai/task.rb +56 -64
- data/lib/rcrewai/tool_runner.rb +132 -0
- data/lib/rcrewai/tool_schema.rb +97 -0
- data/lib/rcrewai/tools/base.rb +98 -37
- data/lib/rcrewai/tools/code_executor.rb +71 -74
- data/lib/rcrewai/tools/email_sender.rb +70 -78
- data/lib/rcrewai/tools/file_reader.rb +38 -30
- data/lib/rcrewai/tools/file_writer.rb +40 -38
- data/lib/rcrewai/tools/pdf_processor.rb +115 -130
- data/lib/rcrewai/tools/sql_database.rb +58 -55
- data/lib/rcrewai/tools/web_search.rb +26 -25
- data/lib/rcrewai/version.rb +2 -2
- data/lib/rcrewai.rb +18 -10
- data/rcrewai.gemspec +39 -39
- metadata +65 -47
|
@@ -0,0 +1,479 @@
|
|
|
1
|
+
# LLM Modernization: Native Tool Calling, Streaming, and MCP Client
|
|
2
|
+
|
|
3
|
+
**Date:** 2026-05-11
|
|
4
|
+
**Status:** Approved design (pending implementation plan)
|
|
5
|
+
**Target version:** `rcrewai` 0.3.0 (current is 0.2.1)
|
|
6
|
+
**Scope:** Replace prompt-engineered ReAct tool calls with native function calling across all five LLM providers; add a typed streaming event model; add an MCP (Model Context Protocol) client so RCrewAI agents can consume external MCP servers as ordinary tools.
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Motivation
|
|
11
|
+
|
|
12
|
+
The current tool-call mechanism is ReAct-style: agents are prompted to emit `USE_TOOL[name](k=v)`, which a regex in `Agent#execute_task` then parses. This works but is fragile — argument escaping, multi-word string values, nested data, and "the model forgot to call the tool" are all recurring failure modes that providers' native function-calling APIs handle correctly.
|
|
13
|
+
|
|
14
|
+
Streaming is absent entirely, which blocks UI work, live cost tracking, and observability/tracing.
|
|
15
|
+
|
|
16
|
+
MCP has emerged as the de-facto standard protocol for exposing tools (filesystem, GitHub, Slack, browsers, internal services) to agent frameworks. Supporting it as a client unlocks a large existing tool ecosystem with zero per-tool code.
|
|
17
|
+
|
|
18
|
+
## Decisions captured during brainstorming
|
|
19
|
+
|
|
20
|
+
| # | Decision | Choice |
|
|
21
|
+
|---|---|---|
|
|
22
|
+
| 1 | Backward compatibility posture | **Dual-mode, native preferred.** Existing tools and the `USE_TOOL[]` path keep working; native tool calling is the default when both the provider and the tool support it. |
|
|
23
|
+
| 2 | Tool schema definition | **Explicit DSL** on `Tools::Base` (`tool_name`, `description`, `param`). Tools without DSL declarations get a permissive fallback schema. |
|
|
24
|
+
| 3 | MCP scope | **Client only** in v1 (stdio + HTTP/SSE transports). Server-mode deferred. |
|
|
25
|
+
| 4 | Streaming surface | **Full typed event stream** (text deltas, tool-call lifecycle, usage, errors). Text-only and listener-object idioms are thin wrappers. |
|
|
26
|
+
| 5 | Provider matrix | All five providers (OpenAI, Anthropic, Google, Azure, Ollama) gain native tools + streaming, with auto-fallback to ReAct for Ollama models that don't support tools. |
|
|
27
|
+
|
|
28
|
+
## Non-goals (v1)
|
|
29
|
+
|
|
30
|
+
- MCP **server** mode (exposing RCrewAI as a callable MCP service).
|
|
31
|
+
- MCP resources and prompts (only tools are wired).
|
|
32
|
+
- MCP OAuth flows for HTTP servers; header-based auth only.
|
|
33
|
+
- MCP sampling (server-initiated LLM calls).
|
|
34
|
+
- Live cost reconciliation against provider invoices (we provide a calculator + price table only).
|
|
35
|
+
- New providers beyond the existing five (Bedrock, Groq, OpenRouter, Mistral) — tracked as a follow-up.
|
|
36
|
+
|
|
37
|
+
---
|
|
38
|
+
|
|
39
|
+
## 1. Architecture overview
|
|
40
|
+
|
|
41
|
+
### New module layout (additive)
|
|
42
|
+
|
|
43
|
+
```
|
|
44
|
+
lib/rcrewai/
|
|
45
|
+
├── tool_schema.rb # NEW – Schema DSL & JSON-schema emitter
|
|
46
|
+
├── events.rb # NEW – Typed event classes for streaming
|
|
47
|
+
├── tool_runner.rb # NEW – Native tool-call loop
|
|
48
|
+
├── legacy_react_runner.rb # NEW – Extracted ReAct loop from agent.rb
|
|
49
|
+
├── pricing.rb # NEW – Per-model price table for cost tracking
|
|
50
|
+
├── mcp/
|
|
51
|
+
│ ├── client.rb # NEW – MCP protocol client (JSON-RPC 2.0)
|
|
52
|
+
│ ├── transport/stdio.rb # NEW
|
|
53
|
+
│ ├── transport/http.rb # NEW – Streamable HTTP (SSE)
|
|
54
|
+
│ └── tool_adapter.rb # NEW – Wraps MCP tool as RCrewAI::Tools::Base
|
|
55
|
+
└── llm_clients/
|
|
56
|
+
├── base.rb # MODIFIED – new chat() contract
|
|
57
|
+
├── openai.rb # MODIFIED – tools + streaming
|
|
58
|
+
├── anthropic.rb # MODIFIED – tools + streaming + prompt-caching hook
|
|
59
|
+
├── google.rb # MODIFIED – tools + streaming
|
|
60
|
+
├── azure.rb # MODIFIED – inherits OpenAI shape
|
|
61
|
+
└── ollama.rb # MODIFIED – native tools (allowlist) + streaming
|
|
62
|
+
```
|
|
63
|
+
|
|
64
|
+
### New `chat` contract on `LLMClients::Base`
|
|
65
|
+
|
|
66
|
+
```ruby
|
|
67
|
+
def chat(messages:, tools: nil, tool_choice: :auto, stream: nil, **options)
|
|
68
|
+
# If `stream` is a Proc (or array of Procs), yields RCrewAI::Events::*
|
|
69
|
+
# and still returns the final aggregated result.
|
|
70
|
+
# If `tools:` is given, uses native function calling on the provider.
|
|
71
|
+
# Returns:
|
|
72
|
+
# {
|
|
73
|
+
# content: String | nil,
|
|
74
|
+
# tool_calls: [{ id:, name:, arguments: Hash }],
|
|
75
|
+
# usage: { prompt_tokens:, completion_tokens:, total_tokens: },
|
|
76
|
+
# finish_reason: Symbol, # :stop | :tool_calls | :length | :max_iterations
|
|
77
|
+
# model: String,
|
|
78
|
+
# provider: Symbol
|
|
79
|
+
# }
|
|
80
|
+
end
|
|
81
|
+
|
|
82
|
+
def supports_native_tools?(model: config.model)
|
|
83
|
+
# OpenAI / Anthropic / Google / Azure: always true
|
|
84
|
+
# Ollama: allowlist check against known tool-capable models
|
|
85
|
+
end
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
### Provider matrix for v1
|
|
89
|
+
|
|
90
|
+
| Provider | Native tools | Streaming | Notes |
|
|
91
|
+
|---|---|---|---|
|
|
92
|
+
| OpenAI | ✅ | ✅ | Reference impl |
|
|
93
|
+
| Anthropic | ✅ | ✅ | Prompt-caching hook on `system` block |
|
|
94
|
+
| Google (Gemini) | ✅ | ✅ | `functionDeclarations` shape |
|
|
95
|
+
| Azure | ✅ | ✅ | Reuses OpenAI client logic |
|
|
96
|
+
| Ollama | ✅ (llama3.1+, qwen2.5, mistral-nemo, etc.) | ✅ | Auto-detects; falls back to ReAct on legacy models |
|
|
97
|
+
|
|
98
|
+
Capability detection: `LLMClient#supports_native_tools?(model:)` is checked once per agent run; the result is logged at INFO level as `[rcrewai] agent=<name> mode=native_tools|react_legacy provider=<provider>`. Suppressible via `config.log_level = :warn`.
|
|
99
|
+
|
|
100
|
+
---
|
|
101
|
+
|
|
102
|
+
## 2. Tool Schema DSL
|
|
103
|
+
|
|
104
|
+
A small class-level DSL on `Tools::Base` produces a canonical JSON schema. Per-provider adapters reshape the canonical schema for OpenAI / Anthropic / Google / Ollama (small differences in nesting and keys).
|
|
105
|
+
|
|
106
|
+
### DSL surface
|
|
107
|
+
|
|
108
|
+
```ruby
|
|
109
|
+
class WebSearch < RCrewAI::Tools::Base
|
|
110
|
+
tool_name "web_search"
|
|
111
|
+
description "Search the web via DuckDuckGo and return top results"
|
|
112
|
+
|
|
113
|
+
param :query, type: :string, required: true,
|
|
114
|
+
description: "Natural-language search query"
|
|
115
|
+
param :max_results, type: :integer, default: 10,
|
|
116
|
+
description: "Number of results to return (1-25)"
|
|
117
|
+
|
|
118
|
+
def execute(query:, max_results: 10)
|
|
119
|
+
# existing implementation
|
|
120
|
+
end
|
|
121
|
+
end
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
### Supported types
|
|
125
|
+
|
|
126
|
+
`:string`, `:integer`, `:number`, `:boolean`, `:array` (with `items:`), `:object` (with `properties:`), `:enum` (with `values: [...]`).
|
|
127
|
+
|
|
128
|
+
### Generated canonical schema
|
|
129
|
+
|
|
130
|
+
```ruby
|
|
131
|
+
WebSearch.json_schema
|
|
132
|
+
# => {
|
|
133
|
+
# name: "web_search",
|
|
134
|
+
# description: "Search the web via DuckDuckGo and return top results",
|
|
135
|
+
# parameters: {
|
|
136
|
+
# type: "object",
|
|
137
|
+
# properties: {
|
|
138
|
+
# query: { type: "string", description: "Natural-language search query" },
|
|
139
|
+
# max_results: { type: "integer", description: "Number of results to return (1-25)", default: 10 }
|
|
140
|
+
# },
|
|
141
|
+
# required: ["query"]
|
|
142
|
+
# }
|
|
143
|
+
# }
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### Validation
|
|
147
|
+
|
|
148
|
+
`Tools::Base#execute_with_validation(args_hash)` validates `args_hash` against the schema (types coerced where unambiguous, e.g. `"10"` → `10` for an `:integer` param) before delegating to `execute(**)`. Bad types raise `ToolError` with a clean message that flows back to the LLM as a tool-result, allowing the agent to recover on the next iteration.
|
|
149
|
+
|
|
150
|
+
### Fallback for undeclared tools
|
|
151
|
+
|
|
152
|
+
Tools without DSL declarations receive a permissive schema (`{ type: "object", additionalProperties: true }`) and a deprecation warning printed once per process. They still work in native-tools mode (the model gets less guidance) and still work in ReAct mode (unchanged).
|
|
153
|
+
|
|
154
|
+
### Built-in tool migration
|
|
155
|
+
|
|
156
|
+
All 7 built-in tools (`web_search`, `file_reader`, `file_writer`, `sql_database`, `email_sender`, `code_executor`, `pdf_processor`) get DSL declarations as part of this work. Each is ~5 added lines.
|
|
157
|
+
|
|
158
|
+
---
|
|
159
|
+
|
|
160
|
+
## 3. Native tool-call loop (`ToolRunner`)
|
|
161
|
+
|
|
162
|
+
The current ~300 lines of prompt-template + regex-parsing code in `Agent#execute_task` is split:
|
|
163
|
+
|
|
164
|
+
- The new **`ToolRunner`** drives a native function-calling conversation.
|
|
165
|
+
- The extracted **`LegacyReactRunner`** preserves the current `USE_TOOL[]` semantics for fallback.
|
|
166
|
+
|
|
167
|
+
`Agent#execute_task` becomes a thin orchestrator: build initial messages → choose runner → return runner result.
|
|
168
|
+
|
|
169
|
+
### `ToolRunner` interface
|
|
170
|
+
|
|
171
|
+
```ruby
|
|
172
|
+
class ToolRunner
|
|
173
|
+
def initialize(agent:, llm:, tools:, max_iterations: 10, event_sink: nil); end
|
|
174
|
+
|
|
175
|
+
def run(messages:)
|
|
176
|
+
# Returns:
|
|
177
|
+
# {
|
|
178
|
+
# content: String,
|
|
179
|
+
# tool_calls_history: [{ tool:, args:, result:, duration_ms: }, ...],
|
|
180
|
+
# usage: { prompt_tokens:, completion_tokens:, total_tokens: },
|
|
181
|
+
# iterations: Integer,
|
|
182
|
+
# finish_reason: Symbol
|
|
183
|
+
# }
|
|
184
|
+
end
|
|
185
|
+
end
|
|
186
|
+
```
|
|
187
|
+
|
|
188
|
+
### Loop algorithm (per iteration)
|
|
189
|
+
|
|
190
|
+
1. Call `llm.chat(messages:, tools: tools.map(&:json_schema), stream: event_sink)`.
|
|
191
|
+
2. Emit `Events::IterationStart` / `Events::TextDelta` / `Events::TextDone` / `Events::Usage` as deltas arrive.
|
|
192
|
+
3. If the response contains `tool_calls`:
|
|
193
|
+
- For each call: emit `Events::ToolCallStart`, run `tool.execute_with_validation(args)`, emit `Events::ToolCallResult` (or `Events::ToolCallError`).
|
|
194
|
+
- Honor `agent.require_approval_for_tools` (existing human-in-loop hook fires here).
|
|
195
|
+
- Record into `agent.memory` via existing `memory.add_tool_usage`.
|
|
196
|
+
- Append the assistant message (with `tool_calls`) and each tool-result message to `messages`.
|
|
197
|
+
- Emit `Events::IterationEnd(finish_reason: :tool_calls)`; continue.
|
|
198
|
+
4. If no `tool_calls` (i.e. `finish_reason: :stop`): emit `Events::IterationEnd(finish_reason: :stop)`; return final content.
|
|
199
|
+
5. If `iterations >= max_iterations`: emit `Events::IterationEnd(finish_reason: :max_iterations)`; return best-effort content with a logged warning.
|
|
200
|
+
|
|
201
|
+
### Runner selection
|
|
202
|
+
|
|
203
|
+
```ruby
|
|
204
|
+
runner = if llm.supports_native_tools?(model: config.model) && tools.all? { |t| t.respond_to?(:json_schema) }
|
|
205
|
+
ToolRunner.new(...)
|
|
206
|
+
else
|
|
207
|
+
LegacyReactRunner.new(...)
|
|
208
|
+
end
|
|
209
|
+
```
|
|
210
|
+
|
|
211
|
+
A single INFO-level log line records the choice. Both runners emit the same event types, so streaming consumers don't care which is in use.
|
|
212
|
+
|
|
213
|
+
### Agent invariants preserved
|
|
214
|
+
|
|
215
|
+
- `Agent#use_tool(name, **args)` (direct invocation API) is unchanged.
|
|
216
|
+
- Both runners route tool execution through `Agent#use_tool` so the human-approval hook and memory recording are shared.
|
|
217
|
+
- `Agent#execute_task` return value gains a `tool_calls_history:` key. All existing keys are unchanged.
|
|
218
|
+
|
|
219
|
+
---
|
|
220
|
+
|
|
221
|
+
## 4. Streaming event model
|
|
222
|
+
|
|
223
|
+
A single typed event stream consumed by UIs, observability, cost tracking, and the tool runner itself.
|
|
224
|
+
|
|
225
|
+
### Event types — `RCrewAI::Events`
|
|
226
|
+
|
|
227
|
+
```ruby
|
|
228
|
+
module RCrewAI::Events
|
|
229
|
+
Event = Struct.new(:type, :timestamp, :agent, :iteration, keyword_init: true)
|
|
230
|
+
|
|
231
|
+
TextDelta = Class.new(Event) # adds :text (partial)
|
|
232
|
+
TextDone = Class.new(Event) # adds :text (full)
|
|
233
|
+
ToolCallStart = Class.new(Event) # adds :tool, :args, :call_id
|
|
234
|
+
ToolCallResult = Class.new(Event) # adds :tool, :call_id, :result, :duration_ms
|
|
235
|
+
ToolCallError = Class.new(Event) # adds :tool, :call_id, :error
|
|
236
|
+
Thinking = Class.new(Event) # adds :text (Anthropic extended thinking, etc.)
|
|
237
|
+
Usage = Class.new(Event) # adds :prompt_tokens, :completion_tokens, :total_tokens, :cost_usd
|
|
238
|
+
IterationStart = Class.new(Event) # adds :iteration_index
|
|
239
|
+
IterationEnd = Class.new(Event) # adds :finish_reason
|
|
240
|
+
Error = Class.new(Event) # adds :error
|
|
241
|
+
end
|
|
242
|
+
```
|
|
243
|
+
|
|
244
|
+
Every event carries `agent` (name) and `iteration` so consumers can correlate without external bookkeeping.
|
|
245
|
+
|
|
246
|
+
### Public API — three idioms
|
|
247
|
+
|
|
248
|
+
```ruby
|
|
249
|
+
# 1. Block form
|
|
250
|
+
crew.execute(stream: true) do |event|
|
|
251
|
+
case event
|
|
252
|
+
when RCrewAI::Events::TextDelta then print event.text
|
|
253
|
+
when RCrewAI::Events::ToolCallStart then puts "→ #{event.tool}(#{event.args})"
|
|
254
|
+
when RCrewAI::Events::Usage then meter.record(event)
|
|
255
|
+
end
|
|
256
|
+
end
|
|
257
|
+
|
|
258
|
+
# 2. Multiple listeners (good for UI + logging + metering simultaneously)
|
|
259
|
+
crew.execute(stream: [logger_sink, cost_sink, ui_sink]) # each responds to #call(event)
|
|
260
|
+
|
|
261
|
+
# 3. Convenience text-only
|
|
262
|
+
agent.execute_task(task).each_text_delta { |chunk| print chunk }
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
### Plumbing path
|
|
266
|
+
|
|
267
|
+
```
|
|
268
|
+
LLMClient (SSE parse) → ToolRunner (orchestrates) → Agent (tags :agent) → Crew (fans out)
|
|
269
|
+
```
|
|
270
|
+
|
|
271
|
+
Each layer is a Proc that calls the next. No coupling between producers and consumers.
|
|
272
|
+
|
|
273
|
+
### Per-provider SSE parsing
|
|
274
|
+
|
|
275
|
+
- **OpenAI:** `chat.completions` stream with `delta.content` + `delta.tool_calls[].function.arguments` (incremental JSON; assembled before emission).
|
|
276
|
+
- **Anthropic:** `message_delta` / `content_block_delta` / `input_json_delta` events.
|
|
277
|
+
- **Google:** `streamGenerateContent` with `candidates[].content.parts[].functionCall`.
|
|
278
|
+
- **Azure:** identical to OpenAI.
|
|
279
|
+
- **Ollama:** line-delimited JSON; `message.tool_calls` arrives as a single chunk (less granular but still streamed).
|
|
280
|
+
|
|
281
|
+
Each adapter normalizes to the canonical `Events::*` shape.
|
|
282
|
+
|
|
283
|
+
### Cost tracking
|
|
284
|
+
|
|
285
|
+
`Events::Usage` includes `cost_usd` computed from `lib/rcrewai/pricing.rb` (per-model price table for OpenAI/Anthropic/Google list prices, shipped at release time). Users can override:
|
|
286
|
+
|
|
287
|
+
```ruby
|
|
288
|
+
RCrewAI.configure do |c|
|
|
289
|
+
c.pricing = { "gpt-4o" => { input: 2.50, output: 10.00 } } # USD per 1M tokens
|
|
290
|
+
end
|
|
291
|
+
```
|
|
292
|
+
|
|
293
|
+
If a model isn't in the table, `cost_usd` is `nil` — not an error.
|
|
294
|
+
|
|
295
|
+
### Non-streaming callers
|
|
296
|
+
|
|
297
|
+
If `stream:` is not provided, the event stream is collected internally and discarded. Public return value is unchanged.
|
|
298
|
+
|
|
299
|
+
---
|
|
300
|
+
|
|
301
|
+
## 5. MCP client integration
|
|
302
|
+
|
|
303
|
+
MCP is JSON-RPC 2.0. We add a client that connects to MCP servers (stdio subprocess or HTTP) and surfaces remote tools as `RCrewAI::Tools::Base` instances — so they run through the same `ToolRunner`, schema validation, human-approval hook, memory recording, and event emission as native tools. No special-casing.
|
|
304
|
+
|
|
305
|
+
### Module — `RCrewAI::MCP`
|
|
306
|
+
|
|
307
|
+
```ruby
|
|
308
|
+
module RCrewAI::MCP
|
|
309
|
+
class Client # JSON-RPC 2.0 over a Transport
|
|
310
|
+
class Transport::Stdio # spawn subprocess, talk over stdin/stdout
|
|
311
|
+
class Transport::Http # Streamable HTTP per MCP spec (POST + SSE)
|
|
312
|
+
class ToolAdapter # wraps one MCP tool as Tools::Base
|
|
313
|
+
class Error
|
|
314
|
+
end
|
|
315
|
+
```
|
|
316
|
+
|
|
317
|
+
### User-facing API
|
|
318
|
+
|
|
319
|
+
```ruby
|
|
320
|
+
# Stdio (subprocess) — most common deployment
|
|
321
|
+
github = RCrewAI::MCP::Client.connect(
|
|
322
|
+
command: "npx",
|
|
323
|
+
args: ["-y", "@modelcontextprotocol/server-github"],
|
|
324
|
+
env: { "GITHUB_TOKEN" => ENV.fetch("GITHUB_TOKEN") }
|
|
325
|
+
)
|
|
326
|
+
|
|
327
|
+
# HTTP (remote MCP server)
|
|
328
|
+
linear = RCrewAI::MCP::Client.connect(
|
|
329
|
+
url: "https://mcp.linear.app/sse",
|
|
330
|
+
headers: { "Authorization" => "Bearer #{ENV['LINEAR_TOKEN']}" }
|
|
331
|
+
)
|
|
332
|
+
|
|
333
|
+
agent = RCrewAI::Agent.new(
|
|
334
|
+
name: "engineer",
|
|
335
|
+
role: "Backend Engineer",
|
|
336
|
+
tools: github.tools + linear.tools + [RCrewAI::Tools::FileWriter.new]
|
|
337
|
+
)
|
|
338
|
+
```
|
|
339
|
+
|
|
340
|
+
### What `client.tools` returns
|
|
341
|
+
|
|
342
|
+
On connect, the client performs the `initialize` handshake and then `tools/list`. Each remote tool becomes a `ToolAdapter` whose:
|
|
343
|
+
|
|
344
|
+
- `tool_name` ← MCP tool name, prefixed with the server name (e.g. `github__create_issue`) to avoid collisions.
|
|
345
|
+
- `description` ← MCP tool description.
|
|
346
|
+
- `json_schema` ← MCP tool's `inputSchema` (already JSON Schema — no translation).
|
|
347
|
+
- `execute(**args)` → sends `tools/call`, returns the result content (text / image / resource).
|
|
348
|
+
|
|
349
|
+
### Lifecycle
|
|
350
|
+
|
|
351
|
+
```ruby
|
|
352
|
+
client = RCrewAI::MCP::Client.connect(...) # opens transport, handshakes, lists tools
|
|
353
|
+
# ...
|
|
354
|
+
client.close # graceful shutdown
|
|
355
|
+
|
|
356
|
+
# Or block form (auto-closes):
|
|
357
|
+
RCrewAI::MCP::Client.with_connection(command: ...) do |client|
|
|
358
|
+
crew.add_agent(RCrewAI::Agent.new(..., tools: client.tools))
|
|
359
|
+
crew.execute
|
|
360
|
+
end
|
|
361
|
+
```
|
|
362
|
+
|
|
363
|
+
A finalizer plus `at_exit` hook ensure subprocesses are killed even on hard exit.
|
|
364
|
+
|
|
365
|
+
### Dependencies
|
|
366
|
+
|
|
367
|
+
No new gem dependency. We hand-roll a minimal JSON-RPC 2.0 client (~100 LOC) since we only need request/response + notifications. SSE is handled by the same parser used by the LLM clients.
|
|
368
|
+
|
|
369
|
+
---
|
|
370
|
+
|
|
371
|
+
## 6. Backward compatibility & migration
|
|
372
|
+
|
|
373
|
+
### Unchanged public surface (zero migration)
|
|
374
|
+
|
|
375
|
+
- `RCrewAI::Agent.new(name:, role:, goal:, tools: [...])`
|
|
376
|
+
- `RCrewAI::Crew.new(...).execute` (return shape unchanged when `stream:` is not passed)
|
|
377
|
+
- `RCrewAI::Task.new(...)`
|
|
378
|
+
- All `human_input`, `require_approval_for_*`, `manager`, `allow_delegation` flags
|
|
379
|
+
- Custom tools that subclass `Tools::Base` and implement `execute(**params)`
|
|
380
|
+
|
|
381
|
+
### Behavior changes a user might observe
|
|
382
|
+
|
|
383
|
+
1. **Tool calls are more reliable.** Agents using OpenAI / Anthropic / Google / Azure (or modern Ollama models) now use native function calling. Fewer "agent forgot to call the tool" and "argument got mangled" failures. *This is the headline win.*
|
|
384
|
+
2. **One new INFO log line per agent run** identifies the mode and provider. Suppressible via `config.log_level = :warn`.
|
|
385
|
+
3. **Tools without DSL declarations** get a permissive fallback schema and a one-time deprecation warning.
|
|
386
|
+
|
|
387
|
+
### Hard breaking changes (called out in CHANGELOG)
|
|
388
|
+
|
|
389
|
+
- `LLMClients::Base#chat` adds `tools:` and `stream:` keyword args. Subclasses outside the gem that override `chat` with an explicit kwarg list need to add these (or accept `**options`). Realistic blast radius: near-zero.
|
|
390
|
+
- `Agent#execute_task` return value gains a `tool_calls_history:` key. Pure addition; existing keys unchanged.
|
|
391
|
+
|
|
392
|
+
### Versioning
|
|
393
|
+
|
|
394
|
+
**0.3.0** — minor bump under 1.0 conventions. CHANGELOG explicitly lists the two hard breaks above.
|
|
395
|
+
|
|
396
|
+
### Migration guide (`docs/upgrading-to-0.3.md`)
|
|
397
|
+
|
|
398
|
+
1. *(Optional, recommended)* Add `tool_name`, `description`, `param` declarations to custom tools. Before/after example provided.
|
|
399
|
+
2. *(Optional)* Adopt streaming via `crew.execute(stream: ...) { |event| ... }`.
|
|
400
|
+
3. *(Optional)* Wire in MCP servers via `RCrewAI::MCP::Client.connect(...)`.
|
|
401
|
+
4. *(Only if affected)* If you subclassed `LLMClients::Base#chat` with explicit kwargs, add `tools: nil, stream: nil` to the signature.
|
|
402
|
+
|
|
403
|
+
### Companion gem (`rcrewai-rails`)
|
|
404
|
+
|
|
405
|
+
Needs a coordinated release. The Rails engine's `Crew#execute` ActiveJob wrapper will gain a `stream:` parameter that pipes events into ActionCable for live UI. Tracked as a follow-up issue, not blocking 0.3.0 of the core gem.
|
|
406
|
+
|
|
407
|
+
---
|
|
408
|
+
|
|
409
|
+
## 7. Testing strategy
|
|
410
|
+
|
|
411
|
+
### New specs
|
|
412
|
+
|
|
413
|
+
```
|
|
414
|
+
spec/
|
|
415
|
+
├── tool_schema_spec.rb # DSL → JSON schema (all types, edge cases, validation)
|
|
416
|
+
├── tool_runner_spec.rb # Multi-iteration loop, max_iterations, error recovery,
|
|
417
|
+
│ # human-approval hook, memory recording, event emission
|
|
418
|
+
├── legacy_react_runner_spec.rb # Pins existing USE_TOOL[] behavior
|
|
419
|
+
├── events_spec.rb # Event tagging, fan-out to multiple sinks
|
|
420
|
+
├── mcp/
|
|
421
|
+
│ ├── client_spec.rb # JSON-RPC framing, handshake, tools/list, tools/call
|
|
422
|
+
│ ├── transport/stdio_spec.rb # Subprocess lifecycle, stderr passthrough, kill on close
|
|
423
|
+
│ ├── transport/http_spec.rb # POST + SSE, header auth, reconnection
|
|
424
|
+
│ └── tool_adapter_spec.rb # MCP tool → Tools::Base behavior parity
|
|
425
|
+
├── llm_clients/
|
|
426
|
+
│ ├── openai_spec.rb # Tools payload shape, SSE delta parsing, tool_call assembly
|
|
427
|
+
│ ├── anthropic_spec.rb # Tools shape, content_block_delta parsing, caching hook
|
|
428
|
+
│ ├── google_spec.rb # functionDeclarations shape, streamGenerateContent
|
|
429
|
+
│ ├── azure_spec.rb # Inherits OpenAI; one auth-shape smoke test
|
|
430
|
+
│ └── ollama_spec.rb # Native-tools allowlist, ReAct fallback
|
|
431
|
+
├── pricing_spec.rb # Cost calculation; missing model returns nil
|
|
432
|
+
└── integration/
|
|
433
|
+
├── native_tool_calling_spec.rb # End-to-end agent + real tool + recorded LLM responses
|
|
434
|
+
├── streaming_spec.rb # End-to-end event stream with multiple consumers
|
|
435
|
+
├── mcp_end_to_end_spec.rb # Real subprocess (fixture MCP server in spec/fixtures/)
|
|
436
|
+
└── legacy_react_fallback_spec.rb# Old USE_TOOL[] path still works on tools w/o DSL
|
|
437
|
+
```
|
|
438
|
+
|
|
439
|
+
### Recording strategy
|
|
440
|
+
|
|
441
|
+
- **Unit tests** for LLM clients use **webmock** with fixture JSON/SSE bodies in `spec/fixtures/llm_responses/` — deterministic, fast, no network.
|
|
442
|
+
- **Integration tests** use **VCR cassettes** (already a dev dep) recorded once against real providers and committed. Re-record via `RECORD=true rspec`.
|
|
443
|
+
- **MCP integration** ships a tiny stdio MCP server (~50 LOC Ruby script) in `spec/fixtures/mcp_servers/` so CI doesn't need `npx`.
|
|
444
|
+
|
|
445
|
+
### Regression coverage
|
|
446
|
+
|
|
447
|
+
- `legacy_react_runner_spec.rb` pins current `USE_TOOL[]` semantics.
|
|
448
|
+
- Existing `agent_spec.rb` and `crew_spec.rb` get added cases for the new `stream:` kwarg (no-op when not provided) and the new `tool_calls_history:` return key.
|
|
449
|
+
|
|
450
|
+
### CI
|
|
451
|
+
|
|
452
|
+
- `.github/workflows/ci.yml` runs Ruby 3.0, 3.1, 3.2, 3.3 on Linux.
|
|
453
|
+
- MCP integration test gated by Ruby ≥ 3.1.
|
|
454
|
+
|
|
455
|
+
### Coverage target
|
|
456
|
+
|
|
457
|
+
`simplecov` (already a dev dep) configured to fail CI at <85% line coverage for `lib/rcrewai/{tool_schema,tool_runner,events,mcp}/**`. No coverage bar on legacy files in this PR.
|
|
458
|
+
|
|
459
|
+
### Out of scope for v1 tests
|
|
460
|
+
|
|
461
|
+
- Live MCP servers from the wild (Slack, GitHub) — too flaky for CI; the fixture server suffices.
|
|
462
|
+
- Real cost figures against provider invoices — only the calculator formula.
|
|
463
|
+
|
|
464
|
+
---
|
|
465
|
+
|
|
466
|
+
## Appendix A — Order of work (rough)
|
|
467
|
+
|
|
468
|
+
1. `ToolSchema` DSL + JSON schema emitter + per-provider schema adapters.
|
|
469
|
+
2. Migrate the 7 built-in tools to declare schemas.
|
|
470
|
+
3. `Events` module + `LLMClients::Base` new `chat` contract.
|
|
471
|
+
4. OpenAI native tools + streaming (reference implementation).
|
|
472
|
+
5. `ToolRunner` + extract `LegacyReactRunner` from `Agent#execute_task`.
|
|
473
|
+
6. Anthropic, Google, Azure native tools + streaming.
|
|
474
|
+
7. Ollama native tools (allowlist) + streaming + ReAct fallback.
|
|
475
|
+
8. `Pricing` + cost on `Events::Usage`.
|
|
476
|
+
9. MCP transports (stdio, HTTP) + `Client` + `ToolAdapter`.
|
|
477
|
+
10. CHANGELOG, `docs/upgrading-to-0.3.md`, `docs/mcp.md`, examples.
|
|
478
|
+
|
|
479
|
+
The implementation plan (writing-plans skill) will refine this into discrete, testable tasks with explicit dependencies.
|
|
@@ -0,0 +1,163 @@
|
|
|
1
|
+
# Upgrading to RCrewAI 0.3
|
|
2
|
+
|
|
3
|
+
RCrewAI 0.3 modernizes the LLM layer with native function calling, typed
|
|
4
|
+
streaming, and an MCP client. Existing code continues to run, but you can
|
|
5
|
+
opt into the new capabilities incrementally.
|
|
6
|
+
|
|
7
|
+
This guide is split into three parts:
|
|
8
|
+
|
|
9
|
+
1. **What you must do** — required changes to keep your code working.
|
|
10
|
+
2. **What you should do** — recommended changes to take advantage of the
|
|
11
|
+
new behavior.
|
|
12
|
+
3. **What you can do** — new capabilities you can adopt at your own pace.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## 1. What you must do
|
|
17
|
+
|
|
18
|
+
### 1a. Custom `LLMClients::Base` subclasses
|
|
19
|
+
|
|
20
|
+
If you have a custom LLM client that overrides `chat` with a fixed kwarg
|
|
21
|
+
list, add `tools:` and `stream:` (or accept `**options`).
|
|
22
|
+
|
|
23
|
+
**Before:**
|
|
24
|
+
|
|
25
|
+
```ruby
|
|
26
|
+
class MyClient < RCrewAI::LLMClients::Base
|
|
27
|
+
def chat(messages:, temperature: 0.1)
|
|
28
|
+
# ...
|
|
29
|
+
end
|
|
30
|
+
end
|
|
31
|
+
```
|
|
32
|
+
|
|
33
|
+
**After:**
|
|
34
|
+
|
|
35
|
+
```ruby
|
|
36
|
+
class MyClient < RCrewAI::LLMClients::Base
|
|
37
|
+
def chat(messages:, tools: nil, tool_choice: :auto, stream: nil, **options)
|
|
38
|
+
# ...
|
|
39
|
+
end
|
|
40
|
+
|
|
41
|
+
def supports_native_tools?(model: config.model)
|
|
42
|
+
false # or true if your provider supports OpenAI-style function calls
|
|
43
|
+
end
|
|
44
|
+
end
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### 1b. Tools without DSL declarations now print a deprecation warning
|
|
48
|
+
|
|
49
|
+
Tools that don't declare a schema still work (with a permissive fallback
|
|
50
|
+
schema) but emit a one-time `[rcrewai] Tool ... has no DSL declarations`
|
|
51
|
+
warning on stderr. To clear the warning, declare a schema:
|
|
52
|
+
|
|
53
|
+
**Before:**
|
|
54
|
+
|
|
55
|
+
```ruby
|
|
56
|
+
class WeatherTool < RCrewAI::Tools::Base
|
|
57
|
+
def initialize
|
|
58
|
+
@name = "weather"
|
|
59
|
+
@description = "Get the weather for a city"
|
|
60
|
+
end
|
|
61
|
+
|
|
62
|
+
def execute(city:)
|
|
63
|
+
# ...
|
|
64
|
+
end
|
|
65
|
+
end
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
**After:**
|
|
69
|
+
|
|
70
|
+
```ruby
|
|
71
|
+
class WeatherTool < RCrewAI::Tools::Base
|
|
72
|
+
tool_name "weather"
|
|
73
|
+
description "Get the weather for a city"
|
|
74
|
+
param :city, type: :string, required: true, description: "City name"
|
|
75
|
+
|
|
76
|
+
def execute(city:)
|
|
77
|
+
# ...
|
|
78
|
+
end
|
|
79
|
+
end
|
|
80
|
+
```
|
|
81
|
+
|
|
82
|
+
---
|
|
83
|
+
|
|
84
|
+
## 2. What you should do
|
|
85
|
+
|
|
86
|
+
### 2a. Consume `Agent#execute_task` return as a hash
|
|
87
|
+
|
|
88
|
+
In 0.3, `execute_task` returns a hash with `:content`, `:tool_calls_history`,
|
|
89
|
+
`:usage`, `:iterations`, and `:finish_reason`. The agent still assigns
|
|
90
|
+
the plain string content to `task.result`, so old code that reads
|
|
91
|
+
`task.result` continues to work.
|
|
92
|
+
|
|
93
|
+
**Before:**
|
|
94
|
+
|
|
95
|
+
```ruby
|
|
96
|
+
result_string = agent.execute_task(task)
|
|
97
|
+
puts result_string
|
|
98
|
+
```
|
|
99
|
+
|
|
100
|
+
**After:**
|
|
101
|
+
|
|
102
|
+
```ruby
|
|
103
|
+
result = agent.execute_task(task)
|
|
104
|
+
puts result[:content]
|
|
105
|
+
puts "tokens: #{result.dig(:usage, :total_tokens)}"
|
|
106
|
+
puts "tool calls: #{result[:tool_calls_history].length}"
|
|
107
|
+
```
|
|
108
|
+
|
|
109
|
+
---
|
|
110
|
+
|
|
111
|
+
## 3. What you can do (new capabilities)
|
|
112
|
+
|
|
113
|
+
### 3a. Native function calling
|
|
114
|
+
|
|
115
|
+
When a tool has a DSL-declared schema and your LLM supports native function
|
|
116
|
+
calls, the agent automatically uses the new `ToolRunner` path — no more
|
|
117
|
+
`USE_TOOL[name](k=v)` prompt parsing. There's nothing to opt into beyond
|
|
118
|
+
declaring the schema (see 1b).
|
|
119
|
+
|
|
120
|
+
### 3b. Streaming events
|
|
121
|
+
|
|
122
|
+
Pass a `stream:` lambda to `agent.execute_task` or `crew.execute`. You'll
|
|
123
|
+
receive typed events as the LLM produces output:
|
|
124
|
+
|
|
125
|
+
```ruby
|
|
126
|
+
events = []
|
|
127
|
+
agent.execute_task(task, stream: ->(e) { events << e })
|
|
128
|
+
|
|
129
|
+
text = events
|
|
130
|
+
.select { |e| e.is_a?(RCrewAI::Events::TextDelta) }
|
|
131
|
+
.map(&:text).join
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
Event types: `TextDelta`, `ToolCallStart`, `ToolCallResult`, `ToolCallError`,
|
|
135
|
+
`Usage` (with `cost_usd`), `IterationStart`, `IterationEnd`, `Error`.
|
|
136
|
+
|
|
137
|
+
### 3c. Cost tracking
|
|
138
|
+
|
|
139
|
+
When `Pricing` knows your model, `Events::Usage` carries `cost_usd`:
|
|
140
|
+
|
|
141
|
+
```ruby
|
|
142
|
+
total_cost = 0.0
|
|
143
|
+
crew.execute(stream: ->(e) {
|
|
144
|
+
total_cost += e.cost_usd if e.is_a?(RCrewAI::Events::Usage) && e.cost_usd
|
|
145
|
+
})
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
Override the table via `RCrewAI.configuration.pricing = { "my-model" => { input: 1.0, output: 5.0 } }`.
|
|
149
|
+
|
|
150
|
+
### 3d. MCP servers as agent tools
|
|
151
|
+
|
|
152
|
+
Connect to an MCP server (stdio or HTTP) and pass its tools to your agent:
|
|
153
|
+
|
|
154
|
+
```ruby
|
|
155
|
+
RCrewAI::MCP::Client.with_connection(command: "npx", args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]) do |client|
|
|
156
|
+
agent = RCrewAI::Agent.new(name: "fs", role: "...", goal: "...", tools: client.tools)
|
|
157
|
+
task = RCrewAI::Task.new(name: "ls", description: "list /tmp", agent: agent)
|
|
158
|
+
result = agent.execute_task(task)
|
|
159
|
+
puts result[:content]
|
|
160
|
+
end
|
|
161
|
+
```
|
|
162
|
+
|
|
163
|
+
See `docs/mcp.md` for the full guide.
|