rcrewai 0.2.0 → 0.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (57) hide show
  1. checksums.yaml +4 -4
  2. data/.rubocop.yml +1 -0
  3. data/.rubocop_todo.yml +99 -0
  4. data/CHANGELOG.md +24 -0
  5. data/README.md +33 -1
  6. data/Rakefile +53 -53
  7. data/bin/rcrewai +3 -3
  8. data/docs/mcp.md +109 -0
  9. data/docs/superpowers/plans/2026-05-11-llm-modernization.md +2753 -0
  10. data/docs/superpowers/specs/2026-05-11-llm-modernization-design.md +479 -0
  11. data/docs/upgrading-to-0.3.md +163 -0
  12. data/examples/async_execution_example.rb +82 -81
  13. data/examples/hierarchical_crew_example.rb +68 -72
  14. data/examples/human_in_the_loop_example.rb +73 -74
  15. data/examples/mcp_example.rb +48 -0
  16. data/examples/native_tools_example.rb +64 -0
  17. data/examples/streaming_example.rb +56 -0
  18. data/lib/rcrewai/agent.rb +148 -287
  19. data/lib/rcrewai/async_executor.rb +43 -43
  20. data/lib/rcrewai/cli.rb +11 -11
  21. data/lib/rcrewai/configuration.rb +14 -9
  22. data/lib/rcrewai/crew.rb +56 -39
  23. data/lib/rcrewai/events.rb +30 -0
  24. data/lib/rcrewai/human_input.rb +104 -114
  25. data/lib/rcrewai/legacy_react_runner.rb +172 -0
  26. data/lib/rcrewai/llm_client.rb +1 -1
  27. data/lib/rcrewai/llm_clients/anthropic.rb +174 -54
  28. data/lib/rcrewai/llm_clients/azure.rb +23 -128
  29. data/lib/rcrewai/llm_clients/base.rb +11 -7
  30. data/lib/rcrewai/llm_clients/google.rb +159 -95
  31. data/lib/rcrewai/llm_clients/ollama.rb +150 -106
  32. data/lib/rcrewai/llm_clients/openai.rb +140 -63
  33. data/lib/rcrewai/mcp/client.rb +101 -0
  34. data/lib/rcrewai/mcp/tool_adapter.rb +59 -0
  35. data/lib/rcrewai/mcp/transport/http.rb +53 -0
  36. data/lib/rcrewai/mcp/transport/stdio.rb +55 -0
  37. data/lib/rcrewai/mcp.rb +8 -0
  38. data/lib/rcrewai/memory.rb +45 -37
  39. data/lib/rcrewai/pricing.rb +34 -0
  40. data/lib/rcrewai/process.rb +86 -95
  41. data/lib/rcrewai/provider_schema.rb +38 -0
  42. data/lib/rcrewai/sse_parser.rb +55 -0
  43. data/lib/rcrewai/task.rb +56 -64
  44. data/lib/rcrewai/tool_runner.rb +132 -0
  45. data/lib/rcrewai/tool_schema.rb +97 -0
  46. data/lib/rcrewai/tools/base.rb +98 -37
  47. data/lib/rcrewai/tools/code_executor.rb +71 -74
  48. data/lib/rcrewai/tools/email_sender.rb +70 -78
  49. data/lib/rcrewai/tools/file_reader.rb +38 -30
  50. data/lib/rcrewai/tools/file_writer.rb +40 -38
  51. data/lib/rcrewai/tools/pdf_processor.rb +115 -130
  52. data/lib/rcrewai/tools/sql_database.rb +58 -55
  53. data/lib/rcrewai/tools/web_search.rb +26 -25
  54. data/lib/rcrewai/version.rb +2 -2
  55. data/lib/rcrewai.rb +18 -10
  56. data/rcrewai.gemspec +55 -36
  57. metadata +86 -50
@@ -0,0 +1,479 @@
1
+ # LLM Modernization: Native Tool Calling, Streaming, and MCP Client
2
+
3
+ **Date:** 2026-05-11
4
+ **Status:** Approved design (pending implementation plan)
5
+ **Target version:** `rcrewai` 0.3.0 (current is 0.2.1)
6
+ **Scope:** Replace prompt-engineered ReAct tool calls with native function calling across all five LLM providers; add a typed streaming event model; add an MCP (Model Context Protocol) client so RCrewAI agents can consume external MCP servers as ordinary tools.
7
+
8
+ ---
9
+
10
+ ## Motivation
11
+
12
+ The current tool-call mechanism is ReAct-style: agents are prompted to emit `USE_TOOL[name](k=v)`, which a regex in `Agent#execute_task` then parses. This works but is fragile — argument escaping, multi-word string values, nested data, and "the model forgot to call the tool" are all recurring failure modes that providers' native function-calling APIs handle correctly.
13
+
14
+ Streaming is absent entirely, which blocks UI work, live cost tracking, and observability/tracing.
15
+
16
+ MCP has emerged as the de-facto standard protocol for exposing tools (filesystem, GitHub, Slack, browsers, internal services) to agent frameworks. Supporting it as a client unlocks a large existing tool ecosystem with zero per-tool code.
17
+
18
+ ## Decisions captured during brainstorming
19
+
20
+ | # | Decision | Choice |
21
+ |---|---|---|
22
+ | 1 | Backward compatibility posture | **Dual-mode, native preferred.** Existing tools and the `USE_TOOL[]` path keep working; native tool calling is the default when both the provider and the tool support it. |
23
+ | 2 | Tool schema definition | **Explicit DSL** on `Tools::Base` (`tool_name`, `description`, `param`). Tools without DSL declarations get a permissive fallback schema. |
24
+ | 3 | MCP scope | **Client only** in v1 (stdio + HTTP/SSE transports). Server-mode deferred. |
25
+ | 4 | Streaming surface | **Full typed event stream** (text deltas, tool-call lifecycle, usage, errors). Text-only and listener-object idioms are thin wrappers. |
26
+ | 5 | Provider matrix | All five providers (OpenAI, Anthropic, Google, Azure, Ollama) gain native tools + streaming, with auto-fallback to ReAct for Ollama models that don't support tools. |
27
+
28
+ ## Non-goals (v1)
29
+
30
+ - MCP **server** mode (exposing RCrewAI as a callable MCP service).
31
+ - MCP resources and prompts (only tools are wired).
32
+ - MCP OAuth flows for HTTP servers; header-based auth only.
33
+ - MCP sampling (server-initiated LLM calls).
34
+ - Live cost reconciliation against provider invoices (we provide a calculator + price table only).
35
+ - New providers beyond the existing five (Bedrock, Groq, OpenRouter, Mistral) — tracked as a follow-up.
36
+
37
+ ---
38
+
39
+ ## 1. Architecture overview
40
+
41
+ ### New module layout (additive)
42
+
43
+ ```
44
+ lib/rcrewai/
45
+ ├── tool_schema.rb # NEW – Schema DSL & JSON-schema emitter
46
+ ├── events.rb # NEW – Typed event classes for streaming
47
+ ├── tool_runner.rb # NEW – Native tool-call loop
48
+ ├── legacy_react_runner.rb # NEW – Extracted ReAct loop from agent.rb
49
+ ├── pricing.rb # NEW – Per-model price table for cost tracking
50
+ ├── mcp/
51
+ │ ├── client.rb # NEW – MCP protocol client (JSON-RPC 2.0)
52
+ │ ├── transport/stdio.rb # NEW
53
+ │ ├── transport/http.rb # NEW – Streamable HTTP (SSE)
54
+ │ └── tool_adapter.rb # NEW – Wraps MCP tool as RCrewAI::Tools::Base
55
+ └── llm_clients/
56
+ ├── base.rb # MODIFIED – new chat() contract
57
+ ├── openai.rb # MODIFIED – tools + streaming
58
+ ├── anthropic.rb # MODIFIED – tools + streaming + prompt-caching hook
59
+ ├── google.rb # MODIFIED – tools + streaming
60
+ ├── azure.rb # MODIFIED – inherits OpenAI shape
61
+ └── ollama.rb # MODIFIED – native tools (allowlist) + streaming
62
+ ```
63
+
64
+ ### New `chat` contract on `LLMClients::Base`
65
+
66
+ ```ruby
67
+ def chat(messages:, tools: nil, tool_choice: :auto, stream: nil, **options)
68
+ # If `stream` is a Proc (or array of Procs), yields RCrewAI::Events::*
69
+ # and still returns the final aggregated result.
70
+ # If `tools:` is given, uses native function calling on the provider.
71
+ # Returns:
72
+ # {
73
+ # content: String | nil,
74
+ # tool_calls: [{ id:, name:, arguments: Hash }],
75
+ # usage: { prompt_tokens:, completion_tokens:, total_tokens: },
76
+ # finish_reason: Symbol, # :stop | :tool_calls | :length | :max_iterations
77
+ # model: String,
78
+ # provider: Symbol
79
+ # }
80
+ end
81
+
82
+ def supports_native_tools?(model: config.model)
83
+ # OpenAI / Anthropic / Google / Azure: always true
84
+ # Ollama: allowlist check against known tool-capable models
85
+ end
86
+ ```
87
+
88
+ ### Provider matrix for v1
89
+
90
+ | Provider | Native tools | Streaming | Notes |
91
+ |---|---|---|---|
92
+ | OpenAI | ✅ | ✅ | Reference impl |
93
+ | Anthropic | ✅ | ✅ | Prompt-caching hook on `system` block |
94
+ | Google (Gemini) | ✅ | ✅ | `functionDeclarations` shape |
95
+ | Azure | ✅ | ✅ | Reuses OpenAI client logic |
96
+ | Ollama | ✅ (llama3.1+, qwen2.5, mistral-nemo, etc.) | ✅ | Auto-detects; falls back to ReAct on legacy models |
97
+
98
+ Capability detection: `LLMClient#supports_native_tools?(model:)` is checked once per agent run; the result is logged at INFO level as `[rcrewai] agent=<name> mode=native_tools|react_legacy provider=<provider>`. Suppressible via `config.log_level = :warn`.
99
+
100
+ ---
101
+
102
+ ## 2. Tool Schema DSL
103
+
104
+ A small class-level DSL on `Tools::Base` produces a canonical JSON schema. Per-provider adapters reshape the canonical schema for OpenAI / Anthropic / Google / Ollama (small differences in nesting and keys).
105
+
106
+ ### DSL surface
107
+
108
+ ```ruby
109
+ class WebSearch < RCrewAI::Tools::Base
110
+ tool_name "web_search"
111
+ description "Search the web via DuckDuckGo and return top results"
112
+
113
+ param :query, type: :string, required: true,
114
+ description: "Natural-language search query"
115
+ param :max_results, type: :integer, default: 10,
116
+ description: "Number of results to return (1-25)"
117
+
118
+ def execute(query:, max_results: 10)
119
+ # existing implementation
120
+ end
121
+ end
122
+ ```
123
+
124
+ ### Supported types
125
+
126
+ `:string`, `:integer`, `:number`, `:boolean`, `:array` (with `items:`), `:object` (with `properties:`), `:enum` (with `values: [...]`).
127
+
128
+ ### Generated canonical schema
129
+
130
+ ```ruby
131
+ WebSearch.json_schema
132
+ # => {
133
+ # name: "web_search",
134
+ # description: "Search the web via DuckDuckGo and return top results",
135
+ # parameters: {
136
+ # type: "object",
137
+ # properties: {
138
+ # query: { type: "string", description: "Natural-language search query" },
139
+ # max_results: { type: "integer", description: "Number of results to return (1-25)", default: 10 }
140
+ # },
141
+ # required: ["query"]
142
+ # }
143
+ # }
144
+ ```
145
+
146
+ ### Validation
147
+
148
+ `Tools::Base#execute_with_validation(args_hash)` validates `args_hash` against the schema (types coerced where unambiguous, e.g. `"10"` → `10` for an `:integer` param) before delegating to `execute(**)`. Bad types raise `ToolError` with a clean message that flows back to the LLM as a tool-result, allowing the agent to recover on the next iteration.
149
+
150
+ ### Fallback for undeclared tools
151
+
152
+ Tools without DSL declarations receive a permissive schema (`{ type: "object", additionalProperties: true }`) and a deprecation warning printed once per process. They still work in native-tools mode (the model gets less guidance) and still work in ReAct mode (unchanged).
153
+
154
+ ### Built-in tool migration
155
+
156
+ All 7 built-in tools (`web_search`, `file_reader`, `file_writer`, `sql_database`, `email_sender`, `code_executor`, `pdf_processor`) get DSL declarations as part of this work. Each is ~5 added lines.
157
+
158
+ ---
159
+
160
+ ## 3. Native tool-call loop (`ToolRunner`)
161
+
162
+ The current ~300 lines of prompt-template + regex-parsing code in `Agent#execute_task` is split:
163
+
164
+ - The new **`ToolRunner`** drives a native function-calling conversation.
165
+ - The extracted **`LegacyReactRunner`** preserves the current `USE_TOOL[]` semantics for fallback.
166
+
167
+ `Agent#execute_task` becomes a thin orchestrator: build initial messages → choose runner → return runner result.
168
+
169
+ ### `ToolRunner` interface
170
+
171
+ ```ruby
172
+ class ToolRunner
173
+ def initialize(agent:, llm:, tools:, max_iterations: 10, event_sink: nil); end
174
+
175
+ def run(messages:)
176
+ # Returns:
177
+ # {
178
+ # content: String,
179
+ # tool_calls_history: [{ tool:, args:, result:, duration_ms: }, ...],
180
+ # usage: { prompt_tokens:, completion_tokens:, total_tokens: },
181
+ # iterations: Integer,
182
+ # finish_reason: Symbol
183
+ # }
184
+ end
185
+ end
186
+ ```
187
+
188
+ ### Loop algorithm (per iteration)
189
+
190
+ 1. Call `llm.chat(messages:, tools: tools.map(&:json_schema), stream: event_sink)`.
191
+ 2. Emit `Events::IterationStart` / `Events::TextDelta` / `Events::TextDone` / `Events::Usage` as deltas arrive.
192
+ 3. If the response contains `tool_calls`:
193
+ - For each call: emit `Events::ToolCallStart`, run `tool.execute_with_validation(args)`, emit `Events::ToolCallResult` (or `Events::ToolCallError`).
194
+ - Honor `agent.require_approval_for_tools` (existing human-in-loop hook fires here).
195
+ - Record into `agent.memory` via existing `memory.add_tool_usage`.
196
+ - Append the assistant message (with `tool_calls`) and each tool-result message to `messages`.
197
+ - Emit `Events::IterationEnd(finish_reason: :tool_calls)`; continue.
198
+ 4. If no `tool_calls` (i.e. `finish_reason: :stop`): emit `Events::IterationEnd(finish_reason: :stop)`; return final content.
199
+ 5. If `iterations >= max_iterations`: emit `Events::IterationEnd(finish_reason: :max_iterations)`; return best-effort content with a logged warning.
200
+
201
+ ### Runner selection
202
+
203
+ ```ruby
204
+ runner = if llm.supports_native_tools?(model: config.model) && tools.all? { |t| t.respond_to?(:json_schema) }
205
+ ToolRunner.new(...)
206
+ else
207
+ LegacyReactRunner.new(...)
208
+ end
209
+ ```
210
+
211
+ A single INFO-level log line records the choice. Both runners emit the same event types, so streaming consumers don't care which is in use.
212
+
213
+ ### Agent invariants preserved
214
+
215
+ - `Agent#use_tool(name, **args)` (direct invocation API) is unchanged.
216
+ - Both runners route tool execution through `Agent#use_tool` so the human-approval hook and memory recording are shared.
217
+ - `Agent#execute_task` return value gains a `tool_calls_history:` key. All existing keys are unchanged.
218
+
219
+ ---
220
+
221
+ ## 4. Streaming event model
222
+
223
+ A single typed event stream consumed by UIs, observability, cost tracking, and the tool runner itself.
224
+
225
+ ### Event types — `RCrewAI::Events`
226
+
227
+ ```ruby
228
+ module RCrewAI::Events
229
+ Event = Struct.new(:type, :timestamp, :agent, :iteration, keyword_init: true)
230
+
231
+ TextDelta = Class.new(Event) # adds :text (partial)
232
+ TextDone = Class.new(Event) # adds :text (full)
233
+ ToolCallStart = Class.new(Event) # adds :tool, :args, :call_id
234
+ ToolCallResult = Class.new(Event) # adds :tool, :call_id, :result, :duration_ms
235
+ ToolCallError = Class.new(Event) # adds :tool, :call_id, :error
236
+ Thinking = Class.new(Event) # adds :text (Anthropic extended thinking, etc.)
237
+ Usage = Class.new(Event) # adds :prompt_tokens, :completion_tokens, :total_tokens, :cost_usd
238
+ IterationStart = Class.new(Event) # adds :iteration_index
239
+ IterationEnd = Class.new(Event) # adds :finish_reason
240
+ Error = Class.new(Event) # adds :error
241
+ end
242
+ ```
243
+
244
+ Every event carries `agent` (name) and `iteration` so consumers can correlate without external bookkeeping.
245
+
246
+ ### Public API — three idioms
247
+
248
+ ```ruby
249
+ # 1. Block form
250
+ crew.execute(stream: true) do |event|
251
+ case event
252
+ when RCrewAI::Events::TextDelta then print event.text
253
+ when RCrewAI::Events::ToolCallStart then puts "→ #{event.tool}(#{event.args})"
254
+ when RCrewAI::Events::Usage then meter.record(event)
255
+ end
256
+ end
257
+
258
+ # 2. Multiple listeners (good for UI + logging + metering simultaneously)
259
+ crew.execute(stream: [logger_sink, cost_sink, ui_sink]) # each responds to #call(event)
260
+
261
+ # 3. Convenience text-only
262
+ agent.execute_task(task).each_text_delta { |chunk| print chunk }
263
+ ```
264
+
265
+ ### Plumbing path
266
+
267
+ ```
268
+ LLMClient (SSE parse) → ToolRunner (orchestrates) → Agent (tags :agent) → Crew (fans out)
269
+ ```
270
+
271
+ Each layer is a Proc that calls the next. No coupling between producers and consumers.
272
+
273
+ ### Per-provider SSE parsing
274
+
275
+ - **OpenAI:** `chat.completions` stream with `delta.content` + `delta.tool_calls[].function.arguments` (incremental JSON; assembled before emission).
276
+ - **Anthropic:** `message_delta` / `content_block_delta` / `input_json_delta` events.
277
+ - **Google:** `streamGenerateContent` with `candidates[].content.parts[].functionCall`.
278
+ - **Azure:** identical to OpenAI.
279
+ - **Ollama:** line-delimited JSON; `message.tool_calls` arrives as a single chunk (less granular but still streamed).
280
+
281
+ Each adapter normalizes to the canonical `Events::*` shape.
282
+
283
+ ### Cost tracking
284
+
285
+ `Events::Usage` includes `cost_usd` computed from `lib/rcrewai/pricing.rb` (per-model price table for OpenAI/Anthropic/Google list prices, shipped at release time). Users can override:
286
+
287
+ ```ruby
288
+ RCrewAI.configure do |c|
289
+ c.pricing = { "gpt-4o" => { input: 2.50, output: 10.00 } } # USD per 1M tokens
290
+ end
291
+ ```
292
+
293
+ If a model isn't in the table, `cost_usd` is `nil` — not an error.
294
+
295
+ ### Non-streaming callers
296
+
297
+ If `stream:` is not provided, the event stream is collected internally and discarded. Public return value is unchanged.
298
+
299
+ ---
300
+
301
+ ## 5. MCP client integration
302
+
303
+ MCP is JSON-RPC 2.0. We add a client that connects to MCP servers (stdio subprocess or HTTP) and surfaces remote tools as `RCrewAI::Tools::Base` instances — so they run through the same `ToolRunner`, schema validation, human-approval hook, memory recording, and event emission as native tools. No special-casing.
304
+
305
+ ### Module — `RCrewAI::MCP`
306
+
307
+ ```ruby
308
+ module RCrewAI::MCP
309
+ class Client # JSON-RPC 2.0 over a Transport
310
+ class Transport::Stdio # spawn subprocess, talk over stdin/stdout
311
+ class Transport::Http # Streamable HTTP per MCP spec (POST + SSE)
312
+ class ToolAdapter # wraps one MCP tool as Tools::Base
313
+ class Error
314
+ end
315
+ ```
316
+
317
+ ### User-facing API
318
+
319
+ ```ruby
320
+ # Stdio (subprocess) — most common deployment
321
+ github = RCrewAI::MCP::Client.connect(
322
+ command: "npx",
323
+ args: ["-y", "@modelcontextprotocol/server-github"],
324
+ env: { "GITHUB_TOKEN" => ENV.fetch("GITHUB_TOKEN") }
325
+ )
326
+
327
+ # HTTP (remote MCP server)
328
+ linear = RCrewAI::MCP::Client.connect(
329
+ url: "https://mcp.linear.app/sse",
330
+ headers: { "Authorization" => "Bearer #{ENV['LINEAR_TOKEN']}" }
331
+ )
332
+
333
+ agent = RCrewAI::Agent.new(
334
+ name: "engineer",
335
+ role: "Backend Engineer",
336
+ tools: github.tools + linear.tools + [RCrewAI::Tools::FileWriter.new]
337
+ )
338
+ ```
339
+
340
+ ### What `client.tools` returns
341
+
342
+ On connect, the client performs the `initialize` handshake and then `tools/list`. Each remote tool becomes a `ToolAdapter` whose:
343
+
344
+ - `tool_name` ← MCP tool name, prefixed with the server name (e.g. `github__create_issue`) to avoid collisions.
345
+ - `description` ← MCP tool description.
346
+ - `json_schema` ← MCP tool's `inputSchema` (already JSON Schema — no translation).
347
+ - `execute(**args)` → sends `tools/call`, returns the result content (text / image / resource).
348
+
349
+ ### Lifecycle
350
+
351
+ ```ruby
352
+ client = RCrewAI::MCP::Client.connect(...) # opens transport, handshakes, lists tools
353
+ # ...
354
+ client.close # graceful shutdown
355
+
356
+ # Or block form (auto-closes):
357
+ RCrewAI::MCP::Client.with_connection(command: ...) do |client|
358
+ crew.add_agent(RCrewAI::Agent.new(..., tools: client.tools))
359
+ crew.execute
360
+ end
361
+ ```
362
+
363
+ A finalizer plus `at_exit` hook ensure subprocesses are killed even on hard exit.
364
+
365
+ ### Dependencies
366
+
367
+ No new gem dependency. We hand-roll a minimal JSON-RPC 2.0 client (~100 LOC) since we only need request/response + notifications. SSE is handled by the same parser used by the LLM clients.
368
+
369
+ ---
370
+
371
+ ## 6. Backward compatibility & migration
372
+
373
+ ### Unchanged public surface (zero migration)
374
+
375
+ - `RCrewAI::Agent.new(name:, role:, goal:, tools: [...])`
376
+ - `RCrewAI::Crew.new(...).execute` (return shape unchanged when `stream:` is not passed)
377
+ - `RCrewAI::Task.new(...)`
378
+ - All `human_input`, `require_approval_for_*`, `manager`, `allow_delegation` flags
379
+ - Custom tools that subclass `Tools::Base` and implement `execute(**params)`
380
+
381
+ ### Behavior changes a user might observe
382
+
383
+ 1. **Tool calls are more reliable.** Agents using OpenAI / Anthropic / Google / Azure (or modern Ollama models) now use native function calling. Fewer "agent forgot to call the tool" and "argument got mangled" failures. *This is the headline win.*
384
+ 2. **One new INFO log line per agent run** identifies the mode and provider. Suppressible via `config.log_level = :warn`.
385
+ 3. **Tools without DSL declarations** get a permissive fallback schema and a one-time deprecation warning.
386
+
387
+ ### Hard breaking changes (called out in CHANGELOG)
388
+
389
+ - `LLMClients::Base#chat` adds `tools:` and `stream:` keyword args. Subclasses outside the gem that override `chat` with an explicit kwarg list need to add these (or accept `**options`). Realistic blast radius: near-zero.
390
+ - `Agent#execute_task` return value gains a `tool_calls_history:` key. Pure addition; existing keys unchanged.
391
+
392
+ ### Versioning
393
+
394
+ **0.3.0** — minor bump under 1.0 conventions. CHANGELOG explicitly lists the two hard breaks above.
395
+
396
+ ### Migration guide (`docs/upgrading-to-0.3.md`)
397
+
398
+ 1. *(Optional, recommended)* Add `tool_name`, `description`, `param` declarations to custom tools. Before/after example provided.
399
+ 2. *(Optional)* Adopt streaming via `crew.execute(stream: ...) { |event| ... }`.
400
+ 3. *(Optional)* Wire in MCP servers via `RCrewAI::MCP::Client.connect(...)`.
401
+ 4. *(Only if affected)* If you subclassed `LLMClients::Base#chat` with explicit kwargs, add `tools: nil, stream: nil` to the signature.
402
+
403
+ ### Companion gem (`rcrewai-rails`)
404
+
405
+ Needs a coordinated release. The Rails engine's `Crew#execute` ActiveJob wrapper will gain a `stream:` parameter that pipes events into ActionCable for live UI. Tracked as a follow-up issue, not blocking 0.3.0 of the core gem.
406
+
407
+ ---
408
+
409
+ ## 7. Testing strategy
410
+
411
+ ### New specs
412
+
413
+ ```
414
+ spec/
415
+ ├── tool_schema_spec.rb # DSL → JSON schema (all types, edge cases, validation)
416
+ ├── tool_runner_spec.rb # Multi-iteration loop, max_iterations, error recovery,
417
+ │ # human-approval hook, memory recording, event emission
418
+ ├── legacy_react_runner_spec.rb # Pins existing USE_TOOL[] behavior
419
+ ├── events_spec.rb # Event tagging, fan-out to multiple sinks
420
+ ├── mcp/
421
+ │ ├── client_spec.rb # JSON-RPC framing, handshake, tools/list, tools/call
422
+ │ ├── transport/stdio_spec.rb # Subprocess lifecycle, stderr passthrough, kill on close
423
+ │ ├── transport/http_spec.rb # POST + SSE, header auth, reconnection
424
+ │ └── tool_adapter_spec.rb # MCP tool → Tools::Base behavior parity
425
+ ├── llm_clients/
426
+ │ ├── openai_spec.rb # Tools payload shape, SSE delta parsing, tool_call assembly
427
+ │ ├── anthropic_spec.rb # Tools shape, content_block_delta parsing, caching hook
428
+ │ ├── google_spec.rb # functionDeclarations shape, streamGenerateContent
429
+ │ ├── azure_spec.rb # Inherits OpenAI; one auth-shape smoke test
430
+ │ └── ollama_spec.rb # Native-tools allowlist, ReAct fallback
431
+ ├── pricing_spec.rb # Cost calculation; missing model returns nil
432
+ └── integration/
433
+ ├── native_tool_calling_spec.rb # End-to-end agent + real tool + recorded LLM responses
434
+ ├── streaming_spec.rb # End-to-end event stream with multiple consumers
435
+ ├── mcp_end_to_end_spec.rb # Real subprocess (fixture MCP server in spec/fixtures/)
436
+ └── legacy_react_fallback_spec.rb# Old USE_TOOL[] path still works on tools w/o DSL
437
+ ```
438
+
439
+ ### Recording strategy
440
+
441
+ - **Unit tests** for LLM clients use **webmock** with fixture JSON/SSE bodies in `spec/fixtures/llm_responses/` — deterministic, fast, no network.
442
+ - **Integration tests** use **VCR cassettes** (already a dev dep) recorded once against real providers and committed. Re-record via `RECORD=true rspec`.
443
+ - **MCP integration** ships a tiny stdio MCP server (~50 LOC Ruby script) in `spec/fixtures/mcp_servers/` so CI doesn't need `npx`.
444
+
445
+ ### Regression coverage
446
+
447
+ - `legacy_react_runner_spec.rb` pins current `USE_TOOL[]` semantics.
448
+ - Existing `agent_spec.rb` and `crew_spec.rb` get added cases for the new `stream:` kwarg (no-op when not provided) and the new `tool_calls_history:` return key.
449
+
450
+ ### CI
451
+
452
+ - `.github/workflows/ci.yml` runs Ruby 3.0, 3.1, 3.2, 3.3 on Linux.
453
+ - MCP integration test gated by Ruby ≥ 3.1.
454
+
455
+ ### Coverage target
456
+
457
+ `simplecov` (already a dev dep) configured to fail CI at <85% line coverage for `lib/rcrewai/{tool_schema,tool_runner,events,mcp}/**`. No coverage bar on legacy files in this PR.
458
+
459
+ ### Out of scope for v1 tests
460
+
461
+ - Live MCP servers from the wild (Slack, GitHub) — too flaky for CI; the fixture server suffices.
462
+ - Real cost figures against provider invoices — only the calculator formula.
463
+
464
+ ---
465
+
466
+ ## Appendix A — Order of work (rough)
467
+
468
+ 1. `ToolSchema` DSL + JSON schema emitter + per-provider schema adapters.
469
+ 2. Migrate the 7 built-in tools to declare schemas.
470
+ 3. `Events` module + `LLMClients::Base` new `chat` contract.
471
+ 4. OpenAI native tools + streaming (reference implementation).
472
+ 5. `ToolRunner` + extract `LegacyReactRunner` from `Agent#execute_task`.
473
+ 6. Anthropic, Google, Azure native tools + streaming.
474
+ 7. Ollama native tools (allowlist) + streaming + ReAct fallback.
475
+ 8. `Pricing` + cost on `Events::Usage`.
476
+ 9. MCP transports (stdio, HTTP) + `Client` + `ToolAdapter`.
477
+ 10. CHANGELOG, `docs/upgrading-to-0.3.md`, `docs/mcp.md`, examples.
478
+
479
+ The implementation plan (writing-plans skill) will refine this into discrete, testable tasks with explicit dependencies.
@@ -0,0 +1,163 @@
1
+ # Upgrading to RCrewAI 0.3
2
+
3
+ RCrewAI 0.3 modernizes the LLM layer with native function calling, typed
4
+ streaming, and an MCP client. Existing code continues to run, but you can
5
+ opt into the new capabilities incrementally.
6
+
7
+ This guide is split into three parts:
8
+
9
+ 1. **What you must do** — required changes to keep your code working.
10
+ 2. **What you should do** — recommended changes to take advantage of the
11
+ new behavior.
12
+ 3. **What you can do** — new capabilities you can adopt at your own pace.
13
+
14
+ ---
15
+
16
+ ## 1. What you must do
17
+
18
+ ### 1a. Custom `LLMClients::Base` subclasses
19
+
20
+ If you have a custom LLM client that overrides `chat` with a fixed kwarg
21
+ list, add `tools:` and `stream:` (or accept `**options`).
22
+
23
+ **Before:**
24
+
25
+ ```ruby
26
+ class MyClient < RCrewAI::LLMClients::Base
27
+ def chat(messages:, temperature: 0.1)
28
+ # ...
29
+ end
30
+ end
31
+ ```
32
+
33
+ **After:**
34
+
35
+ ```ruby
36
+ class MyClient < RCrewAI::LLMClients::Base
37
+ def chat(messages:, tools: nil, tool_choice: :auto, stream: nil, **options)
38
+ # ...
39
+ end
40
+
41
+ def supports_native_tools?(model: config.model)
42
+ false # or true if your provider supports OpenAI-style function calls
43
+ end
44
+ end
45
+ ```
46
+
47
+ ### 1b. Tools without DSL declarations now print a deprecation warning
48
+
49
+ Tools that don't declare a schema still work (with a permissive fallback
50
+ schema) but emit a one-time `[rcrewai] Tool ... has no DSL declarations`
51
+ warning on stderr. To clear the warning, declare a schema:
52
+
53
+ **Before:**
54
+
55
+ ```ruby
56
+ class WeatherTool < RCrewAI::Tools::Base
57
+ def initialize
58
+ @name = "weather"
59
+ @description = "Get the weather for a city"
60
+ end
61
+
62
+ def execute(city:)
63
+ # ...
64
+ end
65
+ end
66
+ ```
67
+
68
+ **After:**
69
+
70
+ ```ruby
71
+ class WeatherTool < RCrewAI::Tools::Base
72
+ tool_name "weather"
73
+ description "Get the weather for a city"
74
+ param :city, type: :string, required: true, description: "City name"
75
+
76
+ def execute(city:)
77
+ # ...
78
+ end
79
+ end
80
+ ```
81
+
82
+ ---
83
+
84
+ ## 2. What you should do
85
+
86
+ ### 2a. Consume `Agent#execute_task` return as a hash
87
+
88
+ In 0.3, `execute_task` returns a hash with `:content`, `:tool_calls_history`,
89
+ `:usage`, `:iterations`, and `:finish_reason`. The agent still assigns
90
+ the plain string content to `task.result`, so old code that reads
91
+ `task.result` continues to work.
92
+
93
+ **Before:**
94
+
95
+ ```ruby
96
+ result_string = agent.execute_task(task)
97
+ puts result_string
98
+ ```
99
+
100
+ **After:**
101
+
102
+ ```ruby
103
+ result = agent.execute_task(task)
104
+ puts result[:content]
105
+ puts "tokens: #{result.dig(:usage, :total_tokens)}"
106
+ puts "tool calls: #{result[:tool_calls_history].length}"
107
+ ```
108
+
109
+ ---
110
+
111
+ ## 3. What you can do (new capabilities)
112
+
113
+ ### 3a. Native function calling
114
+
115
+ When a tool has a DSL-declared schema and your LLM supports native function
116
+ calls, the agent automatically uses the new `ToolRunner` path — no more
117
+ `USE_TOOL[name](k=v)` prompt parsing. There's nothing to opt into beyond
118
+ declaring the schema (see 1b).
119
+
120
+ ### 3b. Streaming events
121
+
122
+ Pass a `stream:` lambda to `agent.execute_task` or `crew.execute`. You'll
123
+ receive typed events as the LLM produces output:
124
+
125
+ ```ruby
126
+ events = []
127
+ agent.execute_task(task, stream: ->(e) { events << e })
128
+
129
+ text = events
130
+ .select { |e| e.is_a?(RCrewAI::Events::TextDelta) }
131
+ .map(&:text).join
132
+ ```
133
+
134
+ Event types: `TextDelta`, `ToolCallStart`, `ToolCallResult`, `ToolCallError`,
135
+ `Usage` (with `cost_usd`), `IterationStart`, `IterationEnd`, `Error`.
136
+
137
+ ### 3c. Cost tracking
138
+
139
+ When `Pricing` knows your model, `Events::Usage` carries `cost_usd`:
140
+
141
+ ```ruby
142
+ total_cost = 0.0
143
+ crew.execute(stream: ->(e) {
144
+ total_cost += e.cost_usd if e.is_a?(RCrewAI::Events::Usage) && e.cost_usd
145
+ })
146
+ ```
147
+
148
+ Override the table via `RCrewAI.configuration.pricing = { "my-model" => { input: 1.0, output: 5.0 } }`.
149
+
150
+ ### 3d. MCP servers as agent tools
151
+
152
+ Connect to an MCP server (stdio or HTTP) and pass its tools to your agent:
153
+
154
+ ```ruby
155
+ RCrewAI::MCP::Client.with_connection(command: "npx", args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]) do |client|
156
+ agent = RCrewAI::Agent.new(name: "fs", role: "...", goal: "...", tools: client.tools)
157
+ task = RCrewAI::Task.new(name: "ls", description: "list /tmp", agent: agent)
158
+ result = agent.execute_task(task)
159
+ puts result[:content]
160
+ end
161
+ ```
162
+
163
+ See `docs/mcp.md` for the full guide.