agent_c 2.9
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.rubocop.yml +10 -0
- data/.ruby-version +1 -0
- data/CLAUDE.md +21 -0
- data/README.md +360 -0
- data/Rakefile +16 -0
- data/TODO.md +105 -0
- data/agent_c.gemspec +38 -0
- data/docs/batch.md +503 -0
- data/docs/chat-methods.md +156 -0
- data/docs/cost-reporting.md +86 -0
- data/docs/pipeline-tips-and-tricks.md +453 -0
- data/docs/session-configuration.md +274 -0
- data/docs/testing.md +747 -0
- data/docs/tools.md +103 -0
- data/docs/versioned-store.md +840 -0
- data/lib/agent_c/agent/chat.rb +211 -0
- data/lib/agent_c/agent/chat_response.rb +38 -0
- data/lib/agent_c/agent/chats/anthropic_bedrock.rb +48 -0
- data/lib/agent_c/batch.rb +102 -0
- data/lib/agent_c/configs/repo.rb +90 -0
- data/lib/agent_c/context.rb +56 -0
- data/lib/agent_c/costs/data.rb +39 -0
- data/lib/agent_c/costs/report.rb +219 -0
- data/lib/agent_c/db/store.rb +162 -0
- data/lib/agent_c/errors.rb +19 -0
- data/lib/agent_c/pipeline.rb +152 -0
- data/lib/agent_c/pipelines/agent.rb +219 -0
- data/lib/agent_c/processor.rb +98 -0
- data/lib/agent_c/prompts.yml +53 -0
- data/lib/agent_c/schema.rb +71 -0
- data/lib/agent_c/session.rb +206 -0
- data/lib/agent_c/store.rb +72 -0
- data/lib/agent_c/test_helpers.rb +173 -0
- data/lib/agent_c/tools/dir_glob.rb +46 -0
- data/lib/agent_c/tools/edit_file.rb +114 -0
- data/lib/agent_c/tools/file_metadata.rb +43 -0
- data/lib/agent_c/tools/git_status.rb +30 -0
- data/lib/agent_c/tools/grep.rb +119 -0
- data/lib/agent_c/tools/paths.rb +36 -0
- data/lib/agent_c/tools/read_file.rb +94 -0
- data/lib/agent_c/tools/run_rails_test.rb +87 -0
- data/lib/agent_c/tools.rb +61 -0
- data/lib/agent_c/utils/git.rb +87 -0
- data/lib/agent_c/utils/shell.rb +58 -0
- data/lib/agent_c/version.rb +5 -0
- data/lib/agent_c.rb +32 -0
- data/lib/versioned_store/base.rb +314 -0
- data/lib/versioned_store/config.rb +26 -0
- data/lib/versioned_store/stores/schema.rb +127 -0
- data/lib/versioned_store/version.rb +5 -0
- data/lib/versioned_store.rb +5 -0
- data/template/Gemfile +9 -0
- data/template/Gemfile.lock +152 -0
- data/template/README.md +61 -0
- data/template/Rakefile +50 -0
- data/template/bin/rake +27 -0
- data/template/lib/autoload.rb +10 -0
- data/template/lib/config.rb +59 -0
- data/template/lib/pipeline.rb +19 -0
- data/template/lib/prompts.yml +57 -0
- data/template/lib/store.rb +17 -0
- data/template/test/pipeline_test.rb +221 -0
- data/template/test/test_helper.rb +18 -0
- metadata +194 -0
data/docs/testing.md
ADDED
|
@@ -0,0 +1,747 @@
|
|
|
1
|
+
# Testing
|
|
2
|
+
|
|
3
|
+
AgentC provides testing utilities for writing tests without external dependencies:
|
|
4
|
+
- `TestHelpers::DummyChat` - Mock LLM responses without real API calls
|
|
5
|
+
- `TestHelpers::DummyGit` - Mock git operations without actual git commands
|
|
6
|
+
- `test_session` helper - Create test sessions with minimal configuration
|
|
7
|
+
|
|
8
|
+
The key benefit is that you use the actual `Session#prompt`, `Session#chat`, and `Pipeline` implementations, so your tests exercise real code paths without external dependencies.
|
|
9
|
+
|
|
10
|
+
|
|
11
|
+
## Testing Pipelines
|
|
12
|
+
|
|
13
|
+
Pipelines are the primary way to orchestrate multi-step agent workflows with persistent state. Testing pipelines involves setting up a store with your domain records, creating tasks, and using `DummyChat` to simulate LLM responses for `agent_step` calls.
|
|
14
|
+
|
|
15
|
+
### Basic Pipeline Test Setup
|
|
16
|
+
|
|
17
|
+
```ruby
|
|
18
|
+
require "test_helper"
|
|
19
|
+
|
|
20
|
+
class MyPipelineTest < Minitest::Test
|
|
21
|
+
include AgentC::TestHelpers
|
|
22
|
+
|
|
23
|
+
def setup
|
|
24
|
+
# Create a store with your domain schema
|
|
25
|
+
@store_class = Class.new(VersionedStore::Base) do
|
|
26
|
+
include AgentC::Store
|
|
27
|
+
|
|
28
|
+
record(:document) do
|
|
29
|
+
schema do |t|
|
|
30
|
+
t.string(:title)
|
|
31
|
+
t.string(:summary)
|
|
32
|
+
t.string(:category)
|
|
33
|
+
end
|
|
34
|
+
end
|
|
35
|
+
end
|
|
36
|
+
|
|
37
|
+
@store = @store_class.new(dir: Dir.mktmpdir)
|
|
38
|
+
@workspace = @store.workspace.create!(
|
|
39
|
+
dir: Dir.mktmpdir,
|
|
40
|
+
env: {}
|
|
41
|
+
)
|
|
42
|
+
end
|
|
43
|
+
|
|
44
|
+
def test_simple_pipeline
|
|
45
|
+
# Define your pipeline
|
|
46
|
+
pipeline_class = Class.new(Pipeline) do
|
|
47
|
+
step(:set_title) do
|
|
48
|
+
record.update!(title: "Document Title")
|
|
49
|
+
end
|
|
50
|
+
|
|
51
|
+
step(:set_category) do
|
|
52
|
+
record.update!(category: "Important")
|
|
53
|
+
end
|
|
54
|
+
end
|
|
55
|
+
|
|
56
|
+
# Create record and task
|
|
57
|
+
document = @store.document.create!
|
|
58
|
+
task = @store.task.create!(record: document, workspace: @workspace)
|
|
59
|
+
session = test_session
|
|
60
|
+
|
|
61
|
+
# Run the pipeline
|
|
62
|
+
pipeline_class.call(task: task, session: session)
|
|
63
|
+
|
|
64
|
+
# Verify results
|
|
65
|
+
assert task.reload.done?
|
|
66
|
+
assert_equal "Document Title", document.reload.title
|
|
67
|
+
assert_equal "Important", document.category
|
|
68
|
+
end
|
|
69
|
+
end
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### Testing Agent Steps with Inline Definitions
|
|
73
|
+
|
|
74
|
+
The simplest way to test `agent_step` is to define them inline with the prompt and schema parameters. This avoids needing to set up I18n translations for tests.
|
|
75
|
+
|
|
76
|
+
**CRITICAL**: When using inline prompts with interpolation placeholders like `%{field_name}`, DummyChat receives the **literal prompt string** with placeholders intact, NOT the interpolated version. Match the exact string including `%{placeholders}` in your DummyChat responses.
|
|
77
|
+
|
|
78
|
+
```ruby
|
|
79
|
+
def test_agent_step_inline
|
|
80
|
+
# Define pipeline with inline agent_step definitions
|
|
81
|
+
pipeline_class = Class.new(AgentC::Pipeline) do
|
|
82
|
+
agent_step(
|
|
83
|
+
:summarize,
|
|
84
|
+
prompt: "Summarize the document titled %{title}",
|
|
85
|
+
schema: -> { string(:summary) }
|
|
86
|
+
)
|
|
87
|
+
|
|
88
|
+
agent_step(
|
|
89
|
+
:categorize,
|
|
90
|
+
prompt: "Categorize this document: %{summary}",
|
|
91
|
+
schema: -> { string(:category) }
|
|
92
|
+
)
|
|
93
|
+
end
|
|
94
|
+
|
|
95
|
+
document = @store.document.create!(title: "My Report")
|
|
96
|
+
task = @store.task.create!(record: document, workspace: @workspace)
|
|
97
|
+
|
|
98
|
+
# Match the LITERAL prompt strings with %{placeholders}, not interpolated values
|
|
99
|
+
dummy_chat = DummyChat.new(responses: {
|
|
100
|
+
"Summarize the document titled %{title}" =>
|
|
101
|
+
'{" "summary": "A comprehensive report"}',
|
|
102
|
+
"Categorize this document: %{summary}" =>
|
|
103
|
+
'{" "category": "Research"}'
|
|
104
|
+
})
|
|
105
|
+
|
|
106
|
+
session = test_session(
|
|
107
|
+
workspace_dir: @workspace.dir,
|
|
108
|
+
chat_provider: ->(**params) { dummy_chat }
|
|
109
|
+
)
|
|
110
|
+
|
|
111
|
+
pipeline_class.call(task:, session:)
|
|
112
|
+
|
|
113
|
+
document.reload
|
|
114
|
+
assert_equal "A comprehensive report", document.summary
|
|
115
|
+
assert_equal "Research", document.category
|
|
116
|
+
assert task.reload.done?
|
|
117
|
+
end
|
|
118
|
+
```
|
|
119
|
+
|
|
120
|
+
### Testing Agent Steps with I18n Prompts
|
|
121
|
+
|
|
122
|
+
**CRITICAL DIFFERENCE**: When using I18n-based agent steps (like `agent_step(:my_step)` without inline prompt), the prompts ARE interpolated BEFORE being sent to DummyChat. Your responses must match the interpolated values, not the literal `%{placeholders}`.
|
|
123
|
+
|
|
124
|
+
```ruby
|
|
125
|
+
def test_i18n_agent_step
|
|
126
|
+
# In prompts.yml:
|
|
127
|
+
# my_step:
|
|
128
|
+
# prompt: "Process file %{file_name}"
|
|
129
|
+
|
|
130
|
+
record = @store.document.create!(file_name: "report.pdf")
|
|
131
|
+
task = @store.task.create!(record:, workspace: @workspace)
|
|
132
|
+
|
|
133
|
+
# I18n interpolates BEFORE sending to DummyChat
|
|
134
|
+
# DummyChat receives: "Process file report.pdf" (interpolated!)
|
|
135
|
+
dummy_chat = DummyChat.new(responses: {
|
|
136
|
+
"Process file report.pdf" => '{}', # ✓ Correct
|
|
137
|
+
"Process file %{file_name}" => '{}' # ✗ Wrong - won't match
|
|
138
|
+
})
|
|
139
|
+
|
|
140
|
+
session = test_session(
|
|
141
|
+
workspace_dir: @workspace.dir,
|
|
142
|
+
chat_provider: ->(**params) { dummy_chat }
|
|
143
|
+
)
|
|
144
|
+
|
|
145
|
+
MyPipeline.call(task:, session:)
|
|
146
|
+
end
|
|
147
|
+
```
|
|
148
|
+
|
|
149
|
+
When testing pipelines that use I18n-based `agent_step`, you need to:
|
|
150
|
+
1. Set up I18n translations with your prompts and schemas
|
|
151
|
+
2. Configure DummyChat responses that match the prompt text
|
|
152
|
+
3. Verify the agent step updates the record correctly
|
|
153
|
+
|
|
154
|
+
```ruby
|
|
155
|
+
def test_agent_step_with_i18n
|
|
156
|
+
# Define pipeline with agent_step
|
|
157
|
+
pipeline_class = Class.new(Pipeline) do
|
|
158
|
+
agent_step(:summarize_document)
|
|
159
|
+
end
|
|
160
|
+
|
|
161
|
+
# Create record with initial data
|
|
162
|
+
document = @store.document.create!(
|
|
163
|
+
title: "My Document",
|
|
164
|
+
category: "Technical"
|
|
165
|
+
)
|
|
166
|
+
task = @store.task.create!(record: document, workspace: @workspace)
|
|
167
|
+
|
|
168
|
+
# Set up I18n translations for the agent step
|
|
169
|
+
I18n.backend.store_translations(:en, {
|
|
170
|
+
summarize_document: {
|
|
171
|
+
tools: ["read_file"],
|
|
172
|
+
cached_prompts: [
|
|
173
|
+
"You are a document summarization assistant."
|
|
174
|
+
],
|
|
175
|
+
prompt: "Summarize the document titled '%{title}' in category '%{category}'",
|
|
176
|
+
response_schema: {
|
|
177
|
+
summary: {
|
|
178
|
+
type: "string",
|
|
179
|
+
description: "The summary of the document"
|
|
180
|
+
}
|
|
181
|
+
}
|
|
182
|
+
}
|
|
183
|
+
})
|
|
184
|
+
|
|
185
|
+
# Configure DummyChat with matching response
|
|
186
|
+
dummy_chat = DummyChat.new(responses: {
|
|
187
|
+
"Summarize the document titled 'My Document' in category 'Technical'" =>
|
|
188
|
+
'{" "summary": "This is a technical document about programming."}'
|
|
189
|
+
})
|
|
190
|
+
|
|
191
|
+
# Create session with DummyChat
|
|
192
|
+
session = test_session(
|
|
193
|
+
workspace_dir: @workspace.dir,
|
|
194
|
+
chat_provider: ->(**params) { dummy_chat }
|
|
195
|
+
)
|
|
196
|
+
|
|
197
|
+
# Run the pipeline
|
|
198
|
+
pipeline_class.call(task: task, session: session)
|
|
199
|
+
|
|
200
|
+
# Verify results
|
|
201
|
+
assert task.reload.done?
|
|
202
|
+
assert_equal "This is a technical document about programming.",
|
|
203
|
+
document.reload.summary
|
|
204
|
+
assert_equal ["summarize_document"], task.completed_steps
|
|
205
|
+
end
|
|
206
|
+
```
|
|
207
|
+
|
|
208
|
+
### Testing with Flexible Prompt Matching
|
|
209
|
+
|
|
210
|
+
For complex prompts or when using I18n with variable interpolation, use regex or proc matching:
|
|
211
|
+
|
|
212
|
+
```ruby
|
|
213
|
+
def test_agent_step_with_regex_matching
|
|
214
|
+
pipeline_class = Class.new(Pipeline) do
|
|
215
|
+
agent_step(:process_document)
|
|
216
|
+
end
|
|
217
|
+
|
|
218
|
+
document = @store.document.create!(title: "Test Doc")
|
|
219
|
+
task = @store.task.create!(record: document, workspace: @workspace)
|
|
220
|
+
|
|
221
|
+
I18n.backend.store_translations(:en, {
|
|
222
|
+
process_document: {
|
|
223
|
+
tools: ["read_file", "edit_file"],
|
|
224
|
+
cached_prompts: ["Instructions..."],
|
|
225
|
+
prompt: "Process document: %{title}",
|
|
226
|
+
response_schema: {
|
|
227
|
+
category: { type: "string", description: "Assigned category" }
|
|
228
|
+
}
|
|
229
|
+
}
|
|
230
|
+
})
|
|
231
|
+
|
|
232
|
+
# Use regex to match prompts flexibly
|
|
233
|
+
dummy_chat = DummyChat.new(responses: {
|
|
234
|
+
/Process document:/ => '{" "category": "Processed"}'
|
|
235
|
+
})
|
|
236
|
+
|
|
237
|
+
session = test_session(
|
|
238
|
+
workspace_dir: @workspace.dir,
|
|
239
|
+
chat_provider: ->(**params) { dummy_chat }
|
|
240
|
+
)
|
|
241
|
+
|
|
242
|
+
pipeline_class.call(task: task, session: session)
|
|
243
|
+
|
|
244
|
+
assert_equal "Processed", document.reload.category
|
|
245
|
+
end
|
|
246
|
+
```
|
|
247
|
+
|
|
248
|
+
### Testing Error Handling in Pipelines
|
|
249
|
+
|
|
250
|
+
```ruby
|
|
251
|
+
def test_agent_step_failure
|
|
252
|
+
pipeline_class = Class.new(Pipeline) do
|
|
253
|
+
agent_step(:failing_step)
|
|
254
|
+
|
|
255
|
+
step(:should_not_run) do
|
|
256
|
+
record.update!(summary: "Should not execute")
|
|
257
|
+
end
|
|
258
|
+
end
|
|
259
|
+
|
|
260
|
+
document = @store.document.create!
|
|
261
|
+
task = @store.task.create!(record: document, workspace: @workspace)
|
|
262
|
+
|
|
263
|
+
I18n.backend.store_translations(:en, {
|
|
264
|
+
failing_step: {
|
|
265
|
+
tools: [],
|
|
266
|
+
cached_prompts: [],
|
|
267
|
+
prompt: "This will fail",
|
|
268
|
+
response_schema: { result: { type: "string", description: "Result" } }
|
|
269
|
+
}
|
|
270
|
+
})
|
|
271
|
+
|
|
272
|
+
dummy_chat = DummyChat.new(responses: {
|
|
273
|
+
"This will fail" => '{"unable_to_fulfill_request_error": "Processing failed"}'
|
|
274
|
+
})
|
|
275
|
+
|
|
276
|
+
session = test_session(
|
|
277
|
+
workspace_dir: @workspace.dir,
|
|
278
|
+
chat_provider: ->(**params) { dummy_chat }
|
|
279
|
+
)
|
|
280
|
+
|
|
281
|
+
pipeline_class.call(task: task, session: session)
|
|
282
|
+
|
|
283
|
+
# Verify failure handling
|
|
284
|
+
assert task.reload.failed?
|
|
285
|
+
assert_match(/Processing failed/, task.error_message)
|
|
286
|
+
assert_nil document.reload.summary
|
|
287
|
+
assert_equal [], task.completed_steps
|
|
288
|
+
end
|
|
289
|
+
```
|
|
290
|
+
|
|
291
|
+
### Testing Pipeline with Multiple Agent Steps
|
|
292
|
+
|
|
293
|
+
```ruby
|
|
294
|
+
def test_multi_step_pipeline
|
|
295
|
+
pipeline_class = Class.new(Pipeline) do
|
|
296
|
+
agent_step(:extract_title)
|
|
297
|
+
agent_step(:generate_summary)
|
|
298
|
+
agent_step(:assign_category)
|
|
299
|
+
end
|
|
300
|
+
|
|
301
|
+
document = @store.document.create!
|
|
302
|
+
task = @store.task.create!(record: document, workspace: @workspace)
|
|
303
|
+
|
|
304
|
+
# Set up I18n for all steps
|
|
305
|
+
I18n.backend.store_translations(:en, {
|
|
306
|
+
extract_title: {
|
|
307
|
+
tools: ["read_file"],
|
|
308
|
+
cached_prompts: ["You extract titles from documents."],
|
|
309
|
+
prompt: "Extract title",
|
|
310
|
+
response_schema: {
|
|
311
|
+
title: { type: "string", description: "Document title" }
|
|
312
|
+
}
|
|
313
|
+
},
|
|
314
|
+
generate_summary: {
|
|
315
|
+
tools: ["read_file"],
|
|
316
|
+
cached_prompts: ["You summarize documents."],
|
|
317
|
+
prompt: "Summarize document: %{title}",
|
|
318
|
+
response_schema: {
|
|
319
|
+
summary: { type: "string", description: "Summary" }
|
|
320
|
+
}
|
|
321
|
+
},
|
|
322
|
+
assign_category: {
|
|
323
|
+
tools: [],
|
|
324
|
+
cached_prompts: ["You categorize documents."],
|
|
325
|
+
prompt: "Categorize: %{title} - %{summary}",
|
|
326
|
+
response_schema: {
|
|
327
|
+
category: { type: "string", description: "Category" }
|
|
328
|
+
}
|
|
329
|
+
}
|
|
330
|
+
})
|
|
331
|
+
|
|
332
|
+
# Configure responses for each step
|
|
333
|
+
dummy_chat = DummyChat.new(responses: {
|
|
334
|
+
"Extract title" =>
|
|
335
|
+
'{" "title": "Research Paper"}',
|
|
336
|
+
"Summarize document: Research Paper" =>
|
|
337
|
+
'{" "summary": "A study on testing"}',
|
|
338
|
+
/Categorize: Research Paper - A study on testing/ =>
|
|
339
|
+
'{" "category": "Research"}'
|
|
340
|
+
})
|
|
341
|
+
|
|
342
|
+
session = test_session(
|
|
343
|
+
workspace_dir: @workspace.dir,
|
|
344
|
+
chat_provider: ->(**params) { dummy_chat }
|
|
345
|
+
)
|
|
346
|
+
|
|
347
|
+
pipeline_class.call(task: task, session: session)
|
|
348
|
+
|
|
349
|
+
# Verify all steps completed
|
|
350
|
+
document.reload
|
|
351
|
+
assert_equal "Research Paper", document.title
|
|
352
|
+
assert_equal "A study on testing", document.summary
|
|
353
|
+
assert_equal "Research", document.category
|
|
354
|
+
assert task.reload.done?
|
|
355
|
+
assert_equal ["extract_title", "generate_summary", "assign_category"],
|
|
356
|
+
task.completed_steps
|
|
357
|
+
end
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
### Testing Pipeline Resumption
|
|
361
|
+
|
|
362
|
+
Pipelines track completed steps and can resume from where they left off:
|
|
363
|
+
|
|
364
|
+
```ruby
|
|
365
|
+
def test_pipeline_resumes_from_completed_steps
|
|
366
|
+
pipeline_class = Class.new(Pipeline) do
|
|
367
|
+
step(:step_1) do
|
|
368
|
+
record.update!(title: "Step 1 Done")
|
|
369
|
+
end
|
|
370
|
+
|
|
371
|
+
step(:step_2) do
|
|
372
|
+
record.update!(summary: "Step 2 Done")
|
|
373
|
+
end
|
|
374
|
+
|
|
375
|
+
step(:step_3) do
|
|
376
|
+
record.update!(category: "Step 3 Done")
|
|
377
|
+
end
|
|
378
|
+
end
|
|
379
|
+
|
|
380
|
+
document = @store.document.create!
|
|
381
|
+
task = @store.task.create!(record: document, workspace: @workspace)
|
|
382
|
+
|
|
383
|
+
# Mark step_1 as already completed
|
|
384
|
+
task.completed_steps << "step_1"
|
|
385
|
+
session = test_session
|
|
386
|
+
|
|
387
|
+
pipeline_class.call(task: task, session: session)
|
|
388
|
+
|
|
389
|
+
# step_1 was skipped, only step_2 and step_3 ran
|
|
390
|
+
assert_nil document.reload.title
|
|
391
|
+
assert_equal "Step 2 Done", document.summary
|
|
392
|
+
assert_equal "Step 3 Done", document.category
|
|
393
|
+
assert_equal ["step_1", "step_2", "step_3"], task.reload.completed_steps
|
|
394
|
+
end
|
|
395
|
+
```
|
|
396
|
+
|
|
397
|
+
|
|
398
|
+
## Testing Git Operations with DummyGit
|
|
399
|
+
|
|
400
|
+
When testing pipelines that perform git operations, use `DummyGit` to avoid actual git commands:
|
|
401
|
+
|
|
402
|
+
```ruby
|
|
403
|
+
require 'agent_c'
|
|
404
|
+
include AgentC::TestHelpers
|
|
405
|
+
|
|
406
|
+
def test_pipeline_with_git
|
|
407
|
+
# Create a dummy git instance
|
|
408
|
+
dummy_git = DummyGit.new(@workspace.dir)
|
|
409
|
+
|
|
410
|
+
# Simulate that a file was created (has uncommitted changes)
|
|
411
|
+
dummy_git.simulate_file_created!
|
|
412
|
+
|
|
413
|
+
# Run pipeline with dummy git
|
|
414
|
+
Pipeline.call(
|
|
415
|
+
task: task,
|
|
416
|
+
session: session,
|
|
417
|
+
git: ->(_dir) { dummy_git }
|
|
418
|
+
)
|
|
419
|
+
|
|
420
|
+
# Verify git operations were called
|
|
421
|
+
assert_equal 1, dummy_git.invocations.count
|
|
422
|
+
commit = dummy_git.invocations.first
|
|
423
|
+
assert_equal :commit_all, commit[:method]
|
|
424
|
+
assert_match(/added file/, commit.dig(:args, 0))
|
|
425
|
+
end
|
|
426
|
+
```
|
|
427
|
+
|
|
428
|
+
### DummyGit API
|
|
429
|
+
|
|
430
|
+
- `initialize(workspace_dir)` - Create instance with working directory
|
|
431
|
+
- `uncommitted_changes?` - Returns false by default, true after `simulate_file_created!`
|
|
432
|
+
- `simulate_file_created!` - Makes `uncommitted_changes?` return true
|
|
433
|
+
- `invocations` - Array of all method calls with `{method:, args:, params:}` hashes
|
|
434
|
+
- Responds to any method via `method_missing`, recording invocations
|
|
435
|
+
|
|
436
|
+
## Basic Usage with session.prompt
|
|
437
|
+
|
|
438
|
+
```ruby
|
|
439
|
+
require 'agent_c'
|
|
440
|
+
include AgentC::TestHelpers
|
|
441
|
+
|
|
442
|
+
# Create a real session with DummyChat as the chat provider
|
|
443
|
+
session = Session.new(
|
|
444
|
+
chat_provider: ->(**params) {
|
|
445
|
+
DummyChat.new(
|
|
446
|
+
responses: {
|
|
447
|
+
"What is 2+2?" => '{" "answer": "4"}'
|
|
448
|
+
},
|
|
449
|
+
**params
|
|
450
|
+
)
|
|
451
|
+
}
|
|
452
|
+
)
|
|
453
|
+
|
|
454
|
+
# Use it just like a real session
|
|
455
|
+
result = session.prompt(
|
|
456
|
+
prompt: "What is 2+2?",
|
|
457
|
+
schema: -> { string(:answer) }
|
|
458
|
+
)
|
|
459
|
+
|
|
460
|
+
result.success? # => true
|
|
461
|
+
result.data["answer"] # => "4"
|
|
462
|
+
```
|
|
463
|
+
|
|
464
|
+
## Using session.chat with DummyChat
|
|
465
|
+
|
|
466
|
+
For testing multi-turn conversations, inject DummyChat directly as a record:
|
|
467
|
+
|
|
468
|
+
```ruby
|
|
469
|
+
session = Session.new()
|
|
470
|
+
dummy_chat = DummyChat.new(responses: {
|
|
471
|
+
"Hello" => "Hi there!",
|
|
472
|
+
"How are you?" => "I'm doing well!"
|
|
473
|
+
})
|
|
474
|
+
|
|
475
|
+
# Inject DummyChat as the record
|
|
476
|
+
chat = session.chat(tools: [], record: dummy_chat)
|
|
477
|
+
|
|
478
|
+
response1 = chat.ask("Hello")
|
|
479
|
+
response1.content # => "Hi there!"
|
|
480
|
+
|
|
481
|
+
response2 = chat.ask("How are you?")
|
|
482
|
+
response2.content # => "I'm doing well!"
|
|
483
|
+
```
|
|
484
|
+
|
|
485
|
+
## Response Mapping
|
|
486
|
+
|
|
487
|
+
TestHelpers::DummyChat accepts a hash mapping prompts to responses. Response values can be strings or callables (lambdas/procs) for simulating side effects. You can use:
|
|
488
|
+
|
|
489
|
+
### Exact String Matching
|
|
490
|
+
|
|
491
|
+
```ruby
|
|
492
|
+
session = Session.new(
|
|
493
|
+
chat_provider: ->(**params) {
|
|
494
|
+
DummyChat.new(
|
|
495
|
+
responses: {
|
|
496
|
+
"What is Ruby?" => '{" "answer": "A programming language"}'
|
|
497
|
+
},
|
|
498
|
+
**params
|
|
499
|
+
)
|
|
500
|
+
}
|
|
501
|
+
)
|
|
502
|
+
```
|
|
503
|
+
|
|
504
|
+
### Regex Matching
|
|
505
|
+
|
|
506
|
+
```ruby
|
|
507
|
+
session = Session.new(
|
|
508
|
+
chat_provider: ->(**params) {
|
|
509
|
+
DummyChat.new(
|
|
510
|
+
responses: {
|
|
511
|
+
/extract.*email/ => '{" "email": "user@example.com"}'
|
|
512
|
+
},
|
|
513
|
+
**params
|
|
514
|
+
)
|
|
515
|
+
}
|
|
516
|
+
)
|
|
517
|
+
|
|
518
|
+
# Matches any prompt containing "extract" followed by "email"
|
|
519
|
+
result = session.prompt(
|
|
520
|
+
prompt: "Please extract the email from this text",
|
|
521
|
+
schema: -> { string(:email) }
|
|
522
|
+
)
|
|
523
|
+
```
|
|
524
|
+
|
|
525
|
+
### Proc Matching
|
|
526
|
+
|
|
527
|
+
```ruby
|
|
528
|
+
session = Session.new(
|
|
529
|
+
chat_provider: ->(**params) {
|
|
530
|
+
DummyChat.new(
|
|
531
|
+
responses: {
|
|
532
|
+
->(text) { text.include?("hello") } => '{" "greeting": "Hi!"}'
|
|
533
|
+
},
|
|
534
|
+
**params
|
|
535
|
+
)
|
|
536
|
+
}
|
|
537
|
+
)
|
|
538
|
+
|
|
539
|
+
# Matches any prompt containing "hello"
|
|
540
|
+
result = session.prompt(
|
|
541
|
+
prompt: "Say hello to me",
|
|
542
|
+
schema: -> { string(:greeting) }
|
|
543
|
+
)
|
|
544
|
+
```
|
|
545
|
+
|
|
546
|
+
### Callable Response Values
|
|
547
|
+
|
|
548
|
+
Response values can be callables (lambdas/procs) that are invoked when the prompt matches. This is useful for:
|
|
549
|
+
- Simulating side effects like file writes or API calls
|
|
550
|
+
- Returning dynamic values based on state
|
|
551
|
+
- Testing scenarios where the LLM interaction triggers other operations
|
|
552
|
+
|
|
553
|
+
```ruby
|
|
554
|
+
require 'tmpdir'
|
|
555
|
+
|
|
556
|
+
session = Session.new(
|
|
557
|
+
chat_provider: ->(**params) {
|
|
558
|
+
DummyChat.new(
|
|
559
|
+
responses: {
|
|
560
|
+
"Write file" => -> {
|
|
561
|
+
File.write("/tmp/test.txt", "content")
|
|
562
|
+
'{" "path": "/tmp/test.txt"}'
|
|
563
|
+
}
|
|
564
|
+
},
|
|
565
|
+
**params
|
|
566
|
+
)
|
|
567
|
+
}
|
|
568
|
+
)
|
|
569
|
+
|
|
570
|
+
result = session.prompt(
|
|
571
|
+
prompt: "Write file",
|
|
572
|
+
schema: -> { string(:path) }
|
|
573
|
+
)
|
|
574
|
+
|
|
575
|
+
# The callable was invoked, file is written
|
|
576
|
+
assert File.exist?("/tmp/test.txt")
|
|
577
|
+
assert_equal "/tmp/test.txt", result.data["path"]
|
|
578
|
+
```
|
|
579
|
+
|
|
580
|
+
You can also use callables with stateful closures:
|
|
581
|
+
|
|
582
|
+
```ruby
|
|
583
|
+
call_count = 0
|
|
584
|
+
|
|
585
|
+
dummy_chat = DummyChat.new(responses: {
|
|
586
|
+
"Count" => -> {
|
|
587
|
+
call_count += 1
|
|
588
|
+
"Called #{call_count} times"
|
|
589
|
+
}
|
|
590
|
+
})
|
|
591
|
+
|
|
592
|
+
session = Session.new()
|
|
593
|
+
chat = session.chat(tools: [], record: dummy_chat)
|
|
594
|
+
|
|
595
|
+
chat.ask("Count") # => "Called 1 times"
|
|
596
|
+
chat.ask("Count") # => "Called 2 times"
|
|
597
|
+
```
|
|
598
|
+
|
|
599
|
+
## Success and Error Responses
|
|
600
|
+
|
|
601
|
+
TestHelpers::DummyChat supports both success and error responses:
|
|
602
|
+
|
|
603
|
+
### Success Response
|
|
604
|
+
|
|
605
|
+
```ruby
|
|
606
|
+
session = Session.new(
|
|
607
|
+
chat_provider: ->(**params) {
|
|
608
|
+
DummyChat.new(
|
|
609
|
+
responses: {
|
|
610
|
+
"Process data" => '{" "result": "processed"}'
|
|
611
|
+
},
|
|
612
|
+
**params
|
|
613
|
+
)
|
|
614
|
+
}
|
|
615
|
+
)
|
|
616
|
+
|
|
617
|
+
result = session.prompt(
|
|
618
|
+
prompt: "Process data",
|
|
619
|
+
schema: -> { string(:result) }
|
|
620
|
+
)
|
|
621
|
+
|
|
622
|
+
result.success? # => true
|
|
623
|
+
result.data["result"] # => "processed"
|
|
624
|
+
```
|
|
625
|
+
|
|
626
|
+
### Error Response
|
|
627
|
+
|
|
628
|
+
```ruby
|
|
629
|
+
session = Session.new(
|
|
630
|
+
chat_provider: ->(**params) {
|
|
631
|
+
DummyChat.new(
|
|
632
|
+
responses: {
|
|
633
|
+
"Impossible task" => '{"unable_to_fulfill_request_error": "Cannot complete"}'
|
|
634
|
+
},
|
|
635
|
+
**params
|
|
636
|
+
)
|
|
637
|
+
}
|
|
638
|
+
)
|
|
639
|
+
|
|
640
|
+
result = session.prompt(
|
|
641
|
+
prompt: "Impossible task",
|
|
642
|
+
schema: -> { string(:result) }
|
|
643
|
+
)
|
|
644
|
+
|
|
645
|
+
result.success? # => false
|
|
646
|
+
result.error_message # => "Cannot complete"
|
|
647
|
+
```
|
|
648
|
+
|
|
649
|
+
## Complete Test Example
|
|
650
|
+
|
|
651
|
+
```ruby
|
|
652
|
+
require "test_helper"
|
|
653
|
+
|
|
654
|
+
class MyFeatureTest < Minitest::Test
|
|
655
|
+
include AgentC::TestHelpers
|
|
656
|
+
|
|
657
|
+
def test_extract_email_from_text
|
|
658
|
+
session = Session.new(
|
|
659
|
+
chat_provider: ->(**params) {
|
|
660
|
+
DummyChat.new(
|
|
661
|
+
responses: {
|
|
662
|
+
/extract.*email/ => '{" "email": "john@example.com"}'
|
|
663
|
+
},
|
|
664
|
+
**params
|
|
665
|
+
)
|
|
666
|
+
}
|
|
667
|
+
)
|
|
668
|
+
|
|
669
|
+
result = session.prompt(
|
|
670
|
+
prompt: "Extract the email from: Contact John at john@example.com",
|
|
671
|
+
schema: -> { string(:email) }
|
|
672
|
+
)
|
|
673
|
+
|
|
674
|
+
assert result.success?
|
|
675
|
+
assert_equal "john@example.com", result.data["email"]
|
|
676
|
+
end
|
|
677
|
+
|
|
678
|
+
def test_handles_error_gracefully
|
|
679
|
+
session = Session.new(
|
|
680
|
+
chat_provider: ->(**params) {
|
|
681
|
+
DummyChat.new(
|
|
682
|
+
responses: {
|
|
683
|
+
"Invalid input" => '{"unable_to_fulfill_request_error": "Input validation failed"}'
|
|
684
|
+
},
|
|
685
|
+
**params
|
|
686
|
+
)
|
|
687
|
+
}
|
|
688
|
+
)
|
|
689
|
+
|
|
690
|
+
result = session.prompt(
|
|
691
|
+
prompt: "Invalid input",
|
|
692
|
+
schema: -> { string(:result) }
|
|
693
|
+
)
|
|
694
|
+
|
|
695
|
+
refute result.success?
|
|
696
|
+
assert_equal "Input validation failed", result.error_message
|
|
697
|
+
end
|
|
698
|
+
|
|
699
|
+
def test_multi_turn_conversation
|
|
700
|
+
session = Session.new()
|
|
701
|
+
dummy_chat = DummyChat.new(responses: {
|
|
702
|
+
"Hello" => "Hi there!",
|
|
703
|
+
/how.*you/ => "I'm doing well!"
|
|
704
|
+
})
|
|
705
|
+
|
|
706
|
+
chat = session.chat(tools: [], record: dummy_chat)
|
|
707
|
+
|
|
708
|
+
assert_equal "Hi there!", chat.ask("Hello").content
|
|
709
|
+
assert_equal "I'm doing well!", chat.ask("How are you?").content
|
|
710
|
+
end
|
|
711
|
+
end
|
|
712
|
+
```
|
|
713
|
+
|
|
714
|
+
## Benefits
|
|
715
|
+
|
|
716
|
+
- **Fast**: No network calls or LLM processing
|
|
717
|
+
- **Predictable**: Same input always produces same output
|
|
718
|
+
- **Isolated**: Tests don't depend on external services
|
|
719
|
+
- **Cost-free**: No API charges during testing
|
|
720
|
+
- **Flexible**: Supports exact, regex, and proc matching
|
|
721
|
+
- **Real code paths**: Uses actual Session implementation
|
|
722
|
+
|
|
723
|
+
## Tips
|
|
724
|
+
|
|
725
|
+
1. **Understand the interpolation difference**:
|
|
726
|
+
- **Inline agent_step**: DummyChat receives literal `%{placeholders}` - match `"Process %{name}"`
|
|
727
|
+
- **I18n agent_step**: DummyChat receives interpolated values - match `"Process John"`
|
|
728
|
+
|
|
729
|
+
2. **Prefer inline agent_step definitions for tests**: Define agent_step with inline `prompt:` and `schema:` parameters instead of using I18n - it's simpler and keeps test logic self-contained
|
|
730
|
+
|
|
731
|
+
3. **Use regex for flexible matching**: When the exact prompt text may vary slightly, use regex patterns like `/Process file/` instead of exact strings
|
|
732
|
+
|
|
733
|
+
4. **Test both success and error cases**: Ensure your code handles both scenarios
|
|
734
|
+
|
|
735
|
+
5. **Keep responses realistic**: Use actual JSON structures your code expects
|
|
736
|
+
|
|
737
|
+
6. **One response map per test**: Makes tests easier to understand and maintain
|
|
738
|
+
|
|
739
|
+
7. **Inject via chat_provider for session.prompt**: Ensures DummyChat is used for all chat instances
|
|
740
|
+
|
|
741
|
+
8. **Inject via record for session.chat**: When you need fine-grained control over a specific chat instance
|
|
742
|
+
|
|
743
|
+
9. **Use callable responses for side effects**: When testing code that expects the LLM interaction to trigger file writes, API calls, or other state changes
|
|
744
|
+
|
|
745
|
+
10. **Use I18n for production agent_steps**: Store prompts in YAML files for production code, load them in tests with `I18n.load_path`
|
|
746
|
+
|
|
747
|
+
11. **Test pipeline resumption**: Verify that pipelines correctly skip already-completed steps
|