turnkit 0.2.8 → 0.2.10
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +12 -5
- data/README.md +165 -26
- data/UPGRADE.md +35 -68
- data/lib/turnkit/adapters/codex.rb +160 -0
- data/lib/turnkit/agent.rb +70 -4
- data/lib/turnkit/budget.rb +23 -8
- data/lib/turnkit/compaction.rb +15 -4
- data/lib/turnkit/conversation.rb +4 -3
- data/lib/turnkit/error.rb +1 -0
- data/lib/turnkit/output_audit.rb +92 -0
- data/lib/turnkit/output_policy.rb +121 -0
- data/lib/turnkit/run.rb +2 -0
- data/lib/turnkit/tool_runner.rb +11 -4
- data/lib/turnkit/turn.rb +96 -12
- data/lib/turnkit/version.rb +1 -1
- data/lib/turnkit/{fleet.rb → workflow.rb} +40 -13
- data/lib/turnkit.rb +14 -5
- metadata +10 -6
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 268561a36c656098e1d23ea6de4c17616358ff931e05e1389e707a9e28fe458b
|
|
4
|
+
data.tar.gz: 8f6731d78fed5b3e3cc94d781c4f4e26accc4f8d05842b5c56eb58a6e7448907
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: ae0a246b5937e586c808a25d28f051bafc54c2a922a52d89160eb3f5ef3bf7360b1d637cbb0c170d41eb74cd536638b6f9a1880275bd0ccd2fc8dcb4ac44db5c
|
|
7
|
+
data.tar.gz: 7ffebcfeadf51f193c7f2277a0842c2f56e00d9ff95d502915924f2a6d7e10744a0a710d1d2f5b1865182a9de21b2cce30edc3e94c16f49626912b93b1fc7063
|
data/CHANGELOG.md
CHANGED
|
@@ -1,12 +1,19 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
-
## 0.2.
|
|
3
|
+
## 0.2.10 - 2026-06-10
|
|
4
4
|
|
|
5
|
-
- Add
|
|
6
|
-
- Add
|
|
7
|
-
- Improve
|
|
5
|
+
- Add output audits and file-backed output policies for validating final run output.
|
|
6
|
+
- Add per-tool execution limits and explicit budget errors.
|
|
7
|
+
- Improve workflow event callbacks, model telemetry events, and compaction usage accounting.
|
|
8
|
+
- Add an Amazon memo writer example and batched page reading in the workflow researcher example.
|
|
9
|
+
|
|
10
|
+
## 0.2.9 - 2026-06-08
|
|
11
|
+
|
|
12
|
+
- Add `TurnKit::Workflow` for reusable single-orchestrator task runtimes with workflow skills, tools, guardrails, compaction, and run monitoring.
|
|
13
|
+
- Add `Agent#run` and `TurnKit::Run` for non-interactive application tasks, with task prompt behavior by default.
|
|
14
|
+
- Improve task-runtime DX with `TurnKit.configure`, `TurnKit.model`, `TurnKit.max_spend`, `TurnKit::Workflow`, positional `run("task")`, `run.output`, `run.tool_calls`, and `Tool.terminal!`.
|
|
8
15
|
- Support tool instances with constructor-injected dependencies.
|
|
9
|
-
- Add a
|
|
16
|
+
- Add a workflow researcher example and upgrade guide.
|
|
10
17
|
|
|
11
18
|
## 0.2.6 - 2026-06-07
|
|
12
19
|
|
data/README.md
CHANGED
|
@@ -4,7 +4,8 @@
|
|
|
4
4
|
[](https://www.ruby-lang.org)
|
|
5
5
|
[](LICENSE.md)
|
|
6
6
|
|
|
7
|
-
Build durable Ruby and Rails agents with
|
|
7
|
+
Build durable Ruby and Rails agents with conversations, runs, workflows, tools,
|
|
8
|
+
skills, output audits, sub-agents, and persistence.
|
|
8
9
|
|
|
9
10
|
## Installation
|
|
10
11
|
|
|
@@ -57,6 +58,18 @@ puts run.output
|
|
|
57
58
|
|
|
58
59
|
## Usage
|
|
59
60
|
|
|
61
|
+
For runnable, API-key-free examples of the three core entry points, see
|
|
62
|
+
[`examples/core_api`](examples/core_api):
|
|
63
|
+
|
|
64
|
+
- conversation: durable thread over time;
|
|
65
|
+
- agent run: one bounded application task;
|
|
66
|
+
- workflow: reusable task runner with skills, tools, and limits.
|
|
67
|
+
|
|
68
|
+
For fuller workflow examples, see:
|
|
69
|
+
|
|
70
|
+
- [`examples/workflow_researcher`](examples/workflow_researcher): source-grounded research with web tools, batch reads, per-tool limits, and deep monitoring;
|
|
71
|
+
- [`examples/amazon_memo_writer`](examples/amazon_memo_writer): strict memo generation with research tools, a structured terminal submit tool, deterministic format checks, and an LLM output policy.
|
|
72
|
+
|
|
60
73
|
### Models
|
|
61
74
|
|
|
62
75
|
Set a model:
|
|
@@ -92,6 +105,23 @@ Use these common providers:
|
|
|
92
105
|
|
|
93
106
|
Expect `TurnKit::ModelAccessError` for obvious key mistakes.
|
|
94
107
|
|
|
108
|
+
To run eligible coding tasks against a ChatGPT Plus/Pro Codex subscription instead of provider API-key billing, use the Codex adapter. It shells out to the official `codex exec` CLI, so authenticate Codex first:
|
|
109
|
+
|
|
110
|
+
```sh
|
|
111
|
+
codex login --device-auth
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
Then configure TurnKit:
|
|
115
|
+
|
|
116
|
+
```ruby
|
|
117
|
+
TurnKit.configure do |config|
|
|
118
|
+
config.client = TurnKit::Adapters::Codex.new(sandbox: "read-only")
|
|
119
|
+
config.model = "gpt-5.4"
|
|
120
|
+
end
|
|
121
|
+
```
|
|
122
|
+
|
|
123
|
+
The Codex adapter does not store ChatGPT tokens or read `~/.codex/auth.json` directly. It reuses Codex CLI auth and records token usage with no TurnKit provider cost, because usage is charged against the user's ChatGPT/Codex plan limits.
|
|
124
|
+
|
|
95
125
|
### Conversations
|
|
96
126
|
|
|
97
127
|
Create a conversation:
|
|
@@ -118,10 +148,13 @@ turn = conversation.run!
|
|
|
118
148
|
puts turn.output_text
|
|
119
149
|
```
|
|
120
150
|
|
|
121
|
-
###
|
|
151
|
+
### Runs
|
|
152
|
+
|
|
153
|
+
Use `Agent#run` when your application needs one non-interactive result. A run is
|
|
154
|
+
the AI equivalent of a service object call: one input, one job, one output.
|
|
122
155
|
|
|
123
|
-
|
|
124
|
-
|
|
156
|
+
Reach for a run when the task is bounded, such as classification, extraction,
|
|
157
|
+
summarization, routing, scoring, or structured JSON generation.
|
|
125
158
|
|
|
126
159
|
```ruby
|
|
127
160
|
agent = TurnKit::Agent.new(
|
|
@@ -135,7 +168,6 @@ agent = TurnKit::Agent.new(
|
|
|
135
168
|
},
|
|
136
169
|
required: ["priority", "reason"]
|
|
137
170
|
},
|
|
138
|
-
prompt_mode: :task
|
|
139
171
|
)
|
|
140
172
|
|
|
141
173
|
run = agent.run(
|
|
@@ -146,8 +178,10 @@ run = agent.run(
|
|
|
146
178
|
puts run.output_data
|
|
147
179
|
```
|
|
148
180
|
|
|
149
|
-
`Agent#run`
|
|
150
|
-
|
|
181
|
+
`Agent#run` uses task prompt behavior by default: it treats the input as the
|
|
182
|
+
contract, avoids follow-up questions, and returns the best result it can. It is a
|
|
183
|
+
small wrapper over TurnKit's existing conversation and turn engine. Existing
|
|
184
|
+
`conversation.ask` usage is still supported for multi-turn threads.
|
|
151
185
|
|
|
152
186
|
Prepare a pending run without calling the model:
|
|
153
187
|
|
|
@@ -157,31 +191,39 @@ request = run.preview
|
|
|
157
191
|
run.run!
|
|
158
192
|
```
|
|
159
193
|
|
|
160
|
-
###
|
|
194
|
+
### Workflows
|
|
161
195
|
|
|
162
|
-
Use a
|
|
163
|
-
task
|
|
164
|
-
|
|
165
|
-
|
|
196
|
+
Use a workflow when a run graduates into a reusable production capability: a
|
|
197
|
+
named task runner with workflow skills, tools, defaults, guardrails, compaction,
|
|
198
|
+
and output policy.
|
|
199
|
+
|
|
200
|
+
Workflows fight for their life when the task has a repeatable operating
|
|
201
|
+
procedure: inspect app data, gather context, use sources, draft, verify, save,
|
|
202
|
+
and stop under budget. They are overkill for simple classification or extraction
|
|
203
|
+
runs.
|
|
166
204
|
|
|
167
205
|
```ruby
|
|
168
206
|
source_grounded_brief = TurnKit::Skill.from_file("app/ai/skills/source_grounded_brief.md")
|
|
169
207
|
|
|
170
|
-
|
|
171
|
-
"brief_writer",
|
|
208
|
+
workflow = TurnKit::Workflow.new(
|
|
209
|
+
name: "brief_writer",
|
|
172
210
|
instructions: "Create source-grounded briefs and verify claims before final output.",
|
|
173
211
|
skills: [source_grounded_brief],
|
|
174
212
|
tools: [WebSearch.new, ReadWebPage.new, SaveBrief],
|
|
175
213
|
max_spend: 0.25,
|
|
176
214
|
max_iterations: 12,
|
|
177
215
|
max_tool_executions: 25,
|
|
216
|
+
max_tool_executions_by_name: {
|
|
217
|
+
web_search: 2,
|
|
218
|
+
read_web_page: 8
|
|
219
|
+
},
|
|
178
220
|
compaction: {
|
|
179
221
|
context_limit: 64_000,
|
|
180
222
|
threshold: 0.75
|
|
181
223
|
}
|
|
182
224
|
)
|
|
183
225
|
|
|
184
|
-
run =
|
|
226
|
+
run = workflow.run(
|
|
185
227
|
"Create a source-grounded brief.",
|
|
186
228
|
input: { topic: "Rails 8 Solid Queue" }
|
|
187
229
|
)
|
|
@@ -198,11 +240,40 @@ model-tool loop:
|
|
|
198
240
|
model → tool → result → model → tool → result → final
|
|
199
241
|
```
|
|
200
242
|
|
|
201
|
-
|
|
202
|
-
|
|
243
|
+
For repeated workflows, keep instructions, skills, and tools stable and pass the
|
|
244
|
+
per-run data through `input:`. This gives provider prompt caching the best chance
|
|
245
|
+
to reuse the stable workflow prompt while each run supplies dynamic data.
|
|
246
|
+
|
|
247
|
+
### Choosing runs, conversations, and workflows
|
|
248
|
+
|
|
249
|
+
Use the smallest entry point that matches the shape of work:
|
|
250
|
+
|
|
251
|
+
| Entry point | Use when | Tradeoffs |
|
|
252
|
+
| --- | --- | --- |
|
|
253
|
+
| `Conversation` | A user or app will keep adding messages over time. | Best for durable threads and follow-up steering; history grows, so long threads need compaction. |
|
|
254
|
+
| `Agent#run` | Your app needs one bounded result now. | Best for simple production tasks; repeated complex policies can sprawl across callers. |
|
|
255
|
+
| `TurnKit::Workflow` | A task becomes a named reusable workflow with tools, skills, limits, and observability. | Best cache and packaging story for repeated autonomous work; overkill for one-off/simple tasks. |
|
|
256
|
+
|
|
257
|
+
Prompt caching and compaction solve different problems:
|
|
258
|
+
|
|
259
|
+
- prompt caching reduces the cost of repeated stable instructions, tools, and
|
|
260
|
+
skills;
|
|
261
|
+
- compaction reduces the cost of long dynamic histories;
|
|
262
|
+
- budgets (`max_spend`, `max_iterations`, `max_tool_executions`) keep autonomous
|
|
263
|
+
loops bounded.
|
|
264
|
+
|
|
265
|
+
Use `max_tool_executions_by_name` when a workflow needs different budgets for
|
|
266
|
+
different tools. For example, allow many cheap reads but only one final submit
|
|
267
|
+
tool, or cap web searches while allowing a batch page reader.
|
|
268
|
+
|
|
269
|
+
Reach for separate agents and `sub_agents` only when the isolation is worth the
|
|
270
|
+
extra model calls, such as different models, different tool permissions,
|
|
271
|
+
parallel specialist review, or separate durable child conversations.
|
|
272
|
+
|
|
273
|
+
Run a workflow with `run`:
|
|
203
274
|
|
|
204
275
|
```ruby
|
|
205
|
-
run =
|
|
276
|
+
run = workflow.run(
|
|
206
277
|
"Create compliant outreach for this account.",
|
|
207
278
|
input: lead.attributes,
|
|
208
279
|
max_spend: 0.25,
|
|
@@ -215,10 +286,6 @@ run = fleet.auto_run(
|
|
|
215
286
|
)
|
|
216
287
|
```
|
|
217
288
|
|
|
218
|
-
Reach for separate agents and `sub_agents` only when the isolation is worth the
|
|
219
|
-
extra model calls, such as different models, different tool permissions,
|
|
220
|
-
parallel specialist review, or separate durable child conversations.
|
|
221
|
-
|
|
222
289
|
Use `terminal!` for save or action tools that complete the run:
|
|
223
290
|
|
|
224
291
|
```ruby
|
|
@@ -235,6 +302,71 @@ class SaveBrief < TurnKit::Tool
|
|
|
235
302
|
end
|
|
236
303
|
```
|
|
237
304
|
|
|
305
|
+
### Output audits and policies
|
|
306
|
+
|
|
307
|
+
Use output audits for deterministic checks that should not depend on another
|
|
308
|
+
model call: required headings, source counts, forbidden characters, JSON shape,
|
|
309
|
+
or project-specific formatting rules.
|
|
310
|
+
|
|
311
|
+
```ruby
|
|
312
|
+
no_em_dash = ->(output) do
|
|
313
|
+
next unless output.include?("—")
|
|
314
|
+
|
|
315
|
+
{ rule: "no_em_dash", message: "contains an em dash" }
|
|
316
|
+
end
|
|
317
|
+
|
|
318
|
+
numbered_lists_only = ->(output) do
|
|
319
|
+
lines = output.lines.each_with_index.filter_map do |line, index|
|
|
320
|
+
index + 1 if line.match?(/^\s*[-*]\s+/)
|
|
321
|
+
end
|
|
322
|
+
|
|
323
|
+
next if lines.empty?
|
|
324
|
+
|
|
325
|
+
{
|
|
326
|
+
rule: "numbered_lists_only",
|
|
327
|
+
message: "contains unordered list markers",
|
|
328
|
+
metadata: { lines: lines }
|
|
329
|
+
}
|
|
330
|
+
end
|
|
331
|
+
|
|
332
|
+
workflow = TurnKit::Workflow.new(
|
|
333
|
+
name: "memo_writer",
|
|
334
|
+
output_audit: [no_em_dash, numbered_lists_only],
|
|
335
|
+
output_audit_mode: :fail
|
|
336
|
+
)
|
|
337
|
+
```
|
|
338
|
+
|
|
339
|
+
Run checks directly when you want to test a renderer or policy without calling a
|
|
340
|
+
model:
|
|
341
|
+
|
|
342
|
+
```ruby
|
|
343
|
+
audit = TurnKit.audit_output(
|
|
344
|
+
"1. Recommendation\n- unordered item — fix this\n",
|
|
345
|
+
constraints: [no_em_dash, numbered_lists_only]
|
|
346
|
+
)
|
|
347
|
+
|
|
348
|
+
puts audit.clean?
|
|
349
|
+
puts audit.messages
|
|
350
|
+
```
|
|
351
|
+
|
|
352
|
+
Use `output_policy` when a semantic judge is worth the extra model call. The
|
|
353
|
+
policy can be a `.md`, `.markdown`, or `.txt` file path, a `TurnKit::OutputPolicy`,
|
|
354
|
+
or any object that responds to `#call` or `#check`.
|
|
355
|
+
|
|
356
|
+
```ruby
|
|
357
|
+
workflow = TurnKit::Workflow.new(
|
|
358
|
+
name: "memo_writer",
|
|
359
|
+
output_policy: "app/ai/policies/amazon_memo.md",
|
|
360
|
+
output_policy_model: "gpt-4.1-mini",
|
|
361
|
+
output_policy_thinking: { effort: :low },
|
|
362
|
+
output_policy_mode: :report
|
|
363
|
+
)
|
|
364
|
+
```
|
|
365
|
+
|
|
366
|
+
`output_policy_mode: :report` records violations while allowing the run to
|
|
367
|
+
complete. `:fail` marks the run failed after recording the output and audit.
|
|
368
|
+
Policy model usage and cost are counted on the parent run.
|
|
369
|
+
|
|
238
370
|
### Prompt Preview
|
|
239
371
|
|
|
240
372
|
Preview a pending turn:
|
|
@@ -272,9 +404,7 @@ class SaveReport < TurnKit::Tool
|
|
|
272
404
|
parameter :title, :string, required: true
|
|
273
405
|
parameter :body, :string, required: true
|
|
274
406
|
|
|
275
|
-
|
|
276
|
-
|
|
277
|
-
def self.completion_message(result)
|
|
407
|
+
terminal! do |result|
|
|
278
408
|
"Saved #{result.fetch("report_id")}."
|
|
279
409
|
end
|
|
280
410
|
|
|
@@ -490,19 +620,28 @@ TurnKit.reconcile_stale!
|
|
|
490
620
|
| `TurnKit.max_iterations` | Limit model loop iterations. |
|
|
491
621
|
| `TurnKit.max_depth` | Limit sub-agent depth. |
|
|
492
622
|
| `TurnKit.max_tool_executions` | Limit tool calls per turn. |
|
|
623
|
+
| `TurnKit.max_tool_executions_by_name` | Limit specific tools independently. |
|
|
493
624
|
| `TurnKit.timeout` | Limit turn runtime. |
|
|
494
|
-
| `TurnKit.
|
|
625
|
+
| `TurnKit.max_spend` | Limit estimated turn cost. |
|
|
495
626
|
| `TurnKit.compaction` | Configure context compaction. |
|
|
627
|
+
| `TurnKit.output_policy_model` | Default model for file-backed output policies. |
|
|
628
|
+
| `TurnKit.output_policy_thinking` | Default thinking config for file-backed output policies. |
|
|
496
629
|
| `TurnKit.on_event` | Subscribe to lifecycle events. |
|
|
497
630
|
|
|
498
631
|
Set options globally:
|
|
499
632
|
|
|
500
633
|
```ruby
|
|
501
634
|
TurnKit.default_model = "gpt-4.1-mini"
|
|
635
|
+
TurnKit.max_spend = 0.25
|
|
502
636
|
TurnKit.max_iterations = 25
|
|
637
|
+
TurnKit.max_tool_executions_by_name = { web_search: 2 }
|
|
638
|
+
TurnKit.output_policy_model = "gpt-4.1-mini"
|
|
503
639
|
TurnKit.timeout = 300
|
|
504
640
|
```
|
|
505
641
|
|
|
642
|
+
`TurnKit.cost_limit` remains supported as the internal/legacy name for
|
|
643
|
+
`max_spend`.
|
|
644
|
+
|
|
506
645
|
Set options per agent:
|
|
507
646
|
|
|
508
647
|
```ruby
|
data/UPGRADE.md
CHANGED
|
@@ -1,13 +1,27 @@
|
|
|
1
1
|
# Upgrade Guide
|
|
2
2
|
|
|
3
|
-
This guide covers migrating to the
|
|
4
|
-
|
|
5
|
-
|
|
6
|
-
|
|
3
|
+
This guide covers migrating to the workflow-based task-runtime API. The
|
|
4
|
+
recommended migration is about making the three work shapes easier to read:
|
|
5
|
+
|
|
6
|
+
- conversations for durable multi-turn threads;
|
|
7
|
+
- runs for one non-interactive application task;
|
|
8
|
+
- workflows for reusable task runners with tools, skills, limits, and policy.
|
|
7
9
|
|
|
8
10
|
## Quick summary
|
|
9
11
|
|
|
10
|
-
|
|
12
|
+
Before changing call sites, bump TurnKit to the latest version and run your
|
|
13
|
+
test suite against the new release.
|
|
14
|
+
|
|
15
|
+
```ruby
|
|
16
|
+
# Gemfile
|
|
17
|
+
gem "turnkit", "~> 0.2.9"
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
```sh
|
|
21
|
+
bundle update turnkit
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
Use workflows for reusable autonomous task runners.
|
|
11
25
|
|
|
12
26
|
Recommended new forms:
|
|
13
27
|
|
|
@@ -17,23 +31,12 @@ TurnKit.configure do |config|
|
|
|
17
31
|
config.max_spend = 0.25
|
|
18
32
|
end
|
|
19
33
|
|
|
20
|
-
|
|
21
|
-
run =
|
|
34
|
+
workflow = TurnKit::Workflow.new(name: "brief_writer", tools: [WebSearch, SaveBrief])
|
|
35
|
+
run = workflow.run("Create a source-grounded brief.", input: { topic: "Rails 8" })
|
|
22
36
|
|
|
23
37
|
puts run.output
|
|
24
38
|
```
|
|
25
39
|
|
|
26
|
-
Old forms still work:
|
|
27
|
-
|
|
28
|
-
```ruby
|
|
29
|
-
TurnKit.default_model = "gpt-5.2"
|
|
30
|
-
|
|
31
|
-
fleet = TurnKit::Fleet.new(name: "brief_writer", tools: [WebSearch, SaveBrief])
|
|
32
|
-
run = fleet.run(task: "Create a source-grounded brief.", input: { topic: "Rails 8" })
|
|
33
|
-
|
|
34
|
-
puts run.output_text
|
|
35
|
-
```
|
|
36
|
-
|
|
37
40
|
## Configuration
|
|
38
41
|
|
|
39
42
|
### Model name
|
|
@@ -112,7 +115,8 @@ puts run.output
|
|
|
112
115
|
```
|
|
113
116
|
|
|
114
117
|
The keyword form still works. The positional string is the recommended form for
|
|
115
|
-
the common case.
|
|
118
|
+
the common case. `Agent#run` uses task prompt behavior by default; pass
|
|
119
|
+
`prompt_mode: :full` if you need conversation-style prompt behavior for a run.
|
|
116
120
|
|
|
117
121
|
### Pending runs
|
|
118
122
|
|
|
@@ -130,10 +134,10 @@ The existing keyword form remains valid:
|
|
|
130
134
|
run = agent.run(task: "Classify later.", async: true)
|
|
131
135
|
```
|
|
132
136
|
|
|
133
|
-
##
|
|
137
|
+
## Workflows
|
|
134
138
|
|
|
135
|
-
The
|
|
136
|
-
|
|
139
|
+
The preferred name for reusable autonomous task runtimes is now workflow. A
|
|
140
|
+
workflow packages:
|
|
137
141
|
|
|
138
142
|
- one task-mode orchestrator
|
|
139
143
|
- workflow skills
|
|
@@ -144,10 +148,8 @@ task runtime.” A fleet packages:
|
|
|
144
148
|
|
|
145
149
|
### Construction
|
|
146
150
|
|
|
147
|
-
Before:
|
|
148
|
-
|
|
149
151
|
```ruby
|
|
150
|
-
|
|
152
|
+
workflow = TurnKit::Workflow.new(
|
|
151
153
|
name: "sales_enrichment",
|
|
152
154
|
tools: [AccountLookup, WebSearch, SaveEnrichment],
|
|
153
155
|
skills: [sales_research_skill],
|
|
@@ -155,34 +157,10 @@ fleet = TurnKit::Fleet.new(
|
|
|
155
157
|
)
|
|
156
158
|
```
|
|
157
159
|
|
|
158
|
-
After:
|
|
159
|
-
|
|
160
|
-
```ruby
|
|
161
|
-
fleet = TurnKit.fleet(
|
|
162
|
-
"sales_enrichment",
|
|
163
|
-
tools: [AccountLookup, WebSearch, SaveEnrichment],
|
|
164
|
-
skills: [sales_research_skill],
|
|
165
|
-
max_spend: 0.25
|
|
166
|
-
)
|
|
167
|
-
```
|
|
168
|
-
|
|
169
|
-
`TurnKit::Fleet.new` remains supported.
|
|
170
|
-
|
|
171
160
|
### Running
|
|
172
161
|
|
|
173
|
-
Before:
|
|
174
|
-
|
|
175
162
|
```ruby
|
|
176
|
-
run =
|
|
177
|
-
task: "Enrich this account for responsible outreach.",
|
|
178
|
-
input: account.attributes
|
|
179
|
-
)
|
|
180
|
-
```
|
|
181
|
-
|
|
182
|
-
After:
|
|
183
|
-
|
|
184
|
-
```ruby
|
|
185
|
-
run = fleet.run(
|
|
163
|
+
run = workflow.run(
|
|
186
164
|
"Enrich this account for responsible outreach.",
|
|
187
165
|
input: account.attributes
|
|
188
166
|
)
|
|
@@ -190,17 +168,6 @@ run = fleet.run(
|
|
|
190
168
|
|
|
191
169
|
`task:` remains supported.
|
|
192
170
|
|
|
193
|
-
### Auto-run alias
|
|
194
|
-
|
|
195
|
-
No behavior change.
|
|
196
|
-
|
|
197
|
-
```ruby
|
|
198
|
-
run = fleet.auto_run("Enrich this account.", input: account.attributes)
|
|
199
|
-
```
|
|
200
|
-
|
|
201
|
-
Use `auto_run` when the name helps communicate that the fleet should navigate
|
|
202
|
-
from input to output on its own. It is an alias for `run`.
|
|
203
|
-
|
|
204
171
|
## Run inspection
|
|
205
172
|
|
|
206
173
|
New convenience methods were added to `TurnKit::Run`.
|
|
@@ -285,10 +252,10 @@ agent = TurnKit::Agent.new(tools: [WebSearch.new(client: client)])
|
|
|
285
252
|
This is the recommended pattern for API clients, test doubles, and per-tenant
|
|
286
253
|
dependencies.
|
|
287
254
|
|
|
288
|
-
## Multi-agent
|
|
255
|
+
## Multi-agent workflows
|
|
289
256
|
|
|
290
257
|
If you previously modeled every role as a separate agent, consider migrating the
|
|
291
|
-
default path to one
|
|
258
|
+
default path to one workflow with a workflow skill.
|
|
292
259
|
|
|
293
260
|
Before:
|
|
294
261
|
|
|
@@ -315,8 +282,8 @@ workflow = TurnKit::Skill.new(
|
|
|
315
282
|
TEXT
|
|
316
283
|
)
|
|
317
284
|
|
|
318
|
-
|
|
319
|
-
"source_brief",
|
|
285
|
+
source_brief = TurnKit::Workflow.new(
|
|
286
|
+
name: "source_brief",
|
|
320
287
|
skills: [workflow],
|
|
321
288
|
tools: [WebSearch, ReadWebPage, SaveBrief],
|
|
322
289
|
max_spend: 0.25,
|
|
@@ -336,11 +303,11 @@ Keep separate agents when the isolation is worth the extra model calls:
|
|
|
336
303
|
|
|
337
304
|
1. Replace `TurnKit.default_model =` with `TurnKit.model =` in app-level config.
|
|
338
305
|
2. Wrap global settings in `TurnKit.configure` if you have more than one.
|
|
339
|
-
3.
|
|
306
|
+
3. Use `TurnKit::Workflow.new(name: "...")` for reusable autonomous task runners.
|
|
340
307
|
4. Replace `run(task: "...")` with `run("...")` where it improves readability.
|
|
341
308
|
5. Replace `run.output_text` with `run.output` in application code.
|
|
342
309
|
6. Replace save/action tool overrides with `terminal!` when convenient.
|
|
343
|
-
7. Consider collapsing role-agent
|
|
310
|
+
7. Consider collapsing role-agent workflows into one workflow plus workflow skills if
|
|
344
311
|
cost or complexity is a concern.
|
|
345
312
|
|
|
346
|
-
|
|
313
|
+
Run your test suite after migrating call sites.
|
|
@@ -0,0 +1,160 @@
|
|
|
1
|
+
# frozen_string_literal: true
|
|
2
|
+
|
|
3
|
+
require "json"
|
|
4
|
+
require "open3"
|
|
5
|
+
require "tempfile"
|
|
6
|
+
|
|
7
|
+
module TurnKit
|
|
8
|
+
module Adapters
|
|
9
|
+
class Codex < Client
|
|
10
|
+
Status = Struct.new(:successful, keyword_init: true) do
|
|
11
|
+
def success? = successful
|
|
12
|
+
end
|
|
13
|
+
|
|
14
|
+
attr_reader :command, :sandbox, :working_directory
|
|
15
|
+
|
|
16
|
+
def initialize(command: ENV.fetch("CODEX_COMMAND", "codex"), sandbox: "read-only", working_directory: Dir.pwd, runner: nil)
|
|
17
|
+
@command = command.to_s
|
|
18
|
+
@sandbox = sandbox
|
|
19
|
+
@working_directory = working_directory
|
|
20
|
+
@runner = runner || method(:run_command)
|
|
21
|
+
end
|
|
22
|
+
|
|
23
|
+
def validate!(model:)
|
|
24
|
+
raise ModelAccessError, "codex command is required" if command.empty?
|
|
25
|
+
raise ModelAccessError, "#{command.inspect} was not found. Install OpenAI Codex CLI and run `codex login --device-auth`." unless executable?(command)
|
|
26
|
+
|
|
27
|
+
stdout, stderr, status = @runner.call([ command, "login", "status" ], stdin_data: nil, chdir: working_directory)
|
|
28
|
+
return true if status.success?
|
|
29
|
+
|
|
30
|
+
message = [ stderr, stdout ].join("\n").strip
|
|
31
|
+
hint = "Run `#{command} login --device-auth` to connect your ChatGPT/Codex subscription."
|
|
32
|
+
raise ModelAccessError, [ "Codex is not authenticated.", message, hint ].reject(&:empty?).join(" ")
|
|
33
|
+
end
|
|
34
|
+
|
|
35
|
+
def chat(model:, messages:, tools:, instructions:, temperature: nil, thinking: nil, output_schema: nil, metadata: nil, on_event: nil)
|
|
36
|
+
raise ToolError, "TurnKit tools are not supported by the Codex adapter; Codex uses its own local tools" if Array(tools).any?
|
|
37
|
+
|
|
38
|
+
with_tempfiles(output_schema: output_schema) do |schema_file, output_file|
|
|
39
|
+
command = exec_command(model: model, schema_file: schema_file&.path, output_file: output_file.path)
|
|
40
|
+
stdout, stderr, status = @runner.call(command, stdin_data: prompt_for(messages: messages, instructions: instructions), chdir: working_directory)
|
|
41
|
+
emit_codex_events(stdout, on_event: on_event)
|
|
42
|
+
raise ModelAccessError, stderr.strip.empty? ? "codex exec failed" : stderr.strip unless status.success?
|
|
43
|
+
|
|
44
|
+
text = read_output(output_file, stdout)
|
|
45
|
+
Result.new(
|
|
46
|
+
text: text,
|
|
47
|
+
output_data: parse_output_data(text, output_schema: output_schema),
|
|
48
|
+
usage: usage_from_jsonl(stdout),
|
|
49
|
+
model: model
|
|
50
|
+
)
|
|
51
|
+
end
|
|
52
|
+
end
|
|
53
|
+
|
|
54
|
+
private
|
|
55
|
+
def exec_command(model:, schema_file:, output_file:)
|
|
56
|
+
args = [ command, "exec", "--json" ]
|
|
57
|
+
args += [ "--sandbox", sandbox.to_s ] if sandbox
|
|
58
|
+
args += [ "--model", model.to_s ] unless model.to_s.empty? || model.to_s == "codex"
|
|
59
|
+
args += [ "--output-schema", schema_file ] if schema_file
|
|
60
|
+
args += [ "-o", output_file, "-" ]
|
|
61
|
+
args
|
|
62
|
+
end
|
|
63
|
+
|
|
64
|
+
def prompt_for(messages:, instructions:)
|
|
65
|
+
parts = []
|
|
66
|
+
parts << "System instructions:\n#{instructions}" unless instructions.to_s.empty?
|
|
67
|
+
Array(messages).each do |message|
|
|
68
|
+
attrs = message.respond_to?(:to_h) ? message.to_h : message
|
|
69
|
+
attrs = attrs.transform_keys(&:to_s)
|
|
70
|
+
role = attrs["role"] || "user"
|
|
71
|
+
content = attrs["content"] || attrs["text"] || ""
|
|
72
|
+
parts << "#{role}:\n#{content}"
|
|
73
|
+
end
|
|
74
|
+
parts.join("\n\n")
|
|
75
|
+
end
|
|
76
|
+
|
|
77
|
+
def with_tempfiles(output_schema:)
|
|
78
|
+
output_file = Tempfile.new([ "turnkit-codex-output", ".txt" ])
|
|
79
|
+
schema_file = nil
|
|
80
|
+
if output_schema
|
|
81
|
+
schema_file = Tempfile.new([ "turnkit-codex-schema", ".json" ])
|
|
82
|
+
schema_file.write(JSON.generate(output_schema))
|
|
83
|
+
schema_file.flush
|
|
84
|
+
end
|
|
85
|
+
|
|
86
|
+
yield schema_file, output_file
|
|
87
|
+
ensure
|
|
88
|
+
schema_file&.close!
|
|
89
|
+
output_file&.close!
|
|
90
|
+
end
|
|
91
|
+
|
|
92
|
+
def read_output(output_file, stdout)
|
|
93
|
+
output_file.rewind
|
|
94
|
+
text = output_file.read.to_s
|
|
95
|
+
return text unless text.empty?
|
|
96
|
+
|
|
97
|
+
final_message_from_jsonl(stdout) || stdout.to_s
|
|
98
|
+
end
|
|
99
|
+
|
|
100
|
+
def final_message_from_jsonl(stdout)
|
|
101
|
+
events = parse_jsonl(stdout)
|
|
102
|
+
messages = events.filter_map do |event|
|
|
103
|
+
item = event["item"]
|
|
104
|
+
next unless item.is_a?(Hash) && item["type"] == "agent_message"
|
|
105
|
+
|
|
106
|
+
item["text"]
|
|
107
|
+
end
|
|
108
|
+
messages.last
|
|
109
|
+
end
|
|
110
|
+
|
|
111
|
+
def parse_output_data(text, output_schema:)
|
|
112
|
+
return nil unless output_schema
|
|
113
|
+
|
|
114
|
+
JSON.parse(text)
|
|
115
|
+
rescue JSON::ParserError
|
|
116
|
+
nil
|
|
117
|
+
end
|
|
118
|
+
|
|
119
|
+
def usage_from_jsonl(stdout)
|
|
120
|
+
usage = parse_jsonl(stdout).filter_map { |event| event["usage"] if event.is_a?(Hash) }.last || {}
|
|
121
|
+
input = usage["input_tokens"].to_i
|
|
122
|
+
cached = usage["cached_input_tokens"].to_i
|
|
123
|
+
Usage.new(
|
|
124
|
+
input_tokens: [ input - cached, 0 ].max,
|
|
125
|
+
output_tokens: usage["output_tokens"].to_i,
|
|
126
|
+
cached_tokens: cached,
|
|
127
|
+
thinking_tokens: usage["reasoning_output_tokens"].to_i
|
|
128
|
+
)
|
|
129
|
+
end
|
|
130
|
+
|
|
131
|
+
def emit_codex_events(stdout, on_event:)
|
|
132
|
+
return unless on_event
|
|
133
|
+
|
|
134
|
+
parse_jsonl(stdout).each do |event|
|
|
135
|
+
on_event.call(type: "codex.#{event.fetch("type", "event")}", payload: event)
|
|
136
|
+
end
|
|
137
|
+
end
|
|
138
|
+
|
|
139
|
+
def parse_jsonl(stdout)
|
|
140
|
+
stdout.to_s.each_line.filter_map do |line|
|
|
141
|
+
JSON.parse(line)
|
|
142
|
+
rescue JSON::ParserError
|
|
143
|
+
nil
|
|
144
|
+
end
|
|
145
|
+
end
|
|
146
|
+
|
|
147
|
+
def executable?(name)
|
|
148
|
+
return true if @runner != method(:run_command)
|
|
149
|
+
return File.executable?(name) if name.include?(File::SEPARATOR)
|
|
150
|
+
|
|
151
|
+
ENV.fetch("PATH", "").split(File::PATH_SEPARATOR).any? { |path| File.executable?(File.join(path, name)) }
|
|
152
|
+
end
|
|
153
|
+
|
|
154
|
+
def run_command(command, stdin_data:, chdir:)
|
|
155
|
+
stdout, stderr, status = Open3.capture3(*command, stdin_data: stdin_data, chdir: chdir)
|
|
156
|
+
[ stdout, stderr, status ]
|
|
157
|
+
end
|
|
158
|
+
end
|
|
159
|
+
end
|
|
160
|
+
end
|