turnkit 0.2.9 → 0.2.10

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a7069b120432ec902d846961157f5635c946602a8298ed4471f09dde3e3e3e0d
4
- data.tar.gz: '09a5d64ff294f89ebde99a6cf1d36dc8731c6cabbf06216d4e9b9551cbe88a1e'
3
+ metadata.gz: 268561a36c656098e1d23ea6de4c17616358ff931e05e1389e707a9e28fe458b
4
+ data.tar.gz: 8f6731d78fed5b3e3cc94d781c4f4e26accc4f8d05842b5c56eb58a6e7448907
5
5
  SHA512:
6
- metadata.gz: de794838f5979194aa2469890848eb7cd60932d6f223e95d17be4d8912a6f2777afb55143f9776d7093be2072451c4a7ba0aa83ca8783c82a29375da56a11c90
7
- data.tar.gz: c037fb4946a252ebf9bb2e0f99b76cca23d60f29275ce4e07a15f71232d4fdc0dce23337ad1b4b47bacd7df50ca7eedd3cf050c82167bcccb30debaa70cdfe22
6
+ metadata.gz: ae0a246b5937e586c808a25d28f051bafc54c2a922a52d89160eb3f5ef3bf7360b1d637cbb0c170d41eb74cd536638b6f9a1880275bd0ccd2fc8dcb4ac44db5c
7
+ data.tar.gz: 7ffebcfeadf51f193c7f2277a0842c2f56e00d9ff95d502915924f2a6d7e10744a0a710d1d2f5b1865182a9de21b2cce30edc3e94c16f49626912b93b1fc7063
data/CHANGELOG.md CHANGED
@@ -1,5 +1,12 @@
1
1
  # Changelog
2
2
 
3
+ ## 0.2.10 - 2026-06-10
4
+
5
+ - Add output audits and file-backed output policies for validating final run output.
6
+ - Add per-tool execution limits and explicit budget errors.
7
+ - Improve workflow event callbacks, model telemetry events, and compaction usage accounting.
8
+ - Add an Amazon memo writer example and batched page reading in the workflow researcher example.
9
+
3
10
  ## 0.2.9 - 2026-06-08
4
11
 
5
12
  - Add `TurnKit::Workflow` for reusable single-orchestrator task runtimes with workflow skills, tools, guardrails, compaction, and run monitoring.
data/README.md CHANGED
@@ -5,7 +5,7 @@
5
5
  [![License](https://img.shields.io/badge/license-MIT-green.svg)](LICENSE.md)
6
6
 
7
7
  Build durable Ruby and Rails agents with conversations, runs, workflows, tools,
8
- skills, sub-agents, and persistence.
8
+ skills, output audits, sub-agents, and persistence.
9
9
 
10
10
  ## Installation
11
11
 
@@ -65,6 +65,11 @@ For runnable, API-key-free examples of the three core entry points, see
65
65
  - agent run: one bounded application task;
66
66
  - workflow: reusable task runner with skills, tools, and limits.
67
67
 
68
+ For fuller workflow examples, see:
69
+
70
+ - [`examples/workflow_researcher`](examples/workflow_researcher): source-grounded research with web tools, batch reads, per-tool limits, and deep monitoring;
71
+ - [`examples/amazon_memo_writer`](examples/amazon_memo_writer): strict memo generation with research tools, a structured terminal submit tool, deterministic format checks, and an LLM output policy.
72
+
68
73
  ### Models
69
74
 
70
75
  Set a model:
@@ -208,6 +213,10 @@ workflow = TurnKit::Workflow.new(
208
213
  max_spend: 0.25,
209
214
  max_iterations: 12,
210
215
  max_tool_executions: 25,
216
+ max_tool_executions_by_name: {
217
+ web_search: 2,
218
+ read_web_page: 8
219
+ },
211
220
  compaction: {
212
221
  context_limit: 64_000,
213
222
  threshold: 0.75
@@ -253,6 +262,10 @@ Prompt caching and compaction solve different problems:
253
262
  - budgets (`max_spend`, `max_iterations`, `max_tool_executions`) keep autonomous
254
263
  loops bounded.
255
264
 
265
+ Use `max_tool_executions_by_name` when a workflow needs different budgets for
266
+ different tools. For example, allow many cheap reads but only one final submit
267
+ tool, or cap web searches while allowing a batch page reader.
268
+
256
269
  Reach for separate agents and `sub_agents` only when the isolation is worth the
257
270
  extra model calls, such as different models, different tool permissions,
258
271
  parallel specialist review, or separate durable child conversations.
@@ -289,6 +302,71 @@ class SaveBrief < TurnKit::Tool
289
302
  end
290
303
  ```
291
304
 
305
+ ### Output audits and policies
306
+
307
+ Use output audits for deterministic checks that should not depend on another
308
+ model call: required headings, source counts, forbidden characters, JSON shape,
309
+ or project-specific formatting rules.
310
+
311
+ ```ruby
312
+ no_em_dash = ->(output) do
313
+ next unless output.include?("—")
314
+
315
+ { rule: "no_em_dash", message: "contains an em dash" }
316
+ end
317
+
318
+ numbered_lists_only = ->(output) do
319
+ lines = output.lines.each_with_index.filter_map do |line, index|
320
+ index + 1 if line.match?(/^\s*[-*]\s+/)
321
+ end
322
+
323
+ next if lines.empty?
324
+
325
+ {
326
+ rule: "numbered_lists_only",
327
+ message: "contains unordered list markers",
328
+ metadata: { lines: lines }
329
+ }
330
+ end
331
+
332
+ workflow = TurnKit::Workflow.new(
333
+ name: "memo_writer",
334
+ output_audit: [no_em_dash, numbered_lists_only],
335
+ output_audit_mode: :fail
336
+ )
337
+ ```
338
+
339
+ Run checks directly when you want to test a renderer or policy without calling a
340
+ model:
341
+
342
+ ```ruby
343
+ audit = TurnKit.audit_output(
344
+ "1. Recommendation\n- unordered item — fix this\n",
345
+ constraints: [no_em_dash, numbered_lists_only]
346
+ )
347
+
348
+ puts audit.clean?
349
+ puts audit.messages
350
+ ```
351
+
352
+ Use `output_policy` when a semantic judge is worth the extra model call. The
353
+ policy can be a `.md`, `.markdown`, or `.txt` file path, a `TurnKit::OutputPolicy`,
354
+ or any object that responds to `#call` or `#check`.
355
+
356
+ ```ruby
357
+ workflow = TurnKit::Workflow.new(
358
+ name: "memo_writer",
359
+ output_policy: "app/ai/policies/amazon_memo.md",
360
+ output_policy_model: "gpt-4.1-mini",
361
+ output_policy_thinking: { effort: :low },
362
+ output_policy_mode: :report
363
+ )
364
+ ```
365
+
366
+ `output_policy_mode: :report` records violations while allowing the run to
367
+ complete. `:fail` marks the run failed after recording the output and audit.
368
+ Policy model usage and cost are counted on the parent run.
369
+
292
370
  ### Prompt Preview
293
371
 
294
372
  Preview a pending turn:
@@ -326,9 +404,7 @@ class SaveReport < TurnKit::Tool
326
404
  parameter :title, :string, required: true
327
405
  parameter :body, :string, required: true
328
406
 
329
- def self.ends_turn? = true
330
-
331
- def self.completion_message(result)
407
+ terminal! do |result|
332
408
  "Saved #{result.fetch("report_id")}."
333
409
  end
334
410
 
@@ -544,9 +620,12 @@ TurnKit.reconcile_stale!
544
620
  | `TurnKit.max_iterations` | Limit model loop iterations. |
545
621
  | `TurnKit.max_depth` | Limit sub-agent depth. |
546
622
  | `TurnKit.max_tool_executions` | Limit tool calls per turn. |
623
+ | `TurnKit.max_tool_executions_by_name` | Limit specific tools independently. |
547
624
  | `TurnKit.timeout` | Limit turn runtime. |
548
625
  | `TurnKit.max_spend` | Limit estimated turn cost. |
549
626
  | `TurnKit.compaction` | Configure context compaction. |
627
+ | `TurnKit.output_policy_model` | Default model for file-backed output policies. |
628
+ | `TurnKit.output_policy_thinking` | Default thinking config for file-backed output policies. |
550
629
  | `TurnKit.on_event` | Subscribe to lifecycle events. |
551
630
 
552
631
  Set options globally:
@@ -555,6 +634,8 @@ Set options globally:
555
634
  TurnKit.default_model = "gpt-4.1-mini"
556
635
  TurnKit.max_spend = 0.25
557
636
  TurnKit.max_iterations = 25
637
+ TurnKit.max_tool_executions_by_name = { web_search: 2 }
638
+ TurnKit.output_policy_model = "gpt-4.1-mini"
558
639
  TurnKit.timeout = 300
559
640
  ```
560
641
 
data/lib/turnkit/agent.rb CHANGED
@@ -3,13 +3,14 @@
3
3
  module TurnKit
4
4
  class Agent
5
5
  attr_reader :name, :description, :model, :instructions, :tools, :skills, :available_skills, :sub_agents
6
- attr_reader :client, :store, :max_iterations, :timeout, :cost_limit, :max_depth, :max_tool_executions
6
+ attr_reader :client, :store, :max_iterations, :timeout, :cost_limit, :max_depth, :max_tool_executions, :max_tool_executions_by_name
7
7
  attr_reader :prompt_sections, :system_prompt, :prompt_mode, :thinking, :compaction, :output_schema, :on_event
8
+ attr_reader :output_audit, :output_audit_mode, :output_policy_model
8
9
 
9
10
  def initialize(name:, description: "", model: nil, instructions: "", tools: [], skills: [], available_skills: [], sub_agents: [],
10
11
  system_prompt: nil, prompt_sections: nil, prompt_mode: nil, client: nil, store: nil,
11
- max_iterations: nil, timeout: nil, cost_limit: nil, max_depth: nil, max_tool_executions: nil, thinking: nil, compaction: nil,
12
- output_schema: nil, on_event: nil)
12
+ max_iterations: nil, timeout: nil, cost_limit: nil, max_depth: nil, max_tool_executions: nil, max_tool_executions_by_name: nil, thinking: nil, compaction: nil,
13
+ output_schema: nil, output_audit: nil, output_audit_mode: nil, output_policy: nil, output_policy_mode: nil, output_policy_model: nil, output_policy_thinking: nil, on_event: nil)
13
14
  @name = name.to_s
14
15
  @description = description.to_s
15
16
  @model = model
@@ -28,9 +29,13 @@ module TurnKit
28
29
  @cost_limit = cost_limit
29
30
  @max_depth = max_depth
30
31
  @max_tool_executions = max_tool_executions
32
+ @max_tool_executions_by_name = max_tool_executions_by_name
31
33
  @thinking = self.class.normalize_thinking(thinking)
32
34
  @compaction = compaction
33
35
  @output_schema = output_schema
36
+ @output_policy_model = output_policy_model
37
+ @output_audit = normalize_output_policy_options(output_audit: output_audit, output_policy: output_policy, output_policy_model: output_policy_model, output_policy_thinking: output_policy_thinking)
38
+ @output_audit_mode = normalize_output_policy_mode(output_audit_mode: output_audit_mode, output_policy_mode: output_policy_mode)
34
39
  @on_event = on_event
35
40
  raise ArgumentError, "name is required" if @name.empty?
36
41
  validate_tools!
@@ -94,6 +99,18 @@ module TurnKit
94
99
  thinking
95
100
  end
96
101
 
102
+ def effective_output_audit
103
+ Array(output_audit).compact
104
+ end
105
+
106
+ def output_policy
107
+ output_audit
108
+ end
109
+
110
+ def output_policy_mode
111
+ output_audit_mode
112
+ end
113
+
97
114
  def effective_client
98
115
  client || TurnKit.client
99
116
  end
@@ -143,6 +160,7 @@ module TurnKit
143
160
  timeout: timeout || TurnKit.timeout,
144
161
  max_depth: max_depth || TurnKit.max_depth,
145
162
  max_tool_executions: max_tool_executions || TurnKit.max_tool_executions,
163
+ max_tool_executions_by_name: max_tool_executions_by_name || TurnKit.max_tool_executions_by_name,
146
164
  cost_limit: cost_limit || TurnKit.cost_limit,
147
165
  root_started_at: root_started_at
148
166
  )
@@ -170,6 +188,53 @@ module TurnKit
170
188
  effective_tools.each(&:validate_definition!)
171
189
  end
172
190
 
191
+ def normalize_output_policy_options(output_audit:, output_policy:, output_policy_model:, output_policy_thinking:)
192
+ raise ArgumentError, "use output_policy: or output_audit:, not both" if output_audit && output_policy
193
+
194
+ output_policy.nil? ? output_audit : normalize_output_policy(output_policy, model: output_policy_model, thinking: output_policy_thinking)
195
+ end
196
+
197
+ def normalize_output_policy(value, model: nil, thinking: nil)
198
+ case value
199
+ when nil
200
+ nil
201
+ when Array
202
+ value.map { |item| normalize_output_policy(item, model: model, thinking: thinking) }.compact
203
+ when String
204
+ output_policy_from_path(value, model: model, thinking: thinking)
205
+ when Pathname
206
+ output_policy_from_path(value.to_s, model: model, thinking: thinking)
207
+ else
208
+ return value if value.respond_to?(:call) || value.respond_to?(:check)
209
+
210
+ raise ArgumentError, "output_policy must be a policy file path, a #call/#check object, or an array of those"
211
+ end
212
+ end
213
+
214
+ def output_policy_from_path(path, model: nil, thinking: nil)
215
+ unless path.match?(/\.(md|markdown|txt)\z/i)
216
+ raise ArgumentError, "output_policy string must be a .md, .markdown, or .txt file path"
217
+ end
218
+
219
+ TurnKit::OutputPolicy.from_file(
220
+ path,
221
+ model: model || TurnKit.output_policy_model,
222
+ thinking: thinking || TurnKit.output_policy_thinking
223
+ )
224
+ end
225
+
226
+ def normalize_output_policy_mode(output_audit_mode:, output_policy_mode:)
227
+ if output_audit_mode && output_policy_mode && output_audit_mode.to_sym != output_policy_mode.to_sym
228
+ raise ArgumentError, "use output_policy_mode: or output_audit_mode:, not both"
229
+ end
230
+
231
+ value = output_policy_mode || output_audit_mode || :report
232
+ mode = value.to_sym
233
+ raise ArgumentError, "unknown output_policy_mode: #{value}" unless %i[report fail].include?(mode)
234
+
235
+ mode
236
+ end
237
+
173
238
  def task_message(task, input)
174
239
  text = task.to_s
175
240
  return text if input.nil?
@@ -2,32 +2,40 @@
2
2
 
3
3
  module TurnKit
4
4
  class Budget
5
- attr_reader :root_started_at, :max_iterations, :timeout, :max_depth, :max_tool_executions, :cost_limit
5
+ attr_reader :root_started_at, :max_iterations, :timeout, :max_depth, :max_tool_executions, :max_tool_executions_by_name, :cost_limit
6
6
 
7
- def initialize(max_iterations:, timeout:, max_depth:, max_tool_executions:, cost_limit: nil, root_started_at: Clock.now)
7
+ def initialize(max_iterations:, timeout:, max_depth:, max_tool_executions:, max_tool_executions_by_name: {}, cost_limit: nil, root_started_at: Clock.now)
8
8
  @root_started_at = root_started_at
9
9
  @max_iterations = max_iterations
10
10
  @timeout = timeout
11
11
  @max_depth = max_depth
12
12
  @max_tool_executions = max_tool_executions
13
+ @max_tool_executions_by_name = normalize_tool_limits(max_tool_executions_by_name)
13
14
  @cost_limit = cost_limit
14
15
  @iterations = 0
15
16
  @tool_executions = 0
17
+ @tool_executions_by_name = Hash.new(0)
16
18
  @cost = 0
17
19
  @mutex = Mutex.new
18
20
  end
19
21
 
20
22
  def count_iteration!
21
23
  @mutex.synchronize do
24
+ raise BudgetError, "maximum iterations reached" if max_iterations && @iterations >= max_iterations
25
+
22
26
  @iterations += 1
23
- raise Error, "maximum iterations reached" if max_iterations && @iterations > max_iterations
24
27
  end
25
28
  end
26
29
 
27
- def count_tool_execution!
30
+ def count_tool_execution!(name = nil)
28
31
  @mutex.synchronize do
32
+ key = name.to_s if name
33
+ limit = max_tool_executions_by_name[key] if key
34
+ raise BudgetError, "maximum tool executions reached" if max_tool_executions && @tool_executions >= max_tool_executions
35
+ raise BudgetError, "maximum executions reached for tool #{key}" if limit && @tool_executions_by_name[key] >= limit
36
+
29
37
  @tool_executions += 1
30
- raise Error, "maximum tool executions reached" if max_tool_executions && @tool_executions > max_tool_executions
38
+ @tool_executions_by_name[key] += 1 if key
31
39
  end
32
40
  end
33
41
 
@@ -40,13 +48,20 @@ module TurnKit
40
48
 
41
49
  @mutex.synchronize do
42
50
  @cost += cost.to_f
43
- raise Error, "cost limit reached" if @cost > cost_limit
51
+ raise BudgetError, "cost limit reached" if @cost > cost_limit
44
52
  end
45
53
  end
46
54
 
47
55
  def check!(depth:)
48
- raise Error, "maximum sub-agent depth reached" if max_depth && depth > max_depth
49
- raise Error, "turn timed out" if timeout && Clock.now >= root_started_at + timeout
56
+ raise BudgetError, "maximum sub-agent depth reached" if max_depth && depth > max_depth
57
+ raise BudgetError, "turn timed out" if timeout && Clock.now >= root_started_at + timeout
50
58
  end
59
+
60
+ private
61
+ def normalize_tool_limits(value)
62
+ value.to_h.transform_keys(&:to_s).transform_values do |limit|
63
+ limit.nil? ? nil : Integer(limit)
64
+ end
65
+ end
51
66
  end
52
67
  end
@@ -117,6 +117,8 @@ module TurnKit
117
117
  return unless force || over_threshold?(messages, policy)
118
118
 
119
119
  compact!(turn.conversation, agent: turn.agent, turn: turn, focus: focus, auto: true, overrides: policy, force: true)
120
+ rescue BudgetError
121
+ raise
120
122
  rescue StandardError => error
121
123
  TurnKit.logger&.warn("TurnKit compaction failed: #{error.class}: #{error.message}")
122
124
  nil
@@ -144,12 +146,15 @@ module TurnKit
144
146
  target_tokens: summary_budget(selected_tokens, policy),
145
147
  fallback_model: turn&.model || conversation.model || agent.effective_model,
146
148
  conversation_id: conversation.id,
147
- turn_id: turn&.id
149
+ turn_id: turn&.id,
150
+ turn: turn
148
151
  )
149
152
 
150
153
  append_summary(conversation, turn: turn, summary: summary, selected: selected, policy: policy, focus: focus, auto: auto, input_tokens: selected_tokens)
151
154
  rescue CompactionError
152
155
  raise
156
+ rescue BudgetError
157
+ raise
153
158
  rescue StandardError => error
154
159
  raise CompactionError, "#{error.class}: #{error.message}"
155
160
  end
@@ -350,18 +355,24 @@ module TurnKit
350
355
  index
351
356
  end
352
357
 
353
- def generate_summary(agent:, policy:, messages:, previous_summary:, focus:, target_tokens:, fallback_model:, conversation_id:, turn_id:)
358
+ def generate_summary(agent:, policy:, messages:, previous_summary:, focus:, target_tokens:, fallback_model:, conversation_id:, turn_id:, turn: nil)
354
359
  client = policy["client"] || agent.effective_client
355
360
  model = policy["model"] || fallback_model
356
361
  safe_messages = messages.map { |message| sanitize_message(message, policy) }
357
362
  prompt = build_prompt(previous_summary: previous_summary, focus: focus, target_tokens: target_tokens)
358
- result = client.chat(
363
+ attrs = {
359
364
  model: model,
360
365
  messages: MessageProjection.for(safe_messages) + [ { role: :user, content: prompt } ],
361
366
  tools: [],
362
367
  instructions: COMPACTION_SYSTEM_PROMPT,
363
368
  metadata: { compaction: true, conversation_id: conversation_id, turn_id: turn_id }
364
- )
369
+ }
370
+ result = if turn
371
+ turn.internal_model_call(**attrs, purpose: "compaction", client: policy["client"])
372
+ else
373
+ client.validate!(model: model)
374
+ client.chat(**attrs)
375
+ end
365
376
  text = result.text.to_s.strip
366
377
  raise CompactionError, "compaction model returned an empty summary" if text.empty?
367
378
 
data/lib/turnkit/error.rb CHANGED
@@ -2,6 +2,7 @@
2
2
 
3
3
  module TurnKit
4
4
  class Error < StandardError; end
5
+ class BudgetError < Error; end
5
6
  class ConfigError < Error; end
6
7
  class CompactionError < Error; end
7
8
  class ModelAccessError < ConfigError; end
@@ -0,0 +1,92 @@
1
+ # frozen_string_literal: true
2
+
3
+ module TurnKit
4
+ class OutputAudit
5
+ Violation = Struct.new(:rule, :message, :metadata, keyword_init: true) do
6
+ def to_h
7
+ { "rule" => rule.to_s, "message" => message.to_s, "metadata" => metadata || {} }
8
+ end
9
+ end
10
+
11
+ Result = Struct.new(:violations, keyword_init: true) do
12
+ def clean?
13
+ violations.empty?
14
+ end
15
+
16
+ def messages
17
+ violations.map(&:message)
18
+ end
19
+
20
+ def to_h
21
+ { "clean" => clean?, "violations" => violations.map(&:to_h) }
22
+ end
23
+ end
24
+
25
+ def self.check(output, constraints: [], context: {})
26
+ new(output, constraints: constraints, context: context).check
27
+ end
28
+
29
+ def initialize(output, constraints: [], context: {})
30
+ @output = output
31
+ @constraints = Array(constraints)
32
+ @context = context || {}
33
+ end
34
+
35
+ def check
36
+ Result.new(violations: constraints.flat_map { |constraint| normalize(check_constraint(constraint)) })
37
+ end
38
+
39
+ private
40
+ attr_reader :output, :constraints, :context
41
+
42
+ def check_constraint(constraint)
43
+ if constraint.respond_to?(:check)
44
+ call_with_optional_context(constraint.method(:check))
45
+ elsif constraint.respond_to?(:call)
46
+ callable = constraint.is_a?(Proc) ? constraint : constraint.method(:call)
47
+ call_with_optional_context(callable)
48
+ else
49
+ raise ArgumentError, "output constraints must respond to #call or #check"
50
+ end
51
+ end
52
+
53
+ def call_with_optional_context(method)
54
+ parameters = method.parameters
55
+ return method.call(output) unless parameters.any? { |kind, _| %i[key keyreq keyrest].include?(kind) }
56
+ return method.call(output, **context) if parameters.any? { |kind, _| kind == :keyrest }
57
+
58
+ accepted = parameters.filter_map { |kind, name| name if %i[key keyreq].include?(kind) }
59
+ method.call(output, **context.slice(*accepted))
60
+ end
61
+
62
+ def normalize(value)
63
+ case value
64
+ when nil, false, true
65
+ []
66
+ when Violation
67
+ [ value ]
68
+ when Result
69
+ value.violations
70
+ when String
71
+ [ Violation.new(rule: "output_constraint", message: value, metadata: {}) ]
72
+ when Hash
73
+ [ violation_from_hash(value) ]
74
+ else
75
+ if value.respond_to?(:to_ary)
76
+ value.to_ary.flat_map { |item| normalize(item) }
77
+ else
78
+ raise ArgumentError, "output constraint returned unsupported value: #{value.class}"
79
+ end
80
+ end
81
+ end
82
+
83
+ def violation_from_hash(value)
84
+ attrs = value.transform_keys(&:to_s)
85
+ Violation.new(
86
+ rule: attrs["rule"] || "output_constraint",
87
+ message: attrs["message"] || attrs["error"] || "output constraint failed",
88
+ metadata: attrs["metadata"] || attrs.reject { |key, _| %w[rule message error].include?(key) }
89
+ )
90
+ end
91
+ end
92
+ end
@@ -0,0 +1,121 @@
1
+ # frozen_string_literal: true
2
+
3
+ module TurnKit
4
+ class OutputPolicy
5
+ DEFAULT_SCHEMA = {
6
+ type: "object",
7
+ properties: {
8
+ approved: { type: "boolean" },
9
+ violations: {
10
+ type: "array",
11
+ items: {
12
+ type: "object",
13
+ properties: {
14
+ rule: { type: "string" },
15
+ message: { type: "string" }
16
+ },
17
+ required: [ "rule", "message" ]
18
+ }
19
+ }
20
+ },
21
+ required: [ "approved", "violations" ]
22
+ }.freeze
23
+
24
+ attr_reader :name, :content, :model, :thinking, :client
25
+
26
+ def self.from_file(path, name: nil, **options)
27
+ new(name: name || File.basename(path, File.extname(path)), content: File.read(path), **options)
28
+ end
29
+
30
+ def initialize(content:, name: "output_policy", model: nil, thinking: nil, client: nil)
31
+ @name = name.to_s
32
+ @content = content.to_s
33
+ @model = model
34
+ @thinking = Agent.normalize_thinking(thinking)
35
+ @client = client
36
+ raise ArgumentError, "content is required" if @content.empty?
37
+ end
38
+
39
+ def call(output, run: nil, turn: nil)
40
+ model_name = model || turn&.model || run&.turn&.model || TurnKit.default_model
41
+ result = if turn
42
+ turn.internal_model_call(
43
+ model: model_name,
44
+ messages: audit_messages(output),
45
+ tools: [],
46
+ instructions: audit_instructions,
47
+ thinking: thinking,
48
+ output_schema: DEFAULT_SCHEMA,
49
+ metadata: { output_policy: name },
50
+ purpose: "output_policy",
51
+ client: client
52
+ )
53
+ else
54
+ audit_client = client || TurnKit.client
55
+ audit_client.validate!(model: model_name)
56
+ chat(audit_client, model: model_name, messages: audit_messages(output), tools: [], instructions: audit_instructions, thinking: thinking, output_schema: DEFAULT_SCHEMA, metadata: { output_policy: name })
57
+ end
58
+ data = result.output_data || parse_json(result.text)
59
+ return if data.fetch("approved", false)
60
+
61
+ Array(data["violations"]).map do |violation|
62
+ attrs = violation.transform_keys(&:to_s)
63
+ OutputAudit::Violation.new(
64
+ rule: attrs["rule"] || name,
65
+ message: attrs["message"] || "output policy failed",
66
+ metadata: attrs.reject { |key, _| %w[rule message].include?(key) }
67
+ )
68
+ end
69
+ end
70
+
71
+ private
72
+ def audit_instructions
73
+ <<~TEXT
74
+ You audit model outputs against the policy below.
75
+
76
+ Return only a JSON object matching this shape:
77
+ {"approved":true,"violations":[]}
78
+
79
+ Set approved to true only when the output satisfies the policy. For each violation, include a concise rule and message. Do not repair the output. Do not wrap the JSON in Markdown. Do not include commentary before or after the JSON.
80
+
81
+ Policy:
82
+ #{content}
83
+ TEXT
84
+ end
85
+
86
+ def audit_messages(output)
87
+ [ { role: :user, content: JSON.generate(output: output) } ]
88
+ end
89
+
90
+ def chat(client, **kwargs)
91
+ accepted = chat_keyword_names(client)
92
+ kwargs = kwargs.slice(*accepted) unless accepted.include?(:keyrest)
93
+ client.chat(**kwargs)
94
+ end
95
+
96
+ def chat_keyword_names(client)
97
+ client.method(:chat).parameters.filter_map do |kind, name|
98
+ return [ :keyrest ] if kind == :keyrest
99
+
100
+ name if %i[key keyreq].include?(kind)
101
+ end
102
+ end
103
+
104
+ def parse_json(value)
105
+ JSON.parse(extract_json(value.to_s))
106
+ rescue JSON::ParserError
107
+ { "approved" => false, "violations" => [ { "rule" => name, "message" => "output policy returned invalid JSON" } ] }
108
+ end
109
+
110
+ def extract_json(value)
111
+ text = value.strip
112
+ return text if text.start_with?("{") && text.end_with?("}")
113
+
114
+ fenced = text[/```(?:json)?\s*(\{.*?\})\s*```/m, 1]
115
+ return fenced if fenced
116
+
117
+ object = text[/\{.*\}/m]
118
+ object || text
119
+ end
120
+ end
121
+ end
data/lib/turnkit/run.rb CHANGED
@@ -14,6 +14,8 @@ module TurnKit
14
14
  def output = output_text
15
15
  def output_text = turn.output_text
16
16
  def output_data = turn.output_data
17
+ def output_audit = turn.output_audit
18
+ def output_audit_clean? = output_audit.nil? || output_audit.fetch("clean", false)
17
19
  def usage = Usage.from_records(turn_records)
18
20
  def cost = Cost.from_records(turn_records)
19
21
  def steps = turn_records.length
@@ -23,10 +23,17 @@ module TurnKit
23
23
  attr_reader :turn
24
24
 
25
25
  def run(tool_call)
26
- turn.budget.count_tool_execution!
27
- tool = tool_for(tool_call.name)
28
26
  execution = ToolExecution.new(create_execution(tool_call))
29
27
 
28
+ begin
29
+ turn.budget.count_tool_execution!(tool_call.name)
30
+ rescue BudgetError => error
31
+ finish_error(execution, tool_call, error.message, details: { "class" => error.class.name, "budget_denied" => true })
32
+ raise
33
+ end
34
+
35
+ tool = tool_for(tool_call.name)
36
+
30
37
  unless tool
31
38
  return finish_error(execution, tool_call, "unknown tool: #{tool_call.name}")
32
39
  end
@@ -58,7 +65,7 @@ module TurnKit
58
65
  def finish_success(execution, tool_call, payload)
59
66
  attrs = turn.store.update_tool_execution(execution.id, "status" => "completed", "result" => payload, "completed_at" => Clock.now)
60
67
  append_result(execution, tool_call, payload)
61
- turn.emit("tool_call.completed", id: tool_call.id, name: tool_call.name)
68
+ turn.emit("tool_call.completed", id: tool_call.id, name: tool_call.name, result_chars: payload.to_json.length)
62
69
  ToolExecution.new(attrs)
63
70
  end
64
71
 
@@ -66,7 +73,7 @@ module TurnKit
66
73
  error = { "message" => message.to_s, "details" => details }.compact
67
74
  attrs = turn.store.update_tool_execution(execution.id, "status" => "failed", "error" => error, "completed_at" => Clock.now)
68
75
  append_result(execution, tool_call, error)
69
- turn.emit("tool_call.failed", id: tool_call.id, name: tool_call.name, error: error)
76
+ turn.emit("tool_call.failed", id: tool_call.id, name: tool_call.name, error: error, result_chars: error.to_json.length)
70
77
  ToolExecution.new(attrs)
71
78
  end
72
79
 
data/lib/turnkit/turn.rb CHANGED
@@ -45,13 +45,13 @@ module TurnKit
45
45
  TurnKit::Compaction.maybe_compact!(self)
46
46
 
47
47
  request = model_request
48
- emit("model.requested", model: request.model, tool_names: request.tool_names)
48
+ emit_model_requested("model.requested", request)
49
49
  result = call_client(request)
50
- emit("model.completed", model: result.model || model, tool_call_count: result.tool_calls.length)
51
50
  result_cost = Cost.from_usage(result.usage, model: result.model || model)
52
51
 
53
- budget.add_cost!(result_cost.total)
54
52
  add_usage!(result.usage, cost: result_cost)
53
+ emit_model_completed("model.completed", result, result_cost, model: model)
54
+ budget.add_cost!(result_cost.total)
55
55
  persist_assistant_message(result)
56
56
 
57
57
  if result.tool_calls?
@@ -62,8 +62,7 @@ module TurnKit
62
62
  break
63
63
  end
64
64
  else
65
- update!(status: "completed", output_text: result.text, output_data: result.output_data, completed_at: Clock.now)
66
- emit("turn.completed", status: status, output_text: result.text)
65
+ complete_with_output(result.text, output_data: result.output_data)
67
66
  break
68
67
  end
69
68
  end
@@ -96,6 +95,10 @@ module TurnKit
96
95
  @record["output_data"]
97
96
  end
98
97
 
98
+ def output_audit
99
+ (@record["options"] || {})["output_audit"]
100
+ end
101
+
99
102
  def usage
100
103
  Usage.from_h(@record["usage"] || {})
101
104
  end
@@ -125,6 +128,28 @@ module TurnKit
125
128
  emit_event(Event.new(type: type, turn_id: id, conversation_id: conversation.id, payload: payload))
126
129
  end
127
130
 
131
+ def internal_model_call(model:, messages:, instructions:, tools: [], thinking: nil, output_schema: nil, metadata: {}, purpose:, client: nil)
132
+ request = ModelRequest.new(
133
+ model: model,
134
+ messages: messages,
135
+ tools: tools,
136
+ instructions: instructions,
137
+ thinking: thinking,
138
+ output_schema: output_schema,
139
+ metadata: { purpose: purpose.to_s, turn_id: id, conversation_id: conversation.id }.merge(metadata || {})
140
+ )
141
+ model_client = client || agent.effective_client
142
+ model_client.validate!(model: request.model)
143
+
144
+ emit_model_requested("#{purpose}.model.requested", request)
145
+ result = call_client(request, client: model_client)
146
+ result_cost = Cost.from_usage(result.usage, model: result.model || request.model)
147
+ add_usage!(result.usage, cost: result_cost)
148
+ emit_model_completed("#{purpose}.model.completed", result, result_cost, model: request.model)
149
+ budget.add_cost!(result_cost.total)
150
+ result
151
+ end
152
+
128
153
  private
129
154
  def model_request
130
155
  prompt = SystemPrompt.new(agent: agent, turn: self, conversation: conversation, mode: prompt_mode || agent.effective_prompt_mode(turn: self))
@@ -148,7 +173,7 @@ module TurnKit
148
173
  )
149
174
  end
150
175
 
151
- def call_client(request)
176
+ def call_client(request, client: agent.effective_client)
152
177
  kwargs = {
153
178
  model: request.model,
154
179
  messages: request.messages,
@@ -159,9 +184,9 @@ module TurnKit
159
184
  metadata: request.metadata,
160
185
  on_event: ->(event) { emit_event(event) }
161
186
  }
162
- accepted = chat_keyword_names(agent.effective_client)
187
+ accepted = chat_keyword_names(client)
163
188
  kwargs = kwargs.slice(*accepted) unless accepted.include?(:keyrest)
164
- agent.effective_client.chat(**kwargs)
189
+ client.chat(**kwargs)
165
190
  end
166
191
 
167
192
  def chat_keyword_names(client)
@@ -176,6 +201,26 @@ module TurnKit
176
201
  MessageProjection.for(TurnKit::Compaction.project(conversation.messages_for_turn(self)))
177
202
  end
178
203
 
204
+ def emit_model_requested(type, request)
205
+ emit(
206
+ type,
207
+ model: request.model,
208
+ tool_names: request.tool_names,
209
+ message_count: request.messages.length,
210
+ prompt: request.report
211
+ )
212
+ end
213
+
214
+ def emit_model_completed(type, result, cost, model: self.model)
215
+ emit(
216
+ type,
217
+ model: result.model || model,
218
+ tool_call_count: result.tool_calls.length,
219
+ usage: result.usage.to_h,
220
+ cost: cost.to_h
221
+ )
222
+ end
223
+
179
224
  def thinking_from_options
180
225
  options = (@record["options"] || {}).transform_keys(&:to_s)
181
226
  return Agent.normalize_thinking(options["thinking"]) if options.key?("thinking")
@@ -219,8 +264,40 @@ module TurnKit
219
264
  message = runner.completion_message(execution)
220
265
  assistant = conversation.append_message(role: "assistant", kind: "text", text: message, turn_id: id)
221
266
  emit("message.created", message_id: assistant.id, role: assistant.role, kind: assistant.kind)
222
- update!(status: "completed", output_text: message, completed_at: Clock.now)
223
- emit("turn.completed", status: status, output_text: message)
267
+ complete_with_output(message)
268
+ end
269
+
270
+ def complete_with_output(text, output_data: nil)
271
+ audit = audit_output(text, output_data: output_data)
272
+ attrs = { output_text: text, output_data: output_data, completed_at: Clock.now }
273
+ if audit && !audit.clean? && agent.output_audit_mode == :fail
274
+ attrs[:status] = "failed"
275
+ attrs[:error] = { "class" => "TurnKit::OutputAudit", "message" => audit.messages.join("; "), "output_audit" => audit.to_h }
276
+ else
277
+ attrs[:status] = "completed"
278
+ end
279
+ update!(attrs)
280
+ persist_output_audit(audit) if audit
281
+
282
+ if failed?
283
+ emit("turn.failed", error: @record["error"])
284
+ else
285
+ emit("turn.completed", status: status, output_text: text)
286
+ end
287
+ end
288
+
289
+ def audit_output(text, output_data: nil)
290
+ constraints = agent.effective_output_audit
291
+ return nil if constraints.empty?
292
+
293
+ output = output_data.nil? ? text : output_data
294
+ TurnKit.audit_output(output, constraints: constraints, context: { turn: self, output_text: text, output_data: output_data })
295
+ end
296
+
297
+ def persist_output_audit(audit)
298
+ options = (@record["options"] || {}).merge("output_audit" => audit.to_h)
299
+ update!(options: options)
300
+ emit("output_audit.completed", clean: audit.clean?, violation_count: audit.violations.length)
224
301
  end
225
302
 
226
303
  def add_usage!(usage, cost: nil)
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module TurnKit
4
- VERSION = "0.2.9"
4
+ VERSION = "0.2.10"
5
5
  end
@@ -6,7 +6,9 @@ module TurnKit
6
6
  class Workflow
7
7
  attr_reader :name, :description, :instructions, :tools, :skills, :available_skills
8
8
  attr_reader :model, :client, :store, :prompt_mode, :thinking, :compaction, :output_schema
9
- attr_reader :max_iterations, :timeout, :cost_limit, :max_depth, :max_tool_executions
9
+ attr_reader :on_event
10
+ attr_reader :max_iterations, :timeout, :cost_limit, :max_depth, :max_tool_executions, :max_tool_executions_by_name
11
+ attr_reader :output_audit, :output_audit_mode, :output_policy, :output_policy_mode, :output_policy_model, :output_policy_thinking
10
12
 
11
13
  DEFAULT_INSTRUCTIONS = <<~TEXT.strip
12
14
  You are an autonomous task orchestrator. Navigate from the application
@@ -17,6 +19,10 @@ module TurnKit
17
19
  patterns. Iterate when work needs missing context, critique, revision, or
18
20
  verification.
19
21
 
22
+ When multiple independent items need the same kind of fetch or read, and
23
+ an available batch tool can handle them in one call, prefer the batch tool
24
+ over repeated one-item tool calls.
25
+
20
26
  Stop when the task is complete, when the available context and tools are
21
27
  sufficient for the best possible answer, or when further iteration would
22
28
  not materially improve the result. Respect runtime, cost, and iteration
@@ -25,9 +31,10 @@ module TurnKit
25
31
 
26
32
  def initialize(name: "workflow", description: "", instructions: nil,
27
33
  tools: [], skills: [], available_skills: [], model: nil, client: nil,
28
- store: nil, prompt_mode: :task, thinking: nil, compaction: nil,
34
+ store: nil, prompt_mode: :task, thinking: nil, compaction: nil, on_event: nil,
29
35
  output_schema: nil, max_iterations: nil, timeout: nil, max_spend: nil,
30
- cost_limit: nil, max_depth: nil, max_tool_executions: nil)
36
+ cost_limit: nil, max_depth: nil, max_tool_executions: nil, max_tool_executions_by_name: nil,
37
+ output_audit: nil, output_audit_mode: nil, output_policy: nil, output_policy_mode: nil, output_policy_model: nil, output_policy_thinking: nil)
31
38
 
32
39
  @name = name.to_s
33
40
  @description = description.to_s
@@ -41,14 +48,22 @@ module TurnKit
41
48
  @prompt_mode = prompt_mode
42
49
  @thinking = thinking
43
50
  @compaction = compaction
51
+ @on_event = on_event
44
52
  @output_schema = output_schema
45
53
  @max_iterations = max_iterations
46
54
  @timeout = timeout
47
55
  @cost_limit = cost_limit || max_spend
48
56
  @max_depth = max_depth
49
57
  @max_tool_executions = max_tool_executions
58
+ @max_tool_executions_by_name = max_tool_executions_by_name
59
+ @output_audit = output_audit
60
+ @output_audit_mode = output_audit_mode
61
+ @output_policy = output_policy
62
+ @output_policy_mode = output_policy_mode
63
+ @output_policy_model = output_policy_model
64
+ @output_policy_thinking = output_policy_thinking
50
65
  raise ArgumentError, "name is required" if @name.empty?
51
- build_agent
66
+ @agent = build_agent
52
67
  end
53
68
 
54
69
  def run(prompt = nil, task: nil, input: nil, async: false, subject: nil, metadata: {},
@@ -57,7 +72,13 @@ module TurnKit
57
72
  task = task || prompt
58
73
  raise ArgumentError, "task is required" if task.to_s.empty?
59
74
 
60
- build_agent(cost_limit: cost_limit || max_spend, **options).run(
75
+ runtime_agent = if options.empty? && cost_limit.nil? && max_spend.nil?
76
+ @agent
77
+ else
78
+ build_agent(cost_limit: cost_limit || max_spend, **options)
79
+ end
80
+
81
+ runtime_agent.run(
61
82
  task,
62
83
  input: input,
63
84
  async: async,
@@ -67,7 +88,7 @@ module TurnKit
67
88
  end
68
89
 
69
90
  def agent(**options)
70
- build_agent(**options)
91
+ options.empty? ? @agent : build_agent(**options)
71
92
  end
72
93
 
73
94
  def max_spend
@@ -89,12 +110,20 @@ module TurnKit
89
110
  prompt_mode: prompt_mode,
90
111
  thinking: thinking,
91
112
  compaction: compaction,
113
+ on_event: on_event,
92
114
  output_schema: output_schema,
93
115
  max_iterations: max_iterations,
94
116
  timeout: timeout,
95
117
  cost_limit: cost_limit,
96
118
  max_depth: max_depth,
97
- max_tool_executions: max_tool_executions
119
+ max_tool_executions: max_tool_executions,
120
+ max_tool_executions_by_name: max_tool_executions_by_name,
121
+ output_audit: output_audit,
122
+ output_audit_mode: output_audit_mode,
123
+ output_policy: output_policy,
124
+ output_policy_mode: output_policy_mode,
125
+ output_policy_model: output_policy_model,
126
+ output_policy_thinking: output_policy_thinking
98
127
  }
99
128
  attrs.merge!(overrides.compact)
100
129
  Agent.new(**attrs)
data/lib/turnkit.rb CHANGED
@@ -5,6 +5,7 @@ require "digest"
5
5
  require "securerandom"
6
6
  require "time"
7
7
  require "date"
8
+ require "pathname"
8
9
 
9
10
  require_relative "turnkit/version"
10
11
  require_relative "turnkit/error"
@@ -22,6 +23,8 @@ require_relative "turnkit/message"
22
23
  require_relative "turnkit/record"
23
24
  require_relative "turnkit/result"
24
25
  require_relative "turnkit/skill"
26
+ require_relative "turnkit/output_audit"
27
+ require_relative "turnkit/output_policy"
25
28
  require_relative "turnkit/prompt_data"
26
29
  require_relative "turnkit/prompt_context"
27
30
  require_relative "turnkit/prompt_contribution"
@@ -48,8 +51,10 @@ module TurnKit
48
51
  class << self
49
52
  attr_accessor :default_model, :client, :store, :logger
50
53
  attr_accessor :max_iterations, :timeout, :max_depth, :max_tool_executions
54
+ attr_accessor :max_tool_executions_by_name
51
55
  attr_accessor :cost_limit, :prompt_cache
52
56
  attr_accessor :compaction
57
+ attr_accessor :output_policy_model, :output_policy_thinking
53
58
  attr_accessor :cost_rates, :cost_calculator
54
59
  attr_accessor :prompt_sections, :prompt_behavior, :available_skills
55
60
  attr_accessor :prompt_data_max_chars, :context_contributors
@@ -66,6 +71,7 @@ module TurnKit
66
71
  self.timeout = 300
67
72
  self.max_depth = 3
68
73
  self.max_tool_executions = 100
74
+ self.max_tool_executions_by_name = {}
69
75
  self.prompt_cache = :auto
70
76
  self.compaction = true
71
77
  self.cost_rates = {}
@@ -76,6 +82,8 @@ module TurnKit
76
82
  self.system_prompt_contributors = []
77
83
  self.model_prompt_contributors = {}
78
84
  self.on_event = nil
85
+ self.output_policy_model = nil
86
+ self.output_policy_thinking = { effort: :low }
79
87
 
80
88
  def self.configure
81
89
  yield self
@@ -102,4 +110,8 @@ module TurnKit
102
110
  store.update_turn(turn.fetch("id"), "status" => "stale", "completed_at" => Clock.now)
103
111
  end
104
112
  end
113
+
114
+ def self.audit_output(output, constraints: [], context: {})
115
+ OutputAudit.check(output, constraints: constraints, context: context)
116
+ end
105
117
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: turnkit
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.9
4
+ version: 0.2.10
5
5
  platform: ruby
6
6
  authors:
7
7
  - Sam Couch
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2026-06-08 00:00:00.000000000 Z
11
+ date: 2026-06-10 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: ruby_llm
@@ -61,6 +61,8 @@ files:
61
61
  - lib/turnkit/message.rb
62
62
  - lib/turnkit/message_projection.rb
63
63
  - lib/turnkit/model_request.rb
64
+ - lib/turnkit/output_audit.rb
65
+ - lib/turnkit/output_policy.rb
64
66
  - lib/turnkit/prompt_context.rb
65
67
  - lib/turnkit/prompt_contribution.rb
66
68
  - lib/turnkit/prompt_data.rb