legion-llm 0.6.20 → 0.6.23
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +35 -0
- data/README.md +1 -1
- data/docs/llm-schema-spec.md +145 -2
- data/lib/legion/llm/pipeline/executor.rb +150 -13
- data/lib/legion/llm/pipeline/request.rb +35 -16
- data/lib/legion/llm/pipeline/response.rb +9 -1
- data/lib/legion/llm/pipeline/steps/classification.rb +2 -2
- data/lib/legion/llm/routes.rb +143 -56
- data/lib/legion/llm/version.rb +1 -1
- metadata +1 -1
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: bd422bcc5c5b6da0dbd4906df8ac394e5c712e709eb8cb367cc676fbf6e45f97
|
|
4
|
+
data.tar.gz: 148e5741014313918781e757c87a50e40b2d5e5ef164631b71959f6027c70316
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: dc80d32daf35e53bfe514a0e318911c97e9e3971374eb711128c68db6a02084cc9fd259f68ccf6e0242fb10ff1cbccf2a1ca9132b37e2aa1e6ee23cd5cbe0b5d
|
|
7
|
+
data.tar.gz: 5673f3536126bc1d3e17e69ab2892edb1b8bd9524bdc6c38993e85ac0869e48a42bd82f9d8691923527f701940a42969052bbbf9db40976434f1ad00210f3934
|
data/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,40 @@
|
|
|
1
1
|
# Legion LLM Changelog
|
|
2
2
|
|
|
3
|
+
## [0.6.23] - 2026-04-07
|
|
4
|
+
|
|
5
|
+
### Fixed
|
|
6
|
+
- `build_response_routing` now always sets `routing[:escalated]` (defaults to `false`) instead of conditionally omitting the key
|
|
7
|
+
- Schema spec annotations updated: Thinking, Cache, Config(Generation) corrected to reflect `from_chat_args` first-class field mapping; ErrorResponse annotation updated with complete error hierarchy including `EscalationExhausted`, `PrivacyModeError`, `TokenBudgetExceeded`, `DaemonDeniedError`, `DaemonRateLimitedError`
|
|
8
|
+
|
|
9
|
+
## [0.6.22] - 2026-04-07
|
|
10
|
+
|
|
11
|
+
### Fixed
|
|
12
|
+
- Classification LEVELS ordering: swapped `[:public, :internal, :restricted, :confidential]` to correct `[:public, :internal, :confidential, :restricted]` so severity comparisons work properly
|
|
13
|
+
- `Response.from_ruby_llm` now extracts actual `stop_reason` from provider response instead of hardcoding `:end_turn`
|
|
14
|
+
- `Request.from_chat_args` maps 16 fields (`tool_choice`, `generation`, `thinking`, `response_format`, `context_strategy`, `cache`, `fork`, `tokens`, `stop`, `modality`, `hooks`, `idempotency_key`, `ttl`, `metadata`, `enrichments`, `predictions`) to first-class struct members instead of dumping into `extra`
|
|
15
|
+
- `build_response` populates routing details (strategy, tier, escalation chain, latency), cost estimation via `CostEstimator`, and actual stop reason instead of hardcoded defaults
|
|
16
|
+
- `response_tool_calls` merges execution data (exchange_id, source, status, duration_ms, result) from timeline events into tool call hashes
|
|
17
|
+
- `step_conversation_uuid` now auto-generates `conv_<hex>` when no conversation_id is provided (was a no-op)
|
|
18
|
+
- `step_response_normalization` now normalizes all enrichment keys to string format (was a no-op)
|
|
19
|
+
- Enrichment key `[:conversation_history]` corrected to `['context:conversation_history']` for consistent `source:type` pattern
|
|
20
|
+
|
|
21
|
+
### Changed
|
|
22
|
+
- Schema spec (`docs/llm-schema-spec.md`) updated: ToolCall, Config(Generation), Cost, Routing(response), Stop status changed from Partial/Not-implemented to Implemented
|
|
23
|
+
|
|
24
|
+
## [0.6.21] - 2026-04-07
|
|
25
|
+
|
|
26
|
+
### Added
|
|
27
|
+
- Real-time tool call SSE streaming: tool-call, tool-result, and tool-error events emitted during execution, not after completion
|
|
28
|
+
- `ClientToolMethods` module extracted from inline tool class for cleaner separation
|
|
29
|
+
- Rich tool execution logging: command, path, pattern, url shown per tool type instead of just key names
|
|
30
|
+
- `summarize_tool_args` produces structured log details per tool type (sh, file_read, file_write, file_edit, grep, glob, web_fetch, list_directory)
|
|
31
|
+
- `tool_event_handler` callback on `Pipeline::Executor` for real-time tool event forwarding via `Thread.current`
|
|
32
|
+
|
|
33
|
+
### Fixed
|
|
34
|
+
- `install_tool_loop_guard` now uses `session.on_tool_call` instead of `session.on(:tool_call)` — RubyLLM callback was never firing, tool_call_id was always nil
|
|
35
|
+
- `list_directory` tool now expands `~` via `File.expand_path` — previously failed with `ENOENT` on tilde paths
|
|
36
|
+
- SSE text-delta events logged at debug level instead of info to reduce log noise
|
|
37
|
+
|
|
3
38
|
## [0.6.20] - 2026-04-06
|
|
4
39
|
|
|
5
40
|
### Added
|
data/README.md
CHANGED
|
@@ -2,7 +2,7 @@
|
|
|
2
2
|
|
|
3
3
|
LLM integration for the [LegionIO](https://github.com/LegionIO/LegionIO) framework. Wraps [ruby_llm](https://github.com/crmne/ruby_llm) to provide chat, embeddings, tool use, and agent capabilities to any Legion extension.
|
|
4
4
|
|
|
5
|
-
**Version**: 0.6.
|
|
5
|
+
**Version**: 0.6.23
|
|
6
6
|
|
|
7
7
|
## Installation
|
|
8
8
|
|
data/docs/llm-schema-spec.md
CHANGED
|
@@ -1,6 +1,75 @@
|
|
|
1
1
|
# Legion::LLM Schema Specification
|
|
2
2
|
|
|
3
|
-
## Status:
|
|
3
|
+
## Status: Mixed — Envelope Implemented, Inner Types Aspirational
|
|
4
|
+
|
|
5
|
+
**Implemented in**: `Pipeline::Request` and `Pipeline::Response` (`lib/legion/llm/pipeline/request.rb`, `response.rb`)
|
|
6
|
+
**Version**: 1.0.0 (schema_version field on all payloads)
|
|
7
|
+
**Last verified**: 2026-04-07
|
|
8
|
+
|
|
9
|
+
The outer envelope is implemented: all 32 `Request` fields and 34 `Response` fields exist as `Data.define` members. However, many inner types (Message, ContentBlock, ToolCall, Chunk, Conversation, Feedback, ErrorResponse) are **not yet implemented as dedicated structs** — they are plain hashes or strings in the current code. Several Response fields are **always nil or empty** in the pipeline today.
|
|
10
|
+
|
|
11
|
+
This document serves as both the **canonical reference** for what is implemented and the **target specification** for what inner types should look like. Sections are annotated with implementation status.
|
|
12
|
+
|
|
13
|
+
For the AMQP wire protocol (exchange topology, queue configuration, message envelope, routing keys), see the Legion Wire Protocol spec in the LegionIO docs repo.
|
|
14
|
+
|
|
15
|
+
### Implementation Status Matrix
|
|
16
|
+
|
|
17
|
+
| Section | Status | Notes |
|
|
18
|
+
|---------|--------|-------|
|
|
19
|
+
| **Request (envelope)** | Implemented | All 32 fields exist on `Data.define`. `from_chat_args` maps all to first-class fields. |
|
|
20
|
+
| **Response (envelope)** | Partial | All 34 fields exist. 10 fields always nil/empty (see below). |
|
|
21
|
+
| **Message** | Not implemented | Plain `{ role:, content: }` hashes. No struct, no id/seq/status/version. |
|
|
22
|
+
| **ContentBlock** | Not implemented | Content is always String. Only `:text` block used (system prompt caching). |
|
|
23
|
+
| **Tool** | Partial | `ToolAdapter` has name/description/parameters. No `source` on object, no `version`. |
|
|
24
|
+
| **ToolCall** | Partial | `id`, `name`, `arguments` + `exchange_id`, `source`, `status`, `duration_ms`, `result` merged from Timeline. `error` field never populated. Timeline lookup by tool name, not call ID (breaks duplicate tool calls). |
|
|
25
|
+
| **ToolChoice** | Stub | Field exists on Request, defaults to `{ mode: :auto }`, never forwarded to provider. |
|
|
26
|
+
| **Enrichment** | Implemented | RAG/GAIA enrichments work. Value shapes vary between steps. |
|
|
27
|
+
| **Prediction** | Partial | Request-side works. Response-side actuals never filled in. |
|
|
28
|
+
| **Tracing** | Implemented | trace_id, span_id, exchange_id all generated and propagated. |
|
|
29
|
+
| **Classification** | Partial | Labels applied but routing restrictions not enforced. |
|
|
30
|
+
| **Caller** | Implemented | Identity propagated, Profile derived. |
|
|
31
|
+
| **Agent** | Not implemented | Response `agent` field always nil. |
|
|
32
|
+
| **Billing** | Partial | Per-request cap only. No cumulative budget enforcement. |
|
|
33
|
+
| **Test** | Implemented | Test mode flags propagated. |
|
|
34
|
+
| **Modality** | Not implemented | Field exists, not acted upon. |
|
|
35
|
+
| **Hooks** | Partial | Pre/post hooks on Request. Response hooks not fired. |
|
|
36
|
+
| **Feedback** | Not implemented | No struct, class, or storage. Spec only. |
|
|
37
|
+
| **Audit** | Implemented | Uses symbol keys (not string keys as spec claims). |
|
|
38
|
+
| **Timeline** | Implemented | Event recording works. Participant tracking works. |
|
|
39
|
+
| **Participants** | Implemented | Tracked via Timeline. |
|
|
40
|
+
| **Wire Capture** | Not implemented | Response `wire` field always nil. |
|
|
41
|
+
| **Retry** | Not implemented | Response `retry` field always nil. |
|
|
42
|
+
| **Safety** | Not implemented | Response `safety` field always nil. |
|
|
43
|
+
| **Rate Limit** | Not implemented | Response `rate_limit` field always nil. |
|
|
44
|
+
| **Thinking** | Partial | Request thinking config mapped to first-class field. Response thinking **never populated** by executor (always nil). |
|
|
45
|
+
| **Context Window** | Not implemented | `tokens.context_window`, `utilization`, `headroom` never populated. |
|
|
46
|
+
| **Validation** | Not implemented | Response `validation` field always nil. |
|
|
47
|
+
| **Provider Features** | Not implemented | Response `features` field always nil. |
|
|
48
|
+
| **Model Deprecation** | Not implemented | Response `deprecation` field always nil. |
|
|
49
|
+
| **Cache** | Partial | Request cache mapped to first-class field. Response `cache` always `{}`. |
|
|
50
|
+
| **Chunk (Streaming)** | Not implemented | Raw RubyLLM chunks passed through; no spec-compliant Chunk struct. |
|
|
51
|
+
| **ErrorResponse** | Not implemented | No struct; only exception classes (`LLMError` hierarchy). |
|
|
52
|
+
| **Conversation** | Partial | `ConversationStore` exists but no `Conversation` struct. Limited fields. |
|
|
53
|
+
| **Config (Generation)** | Implemented | `from_chat_args` now maps generation, thinking, response_format, etc. to first-class fields. |
|
|
54
|
+
| **Quality** | Implemented | Returns `{ score:, band:, source: }` (not `{ score:, acceptable:, checker: }` as spec says). |
|
|
55
|
+
| **Cost** | Implemented | Populated via `CostEstimator.estimate` with `estimated_usd`, `provider`, `model`. |
|
|
56
|
+
| **Routing (response)** | Implemented | `provider`, `model`, `strategy`, `tier`, `escalated`, `escalation_chain`, `latency_ms` populated. |
|
|
57
|
+
| **Stop** | Implemented | `stop.reason` extracted from provider response (`:end_turn`, `:tool_use`, etc.). |
|
|
58
|
+
| **Metering** | Not implemented | Module exists but not wired into pipeline steps. |
|
|
59
|
+
|
|
60
|
+
#### Response Fields Always Nil/Empty
|
|
61
|
+
|
|
62
|
+
These Response fields exist on the `Data.define` but are **never populated** by the executor today:
|
|
63
|
+
|
|
64
|
+
- `agent` — always nil
|
|
65
|
+
- `cache` — always `{}`
|
|
66
|
+
- `safety` — always nil
|
|
67
|
+
- `rate_limit` — always nil
|
|
68
|
+
- `features` — always nil
|
|
69
|
+
- `deprecation` — always nil
|
|
70
|
+
- `validation` — always nil
|
|
71
|
+
- `wire` — always nil
|
|
72
|
+
- `retry` — always nil
|
|
4
73
|
|
|
5
74
|
## Design Principles
|
|
6
75
|
|
|
@@ -27,6 +96,8 @@ schema_version: "1.0.0" # semver -- major.minor.patch
|
|
|
27
96
|
|
|
28
97
|
## Message
|
|
29
98
|
|
|
99
|
+
> **Implementation status: NOT IMPLEMENTED** — No `Message` struct exists. Messages are plain hashes with only `role` and `content` in the pipeline. `ConversationStore` persists additional fields (`id`, `seq`, `parent_id`, `agent_id`, `created_at`) in its DB rows, but these are not surfaced as a structured Message object.
|
|
100
|
+
|
|
30
101
|
The atomic unit of conversation. Every exchange between user, assistant, and tools is a Message.
|
|
31
102
|
|
|
32
103
|
```
|
|
@@ -78,6 +149,8 @@ message.text # returns text content regardless of String vs Array<ContentBlock>
|
|
|
78
149
|
|
|
79
150
|
## Content Blocks
|
|
80
151
|
|
|
152
|
+
> **Implementation status: NOT IMPLEMENTED** — No `ContentBlock` struct exists. Content is always a plain String in the pipeline. The only place a typed block hash is constructed is for system prompt caching (`{ type: :text, content: ..., cache_control: ... }`). No image, audio, video, document, tool_use, tool_result, citation, or error block handling exists.
|
|
153
|
+
|
|
81
154
|
Multimodal content. When `Message.content` is an array, each element is a ContentBlock.
|
|
82
155
|
|
|
83
156
|
### Block Types
|
|
@@ -199,6 +272,8 @@ data: Hash? # structured error data
|
|
|
199
272
|
|
|
200
273
|
## Tool
|
|
201
274
|
|
|
275
|
+
> **Implementation status: PARTIAL** — `ToolAdapter` wraps `RubyLLM::Tool` with `name`, `description`, `parameters`. The `source` field exists as a parallel lookup in `find_tool_source` (not on the tool object). `version` does not exist.
|
|
276
|
+
|
|
202
277
|
Tool definitions available to the LLM.
|
|
203
278
|
|
|
204
279
|
```
|
|
@@ -227,6 +302,8 @@ Used by RBAC (can this caller use tools from this source?) and audit (which syst
|
|
|
227
302
|
|
|
228
303
|
## ToolCall
|
|
229
304
|
|
|
305
|
+
> **Implementation status: PARTIAL** — Tool calls are hashes with `id`, `name`, `arguments` and optionally `exchange_id`, `source`, `status`, `duration_ms`, `result` merged from matching Timeline events. The `error` field is never populated. Timeline lookup uses tool name (not call ID), so duplicate invocations of the same tool in one response will only have execution data for the last invocation.
|
|
306
|
+
|
|
230
307
|
A tool invocation made by the assistant, with execution results.
|
|
231
308
|
|
|
232
309
|
```
|
|
@@ -251,6 +328,8 @@ Always a parsed Hash, never a JSON string. Provider adapters that receive argume
|
|
|
251
328
|
|
|
252
329
|
## ToolChoice
|
|
253
330
|
|
|
331
|
+
> **Implementation status: STUB** — Field exists on Request, defaults to `{ mode: :auto }`. The `:specific` mode's `name` field is not handled. The `tool_choice` value is never forwarded to the underlying RubyLLM provider call.
|
|
332
|
+
|
|
254
333
|
Controls how the LLM uses available tools.
|
|
255
334
|
|
|
256
335
|
```
|
|
@@ -263,6 +342,8 @@ ToolChoice
|
|
|
263
342
|
|
|
264
343
|
## Enrichment
|
|
265
344
|
|
|
345
|
+
> **Implementation status: IMPLEMENTED** — RAG and GAIA enrichments work. Note: value shapes are inconsistent across pipeline steps — not all enrichments include `content:`, `data:`, `duration_ms:`, `timestamp:` as spec describes.
|
|
346
|
+
|
|
266
347
|
Things that *shaped* the request during processing. Any system can contribute enrichments without schema changes. Enrichments modify or observe the request -- for decisions and outcomes, see [Audit](#audit).
|
|
267
348
|
|
|
268
349
|
Enrichments are a **Hash keyed by `"source:type"`**, not an array. This enables direct lookup and clean request-vs-response comparison without looping.
|
|
@@ -319,6 +400,8 @@ Adding a new system requires zero schema changes -- just add a new key.
|
|
|
319
400
|
|
|
320
401
|
## Prediction
|
|
321
402
|
|
|
403
|
+
> **Implementation status: PARTIAL** — Request-side predictions work (components can contribute predictions). Response-side actuals (`actual_value`, `accurate`) are never filled in — no post-execution comparison occurs.
|
|
404
|
+
|
|
322
405
|
Hypothesis recorded before execution, compared to reality after execution. Enables self-improving systems. Any component in the pipeline can contribute predictions.
|
|
323
406
|
|
|
324
407
|
Predictions are a **Hash keyed by `"source:type"`**, same pattern as enrichments. Direct lookup, no looping.
|
|
@@ -395,6 +478,8 @@ response.predictions.count { |_, v| v[:correct] }.to_f / response.predictions.si
|
|
|
395
478
|
|
|
396
479
|
## Tracing & Correlation
|
|
397
480
|
|
|
481
|
+
> **Implementation status: IMPLEMENTED** — `trace_id`, `span_id`, `exchange_id` all generated and propagated via `Pipeline::Tracing`.
|
|
482
|
+
|
|
398
483
|
OpenTelemetry-compatible distributed tracing. Groups related requests across agentic loops, forks, and multi-step tasks.
|
|
399
484
|
|
|
400
485
|
```
|
|
@@ -424,6 +509,8 @@ Tracing is present on Request, Response, ErrorResponse, and Chunk.
|
|
|
424
509
|
|
|
425
510
|
## Exchange (Per-Hop Tracking)
|
|
426
511
|
|
|
512
|
+
> **Implementation status: IMPLEMENTED** — `conversation_id`, `request_id` (mapped to `id`), and `exchange_id` all generated via `Pipeline::Tracing` and propagated through the pipeline.
|
|
513
|
+
|
|
427
514
|
Three-level ID hierarchy inspired by SIP's Call-ID / CSeq / Branch/Via model. Tracks every hop within a single request.
|
|
428
515
|
|
|
429
516
|
```
|
|
@@ -487,6 +574,8 @@ In practice, each exchange would become a child span under the request's span in
|
|
|
487
574
|
|
|
488
575
|
## Data Classification & Compliance
|
|
489
576
|
|
|
577
|
+
> **Implementation status: PARTIAL** — Classification labels are applied to requests. However, routing restrictions (e.g., preventing PHI-tagged data from going to certain providers) are not enforced.
|
|
578
|
+
|
|
490
579
|
Data governance for enterprise adoption. Controls where data can be processed, how long it's retained, and what it contains.
|
|
491
580
|
|
|
492
581
|
```
|
|
@@ -538,6 +627,8 @@ Provider registry includes each provider's processing jurisdiction. Router match
|
|
|
538
627
|
|
|
539
628
|
## Caller
|
|
540
629
|
|
|
630
|
+
> **Implementation status: IMPLEMENTED** — Caller identity propagated through the pipeline. `Profile.derive` reads `caller[:requested_by][:type]` to determine step skipping.
|
|
631
|
+
|
|
541
632
|
Auth-level identity tracking. Who authenticated to make this request, and on whose behalf. Separate from `agent` (which tracks AI entity identity).
|
|
542
633
|
|
|
543
634
|
```
|
|
@@ -602,6 +693,8 @@ RBAC checks `caller.requested_by` for permission evaluation. If `requested_for`
|
|
|
602
693
|
|
|
603
694
|
## Agent Identity
|
|
604
695
|
|
|
696
|
+
> **Implementation status: NOT IMPLEMENTED** — The `agent` field exists on both Request and Response but is always nil. No agent identity is attached during pipeline execution.
|
|
697
|
+
|
|
605
698
|
Tracks which AI entity is executing the request. Not about auth (that's `caller`) -- about the AI agent doing the work.
|
|
606
699
|
|
|
607
700
|
```
|
|
@@ -648,6 +741,8 @@ Multiple LLM requests can share a `task_id`, enabling: "Show me everything that
|
|
|
648
741
|
|
|
649
742
|
## Billing & Budget
|
|
650
743
|
|
|
744
|
+
> **Implementation status: PARTIAL** — Per-request cost cap works. Cumulative budget tracking (daily/monthly limits) is not implemented. Metering module exists but is not wired into pipeline steps.
|
|
745
|
+
|
|
651
746
|
Cost tracking, budget enforcement, and rate limiting.
|
|
652
747
|
|
|
653
748
|
```
|
|
@@ -690,6 +785,8 @@ Checked in the pipeline before the provider call:
|
|
|
690
785
|
|
|
691
786
|
## Test & Evaluation Mode
|
|
692
787
|
|
|
788
|
+
> **Implementation status: IMPLEMENTED** — Test mode flags propagated through the pipeline.
|
|
789
|
+
|
|
693
790
|
Controls for testing, benchmarking, replay, and experimentation.
|
|
694
791
|
|
|
695
792
|
```
|
|
@@ -743,6 +840,8 @@ Experiment results are tracked via predictions (expected: better quality with GA
|
|
|
743
840
|
|
|
744
841
|
## Modality
|
|
745
842
|
|
|
843
|
+
> **Implementation status: NOT IMPLEMENTED** — The `modality` field exists on Request but is not acted upon by the pipeline or provider adapters.
|
|
844
|
+
|
|
746
845
|
Declares input and output modality expectations. Guides routing (not all providers support all combinations) and future-proofs for multimodal evolution.
|
|
747
846
|
|
|
748
847
|
```
|
|
@@ -799,6 +898,8 @@ Provider capabilities:
|
|
|
799
898
|
|
|
800
899
|
## Lifecycle Hooks
|
|
801
900
|
|
|
901
|
+
> **Implementation status: PARTIAL** — Pre/post hooks on Request are supported. Response-side hook firing is not implemented.
|
|
902
|
+
|
|
802
903
|
Caller-declared injection points in the pipeline. Named hooks registered by extensions or configuration.
|
|
803
904
|
|
|
804
905
|
```
|
|
@@ -839,6 +940,8 @@ Hooks receive the full request/response context and can add enrichments, but can
|
|
|
839
940
|
|
|
840
941
|
## Feedback
|
|
841
942
|
|
|
943
|
+
> **Implementation status: NOT IMPLEMENTED** — No Feedback struct, class, or storage exists. No code submits, receives, or stores feedback.
|
|
944
|
+
|
|
842
945
|
User or automated quality feedback on specific messages. Lives on the Conversation, not on individual requests. Closes the learning loop.
|
|
843
946
|
|
|
844
947
|
```
|
|
@@ -884,6 +987,8 @@ Quality checkers and GAIA can also submit feedback:
|
|
|
884
987
|
|
|
885
988
|
## Audit
|
|
886
989
|
|
|
990
|
+
> **Implementation status: IMPLEMENTED** — Audit records are populated by the pipeline. Note: uses symbol keys (`:step`, `:action`), not string keys as some examples in this spec show.
|
|
991
|
+
|
|
887
992
|
Record of what *happened* during pipeline processing -- decisions, actions, outcomes. Separate from enrichments (which record what *shaped* the request). Response-only.
|
|
888
993
|
|
|
889
994
|
Audit is a **Hash keyed by `"step:action"`**, same pattern as enrichments and predictions.
|
|
@@ -990,6 +1095,8 @@ response.audit[:"persistence:store"][:data][:method] # => :direct
|
|
|
990
1095
|
|
|
991
1096
|
## Pipeline Timeline
|
|
992
1097
|
|
|
1098
|
+
> **Implementation status: IMPLEMENTED** — `Pipeline::Timeline` records ordered events with participant tracking.
|
|
1099
|
+
|
|
993
1100
|
Inspired by [Homer/SIPCAPTURE](https://github.com/sipcapture/homer) call flow diagrams. A unified, globally-sequenced timeline of **everything** that happened during a request. Reconstructs the full call flow across all systems -- enrichments, audit, tool calls, provider calls, connections -- in one ordered record.
|
|
994
1101
|
|
|
995
1102
|
This is the **one place an array is correct**. Timeline is ordered data, not lookup data. You iterate it in sequence to reconstruct the call flow, like Homer's ladder diagram.
|
|
@@ -1125,6 +1232,8 @@ The timeline is built during pipeline execution and returned on the response. It
|
|
|
1125
1232
|
|
|
1126
1233
|
## Participants
|
|
1127
1234
|
|
|
1235
|
+
> **Implementation status: IMPLEMENTED** — Tracked via `Pipeline::Timeline`.
|
|
1236
|
+
|
|
1128
1237
|
All systems that touched this request. Enables Homer-style column headers for call flow visualization. Response-only, populated by the pipeline.
|
|
1129
1238
|
|
|
1130
1239
|
```
|
|
@@ -1154,6 +1263,8 @@ Auto-populated: every unique `from` and `to` value in the timeline becomes a par
|
|
|
1154
1263
|
|
|
1155
1264
|
## Wire Capture
|
|
1156
1265
|
|
|
1266
|
+
> **Implementation status: NOT IMPLEMENTED** — Response `wire` field is always nil. No capture of raw provider payloads occurs.
|
|
1267
|
+
|
|
1157
1268
|
Raw request and response payloads as sent to/received from the provider. For debugging translator issues, you need both sides of the wire. Opt-in (can be expensive to store).
|
|
1158
1269
|
|
|
1159
1270
|
Keyed by `exchange_id` -- one capture per provider call, not per request. A request with retries or tool loops produces multiple wire captures.
|
|
@@ -1223,6 +1334,8 @@ This lives on `response.routing.connection` since it's part of the routing outco
|
|
|
1223
1334
|
|
|
1224
1335
|
## Retry
|
|
1225
1336
|
|
|
1337
|
+
> **Implementation status: NOT IMPLEMENTED** — Response `retry` field is always nil. Retry logic exists in the executor (rate limit rescue) but results are not captured in the retry struct.
|
|
1338
|
+
|
|
1226
1339
|
Distinct from escalation. Retries are the same provider/model attempted again after a transient failure. Escalation is switching to a different provider/model.
|
|
1227
1340
|
|
|
1228
1341
|
```
|
|
@@ -1269,6 +1382,8 @@ response.retry = {
|
|
|
1269
1382
|
|
|
1270
1383
|
## Content Safety
|
|
1271
1384
|
|
|
1385
|
+
> **Implementation status: NOT IMPLEMENTED** — Response `safety` field is always nil. Provider safety results are not captured.
|
|
1386
|
+
|
|
1272
1387
|
Provider-reported content filtering results. Different from classification (which is our data governance). This is the provider saying "I evaluated this content against my safety policies."
|
|
1273
1388
|
|
|
1274
1389
|
Response-only. Not all providers return this.
|
|
@@ -1322,6 +1437,8 @@ response.safety = {
|
|
|
1322
1437
|
|
|
1323
1438
|
## Rate Limit State
|
|
1324
1439
|
|
|
1440
|
+
> **Implementation status: NOT IMPLEMENTED** — Response `rate_limit` field is always nil. Provider rate limit headers are not captured (rate limit errors are rescued and retried, but quota state is not stored).
|
|
1441
|
+
|
|
1325
1442
|
Provider quota state returned in response headers. Structured and always captured (not opt-in like wire). Critical for routing decisions.
|
|
1326
1443
|
|
|
1327
1444
|
```
|
|
@@ -1361,6 +1478,8 @@ end
|
|
|
1361
1478
|
|
|
1362
1479
|
## Thinking & Reasoning
|
|
1363
1480
|
|
|
1481
|
+
> **Implementation status: PARTIAL** — Request-side thinking configuration is mapped to the first-class `thinking` field by `from_chat_args`. Response-side `thinking` field exists on the Response struct but is **never populated** by the executor — it is always nil.
|
|
1482
|
+
|
|
1364
1483
|
Controls for extended thinking, chain-of-thought, and reasoning behavior. Separate from generation parameters (temperature, top_p) because reasoning is about *how deeply* the model thinks, not *how randomly* it samples.
|
|
1365
1484
|
|
|
1366
1485
|
### Request side
|
|
@@ -1395,6 +1514,8 @@ Thinking tokens are tracked separately from regular output tokens because they h
|
|
|
1395
1514
|
|
|
1396
1515
|
## Context Window Utilization
|
|
1397
1516
|
|
|
1517
|
+
> **Implementation status: NOT IMPLEMENTED** — `tokens.context_window`, `tokens.utilization`, and `tokens.headroom` are never populated on the Response. Only `input_tokens` and `output_tokens` are set.
|
|
1518
|
+
|
|
1398
1519
|
Expands response-side tokens with capacity information. Drives context strategy decisions.
|
|
1399
1520
|
|
|
1400
1521
|
Added to `response.tokens`:
|
|
@@ -1439,6 +1560,8 @@ end
|
|
|
1439
1560
|
|
|
1440
1561
|
## Structured Output Validation
|
|
1441
1562
|
|
|
1563
|
+
> **Implementation status: NOT IMPLEMENTED** — Response `validation` field is always nil. `StructuredOutput` module exists for enforcing schemas but does not populate this struct.
|
|
1564
|
+
|
|
1442
1565
|
When `response_format.type` is `:json` or `:json_schema`, reports whether the response actually validated.
|
|
1443
1566
|
|
|
1444
1567
|
Response-only. Added to response alongside quality.
|
|
@@ -1479,6 +1602,8 @@ response.validation = {
|
|
|
1479
1602
|
|
|
1480
1603
|
## Provider Features
|
|
1481
1604
|
|
|
1605
|
+
> **Implementation status: NOT IMPLEMENTED** — Response `features` field is always nil.
|
|
1606
|
+
|
|
1482
1607
|
Post-hoc report of which provider-specific features actually activated on this request. Different from capabilities (what the provider CAN do) -- this is what it DID.
|
|
1483
1608
|
|
|
1484
1609
|
Response-only. Hash-keyed by feature name.
|
|
@@ -1519,6 +1644,8 @@ end
|
|
|
1519
1644
|
|
|
1520
1645
|
## Model Deprecation
|
|
1521
1646
|
|
|
1647
|
+
> **Implementation status: NOT IMPLEMENTED** — Response `deprecation` field is always nil.
|
|
1648
|
+
|
|
1522
1649
|
Structured deprecation warnings from providers. Separate from the `warnings` array because automated systems need to act on these programmatically.
|
|
1523
1650
|
|
|
1524
1651
|
Response-only.
|
|
@@ -1564,6 +1691,8 @@ end
|
|
|
1564
1691
|
|
|
1565
1692
|
## Cache
|
|
1566
1693
|
|
|
1694
|
+
> **Implementation status: PARTIAL** — Request-side `cache` field is mapped to the first-class field by `from_chat_args` (defaults to `{ strategy: :default, cacheable: true }`). Response-side `cache` field is always `{}`.
|
|
1695
|
+
|
|
1567
1696
|
Symmetric caching controls on request and response. Replaces a flat strategy symbol with structured metadata.
|
|
1568
1697
|
|
|
1569
1698
|
### Request side (what I want)
|
|
@@ -1631,6 +1760,8 @@ Response: cache: { hit: true, key: "sha256:abc123", tier: :local, age: 45, expir
|
|
|
1631
1760
|
|
|
1632
1761
|
## Request
|
|
1633
1762
|
|
|
1763
|
+
> **Implementation status: IMPLEMENTED (envelope)** — All 32 fields exist as `Data.define` members with `.build` and `.from_chat_args` constructors. All fields including `generation`, `thinking`, `response_format`, `context_strategy`, `cache`, `fork`, `tokens`, `stop`, `modality`, `hooks`, `idempotency_key`, `ttl`, `metadata`, `enrichments`, and `predictions` are mapped to first-class struct members. Convenience accessors (`.model`, `.provider`) described in the spec are not defined.
|
|
1764
|
+
|
|
1634
1765
|
What goes into the Legion::LLM pipeline.
|
|
1635
1766
|
|
|
1636
1767
|
```
|
|
@@ -1795,6 +1926,8 @@ For queue ordering when requests go through RMQ:
|
|
|
1795
1926
|
|
|
1796
1927
|
## Response
|
|
1797
1928
|
|
|
1929
|
+
> **Implementation status: PARTIAL (envelope)** — All 34 fields exist as `Data.define` members. 9 fields are always nil/empty (see status matrix above). `routing` populates `provider`, `model`, `strategy`, `tier`, `escalated`, `escalation_chain`, `latency_ms`. `stop.reason` extracted from provider response (falls back to `:end_turn`). `quality` returns `{ score:, band:, source:, signals: }` from `ConfidenceScorer` (not `{ score:, acceptable:, checker: }` as the Response struct below shows). `cost` populated via `CostEstimator.estimate` with `estimated_usd`, `provider`, `model`. Convenience accessors (`.model`, `.provider`) are not defined.
|
|
1930
|
+
|
|
1798
1931
|
What comes back from the Legion::LLM pipeline.
|
|
1799
1932
|
|
|
1800
1933
|
```
|
|
@@ -1843,7 +1976,7 @@ Response
|
|
|
1843
1976
|
|
|
1844
1977
|
# Stop (symmetric with request)
|
|
1845
1978
|
stop: Hash
|
|
1846
|
-
reason: Symbol # :end_turn, :
|
|
1979
|
+
reason: Symbol # :end_turn, :tool_use, :max_tokens, :safety, :stop_sequence
|
|
1847
1980
|
sequence: String? # which stop sequence was hit (nil if none)
|
|
1848
1981
|
|
|
1849
1982
|
# Tools (symmetric with request)
|
|
@@ -1984,6 +2117,8 @@ response.participants # ["pipeline", "rbac", "provider:claude", ...]
|
|
|
1984
2117
|
|
|
1985
2118
|
## Chunk (Streaming)
|
|
1986
2119
|
|
|
2120
|
+
> **Implementation status: NOT IMPLEMENTED** — No `Chunk` struct exists. Streaming (`call_stream`) yields raw RubyLLM chunk objects directly to callers with no translation to the spec format.
|
|
2121
|
+
|
|
1987
2122
|
Incremental data during a streamed response.
|
|
1988
2123
|
|
|
1989
2124
|
```
|
|
@@ -2015,6 +2150,8 @@ Chunk
|
|
|
2015
2150
|
|
|
2016
2151
|
## ErrorResponse
|
|
2017
2152
|
|
|
2153
|
+
> **Implementation status: NOT IMPLEMENTED** — No `ErrorResponse` struct exists. Errors are raised as exceptions from the `Legion::LLM` error hierarchy: `LLMError` (base), `AuthError`, `RateLimitError`, `ContextOverflow`, `ProviderError`, `ProviderDown`, `UnsupportedCapability`, `PipelineError`, `TokenBudgetExceeded`, `EmbeddingUnavailableError`. Additionally, `EscalationExhausted`, `DaemonDeniedError`, `DaemonRateLimitedError`, and `PrivacyModeError` inherit from `StandardError` directly (not `LLMError`). These are Ruby exceptions, not structured response payloads.
|
|
2154
|
+
|
|
2018
2155
|
Standard error format for failed requests.
|
|
2019
2156
|
|
|
2020
2157
|
```
|
|
@@ -2059,6 +2196,8 @@ ErrorResponse
|
|
|
2059
2196
|
|
|
2060
2197
|
## Conversation
|
|
2061
2198
|
|
|
2199
|
+
> **Implementation status: PARTIAL** — `ConversationStore` exists as an in-memory LRU (256 slots) with optional DB persistence. No `Conversation` struct — conversations are plain hashes (`{ messages: [], metadata: {}, lru_tick: N }`). DB persistence stores `id`, `caller_identity`, `metadata` (JSON blob), `created_at`, `updated_at`. Most spec fields (`title`, `summary`, `state`, `shared`, `participants`, `tags`, `pinned`, `usage_total`, `routing_history`) exist only as arbitrary metadata blob entries, not first-class fields.
|
|
2200
|
+
|
|
2062
2201
|
The persistent conversation object stored in the ConversationStore.
|
|
2063
2202
|
|
|
2064
2203
|
```
|
|
@@ -2119,6 +2258,8 @@ Legion::LLM.chat(
|
|
|
2119
2258
|
|
|
2120
2259
|
## Config (Generation Parameters)
|
|
2121
2260
|
|
|
2261
|
+
> **Implementation status: PARTIAL** — Generation parameters are mapped to the first-class `generation` field by `from_chat_args`. However, provider adapters only forward `model` and `provider` to RubyLLM, not temperature/top_p/etc from the `generation` hash.
|
|
2262
|
+
|
|
2122
2263
|
Sent in `request.generation`. Provider adapters map supported parameters and ignore unsupported ones.
|
|
2123
2264
|
|
|
2124
2265
|
```
|
|
@@ -2162,6 +2303,8 @@ response_format:
|
|
|
2162
2303
|
|
|
2163
2304
|
## Provider Adapter Contract
|
|
2164
2305
|
|
|
2306
|
+
> **Implementation status: PARTIAL** — Provider LEXs (extensions-ai/) exist and work for chat/embed. The formal `ProviderAdapter` interface with `Translator` is not enforced — providers integrate via RubyLLM's native provider system.
|
|
2307
|
+
|
|
2165
2308
|
Every provider LEX must implement `Legion::LLM::ProviderAdapter` including a `Translator`.
|
|
2166
2309
|
|
|
2167
2310
|
### Required methods
|
|
@@ -17,6 +17,7 @@ module Legion
|
|
|
17
17
|
attr_reader :request, :profile, :timeline, :tracing, :enrichments,
|
|
18
18
|
:audit, :warnings, :discovered_tools, :confidence_score,
|
|
19
19
|
:escalation_chain
|
|
20
|
+
attr_accessor :tool_event_handler
|
|
20
21
|
|
|
21
22
|
include Steps::ToolDiscovery
|
|
22
23
|
include Steps::ToolCalls
|
|
@@ -67,6 +68,7 @@ module Legion
|
|
|
67
68
|
@escalation_chain = nil
|
|
68
69
|
@escalation_history = []
|
|
69
70
|
@proactive_tier_assignment = nil
|
|
71
|
+
@tool_event_handler = nil
|
|
70
72
|
end
|
|
71
73
|
|
|
72
74
|
def call
|
|
@@ -164,7 +166,11 @@ module Legion
|
|
|
164
166
|
|
|
165
167
|
def step_idempotency; end
|
|
166
168
|
|
|
167
|
-
def step_conversation_uuid
|
|
169
|
+
def step_conversation_uuid
|
|
170
|
+
return if @request.conversation_id
|
|
171
|
+
|
|
172
|
+
@request = @request.with(conversation_id: "conv_#{SecureRandom.hex(8)}")
|
|
173
|
+
end
|
|
168
174
|
|
|
169
175
|
def step_context_load
|
|
170
176
|
conv_id = @request.conversation_id
|
|
@@ -187,7 +193,7 @@ module Legion
|
|
|
187
193
|
maybe_compact_history(conv_id, history)
|
|
188
194
|
end
|
|
189
195
|
|
|
190
|
-
@enrichments[:conversation_history] = history
|
|
196
|
+
@enrichments['context:conversation_history'] = history
|
|
191
197
|
@timeline.record(
|
|
192
198
|
category: :internal, key: 'context:loaded',
|
|
193
199
|
direction: :internal, detail: "loaded #{history.size} prior messages",
|
|
@@ -656,7 +662,15 @@ module Legion
|
|
|
656
662
|
|
|
657
663
|
session, message_content = build_ruby_llm_session
|
|
658
664
|
install_tool_loop_guard(session)
|
|
659
|
-
|
|
665
|
+
|
|
666
|
+
Thread.current[:legion_tool_event_handler] = @tool_event_handler
|
|
667
|
+
begin
|
|
668
|
+
@raw_response = message_content ? session.ask(message_content, &) : session
|
|
669
|
+
ensure
|
|
670
|
+
Thread.current[:legion_tool_event_handler] = nil
|
|
671
|
+
Thread.current[:legion_current_tool_call_id] = nil
|
|
672
|
+
Thread.current[:legion_current_tool_name] = nil
|
|
673
|
+
end
|
|
660
674
|
|
|
661
675
|
@timestamps[:provider_end] = Time.now
|
|
662
676
|
record_provider_response
|
|
@@ -690,18 +704,47 @@ module Legion
|
|
|
690
704
|
end
|
|
691
705
|
|
|
692
706
|
def install_tool_loop_guard(session)
|
|
693
|
-
|
|
707
|
+
unless session.respond_to?(:on_tool_call)
|
|
708
|
+
log.warn('[pipeline] tool loop guard unavailable: ruby_llm session does not respond to on_tool_call')
|
|
709
|
+
return
|
|
710
|
+
end
|
|
694
711
|
|
|
695
712
|
tool_round = 0
|
|
696
|
-
session.
|
|
713
|
+
session.on_tool_call do |tool_call|
|
|
697
714
|
tool_round += 1
|
|
698
715
|
if tool_round > MAX_RUBY_LLM_TOOL_ROUNDS
|
|
699
716
|
log.warn("[pipeline] tool loop cap hit: #{tool_round} rounds, halting")
|
|
700
717
|
raise Legion::LLM::PipelineError, "tool loop exceeded #{MAX_RUBY_LLM_TOOL_ROUNDS} rounds"
|
|
701
718
|
end
|
|
719
|
+
|
|
720
|
+
emit_tool_call_event(tool_call, tool_round)
|
|
702
721
|
end
|
|
703
722
|
end
|
|
704
723
|
|
|
724
|
+
def emit_tool_call_event(tool_call, round)
|
|
725
|
+
tc_id = tool_call_field(tool_call, :id)
|
|
726
|
+
tc_name = tool_call_field(tool_call, :name)
|
|
727
|
+
tc_args = tool_call_field(tool_call, :arguments)
|
|
728
|
+
|
|
729
|
+
log.info("[pipeline][tool-call] round=#{round} id=#{tc_id} tool=#{tc_name}")
|
|
730
|
+
|
|
731
|
+
Thread.current[:legion_current_tool_call_id] = tc_id
|
|
732
|
+
Thread.current[:legion_current_tool_name] = tc_name
|
|
733
|
+
|
|
734
|
+
@tool_event_handler&.call(
|
|
735
|
+
type: :tool_call, tool_call_id: tc_id, tool_name: tc_name,
|
|
736
|
+
arguments: tc_args, round: round
|
|
737
|
+
)
|
|
738
|
+
end
|
|
739
|
+
|
|
740
|
+
def tool_call_field(tool_call, field)
|
|
741
|
+
return tool_call.public_send(field) if tool_call.respond_to?(field)
|
|
742
|
+
|
|
743
|
+
tool_call[field]
|
|
744
|
+
rescue StandardError
|
|
745
|
+
nil
|
|
746
|
+
end
|
|
747
|
+
|
|
705
748
|
def apply_ruby_llm_instructions(session)
|
|
706
749
|
injected_system = EnrichmentInjector.inject(
|
|
707
750
|
system: @request.system,
|
|
@@ -758,7 +801,7 @@ module Legion
|
|
|
758
801
|
attrs = Steps::SpanAnnotator.attributes_for(step_name, audit: @audit, enrichments: @enrichments)
|
|
759
802
|
attrs.each { |key, val| span.set_attribute(key, val) unless val.nil? }
|
|
760
803
|
rescue StandardError => e
|
|
761
|
-
handle_exception(e, level: :
|
|
804
|
+
handle_exception(e, level: :warn, operation: 'llm.pipeline.annotate_span', step: step_name)
|
|
762
805
|
nil
|
|
763
806
|
end
|
|
764
807
|
|
|
@@ -783,7 +826,7 @@ module Legion
|
|
|
783
826
|
span.set_attribute('routing.tier', data[:tier].to_s) if data[:tier]
|
|
784
827
|
end
|
|
785
828
|
rescue StandardError => e
|
|
786
|
-
handle_exception(e, level: :
|
|
829
|
+
handle_exception(e, level: :warn, operation: 'llm.pipeline.annotate_top_level_span')
|
|
787
830
|
nil
|
|
788
831
|
end
|
|
789
832
|
|
|
@@ -800,7 +843,14 @@ module Legion
|
|
|
800
843
|
nil
|
|
801
844
|
end
|
|
802
845
|
|
|
803
|
-
def step_response_normalization
|
|
846
|
+
def step_response_normalization
|
|
847
|
+
# Normalize enrichment keys to consistent string "source:type" format
|
|
848
|
+
normalized = {}
|
|
849
|
+
@enrichments.each do |key, value|
|
|
850
|
+
normalized[key.to_s] = value
|
|
851
|
+
end
|
|
852
|
+
@enrichments = normalized
|
|
853
|
+
end
|
|
804
854
|
|
|
805
855
|
def step_context_store
|
|
806
856
|
conv_id = @request.conversation_id
|
|
@@ -865,10 +915,11 @@ module Legion
|
|
|
865
915
|
request_id: @request.id,
|
|
866
916
|
conversation_id: @request.conversation_id || "conv_#{SecureRandom.hex(8)}",
|
|
867
917
|
message: msg,
|
|
868
|
-
routing:
|
|
918
|
+
routing: build_response_routing,
|
|
869
919
|
tokens: extract_tokens,
|
|
870
|
-
stop:
|
|
920
|
+
stop: extract_stop_reason,
|
|
871
921
|
tools: response_tool_calls,
|
|
922
|
+
cost: estimate_response_cost,
|
|
872
923
|
timestamps: @timestamps,
|
|
873
924
|
enrichments: @enrichments,
|
|
874
925
|
audit: @audit,
|
|
@@ -890,17 +941,103 @@ module Legion
|
|
|
890
941
|
Array(requested).map { |name| name.to_s.tr('.', '_') }.reject(&:empty?)
|
|
891
942
|
end
|
|
892
943
|
|
|
944
|
+
def build_response_routing
|
|
945
|
+
routing = { provider: @resolved_provider, model: @resolved_model }
|
|
946
|
+
|
|
947
|
+
routing_audit = @audit[:'routing:provider_selection']
|
|
948
|
+
if routing_audit.is_a?(Hash) && routing_audit[:data].is_a?(Hash)
|
|
949
|
+
routing[:strategy] = routing_audit[:data][:strategy]
|
|
950
|
+
routing[:tier] = routing_audit[:data][:tier]
|
|
951
|
+
end
|
|
952
|
+
|
|
953
|
+
routing[:escalated] = @escalation_history.size > 1
|
|
954
|
+
routing[:escalation_chain] = @escalation_history if @escalation_history.any?
|
|
955
|
+
|
|
956
|
+
if @timestamps[:provider_start] && @timestamps[:provider_end]
|
|
957
|
+
routing[:latency_ms] = ((@timestamps[:provider_end] - @timestamps[:provider_start]) * 1000).round
|
|
958
|
+
end
|
|
959
|
+
|
|
960
|
+
routing
|
|
961
|
+
end
|
|
962
|
+
|
|
963
|
+
def extract_stop_reason
|
|
964
|
+
reason = if @raw_response.respond_to?(:stop_reason)
|
|
965
|
+
@raw_response.stop_reason&.to_sym
|
|
966
|
+
elsif @raw_response.respond_to?(:tool_calls) && @raw_response.tool_calls&.any?
|
|
967
|
+
:tool_use
|
|
968
|
+
end
|
|
969
|
+
{ reason: reason || :end_turn }
|
|
970
|
+
rescue StandardError
|
|
971
|
+
{ reason: :end_turn }
|
|
972
|
+
end
|
|
973
|
+
|
|
974
|
+
def estimate_response_cost
|
|
975
|
+
tokens = extract_tokens
|
|
976
|
+
input = tokens.respond_to?(:input_tokens) ? tokens.input_tokens : tokens[:input].to_i
|
|
977
|
+
output = tokens.respond_to?(:output_tokens) ? tokens.output_tokens : tokens[:output].to_i
|
|
978
|
+
return {} unless @resolved_model && (input + output).positive?
|
|
979
|
+
|
|
980
|
+
estimated = CostEstimator.estimate(
|
|
981
|
+
model_id: @resolved_model,
|
|
982
|
+
input_tokens: input,
|
|
983
|
+
output_tokens: output
|
|
984
|
+
)
|
|
985
|
+
{ estimated_usd: estimated, provider: @resolved_provider, model: @resolved_model }
|
|
986
|
+
rescue StandardError
|
|
987
|
+
{}
|
|
988
|
+
end
|
|
989
|
+
|
|
893
990
|
def response_tool_calls
|
|
894
991
|
return [] unless @raw_response.respond_to?(:tool_calls) && @raw_response.tool_calls
|
|
895
992
|
|
|
993
|
+
tool_timeline = build_tool_timeline_index
|
|
994
|
+
|
|
896
995
|
Array(@raw_response.tool_calls).map do |tool_call|
|
|
897
|
-
|
|
898
|
-
|
|
899
|
-
|
|
996
|
+
tc_id = tool_call[:id] || tool_call['id']
|
|
997
|
+
tc_name = tool_call[:name] || tool_call['name']
|
|
998
|
+
|
|
999
|
+
entry = {
|
|
1000
|
+
id: tc_id,
|
|
1001
|
+
name: tc_name,
|
|
900
1002
|
arguments: tool_call[:arguments] || tool_call['arguments'] || {}
|
|
901
1003
|
}
|
|
1004
|
+
|
|
1005
|
+
# Merge execution data from timeline if available
|
|
1006
|
+
timeline_data = tool_timeline[tc_name]
|
|
1007
|
+
if timeline_data
|
|
1008
|
+
entry[:exchange_id] = timeline_data[:exchange_id]
|
|
1009
|
+
entry[:source] = timeline_data[:source]
|
|
1010
|
+
entry[:status] = timeline_data[:status]
|
|
1011
|
+
entry[:duration_ms] = timeline_data[:duration_ms]
|
|
1012
|
+
entry[:result] = timeline_data[:result]
|
|
1013
|
+
end
|
|
1014
|
+
|
|
1015
|
+
entry
|
|
902
1016
|
end
|
|
903
1017
|
end
|
|
1018
|
+
|
|
1019
|
+
def build_tool_timeline_index
|
|
1020
|
+
index = {}
|
|
1021
|
+
@timeline.events.each do |event|
|
|
1022
|
+
key = event[:key]
|
|
1023
|
+
data = event[:data] || {}
|
|
1024
|
+
|
|
1025
|
+
if key&.start_with?('tool:execute:')
|
|
1026
|
+
tool_name = key.sub('tool:execute:', '')
|
|
1027
|
+
index[tool_name] = {
|
|
1028
|
+
exchange_id: event[:exchange_id],
|
|
1029
|
+
source: data[:source],
|
|
1030
|
+
status: data[:status],
|
|
1031
|
+
duration_ms: event[:duration_ms]
|
|
1032
|
+
}
|
|
1033
|
+
elsif key&.start_with?('tool:result:')
|
|
1034
|
+
tool_name = key.sub('tool:result:', '')
|
|
1035
|
+
index[tool_name][:result] = data[:result] if index[tool_name]
|
|
1036
|
+
end
|
|
1037
|
+
end
|
|
1038
|
+
|
|
1039
|
+
index
|
|
1040
|
+
end
|
|
904
1041
|
end
|
|
905
1042
|
end
|
|
906
1043
|
end
|
|
@@ -67,26 +67,45 @@ module Legion
|
|
|
67
67
|
|
|
68
68
|
extra = kwargs.except(
|
|
69
69
|
:message, :messages, :model, :provider, :system,
|
|
70
|
-
:tools, :stream, :caller, :classification, :billing,
|
|
70
|
+
:tools, :tool_choice, :stream, :caller, :classification, :billing,
|
|
71
71
|
:agent, :test, :tracing, :priority, :conversation_id,
|
|
72
|
-
:request_id, :id
|
|
72
|
+
:request_id, :id, :generation, :thinking, :response_format,
|
|
73
|
+
:context_strategy, :cache, :fork, :tokens, :stop,
|
|
74
|
+
:modality, :hooks, :idempotency_key, :ttl, :metadata,
|
|
75
|
+
:enrichments, :predictions
|
|
73
76
|
)
|
|
74
77
|
|
|
75
78
|
build_args = {
|
|
76
|
-
messages:
|
|
77
|
-
system:
|
|
78
|
-
routing:
|
|
79
|
-
tools:
|
|
80
|
-
|
|
81
|
-
|
|
82
|
-
|
|
83
|
-
|
|
84
|
-
|
|
85
|
-
|
|
86
|
-
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
|
|
79
|
+
messages: messages,
|
|
80
|
+
system: kwargs[:system],
|
|
81
|
+
routing: routing,
|
|
82
|
+
tools: kwargs.fetch(:tools, []),
|
|
83
|
+
tool_choice: kwargs[:tool_choice] || { mode: :auto },
|
|
84
|
+
stream: kwargs.fetch(:stream, false),
|
|
85
|
+
generation: kwargs[:generation] || {},
|
|
86
|
+
thinking: kwargs[:thinking],
|
|
87
|
+
response_format: kwargs[:response_format] || { type: :text },
|
|
88
|
+
context_strategy: kwargs.fetch(:context_strategy, :auto),
|
|
89
|
+
cache: kwargs[:cache] || { strategy: :default, cacheable: true },
|
|
90
|
+
fork: kwargs[:fork],
|
|
91
|
+
tokens: kwargs[:tokens] || { max: 4096 },
|
|
92
|
+
stop: kwargs[:stop] || { sequences: [] },
|
|
93
|
+
modality: kwargs[:modality],
|
|
94
|
+
hooks: kwargs[:hooks],
|
|
95
|
+
caller: kwargs[:caller],
|
|
96
|
+
classification: kwargs[:classification],
|
|
97
|
+
billing: kwargs[:billing],
|
|
98
|
+
agent: kwargs[:agent],
|
|
99
|
+
test: kwargs[:test],
|
|
100
|
+
tracing: kwargs[:tracing],
|
|
101
|
+
priority: kwargs.fetch(:priority, :normal),
|
|
102
|
+
conversation_id: kwargs[:conversation_id],
|
|
103
|
+
idempotency_key: kwargs[:idempotency_key],
|
|
104
|
+
ttl: kwargs[:ttl],
|
|
105
|
+
metadata: kwargs[:metadata] || {},
|
|
106
|
+
enrichments: kwargs[:enrichments] || {},
|
|
107
|
+
predictions: kwargs[:predictions] || {},
|
|
108
|
+
extra: extra
|
|
90
109
|
}
|
|
91
110
|
build_args[:id] = request_id if request_id
|
|
92
111
|
build(**build_args)
|
|
@@ -55,13 +55,21 @@ module Legion
|
|
|
55
55
|
input = msg.respond_to?(:input_tokens) ? msg.input_tokens.to_i : 0
|
|
56
56
|
output = msg.respond_to?(:output_tokens) ? msg.output_tokens.to_i : 0
|
|
57
57
|
|
|
58
|
+
stop_reason = if msg.respond_to?(:stop_reason)
|
|
59
|
+
msg.stop_reason&.to_sym || :end_turn
|
|
60
|
+
elsif msg.respond_to?(:tool_calls) && msg.tool_calls&.any?
|
|
61
|
+
:tool_use
|
|
62
|
+
else
|
|
63
|
+
:end_turn
|
|
64
|
+
end
|
|
65
|
+
|
|
58
66
|
build(
|
|
59
67
|
request_id: request_id,
|
|
60
68
|
conversation_id: conversation_id,
|
|
61
69
|
message: { role: :assistant, content: msg.content },
|
|
62
70
|
routing: { provider: provider, model: model || (msg.respond_to?(:model_id) ? msg.model_id : nil) },
|
|
63
71
|
tokens: { input: input, output: output, total: input + output },
|
|
64
|
-
stop: { reason:
|
|
72
|
+
stop: { reason: stop_reason },
|
|
65
73
|
**extra
|
|
66
74
|
)
|
|
67
75
|
end
|
|
@@ -9,7 +9,7 @@ module Legion
|
|
|
9
9
|
module Classification
|
|
10
10
|
include Legion::Logging::Helper
|
|
11
11
|
|
|
12
|
-
LEVELS = %i[public internal restricted
|
|
12
|
+
LEVELS = %i[public internal confidential restricted].freeze
|
|
13
13
|
|
|
14
14
|
PII_PATTERNS = {
|
|
15
15
|
ssn: /\b\d{3}-\d{2}-\d{4}\b/,
|
|
@@ -105,7 +105,7 @@ module Legion
|
|
|
105
105
|
|
|
106
106
|
{ level: level.to_sym }
|
|
107
107
|
rescue StandardError => e
|
|
108
|
-
handle_exception(e, level: :
|
|
108
|
+
handle_exception(e, level: :warn, operation: 'llm.pipeline.steps.classification.default')
|
|
109
109
|
nil
|
|
110
110
|
end
|
|
111
111
|
end
|
data/lib/legion/llm/routes.rb
CHANGED
|
@@ -15,6 +15,93 @@ require 'legion/logging/helper'
|
|
|
15
15
|
module Legion
|
|
16
16
|
module LLM
|
|
17
17
|
module Routes
|
|
18
|
+
# Mixin for dynamically-built client tool classes — keeps build_client_tool_class small.
|
|
19
|
+
module ClientToolMethods
|
|
20
|
+
private
|
|
21
|
+
|
|
22
|
+
def log_tool(level, ref, status, **details)
|
|
23
|
+
return unless defined?(Legion::Logging)
|
|
24
|
+
|
|
25
|
+
parts = ["[tool][#{ref}] #{status}"]
|
|
26
|
+
details.each { |k, v| parts << "#{k}=#{v}" }
|
|
27
|
+
Legion::Logging.send(level, parts.join(' '))
|
|
28
|
+
end
|
|
29
|
+
|
|
30
|
+
def summarize_tool_arg_keys(kwargs)
|
|
31
|
+
kwargs.keys.map(&:to_s).sort.join(',')
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
def summarize_tool_args(ref, kwargs)
|
|
35
|
+
case ref
|
|
36
|
+
when 'sh'
|
|
37
|
+
{ args: summarize_tool_arg_keys(kwargs), command_provided: kwargs.key?(:command) || kwargs.key?(:cmd) || !kwargs.empty? }
|
|
38
|
+
when 'file_write'
|
|
39
|
+
content = kwargs[:content] || kwargs[:contents]
|
|
40
|
+
{ args: summarize_tool_arg_keys(kwargs), bytes: content.to_s.bytesize }
|
|
41
|
+
when 'file_edit'
|
|
42
|
+
{ args: summarize_tool_arg_keys(kwargs),
|
|
43
|
+
old_len: kwargs[:old_text].to_s.length, new_len: kwargs[:new_text].to_s.length }
|
|
44
|
+
else
|
|
45
|
+
{ args: summarize_tool_arg_keys(kwargs) }
|
|
46
|
+
end
|
|
47
|
+
end
|
|
48
|
+
|
|
49
|
+
def dispatch_client_tool(ref, **kwargs)
|
|
50
|
+
case ref
|
|
51
|
+
when 'sh'
|
|
52
|
+
cmd = kwargs[:command] || kwargs[:cmd] || kwargs.values.first.to_s
|
|
53
|
+
output, status = ::Open3.capture2e(cmd, chdir: Dir.pwd)
|
|
54
|
+
"exit=#{status.exitstatus}\n#{output}"
|
|
55
|
+
when 'file_read'
|
|
56
|
+
path = kwargs[:path] || kwargs[:file_path] || kwargs.values.first.to_s
|
|
57
|
+
::File.exist?(path) ? ::File.read(path, encoding: 'utf-8') : "File not found: #{path}"
|
|
58
|
+
when 'file_write'
|
|
59
|
+
path = kwargs[:path] || kwargs[:file_path]
|
|
60
|
+
content = kwargs[:content] || kwargs[:contents]
|
|
61
|
+
::File.write(path, content)
|
|
62
|
+
"Written #{content.to_s.bytesize} bytes to #{path}"
|
|
63
|
+
when 'file_edit'
|
|
64
|
+
path = kwargs[:path] || kwargs[:file_path]
|
|
65
|
+
old_text = kwargs[:old_text] || kwargs[:search]
|
|
66
|
+
new_text = kwargs[:new_text] || kwargs[:replace]
|
|
67
|
+
content = ::File.read(path, encoding: 'utf-8')
|
|
68
|
+
content.sub!(old_text, new_text)
|
|
69
|
+
::File.write(path, content)
|
|
70
|
+
"Edited #{path}"
|
|
71
|
+
when 'list_directory'
|
|
72
|
+
path = ::File.expand_path(kwargs[:path] || kwargs[:dir] || Dir.pwd)
|
|
73
|
+
Dir.entries(path).reject { |e| e.start_with?('.') }.sort.join("\n")
|
|
74
|
+
when 'grep'
|
|
75
|
+
pattern = kwargs[:pattern] || kwargs[:query] || kwargs.values.first.to_s
|
|
76
|
+
path = kwargs[:path] || Dir.pwd
|
|
77
|
+
output, = ::Open3.capture2e('grep', '-rn', '--include=*.rb', pattern, path)
|
|
78
|
+
output.lines.first(50).join
|
|
79
|
+
when 'glob'
|
|
80
|
+
pattern = kwargs[:pattern] || kwargs.values.first.to_s
|
|
81
|
+
Dir.glob(pattern).first(100).join("\n")
|
|
82
|
+
when 'web_fetch'
|
|
83
|
+
url = kwargs[:url] || kwargs.values.first.to_s
|
|
84
|
+
require 'net/http'
|
|
85
|
+
uri = URI(url)
|
|
86
|
+
Net::HTTP.get(uri)
|
|
87
|
+
else
|
|
88
|
+
"Tool #{ref} is not executable server-side. Use a legion_ prefixed tool instead."
|
|
89
|
+
end
|
|
90
|
+
end
|
|
91
|
+
|
|
92
|
+
def notify_tool_event(type, ref, **data)
|
|
93
|
+
handler = Thread.current[:legion_tool_event_handler]
|
|
94
|
+
return unless handler
|
|
95
|
+
|
|
96
|
+
handler.call(
|
|
97
|
+
type: type,
|
|
98
|
+
tool_call_id: Thread.current[:legion_current_tool_call_id],
|
|
99
|
+
tool_name: ref,
|
|
100
|
+
**data
|
|
101
|
+
)
|
|
102
|
+
end
|
|
103
|
+
end
|
|
104
|
+
|
|
18
105
|
def self.registered(app) # rubocop:disable Metrics/CyclomaticComplexity,Metrics/PerceivedComplexity,Metrics/AbcSize,Metrics/MethodLength
|
|
19
106
|
app.helpers do # rubocop:disable Metrics/BlockLength
|
|
20
107
|
include Legion::Logging::Helper
|
|
@@ -31,7 +118,7 @@ module Legion
|
|
|
31
118
|
begin
|
|
32
119
|
parsed = Legion::JSON.load(raw)
|
|
33
120
|
rescue StandardError => e
|
|
34
|
-
handle_exception(e, level: :
|
|
121
|
+
handle_exception(e, level: :warn, operation: 'llm.routes.parse_request_body')
|
|
35
122
|
halt 400, { 'Content-Type' => 'application/json' },
|
|
36
123
|
Legion::JSON.dump({ error: { code: 'invalid_json', message: 'request body is not valid JSON' } })
|
|
37
124
|
end
|
|
@@ -140,55 +227,31 @@ module Legion
|
|
|
140
227
|
end
|
|
141
228
|
end
|
|
142
229
|
|
|
143
|
-
# rubocop:disable Metrics/BlockLength
|
|
144
230
|
define_method(:build_client_tool_class) do |tname, tdesc, tschema|
|
|
231
|
+
tool_ref = tname
|
|
145
232
|
klass = Class.new(RubyLLM::Tool) do
|
|
233
|
+
include Legion::LLM::Routes::ClientToolMethods
|
|
234
|
+
|
|
146
235
|
description tdesc
|
|
147
|
-
define_method(:name) {
|
|
148
|
-
tool_ref = tname
|
|
236
|
+
define_method(:name) { tool_ref }
|
|
149
237
|
|
|
150
238
|
define_method(:execute) do |**kwargs|
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
|
|
154
|
-
|
|
155
|
-
|
|
156
|
-
|
|
157
|
-
|
|
158
|
-
|
|
159
|
-
when 'file_write'
|
|
160
|
-
path = kwargs[:path] || kwargs[:file_path]
|
|
161
|
-
content = kwargs[:content] || kwargs[:contents]
|
|
162
|
-
::File.write(path, content)
|
|
163
|
-
"Written #{content.to_s.bytesize} bytes to #{path}"
|
|
164
|
-
when 'file_edit'
|
|
165
|
-
path = kwargs[:path] || kwargs[:file_path]
|
|
166
|
-
old_text = kwargs[:old_text] || kwargs[:search]
|
|
167
|
-
new_text = kwargs[:new_text] || kwargs[:replace]
|
|
168
|
-
content = ::File.read(path, encoding: 'utf-8')
|
|
169
|
-
content.sub!(old_text, new_text)
|
|
170
|
-
::File.write(path, content)
|
|
171
|
-
"Edited #{path}"
|
|
172
|
-
when 'list_directory'
|
|
173
|
-
path = kwargs[:path] || kwargs[:dir] || Dir.pwd
|
|
174
|
-
Dir.entries(path).reject { |e| e.start_with?('.') }.sort.join("\n")
|
|
175
|
-
when 'grep'
|
|
176
|
-
pattern = kwargs[:pattern] || kwargs[:query] || kwargs.values.first.to_s
|
|
177
|
-
path = kwargs[:path] || Dir.pwd
|
|
178
|
-
output, = ::Open3.capture2e('grep', '-rn', '--include=*.rb', pattern, path)
|
|
179
|
-
output.lines.first(50).join
|
|
180
|
-
when 'glob'
|
|
181
|
-
pattern = kwargs[:pattern] || kwargs.values.first.to_s
|
|
182
|
-
Dir.glob(pattern).first(100).join("\n")
|
|
183
|
-
when 'web_fetch'
|
|
184
|
-
url = kwargs[:url] || kwargs.values.first.to_s
|
|
185
|
-
require 'net/http'
|
|
186
|
-
uri = URI(url)
|
|
187
|
-
Net::HTTP.get(uri)
|
|
188
|
-
else
|
|
189
|
-
"Tool #{tool_ref} is not executable server-side. Use a legion_ prefixed tool instead."
|
|
190
|
-
end
|
|
239
|
+
summary = summarize_tool_args(tool_ref, kwargs)
|
|
240
|
+
log_tool(:info, tool_ref, 'executing', **summary)
|
|
241
|
+
t0 = ::Process.clock_gettime(::Process::CLOCK_MONOTONIC)
|
|
242
|
+
result = dispatch_client_tool(tool_ref, **kwargs)
|
|
243
|
+
ms = ((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - t0) * 1000).round(1)
|
|
244
|
+
log_tool(:info, tool_ref, 'completed', duration_ms: ms, result_size: result.to_s.bytesize)
|
|
245
|
+
notify_tool_event(:tool_result, tool_ref, result: result.to_s[0, 4096])
|
|
246
|
+
result
|
|
191
247
|
rescue StandardError => e
|
|
248
|
+
ms = begin
|
|
249
|
+
((::Process.clock_gettime(::Process::CLOCK_MONOTONIC) - t0) * 1000).round(1)
|
|
250
|
+
rescue StandardError
|
|
251
|
+
nil
|
|
252
|
+
end
|
|
253
|
+
log_tool(:error, tool_ref, 'failed', duration_ms: ms, error: e.message)
|
|
254
|
+
notify_tool_event(:tool_error, tool_ref, error: e.message)
|
|
192
255
|
if defined?(Legion::Logging) && Legion::Logging.respond_to?(:log_exception)
|
|
193
256
|
Legion::Logging.log_exception(e, payload_summary: "client tool #{tool_ref} failed", component_type: :api)
|
|
194
257
|
end
|
|
@@ -201,7 +264,6 @@ module Legion
|
|
|
201
264
|
handle_exception(e, level: :warn, operation: "llm.routes.build_client_tool_class.#{tname}")
|
|
202
265
|
nil
|
|
203
266
|
end
|
|
204
|
-
# rubocop:enable Metrics/BlockLength
|
|
205
267
|
|
|
206
268
|
define_method(:extract_tool_calls) do |pipeline_response|
|
|
207
269
|
tools_data = pipeline_response.tools
|
|
@@ -217,10 +279,12 @@ module Legion
|
|
|
217
279
|
end
|
|
218
280
|
|
|
219
281
|
define_method(:emit_sse_event) do |stream, event_name, payload|
|
|
282
|
+
level = event_name == 'text-delta' ? :debug : :info
|
|
283
|
+
log.send(level, "[sse][emit] event=#{event_name} keys=#{payload.is_a?(Hash) ? payload.keys.join(',') : 'n/a'}")
|
|
220
284
|
stream << "event: #{event_name}\ndata: #{Legion::JSON.dump(payload)}\n\n"
|
|
221
285
|
end
|
|
222
286
|
|
|
223
|
-
define_method(:emit_timeline_tool_events) do |stream, pipeline_response|
|
|
287
|
+
define_method(:emit_timeline_tool_events) do |stream, pipeline_response, skip_tool_results: false|
|
|
224
288
|
timeline = Array(pipeline_response.timeline)
|
|
225
289
|
timeline.each do |event|
|
|
226
290
|
key = event[:key].to_s
|
|
@@ -230,6 +294,9 @@ module Legion
|
|
|
230
294
|
next if name.to_s.empty?
|
|
231
295
|
|
|
232
296
|
if key.start_with?('tool:result:')
|
|
297
|
+
# Skip replay when real-time tool events already emitted these during streaming
|
|
298
|
+
next if skip_tool_results
|
|
299
|
+
|
|
233
300
|
event_name = data[:status].to_s == 'error' ? 'tool-error' : 'tool-result'
|
|
234
301
|
emit_sse_event(stream, event_name, {
|
|
235
302
|
toolCallId: data[:tool_call_id],
|
|
@@ -520,6 +587,35 @@ module Legion
|
|
|
520
587
|
# rubocop:disable Metrics/BlockLength
|
|
521
588
|
stream do |out|
|
|
522
589
|
full_text = +''
|
|
590
|
+
|
|
591
|
+
executor.tool_event_handler = lambda { |event|
|
|
592
|
+
log.info("[inference][tool-event] type=#{event[:type]} tool=#{event[:tool_name]} id=#{event[:tool_call_id]}")
|
|
593
|
+
case event[:type]
|
|
594
|
+
when :tool_call
|
|
595
|
+
emit_sse_event(out, 'tool-call', {
|
|
596
|
+
toolCallId: event[:tool_call_id],
|
|
597
|
+
toolName: event[:tool_name],
|
|
598
|
+
args: event[:arguments],
|
|
599
|
+
timestamp: Time.now.utc.iso8601
|
|
600
|
+
})
|
|
601
|
+
when :tool_result
|
|
602
|
+
emit_sse_event(out, 'tool-result', {
|
|
603
|
+
toolCallId: event[:tool_call_id],
|
|
604
|
+
toolName: event[:tool_name],
|
|
605
|
+
result: event[:result],
|
|
606
|
+
timestamp: Time.now.utc.iso8601
|
|
607
|
+
})
|
|
608
|
+
when :tool_error
|
|
609
|
+
emit_sse_event(out, 'tool-error', {
|
|
610
|
+
toolCallId: event[:tool_call_id],
|
|
611
|
+
toolName: event[:tool_name],
|
|
612
|
+
result: event[:error],
|
|
613
|
+
status: 'error',
|
|
614
|
+
timestamp: Time.now.utc.iso8601
|
|
615
|
+
})
|
|
616
|
+
end
|
|
617
|
+
}
|
|
618
|
+
|
|
523
619
|
pipeline_response = executor.call_stream do |chunk|
|
|
524
620
|
text = chunk.respond_to?(:content) ? chunk.content.to_s : chunk.to_s
|
|
525
621
|
next if text.empty?
|
|
@@ -528,16 +624,7 @@ module Legion
|
|
|
528
624
|
emit_sse_event(out, 'text-delta', { delta: text })
|
|
529
625
|
end
|
|
530
626
|
|
|
531
|
-
|
|
532
|
-
emit_sse_event(out, 'tool-call', {
|
|
533
|
-
toolCallId: tool_call[:id],
|
|
534
|
-
toolName: tool_call[:name],
|
|
535
|
-
args: tool_call[:arguments],
|
|
536
|
-
timestamp: Time.now.utc.iso8601
|
|
537
|
-
})
|
|
538
|
-
end
|
|
539
|
-
|
|
540
|
-
emit_timeline_tool_events(out, pipeline_response)
|
|
627
|
+
emit_timeline_tool_events(out, pipeline_response, skip_tool_results: !executor.tool_event_handler.nil?)
|
|
541
628
|
|
|
542
629
|
enrichments = pipeline_response.enrichments
|
|
543
630
|
emit_sse_event(out, 'enrichment', enrichments) if enrichments.is_a?(Hash) && !enrichments.empty?
|
data/lib/legion/llm/version.rb
CHANGED