llm_meta_client 1.0.1 → 1.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +49 -0
- data/app/assets/stylesheets/llm_meta_client/generation_settings.css +15 -0
- data/lib/generators/llm_meta_client/scaffold/scaffold_generator.rb +5 -0
- data/lib/generators/llm_meta_client/scaffold/templates/app/controllers/chat_streams_controller.rb +92 -0
- data/lib/generators/llm_meta_client/scaffold/templates/app/controllers/chats_controller.rb +13 -24
- data/lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/chats_form_controller.js +15 -1
- data/lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/generation_settings_controller.js +71 -0
- data/lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/message_stream_controller.js +101 -0
- data/lib/generators/llm_meta_client/scaffold/templates/app/models/chat.rb +53 -15
- data/lib/generators/llm_meta_client/scaffold/templates/app/views/chats/_chat_sidebar.html.erb +8 -0
- data/lib/generators/llm_meta_client/scaffold/templates/app/views/chats/_streaming_message.html.erb +7 -0
- data/lib/generators/llm_meta_client/scaffold/templates/app/views/chats/create.turbo_stream.erb +4 -6
- data/lib/generators/llm_meta_client/scaffold/templates/app/views/chats/update.turbo_stream.erb +3 -3
- data/lib/generators/llm_meta_client/scaffold/templates/app/views/shared/_generation_settings_field.html.erb +3 -1
- data/lib/llm_meta_client/server_query.rb +149 -1
- data/lib/llm_meta_client/version.rb +1 -1
- metadata +14 -4
checksums.yaml
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
---
|
|
2
2
|
SHA256:
|
|
3
|
-
metadata.gz:
|
|
4
|
-
data.tar.gz:
|
|
3
|
+
metadata.gz: 5529e0613f2103802abfbf448ef5d030f63a7788e8fde7440d732fa94ced7378
|
|
4
|
+
data.tar.gz: 38223a0a3ff727e9a649fc38db922e5f89aaaa858118464ef9d410a727fff5f1
|
|
5
5
|
SHA512:
|
|
6
|
-
metadata.gz:
|
|
7
|
-
data.tar.gz:
|
|
6
|
+
metadata.gz: 2d86c22b05ff9991ff7087a370de97fc90adaf766f2a13745511d2ce7beb129828e169cbeb601b9d2996836f62fb79ca398a66a2e1fbd09a20cf78c148d5fb8a
|
|
7
|
+
data.tar.gz: 55cc853db63cca50ace6693e4b4a124ee2923371fd6709b2ef1d92e3fdcb4bc62ad433533c2e832dfff132d6ab91bf608847d583d0fdc1104beb38c54597db8d
|
data/CHANGELOG.md
CHANGED
|
@@ -5,6 +5,55 @@ All notable changes to this project will be documented in this file.
|
|
|
5
5
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
|
|
6
6
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
|
7
7
|
|
|
8
|
+
## [1.2.0] - 2026-05-10
|
|
9
|
+
|
|
10
|
+
### Added
|
|
11
|
+
|
|
12
|
+
- End-to-end SSE streaming for chat completions:
|
|
13
|
+
- `ServerQuery#stream` consumes SSE from the new `chat_streams` endpoint on `llm_meta_server` and yields parsed events. Returns the assembled content; absorbs upstream `done` markers and raises `ServerError` on upstream `error` events.
|
|
14
|
+
- `Chat#stream_assistant_response` and `Chat#finalize_streamed_response` for streaming generation with persistence at stream close (assistant message saved only on success — disconnects mid-stream don't persist).
|
|
15
|
+
- Scaffold now generates `ChatStreamsController`, `_streaming_message.html.erb` partial, `message_stream_controller.js` Stimulus controller, and a shared `_chat_sidebar.html.erb` partial. Routes add `resource :stream` nested under `chats`.
|
|
16
|
+
- Streaming bubble swaps to the host-rendered `_message` partial on save, so any markdown / syntax-highlighting customization in the host's `_message.html.erb` applies post-stream.
|
|
17
|
+
- `event: title` SSE event includes a turbo_stream snippet that updates the chat-sidebar in place when a brand-new chat gets its auto-generated title.
|
|
18
|
+
|
|
19
|
+
### Changed
|
|
20
|
+
|
|
21
|
+
- `ServerQuery` error messages parse the JSON response body from `llm_meta_server` and surface friendlier text (rate limits, auth errors, upstream unavailable) instead of bare `HTTP <code>`.
|
|
22
|
+
- Streaming endpoint v1 does not pass `tool_ids`. The synchronous `#call` path is unchanged and still supports tool calls.
|
|
23
|
+
|
|
24
|
+
### Notes
|
|
25
|
+
|
|
26
|
+
- The streaming endpoint requires `llm_meta_server` with the matching `chat_streams` route. SSE delivery through reverse proxies needs `proxy_buffering off` (nginx) or `flushpackets=on` + `SetEnv no-gzip 1` (Apache).
|
|
27
|
+
|
|
28
|
+
## [1.1.1] - 2026-04-22
|
|
29
|
+
|
|
30
|
+
### Added
|
|
31
|
+
|
|
32
|
+
- `ServerQuery#call` now surfaces tool calls from the LLM server response. When the response includes a `tool_calls` array, a markdown-formatted "Tool calls" section (name + JSON args) is appended to the returned content (separated by a horizontal rule). This lets host apps display which tools the LLM invoked without any schema or view changes; existing markdown renderers pick it up automatically. Previously, tool calls were silently dropped.
|
|
33
|
+
|
|
34
|
+
## [1.1.0] - 2026-04-22
|
|
35
|
+
|
|
36
|
+
### Changed
|
|
37
|
+
|
|
38
|
+
- Widen `prompt_navigator` dependency constraint from `~> 1.0` to `>= 1.0, < 3.0` so host apps can opt into `prompt_navigator` 2.0 (which requires Ruby 3.4.9+ and adds `PromptExecution.delete_set!`). Existing hosts on `prompt_navigator` 1.x keep resolving unchanged.
|
|
39
|
+
|
|
40
|
+
## [1.0.2] - 2026-03-27
|
|
41
|
+
|
|
42
|
+
### Added
|
|
43
|
+
|
|
44
|
+
- Add client-side validation for Generation Settings JSON
|
|
45
|
+
|
|
46
|
+
## [1.0.1] - 2026-03-25
|
|
47
|
+
|
|
48
|
+
### Fixed
|
|
49
|
+
|
|
50
|
+
- Fix: normalize Ollama llm_type in server resource options
|
|
51
|
+
- Fix: update branch_from_uuid after LLM response
|
|
52
|
+
|
|
53
|
+
### Changed
|
|
54
|
+
|
|
55
|
+
- Refactor: move llm_uuid and model from Chat to PromptExecution
|
|
56
|
+
|
|
8
57
|
## [1.0.0] - 2026-03-25
|
|
9
58
|
|
|
10
59
|
### Changed
|
|
@@ -62,6 +62,21 @@
|
|
|
62
62
|
}
|
|
63
63
|
}
|
|
64
64
|
|
|
65
|
+
.generation-settings-json-input--invalid {
|
|
66
|
+
border-color: #ef4444;
|
|
67
|
+
|
|
68
|
+
&:focus {
|
|
69
|
+
border-color: #ef4444;
|
|
70
|
+
box-shadow: 0 0 0 3px rgba(239, 68, 68, 0.1);
|
|
71
|
+
}
|
|
72
|
+
}
|
|
73
|
+
|
|
74
|
+
.generation-settings-error {
|
|
75
|
+
font-size: 12px;
|
|
76
|
+
color: #ef4444;
|
|
77
|
+
margin-top: 4px;
|
|
78
|
+
}
|
|
79
|
+
|
|
65
80
|
.generation-settings-hint {
|
|
66
81
|
font-size: 11px;
|
|
67
82
|
color: #9ca3af;
|
|
@@ -19,6 +19,7 @@ module LlmMetaClient
|
|
|
19
19
|
|
|
20
20
|
def create_controllers
|
|
21
21
|
template "app/controllers/chats_controller.rb"
|
|
22
|
+
template "app/controllers/chat_streams_controller.rb"
|
|
22
23
|
template "app/controllers/prompts_controller.rb"
|
|
23
24
|
template "app/controllers/api/mcp_servers_controller.rb"
|
|
24
25
|
end
|
|
@@ -29,6 +30,8 @@ module LlmMetaClient
|
|
|
29
30
|
template "app/views/chats/create.turbo_stream.erb"
|
|
30
31
|
template "app/views/chats/update.turbo_stream.erb"
|
|
31
32
|
template "app/views/chats/_message.html.erb"
|
|
33
|
+
template "app/views/chats/_streaming_message.html.erb"
|
|
34
|
+
template "app/views/chats/_chat_sidebar.html.erb"
|
|
32
35
|
template "app/views/chats/_messages_list.html.erb"
|
|
33
36
|
template "app/views/shared/_family_field.html.erb"
|
|
34
37
|
template "app/views/shared/_api_key_field.html.erb"
|
|
@@ -46,6 +49,7 @@ module LlmMetaClient
|
|
|
46
49
|
template "app/javascript/controllers/chat_title_edit_controller.js"
|
|
47
50
|
template "app/javascript/controllers/tool_selector_controller.js"
|
|
48
51
|
template "app/javascript/controllers/generation_settings_controller.js"
|
|
52
|
+
template "app/javascript/controllers/message_stream_controller.js"
|
|
49
53
|
copy_file "app/javascript/popover.js"
|
|
50
54
|
end
|
|
51
55
|
|
|
@@ -73,6 +77,7 @@ module LlmMetaClient
|
|
|
73
77
|
patch :update_title
|
|
74
78
|
get :download_csv
|
|
75
79
|
end
|
|
80
|
+
resource :stream, only: [ :show ], controller: "chat_streams"
|
|
76
81
|
end
|
|
77
82
|
resources :prompts, only: [ :show ]
|
|
78
83
|
|
data/lib/generators/llm_meta_client/scaffold/templates/app/controllers/chat_streams_controller.rb
ADDED
|
@@ -0,0 +1,92 @@
|
|
|
1
|
+
class ChatStreamsController < ApplicationController
|
|
2
|
+
include ActionController::Live
|
|
3
|
+
|
|
4
|
+
skip_before_action :authenticate_user!, raise: false
|
|
5
|
+
skip_before_action :verify_authenticity_token
|
|
6
|
+
|
|
7
|
+
def show
|
|
8
|
+
response.headers["Content-Type"] = "text/event-stream"
|
|
9
|
+
response.headers["Cache-Control"] = "no-cache"
|
|
10
|
+
response.headers["X-Accel-Buffering"] = "no"
|
|
11
|
+
|
|
12
|
+
chat = find_chat
|
|
13
|
+
prompt_execution = PromptNavigator::PromptExecution.find_by!(execution_id: params[:execution_id])
|
|
14
|
+
unless chat.messages.exists?(prompt_navigator_prompt_execution_id: prompt_execution.id)
|
|
15
|
+
raise ActiveRecord::RecordNotFound
|
|
16
|
+
end
|
|
17
|
+
|
|
18
|
+
jwt_token = current_user.id_token if user_signed_in?
|
|
19
|
+
generation_settings = parse_generation_settings(params[:generation_settings_json])
|
|
20
|
+
|
|
21
|
+
assembled = chat.stream_assistant_response(prompt_execution, jwt_token, generation_settings: generation_settings) do |event|
|
|
22
|
+
forward(event)
|
|
23
|
+
end
|
|
24
|
+
|
|
25
|
+
if assembled.present?
|
|
26
|
+
assistant_message = chat.finalize_streamed_response(prompt_execution, assembled, jwt_token)
|
|
27
|
+
if assistant_message
|
|
28
|
+
forward(event: "saved", data: {
|
|
29
|
+
message_id: assistant_message.id,
|
|
30
|
+
execution_id: prompt_execution.execution_id,
|
|
31
|
+
html: view_context.render(partial: "chats/message", locals: { message: assistant_message })
|
|
32
|
+
})
|
|
33
|
+
end
|
|
34
|
+
|
|
35
|
+
title_before = chat.title
|
|
36
|
+
chat.generate_title(prompt_execution.prompt, jwt_token)
|
|
37
|
+
if chat.reload.title.present? && chat.title != title_before
|
|
38
|
+
forward(event: "title", data: {
|
|
39
|
+
title: chat.title,
|
|
40
|
+
chat_uuid: chat.uuid,
|
|
41
|
+
turbo_stream: render_sidebar_update(chat)
|
|
42
|
+
})
|
|
43
|
+
end
|
|
44
|
+
end
|
|
45
|
+
|
|
46
|
+
forward(event: "done", data: {})
|
|
47
|
+
rescue ActionController::Live::ClientDisconnected
|
|
48
|
+
Rails.logger.info "[ChatStream] client disconnected"
|
|
49
|
+
rescue ActiveRecord::RecordNotFound
|
|
50
|
+
forward(event: "error", data: { code: "not_found", message: "Chat or prompt execution not found" }) rescue nil
|
|
51
|
+
rescue StandardError => e
|
|
52
|
+
Rails.logger.error "[ChatStream] #{e.class}: #{e.message}"
|
|
53
|
+
forward(event: "error", data: { code: e.class.name, message: e.message }) rescue nil
|
|
54
|
+
ensure
|
|
55
|
+
response.stream.close
|
|
56
|
+
end
|
|
57
|
+
|
|
58
|
+
private
|
|
59
|
+
|
|
60
|
+
def find_chat
|
|
61
|
+
scope = user_signed_in? ? current_user.chats : Chat.where(user_id: nil)
|
|
62
|
+
scope.find_by!(uuid: params[:chat_id])
|
|
63
|
+
end
|
|
64
|
+
|
|
65
|
+
def forward(event)
|
|
66
|
+
name = event[:event]
|
|
67
|
+
payload = event[:data].to_json
|
|
68
|
+
if name.nil? || name == "message"
|
|
69
|
+
response.stream.write "data: #{payload}\n\n"
|
|
70
|
+
else
|
|
71
|
+
response.stream.write "event: #{name}\ndata: #{payload}\n\n"
|
|
72
|
+
end
|
|
73
|
+
end
|
|
74
|
+
|
|
75
|
+
def parse_generation_settings(raw)
|
|
76
|
+
return {} if raw.blank?
|
|
77
|
+
parsed = JSON.parse(raw)
|
|
78
|
+
parsed.is_a?(Hash) ? parsed.symbolize_keys : {}
|
|
79
|
+
rescue JSON::ParserError
|
|
80
|
+
{}
|
|
81
|
+
end
|
|
82
|
+
|
|
83
|
+
def render_sidebar_update(chat)
|
|
84
|
+
initialize_chat(user_signed_in? ? current_user.chats : nil)
|
|
85
|
+
add_chat(chat)
|
|
86
|
+
view_context.turbo_stream.replace(
|
|
87
|
+
"chat-sidebar",
|
|
88
|
+
partial: "chats/chat_sidebar",
|
|
89
|
+
locals: { chat: chat }
|
|
90
|
+
).to_s
|
|
91
|
+
end
|
|
92
|
+
end
|
|
@@ -70,9 +70,10 @@ class ChatsController < ApplicationController
|
|
|
70
70
|
initialize_history @chat&.ordered_by_descending_prompt_executions
|
|
71
71
|
|
|
72
72
|
if params[:message].present?
|
|
73
|
-
# Validate generation settings before proceeding
|
|
73
|
+
# Validate generation settings before proceeding (raises if invalid).
|
|
74
|
+
# The streaming controller re-parses them from the URL.
|
|
74
75
|
begin
|
|
75
|
-
|
|
76
|
+
generation_settings_param
|
|
76
77
|
rescue InvalidGenerationSettingsError => e
|
|
77
78
|
@error_message = e.message
|
|
78
79
|
respond_to do |format|
|
|
@@ -92,15 +93,10 @@ class ChatsController < ApplicationController
|
|
|
92
93
|
# Set active message UUID for highlighting in UI
|
|
93
94
|
set_active_message_uuid(@prompt_execution&.execution_id || params.dig(:chat, :branch_from_uuid))
|
|
94
95
|
|
|
95
|
-
#
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
@chat.generate_title(params[:message], jwt_token)
|
|
100
|
-
rescue StandardError => e
|
|
101
|
-
Rails.logger.error "Error in chat response: #{e.class} - #{e.message}\n#{e.backtrace&.join("\n")}"
|
|
102
|
-
@error_message = "An error occurred while getting the response. Please try again."
|
|
103
|
-
end
|
|
96
|
+
# The assistant response is streamed by ChatStreamsController (SSE).
|
|
97
|
+
# The streaming bubble is rendered by create.turbo_stream.erb and opens
|
|
98
|
+
# the EventSource on connect; persistence + title gen happen at stream close.
|
|
99
|
+
@generation_settings_json = params[:generation_settings_json]
|
|
104
100
|
end
|
|
105
101
|
|
|
106
102
|
# Return turbo stream to render both messages
|
|
@@ -167,9 +163,10 @@ class ChatsController < ApplicationController
|
|
|
167
163
|
initialize_history @chat&.ordered_by_descending_prompt_executions
|
|
168
164
|
|
|
169
165
|
if params[:message].present?
|
|
170
|
-
# Validate generation settings before proceeding
|
|
166
|
+
# Validate generation settings before proceeding (raises if invalid).
|
|
167
|
+
# The streaming controller re-parses them from the URL.
|
|
171
168
|
begin
|
|
172
|
-
|
|
169
|
+
generation_settings_param
|
|
173
170
|
rescue InvalidGenerationSettingsError => e
|
|
174
171
|
@error_message = e.message
|
|
175
172
|
respond_to do |format|
|
|
@@ -189,13 +186,9 @@ class ChatsController < ApplicationController
|
|
|
189
186
|
# Set active message UUID for highlighting in UI
|
|
190
187
|
set_active_message_uuid(@prompt_execution&.execution_id || params.dig(:chat, :branch_from_uuid))
|
|
191
188
|
|
|
192
|
-
#
|
|
193
|
-
|
|
194
|
-
|
|
195
|
-
rescue StandardError => e
|
|
196
|
-
Rails.logger.error "Error in chat response: #{e.class} - #{e.message}\n#{e.backtrace&.join("\n")}"
|
|
197
|
-
@error_message = "An error occurred while getting the response. Please try again."
|
|
198
|
-
end
|
|
189
|
+
# The assistant response is streamed by ChatStreamsController (SSE).
|
|
190
|
+
# See create action for details.
|
|
191
|
+
@generation_settings_json = params[:generation_settings_json]
|
|
199
192
|
end
|
|
200
193
|
|
|
201
194
|
# Return turbo stream to render both messages
|
|
@@ -207,10 +200,6 @@ class ChatsController < ApplicationController
|
|
|
207
200
|
|
|
208
201
|
private
|
|
209
202
|
|
|
210
|
-
def tool_ids_param
|
|
211
|
-
params[:tool_ids].presence || []
|
|
212
|
-
end
|
|
213
|
-
|
|
214
203
|
ALLOWED_GENERATION_KEYS = %w[temperature top_k top_p max_tokens repeat_penalty].freeze
|
|
215
204
|
|
|
216
205
|
class InvalidGenerationSettingsError < StandardError; end
|
|
@@ -13,7 +13,15 @@ export default class extends Controller {
|
|
|
13
13
|
}
|
|
14
14
|
|
|
15
15
|
// Handle form submission to show user message immediately
|
|
16
|
-
submit() {
|
|
16
|
+
submit(event) {
|
|
17
|
+
// Check generation settings validity before submitting
|
|
18
|
+
const gsController = this.#generationSettingsController()
|
|
19
|
+
if (gsController && !gsController.isValid) {
|
|
20
|
+
event.preventDefault()
|
|
21
|
+
gsController.validate()
|
|
22
|
+
return
|
|
23
|
+
}
|
|
24
|
+
|
|
17
25
|
// Don't prevent default - let Turbo handle the form submission
|
|
18
26
|
// Just add the user message to the DOM immediately
|
|
19
27
|
const messageContent = this.promptTarget.value.trim()
|
|
@@ -64,6 +72,12 @@ export default class extends Controller {
|
|
|
64
72
|
}
|
|
65
73
|
}
|
|
66
74
|
|
|
75
|
+
#generationSettingsController() {
|
|
76
|
+
const el = this.element.querySelector('[data-controller*="generation-settings"]')
|
|
77
|
+
if (!el) return null
|
|
78
|
+
return this.application.getControllerForElementAndIdentifier(el, "generation-settings")
|
|
79
|
+
}
|
|
80
|
+
|
|
67
81
|
#canSubmit() {
|
|
68
82
|
// Text field and prompt field can be validated using HTML5's required attribute,
|
|
69
83
|
// so we delegate to checkValidity() to utilize standard validation
|
|
@@ -1,5 +1,7 @@
|
|
|
1
1
|
import { Controller } from "@hotwired/stimulus"
|
|
2
2
|
|
|
3
|
+
const ALLOWED_KEYS = ["temperature", "top_k", "top_p", "max_tokens", "repeat_penalty"]
|
|
4
|
+
|
|
3
5
|
// Connects to data-controller="generation-settings"
|
|
4
6
|
export default class extends Controller {
|
|
5
7
|
static targets = [
|
|
@@ -7,6 +9,7 @@ export default class extends Controller {
|
|
|
7
9
|
"toggleIcon",
|
|
8
10
|
"panel",
|
|
9
11
|
"jsonInput",
|
|
12
|
+
"error",
|
|
10
13
|
]
|
|
11
14
|
|
|
12
15
|
connect() {
|
|
@@ -24,4 +27,72 @@ export default class extends Controller {
|
|
|
24
27
|
this.toggleIconTarget.classList.toggle("bi-chevron-up", this.expanded)
|
|
25
28
|
}
|
|
26
29
|
}
|
|
30
|
+
|
|
31
|
+
validate() {
|
|
32
|
+
const input = this.jsonInputTarget.value.trim()
|
|
33
|
+
|
|
34
|
+
if (!input) {
|
|
35
|
+
this.#clearError()
|
|
36
|
+
return
|
|
37
|
+
}
|
|
38
|
+
|
|
39
|
+
let parsed
|
|
40
|
+
try {
|
|
41
|
+
parsed = JSON.parse(input)
|
|
42
|
+
} catch (e) {
|
|
43
|
+
this.#showError("Invalid JSON syntax")
|
|
44
|
+
return
|
|
45
|
+
}
|
|
46
|
+
|
|
47
|
+
if (typeof parsed !== "object" || Array.isArray(parsed) || parsed === null) {
|
|
48
|
+
this.#showError("Must be a JSON object (e.g. {\"temperature\": 0.7})")
|
|
49
|
+
return
|
|
50
|
+
}
|
|
51
|
+
|
|
52
|
+
const unknownKeys = Object.keys(parsed).filter(k => !ALLOWED_KEYS.includes(k))
|
|
53
|
+
if (unknownKeys.length > 0) {
|
|
54
|
+
this.#showError(`Unknown keys: ${unknownKeys.join(", ")}`)
|
|
55
|
+
return
|
|
56
|
+
}
|
|
57
|
+
|
|
58
|
+
const nonNumeric = Object.entries(parsed).filter(([, v]) => typeof v !== "number")
|
|
59
|
+
if (nonNumeric.length > 0) {
|
|
60
|
+
this.#showError(`Values must be numeric: ${nonNumeric.map(([k]) => k).join(", ")}`)
|
|
61
|
+
return
|
|
62
|
+
}
|
|
63
|
+
|
|
64
|
+
this.#clearError()
|
|
65
|
+
}
|
|
66
|
+
|
|
67
|
+
get isValid() {
|
|
68
|
+
if (!this.hasJsonInputTarget) return true
|
|
69
|
+
const input = this.jsonInputTarget.value.trim()
|
|
70
|
+
if (!input) return true
|
|
71
|
+
|
|
72
|
+
try {
|
|
73
|
+
const parsed = JSON.parse(input)
|
|
74
|
+
if (typeof parsed !== "object" || Array.isArray(parsed) || parsed === null) return false
|
|
75
|
+
if (Object.keys(parsed).some(k => !ALLOWED_KEYS.includes(k))) return false
|
|
76
|
+
if (Object.values(parsed).some(v => typeof v !== "number")) return false
|
|
77
|
+
return true
|
|
78
|
+
} catch {
|
|
79
|
+
return false
|
|
80
|
+
}
|
|
81
|
+
}
|
|
82
|
+
|
|
83
|
+
#showError(message) {
|
|
84
|
+
if (this.hasErrorTarget) {
|
|
85
|
+
this.errorTarget.textContent = message
|
|
86
|
+
this.errorTarget.style.display = "block"
|
|
87
|
+
}
|
|
88
|
+
this.jsonInputTarget.classList.add("generation-settings-json-input--invalid")
|
|
89
|
+
}
|
|
90
|
+
|
|
91
|
+
#clearError() {
|
|
92
|
+
if (this.hasErrorTarget) {
|
|
93
|
+
this.errorTarget.textContent = ""
|
|
94
|
+
this.errorTarget.style.display = "none"
|
|
95
|
+
}
|
|
96
|
+
this.jsonInputTarget.classList.remove("generation-settings-json-input--invalid")
|
|
97
|
+
}
|
|
27
98
|
}
|
|
@@ -0,0 +1,101 @@
|
|
|
1
|
+
import { Controller } from "@hotwired/stimulus"
|
|
2
|
+
|
|
3
|
+
// Connects to data-controller="message-stream"
|
|
4
|
+
// Opens an EventSource on connect, appends each delta to the content target,
|
|
5
|
+
// closes on `done` / `error`.
|
|
6
|
+
export default class extends Controller {
|
|
7
|
+
static targets = ["content"]
|
|
8
|
+
static values = { url: String }
|
|
9
|
+
|
|
10
|
+
connect() {
|
|
11
|
+
this.completed = false
|
|
12
|
+
this.source = new EventSource(this.urlValue)
|
|
13
|
+
this.source.addEventListener("message", (e) => this.#onDelta(e))
|
|
14
|
+
this.source.addEventListener("done", () => this.#onDone())
|
|
15
|
+
this.source.addEventListener("title", (e) => this.#onTitle(e))
|
|
16
|
+
this.source.addEventListener("saved", (e) => this.#onSaved(e))
|
|
17
|
+
this.source.addEventListener("error", (e) => this.#onError(e))
|
|
18
|
+
}
|
|
19
|
+
|
|
20
|
+
disconnect() {
|
|
21
|
+
this.#close()
|
|
22
|
+
}
|
|
23
|
+
|
|
24
|
+
#onDelta(event) {
|
|
25
|
+
let delta
|
|
26
|
+
try { delta = JSON.parse(event.data).delta } catch { return }
|
|
27
|
+
if (!delta) return
|
|
28
|
+
this.contentTarget.append(delta)
|
|
29
|
+
this.#scrollToBottom()
|
|
30
|
+
}
|
|
31
|
+
|
|
32
|
+
#onTitle(event) {
|
|
33
|
+
try {
|
|
34
|
+
const data = JSON.parse(event.data)
|
|
35
|
+
if (data.turbo_stream && window.Turbo) {
|
|
36
|
+
window.Turbo.renderStreamMessage(data.turbo_stream)
|
|
37
|
+
}
|
|
38
|
+
} catch {}
|
|
39
|
+
}
|
|
40
|
+
|
|
41
|
+
#onSaved(event) {
|
|
42
|
+
try {
|
|
43
|
+
const data = JSON.parse(event.data)
|
|
44
|
+
this.element.dataset.savedExecutionId = data.execution_id
|
|
45
|
+
if (data.html) this.#swapInRenderedMessage(data.html)
|
|
46
|
+
} catch {}
|
|
47
|
+
}
|
|
48
|
+
|
|
49
|
+
// Swap the streaming bubble's role + content with the host-rendered _message
|
|
50
|
+
// partial output so any markdown / syntax highlighting / partial customizations
|
|
51
|
+
// applied on reload also apply right after the stream finishes. We don't
|
|
52
|
+
// replace the whole element — that would disconnect this controller and
|
|
53
|
+
// close the EventSource before `title` / `done` arrive.
|
|
54
|
+
#swapInRenderedMessage(html) {
|
|
55
|
+
const doc = new DOMParser().parseFromString(html, "text/html")
|
|
56
|
+
const newBubble = doc.querySelector(".message")
|
|
57
|
+
if (!newBubble) return
|
|
58
|
+
|
|
59
|
+
const newRole = newBubble.querySelector(".message-role")
|
|
60
|
+
const oldRole = this.element.querySelector(".message-role")
|
|
61
|
+
if (newRole && oldRole) oldRole.innerHTML = newRole.innerHTML
|
|
62
|
+
|
|
63
|
+
const newContent = newBubble.querySelector(".message-content")
|
|
64
|
+
if (newContent) this.contentTarget.innerHTML = newContent.innerHTML
|
|
65
|
+
|
|
66
|
+
this.element.classList.remove("streaming")
|
|
67
|
+
if (newBubble.id) this.element.id = newBubble.id
|
|
68
|
+
}
|
|
69
|
+
|
|
70
|
+
#onDone() {
|
|
71
|
+
this.completed = true
|
|
72
|
+
this.#close()
|
|
73
|
+
}
|
|
74
|
+
|
|
75
|
+
#onError(event) {
|
|
76
|
+
// EventSource fires onerror whenever the connection closes — including
|
|
77
|
+
// immediately after a clean `event: done`. Suppress those.
|
|
78
|
+
if (this.completed) {
|
|
79
|
+
this.#close()
|
|
80
|
+
return
|
|
81
|
+
}
|
|
82
|
+
let message = "Stream interrupted."
|
|
83
|
+
try { if (event.data) message = JSON.parse(event.data).message || message } catch {}
|
|
84
|
+
const errEl = document.createElement("p")
|
|
85
|
+
errEl.className = "stream-error"
|
|
86
|
+
errEl.textContent = `[error] ${message}`
|
|
87
|
+
this.contentTarget.appendChild(errEl)
|
|
88
|
+
this.#close()
|
|
89
|
+
}
|
|
90
|
+
|
|
91
|
+
#close() {
|
|
92
|
+
if (this.source && this.source.readyState !== EventSource.CLOSED) {
|
|
93
|
+
this.source.close()
|
|
94
|
+
}
|
|
95
|
+
}
|
|
96
|
+
|
|
97
|
+
#scrollToBottom() {
|
|
98
|
+
const chatMessages = document.getElementById("chat-messages")
|
|
99
|
+
if (chatMessages) chatMessages.scrollTop = chatMessages.scrollHeight
|
|
100
|
+
}
|
|
101
|
+
}
|
|
@@ -68,6 +68,35 @@ class Chat < ApplicationRecord
|
|
|
68
68
|
new_message
|
|
69
69
|
end
|
|
70
70
|
|
|
71
|
+
# Stream the assistant response from the LLM. Yields each parsed SSE event.
|
|
72
|
+
# Returns the assembled content. Caller is responsible for persistence.
|
|
73
|
+
def stream_assistant_response(prompt_execution, jwt_token, generation_settings: {}, &block)
|
|
74
|
+
summarized_context, prompt = build_streaming_context(prompt_execution, jwt_token)
|
|
75
|
+
LlmMetaClient::ServerQuery.new.stream(
|
|
76
|
+
jwt_token,
|
|
77
|
+
prompt_execution.llm_uuid,
|
|
78
|
+
prompt_execution.model,
|
|
79
|
+
summarized_context,
|
|
80
|
+
prompt,
|
|
81
|
+
generation_settings: generation_settings,
|
|
82
|
+
&block
|
|
83
|
+
)
|
|
84
|
+
end
|
|
85
|
+
|
|
86
|
+
# Persist the streamed assistant response. Skips persistence if content is blank.
|
|
87
|
+
def finalize_streamed_response(prompt_execution, content, jwt_token)
|
|
88
|
+
return nil if content.blank?
|
|
89
|
+
|
|
90
|
+
prompt_execution.update!(
|
|
91
|
+
llm_platform: resolve_llm_type(prompt_execution.llm_uuid, jwt_token),
|
|
92
|
+
response: content
|
|
93
|
+
)
|
|
94
|
+
messages.create!(
|
|
95
|
+
role: "assistant",
|
|
96
|
+
prompt_navigator_prompt_execution: prompt_execution
|
|
97
|
+
)
|
|
98
|
+
end
|
|
99
|
+
|
|
71
100
|
# Get all messages in order
|
|
72
101
|
def ordered_messages
|
|
73
102
|
messages
|
|
@@ -115,31 +144,40 @@ class Chat < ApplicationRecord
|
|
|
115
144
|
|
|
116
145
|
# Send messages to LLM and get response
|
|
117
146
|
def send_to_llm(prompt_execution, jwt_token, tool_ids: [], generation_settings: {})
|
|
118
|
-
|
|
119
|
-
|
|
147
|
+
summarized_context, prompt = build_streaming_context(prompt_execution, jwt_token)
|
|
148
|
+
LlmMetaClient::ServerQuery.new.call(
|
|
149
|
+
jwt_token,
|
|
150
|
+
prompt_execution.llm_uuid,
|
|
151
|
+
prompt_execution.model,
|
|
152
|
+
summarized_context,
|
|
153
|
+
prompt,
|
|
154
|
+
tool_ids: tool_ids,
|
|
155
|
+
generation_settings: generation_settings
|
|
156
|
+
)
|
|
157
|
+
end
|
|
120
158
|
|
|
121
|
-
|
|
159
|
+
# Build the (summarized_context, prompt) tuple for an LLM call.
|
|
160
|
+
# Shared by both the synchronous and streaming paths.
|
|
161
|
+
def build_streaming_context(prompt_execution, jwt_token)
|
|
122
162
|
llm_options = LlmMetaClient::ServerResource.available_llm_options(jwt_token)
|
|
123
|
-
|
|
124
|
-
# Error if no LLM is available
|
|
125
163
|
raise LlmMetaClient::Exceptions::OllamaUnavailableError, "No LLM available" if llm_options.empty?
|
|
126
164
|
|
|
127
|
-
# Build prompt and context from direct lineage via PromptExecution
|
|
128
165
|
last_msg = ordered_messages.last
|
|
129
166
|
pe = last_msg.prompt_navigator_prompt_execution
|
|
130
|
-
|
|
131
167
|
prompt = { role: last_msg.role, prompt: pe.prompt }
|
|
132
168
|
context = pe.build_context(limit: Rails.configuration.summarize_conversation_count)
|
|
133
169
|
|
|
134
|
-
|
|
135
|
-
|
|
136
|
-
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
170
|
+
summarized_context =
|
|
171
|
+
if context.empty?
|
|
172
|
+
"No context available."
|
|
173
|
+
else
|
|
174
|
+
LlmMetaClient::ServerQuery.new.call(
|
|
175
|
+
jwt_token, prompt_execution.llm_uuid, prompt_execution.model,
|
|
176
|
+
context, "Please summarize the context"
|
|
177
|
+
)
|
|
178
|
+
end
|
|
140
179
|
summarized_context += "Additional prompt: Responses from the assistant must consist solely of the response body."
|
|
141
180
|
|
|
142
|
-
|
|
143
|
-
LlmMetaClient::ServerQuery.new.call(jwt_token, llm_uuid, model, summarized_context, prompt, tool_ids: tool_ids, generation_settings: generation_settings)
|
|
181
|
+
[ summarized_context, prompt ]
|
|
144
182
|
end
|
|
145
183
|
end
|
data/lib/generators/llm_meta_client/scaffold/templates/app/views/chats/_streaming_message.html.erb
ADDED
|
@@ -0,0 +1,7 @@
|
|
|
1
|
+
<%% stream_url = chat_stream_path(chat_id: chat.uuid, execution_id: prompt_execution.execution_id, generation_settings_json: @generation_settings_json.presence) %>
|
|
2
|
+
<div class="message assistant streaming"
|
|
3
|
+
data-controller="message-stream"
|
|
4
|
+
data-message-stream-url-value="<%%= stream_url %>">
|
|
5
|
+
<div class="message-role">🤖 streaming…</div>
|
|
6
|
+
<div class="message-content" data-message-stream-target="content"></div>
|
|
7
|
+
</div>
|
data/lib/generators/llm_meta_client/scaffold/templates/app/views/chats/create.turbo_stream.erb
CHANGED
|
@@ -5,10 +5,10 @@
|
|
|
5
5
|
<%% # User message is already shown by JavaScript on form submit %>
|
|
6
6
|
<%% # Only render assistant message here %>
|
|
7
7
|
|
|
8
|
-
|
|
9
|
-
<%% if @
|
|
8
|
+
<%%# Render streaming assistant placeholder; the message-stream Stimulus controller opens an EventSource and appends deltas as they arrive. %>
|
|
9
|
+
<%% if @prompt_execution && @error_message.blank? %>
|
|
10
10
|
<%%= turbo_stream.append "messages-list" do %>
|
|
11
|
-
<%%= render partial: "chats/
|
|
11
|
+
<%%= render partial: "chats/streaming_message", locals: { chat: @chat, prompt_execution: @prompt_execution } %>
|
|
12
12
|
<%% end %>
|
|
13
13
|
<%% end %>
|
|
14
14
|
|
|
@@ -25,9 +25,7 @@
|
|
|
25
25
|
|
|
26
26
|
<%% # Update chat sidebar %>
|
|
27
27
|
<%%= turbo_stream.replace "chat-sidebar" do %>
|
|
28
|
-
|
|
29
|
-
<%%= chat_list(->(id) { chat_path(id) }, active_uuid: @chat&.uuid, download_csv_path: ->(id) { download_csv_chat_path(id) }, download_all_csv_path: download_all_csv_chats_path) %>
|
|
30
|
-
</div>
|
|
28
|
+
<%%= render partial: "chats/chat_sidebar", locals: { chat: @chat } %>
|
|
31
29
|
<%% end %>
|
|
32
30
|
|
|
33
31
|
<%% # Update history sidebar - replace entire content to ensure update %>
|
data/lib/generators/llm_meta_client/scaffold/templates/app/views/chats/update.turbo_stream.erb
CHANGED
|
@@ -5,10 +5,10 @@
|
|
|
5
5
|
<%% # User message is already shown by JavaScript on form submit %>
|
|
6
6
|
<%% # Only render assistant message here %>
|
|
7
7
|
|
|
8
|
-
|
|
9
|
-
<%% if @
|
|
8
|
+
<%%# Render streaming assistant placeholder; the message-stream Stimulus controller opens an EventSource and appends deltas as they arrive. %>
|
|
9
|
+
<%% if @prompt_execution && @error_message.blank? %>
|
|
10
10
|
<%%= turbo_stream.append "messages-list" do %>
|
|
11
|
-
<%%= render partial: "chats/
|
|
11
|
+
<%%= render partial: "chats/streaming_message", locals: { chat: @chat, prompt_execution: @prompt_execution } %>
|
|
12
12
|
<%% end %>
|
|
13
13
|
<%% end %>
|
|
14
14
|
|
|
@@ -20,7 +20,9 @@
|
|
|
20
20
|
class="generation-settings-json-input"
|
|
21
21
|
rows="8"
|
|
22
22
|
placeholder='{"temperature": 0.7, "top_k": 40, "top_p": 0.9, "max_tokens": 4096, "repeat_penalty": 1.1}'
|
|
23
|
-
data-<%%= stimulus_controller %>-target="jsonInput"
|
|
23
|
+
data-<%%= stimulus_controller %>-target="jsonInput"
|
|
24
|
+
data-action="input-><%%= stimulus_controller %>#validate"></textarea>
|
|
25
|
+
<div class="generation-settings-error" data-<%%= stimulus_controller %>-target="error" style="display: none;"></div>
|
|
24
26
|
<div class="generation-settings-hint">
|
|
25
27
|
Available keys: temperature, top_k, top_p, max_tokens, repeat_penalty
|
|
26
28
|
</div>
|
|
@@ -1,5 +1,39 @@
|
|
|
1
|
+
require "net/http"
|
|
2
|
+
require "uri"
|
|
3
|
+
require "json"
|
|
4
|
+
|
|
1
5
|
module LlmMetaClient
|
|
2
6
|
class ServerQuery
|
|
7
|
+
# Stream LLM responses incrementally. Yields each content delta event
|
|
8
|
+
# ({ event: "message", data: { "delta" => "..." } }) to the caller's block.
|
|
9
|
+
# Upstream "done" markers are absorbed (end-of-stream is signaled by the
|
|
10
|
+
# block returning); upstream "error" events raise ServerError.
|
|
11
|
+
# Returns the assembled content string. Tool calls are not supported here.
|
|
12
|
+
def stream(id_token, api_key_uuid, model_id, context, user_content, generation_settings: {})
|
|
13
|
+
context_and_user_content = "Context:#{context}, User Prompt: #{user_content}"
|
|
14
|
+
debug_log "Streaming request to LLM: \n===>\n#{context_and_user_content}\n===>"
|
|
15
|
+
|
|
16
|
+
body = { prompt: context_and_user_content }
|
|
17
|
+
body[:generation_settings] = generation_settings if generation_settings.present?
|
|
18
|
+
|
|
19
|
+
assembled = +""
|
|
20
|
+
request_stream(api_key_uuid, id_token, model_id, body) do |event|
|
|
21
|
+
case event[:event]
|
|
22
|
+
when "message"
|
|
23
|
+
assembled << event[:data]["delta"].to_s
|
|
24
|
+
yield event if block_given?
|
|
25
|
+
when "done"
|
|
26
|
+
# End-of-stream marker from upstream; no-op here.
|
|
27
|
+
when "error"
|
|
28
|
+
raise Exceptions::ServerError, format_stream_error(event[:data])
|
|
29
|
+
else
|
|
30
|
+
yield event if block_given?
|
|
31
|
+
end
|
|
32
|
+
end
|
|
33
|
+
|
|
34
|
+
assembled
|
|
35
|
+
end
|
|
36
|
+
|
|
3
37
|
def call(id_token, api_key_uuid, model_id, context, user_content, tool_ids: [], generation_settings: {})
|
|
4
38
|
debug_log "Context: #{context}"
|
|
5
39
|
context_and_user_content = "Context:#{context}, User Prompt: #{user_content}"
|
|
@@ -7,13 +41,17 @@ module LlmMetaClient
|
|
|
7
41
|
|
|
8
42
|
response = request(api_key_uuid, id_token, model_id, context_and_user_content, tool_ids, generation_settings)
|
|
9
43
|
|
|
10
|
-
|
|
44
|
+
unless response.success?
|
|
45
|
+
raise Exceptions::ServerError, build_error_message(response.code.to_i, response.parsed_response)
|
|
46
|
+
end
|
|
11
47
|
|
|
12
48
|
response_body = response.parsed_response
|
|
13
49
|
|
|
14
50
|
raise Exceptions::InvalidResponseError, "LLM server returned non-JSON response" unless response_body.is_a?(Hash)
|
|
15
51
|
|
|
16
52
|
content = response_body.dig("response", "message") || ""
|
|
53
|
+
tool_calls = response_body.dig("response", "tool_calls")
|
|
54
|
+
content = combine_with_tool_calls(content, tool_calls) if tool_calls.is_a?(Array) && tool_calls.any?
|
|
17
55
|
|
|
18
56
|
raise Exceptions::EmptyResponseError, "LLM server returned empty response" if content.blank?
|
|
19
57
|
|
|
@@ -28,6 +66,28 @@ module LlmMetaClient
|
|
|
28
66
|
Rails.logger.info(message) if Rails.env.development?
|
|
29
67
|
end
|
|
30
68
|
|
|
69
|
+
def combine_with_tool_calls(message, tool_calls)
|
|
70
|
+
tool_section = format_tool_calls(tool_calls)
|
|
71
|
+
return tool_section if message.blank?
|
|
72
|
+
"#{message}\n\n---\n\n#{tool_section}"
|
|
73
|
+
end
|
|
74
|
+
|
|
75
|
+
def format_tool_calls(tool_calls)
|
|
76
|
+
lines = [ "**Tool calls**", "" ]
|
|
77
|
+
tool_calls.each do |tc|
|
|
78
|
+
name = tc["name"] || tc[:name] || "(unknown)"
|
|
79
|
+
args = tc["arguments"] || tc[:arguments]
|
|
80
|
+
args_str =
|
|
81
|
+
case args
|
|
82
|
+
when Hash, Array then args.to_json
|
|
83
|
+
when nil then ""
|
|
84
|
+
else args.to_s
|
|
85
|
+
end
|
|
86
|
+
lines << (args_str.empty? ? "- `#{name}`" : "- `#{name}` — `#{args_str}`")
|
|
87
|
+
end
|
|
88
|
+
lines.join("\n")
|
|
89
|
+
end
|
|
90
|
+
|
|
31
91
|
def request(api_key_uuid, id_token, model_id, user_content, tool_ids, generation_settings)
|
|
32
92
|
headers = { "Content-Type" => "application/json" }
|
|
33
93
|
headers["Authorization"] = "Bearer #{id_token}" if id_token.present?
|
|
@@ -47,5 +107,93 @@ module LlmMetaClient
|
|
|
47
107
|
def url(api_key_uuid, model_id)
|
|
48
108
|
"#{Rails.application.config.llm_service_base_url}/api/llm_api_keys/#{api_key_uuid}/models/#{model_id}/chats"
|
|
49
109
|
end
|
|
110
|
+
|
|
111
|
+
def stream_url(api_key_uuid, model_id)
|
|
112
|
+
"#{Rails.application.config.llm_service_base_url}/api/llm_api_keys/#{api_key_uuid}/models/#{model_id}/chat_streams"
|
|
113
|
+
end
|
|
114
|
+
|
|
115
|
+
def request_stream(api_key_uuid, id_token, model_id, body)
|
|
116
|
+
uri = URI(stream_url(api_key_uuid, model_id))
|
|
117
|
+
|
|
118
|
+
Net::HTTP.start(uri.host, uri.port, use_ssl: uri.scheme == "https", read_timeout: 600) do |http|
|
|
119
|
+
req = Net::HTTP::Post.new(uri)
|
|
120
|
+
req["Content-Type"] = "application/json"
|
|
121
|
+
req["Accept"] = "text/event-stream"
|
|
122
|
+
req["Authorization"] = "Bearer #{id_token}" if id_token.present?
|
|
123
|
+
req.body = body.to_json
|
|
124
|
+
|
|
125
|
+
http.request(req) do |response|
|
|
126
|
+
unless response.is_a?(Net::HTTPSuccess)
|
|
127
|
+
body = JSON.parse(response.read_body.to_s) rescue nil
|
|
128
|
+
raise Exceptions::ServerError, build_error_message(response.code.to_i, body)
|
|
129
|
+
end
|
|
130
|
+
|
|
131
|
+
buffer = +""
|
|
132
|
+
response.read_body do |chunk|
|
|
133
|
+
buffer << chunk
|
|
134
|
+
while (boundary = buffer.index("\n\n"))
|
|
135
|
+
raw_event = buffer.slice!(0, boundary + 2)
|
|
136
|
+
parsed = parse_sse_event(raw_event)
|
|
137
|
+
yield parsed if parsed
|
|
138
|
+
end
|
|
139
|
+
end
|
|
140
|
+
end
|
|
141
|
+
end
|
|
142
|
+
end
|
|
143
|
+
|
|
144
|
+
# Format an `event: error` SSE payload from llm_meta_server into a
|
|
145
|
+
# user-facing string. Payload shape: { "code" => "rate_limit", "message" => "..." }
|
|
146
|
+
def format_stream_error(data)
|
|
147
|
+
code = data["code"]
|
|
148
|
+
message = data["message"]
|
|
149
|
+
case code
|
|
150
|
+
when "rate_limit"
|
|
151
|
+
suffix = message.present? ? ": #{message}" : ""
|
|
152
|
+
"Rate limit exceeded — check your provider plan or retry shortly#{suffix}"
|
|
153
|
+
when "api_key_required"
|
|
154
|
+
message.presence || "API key required for this model"
|
|
155
|
+
else
|
|
156
|
+
message.presence || "Upstream stream error"
|
|
157
|
+
end
|
|
158
|
+
end
|
|
159
|
+
|
|
160
|
+
# Turn a non-success HTTP response from llm_meta_server into a user-facing
|
|
161
|
+
# error string. The server returns JSON like
|
|
162
|
+
# { "error" => "LLM API Rate limit exceeded", "message" => "Too many requests" }
|
|
163
|
+
# for known error classes; fall back to a generic message otherwise.
|
|
164
|
+
def build_error_message(status_code, body)
|
|
165
|
+
if body.is_a?(Hash)
|
|
166
|
+
err = body["error"]
|
|
167
|
+
msg = body["message"]
|
|
168
|
+
return "#{err}: #{msg}" if err.present? && msg.present?
|
|
169
|
+
return err if err.present?
|
|
170
|
+
return msg if msg.present?
|
|
171
|
+
end
|
|
172
|
+
case status_code
|
|
173
|
+
when 429 then "Rate limit exceeded — check your provider plan or retry shortly (HTTP 429)"
|
|
174
|
+
when 401, 403 then "LLM service rejected the request (HTTP #{status_code}) — check your API key"
|
|
175
|
+
when 502, 503, 504 then "LLM service is unavailable (HTTP #{status_code})"
|
|
176
|
+
else "LLM server returned HTTP #{status_code}"
|
|
177
|
+
end
|
|
178
|
+
end
|
|
179
|
+
|
|
180
|
+
def parse_sse_event(raw)
|
|
181
|
+
event_name = "message"
|
|
182
|
+
data_lines = []
|
|
183
|
+
raw.each_line(chomp: true) do |line|
|
|
184
|
+
next if line.empty?
|
|
185
|
+
if line.start_with?("event:")
|
|
186
|
+
event_name = line.sub(/^event:\s*/, "")
|
|
187
|
+
elsif line.start_with?("data:")
|
|
188
|
+
data_lines << line.sub(/^data:\s*/, "")
|
|
189
|
+
end
|
|
190
|
+
end
|
|
191
|
+
return nil if data_lines.empty?
|
|
192
|
+
|
|
193
|
+
data = JSON.parse(data_lines.join("\n"))
|
|
194
|
+
{ event: event_name, data: data }
|
|
195
|
+
rescue JSON::ParserError
|
|
196
|
+
nil
|
|
197
|
+
end
|
|
50
198
|
end
|
|
51
199
|
end
|
metadata
CHANGED
|
@@ -1,7 +1,7 @@
|
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
|
2
2
|
name: llm_meta_client
|
|
3
3
|
version: !ruby/object:Gem::Version
|
|
4
|
-
version: 1.0
|
|
4
|
+
version: 1.2.0
|
|
5
5
|
platform: ruby
|
|
6
6
|
authors:
|
|
7
7
|
- dhq_boiler
|
|
@@ -47,16 +47,22 @@ dependencies:
|
|
|
47
47
|
name: prompt_navigator
|
|
48
48
|
requirement: !ruby/object:Gem::Requirement
|
|
49
49
|
requirements:
|
|
50
|
-
- - "
|
|
50
|
+
- - ">="
|
|
51
51
|
- !ruby/object:Gem::Version
|
|
52
52
|
version: '1.0'
|
|
53
|
+
- - "<"
|
|
54
|
+
- !ruby/object:Gem::Version
|
|
55
|
+
version: '3.0'
|
|
53
56
|
type: :runtime
|
|
54
57
|
prerelease: false
|
|
55
58
|
version_requirements: !ruby/object:Gem::Requirement
|
|
56
59
|
requirements:
|
|
57
|
-
- - "
|
|
60
|
+
- - ">="
|
|
58
61
|
- !ruby/object:Gem::Version
|
|
59
62
|
version: '1.0'
|
|
63
|
+
- - "<"
|
|
64
|
+
- !ruby/object:Gem::Version
|
|
65
|
+
version: '3.0'
|
|
60
66
|
- !ruby/object:Gem::Dependency
|
|
61
67
|
name: chat_manager
|
|
62
68
|
requirement: !ruby/object:Gem::Requirement
|
|
@@ -103,18 +109,22 @@ files:
|
|
|
103
109
|
- lib/generators/llm_meta_client/authentication/templates/db/migrate/create_users.rb
|
|
104
110
|
- lib/generators/llm_meta_client/scaffold/scaffold_generator.rb
|
|
105
111
|
- lib/generators/llm_meta_client/scaffold/templates/app/controllers/api/mcp_servers_controller.rb
|
|
112
|
+
- lib/generators/llm_meta_client/scaffold/templates/app/controllers/chat_streams_controller.rb
|
|
106
113
|
- lib/generators/llm_meta_client/scaffold/templates/app/controllers/chats_controller.rb
|
|
107
114
|
- lib/generators/llm_meta_client/scaffold/templates/app/controllers/prompts_controller.rb
|
|
108
115
|
- lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/chat_title_edit_controller.js
|
|
109
116
|
- lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/chats_form_controller.js
|
|
110
117
|
- lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/generation_settings_controller.js
|
|
111
118
|
- lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/llm_selector_controller.js
|
|
119
|
+
- lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/message_stream_controller.js
|
|
112
120
|
- lib/generators/llm_meta_client/scaffold/templates/app/javascript/controllers/tool_selector_controller.js
|
|
113
121
|
- lib/generators/llm_meta_client/scaffold/templates/app/javascript/popover.js
|
|
114
122
|
- lib/generators/llm_meta_client/scaffold/templates/app/models/chat.rb
|
|
115
123
|
- lib/generators/llm_meta_client/scaffold/templates/app/models/message.rb
|
|
124
|
+
- lib/generators/llm_meta_client/scaffold/templates/app/views/chats/_chat_sidebar.html.erb
|
|
116
125
|
- lib/generators/llm_meta_client/scaffold/templates/app/views/chats/_message.html.erb
|
|
117
126
|
- lib/generators/llm_meta_client/scaffold/templates/app/views/chats/_messages_list.html.erb
|
|
127
|
+
- lib/generators/llm_meta_client/scaffold/templates/app/views/chats/_streaming_message.html.erb
|
|
118
128
|
- lib/generators/llm_meta_client/scaffold/templates/app/views/chats/create.turbo_stream.erb
|
|
119
129
|
- lib/generators/llm_meta_client/scaffold/templates/app/views/chats/edit.html.erb
|
|
120
130
|
- lib/generators/llm_meta_client/scaffold/templates/app/views/chats/new.html.erb
|
|
@@ -163,7 +173,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
|
163
173
|
- !ruby/object:Gem::Version
|
|
164
174
|
version: '0'
|
|
165
175
|
requirements: []
|
|
166
|
-
rubygems_version:
|
|
176
|
+
rubygems_version: 3.6.9
|
|
167
177
|
specification_version: 4
|
|
168
178
|
summary: A Rails Engine for integrating multiple LLM providers into your application.
|
|
169
179
|
test_files: []
|