openclacky 0.9.5 → 0.9.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (52) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +27 -0
  3. data/lib/clacky/agent/llm_caller.rb +11 -0
  4. data/lib/clacky/agent/message_compressor_helper.rb +2 -4
  5. data/lib/clacky/agent/session_serializer.rb +62 -46
  6. data/lib/clacky/agent/time_machine.rb +1 -1
  7. data/lib/clacky/agent.rb +154 -33
  8. data/lib/clacky/cli.rb +20 -12
  9. data/lib/clacky/client.rb +12 -2
  10. data/lib/clacky/default_agents/base_prompt.md +1 -0
  11. data/lib/clacky/default_skills/product-help/SKILL.md +91 -0
  12. data/lib/clacky/default_skills/skill-add/SKILL.md +24 -24
  13. data/lib/clacky/default_skills/skill-add/scripts/install_from_zip.rb +49 -20
  14. data/lib/clacky/default_skills/skill-creator/SKILL.md +5 -2
  15. data/lib/clacky/json_ui_controller.rb +5 -3
  16. data/lib/clacky/message_history.rb +31 -16
  17. data/lib/clacky/plain_ui_controller.rb +3 -4
  18. data/lib/clacky/server/channel/adapters/feishu/adapter.rb +40 -28
  19. data/lib/clacky/server/channel/adapters/feishu/file_processor.rb +14 -7
  20. data/lib/clacky/server/channel/adapters/wecom/adapter.rb +22 -10
  21. data/lib/clacky/server/channel/adapters/wecom/ws_client.rb +173 -13
  22. data/lib/clacky/server/channel/channel_manager.rb +150 -63
  23. data/lib/clacky/server/channel/channel_ui_controller.rb +29 -14
  24. data/lib/clacky/server/http_server.rb +35 -36
  25. data/lib/clacky/server/web_ui_controller.rb +4 -4
  26. data/lib/clacky/skill.rb +7 -4
  27. data/lib/clacky/tools/glob.rb +3 -2
  28. data/lib/clacky/tools/safe_shell.rb +21 -6
  29. data/lib/clacky/tools/web_fetch.rb +3 -1
  30. data/lib/clacky/ui2/components/input_area.rb +33 -38
  31. data/lib/clacky/ui2/components/message_component.rb +10 -11
  32. data/lib/clacky/ui2/ui_controller.rb +4 -4
  33. data/lib/clacky/ui2/view_renderer.rb +3 -3
  34. data/lib/clacky/ui_interface.rb +3 -1
  35. data/lib/clacky/utils/environment_detector.rb +94 -0
  36. data/lib/clacky/utils/file_parser/docx_parser.rb +156 -0
  37. data/lib/clacky/utils/file_parser/pptx_parser.rb +116 -0
  38. data/lib/clacky/utils/file_parser/xlsx_parser.rb +95 -0
  39. data/lib/clacky/utils/file_parser/zip_parser.rb +60 -0
  40. data/lib/clacky/utils/file_processor.rb +243 -203
  41. data/lib/clacky/version.rb +1 -1
  42. data/lib/clacky/web/app.css +159 -9
  43. data/lib/clacky/web/app.js +103 -25
  44. data/lib/clacky/web/brand.js +1 -1
  45. data/lib/clacky/web/i18n.js +18 -12
  46. data/lib/clacky/web/index.html +42 -14
  47. data/lib/clacky/web/sessions.js +16 -2
  48. data/lib/clacky/web/skills.js +161 -136
  49. data/lib/clacky.rb +2 -1
  50. data/scripts/install.sh +19 -35
  51. metadata +7 -2
  52. data/lib/clacky/utils/file_attachment.rb +0 -105
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: a50886ecfabfb60ea86a139a0180fc64803d1853d49aee14b201aa4e1d14a907
4
- data.tar.gz: 7979255d8dc2113189a5934081a8b5cd24cce0e84aad9ab3deda97116924c6a5
3
+ metadata.gz: a499294341fb7b3fd0f4884ecc672705317c1f33d4ead1531ebdd34465a8f5f8
4
+ data.tar.gz: a2c023146c5ed2b91c0777e31266800ff933fd053bee146430c0587cc8fc1999
5
5
  SHA512:
6
- metadata.gz: '0882ad06699e96e87581066d2be75755ccc0b82bd2bf1997fad7155f493733015c8abe5fb857391796157173c1408b9d011d6cc731b18123890cd33c2c9f34e4'
7
- data.tar.gz: 78e98487a191ba8014fb1d6b84d518bd1949b3724e90ab49523919bbbf0a17cfd1596b00ddf7576727fe96cf5e2d6783a17855c9be940fd5d9f9b7ea6304ee96
6
+ metadata.gz: 32640a8ff88ebfe3c37f69c9362c6935a876515a30a9fbff14d6496af6dbc2ec3c0df227db4a62d362c80d44a117f63e5c0ce48dc96f4229d1a03c1761b01eae
7
+ data.tar.gz: 72ed3bf45504176b76904cbeb90bcf32e5606c65405deb4b26fa3e6923e1e977c99615f5221d27a318e7585c5831523770ddae97b23affc2306efc2ef00fa9e4
data/CHANGELOG.md CHANGED
@@ -7,6 +7,33 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
7
7
 
8
8
  ## [Unreleased]
9
9
 
10
+ ## [0.9.6] - 2026-03-18
11
+
12
+ ### Added
13
+ - **Environment-aware context injection**: the agent now automatically detects your OS, desktop environment, and screen info and includes it in every session — so it can give OS-specific advice without you having to explain your setup
14
+ - **File attachments via IM channels**: you can now send images and documents directly through Feishu or WeCom to the agent, which processes them just like files sent via the Web UI
15
+ - **Unified file attachment pipeline for Web UI**: images and Office/PDF documents can now be attached in the web chat interface with automatic image compression before upload
16
+ - **Skills can now be installed from local zip files**: `skill-add` now accepts a local file path (not just a URL), so you can install skills from a downloaded zip without hosting it anywhere
17
+ - **Skill import bar in Web UI**: the Skills settings page now has an import bar where you can paste a URL or upload a local zip file directly — no terminal needed to install new skills
18
+ - **`$SKILL_DIR` available in skill instructions**: skill files can now reference `$SKILL_DIR` to get the absolute path to their own directory, making it easy to reference supporting files with correct paths
19
+ - **`product-help` built-in skill**: the agent can now answer questions about Clacky's own features, configuration, and usage through a dedicated built-in skill
20
+
21
+ ### Fixed
22
+ - **PDF and Office files now appear in glob results**: file discovery tools no longer skip `.pdf`, `.docx`, and other document formats — they show up correctly in file listings
23
+ - **Chat history visible after message compression**: sessions where all user messages were compressed no longer show a blank history — prior conversation is now correctly replayed
24
+ - **Stale message reference in task history**: an internal bug (`@messages` vs `@history`) that could cause incorrect task history in compressed sessions is fixed
25
+ - **File-only messages handled correctly in channel UI**: sending a file without text via IM channels no longer causes a display issue in the channel UI
26
+ - **WeCom WebSocket client stability**: fixed async dispatch and frame acknowledgment in the WeCom WS client to reduce dropped messages and connection issues
27
+ - **Session serializer variable fix**: corrected a stale variable reference in session replay that could cause errors when restoring sessions
28
+ - **`web_fetch` compatibility improved**: better request headers make web page fetching more reliable across more sites
29
+ - **Reasoning content preserved in API messages**: `reasoning_content` fields are no longer stripped from messages, fixing potential issues with reasoning-capable models
30
+
31
+ ### More
32
+ - Markdown links in chat now open in a new tab
33
+ - Removed public skill store tab from the Skills panel (store content is now integrated differently)
34
+ - Reduce WebSocket ping log noise in HTTP server
35
+ - Centralize message cleanup logic in `MessageHistory`
36
+
10
37
  ## [0.9.5] - 2026-03-17
11
38
 
12
39
  ### Added
@@ -45,6 +45,17 @@ module Clacky
45
45
  @ui&.show_error("Network failed after #{max_retries} retries: #{e.message}")
46
46
  raise AgentError, "Network connection failed after #{max_retries} retries: #{e.message}"
47
47
  end
48
+ rescue RetryableError => e
49
+ @ui&.clear_progress
50
+ retries += 1
51
+ if retries <= max_retries
52
+ @ui&.show_warning("#{e.message} (#{retries}/#{max_retries})")
53
+ sleep retry_delay
54
+ retry
55
+ else
56
+ @ui&.show_error("LLM service unavailable after #{max_retries} retries. Please try again later.")
57
+ raise AgentError, "LLM service unavailable after #{max_retries} retries"
58
+ end
48
59
  ensure
49
60
  @ui&.clear_progress
50
61
  end
@@ -34,13 +34,11 @@ module Clacky
34
34
  true
35
35
  rescue Clacky::AgentInterrupted => e
36
36
  @ui&.log("Idle compression canceled: #{e.message}", level: :info)
37
- @history.pop_while { |m| m[:system_injected] && !m.equal?(compression_message) }
38
- @history.pop_last if @history.to_a.last&.equal?(compression_message)
37
+ @history.rollback_before(compression_message)
39
38
  false
40
39
  rescue => e
41
40
  @ui&.log("Idle compression failed: #{e.message}", level: :error)
42
- @history.pop_while { |m| m[:system_injected] && !m.equal?(compression_message) }
43
- @history.pop_last if @history.to_a.last&.equal?(compression_message)
41
+ @history.rollback_before(compression_message)
44
42
  false
45
43
  end
46
44
  end
@@ -90,7 +90,8 @@ module Clacky
90
90
  active_task_id: @active_task_id || 0
91
91
  },
92
92
  config: {
93
- models: @config.models,
93
+ # NOTE: api_key and other sensitive credentials are intentionally excluded
94
+ # to prevent leaking secrets into session files on disk.
94
95
  permission_mode: @config.permission_mode.to_s,
95
96
  enable_compression: @config.enable_compression,
96
97
  enable_prompt_caching: @config.enable_prompt_caching,
@@ -121,7 +122,7 @@ module Clacky
121
122
  # created_at < before. Pass nil to get the most recent rounds.
122
123
  # @return [Hash] { has_more: Boolean } — whether older rounds exist beyond this page
123
124
  def replay_history(ui, limit: 20, before: nil)
124
- # Split @messages into rounds, each starting at a real user message
125
+ # Split @history into rounds, each starting at a real user message
125
126
  rounds = []
126
127
  current_round = nil
127
128
 
@@ -155,62 +156,31 @@ module Clacky
155
156
  rounds = rounds.select { |r| r[:user_msg][:created_at] && r[:user_msg][:created_at] < before }
156
157
  end
157
158
 
159
+ # Fallback: when the conversation was compressed and no user messages remain in the
160
+ # kept slice, render the surviving assistant/tool messages directly so the user can
161
+ # still see the last visible state of the chat (e.g. compressed summary + recent work).
162
+ if rounds.empty?
163
+ visible = @history.to_a.reject { |m| m[:role].to_s == "system" || m[:system_injected] }
164
+ visible.each { |msg| _replay_single_message(msg, ui) }
165
+ return { has_more: false }
166
+ end
167
+
158
168
  has_more = rounds.size > limit
159
169
  # Take the most recent `limit` rounds
160
170
  page = rounds.last(limit)
161
171
 
162
172
  page.each do |round|
163
173
  msg = round[:user_msg]
164
- display_text = extract_text_from_content(msg[:content])
165
- # Extract image data URLs from multipart content (for history replay rendering)
166
- images = extract_images_from_content(msg[:content])
167
- # Emit user message with its timestamp for dedup on the frontend
168
- ui.show_user_message(display_text, created_at: msg[:created_at], images: images)
174
+ raw_text = extract_text_from_content(msg[:content])
175
+ # Files are stored as system_injected messages (skipped below), not embedded in user text.
176
+ ui.show_user_message(raw_text, created_at: msg[:created_at])
169
177
 
170
178
  round[:events].each do |ev|
171
179
  # Skip system-injected messages (e.g. synthetic skill content, memory prompts)
172
180
  # — they are internal scaffolding and must not be shown to the user.
173
181
  next if ev[:system_injected]
174
182
 
175
- case ev[:role].to_s
176
- when "assistant"
177
- # Text content
178
- text = extract_text_from_content(ev[:content]).to_s.strip
179
- ui.show_assistant_message(text) unless text.empty?
180
-
181
- # Tool calls embedded in assistant message
182
- Array(ev[:tool_calls]).each do |tc|
183
- name = tc[:name] || tc.dig(:function, :name) || ""
184
- args_raw = tc[:arguments] || tc.dig(:function, :arguments) || {}
185
- args = args_raw.is_a?(String) ? (JSON.parse(args_raw) rescue args_raw) : args_raw
186
-
187
- # Special handling: request_user_feedback question is shown as an
188
- # assistant message (matching real-time behavior), not as a tool call.
189
- if name == "request_user_feedback"
190
- question = args.is_a?(Hash) ? (args[:question] || args["question"]).to_s : ""
191
- ui.show_assistant_message(question) unless question.empty?
192
- else
193
- ui.show_tool_call(name, args)
194
- end
195
- end
196
-
197
- # Emit token usage stored on this message (for history replay display)
198
- ui.show_token_usage(ev[:token_usage]) if ev[:token_usage]
199
-
200
- when "user"
201
- # Anthropic-format tool results (role: user, content: array of tool_result blocks)
202
- next unless ev[:content].is_a?(Array)
203
-
204
- ev[:content].each do |blk|
205
- next unless blk.is_a?(Hash) && blk[:type] == "tool_result"
206
-
207
- ui.show_tool_result(blk[:content].to_s)
208
- end
209
-
210
- when "tool"
211
- # OpenAI-format tool result
212
- ui.show_tool_result(ev[:content].to_s)
213
- end
183
+ _replay_single_message(ev, ui)
214
184
  end
215
185
  end
216
186
 
@@ -219,6 +189,52 @@ module Clacky
219
189
 
220
190
  private
221
191
 
192
+ # Render a single non-user message into the UI.
193
+ # Used by both the normal round-based replay and the compressed-session fallback.
194
+ def _replay_single_message(msg, ui)
195
+ return if msg[:system_injected]
196
+
197
+ case msg[:role].to_s
198
+ when "assistant"
199
+ # Text content
200
+ text = extract_text_from_content(msg[:content]).to_s.strip
201
+ ui.show_assistant_message(text, files: []) unless text.empty?
202
+
203
+ # Tool calls embedded in assistant message
204
+ Array(msg[:tool_calls]).each do |tc|
205
+ name = tc[:name] || tc.dig(:function, :name) || ""
206
+ args_raw = tc[:arguments] || tc.dig(:function, :arguments) || {}
207
+ args = args_raw.is_a?(String) ? (JSON.parse(args_raw) rescue args_raw) : args_raw
208
+
209
+ # Special handling: request_user_feedback question is shown as an
210
+ # assistant message (matching real-time behavior), not as a tool call.
211
+ if name == "request_user_feedback"
212
+ question = args.is_a?(Hash) ? (args[:question] || args["question"]).to_s : ""
213
+ ui.show_assistant_message(question, files: []) unless question.empty?
214
+ else
215
+ ui.show_tool_call(name, args)
216
+ end
217
+ end
218
+
219
+ # Emit token usage stored on this message (for history replay display)
220
+ ui.show_token_usage(msg[:token_usage]) if msg[:token_usage]
221
+
222
+ when "user"
223
+ # Anthropic-format tool results (role: user, content: array of tool_result blocks)
224
+ return unless msg[:content].is_a?(Array)
225
+
226
+ msg[:content].each do |blk|
227
+ next unless blk.is_a?(Hash) && blk[:type] == "tool_result"
228
+
229
+ ui.show_tool_result(blk[:content].to_s)
230
+ end
231
+
232
+ when "tool"
233
+ # OpenAI-format tool result
234
+ ui.show_tool_result(msg[:content].to_s)
235
+ end
236
+ end
237
+
222
238
  # Replace the system message in @messages with a freshly built system prompt.
223
239
  # Called after restore_session so newly installed skills and any other
224
240
  # configuration changes since the session was saved take effect immediately.
@@ -147,7 +147,7 @@ module Clacky
147
147
  tasks = []
148
148
  (1..@current_task_id).to_a.reverse.take(limit).reverse.each do |task_id|
149
149
  # Find first user message for this task
150
- first_user_msg = @messages.find do |msg|
150
+ first_user_msg = @history.to_a.find do |msg|
151
151
  msg[:task_id] == task_id && msg[:role] == "user"
152
152
  end
153
153
 
data/lib/clacky/agent.rb CHANGED
@@ -6,6 +6,7 @@ require "tty-prompt"
6
6
  require "set"
7
7
  require_relative "utils/arguments_parser"
8
8
  require_relative "utils/file_processor"
9
+ require_relative "utils/environment_detector"
9
10
 
10
11
  # Load all agent modules
11
12
  require_relative "agent/message_compressor"
@@ -151,7 +152,7 @@ module Clacky
151
152
  @name = new_name.to_s.strip
152
153
  end
153
154
 
154
- def run(user_input, images: [], files: [])
155
+ def run(user_input, files: [])
155
156
  # Start new task for Time Machine
156
157
  task_id = start_new_task
157
158
 
@@ -178,11 +179,38 @@ module Clacky
178
179
  # Inject session context (date + model) if not yet present or date has changed
179
180
  inject_session_context_if_needed
180
181
 
181
- # Format user message with images and files if provided
182
- user_content = format_user_content(user_input, images, files)
182
+ # Split files into vision images and disk files; downgrade oversized images to disk
183
+ image_files, disk_files = partition_files(Array(files))
184
+ vision_urls, downgraded = resolve_vision_images(image_files)
185
+ all_disk_files = disk_files + downgraded
186
+
187
+ # Format user message — text + inline vision images
188
+ user_content = format_user_content(user_input, vision_urls)
183
189
  @history.append({ role: "user", content: user_content, task_id: task_id, created_at: Time.now.to_f })
184
190
  @total_tasks += 1
185
191
 
192
+ # Inject disk file references as a system_injected message so:
193
+ # - LLM sees the file info (system_injected is NOT stripped from to_api)
194
+ # - replay_history skips it (next if ev[:system_injected]), keeping the user bubble clean
195
+ unless all_disk_files.empty?
196
+ file_prompt = all_disk_files.filter_map do |f|
197
+ path = f[:path] || f["path"]
198
+ name = f[:name] || f["name"]
199
+ type = f[:type] || f["type"]
200
+ preview_path = f[:preview_path] || f["preview_path"]
201
+ next unless path && name
202
+
203
+ lines = ["[File: #{name}]", "Type: #{type || "file"}"]
204
+ lines << "Original: #{path}"
205
+ lines << "Preview (Markdown): #{preview_path}" if preview_path
206
+ lines.join("\n")
207
+ end.join("\n\n")
208
+
209
+ unless file_prompt.empty?
210
+ @history.append({ role: "user", content: file_prompt, system_injected: true, task_id: task_id })
211
+ end
212
+ end
213
+
186
214
  # If the user typed a slash command targeting a skill with disable-model-invocation: true,
187
215
  # inject the skill content as a synthetic assistant message so the LLM can act on it.
188
216
  # Skills already in the system prompt (model_invocation_allowed?) are skipped.
@@ -218,7 +246,7 @@ module Clacky
218
246
  if @memory_updating && response[:content] && !response[:content].empty?
219
247
  @ui&.show_info(response[:content].strip)
220
248
  elsif response[:content] && !response[:content].empty?
221
- @ui&.show_assistant_message(response[:content])
249
+ emit_assistant_message(response[:content])
222
250
  end
223
251
 
224
252
  # Show token usage after the assistant message so WebUI renders it below the bubble
@@ -243,7 +271,7 @@ module Clacky
243
271
  # Show assistant message if there's content before tool calls
244
272
  # During memory update phase, suppress text output (only tool calls matter)
245
273
  if response[:content] && !response[:content].empty? && !@memory_updating
246
- @ui&.show_assistant_message(response[:content])
274
+ emit_assistant_message(response[:content])
247
275
  end
248
276
 
249
277
  # Show token usage after assistant message (or immediately if no message).
@@ -277,7 +305,7 @@ module Clacky
277
305
  next
278
306
  else
279
307
  # User just said "no" without feedback - stop and wait
280
- @ui&.show_assistant_message("Tool execution was denied. Please give more instructions...")
308
+ @ui&.show_assistant_message("Tool execution was denied. Please give more instructions...", files: [])
281
309
  break
282
310
  end
283
311
  end
@@ -350,12 +378,9 @@ module Clacky
350
378
  handle_compression_response(response, compression_context)
351
379
  compression_handled = true
352
380
  ensure
353
- # If interrupted or failed, remove the dangling compression message so it
354
- # doesn't pollute future conversation turns.
355
- unless compression_handled
356
- @history.pop_while { |m| m[:system_injected] && !m.equal?(compression_message) }
357
- @history.pop_last if @history.to_a.last&.equal?(compression_message)
358
- end
381
+ # If interrupted or failed, roll back the speculative compression message
382
+ # so it doesn't pollute future conversation turns.
383
+ @history.rollback_before(compression_message) unless compression_handled
359
384
  end
360
385
  return nil
361
386
  end
@@ -565,7 +590,7 @@ module Clacky
565
590
  # Special handling for request_user_feedback: show directly as message
566
591
  if call[:name] == "request_user_feedback"
567
592
  if result.is_a?(Hash) && result[:message]
568
- @ui&.show_assistant_message(result[:message])
593
+ @ui&.show_assistant_message(result[:message], files: [])
569
594
  end
570
595
 
571
596
  if @config.permission_mode == :auto_approve
@@ -756,13 +781,9 @@ module Clacky
756
781
  subagent.instance_variable_set(:@previous_total_tokens, @previous_total_tokens)
757
782
 
758
783
  # Deep clone history to avoid cross-contamination.
759
- # to_api already strips trailing orphaned tool_calls; we use to_a here so the
760
- # subagent gets the full internal list and its own to_api handles the strip on send.
784
+ # Dangling tool_calls (no tool_result yet) are cleaned up automatically by
785
+ # MessageHistory#append when the subagent appends its first user message.
761
786
  cloned_messages = deep_clone(@history.to_a)
762
- # Strip pending tool_calls (no tool_result yet) — fork happens inside act(),
763
- # before observe() has appended tool results. Anthropic rejects orphaned tool_use.
764
- cloned_messages.pop if cloned_messages.last&.dig(:role) == "assistant" &&
765
- cloned_messages.last[:tool_calls]&.any?
766
787
  subagent.instance_variable_set(:@history, MessageHistory.new(cloned_messages))
767
788
 
768
789
  # Append system prompt suffix as user message (for cache reuse)
@@ -861,24 +882,85 @@ module Clacky
861
882
  # @param images [Array<String>] Array of image file paths or data: URLs
862
883
  # @param files [Array] Unused — kept for signature compatibility
863
884
  # @return [String|Array] String if no images, Array with content blocks otherwise
864
- private def format_user_content(text, images, files = [])
865
- images ||= []
885
+ # Partition files array into [image_files, non_image_files].
886
+ # Image files: have mime_type starting with "image/" OR have data_url present.
887
+ private def partition_files(files)
888
+ image_files = []
889
+ non_image_files = []
890
+ files.each do |f|
891
+ mime = f[:mime_type] || f["mime_type"] || ""
892
+ data_url = f[:data_url] || f["data_url"]
893
+ if mime.start_with?("image/") || data_url
894
+ image_files << f
895
+ else
896
+ non_image_files << f
897
+ end
898
+ end
899
+ [image_files, non_image_files]
900
+ end
901
+
902
+ # Resolve image files to vision data_urls.
903
+ # Files with data_url: use as-is (already compressed by frontend or adapter).
904
+ # Files with path: convert to data_url via FileProcessor.
905
+ # Oversized images (> MAX_IMAGE_BYTES) are downgraded to disk file refs.
906
+ # @return [Array<String>, Array<Hash>] [vision_urls, downgraded_disk_files]
907
+ private def resolve_vision_images(image_files)
908
+ require "base64"
909
+ max_bytes = Utils::FileProcessor::MAX_IMAGE_BYTES
910
+ vision_urls = []
911
+ downgraded = []
912
+
913
+ image_files.each do |f|
914
+ name = f[:name] || f["name"] || "image.jpg"
915
+ mime = f[:mime_type] || f["mime_type"] || "image/jpeg"
916
+ data_url = f[:data_url] || f["data_url"]
917
+ path = f[:path] || f["path"]
918
+
919
+ if data_url
920
+ # Strip header to check byte size: "data:image/jpeg;base64,<data>"
921
+ b64_data = data_url.split(",", 2).last.to_s
922
+ byte_size = (b64_data.bytesize * 3) / 4
923
+ if byte_size > max_bytes
924
+ # Downgrade: save to disk
925
+ raw = Base64.decode64(b64_data)
926
+ file_ref = Utils::FileProcessor.save_image_to_disk(body: raw, mime_type: mime, filename: name)
927
+ downgraded << { name: name, path: file_ref.original_path, type: "image", mime_type: mime }
928
+ else
929
+ vision_urls << data_url
930
+ end
931
+ elsif path
932
+ begin
933
+ data_url_from_path = Utils::FileProcessor.image_path_to_data_url(path)
934
+ b64_data = data_url_from_path.split(",", 2).last.to_s
935
+ byte_size = (b64_data.bytesize * 3) / 4
936
+ if byte_size > max_bytes
937
+ raw = Base64.decode64(b64_data)
938
+ file_ref = Utils::FileProcessor.save_image_to_disk(body: raw, mime_type: mime, filename: name)
939
+ downgraded << { name: name, path: file_ref.original_path, type: "image", mime_type: mime }
940
+ else
941
+ vision_urls << data_url_from_path
942
+ end
943
+ rescue => e
944
+ @ui&.log("Failed to load image #{name}: #{e.message}", level: :warn)
945
+ end
946
+ end
947
+ end
948
+
949
+ [vision_urls, downgraded]
950
+ end
951
+
952
+ # Build user message content for LLM.
953
+ # Returns plain String when no vision images; Array of content parts otherwise.
954
+ private def format_user_content(text, vision_urls)
955
+ vision_urls ||= []
866
956
 
867
- return text if images.empty?
957
+ return text if vision_urls.empty?
868
958
 
869
959
  content = []
870
960
  content << { type: "text", text: text } unless text.nil? || text.empty?
871
-
872
- images.each do |image|
873
- # Accept both file paths and pre-encoded data: URLs (e.g. from Web UI)
874
- image_url = if image.start_with?("data:")
875
- image
876
- else
877
- Utils::FileProcessor.image_path_to_data_url(image)
878
- end
879
- content << { type: "image_url", image_url: { url: image_url } }
961
+ vision_urls.each do |url|
962
+ content << { type: "image_url", image_url: { url: url } }
880
963
  end
881
-
882
964
  content
883
965
  end
884
966
 
@@ -896,7 +978,16 @@ module Clacky
896
978
  # Skip if we already have a context for today
897
979
  return if @history.last_session_context_date == today
898
980
 
899
- content = "[Session context: Today is #{Time.now.strftime('%Y-%m-%d, %A')}. Current model: #{current_model}]"
981
+ os = Clacky::Utils::EnvironmentDetector.os_type
982
+ desktop = Clacky::Utils::EnvironmentDetector.desktop_path
983
+ parts = [
984
+ "Today is #{Time.now.strftime('%Y-%m-%d, %A')}",
985
+ "Current model: #{current_model}",
986
+ os != :unknown ? "OS: #{Clacky::Utils::EnvironmentDetector.os_label}" : nil,
987
+ desktop ? "Desktop: #{desktop}" : nil
988
+ ].compact.join(". ")
989
+
990
+ content = "[Session context: #{parts}]"
900
991
  @history.append({
901
992
  role: "user",
902
993
  content: content,
@@ -906,6 +997,36 @@ module Clacky
906
997
  })
907
998
  end
908
999
 
1000
+ # Parse markdown file:// links from assistant message content.
1001
+ # Handles both regular links and inline images:
1002
+ # [Download report](file:///path/to/file.pdf)
1003
+ # ![chart](file:///path/to/chart.png)
1004
+ #
1005
+ # Returns { text: String, files: Array<{name:, path:, inline:}> }
1006
+ # File links are stripped from the returned text.
1007
+ private def parse_file_links(content)
1008
+ return { text: content, files: [] } if content.nil? || content.empty?
1009
+
1010
+ files = []
1011
+ text = content.gsub(/(!?)\[([^\]]*)\]\(file:\/\/([^)]+)\)/) do
1012
+ inline = $1 == "!"
1013
+ name = $2.empty? ? File.basename($3) : $2
1014
+ path = File.expand_path($3)
1015
+ Clacky::Logger.info("[parse_file_links] raw=#{$3.inspect} expanded=#{path.inspect} exist=#{File.exist?(path)}")
1016
+ files << { name: name, path: path, inline: inline }
1017
+ ""
1018
+ end
1019
+ { text: text.strip, files: files }
1020
+ end
1021
+
1022
+ # Emit assistant message to UI, parsing any embedded file:// links first.
1023
+ private def emit_assistant_message(content)
1024
+ return if content.nil? || content.empty?
1025
+
1026
+ parsed = parse_file_links(content)
1027
+ @ui&.show_assistant_message(parsed[:text], files: parsed[:files])
1028
+ end
1029
+
909
1030
  # Track modified files for Time Machine snapshots
910
1031
  # @param tool_name [String] Name of the tool that was executed
911
1032
  # @param args [Hash] Arguments passed to the tool
data/lib/clacky/cli.rb CHANGED
@@ -53,7 +53,8 @@ module Clacky
53
53
  option :attach, type: :string, aliases: "-a", desc: "Attach to session by number or keyword"
54
54
  option :json, type: :boolean, default: false, desc: "Output NDJSON to stdout (for scripting/piping)"
55
55
  option :message, type: :string, aliases: "-m", desc: "Run non-interactively with this message and exit"
56
- option :image, type: :array, aliases: "-i", desc: "Image file path(s) to attach (use with -m; can be specified multiple times)"
56
+ option :file, type: :array, aliases: "-f", desc: "File path(s) to attach (use with -m; supports images and documents)"
57
+ option :image, type: :array, aliases: "-i", desc: "Image file path(s) to attach (alias for --file, kept for compatibility)"
57
58
  option :agent, type: :string, default: "coding", desc: "Agent profile to use: coding, general, or any custom profile name (default: coding)"
58
59
  option :help, type: :boolean, aliases: "-h", desc: "Show this help message"
59
60
  def agent
@@ -111,7 +112,8 @@ module Clacky
111
112
  Dir.chdir(working_dir) if should_chdir
112
113
  begin
113
114
  if options[:message]
114
- run_non_interactive(agent, options[:message], Array(options[:image]), agent_config, session_manager)
115
+ file_paths = Array(options[:file]) + Array(options[:image])
116
+ run_non_interactive(agent, options[:message], file_paths, agent_config, session_manager)
115
117
  elsif options[:json]
116
118
  run_agent_with_json(agent, working_dir, agent_config, session_manager, client, profile: agent_profile)
117
119
  else
@@ -446,20 +448,26 @@ module Clacky
446
448
  # Run agent non-interactively with a single message, then exit.
447
449
  # Forces auto_approve mode so no human confirmation is needed.
448
450
  # Output goes directly to stdout; exits with code 0 on success, 1 on error.
449
- def run_non_interactive(agent, message, images, agent_config, session_manager)
451
+ def run_non_interactive(agent, message, file_paths, agent_config, session_manager)
450
452
  # Force auto-approve — no one is around to confirm anything
451
453
  agent_config.permission_mode = :auto_approve
452
454
 
453
- # Validate image paths up-front so we fail fast with a clear message
454
- images.each do |path|
455
- raise ArgumentError, "Image file not found: #{path}" unless File.exist?(path)
455
+ # Validate paths up-front so we fail fast with a clear message
456
+ file_paths.each do |path|
457
+ raise ArgumentError, "File not found: #{path}" unless File.exist?(path)
458
+ end
459
+
460
+ # Convert file paths to file hashes — agent.run decides how to handle each
461
+ files = file_paths.map do |path|
462
+ mime = Utils::FileProcessor.detect_mime_type(path) rescue "application/octet-stream"
463
+ { name: File.basename(path), mime_type: mime, path: path }
456
464
  end
457
465
 
458
466
  # Wire up plain-text stdout UI so all agent output is visible
459
467
  plain_ui = Clacky::PlainUIController.new
460
468
  agent.instance_variable_set(:@ui, plain_ui)
461
469
 
462
- agent.run(message, images: images)
470
+ agent.run(message, files: files)
463
471
  session_manager&.save(agent.to_session_data(status: :success))
464
472
  exit(0)
465
473
  rescue Clacky::AgentInterrupted
@@ -476,7 +484,7 @@ module Clacky
476
484
  #
477
485
  # Input protocol (one JSON per line on stdin):
478
486
  # {"type":"message","content":"..."} — run agent with this message
479
- # {"type":"message","content":"...","images":["path"]} — with images
487
+ # {"type":"message","content":"...","files":[{"name":"x.jpg","mime_type":"image/jpeg","data_url":"data:..."}]} — with files
480
488
  # {"type":"exit"} — graceful shutdown
481
489
  # {"type":"confirmation","id":"conf_1","result":"yes"} — answer to request_confirmation
482
490
  #
@@ -522,8 +530,8 @@ module Clacky
522
530
  next
523
531
  end
524
532
 
525
- images = input["images"] || []
526
- run_json_task(agent, json_ui, session_manager) { agent.run(content, images: images) }
533
+ files = input["files"] || []
534
+ run_json_task(agent, json_ui, session_manager) { agent.run(content, files: files) }
527
535
  when "exit"
528
536
  break
529
537
  else
@@ -645,7 +653,7 @@ module Clacky
645
653
  end
646
654
 
647
655
  # Set up input handler
648
- ui_controller.on_input do |input, images, display: nil|
656
+ ui_controller.on_input do |input, files, display: nil|
649
657
  # Handle commands
650
658
  case input.downcase.strip
651
659
  when "/config"
@@ -693,7 +701,7 @@ module Clacky
693
701
 
694
702
  # Run agent (Agent will call @ui methods directly)
695
703
  # Agent internally tracks total_tasks and total_cost
696
- result = agent.run(input, images: images)
704
+ result = agent.run(input, files: files)
697
705
 
698
706
  # Save session after each task
699
707
  if session_manager
data/lib/clacky/client.rb CHANGED
@@ -112,6 +112,7 @@ module Clacky
112
112
  response = anthropic_connection.post("v1/messages") { |r| r.body = body.to_json }
113
113
 
114
114
  raise_error(response) unless response.status == 200
115
+ check_html_response(response)
115
116
  MessageFormat::Anthropic.parse_response(JSON.parse(response.body))
116
117
  end
117
118
 
@@ -132,6 +133,7 @@ module Clacky
132
133
  response = openai_connection.post("chat/completions") { |r| r.body = body.to_json }
133
134
 
134
135
  raise_error(response) unless response.status == 200
136
+ check_html_response(response)
135
137
  MessageFormat::OpenAI.parse_response(JSON.parse(response.body))
136
138
  end
137
139
 
@@ -227,12 +229,20 @@ module Clacky
227
229
  when 401 then raise AgentError, "Invalid API key"
228
230
  when 403 then raise AgentError, "Access denied: #{error_message}"
229
231
  when 404 then raise AgentError, "API endpoint not found: #{error_message}"
230
- when 429 then raise AgentError, "Rate limit exceeded"
231
- when 500..599 then raise AgentError, "Server error (#{response.status}): #{error_message}"
232
+ when 429 then raise RetryableError, "Rate limit exceeded, please wait a moment"
233
+ when 500..599 then raise RetryableError, "LLM service temporarily unavailable (#{response.status}), retrying..."
232
234
  else raise AgentError, "Unexpected error (#{response.status}): #{error_message}"
233
235
  end
234
236
  end
235
237
 
238
+ # Raise a friendly error if the response body is HTML (e.g. gateway error page returned with 200)
239
+ def check_html_response(response)
240
+ body = response.body.to_s.lstrip
241
+ if body.start_with?("<!DOCTYPE", "<!doctype", "<html", "<HTML")
242
+ raise RetryableError, "LLM service temporarily unavailable (received HTML error page), retrying..."
243
+ end
244
+ end
245
+
236
246
  def extract_error_message(error_body, raw_body)
237
247
  if raw_body.is_a?(String) && raw_body.strip.start_with?("<!DOCTYPE", "<html")
238
248
  return "Invalid API endpoint or server error (received HTML instead of JSON)"
@@ -4,6 +4,7 @@
4
4
  - Break down complex tasks into manageable steps
5
5
  - **USE TOOLS to create/modify files** — don't just return content
6
6
  - Provide brief explanations after completing actions
7
+ - When the user asks to send/download a file or you generate one for them, append `[filename](file://~/path/to/file)` at the end of your reply
7
8
 
8
9
  ## Tool Usage Rules
9
10