RubyGems - openclacky - Versions diffs - 1.0.0.beta.5 → 1.0.0 - Mend

openclacky 1.0.0.beta.5 → 1.0.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (26) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +22 -1
data/lib/clacky/agent/llm_caller.rb +87 -4
data/lib/clacky/agent/message_compressor_helper.rb +46 -28
data/lib/clacky/agent/session_serializer.rb +47 -2
data/lib/clacky/agent/skill_evolution.rb +21 -6
data/lib/clacky/agent/skill_manager.rb +35 -1
data/lib/clacky/agent.rb +15 -1
data/lib/clacky/client.rb +44 -8
data/lib/clacky/json_ui_controller.rb +2 -1
data/lib/clacky/plain_ui_controller.rb +1 -1
data/lib/clacky/providers.rb +2 -2
data/lib/clacky/server/channel/channel_ui_controller.rb +1 -1
data/lib/clacky/server/http_server.rb +94 -0
data/lib/clacky/server/session_registry.rb +8 -1
data/lib/clacky/server/web_ui_controller.rb +3 -2
data/lib/clacky/session_manager.rb +105 -1
data/lib/clacky/ui2/ui_controller.rb +2 -1
data/lib/clacky/ui_interface.rb +1 -1
data/lib/clacky/version.rb +1 -1
data/lib/clacky/web/app.css +158 -6
data/lib/clacky/web/app.js +157 -7
data/lib/clacky/web/i18n.js +45 -24
data/lib/clacky/web/index.html +10 -0
data/lib/clacky/web/sessions.js +88 -1
metadata +1 -1

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: 39e25cd04a3d01fdacbb0382c2c367a1e72e8d2be88408e7fb29f804b3af1ba6
-  data.tar.gz: 492ca66bcfb55a6cfc3f2cf38f171ce983f142a7a4b0f8655e5aafa317b79a69
+  metadata.gz: 49800afa935670c288d9f421595df4246b61e76ed0f2a74e1a7a754e85e26162
+  data.tar.gz: dba09cac5a79485b743aaad4568ce2e4fe2e13772d6b8c43a360ec11eca7c762
 SHA512:
-  metadata.gz: 014eeb8227bcc4cd94104a1da3bb2877083a1c70c4baaaf408233eec57ef684cbc2bcbac632ca52a771e2f1a8f436f2a09d89b697a165f1147891cabfe3708a0
-  data.tar.gz: cc54f77d960bfd2db73906b713a84d0da6465fc18c65d9ec3ceb75d250bf426adaf4d9ba42c71900beab889bb6acf6a6472fa3843420fec8bbd3460a13f00088
+  metadata.gz: 2b723771f71d880d99582f6bfd4d23a66f54ee3caa87f7ed228360f015cadb52a20be9d6869c6e35612740ddb889ceb762efa541a41bc25810f5897d47a333e1
+  data.tar.gz: 5c425e94d2bf4c4d68175b740d840b9cd6270ef91f2e68e6d8403fbb6fbc5336b07bd65308907dbb8d8c3cd1cb906c4c5f64ae7710a7e0619ab2aaae0ddc278b

data/CHANGELOG.md CHANGED Viewed

@@ -5,7 +5,28 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
-## [Unreleased]
+## [1.0.0] - 2026-04-30
+### Added
+- **Speed test tool in Web UI.** Test API response latency for different models and providers directly from the settings panel, making it easy to find the fastest endpoint for your region.
+- **History chunk loading.** Previously compressed conversation chunks can now be loaded back into the session when needed, so long-running conversations don't lose context.
+- **Default model changed to 4.5.** New default model provides better balance of speed, quality, and cost for most tasks.
+### Improved
+- **Thinking indicator now visible for more steps.** The "thinking..." indicator stays visible longer during complex operations, giving better feedback about what the agent is doing.
+- **Message timestamps display correctly in Web UI.** User message times now show properly without layout issues, and the scroll behavior is smoother.
+### Fixed
+- **Scroll position no longer jumps unexpectedly** in the Web UI when loading session history.
+## [1.0.0.beta.6] - 2026-04-30
+### Fixed
+- **Compression chunk indexing now uses disk-based discovery.** Chunk files are no longer incorrectly overwritten after the second compression. Previously, chunk index was counted from compressed_summary messages in history — which caps at 1 after rebuild — causing chunk-2.md to be overwritten on every subsequent compression. Now uses durable disk-based chunk discovery via SessionManager, ensuring all compressed chunks are preserved.
+- **Skill evolution no longer creates duplicate skills.** The reflect and auto-create scenarios in skill evolution are now mutually exclusive: when a skill was just used, only reflection runs; when no skill was used, only auto-creation is considered. This prevents near-duplicate "auto-*" skills from being extracted from tasks already served by an existing skill.
+### Improved
+- **Slash commands no longer misinterpret filesystem paths.** Pasted paths like `/Users/alice/foo` or `/tmp/bar` are no longer mistaken for slash commands, avoiding confusing "skill not found" notices.
 ## [1.0.0.beta.5] - 2026-04-29

data/lib/clacky/agent/llm_caller.rb CHANGED Viewed

@@ -86,7 +86,45 @@ module Clacky
           # Successful response — if we were probing, confirm primary is healthy.
           handle_probe_success if @config.probing?
-        rescue Faraday::ConnectionFailed, Faraday::TimeoutError, Faraday::SSLError, Errno::ECONNREFUSED, Errno::ETIMEDOUT => e
+        rescue Faraday::TimeoutError => e
+          # ── Read-timeout path (distinct from connection-level failures) ──
+          # Faraday::TimeoutError on our non-streaming POST almost always means
+          # the *response* took longer than the 300s read-timeout to come back —
+          # i.e. the model is trying to produce a huge output in one shot
+          # (e.g. "write me a 2000-line snake game"). Blindly retrying the same
+          # request with the same prompt reproduces the same timeout.
+          #
+          # Strategy:
+          #   1. On the FIRST timeout in a task, inject a `[SYSTEM]` user message
+          #      telling the model to break the work into smaller steps, then
+          #      retry. The history edit changes the prompt, so the retry is
+          #      materially different from the failed attempt.
+          #   2. On subsequent timeouts in the same task, fall back to the
+          #      generic "just retry" behaviour (the model may have ignored
+          #      the hint; don't pile on duplicate hints).
+          #   3. Probing-mode timeouts still go through handle_probe_failure.
+          retries += 1
+          if @config.probing?
+            handle_probe_failure
+            retry
+          end
+          if retries <= max_retries
+            inject_large_output_hint_if_first_timeout(e)
+            @ui&.show_progress(
+              "Response too slow (likely generating too much at once): #{e.message}",
+              progress_type: "retrying",
+              phase: "active",
+              metadata: { attempt: retries, total: max_retries }
+            )
+            sleep retry_delay
+            retry
+          else
+            raise AgentError, "[LLM] Request timed out after #{max_retries} retries: #{e.message}"
+          end
+        rescue Faraday::ConnectionFailed, Faraday::SSLError, Errno::ECONNREFUSED, Errno::ETIMEDOUT => e
           retries += 1
           # Probing failure: primary still down — renew cooling-off and retry with fallback.
@@ -95,9 +133,10 @@ module Clacky
             retry
           end
-          # Network-level errors (timeouts, connection failures) are likely transient
-          # infrastructure blips — do NOT trigger fallback.  Just retry on the current
-          # model (primary or already-active fallback) up to max_retries.
+          # Connection-level errors (DNS, TCP refused, open-timeout, TLS) are
+          # transient infrastructure blips — do NOT trigger fallback, and do
+          # NOT inject the "break into steps" hint (the model did nothing wrong).
+          # Just retry on the current model up to max_retries.
           if retries <= max_retries
             @ui&.show_progress(
               "Network failed: #{e.message}",
@@ -229,6 +268,50 @@ module Clacky
           (msg.include?("thinking") || msg.include?("must be passed back") ||
            msg.include?("must be provided"))
       end
+      # On the FIRST Faraday::TimeoutError within a task, append a [SYSTEM]
+      # user message to the history instructing the model to break its work
+      # into smaller steps. Subsequent timeouts in the same task are ignored
+      # here (caller just retries) so we don't pollute history with duplicate
+      # hints.
+      #
+      # The injected message carries `system_injected: true` so it is:
+      #   - Hidden from UI replay (session_serializer / replay_history filters)
+      #   - Skipped by prompt-caching marker placement (client.rb)
+      #   - Skipped by message compression's "recent user turn" protection
+      #     (message_compressor_helper.rb)
+      #
+      # Reset per-task via Agent#run (see @task_timeout_hint_injected = false).
+      private def inject_large_output_hint_if_first_timeout(err)
+        return if @task_timeout_hint_injected
+        @task_timeout_hint_injected = true
+        hint = "[SYSTEM] The previous LLM response timed out (read timeout after ~300s). " \
+               "This usually means the model was trying to produce too much output in a single response. " \
+               "Please change your approach:\n" \
+               "- Break the task into multiple smaller steps, each producing a short response.\n" \
+               "- For long files: first create a skeleton with `write` (structure + placeholder comments only), " \
+               "then fill in each section with separate `edit` calls.\n" \
+               "- Keep each single tool-call argument (especially file content) well under ~500 lines.\n" \
+               "- Do NOT attempt to output the entire deliverable in one response."
+        @history.append({
+          role: "user",
+          content: hint,
+          system_injected: true,
+          task_id: @current_task_id
+        })
+        Clacky::Logger.info(
+          "[llm_caller] Read-timeout detected — injected 'break into smaller steps' hint " \
+          "(error=#{err.class}: #{err.message})"
+        )
+        @ui&.show_warning(
+          "LLM response timed out — asking model to break the task into smaller steps and retrying..."
+        )
+      end
     end
   end
 end

data/lib/clacky/agent/message_compressor_helper.rb CHANGED Viewed

@@ -154,12 +154,22 @@ module Clacky
         # Note: we need to remove the compression instruction message we just added
         original_messages = @history.to_a[0..-2]  # All except the last (compression instruction)
-        # Archive compressed messages to a chunk MD file before discarding them
-        # Count existing compressed_summary messages in history to determine the next chunk index.
-        # Using @compressed_summaries.size would reset to 0 on process restart and overwrite existing
-        # chunk files, creating circular chunk references. Counting from history is always accurate.
-        existing_chunk_count = original_messages.count { |m| m[:compressed_summary] }
-        chunk_index = existing_chunk_count + 1
+        # Archive compressed messages to a chunk MD file before discarding them.
+        #
+        # IMPORTANT: chunk_index and previous_chunks MUST come from disk, not from
+        # message history. Each compression's rebuild_with_compression keeps only
+        # ONE compressed_summary message (the new one), dropping older summaries
+        # and embedding their references into the new summary's content. So
+        # counting compressed_summary messages in history caps at 1 from the
+        # second compression onward — causing chunk-2.md to be overwritten on
+        # every subsequent compression, and losing references to chunk-1.md.
+        #
+        # Disk is the only durable source of truth: chunk files survive process
+        # restarts, session reloads, and message rebuilds. SessionManager owns
+        # all chunk file I/O (naming, writing, discovery) — we just ask it.
+        sm = session_manager
+        existing_chunks = sm.chunks_for_current(@session_id, @created_at)
+        chunk_index = sm.next_chunk_index(@session_id, @created_at)
         # Extract topics from the LLM response to store in both the chunk MD front
         # matter and the compressed_summary message hash (for future chunk indexing).
@@ -173,14 +183,13 @@ module Clacky
           topics: topics
         )
-        # Collect previous chunk references so the new summary carries a complete
-        # index of all older archives. Without this, each new compression would
-        # lose all prior chunk references — leaving only the newest chunk reachable
-        # via replay_history. The AI can still access older chunks via file_reader
-        # using the embedded basenames and topics.
-        previous_chunks = original_messages
-          .select { |m| m[:compressed_summary] && m[:chunk_path] }
-          .map { |m| { basename: File.basename(m[:chunk_path]), path: m[:chunk_path], topics: m[:topics] } }
+        # Build previous_chunks index from the disk-discovered chunks (already
+        # sorted by index ascending). This gives the new summary a complete
+        # chronological index of all older archives so the AI can recall any
+        # past chunk via file_reader, not just the most recent one.
+        previous_chunks = existing_chunks.map do |c|
+          { basename: c[:basename], path: c[:path], topics: c[:topics] }
+        end
         @history.replace_all(@message_compressor.rebuild_with_compression(
           compressed_content,
@@ -348,8 +357,22 @@ module Clacky
         end
       end
-      # Save the messages being compressed to a chunk MD file for future recall
-      # File path: ~/.clacky/sessions/{datetime}-{short_id}-chunk-{n}.md
+      # Lazy accessor for a SessionManager instance used by compression chunk I/O.
+      # We keep this local to the helper rather than threading a manager instance
+      # through the Agent constructor — Agent itself doesn't persist sessions
+      # (CLI / HTTP server do that), but the compression archive lives in the
+      # same directory under SessionManager's ownership.
+      #
+      # NOTE: Uses Clacky::SessionManager::SESSIONS_DIR by default. Tests can
+      # stub that constant to point at a tmpdir.
+      private def session_manager
+        @session_manager ||= Clacky::SessionManager.new
+      end
+      # Save the messages being compressed to a chunk MD file for future recall.
+      # The filesystem concerns (path, write, chmod) are delegated to SessionManager;
+      # this method is responsible only for the business rules of WHAT gets archived.
+      #
       # @param original_messages [Array<Hash>] All messages before compression (excluding compression instruction)
       # @param recent_messages [Array<Hash>] Recent messages being kept (to exclude from chunk)
       # @param chunk_index [Integer] Sequential chunk number
@@ -373,19 +396,14 @@ module Clacky
         return nil if messages_to_archive.empty?
-        sessions_dir = Clacky::SessionManager::SESSIONS_DIR
-        datetime = Time.parse(@created_at).strftime("%Y-%m-%d-%H-%M-%S")
-        short_id = @session_id[0..7]
-        base_name = "#{datetime}-#{short_id}"
-        chunk_filename = "#{base_name}-chunk-#{chunk_index}.md"
-        chunk_path = File.join(sessions_dir, chunk_filename)
-        md_content = build_chunk_md(messages_to_archive, chunk_index: chunk_index, compression_level: compression_level, topics: topics)
-        File.write(chunk_path, md_content)
-        FileUtils.chmod(0o600, chunk_path)
+        md_content = build_chunk_md(messages_to_archive,
+                                    chunk_index: chunk_index,
+                                    compression_level: compression_level,
+                                    topics: topics)
-        chunk_path
+        # Delegate filesystem concerns (path assembly, write, chmod) to SessionManager —
+        # it owns the on-disk layout for sessions and their chunk archives.
+        session_manager.write_chunk(@session_id, @created_at, chunk_index, md_content)
       rescue => e
         @ui&.log("Failed to save chunk MD: #{e.message}", level: :warn)
         nil

data/lib/clacky/agent/session_serializer.rb CHANGED Viewed

@@ -36,6 +36,15 @@ module Clacky
         # Restore previous_total_tokens for accurate delta calculation across sessions
         @previous_total_tokens = session_data.dig(:stats, :previous_total_tokens) || 0
+        # Recover the latest latency metric from the most recent assistant message
+        # that carries a :latency field. This is the source of truth for the status-bar
+        # signal — no separate session-level field is needed. Older sessions (pre-feature)
+        # simply start with nil; the signal stays hidden until the next LLM call populates it.
+        last_assistant_with_latency = @history.to_a.reverse.find do |m|
+          m[:role].to_s == "assistant" && m[:latency]
+        end
+        @latest_latency = last_assistant_with_latency&.dig(:latency)
         # Restore Time Machine state
         @task_parents = session_data.dig(:time_machine, :task_parents) || {}
         @current_task_id = session_data.dig(:time_machine, :current_task_id) || 0
@@ -178,8 +187,18 @@ module Clacky
           elsif current_round
             current_round[:events] << msg
           elsif msg[:compressed_summary] && msg[:chunk_path]
-            # Compressed summary sitting before any user rounds — expand it from chunk md
-            chunk_rounds = parse_chunk_md_to_rounds(msg[:chunk_path])
+            # Compressed summary sitting before any user rounds — expand ALL chunk
+            # MD files that belong to the same session (siblings of chunk_path),
+            # in chunk-index ascending order.
+            #
+            # Under the current "single summary + previous_chunks index" scheme,
+            # session.json only keeps the newest compressed_summary message (which
+            # points at the newest chunk). Older chunks (chunk-1..chunk-N-1) are
+            # referenced only as basenames inside the summary text. Expanding just
+            # msg[:chunk_path] would therefore lose all prior chunks on replay.
+            chunk_rounds = sibling_chunks_of(msg[:chunk_path]).flat_map { |p|
+              parse_chunk_md_to_rounds(p)
+            }
             rounds.concat(chunk_rounds)
             # After expanding, treat the last chunk round as the current round so that
             # any orphaned assistant/tool messages that follow in session.json (belonging
@@ -243,6 +262,32 @@ module Clacky
         { has_more: has_more }
       end
+      # Return all chunk MD file paths that belong to the same session as
+      # +chunk_path+, sorted by chunk index ascending (chunk-1, chunk-2, …).
+      # Uses the filename convention "<base>-chunk-<N>.md".
+      #
+      # Handles path resolution the same way parse_chunk_md_to_rounds does:
+      # if the stored path doesn't exist, fall back to SESSIONS_DIR + basename
+      # (cross-machine / cross-user session bundles).
+      private def sibling_chunks_of(chunk_path)
+        return [] unless chunk_path
+        resolved = chunk_path.to_s
+        unless File.exist?(resolved)
+          resolved = File.join(Clacky::SessionManager::SESSIONS_DIR, File.basename(resolved))
+        end
+        return [] unless File.exist?(resolved)
+        dir  = File.dirname(resolved)
+        base = File.basename(resolved).sub(/-chunk-\d+\.md\z/, "")
+        return [resolved] if base == File.basename(resolved)  # unconventional name — just use as-is
+        Dir.glob(File.join(dir, "#{base}-chunk-*.md")).sort_by do |p|
+          m = File.basename(p).match(/-chunk-(\d+)\.md\z/)
+          m ? m[1].to_i : Float::INFINITY
+        end
+      end
       # Parse a chunk MD file into an array of rounds compatible with replay_history.
       # Each round is { user_msg: Hash, events: Array<Hash> }.
       # Timestamps are synthesised from the chunk's archived_at, spread backwards.

data/lib/clacky/agent/skill_evolution.rb CHANGED Viewed

@@ -10,16 +10,31 @@ module Clacky
     # Triggered at the end of Agent#run (post-run hooks), only for main agents.
     module SkillEvolution
       # Main entry point - runs all skill evolution checks
-      # Called from Agent#run after the main loop completes
+      # Called from Agent#run after the main loop completes.
+      #
+      # The two scenarios are mutually exclusive by design:
+      #
+      #   * If a skill just ran (@skill_execution_context is set), the user's
+      #     need was already served by an existing skill. Run Scenario 2
+      #     (reflect + possibly improve that skill) and skip Scenario 1 —
+      #     otherwise we would auto-extract a near-duplicate "auto-*" skill
+      #     from the same task, polluting the skills directory.
+      #
+      #   * If no skill ran, the task was solved with raw tools. That is the
+      #     signal for Scenario 1: if the pattern is complex/repeatable enough,
+      #     consider extracting it into a new skill.
       def run_skill_evolution_hooks
         return unless skill_evolution_enabled?
         return if @is_subagent
-        # Scenario 2: Reflect on executed skill (if one just ran)
-        maybe_reflect_on_skill if @skill_execution_context
-        # Scenario 1: Auto-create new skill from complex task
-        maybe_create_skill_from_task
+        if @skill_execution_context
+          # Scenario 2: Reflect on executed skill (may invoke skill-creator
+          # to UPDATE the existing skill, but will not create a new one).
+          maybe_reflect_on_skill
+        else
+          # Scenario 1: Auto-create new skill from complex task.
+          maybe_create_skill_from_task
+        end
       end
       # Check if skill evolution is enabled in config

data/lib/clacky/agent/skill_manager.rb CHANGED Viewed

@@ -33,12 +33,46 @@ module Clacky
       def parse_skill_command(input)
         return { matched: false } unless input.start_with?("/")
-        match = input.match(%r{^/(\S+)(?:\s+(.*))?$})
+        # Split off the first whitespace-delimited token after the leading "/".
+        # Shape of a slash command:
+        #   /<command>
+        #   /<command> <arguments...>
+        #
+        # The key distinction we need to make is "slash command" vs. "filesystem
+        # path starting with /". Paths look like "/xxx/yyy", "/Users/alice/foo",
+        # "/tmp/bar" — what they all share is a *second* "/" inside the first
+        # token. Slash commands, on the other hand, may legitimately contain
+        # non-slug characters like ':' or '.' (e.g. "/guizang-ppt-skill:create"),
+        # so we deliberately DO NOT require the command to be a clean slug here —
+        # find_by_command handles the lookup, and a pilot-error like "/foo.bar"
+        # should still surface a friendly "skill not found" notice.
+        #
+        # Rejected as slash commands (treated as plain user messages):
+        #   - "/", "//", "/*.rb"        — token is empty or begins with a separator/glob
+        #   - "/ leading space"         — whitespace immediately after /
+        #   - "/Users/alice/foo"        — second "/" inside the first token ⇒ a path
+        #   - "/xxxx/zzzz/"             — same
+        #
+        # Accepted (routed to find_by_command, may yield :not_found notice):
+        #   - "/commit"
+        #   - "/skill-add https://…"     — "/" appears only in arguments, fine
+        #   - "/guizang-ppt-skill:create", "/foo.bar"  — non-slug but no path shape
+        match = input.match(%r{^/(\S+?)(?:\s+(.*))?$})
         return { matched: false } unless match
         skill_name = match[1]
         arguments  = match[2] || ""
+        # Reject path-like first tokens: anything containing a "/" after the
+        # leading one belongs to the filesystem, not the command namespace.
+        # This also naturally rejects "" (from "/" alone) and "*…" / ".…" style
+        # tokens because they won't be registered as a command — but those edge
+        # cases fall through to :not_found which is acceptable. The main goal is
+        # to stop pasted paths like "/Users/foo/bar" from producing a bogus
+        # "skill /Users/foo/bar not found" reply.
+        return { matched: false } if skill_name.include?("/")
+        return { matched: false } if skill_name.empty?
         skill = @skill_loader.find_by_command("/#{skill_name}")
         return { matched: true, found: false, skill_name: skill_name, reason: :not_found } unless skill

data/lib/clacky/agent.rb CHANGED Viewed

@@ -42,7 +42,8 @@ module Clacky
     attr_reader :session_id, :name, :history, :iterations, :total_cost, :working_dir, :created_at, :total_tasks, :todos,
                 :cache_stats, :cost_source, :ui, :skill_loader, :agent_profile,
-                :status, :error, :updated_at, :source
+                :status, :error, :updated_at, :source,
+                :latest_latency  # Hash of latency metrics from the most recent LLM call (see Client#send_messages_with_tools)
     attr_accessor :pinned
     def permission_mode
@@ -78,6 +79,7 @@ module Clacky
       @task_cost_source = :estimated  # Track cost source for current task
       @previous_total_tokens = 0  # Track tokens from previous iteration for delta calculation
       @interrupted = false  # Flag for user interrupt
+      @latest_latency = nil  # Most recent LLM call's latency metrics (see Client#send_messages_with_tools)
       @ui = ui  # UIController for direct UI interaction
       @debug_logs = []  # Debug logs for troubleshooting
       @pending_injections = []     # Pending inline skill injections to flush after observe()
@@ -208,6 +210,7 @@ module Clacky
       @start_time = Time.now
       @task_truncation_count = 0  # Reset truncation counter for each task
+      @task_timeout_hint_injected = false  # Reset read-timeout hint injection (see LlmCaller)
       @task_cost_source = :estimated  # Reset for new task
       # Note: Do NOT reset @previous_total_tokens here - it should maintain the value from the last iteration
       # across tasks to correctly calculate delta tokens in each iteration
@@ -681,6 +684,17 @@ module Clacky
       end
       # Store token_usage in the message so replay_history can re-emit it
       msg[:token_usage] = response[:token_usage] if response[:token_usage]
+      # Store per-message latency — this is the source of truth (session.json)
+      # for all time-to-first-token / duration / throughput info. The status
+      # bar signal reads the last assistant message's latency; no separate
+      # config file or top-level session field is introduced.
+      if response[:latency]
+        msg[:latency] = response[:latency]
+        @latest_latency = response[:latency]
+        # Push to UI so the status-bar signal updates immediately after the
+        # model finishes (before any tool execution delays the next event).
+        @ui&.update_sessionbar(latency: response[:latency])
+      end
       # Preserve reasoning_content from the real LLM response.
       # This is the authoritative signal used by MessageHistory#to_api to
       # detect thinking-mode providers (DeepSeek V4, Kimi K2 thinking, etc.)

data/lib/clacky/client.rb CHANGED Viewed

@@ -89,18 +89,54 @@ module Clacky
     # ── Agent main path ───────────────────────────────────────────────────────
     # Send messages with tool-calling support.
-    # Returns canonical response hash: { content:, tool_calls:, finish_reason:, usage: }
+    # Returns canonical response hash: { content:, tool_calls:, finish_reason:, usage:, latency: }
+    #
+    # Latency measurement:
+    #   Because the current HTTP path is *non-streaming* (plain POST, response
+    #   body read in one shot), TTFB (time to response headers) is not exposed
+    #   by Faraday's default adapter without extra plumbing. What we CAN measure
+    #   cheaply — and what users actually feel — is total request duration,
+    #   which for a non-streaming call equals the time from "hit Enter" to
+    #   "first token visible" (since we receive everything at once).
+    #
+    #   So we record `duration_ms` as the authoritative number and alias it to
+    #   `ttft_ms` for downstream consumers (status bar uses ttft_ms as its
+    #   signal metric — see docs). When we migrate to streaming later, this
+    #   same `ttft_ms` field will start carrying the *actual* first-token
+    #   latency without any schema change.
     def send_messages_with_tools(messages, model:, tools:, max_tokens:, enable_caching: false)
       caching_enabled = enable_caching && supports_prompt_caching?(model)
       cloned = deep_clone(messages)
-      if bedrock?
-        send_bedrock_request(cloned, model, tools, max_tokens, caching_enabled)
-      elsif anthropic_format?
-        send_anthropic_request(cloned, model, tools, max_tokens, caching_enabled)
-      else
-        send_openai_request(cloned, model, tools, max_tokens, caching_enabled)
-      end
+      t0 = Process.clock_gettime(Process::CLOCK_MONOTONIC)
+      response =
+        if bedrock?
+          send_bedrock_request(cloned, model, tools, max_tokens, caching_enabled)
+        elsif anthropic_format?
+          send_anthropic_request(cloned, model, tools, max_tokens, caching_enabled)
+        else
+          send_openai_request(cloned, model, tools, max_tokens, caching_enabled)
+        end
+      t1 = Process.clock_gettime(Process::CLOCK_MONOTONIC)
+      duration_ms = ((t1 - t0) * 1000).round
+      # Throughput is only meaningful with a reasonable output size; below ~10
+      # tokens the sample is too small to be informative and the result is
+      # wildly high (e.g. 1 token / 50ms → 20 tok/s is meaningless).
+      # Canonical usage hashes from message_format/* all use :completion_tokens.
+      output_tokens = response[:usage]&.dig(:completion_tokens).to_i
+      tps = (output_tokens >= 10 && duration_ms > 0) ? (output_tokens * 1000.0 / duration_ms).round(1) : nil
+      response[:latency] = {
+        ttft_ms:     duration_ms,      # non-streaming: TTFT == full duration
+        duration_ms: duration_ms,
+        output_tokens: output_tokens,
+        tps:         tps,
+        model:       model,
+        measured_at: Time.now.to_f,
+        streaming:   false              # future flag — true when we migrate
+      }
+      response
     end
     # Format tool results into canonical messages ready to append to @messages.

data/lib/clacky/json_ui_controller.rb CHANGED Viewed

@@ -134,12 +134,13 @@ module Clacky
     # === State updates ===
-    def update_sessionbar(tasks: nil, cost: nil, cost_source: nil, status: nil)
+    def update_sessionbar(tasks: nil, cost: nil, cost_source: nil, status: nil, latency: nil)
       data = {}
       data[:tasks] = tasks if tasks
       data[:cost] = cost if cost
       data[:cost_source] = cost_source if cost_source
       data[:status] = status if status
+      data[:latency] = latency if latency
       emit("session_update", **data) unless data.empty?
     end

data/lib/clacky/plain_ui_controller.rb CHANGED Viewed

@@ -136,7 +136,7 @@ module Clacky
     # === State updates (no-ops) ===
-    def update_sessionbar(tasks: nil, cost: nil, cost_source: nil, status: nil); end
+    def update_sessionbar(tasks: nil, cost: nil, cost_source: nil, status: nil, latency: nil); end
     def update_todos(todos); end
     def set_working_status; end
     def set_idle_status; end

data/lib/clacky/providers.rb CHANGED Viewed

@@ -22,7 +22,7 @@ module Clacky
         "name" => "OpenClacky",
         "base_url" => "https://api.openclacky.com",
         "api" => "bedrock",
-        "default_model" => "abs-claude-sonnet-4-6",
+        "default_model" => "abs-claude-sonnet-4-5",
         "models" => [
           "abs-claude-opus-4-7",
           "abs-claude-opus-4-6",
@@ -131,7 +131,7 @@ module Clacky
       }.freeze,
       "clackyai-sea" => {
-        "name" => "ClackyAI( Sea )",
+        "name" => "ClackyAI(Sea)",
         "base_url" => "https://api.clacky.ai",
         "api" => "bedrock",
         "default_model" => "abs-claude-sonnet-4-5",

data/lib/clacky/server/channel/channel_ui_controller.rb CHANGED Viewed

@@ -152,7 +152,7 @@ module Clacky
       # === State updates (no-ops for IM) ===
-      def update_sessionbar(tasks: nil, cost: nil, cost_source: nil, status: nil); end
+      def update_sessionbar(tasks: nil, cost: nil, cost_source: nil, status: nil, latency: nil); end
       def update_todos(todos); end
       def set_working_status; end
       def set_idle_status; end