RubyGems - openclacky - Versions diffs - 0.8.5 → 0.8.7 - Mend

openclacky 0.8.5 → 0.8.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (69) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +52 -0
data/docs/channel-architecture.md +235 -0
data/lib/clacky/agent/memory_updater.rb +3 -2
data/lib/clacky/agent/session_serializer.rb +48 -3
data/lib/clacky/agent/skill_manager.rb +1 -1
data/lib/clacky/agent.rb +34 -15
data/lib/clacky/brand_config.rb +352 -43
data/lib/clacky/cli.rb +5 -4
data/lib/clacky/client.rb +2 -2
data/lib/clacky/default_skills/channel-setup/SKILL.md +204 -0
data/lib/clacky/default_skills/cron-task-creator/SKILL.md +250 -0
data/lib/clacky/default_skills/cron-task-creator/evals/evals.json +38 -0
data/lib/clacky/default_skills/cron-task-creator/scripts/list_tasks.rb +121 -0
data/lib/clacky/default_skills/cron-task-creator/scripts/manage_schedule.rb +149 -0
data/lib/clacky/default_skills/cron-task-creator/scripts/manage_task.rb +81 -0
data/lib/clacky/default_skills/cron-task-creator/scripts/task_history.rb +137 -0
data/lib/clacky/default_skills/pdf-reader/SKILL.md +90 -0
data/lib/clacky/default_skills/skill-add/SKILL.md +29 -252
data/lib/clacky/default_skills/skill-add/scripts/install_from_zip.rb +233 -0
data/lib/clacky/default_skills/skill-creator/SKILL.md +547 -0
data/lib/clacky/default_skills/skill-creator/agents/analyzer.md +274 -0
data/lib/clacky/default_skills/skill-creator/agents/comparator.md +202 -0
data/lib/clacky/default_skills/skill-creator/agents/grader.md +223 -0
data/lib/clacky/default_skills/skill-creator/eval-viewer/generate_review.py +471 -0
data/lib/clacky/default_skills/skill-creator/eval-viewer/viewer.html +1325 -0
data/lib/clacky/default_skills/skill-creator/references/schemas.md +430 -0
data/lib/clacky/default_skills/skill-creator/scripts/__init__.py +0 -0
data/lib/clacky/default_skills/skill-creator/scripts/aggregate_benchmark.py +401 -0
data/lib/clacky/default_skills/skill-creator/scripts/generate_report.py +326 -0
data/lib/clacky/default_skills/skill-creator/scripts/improve_description.py +310 -0
data/lib/clacky/default_skills/skill-creator/scripts/quick_validate.py +103 -0
data/lib/clacky/default_skills/skill-creator/scripts/run_eval.py +317 -0
data/lib/clacky/default_skills/skill-creator/scripts/run_loop.py +331 -0
data/lib/clacky/default_skills/skill-creator/scripts/utils.py +47 -0
data/lib/clacky/server/channel/adapters/base.rb +82 -0
data/lib/clacky/server/channel/adapters/feishu/adapter.rb +172 -0
data/lib/clacky/server/channel/adapters/feishu/bot.rb +191 -0
data/lib/clacky/server/channel/adapters/feishu/message_parser.rb +106 -0
data/lib/clacky/server/channel/adapters/feishu/ws_client.rb +385 -0
data/lib/clacky/server/channel/adapters/wecom/adapter.rb +106 -0
data/lib/clacky/server/channel/adapters/wecom/ws_client.rb +188 -0
data/lib/clacky/server/channel/channel_config.rb +146 -0
data/lib/clacky/server/channel/channel_manager.rb +230 -0
data/lib/clacky/server/channel/channel_ui_controller.rb +179 -0
data/lib/clacky/server/channel.rb +29 -0
data/lib/clacky/server/http_server.rb +401 -12
data/lib/clacky/server/web_ui_controller.rb +73 -1
data/lib/clacky/skill.rb +25 -11
data/lib/clacky/skill_loader.rb +15 -7
data/lib/clacky/tools/browser.rb +300 -43
data/lib/clacky/tools/file_reader.rb +3 -3
data/lib/clacky/tools/shell.rb +22 -0
data/lib/clacky/utils/file_processor.rb +2 -2
data/lib/clacky/utils/logger.rb +20 -0
data/lib/clacky/version.rb +1 -1
data/lib/clacky/web/app.css +509 -17
data/lib/clacky/web/app.js +143 -34
data/lib/clacky/web/channels.js +196 -0
data/lib/clacky/web/icon-dark.svg +23 -0
data/lib/clacky/web/icon.svg +26 -0
data/lib/clacky/web/index.html +31 -7
data/lib/clacky/web/sessions.js +14 -1
data/lib/clacky/web/settings.js +2 -2
data/lib/clacky/web/skills.js +353 -108
data/lib/clacky/web/tasks.js +2 -2
metadata +40 -3
data/lib/clacky/default_skills/create-task/SKILL.md +0 -102
data/lib/clacky/default_skills/skill-add/scripts/install_from_github.rb +0 -189

checksums.yaml CHANGED Viewed

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: eafcf68d56923cdd3aaacc446277756c77661eaf5da6348ac7f54c565a141bd3
-  data.tar.gz: 5f09c5ffdfba608554be327b9e7bf6ea1e7df57ceedf2ead6402ae8f74ce8780
+  metadata.gz: 38f9805e951dec0f87bda1b64033e0ea7f0c5c6d1c4fd2427f57dfc13aec0835
+  data.tar.gz: f6f0d08206ead392ffbbc073bb92c5b8e5b4c9f4ecf37172153c4bf46f4963e0
 SHA512:
-  metadata.gz: 223dc788074dc74f3e61f1980eb9bab3ef0fb1d15d1644bf500013ae058212d52b8184be51762c08adeeee3d856d2eab4c78d7ea142fa69b3273e81003c80d7c
-  data.tar.gz: 58fbeba9eb4f17f02d61a61707c82e16f3a5d152d0ae95cc9ad745d356470a677ab471093271f54ca3e25ddffe9203060e8612698269c96d18015f24288f912a
+  metadata.gz: d7400735f1f2cbf9fa6b74e56aaa9264e881ab0618885e87b9757458b3b87bde01c5319db6d6f6833573792229c8aa635d5c09bab43cdde15e8cddfe2ce3e418
+  data.tar.gz: ef4dede49038208ff386f5b536ba4c64158e5b72f5599694f14ecf83bd3259b51be6af52bef10fbdea88fbc23f2b2b11c9316e1bdbb1f350c355a0fedeb23bd1

data/CHANGELOG.md CHANGED Viewed

@@ -7,6 +7,58 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
+## [0.8.7] - 2026-03-13
+### Added
+- **PDF file upload and reading**: users can now upload PDF files directly in the WebUI chat; the agent reads and analyzes the content via the built-in `pdf-reader` skill
+- **WebUI favicon and SVG icons**: browser tab now shows the Clacky icon
+- **Public skill store install**: skills from the public store can be installed directly via the WebUI without a GitHub URL
+- **Auto-kill previous server on startup**: launching `clacky serve` now automatically kills any previously running instance via pidfile, preventing port conflicts
+### Improved
+- **Brand skill loading speed**: loading brand skills no longer triggers a network decryption request — name and description are now read from the local `brand_skills.json` cache, making New Session significantly faster
+- **Memory update UX**: memory update step now shows a spinner and info-style message instead of a bare log line
+- **Browser snapshot output**: snapshot output is compressed to reduce token cost when the agent uses browser tools
+- **Subagent output**: subagent task completion now shows a brief info line instead of a full "Task Complete" block, reducing noise in the parent agent's context
+### Fixed
+- **Subagent token delta on first iteration**: subagent now inherits `previous_total_tokens` correctly, fixing an inflated token count on the first tool iteration
+- **Chrome DevTools inspect URL**: updated the remote debugging URL to include the `#remote-debugging` fragment for correct navigation
+- **Shell output token explosion**: long lines in shell output are now truncated to prevent excessive token usage
+### More
+- Binary file size limit lowered from 5 MB to 512 KB to reduce accidental token cost
+- `kill_existing_server` logic moved from CLI into `HttpServer` for cleaner separation
+- Browser tool prefers `snapshot -i` over `screenshot` for lower token cost
+- Cross-platform PID file path using `Dir.tmpdir` instead of hardcoded `/tmp`
+## [0.8.6] - 2026-03-12
+### Added
+- **Channel system with Feishu & WeCom support**: integrated IM platform adapters — agents can now receive and reply to messages via Feishu (WebSocket) and WeCom channels
+- **Skill encryption (brand skills)**: brand skills can be distributed as encrypted `.enc` files, decrypted on-the-fly using license keys; includes a full key management and manifest system
+- **Cron task creator & skill creator default skills**: two new built-in skills for creating scheduled tasks and new skills directly from chat
+- **Image messages in session history restore**: session restore now correctly replays image-containing messages, including thumbnail display in the UI
+- **Skill auto-upload to cloud**: skills can be uploaded to the cloud store from within the UI
+### Improved
+- **WeCom setup flow**: improved step-by-step WeCom channel configuration UX (#11)
+- **Skill autocomplete UI**: enhanced slash-command autocomplete interaction — better keyboard navigation, input behavior, and visual feedback (#6)
+- **Chrome setup UX**: simplified Chrome installation flow with improved error messages and progress indicators (#8)
+- **WebUI colors and layout**: polished light/dark mode colors, sidebar alignment, and badge styles for a more consistent look
+- **Test suite speed**: `CLACKY_TEST` guard prevents brand skill network calls during tests — suite now runs ~60× faster per example
+### Fixed
+- **Duplicate user bubble on skill install**: prevented an extra chat bubble appearing when installing a skill from the store
+- **Image thumbnails in session replay**: restored missing image thumbnails when replaying historical sessions
+- **WebUI permission mode**: Web UI sessions now correctly use `confirm_all` permission mode
+- **Feishu WS log noise**: removed emoji characters from WebSocket connection log messages
+### More
+- Subagent memory update disabled to reduce noise
+- Ping request `max_tokens` bumped from 10 to 16
+- WebUI updated to use new cron-task-creator and skill-creator skills
 ## [0.8.5] - 2026-03-11
 ### Fixed

data/docs/channel-architecture.md ADDED Viewed

@@ -0,0 +1,235 @@
+# Channel Architecture
+## Overview
+Channel is a feature that bridges Clacky's Server Sessions to IM platforms
+(Feishu, WeCom, DingTalk, etc.). It reuses the existing Agent + SessionRegistry
+infrastructure — the Agent knows nothing about IM; the Channel layer is purely
+a transport adapter.
+## Design Principles
+- **Zero Agent intrusion** — Agent only speaks `UIInterface`; swap the controller, get IM output
+- **Reuse SessionRegistry** — IM chats resolve to the same `SessionRegistry` sessions as Web UI
+- **WebSocket long connection** — No public domain required; adapters hold a persistent WSS connection to the IM platform
+- **One platform = 2 threads** — read loop thread + ping/heartbeat thread (constant, small footprint)
+---
+## Layer Diagram
+```
+IM Platforms (Feishu / WeCom / DingTalk)
+      │  WebSocket long connection (wss://)
+      ▼
+┌─────────────────────────────────────┐
+│       Channel Adapter Layer         │
+│  Feishu::Adapter                    │
+│    ├── WSClient   (read loop + ping) │
+│    ├── Bot        (send API)         │
+│    └── MessageParser                │
+│  Wecom::Adapter                     │
+│    └── WSClient   (read loop + ping) │
+│  (future) Dingtalk::Adapter         │
+└──────────────┬──────────────────────┘
+               │ standardized event Hash
+               ▼
+┌─────────────────────────────────────┐
+│          ChannelManager             │
+│  • Owns adapter threads             │
+│  • Routes inbound event →           │
+│    ChannelBinding → session_id      │
+│  • Calls agent.run in Thread.new    │
+└──────────────┬──────────────────────┘
+               │
+       ┌───────┴────────┐
+       ▼                ▼
+SessionRegistry    ChannelUIController
+(existing)         (implements UIInterface)
+       │                │
+       ▼                ▼
+    Agent            IM Platform reply
+  (unchanged)       via adapter.send_text
+```
+---
+## File Structure
+```
+lib/clacky/channel/
+├── adapters/
+│   ├── base.rb                  # Adapter abstract base + registry
+│   ├── feishu/
+│   │   ├── adapter.rb           # Feishu::Adapter < Base
+│   │   ├── bot.rb               # HTTP send API (token cache, Markdown/card)
+│   │   ├── message_parser.rb    # Raw WS event → standardized Hash
+│   │   └── ws_client.rb         # Feishu protobuf WS long connection
+│   └── wecom/
+│       ├── adapter.rb           # Wecom::Adapter < Base
+│       └── ws_client.rb         # WeCom JSON WS long connection
+├── channel_message.rb           # Struct: standardized inbound message
+├── channel_binding.rb           # (platform, user_id) → session_id mapping
+├── channel_ui_controller.rb     # UIInterface impl — pushes events to IM
+└── channel_manager.rb           # Lifecycle: start/stop adapters, route messages
+lib/clacky/channel.rb            # Top-level require entry point
+```
+---
+## Standardized Inbound Event
+All adapters yield the same Hash shape to `ChannelManager`:
+```ruby
+{
+  platform:   :feishu,          # Symbol
+  chat_id:    "oc_xxx",         # String — IM chat/group identifier
+  user_id:    "ou_xxx",         # String — IM user identifier
+  text:       "deploy now",     # String — cleaned user text
+  message_id: "om_xxx",         # String — for threading / update
+  timestamp:  Time,             # Time object
+  chat_type:  :direct | :group, # Symbol
+  raw:        { ... }           # Original platform payload
+}
+```
+---
+## Adapter Interface (Base)
+```ruby
+class Adapters::Base
+  def self.platform_id → Symbol
+  def self.platform_config(raw_config) → Hash   # symbol-keyed
+  def self.env_keys → Array<String>             # for config serialization
+  def start(&on_message)   # blocks; yields event Hash per inbound message
+  def stop                 # graceful shutdown
+  def send_text(chat_id, text, reply_to: nil) → Hash
+  def update_message(chat_id, message_id, text) → Boolean
+  def supports_message_updates? → Boolean
+  def validate_config(config) → Array<String>   # error messages
+end
+```
+---
+## ChannelManager
+```ruby
+class ChannelManager
+  def initialize(session_registry:, session_builder:, channel_config:, agent_config:)
+  def start   # Thread.new per enabled platform adapter
+  def stop    # kills all adapter threads gracefully
+  private
+  def route_message(adapter, event)
+    session_id = @binding.resolve_or_create(event, session_builder: @session_builder)
+    ui         = ChannelUIController.new(event, adapter)
+    Thread.new { run_agent(session_id, event[:text], ui) }
+  end
+end
+```
+---
+## ChannelBinding
+Maps `(platform, user_id)` → `session_id`. Persisted to `~/.clacky/channel_bindings.yml`.
+Binding modes (configurable per platform):
+| Mode | Key | Description |
+|------|-----|-------------|
+| `user` | `(platform, user_id)` | Each IM user gets their own session (default) |
+| `chat` | `(platform, chat_id)` | Whole group shares one session |
+---
+## ChannelUIController
+Implements `UIInterface`. Key behaviours:
+- `show_assistant_message` → `adapter.send_text(chat_id, content)`
+- `show_tool_call` → buffers as `⚙️ \`tool summary\`` (flushed on next message)
+- `show_progress` → `adapter.update_message(...)` if `supports_message_updates?`
+- `show_complete` → sends `✅ Complete • N iterations • $cost`
+- `request_confirmation` → **not supported in IM** (returns auto-approved / raises)
+---
+## Thread Model
+```
+Main thread  (WEBrick server.start — blocks)
+├── WEBrick request threads    (existing)
+├── Agent task threads         (existing, per task)
+├── Scheduler thread           (existing, clacky-scheduler)
+└── ChannelManager
+    ├── feishu-adapter thread  (WSClient read loop, constant)
+    │   └── feishu-ping thread (heartbeat, 90s)
+    └── wecom-adapter thread   (WSClient read loop, constant)
+        └── wecom-ping thread  (heartbeat, 30s)
+```
+Per enabled platform: **2 constant threads**. Agent task threads are spawned
+on demand (same as Web UI path) and exit when done.
+---
+## Configuration
+Channel credentials live in `~/.clacky/channels.yml` (managed by `ChannelConfig`
+which already exists in main branch):
+```yaml
+channels:
+  feishu:
+    enabled: true
+    app_id: cli_xxx
+    app_secret: xxx
+    allowed_users:
+      - ou_xxx
+  wecom:
+    enabled: false
+    bot_id: xxx
+    secret: xxx
+```
+`ChannelManager` reads this via `ChannelConfig#platform_config(platform)`.
+---
+## Integration with HttpServer
+```ruby
+# HttpServer#initialize
+@channel_manager = ChannelManager.new(
+  session_registry: @registry,
+  session_builder:  method(:build_session),
+  channel_config:   Clacky::ChannelConfig.load,
+  agent_config:     @agent_config
+)
+# HttpServer#start  (after scheduler.start)
+@channel_manager.start
+```
+`ChannelManager#start` is non-blocking (spawns threads internally),
+mirroring `Scheduler#start` behaviour.
+---
+## Future: DingTalk
+DingTalk also supports a WebSocket Stream mode. Adding it means:
+1. `lib/clacky/channel/adapters/dingtalk/adapter.rb` inheriting `Base`
+2. `lib/clacky/channel/adapters/dingtalk/ws_client.rb`
+3. Register: `Adapters.register(:dingtalk, Adapter)`
+4. Add credentials to `ChannelConfig`
+No changes needed to `ChannelManager`, `ChannelUIController`, or `ChannelBinding`.

data/lib/clacky/agent/memory_updater.rb CHANGED Viewed

@@ -26,6 +26,7 @@ module Clacky
       # @return [Boolean]
       def should_update_memory?
         return false unless memory_update_enabled?
+        return false if @is_subagent  # Subagents never update memory
         task_iterations = @iterations - (@task_start_iterations || 0)
         task_iterations >= MEMORY_UPDATE_MIN_ITERATIONS
@@ -41,7 +42,7 @@ module Clacky
         @memory_prompt_injected = true
         @memory_updating = true
-        @ui&.show_info("Updating long-term memory...")
+        @ui&.show_progress("Updating long-term memory…")
         @messages << {
           role: "user",
@@ -61,7 +62,7 @@ module Clacky
         @messages.reject! { |m| m[:memory_update] }
         @memory_prompt_injected = false
         @memory_updating = false
-        @ui&.show_info("Memory updated.")
+        @ui&.clear_progress
       end
       private def memory_update_enabled?

data/lib/clacky/agent/session_serializer.rb CHANGED Viewed

@@ -153,8 +153,20 @@ module Clacky
         @messages.each do |msg|
           role = msg[:role].to_s
-          if role == "user" && !msg[:system_injected] && msg[:content].is_a?(String) &&
-             !msg[:content].to_s.start_with?("[SYSTEM]")
+          # A real user message can have either a String content or an Array content
+          # (Array = multipart: text + image blocks). Exclude system-injected messages
+          # and synthetic [SYSTEM] text messages.
+          is_real_user_msg = role == "user" && !msg[:system_injected] &&
+            if msg[:content].is_a?(String)
+              !msg[:content].start_with?("[SYSTEM]")
+            elsif msg[:content].is_a?(Array)
+              # Must contain at least one text or image block (not a tool_result array)
+              msg[:content].any? { |b| b.is_a?(Hash) && %w[text image].include?(b[:type].to_s) }
+            else
+              false
+            end
+          if is_real_user_msg
             # Start a new round at each real user message
             current_round = { user_msg: msg, events: [] }
             rounds << current_round
@@ -175,8 +187,10 @@ module Clacky
         page.each do |round|
           msg = round[:user_msg]
           display_text = extract_text_from_content(msg[:content])
+          # Extract image data URLs from multipart content (for history replay rendering)
+          images = extract_images_from_content(msg[:content])
           # Emit user message with its timestamp for dedup on the frontend
-          ui.show_user_message(display_text, created_at: msg[:created_at])
+          ui.show_user_message(display_text, created_at: msg[:created_at], images: images)
           round[:events].each do |ev|
             # Skip system-injected messages (e.g. synthetic skill content, memory prompts)
@@ -241,6 +255,37 @@ module Clacky
         Clacky::Logger.warn("refresh_system_prompt failed during session restore: #{e.message}")
       end
+      # Extract base64 data URLs from multipart content (image blocks).
+      # Returns an empty array when there are no images or content is plain text.
+      # @param content [String, Array, Object] Message content
+      # @return [Array<String>] Array of data URLs (e.g. "data:image/png;base64,...")
+      def extract_images_from_content(content)
+        return [] unless content.is_a?(Array)
+        content.filter_map do |block|
+          next unless block.is_a?(Hash)
+          case block[:type].to_s
+          when "image_url"
+            # OpenAI format: { type: "image_url", image_url: { url: "data:image/png;base64,..." } }
+            block.dig(:image_url, :url)
+          when "image"
+            # Anthropic format: { type: "image", source: { type: "base64", media_type: "image/png", data: "..." } }
+            source = block[:source]
+            next unless source.is_a?(Hash) && source[:type].to_s == "base64"
+            "data:#{source[:media_type]};base64,#{source[:data]}"
+          when "document"
+            # Anthropic PDF document block — return a sentinel string for frontend display
+            source = block[:source]
+            next unless source.is_a?(Hash) && source[:media_type].to_s == "application/pdf"
+            # Return a special marker so the frontend can render a PDF badge instead of an <img>
+            "pdf:#{source[:data]&.then { |d| d[0, 32] }}"  # prefix to identify without full payload
+          end
+        end
+      end
       # Extract text from message content (handles string and array formats)
       # @param content [String, Array, Object] Message content
       # @return [String] Extracted text

data/lib/clacky/agent/skill_manager.rb CHANGED Viewed

@@ -184,7 +184,7 @@ module Clacky
           system_injected: true
         }
-        @ui&.log("Injected skill content for /#{skill.identifier}", level: :info)
+        @ui&.show_info("Injected skill content for /#{skill.identifier}")
       end
       private

data/lib/clacky/agent.rb CHANGED Viewed

@@ -141,7 +141,7 @@ module Clacky
       @config.model_name
     end
-    def run(user_input, images: [])
+    def run(user_input, images: [], files: [])
       # Start new task for Time Machine
       task_id = start_new_task
@@ -172,8 +172,8 @@ module Clacky
         @messages << system_message
       end
-      # Format user message with images if provided
-      user_content = format_user_content(user_input, images)
+      # Format user message with images and files if provided
+      user_content = format_user_content(user_input, images, files)
       @messages << { role: "user", content: user_content, task_id: task_id, created_at: Time.now.to_f }
       @total_tasks += 1
@@ -208,7 +208,12 @@ module Clacky
           # Check if done (no more tool calls needed)
           if response[:finish_reason] == "stop" || response[:tool_calls].nil? || response[:tool_calls].empty?
-            @ui&.show_assistant_message(response[:content]) if response[:content] && !response[:content].empty?
+            # During memory update phase, show LLM response as info (not a chat bubble)
+            if @memory_updating && response[:content] && !response[:content].empty?
+              @ui&.show_info("🧠 " + response[:content].strip)
+            elsif response[:content] && !response[:content].empty?
+              @ui&.show_assistant_message(response[:content])
+            end
             # Debug: log why we're stopping
             if @config.verbose && (response[:tool_calls].nil? || response[:tool_calls].empty?)
@@ -227,7 +232,8 @@ module Clacky
           end
           # Show assistant message if there's content before tool calls
-          if response[:content] && !response[:content].empty?
+          # During memory update phase, suppress text output (only tool calls matter)
+          if response[:content] && !response[:content].empty? && !@memory_updating
             @ui&.show_assistant_message(response[:content])
           end
@@ -272,13 +278,17 @@ module Clacky
           @modified_files_in_task = []  # Reset for next task
         end
-        @ui&.show_complete(
-          iterations: result[:iterations],
-          cost: result[:total_cost_usd],
-          duration: result[:duration_seconds],
-          cache_stats: result[:cache_stats],
-          awaiting_user_feedback: awaiting_user_feedback
-        )
+        if @is_subagent
+          @ui&.show_info("Subagent done (#{result[:iterations]} iterations, $#{result[:total_cost_usd].round(4)})")
+        else
+          @ui&.show_complete(
+            iterations: result[:iterations],
+            cost: result[:total_cost_usd],
+            duration: result[:duration_seconds],
+            cache_stats: result[:cache_stats],
+            awaiting_user_feedback: awaiting_user_feedback
+          )
+        end
         @hooks.trigger(:on_complete, result)
         result
       rescue Clacky::AgentInterrupted
@@ -714,6 +724,10 @@ module Clacky
         ui: @ui,
         profile: @agent_profile.name
       )
+      subagent.instance_variable_set(:@is_subagent, true)
+      # Inherit previous_total_tokens so the first iteration delta is calculated correctly
+      subagent.instance_variable_set(:@previous_total_tokens, @previous_total_tokens)
       # Deep clone messages to avoid cross-contamination
       subagent.instance_variable_set(:@messages, deep_clone(@messages))
@@ -809,11 +823,16 @@ module Clacky
     end
     # Format user content with optional images
+    # PDF files are handled upstream (server injects file path into message text),
+    # so this method only needs to handle images.
     # @param text [String] User's text input
     # @param images [Array<String>] Array of image file paths or data: URLs
-    # @return [String|Array] String if no images, Array with text and image_url objects if images present
-    private def format_user_content(text, images)
-      return text if images.nil? || images.empty?
+    # @param files [Array] Unused — kept for signature compatibility
+    # @return [String|Array] String if no images, Array with content blocks otherwise
+    private def format_user_content(text, images, files = [])
+      images ||= []
+      return text if images.empty?
       content = []
       content << { type: "text", text: text } unless text.nil? || text.empty?