RubyGems - openclacky - Versions diffs - 0.7.0 → 0.7.2 - Mend

openclacky 0.7.0 → 0.7.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (78) hide show

checksums.yaml +4 -4
data/.clacky/skills/commit/SKILL.md +29 -4
data/.clackyrules +3 -1
data/CHANGELOG.md +103 -2
data/README.md +70 -161
data/bin/clarky +11 -0
data/docs/HOW-TO-USE-CN.md +96 -0
data/docs/HOW-TO-USE.md +94 -0
data/docs/config.example.yml +27 -0
data/docs/deploy_subagent_design.md +540 -0
data/docs/time_machine_design.md +247 -0
data/docs/why-openclacky.md +0 -1
data/lib/clacky/agent/cost_tracker.rb +180 -0
data/lib/clacky/agent/llm_caller.rb +54 -0
data/lib/clacky/{message_compressor.rb → agent/message_compressor.rb} +12 -36
data/lib/clacky/agent/message_compressor_helper.rb +534 -0
data/lib/clacky/agent/session_serializer.rb +152 -0
data/lib/clacky/agent/skill_manager.rb +138 -0
data/lib/clacky/agent/system_prompt_builder.rb +96 -0
data/lib/clacky/agent/time_machine.rb +199 -0
data/lib/clacky/agent/tool_executor.rb +434 -0
data/lib/clacky/{tool_registry.rb → agent/tool_registry.rb} +1 -1
data/lib/clacky/agent.rb +260 -1370
data/lib/clacky/agent_config.rb +447 -10
data/lib/clacky/cli.rb +275 -98
data/lib/clacky/client.rb +12 -2
data/lib/clacky/default_skills/code-explorer/SKILL.md +34 -0
data/lib/clacky/default_skills/deploy/SKILL.md +13 -0
data/lib/clacky/default_skills/deploy/scripts/rails_deploy.rb +383 -0
data/lib/clacky/default_skills/deploy/tools/check_health.rb +116 -0
data/lib/clacky/default_skills/deploy/tools/execute_deployment.rb +174 -0
data/lib/clacky/default_skills/deploy/tools/fetch_runtime_logs.rb +67 -0
data/lib/clacky/default_skills/deploy/tools/list_services.rb +80 -0
data/lib/clacky/default_skills/deploy/tools/report_deploy_status.rb +67 -0
data/lib/clacky/default_skills/deploy/tools/set_deploy_variables.rb +138 -0
data/lib/clacky/default_skills/new/SKILL.md +2 -2
data/lib/clacky/json_ui_controller.rb +195 -0
data/lib/clacky/providers.rb +107 -0
data/lib/clacky/skill.rb +48 -7
data/lib/clacky/skill_loader.rb +7 -0
data/lib/clacky/tools/edit.rb +105 -48
data/lib/clacky/tools/file_reader.rb +44 -73
data/lib/clacky/tools/invoke_skill.rb +89 -0
data/lib/clacky/tools/list_tasks.rb +54 -0
data/lib/clacky/tools/redo_task.rb +41 -0
data/lib/clacky/tools/safe_shell.rb +1 -1
data/lib/clacky/tools/shell.rb +74 -62
data/lib/clacky/tools/trash_manager.rb +1 -1
data/lib/clacky/tools/undo_task.rb +32 -0
data/lib/clacky/tools/web_fetch.rb +2 -1
data/lib/clacky/ui2/components/command_suggestions.rb +13 -3
data/lib/clacky/ui2/components/inline_input.rb +23 -2
data/lib/clacky/ui2/components/input_area.rb +65 -21
data/lib/clacky/ui2/components/modal_component.rb +199 -62
data/lib/clacky/ui2/layout_manager.rb +75 -25
data/lib/clacky/ui2/line_editor.rb +23 -2
data/lib/clacky/ui2/markdown_renderer.rb +31 -10
data/lib/clacky/ui2/screen_buffer.rb +2 -0
data/lib/clacky/ui2/ui_controller.rb +316 -37
data/lib/clacky/ui2.rb +2 -0
data/lib/clacky/ui_interface.rb +50 -0
data/lib/clacky/utils/arguments_parser.rb +31 -3
data/lib/clacky/utils/file_processor.rb +13 -18
data/lib/clacky/version.rb +1 -1
data/lib/clacky.rb +19 -9
data/scripts/install.sh +274 -97
data/scripts/uninstall.sh +12 -12
metadata +40 -13
data/.clacky/skills/test-skill/SKILL.md +0 -15
data/lib/clacky/compression/base.rb +0 -231
data/lib/clacky/compression/standard.rb +0 -339
data/lib/clacky/config.rb +0 -117
/data/lib/clacky/{hook_manager.rb → agent/hook_manager.rb} +0 -0
/data/lib/clacky/{progress_indicator.rb → ui2/progress_indicator.rb} +0 -0
/data/lib/clacky/{thinking_verbs.rb → ui2/thinking_verbs.rb} +0 -0
/data/lib/clacky/{gitignore_parser.rb → utils/gitignore_parser.rb} +0 -0
/data/lib/clacky/{model_pricing.rb → utils/model_pricing.rb} +0 -0
/data/lib/clacky/{trash_directory.rb → utils/trash_directory.rb} +0 -0

data/docs/time_machine_design.md ADDED Viewed

@@ -0,0 +1,247 @@
+# Time Machine Design Documentation
+## Overview
+Time Machine is a feature that allows users to navigate through the agent's task execution history, providing undo/redo capabilities and branch exploration. Users can access it via ESC key or `/undo` command to view an interactive menu of past tasks.
+## Core Data Structure Design
+### Task History Graph
+The Time Machine uses a minimal tree-based data structure to track task relationships:
+**Three Core State Variables:**
+1. **task_parents** (Hash): Maps each task_id to its parent_id
+   - Forms a tree structure where each task points to its predecessor
+   - Root tasks have parent_id = 0
+   - Enables traversal in both directions (parent→children, child→parent)
+2. **current_task_id** (Integer): The latest created task ID
+   - Always increments when new tasks are created
+   - Never decreases, even during undo operations
+   - Represents the "tip" of the execution timeline
+3. **active_task_id** (Integer): The current active position in history
+   - Can move backward/forward during undo/redo
+   - Determines which messages are visible to the LLM
+   - When active_task_id < current_task_id, we're viewing "past" state
+### Task Metadata Structure
+Each task in the history contains:
+- **task_id**: Unique identifier (auto-incrementing integer)
+- **summary**: Brief description (first 80 chars of user's message)
+- **status**: One of three states
+  - `:past` - Task is before the current active position
+  - `:current` - Task is the active position (marked with `→`)
+  - `:future` - Task exists but is after active position (marked with `↯`)
+- **has_branches**: Boolean indicating if multiple children exist (marked with `⎇`)
+## Snapshot Strategy
+### File State Preservation
+**Complete AFTER-State Snapshots:**
+- After each successful task execution, all modified files are saved
+- Storage location: `~/.clacky/snapshots/{session_id}/task-{id}/`
+- Each file is stored with its full relative path from working directory
+- Only files modified during that task are snapshotted
+**Why AFTER-state instead of BEFORE-state:**
+- Simpler restoration logic (just copy files back)
+- No need to track "what changed" - the snapshot IS the state
+- Easier to verify correctness (snapshot = expected state)
+**File Restoration Process:**
+- When switching to a task, iterate through all its snapshotted files
+- Copy each file from snapshot directory to working directory
+- File permissions and timestamps are preserved
+### Message Filtering
+**Active Messages Concept:**
+- Messages array contains ALL messages (past, current, future)
+- `active_messages()` method filters out "future" messages
+- LLM only sees messages with `task_id <= active_task_id`
+- This creates the illusion of time travel without data deletion
+**Why Keep All Messages:**
+- Enables redo operations (future messages preserved)
+- Allows branch switching (alternative futures available)
+- Simplifies session serialization (single source of truth)
+## Session Persistence
+### State Serialization
+Time Machine state is saved under `:time_machine` key in session data:
+- task_parents hash (complete tree structure)
+- current_task_id (latest task number)
+- active_task_id (current viewing position)
+**Restoration Guarantees:**
+- Complete task tree is rebuilt
+- Active position is restored
+- Snapshot files remain available across sessions
+- User can continue undo/redo from where they left off
+## Critical Test Scenarios
+### 1. Basic Undo/Redo Flow
+**Test Focus:**
+- Sequential task creation increments task IDs correctly
+- Undo moves active_task_id backward (current_task_id unchanged)
+- Redo moves active_task_id forward
+- File snapshots are correctly restored at each step
+- Cannot undo beyond root task (task_id = 0)
+- Cannot redo beyond current_task_id
+**Edge Cases:**
+- Undoing at root task should fail gracefully
+- Redoing when already at tip should fail gracefully
+- Multiple consecutive undos should work correctly
+### 2. Branching Scenarios
+**Test Focus:**
+- After undo, creating new task creates a branch
+- New branch starts from active_task_id, not current_task_id
+- Original future branch is preserved (for potential redo)
+- Parent task is marked with `has_branches: true`
+- Child tasks list should include both branches
+**Branch Navigation:**
+- Switching between branches restores correct file states
+- Each branch maintains independent history
+- Message filtering correctly shows only relevant messages
+### 3. Message Filtering and Task IDs
+**Test Focus:**
+- Every message is tagged with task_id (user, assistant, tool results)
+- Active messages only include those with task_id <= active_task_id
+- LLM never sees "future" messages during undo state
+- After redo, future messages become visible again
+- New tasks created after undo get fresh task IDs (not reused)
+**Message Consistency:**
+- Tool results are associated with correct task
+- Multi-turn conversations maintain task association
+- Error messages don't break task ID tagging
+### 4. File Snapshot Integrity
+**Test Focus:**
+- Only modified files are snapshotted (not entire project)
+- File content is exactly preserved (byte-for-byte)
+- Nested directory structures are correctly recreated
+- Multiple files in single task are all snapshotted
+- Snapshot directory naming prevents collisions
+**Restoration Accuracy:**
+- After undo + file restore, file content matches expected state
+- Subsequent task execution works with restored files
+- Binary files are handled correctly (not corrupted)
+### 5. Session Persistence and Recovery
+**Test Focus:**
+- Save session, restart, restore session preserves Time Machine state
+- Task tree structure is fully rebuilt
+- Active position is correctly restored
+- Snapshot files are accessible after restart
+- Undo/redo operations work identically after restore
+**Persistence Edge Cases:**
+- Empty task history (new session)
+- Session with complex branching
+- Session saved while in "undo" state (active_task_id < current_task_id)
+### 6. AI Tool Integration
+**Test Focus:**
+- Tools are correctly registered in tool registry
+- AI can invoke undo_task, redo_task, list_tasks
+- Agent parameter is correctly injected (similar to TodoManager pattern)
+- Tool execution returns success/failure messages
+- Tools respect permission modes (confirm_all, auto_approve, etc.)
+**Tool Interaction:**
+- AI calling undo_task modifies agent state correctly
+- Subsequent AI responses use filtered messages
+- Tool results are included in task history
+- Multiple tool calls in sequence work correctly
+### 7. UI and User Interaction
+**Test Focus:**
+- ESC key triggers time machine menu
+- `/undo` command works identically to ESC
+- Menu displays correct task list with status indicators
+- Visual markers: `→` current, `↯` future, `⎇` branches
+- User selection triggers correct task switch
+- Menu updates after undo/redo operations
+**User Experience:**
+- Task summaries are readable (truncated to 80 chars)
+- Menu is responsive with large task histories
+- Cancel/exit returns to normal operation
+- Error messages are clear and actionable
+### 8. Integration with Existing Features
+**Test Focus:**
+- Works with message compression (no dependency on tool_calls)
+- Compatible with session serialization
+- Doesn't interfere with cost tracking
+- Works with both UI modes (UI1 and UI2)
+- Subagent forking doesn't inherit Time Machine state
+**Feature Compatibility:**
+- Todo manager works normally during undo state
+- Web search tools work correctly
+- File tools (write, edit) trigger snapshots
+- Shell commands can be undone via file snapshots
+## Design Principles
+### Minimal Invasiveness
+- Only 3 new instance variables in Agent class
+- No changes to core message structure (only adds task_id field)
+- Existing tools unaware of Time Machine existence
+- No performance impact when not in use
+### Data Integrity
+- Never delete messages or snapshots (immutable history)
+- File restoration is idempotent (can redo multiple times)
+- Task IDs never reused (prevents confusion)
+- Snapshot isolation (each task has independent directory)
+### User Control
+- Explicit user action required (ESC or /undo)
+- Clear visual feedback on current position
+- Cannot accidentally lose work (future preserved)
+- Can explore branches without commitment
+### Developer Friendly
+- Simple tree data structure (easy to reason about)
+- Comprehensive test coverage (55 test cases)
+- Clear separation of concerns (module-based design)
+- Well-documented edge cases
+## Future Enhancement Possibilities
+### Potential Improvements
+- Automatic snapshot garbage collection (old sessions)
+- Diff view between task states
+- Named checkpoints (user-defined bookmarks)
+- Merge branches functionality
+- Export task history as replay script
+- Snapshot compression for large files
+### Scalability Considerations
+- Large file handling (incremental snapshots)
+- Long session histories (pagination in UI)
+- Multiple simultaneous branches (better visualization)
+- Remote collaboration (shared task history)

data/docs/why-openclacky.md CHANGED Viewed

@@ -79,7 +79,6 @@ A command-line AI assistant that's approachable for non-technical users but powe
    |------|----------|----------|
    | `auto_approve` | Execute all tools automatically | Batch operations |
    | `confirm_safes` | Auto-approve safe operations | Daily development |
-   | `confirm_edits` | Confirm file modifications | Careful work |
    | `plan_only` | Generate plans only | Code review |
 5. **Session Recovery**

data/lib/clacky/agent/cost_tracker.rb ADDED Viewed

@@ -0,0 +1,180 @@
+# frozen_string_literal: true
+module Clacky
+  class Agent
+    # Cost tracking and token usage statistics
+    # Manages cost calculation, token estimation, and usage display
+    module CostTracker
+      # Track cost from API usage
+      # Updates total cost and displays iteration statistics
+      # @param usage [Hash] Usage data from API response
+      # @param raw_api_usage [Hash, nil] Raw API usage data for debugging
+      def track_cost(usage, raw_api_usage: nil)
+        # Priority 1: Use API-provided cost if available (OpenRouter, LiteLLM, etc.)
+        iteration_cost = nil
+        if usage[:api_cost]
+          @total_cost += usage[:api_cost]
+          @cost_source = :api
+          @task_cost_source = :api
+          iteration_cost = usage[:api_cost]
+          @ui&.log("Using API-provided cost: $#{usage[:api_cost]}", level: :debug) if @config.verbose
+        else
+          # Priority 2: Calculate from tokens using ModelPricing
+          result = ModelPricing.calculate_cost(model: current_model, usage: usage)
+          cost = result[:cost]
+          pricing_source = result[:source]
+          @total_cost += cost
+          iteration_cost = cost
+          # Map pricing source to cost source: :price or :default
+          @cost_source = pricing_source
+          @task_cost_source = pricing_source
+          if @config.verbose
+            source_label = pricing_source == :price ? "model pricing" : "default pricing"
+            @ui&.log("Calculated cost for #{@config.model_name} using #{source_label}: $#{cost.round(6)}", level: :debug)
+            @ui&.log("Usage breakdown: prompt=#{usage[:prompt_tokens]}, completion=#{usage[:completion_tokens]}, cache_write=#{usage[:cache_creation_input_tokens] || 0}, cache_read=#{usage[:cache_read_input_tokens] || 0}", level: :debug)
+          end
+        end
+        # Display token usage statistics for this iteration
+        display_iteration_tokens(usage, iteration_cost)
+        # Update session bar cost in real-time (don't wait for agent.run to finish)
+        @ui&.update_sessionbar(cost: @total_cost)
+        # Track cache usage statistics (global)
+        @cache_stats[:total_requests] += 1
+        if usage[:cache_creation_input_tokens]
+          @cache_stats[:cache_creation_input_tokens] += usage[:cache_creation_input_tokens]
+        end
+        if usage[:cache_read_input_tokens]
+          @cache_stats[:cache_read_input_tokens] += usage[:cache_read_input_tokens]
+          @cache_stats[:cache_hit_requests] += 1
+        end
+        # Store raw API usage samples (keep last 3 for debugging)
+        if raw_api_usage
+          @cache_stats[:raw_api_usage_samples] ||= []
+          @cache_stats[:raw_api_usage_samples] << raw_api_usage
+          @cache_stats[:raw_api_usage_samples] = @cache_stats[:raw_api_usage_samples].last(3)
+        end
+        # Track cache usage for current task
+        if @task_cache_stats
+          @task_cache_stats[:total_requests] += 1
+          if usage[:cache_creation_input_tokens]
+            @task_cache_stats[:cache_creation_input_tokens] += usage[:cache_creation_input_tokens]
+          end
+          if usage[:cache_read_input_tokens]
+            @task_cache_stats[:cache_read_input_tokens] += usage[:cache_read_input_tokens]
+            @task_cache_stats[:cache_hit_requests] += 1
+          end
+        end
+      end
+      # Estimate token count for a message content
+      # Simple approximation: characters / 4 (English text)
+      # For Chinese/other languages, characters / 2 is more accurate
+      # This is a rough estimate for compression triggering purposes
+      # @param content [String, Array, Object] Message content
+      # @return [Integer] Estimated token count
+      def estimate_tokens(content)
+        return 0 if content.nil?
+        text = if content.is_a?(String)
+                 content
+               elsif content.is_a?(Array)
+                 # Handle content arrays (e.g., with images)
+                 # Add safety check to prevent nil.compact error
+                 mapped = content.map { |c| c[:text] if c.is_a?(Hash) }
+                 (mapped || []).compact.join
+               else
+                 content.to_s
+               end
+        return 0 if text.empty?
+        # Detect language mix - count non-ASCII characters
+        ascii_count = text.bytes.count { |b| b < 128 }
+        total_bytes = text.bytes.length
+        # Mix ratio (1.0 = all English, 0.5 = all Chinese)
+        mix_ratio = total_bytes > 0 ? ascii_count.to_f / total_bytes : 1.0
+        # English: ~4 chars/token, Chinese: ~2 chars/token
+        base_chars_per_token = mix_ratio * 4 + (1 - mix_ratio) * 2
+        (text.length / base_chars_per_token).to_i + 50 # Add overhead for message structure
+      end
+      # Calculate total token count for all messages
+      # Returns estimated tokens and breakdown by category
+      # @return [Hash] Token counts by role and total
+      def total_message_tokens
+        system_tokens = 0
+        user_tokens = 0
+        assistant_tokens = 0
+        tool_tokens = 0
+        summary_tokens = 0
+        @messages.each do |msg|
+          tokens = estimate_tokens(msg[:content])
+          case msg[:role]
+          when "system"
+            system_tokens += tokens
+          when "user"
+            user_tokens += tokens
+          when "assistant"
+            assistant_tokens += tokens
+          when "tool"
+            tool_tokens += tokens
+          end
+        end
+        {
+          total: system_tokens + user_tokens + assistant_tokens + tool_tokens,
+          system: system_tokens,
+          user: user_tokens,
+          assistant: assistant_tokens,
+          tool: tool_tokens
+        }
+      end
+      private
+      # Display token usage for current iteration
+      # @param usage [Hash] Usage data from API
+      # @param cost [Float] Cost for this iteration
+      def display_iteration_tokens(usage, cost)
+        prompt_tokens = usage[:prompt_tokens] || 0
+        completion_tokens = usage[:completion_tokens] || 0
+        total_tokens = usage[:total_tokens] || (prompt_tokens + completion_tokens)
+        cache_write = usage[:cache_creation_input_tokens] || 0
+        cache_read = usage[:cache_read_input_tokens] || 0
+        # Calculate token delta from previous iteration
+        delta_tokens = total_tokens - @previous_total_tokens
+        @previous_total_tokens = total_tokens  # Update for next iteration
+        # Prepare data for UI to format and display
+        token_data = {
+          delta_tokens: delta_tokens,
+          prompt_tokens: prompt_tokens,
+          completion_tokens: completion_tokens,
+          total_tokens: total_tokens,
+          cache_write: cache_write,
+          cache_read: cache_read,
+          cost: cost
+        }
+        # Let UI handle formatting and display
+        @ui&.show_token_usage(token_data)
+      end
+    end
+  end
+end

data/lib/clacky/agent/llm_caller.rb ADDED Viewed

@@ -0,0 +1,54 @@
+# frozen_string_literal: true
+module Clacky
+  class Agent
+    # LLM API call management
+    # Handles API calls with retry logic and progress indication
+    module LlmCaller
+      # Execute LLM API call with progress indicator, retry logic, and cost tracking
+      # This method is shared by both normal think() and compression flows
+      # @return [Hash] API response with :content, :tool_calls, :usage, etc.
+      private def call_llm
+        @ui&.show_progress
+        tools_to_send = @tool_registry.all_definitions
+        # Retry logic for network failures
+        max_retries = 10
+        retry_delay = 5
+        retries = 0
+        begin
+          # Use active_messages to filter out "future" messages after undo
+          messages_to_send = respond_to?(:active_messages) ? active_messages : @messages
+          response = @client.send_messages_with_tools(
+            messages_to_send,
+            model: current_model,
+            tools: tools_to_send,
+            max_tokens: @config.max_tokens,
+            enable_caching: @config.enable_prompt_caching
+          )
+        rescue Faraday::ConnectionFailed, Faraday::TimeoutError, Errno::ECONNREFUSED, Errno::ETIMEDOUT => e
+          @ui&.clear_progress
+          retries += 1
+          if retries <= max_retries
+            @ui&.show_warning("Network failed: #{e.message}. Retry #{retries}/#{max_retries}...")
+            sleep retry_delay
+            retry
+          else
+            @ui&.show_error("Network failed after #{max_retries} retries: #{e.message}")
+            raise AgentError, "Network connection failed after #{max_retries} retries: #{e.message}"
+          end
+        ensure
+          @ui&.clear_progress
+        end
+        # Track cost for all LLM calls
+        track_cost(response[:usage], raw_api_usage: response[:raw_api_usage])
+        response
+      end
+    end
+  end
+end

data/lib/clacky/{message_compressor.rb → agent/message_compressor.rb} RENAMED Viewed

@@ -57,6 +57,12 @@ module Clacky
     # Generate compression instruction message to be inserted into conversation
     # This enables cache reuse by using the same API call with tools
+    #
+    # SIMPLIFIED APPROACH:
+    # - Don't duplicate conversation history in the compression message
+    # - LLM can already see all messages, just ask it to compress
+    # - Keep the instruction small for better cache efficiency
+    #
     # @param messages [Array<Hash>] Original conversation messages
     # @param recent_messages [Array<Hash>] Recent messages to keep uncompressed (optional)
     # @return [Hash] Compression instruction message to insert, or nil if nothing to compress
@@ -67,12 +73,12 @@ module Clacky
       # If nothing to compress, return nil
       return nil if messages_to_compress.empty?
-      # Build compression prompt with instruction and conversation
-      content = build_compression_content(messages_to_compress)
-      full_prompt = "#{COMPRESSION_PROMPT}\n\nConversation to compress:\n\n#{content}"
-      # Return the compression instruction as a user message with system_injected marker
-      { role: "user", content: full_prompt, system_injected: true }
+      # Simple compression instruction - LLM can see the history already
+      {
+        role: "user",
+        content: COMPRESSION_PROMPT,
+        system_injected: true
+      }
     end
     # Parse LLM response and rebuild message list with compression
@@ -98,36 +104,6 @@ module Clacky
     private
-    def build_compression_content(messages)
-      # Format messages as readable text for compression
-      messages.map do |msg|
-        role = msg[:role]
-        content = format_content(msg[:content])
-        "[#{role.upcase}] #{content}"
-      end.join("\n\n")
-    end
-    def format_content(content)
-      return content if content.is_a?(String)
-      if content.is_a?(Array)
-        content.map do |block|
-          case block[:type]
-          when "text"
-            block[:text]
-          when "tool_use"
-            "TOOL: #{block[:name]}(#{block[:input]})"
-          when "tool_result"
-            "RESULT: #{block[:content]}"
-          else
-            block.to_s
-          end
-        end.join("\n")
-      else
-        content.to_s
-      end
-    end
     def parse_compressed_result(result)
       # Return the compressed result as a single assistant message
       # Keep the <analysis> or <summary> tags as they provide semantic context