openclacky 1.2.12 → 1.2.14

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. checksums.yaml +4 -4
  2. data/.clacky/skills/gem-release/SKILL.md +5 -1
  3. data/.clacky/skills/gem-release/scripts/release.sh +4 -1
  4. data/CHANGELOG.md +39 -0
  5. data/lib/clacky/agent/llm_caller.rb +40 -25
  6. data/lib/clacky/agent/memory_updater.rb +12 -0
  7. data/lib/clacky/agent/session_serializer.rb +1 -0
  8. data/lib/clacky/agent/skill_auto_creator.rb +7 -4
  9. data/lib/clacky/agent/skill_evolution.rb +23 -5
  10. data/lib/clacky/agent/skill_manager.rb +86 -1
  11. data/lib/clacky/agent/skill_reflector.rb +18 -23
  12. data/lib/clacky/agent.rb +132 -15
  13. data/lib/clacky/agent_config.rb +183 -22
  14. data/lib/clacky/cli.rb +55 -0
  15. data/lib/clacky/client.rb +11 -1
  16. data/lib/clacky/default_parsers/pdf_parser.rb +70 -86
  17. data/lib/clacky/default_parsers/pdf_parser_vlm.py +136 -0
  18. data/lib/clacky/default_skills/persist-memory/SKILL.md +4 -3
  19. data/lib/clacky/default_skills/search-skills/SKILL.md +61 -0
  20. data/lib/clacky/idle_compression_timer.rb +1 -1
  21. data/lib/clacky/message_format/open_ai.rb +7 -1
  22. data/lib/clacky/openai_stream_aggregator.rb +4 -1
  23. data/lib/clacky/providers.rb +77 -12
  24. data/lib/clacky/server/http_server.rb +296 -7
  25. data/lib/clacky/server/session_registry.rb +30 -8
  26. data/lib/clacky/server/web_ui_controller.rb +24 -1
  27. data/lib/clacky/session_manager.rb +120 -0
  28. data/lib/clacky/tools/web_search.rb +59 -8
  29. data/lib/clacky/ui2/layout_manager.rb +15 -5
  30. data/lib/clacky/ui2/progress_handle.rb +18 -8
  31. data/lib/clacky/ui2/ui_controller.rb +27 -0
  32. data/lib/clacky/ui_interface.rb +22 -0
  33. data/lib/clacky/utils/model_pricing.rb +96 -0
  34. data/lib/clacky/version.rb +1 -1
  35. data/lib/clacky/vision/resolver.rb +157 -0
  36. data/lib/clacky/web/app.css +209 -4
  37. data/lib/clacky/web/app.js +6 -5
  38. data/lib/clacky/web/i18n.js +22 -6
  39. data/lib/clacky/web/index.html +2 -1
  40. data/lib/clacky/web/sessions.js +408 -80
  41. data/lib/clacky/web/settings.js +241 -60
  42. data/lib/clacky/web/skills.js +5 -14
  43. data/lib/clacky/web/utils.js +57 -0
  44. data/lib/clacky/web/ws-dispatcher.js +136 -0
  45. data/lib/clacky.rb +1 -0
  46. metadata +6 -2
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 451817565cffdf7b1efcdf5e741cea76af0451a8d9900804e2aa3c6a5384ba4a
4
- data.tar.gz: 0232ede01332162004abc1638a8a03b41095c44c68198ce6327e5f5fc815f49a
3
+ metadata.gz: 82874a3ac7c623672bd09b5fa1be1c5dd70b1f223119a1b58b86f85417e46f1c
4
+ data.tar.gz: ba5f1cc02f50a0bee31e24a6ad009c265881eef8b8b9efa6f17b5bec29124414
5
5
  SHA512:
6
- metadata.gz: 3a88a963b238a35fc25d5791b752981179290a88b27d176285647668711b91360a1d4c656677536fda84d18d0fa05fe67ad8364bdf0f7dbbba0a31a007156cbd
7
- data.tar.gz: 124b77cbeec34494c8d35d58f1735459ba50da7c112a13a35c8038bbe89ee81412cf60107f4f926ad870efbbf65621b369f4bb5f1de67b477032b14dbad338d3
6
+ metadata.gz: 5535350a83909fffe2471ab0f6505d54f9bc2436826636eacb1c8d6bbbd84e554087b31b9503d6671449789160072eb743711556f302dc42b820637b7edab83d
7
+ data.tar.gz: bcdec5ed7e56cfc27ee2370ae46582239fa425254e013a7577f162b28c6e2d88b821768f2e367fe2b33dd6773804740692f1f29cba0ca2d73f40d38f6b8e2243
@@ -25,7 +25,7 @@ Automates the complete openclacky gem release workflow via `SKILL_DIR/scripts/re
25
25
  The release script (`SKILL_DIR/scripts/release.sh`) handles everything end-to-end:
26
26
 
27
27
  1. Pre-release checks (clean working directory, required tools)
28
- 2. Run test suite (`bundle exec rspec`)
28
+ 2. Run test suite (`bundle exec rspec`) + web search smoke tests (real network — verifies Bing/DDG parsers still work against live HTML)
29
29
  3. Bump version in `lib/clacky/version.rb`
30
30
  4. Update `Gemfile.lock` via `bundle install`
31
31
  5. Commit and push to origin, wait for CI
@@ -177,6 +177,10 @@ Ask the user whether to use `--update-latest` before running the script.
177
177
  The script uses `set -euo pipefail` and stops on any failure. Common issues:
178
178
 
179
179
  - **Tests fail** → fix tests before re-running
180
+ - **Web search smoke test fails (Bing)** → This often happens due to datacenter IP fingerprinting (anti-scrape blocking) returning irrelevant top-domain filler (like Mr.Bricolage). If you see "No ruby-related result from bing" during the smoke test:
181
+ 1. Manually run `bundle exec rspec spec/integration/web_search_smoke_spec.rb --tag smoke` to verify
182
+ 2. If it's the anti-scrape block, temporarily edit `spec/integration/web_search_smoke_spec.rb` to skip the relevance check on failure (e.g., using `skip "Bing returned anti-scrape garbage..."`)
183
+ 3. Commit the change ("ci: skip bing smoke test relevance check on anti-scrape") and re-run the release script
180
184
  - **CI fails** → script pushes then watches CI; fix and re-push if needed
181
185
  - **gem push fails** → check RubyGems credentials (`gem signin`)
182
186
  - **gh release fails** → check `gh auth status`
@@ -116,10 +116,13 @@ step 2 "Running test suite"
116
116
 
117
117
  if [[ "$DRY_RUN" == true ]]; then
118
118
  echo -e " ${YELLOW}[dry-run]${NC} bundle exec rspec"
119
+ echo -e " ${YELLOW}[dry-run]${NC} bundle exec rspec spec/integration/web_search_smoke_spec.rb --tag smoke"
119
120
  else
120
121
  bundle exec rspec || die "Tests failed — aborting release"
122
+ bundle exec rspec spec/integration/web_search_smoke_spec.rb --tag smoke \
123
+ || die "Web search smoke tests failed — a provider parser may be broken on real network. Aborting release."
121
124
  fi
122
- success "All tests passed"
125
+ success "All tests passed (including web search smoke)"
123
126
 
124
127
  # ════════════════════════════════════════════════════════════════════════
125
128
  # Step 3: Bump version
data/CHANGELOG.md CHANGED
@@ -5,6 +5,45 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [1.2.14] - 2026-06-08
9
+
10
+ ### Added
11
+ - OCR support for scanned PDFs (optical character recognition)
12
+ - VLM-based PDF parser for improved document understanding
13
+
14
+ ### Improved
15
+ - PDF OCR processing quality
16
+
17
+ ### Fixed
18
+ - PDF processing not appearing in session history
19
+ - Stale progress indicator that wouldn't dismiss
20
+
21
+ ### More
22
+ - Document Bing smoke test anti-scrape failure handling in gem-release
23
+
24
+ ## [1.2.13] - 2026-06-08
25
+
26
+ ### Added
27
+ - Session forking capability (Fork any message to a new session)
28
+ - Gemini Flash 3.5 support and MIMO model pricing
29
+ - Web search content capability and search skill LRU caching
30
+ - Token usage visibility after tool calls
31
+ - Subagent UI formatting for better readability
32
+
33
+ ### Improved
34
+ - Web search performance using Bing race search strategy
35
+ - Input box automatically clears when switching sessions
36
+ - Skill evolution info display simplified
37
+ - TUI adds an extra progress bar for better visual feedback
38
+
39
+ ### Fixed
40
+ - Dir-picker path input synchronization on directory navigation
41
+ - Thinking mode silent retries
42
+ - IME (Input Method Editor) input check issues
43
+ - WebUI reflect bug
44
+ - Upstream JSON loading stability
45
+ - Prevent skill evolution when the last message is incomplete
46
+
8
47
  ## [1.2.12] - 2026-06-05
9
48
 
10
49
  ### Fixed
@@ -144,6 +144,28 @@ module Clacky
144
144
  raise RetryableError, "[LLM] Model returned empty response (no content, no tool_calls), retrying..."
145
145
  end
146
146
 
147
+ # Thinking-mode silent response detector. DeepSeek V4 / Kimi K2 /
148
+ # other reasoning models occasionally spend all output tokens inside
149
+ # `reasoning_content` and emit `content=""` + no tool_calls +
150
+ # `finish_reason="stop"`. Protocol-legal under OpenAI semantics
151
+ # (stop = model done), but semantically the model "thought and went
152
+ # silent" — agent main loop would treat it as task completion and
153
+ # exit. Reuse RetryableError so the existing retry + fallback
154
+ # pipeline handles it identically to 5xx/429.
155
+ if response[:content].to_s.strip.empty? &&
156
+ (response[:tool_calls].nil? || response[:tool_calls].empty?) &&
157
+ response[:reasoning_content].to_s.strip.length > 0 &&
158
+ response[:finish_reason].to_s == "stop"
159
+ reasoning_str = response[:reasoning_content].to_s
160
+ Clacky::Logger.warn("llm.thinking_mode_silent_response_detected",
161
+ model: api_call_model,
162
+ reasoning_len: reasoning_str.length,
163
+ reasoning_tail: reasoning_str[-200, 200] || reasoning_str,
164
+ completion_tokens: response.dig(:token_usage, :completion_tokens)
165
+ )
166
+ raise RetryableError, "[LLM] Thinking-mode model produced reasoning but empty content/tool_calls, retrying..."
167
+ end
168
+
147
169
  rescue Faraday::TimeoutError => e
148
170
  # Faraday::TimeoutError on our non-streaming POST almost always means
149
171
  # the *response* took longer than the 300s read-timeout to come back —
@@ -612,17 +634,10 @@ module Clacky
612
634
  # stream mid-tool_use (observed with Anthropic at ~127 s TTFT under
613
635
  # load), OpenRouter does NOT surface an error — it emits a valid
614
636
  # `tool_calls[]` whose `arguments` is empty, `"{}"`, or non-parseable
615
- # JSON. Without this check the agent would either execute the tool with
616
- # empty args or (worse) silently exit thinking the task finished.
617
- #
618
- # Rule is deliberately narrow: we only intercept the case where the
619
- # model streamed literally nothing into the tool_call arguments —
620
- # i.e. `nil`, empty string, or the placeholder `"{}"`. Partial/invalid
621
- # JSON (e.g. `{"path": "/tmp/x"`) is left to the existing
622
- # ArgumentsParser → BadArgumentsError path, because the model already
623
- # committed to specific values and feeding the parse error back as a
624
- # tool_result lets it self-correct in one round-trip (faster than a
625
- # blind retry from scratch).
637
+ # JSON. Without this check the agent would either execute the tool
638
+ # with empty args, or write the broken arguments string back into
639
+ # history and have the NEXT request rejected by the upstream proxy
640
+ # with a 400 BadRequest at the json.loads boundary.
626
641
  private def detect_upstream_truncation!(response)
627
642
  tool_calls = response[:tool_calls]
628
643
  return if tool_calls.nil? || tool_calls.empty?
@@ -653,22 +668,23 @@ module Clacky
653
668
  "(args=#{args_str[0, 40].inspect}). Retrying..."
654
669
  end
655
670
 
656
- # True when a tool_call's arguments field looks COMPLETELY empty
657
- # i.e. the upstream stream was cut before the model wrote any real
658
- # content into the arguments JSON.
671
+ # True when a tool_call's arguments field is unusable — either empty
672
+ # or not a complete, parseable JSON object.
659
673
  #
660
674
  # Rules:
661
- # - nil / non-String / empty string → truncated (nothing at all)
675
+ # - nil / non-String / empty string → truncated
662
676
  # - parses to {} (empty object) → truncated (placeholder only)
663
- # - anything else (including partial/invalid JSON like `{"path":
664
- # "/tmp/x"` where the model already started writing) → NOT
665
- # truncated by this detector
677
+ # - JSON::ParserError (partial JSON) truncated
678
+ # - valid non-empty JSON object → NOT truncated
666
679
  #
667
- # Partial-JSON cases are deliberately left to the existing
668
- # ArgumentsParser BadArgumentsError path, which surfaces the parse
669
- # error back to the LLM as a tool_result so it can self-correct. That
670
- # is more efficient than a blind retry when the model already wrote
671
- # most of the args.
680
+ # Why partial JSON counts as truncated: even though ArgumentsParser
681
+ # could repair it for the current turn, the original broken string
682
+ # still ends up in history (agent.rb#format_tool_calls_for_api keeps
683
+ # arguments verbatim). The next turn's request body would then carry
684
+ # an invalid JSON in tool_calls[].function.arguments, which upstream
685
+ # proxies (LiteLLM, OpenRouter, etc.) reject with a 400 BadRequest
686
+ # before the model ever sees it. Retrying from a clean state is the
687
+ # only path that actually recovers.
672
688
  private def tool_call_args_truncated?(args)
673
689
  return true if args.nil?
674
690
  return true unless args.is_a?(String)
@@ -677,8 +693,7 @@ module Clacky
677
693
  parsed = begin
678
694
  JSON.parse(args)
679
695
  rescue JSON::ParserError
680
- # Partial/invalid JSON — let ArgumentsParser handle it downstream.
681
- return false
696
+ return true
682
697
  end
683
698
 
684
699
  parsed.is_a?(Hash) && parsed.empty?
@@ -68,6 +68,18 @@ module Clacky
68
68
  def run_memory_update_subagent
69
69
  return unless should_update_memory?
70
70
 
71
+ with_memory_update_phase do
72
+ run_memory_update_subagent_inner
73
+ end
74
+ end
75
+
76
+ private def with_memory_update_phase
77
+ return yield unless @ui.respond_to?(:with_phase)
78
+
79
+ @ui.with_phase(kind: "memory_update", label: "Updating long-term memory") { yield }
80
+ end
81
+
82
+ private def run_memory_update_subagent_inner
71
83
  handle = @ui&.start_progress(message: "Updating long-term memory…", style: :primary)
72
84
 
73
85
  # Fork subagent inheriting main agent's model, tools, and history.
@@ -272,6 +272,7 @@ module Clacky
272
272
  # Disk files (PDF, doc, etc.): stored in display_files on the user message at send time
273
273
  disk_files = Array(msg[:display_files]).map { |f|
274
274
  { name: f[:name] || f["name"], type: f[:type] || f["type"] || "file",
275
+ path: f[:path] || f["path"],
275
276
  preview_path: f[:preview_path] || f["preview_path"] }
276
277
  }
277
278
  all_files = image_files + disk_files
@@ -73,11 +73,14 @@ module Clacky
73
73
 
74
74
  ## Decision Criteria (ALL must be true)
75
75
 
76
- 1. **Reusable**: The workflow could apply to similar tasks in the future
76
+ 1. **Turn is actually finished**: The assistant's last message is
77
+ not a question back to the user, and the user wasn't just asking
78
+ /discussing/exploring (Q&A is not work to capture).
79
+ 2. **Reusable**: The workflow could apply to similar tasks in the future
77
80
  (not a one-off, project-specific task)
78
- 2. **Well-defined**: Clear steps with consistent logic, not just exploratory conversation
79
- 3. **Valuable**: Would save more than 5 minutes of work if reused
80
- 4. **Generalizable**: Can be parameterized for different inputs/contexts
81
+ 3. **Well-defined**: Clear steps with consistent logic, not just exploratory conversation
82
+ 4. **Valuable**: Would save more than 5 minutes of work if reused
83
+ 5. **Generalizable**: Can be parameterized for different inputs/contexts
81
84
 
82
85
  ## Action
83
86
 
@@ -26,17 +26,35 @@ module Clacky
26
26
  def run_skill_evolution_hooks
27
27
  return unless skill_evolution_enabled?
28
28
  return if @is_subagent
29
+ return unless skill_evolution_visible? || skill_evolution_has_work?
29
30
 
31
+ with_skill_evolution_phase do
32
+ if @skill_execution_context
33
+ maybe_reflect_on_skill
34
+ else
35
+ maybe_create_skill_from_task
36
+ end
37
+ end
38
+ end
39
+
40
+ private def skill_evolution_visible?
41
+ @config.respond_to?(:verbose) && @config.verbose
42
+ end
43
+
44
+ private def skill_evolution_has_work?
30
45
  if @skill_execution_context
31
- # Scenario 2: Reflect on executed skill (may invoke skill-creator
32
- # to UPDATE the existing skill, but will not create a new one).
33
- maybe_reflect_on_skill
46
+ should_reflect_on_skill?
34
47
  else
35
- # Scenario 1: Auto-create new skill from complex task.
36
- maybe_create_skill_from_task
48
+ should_auto_create_skill?
37
49
  end
38
50
  end
39
51
 
52
+ private def with_skill_evolution_phase
53
+ return yield unless @ui.respond_to?(:with_phase)
54
+
55
+ @ui.with_phase(kind: "skill_evolution", label: "Reflecting on this task") { yield }
56
+ end
57
+
40
58
  # Check if skill evolution is enabled in config
41
59
  # @return [Boolean]
42
60
  private def skill_evolution_enabled?
@@ -1,5 +1,7 @@
1
1
  # frozen_string_literal: true
2
2
 
3
+ require "fileutils"
4
+
3
5
  module Clacky
4
6
  class Agent
5
7
  # Skill management and execution
@@ -128,6 +130,32 @@ module Clacky
128
130
  s.identifier.to_s.start_with?("mcp:")
129
131
  end
130
132
 
133
+ # Sort normal skills so AVAILABLE SKILLS prioritises what the user
134
+ # actually relies on:
135
+ # 1. default skills first (alphabetical, stable) — the always-present
136
+ # built-in baseline; they don't participate in LRU.
137
+ # 2. user-installed (project + brand + global) after, ordered by the
138
+ # skill directory's mtime descending (LRU). touch_skill_for_lru
139
+ # bumps mtime on every invocation; freshly installed skills also
140
+ # naturally float to the top.
141
+ # 3. search-skills is pinned to the very end (after truncation) so it
142
+ # sits next to the "(N more skills installed)" hint and is the
143
+ # last thing the LLM sees when scanning the list — maximising the
144
+ # chance it remembers to search before building a duplicate skill.
145
+ default_skills, user_skills = normal_skills.partition { |s| s.source == :default }
146
+ search_skill, default_skills = default_skills.partition { |s| s.identifier.to_s == "search-skills" }
147
+ default_skills = default_skills.sort_by { |s| s.identifier.to_s }
148
+ user_skills = user_skills.sort_by { |s|
149
+ mt = File.mtime(s.directory.to_s).to_f rescue 0.0
150
+ [-mt, s.identifier.to_s]
151
+ }
152
+ normal_skills = default_skills + user_skills
153
+
154
+ # Track total before truncation so we can hint the agent that more
155
+ # skills exist beyond the window.
156
+ total_normal_skills = normal_skills.size
157
+ truncated_skill_count = 0
158
+
131
159
  # Enforce system prompt injection limit to control token usage.
132
160
  # Warn at most once per process per dropped-set signature — build_skill_context
133
161
  # runs on every system-prompt assembly and is invoked from many short-lived
@@ -135,6 +163,7 @@ module Clacky
135
163
  if normal_skills.size > MAX_CONTEXT_SKILLS
136
164
  kept = normal_skills.first(MAX_CONTEXT_SKILLS)
137
165
  dropped = normal_skills.drop(MAX_CONTEXT_SKILLS)
166
+ truncated_skill_count = dropped.size
138
167
  dropped_names = dropped.map(&:identifier)
139
168
  signature = dropped_names.sort.join(",")
140
169
 
@@ -150,6 +179,8 @@ module Clacky
150
179
  normal_skills = kept
151
180
  end
152
181
 
182
+ normal_skills += search_skill unless search_skill.empty?
183
+
153
184
  if mcp_skills.size > MAX_CONTEXT_MCP_SERVERS
154
185
  dropped = mcp_skills.drop(MAX_CONTEXT_MCP_SERVERS).map(&:identifier)
155
186
  signature = "mcp:" + dropped.sort.join(",")
@@ -194,6 +225,12 @@ module Clacky
194
225
  end
195
226
  end
196
227
 
228
+ if truncated_skill_count > 0
229
+ context += "(#{truncated_skill_count} more skill(s) installed but not shown here. " \
230
+ "If the listed skills don't fit the task, invoke the `search-skills` skill " \
231
+ "to look them up by keyword BEFORE deciding to build a new skill.)\n\n"
232
+ end
233
+
197
234
  context += "\n"
198
235
  sections << context
199
236
  end
@@ -296,6 +333,8 @@ module Clacky
296
333
  # @param task_id [Integer] Current task ID (for message tagging)
297
334
  # @return [void]
298
335
  def inject_skill_as_assistant_message(skill, arguments, task_id, slash_command: false)
336
+ touch_skill_for_lru(skill)
337
+
299
338
  # Track skill execution context for self-evolution system
300
339
  @skill_execution_context = {
301
340
  skill_name: skill.identifier,
@@ -413,10 +452,42 @@ module Clacky
413
452
  # @return [Hash<String, Proc>]
414
453
  def build_template_context
415
454
  {
416
- "memories_meta" => -> { load_memories_meta }
455
+ "memories_meta" => -> { load_memories_meta },
456
+ "all_skills_meta" => -> { load_all_skills_meta }
417
457
  }
418
458
  end
419
459
 
460
+ # Render a complete list of installed skills (no MAX_CONTEXT_SKILLS cap)
461
+ # for skills like `search-skills` that need to see every available skill.
462
+ # Brand skill names + descriptions are pulled from cached_metadata so this
463
+ # is safe to inject without touching encrypted SKILL.md.enc content.
464
+ # @return [String]
465
+ def load_all_skills_meta
466
+ all = @skill_loader.load_all
467
+ all = filter_skills_by_profile(all)
468
+ all = all.reject(&:invalid?)
469
+ all = all.reject { |s| s.identifier.to_s.start_with?("mcp:") }
470
+
471
+ return "(No skills installed.)" if all.empty?
472
+
473
+ default_skills, user_skills = all.partition { |s| s.source == :default }
474
+ default_skills = default_skills.sort_by { |s| s.identifier.to_s }
475
+ user_skills = user_skills.sort_by { |s|
476
+ mt = File.mtime(s.directory.to_s).to_f rescue 0.0
477
+ [-mt, s.identifier.to_s]
478
+ }
479
+ ordered = default_skills + user_skills
480
+
481
+ lines = ["All installed skills (#{ordered.size} total):", ""]
482
+ ordered.each do |skill|
483
+ lines << "- name: #{skill.identifier}"
484
+ lines << " source: #{skill.source}"
485
+ lines << " description: #{skill.context_description}"
486
+ lines << ""
487
+ end
488
+ lines.join("\n")
489
+ end
490
+
420
491
  # Scan ~/.clacky/memories/ and return a formatted summary of all memory files.
421
492
  # Parses YAML frontmatter (same pattern as Skill#parse_frontmatter) for each file.
422
493
  # @return [String] Formatted list of memory topics and descriptions
@@ -488,11 +559,25 @@ module Clacky
488
559
  FileUtils.remove_dir(dir, true) rescue nil
489
560
  end
490
561
 
562
+ # Bump a skill's directory mtime so user-installed skills sort by recent
563
+ # use (LRU) when assembling AVAILABLE SKILLS. Touches the directory, NOT
564
+ # SKILL.md — the WebUI creator center uses SKILL.md mtime to detect local
565
+ # edits, and we must not produce false positives there.
566
+ # default-source skills are skipped: they don't participate in LRU and
567
+ # often live in a read-only gem path.
568
+ def touch_skill_for_lru(skill)
569
+ return if skill.source == :default
570
+ FileUtils.touch(skill.directory.to_s)
571
+ rescue StandardError
572
+ nil
573
+ end
574
+
491
575
  # Execute a skill in a forked subagent
492
576
  # @param skill [Skill] The skill to execute
493
577
  # @param arguments [String] Arguments for the skill
494
578
  # @return [String] Summary of subagent execution
495
579
  def execute_skill_with_subagent(skill, arguments)
580
+ touch_skill_for_lru(skill)
496
581
  # For encrypted brand skills with supporting scripts: decrypt to a tmpdir.
497
582
  # Subagent path has a clear boundary (subagent.run returns), so we shred inline
498
583
  # rather than registering on the parent agent.
@@ -19,45 +19,35 @@ module Clacky
19
19
  # Check if we should reflect on the skill that just executed
20
20
  # Called from SkillEvolution#run_skill_evolution_hooks
21
21
  def maybe_reflect_on_skill
22
- return unless @skill_execution_context
23
-
24
- # Only reflect on skills that the user explicitly invoked via slash command.
25
- # Skills triggered by the LLM itself (e.g. as part of a broader task) or
26
- # platform-management skills invoked incidentally should not be reflected on.
27
- return unless @skill_execution_context[:slash_command]
28
-
29
- # Skip default and brand skills — they are system-owned and should not be
30
- # auto-improved by the evolution system.
31
- source = @skill_execution_context[:source]
32
- return if source == :default || source == :brand
22
+ return unless should_reflect_on_skill?
33
23
 
34
24
  skill_name = @skill_execution_context[:skill_name]
35
- start_iteration = @skill_execution_context[:start_iteration]
36
-
37
- # Calculate iterations within the skill execution (not session-cumulative)
38
- iterations = @iterations - start_iteration
39
-
40
- # Only reflect if the skill actually ran for a meaningful number of iterations
41
- return if iterations < MIN_SKILL_ITERATIONS
42
25
 
43
- # Fork an isolated subagent to reflect + improve — does NOT touch main history
44
26
  @ui&.show_info("Reflecting on skill execution: #{skill_name}")
45
27
  subagent = fork_subagent
46
28
  result = subagent.run(build_skill_reflection_prompt(skill_name))
47
29
 
48
- # Merge subagent cost into parent's cumulative session spend so the
49
- # sessionbar reflects the real total. Without this, reflection cost
50
- # silently disappears from the user's visible total.
51
30
  if result
52
31
  subagent_cost = result[:total_cost_usd] || 0.0
53
32
  @total_cost += subagent_cost
54
33
  @ui&.update_sessionbar(cost: @total_cost, cost_source: @cost_source)
55
34
  end
56
35
 
57
- # Clear the context so we don't reflect again
58
36
  @skill_execution_context = nil
59
37
  end
60
38
 
39
+ private def should_reflect_on_skill?
40
+ return false unless @skill_execution_context
41
+ return false unless @skill_execution_context[:slash_command]
42
+
43
+ source = @skill_execution_context[:source]
44
+ return false if source == :default || source == :brand
45
+
46
+ start_iteration = @skill_execution_context[:start_iteration]
47
+ iterations = @iterations - start_iteration
48
+ iterations >= MIN_SKILL_ITERATIONS
49
+ end
50
+
61
51
  # Build the reflection prompt content
62
52
  # @param skill_name [String]
63
53
  # @return [String]
@@ -79,6 +69,11 @@ module Clacky
79
69
 
80
70
  ## Decision
81
71
 
72
+ If the assistant's last message is a question back to the user
73
+ (the turn isn't actually finished), or the user was just asking/
74
+ discussing rather than finishing a task:
75
+ → Respond briefly: "Skill #{skill_name} worked well, no improvements needed."
76
+
82
77
  If you identified **concrete, actionable improvements**:
83
78
  → Call invoke_skill("skill-creator", task: "Improve skill #{skill_name}: [describe specific improvements needed]")
84
79