openclacky 1.0.0 → 1.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +21 -0
  3. data/README.md +87 -53
  4. data/lib/clacky/agent/cost_tracker.rb +19 -2
  5. data/lib/clacky/agent/llm_caller.rb +33 -0
  6. data/lib/clacky/agent/message_compressor_helper.rb +32 -2
  7. data/lib/clacky/agent.rb +1 -20
  8. data/lib/clacky/client.rb +44 -5
  9. data/lib/clacky/default_parsers/pdf_parser.rb +58 -17
  10. data/lib/clacky/default_parsers/pdf_parser_ocr.py +103 -0
  11. data/lib/clacky/default_parsers/pdf_parser_plumber.py +62 -0
  12. data/lib/clacky/default_skills/deploy/SKILL.md +201 -77
  13. data/lib/clacky/default_skills/new/SKILL.md +3 -114
  14. data/lib/clacky/default_skills/onboard/SKILL.md +340 -133
  15. data/lib/clacky/default_skills/onboard/scripts/import_external_skills.rb +371 -0
  16. data/lib/clacky/message_format/anthropic.rb +72 -8
  17. data/lib/clacky/message_format/bedrock.rb +6 -3
  18. data/lib/clacky/providers.rb +89 -0
  19. data/lib/clacky/server/http_server.rb +736 -7
  20. data/lib/clacky/server/session_registry.rb +55 -24
  21. data/lib/clacky/skill.rb +10 -9
  22. data/lib/clacky/skill_loader.rb +23 -11
  23. data/lib/clacky/tools/file_reader.rb +232 -127
  24. data/lib/clacky/tools/security.rb +42 -64
  25. data/lib/clacky/tools/terminal/persistent_session.rb +15 -4
  26. data/lib/clacky/tools/terminal/safe_rm.sh +106 -0
  27. data/lib/clacky/tools/terminal/session_manager.rb +8 -3
  28. data/lib/clacky/tools/terminal.rb +263 -16
  29. data/lib/clacky/ui2/layout_manager.rb +8 -1
  30. data/lib/clacky/ui2/output_buffer.rb +83 -23
  31. data/lib/clacky/ui2/ui_controller.rb +74 -7
  32. data/lib/clacky/utils/model_pricing.rb +120 -0
  33. data/lib/clacky/utils/parser_manager.rb +70 -6
  34. data/lib/clacky/utils/string_matcher.rb +23 -1
  35. data/lib/clacky/version.rb +1 -1
  36. data/lib/clacky/web/app.css +574 -0
  37. data/lib/clacky/web/app.js +40 -1608
  38. data/lib/clacky/web/i18n.js +195 -0
  39. data/lib/clacky/web/index.html +158 -0
  40. data/lib/clacky/web/profile.js +442 -0
  41. data/lib/clacky/web/sessions.js +1032 -0
  42. data/lib/clacky/web/sidebar.js +39 -0
  43. data/lib/clacky/web/skills.js +456 -0
  44. data/lib/clacky/web/trash.js +343 -0
  45. data/lib/clacky/web/ws-dispatcher.js +255 -0
  46. data/lib/clacky.rb +0 -3
  47. metadata +15 -17
  48. data/lib/clacky/clacky_auth_client.rb +0 -152
  49. data/lib/clacky/clacky_cloud_config.rb +0 -123
  50. data/lib/clacky/cloud_project_client.rb +0 -169
  51. data/lib/clacky/default_skills/deploy/scripts/rails_deploy.rb +0 -1377
  52. data/lib/clacky/default_skills/deploy/tools/check_health.rb +0 -116
  53. data/lib/clacky/default_skills/deploy/tools/create_database_service.rb +0 -341
  54. data/lib/clacky/default_skills/deploy/tools/execute_deployment.rb +0 -99
  55. data/lib/clacky/default_skills/deploy/tools/fetch_runtime_logs.rb +0 -77
  56. data/lib/clacky/default_skills/deploy/tools/list_services.rb +0 -67
  57. data/lib/clacky/default_skills/deploy/tools/report_deploy_status.rb +0 -67
  58. data/lib/clacky/default_skills/deploy/tools/set_deploy_variables.rb +0 -189
  59. data/lib/clacky/default_skills/new/scripts/cloud_project_init.sh +0 -74
  60. data/lib/clacky/deploy_api_client.rb +0 -484
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 49800afa935670c288d9f421595df4246b61e76ed0f2a74e1a7a754e85e26162
4
- data.tar.gz: dba09cac5a79485b743aaad4568ce2e4fe2e13772d6b8c43a360ec11eca7c762
3
+ metadata.gz: 9d6ba5a62f7a352730705db11aff8ab76af059764903eb4413bd5a0aa835fecf
4
+ data.tar.gz: 58ba8fdcf23b5dabcc4a8ed709be0f34a9d27a5be83601fee685a638eb3ff445
5
5
  SHA512:
6
- metadata.gz: 2b723771f71d880d99582f6bfd4d23a66f54ee3caa87f7ed228360f015cadb52a20be9d6869c6e35612740ddb889ceb762efa541a41bc25810f5897d47a333e1
7
- data.tar.gz: 5c425e94d2bf4c4d68175b740d840b9cd6270ef91f2e68e6d8403fbb6fbc5336b07bd65308907dbb8d8c3cd1cb906c4c5f64ae7710a7e0619ab2aaae0ddc278b
6
+ metadata.gz: 00e3f00119cad74d7da43519a1a12332e509c0050946d713dea17db539bbadf0099e96ea5369cc19046fd0bc1c224849cbbaf43addfe0708858780a370067b3b
7
+ data.tar.gz: 4e7888c952dd49c664c67212c0986b62bd7745887dae7d85bce14b3f36c544fc5bd9ca27f1851f04e14477cfd9316938605b6ae0f89b19652cadd1442c6dc564
data/CHANGELOG.md CHANGED
@@ -5,6 +5,27 @@ All notable changes to this project will be documented in this file.
5
5
  The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6
6
  and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7
7
 
8
+ ## [1.0.1] - 2026-05-06
9
+
10
+ ### Added
11
+ - **OpenRouter Anthropic API support.** You can now route Claude model requests through OpenRouter, giving access to Anthropic models via a single OpenRouter API key — useful when Anthropic direct access is limited in your region.
12
+ - **GPT provider support.** Direct GPT provider configuration is now available alongside other providers, making it easier to switch between different OpenAI-compatible endpoints.
13
+ - **OCR-powered PDF reading.** PDF files that contain scanned images (non-text PDFs) are now readable via OCR, allowing the agent to extract content from scanned documents, invoices, and image-heavy PDFs.
14
+ - **Terminal output size control.** The agent now limits terminal output to a configurable size, preventing token overflows when running commands that produce very long output.
15
+ - **Memories & Trash manager in Web UI.** A new management panel lets you browse, review, and delete agent memories and trashed files directly from the Web UI.
16
+ - **Watchdog for interrupt messages.** A background watchdog ensures interrupt signals reliably stop the agent even when it's deep in a tool execution loop.
17
+ - **Skill import with category directory scanning.** When importing skills from openclaw packages, nested category directories are now scanned automatically, so all skills in a category bundle are imported at once.
18
+
19
+ ### Improved
20
+ - **Deploy skill simplified.** The deploy skill now uses Railway CLI directly without custom helper tools, making deployments more reliable and the codebase significantly lighter.
21
+ - **Fix double-render of progress indicators.** Progress spinners and status lines no longer render twice in quick succession, keeping the Web UI output clean.
22
+ - **Session idle status tracking and file descriptor cleanup.** Sessions now correctly report idle state when the agent finishes, and open file descriptors are properly closed to avoid resource leaks.
23
+ - **GPT-4.1 and GPT-5 pricing added.** Model cost tracking now includes the latest GPT-4.1 and GPT-5 pricing tiers.
24
+
25
+ ### Fixed
26
+ - **UTF-8 encoding error in file preview.** Opening files with non-UTF-8 characters no longer crashes the preview — they are now handled gracefully.
27
+ - **Expand `~` in openfile path.** The "open file in editor" API endpoint now correctly expands `~` to the user's home directory.
28
+
8
29
  ## [1.0.0] - 2026-04-30
9
30
 
10
31
  ### Added
data/README.md CHANGED
@@ -6,77 +6,79 @@
6
6
  [![Downloads](https://img.shields.io/gem/dt/openclacky?label=downloads&style=flat-square&color=brightgreen)](https://rubygems.org/gems/openclacky)
7
7
  [![License](https://img.shields.io/badge/license-MIT-lightgrey?style=flat-square)](LICENSE.txt)
8
8
 
9
- **From expertise to business turn your professional knowledge into a monetizable OpenClaw Skill.**
9
+ **The most Token-efficient open-source AI Agent.**
10
10
 
11
- OpenClacky is the creator-side platform for the OpenClaw ecosystem. Package your methods and workflows into encrypted, white-labeled Skills that your clients install and use under your name, your brand, your price.
11
+ OpenClacky matches Claude Code on capability at comparable cost, and saves significantly against other open-source agents (~50% vs OpenClaw, ~3× cheaper than Hermes). 100% open source (MIT), BYOK with any OpenAI-compatible model, built on two years of Agentic R&D and harness engineering.
12
12
 
13
- ## Why OpenClacky?
13
+ > Website: https://www.openclacky.com/ · Backed by **MiraclePlus · ZhenFund · Sequoia China · Hillhouse Capital**
14
14
 
15
- The OpenClaw ecosystem has 5,700+ Skills and growing. But almost all of them are open-sourced, free, and easily copied. The real scarcity isn't more Skills — it's **expertise-backed, production-grade Skills worth paying for**.
15
+ ## Why OpenClacky?
16
16
 
17
- OpenClacky is built for the people who have that expertise.
17
+ Same task, how much do you pay? Under comparable agent workloads, OpenClacky saves a large amount of Token spend compared to mainstream alternatives.
18
18
 
19
- | | **Openclaw** | **OpenClacky** |
19
+ | Agent | Relative cost | Notes |
20
20
  |---|---|---|
21
- | **Core model** | Open sharing | Encrypted & protected |
22
- | **Primary users** | Users who install Skills | Creators who sell Skills |
23
- | **Revenue** | None | Creator-defined pricing |
24
- | **Brand** | Platform brand | Your own brand |
25
- | **Driven by** | Technical contributors | Domain expertise |
21
+ | **OpenClacky** | **~0.8–1.2×** | 16 tools · ~100% cache hit · subagent routing |
22
+ | Claude Code | 1.0× (baseline) | World-class harness, closed-source subscription |
23
+ | OpenClaw | ~1.5× | Comparable harness agent |
24
+ | Hermes | ~3× | 52 built-in tools — schema bloat ~3–4× |
26
25
 
27
- ## How It Works
26
+ *Numbers are averages measured on internal common agent tasks, using Claude Code as the baseline. Full benchmark reports will be published on GitHub.*
28
27
 
29
- **Four steps from capability to business:**
28
+ ## Feature comparison
30
29
 
31
- 1. **Craft your Skill**Turn your domain methodology into a repeatable AI workflow
32
- 2. **Encrypt & protect** — Your logic stays yours; clients can't inspect or copy it
33
- 3. **Package your brand** — Ship under your name, your logo, your onboarding experience
34
- 4. **Launch & acquire** — One-click sales page, built-in SEO, start converting traffic
30
+ Core agent capability is roughly on par across the field the real differentiators are **cost, openness, Skill evolution, and integrations**.
35
31
 
36
- ## Who It's For
32
+ | Feature | Claude Code | OpenClaw | Hermes | **OpenClacky** |
33
+ |---|:---:|:---:|:---:|:---:|
34
+ | Token cost | 1.0× | ~1.5× | ~3× | **~0.8–1.2×** |
35
+ | Open source | ❌ Closed | ✅ Open | ✅ Open | ✅ MIT |
36
+ | BYOK / model freedom | ❌ Anthropic only | ✅ | ✅ | ✅ |
37
+ | Skill self-evolution | ❌ | ❌ | ✅ | ✅ |
38
+ | IM integration (Feishu / WeCom / WeChat) | ❌ | ✅ | ✅ | ✅ |
37
39
 
38
- OpenClacky is built for domain experts whose knowledge can be expressed as *information processing + executable actions*:
40
+ ## How we get the cost down
39
41
 
40
- - **SEO specialists**keyword research, content scoring, rank monitoring
41
- - **Lawyers** — contract review, case retrieval, risk flagging
42
- - **Traders** — signal detection, strategy backtesting, automated execution
43
- - **Data analysts** — cleaning, modeling, report generation
44
- - **Content strategists** — topic selection, outlines, drafts at scale
42
+ Not by cutting features by compounding the right choice at every layer.
45
43
 
46
- ## Features
44
+ ### 1. Ultra-high cache hit rate
45
+ Sessions never restart, double cache markers, **Insert-then-Compress** — the system prompt is never mutated, so compression still reuses the cache. **Measured cache hit rate: near 100%.**
47
46
 
48
- - [x] **Skill builder** — Create AI workflows via conversation or UI, iterate and ship fast
49
- - [x] **Encryption** Protect your knowledge assets; end users cannot read your Skill source
50
- - [x] **White-label packaging** — Your brand, your product line, your client experience
51
- - [x] **Auto-update delivery** — Push updates to all users seamlessly, with version control
52
- - [x] **Cross-platform distribution** — Windows, macOS, Linux — one Skill, every platform
53
- - [x] **Sales page generator** — Launch your storefront fast, with built-in SEO foundations
54
- - [x] **Cost monitoring** — Real-time token tracking, automatic compression (up to 90% savings)
55
- - [x] **Multi-provider support** — OpenAI, Anthropic, DeepSeek, and any OpenAI-compatible API
56
- - [ ] **Skill marketplace** — Discover and distribute premium Skills *(coming soon)*
47
+ ### 2. Minimal tool set
48
+ Only **16 core tools**. Capabilities are offloaded to the Skill ecosystem via a single `invoke_skill` meta-tool. Tool count is not the metric — task completion rate is.
57
49
 
58
- ## Coding Support
50
+ | OpenClacky | Claude Code | OpenClaw | Hermes |
51
+ |:--:|:--:|:--:|:--:|
52
+ | **16** | 40+ | 23 | 52 |
59
53
 
60
- OpenClacky also works as a general AI coding assistant — scaffold full-stack Rails apps, add features, or explore an unfamiliar codebase:
54
+ ### 3. Idle-time auto-compression
55
+ Go to a meeting, grab coffee — the agent compresses long context in the background and pre-warms the cache. Your first message back hits the cache directly. **Cold-start first-token cost reduced by 50%+.**
61
56
 
62
- ```bash
63
- $ openclacky
64
- > /new my-app # scaffold a full-stack Rails app
65
- > Add user auth with email and password
66
- > How does the payment module work?
67
- ```
57
+ ### 4. BYOK — you pick the model, you set the cost
58
+ Any OpenAI-compatible API, plug and play. Official direct, aggregate routing, compatible relays — the choice is 100% yours. Use Claude for code, auto-route subtasks to DeepSeek, save another chunk of tokens.
59
+
60
+ Built on **2 years · 3 generations of agentic architecture · 6 core harness engineering decisions**.
61
+
62
+ ## Skills — the soul of the agent
68
63
 
69
- Built on a production-ready Rails architecture with one-click deployment, dev/prod isolation, and automatic backups.
64
+ - **Invoke with `/`** instant browse, fuzzy search, direct call. Hundreds of Skills at your fingertips.
65
+ - **Create Skills in natural language** — just describe what you want; the agent drafts `SKILL.md`, breaks down steps, and runs validation. No code required.
66
+ - **Self-evolving** — after each run, the agent updates the Skill based on execution context and results. The next call is more stable and more accurate.
67
+ - **Open & compatible** — supports Claude Skills / Markdown Pack / custom formats.
68
+ - **Monetizable** — polished Skills can be packaged for sale, with encrypted distribution, License management, and creator-defined pricing.
70
69
 
71
70
  ## Installation
72
71
 
73
- ### Method 1: One-line Install (Recommended)
72
+ ### Desktop installer (recommended)
74
73
 
75
- ```bash
76
- /bin/bash -c "$(curl -sSL https://raw.githubusercontent.com/clacky-ai/openclacky/main/scripts/install.sh)"
77
- ```
74
+ Double-click to install — environment, dependencies, and Skills all set up automatically.
75
+
76
+ - **macOS** — [Download `.dmg`](https://oss.1024code.com/openclacky-installer/official/openclacky-installer.dmg) (Apple Silicon / Intel)
77
+ - **Windows** — [Download `.exe`](https://oss.1024code.com/openclacky-installer/official/openclacky-installer.exe) (Windows 10 2004+ / Windows 11)
78
78
 
79
- ### Method 2: RubyGems
79
+ More options: https://www.openclacky.com/
80
+
81
+ ### Command line
80
82
 
81
83
  **Requirements:** Ruby >= 3.1.0
82
84
 
@@ -84,6 +86,12 @@ Built on a production-ready Rails architecture with one-click deployment, dev/pr
84
86
  gem install openclacky
85
87
  ```
86
88
 
89
+ Or one-line install:
90
+
91
+ ```bash
92
+ /bin/bash -c "$(curl -sSL https://raw.githubusercontent.com/clacky-ai/openclacky/main/scripts/install.sh)"
93
+ ```
94
+
87
95
  ## Quick Start
88
96
 
89
97
  ### Terminal (CLI)
@@ -95,16 +103,16 @@ openclacky # start interactive agent in current directory
95
103
  ### Web UI
96
104
 
97
105
  ```bash
98
- openclacky server # start the web server (default: http://localhost:7070)
106
+ openclacky server # default: http://localhost:7070
99
107
  ```
100
108
 
101
- Then open **http://localhost:7070** in your browser. You'll get a full-featured chat interface with multi-session support — run separate sessions for coding, copywriting, research, and more, all in parallel.
109
+ Open **http://localhost:7070** for a full chat interface with multi-session support — run coding, copywriting, research sessions in parallel.
102
110
 
103
111
  Options:
104
112
 
105
113
  ```bash
106
- openclacky server --port 8080 # custom port
107
- openclacky server --host 0.0.0.0 # listen on all interfaces (e.g. remote access)
114
+ openclacky server --port 8080 # custom port
115
+ openclacky server --host 0.0.0.0 # listen on all interfaces (remote access)
108
116
  ```
109
117
 
110
118
  ## Configuration
@@ -114,7 +122,26 @@ $ openclacky
114
122
  > /config
115
123
  ```
116
124
 
117
- You'll be prompted to set your **API Key**, **Model**, and **Base URL** (any OpenAI-compatible provider).
125
+ Set your **API Key**, **Model**, and **Base URL** (any OpenAI-compatible provider).
126
+
127
+ Supported out of the box: **Claude (Anthropic) · GPT (OpenAI) · DeepSeek · Kimi (Moonshot) · MiniMax · OpenRouter** — or any custom endpoint.
128
+
129
+ ## Coding use case
130
+
131
+ OpenClacky works as a general AI coding assistant — scaffold full-stack apps, add features, or explore unfamiliar codebases:
132
+
133
+ ```bash
134
+ $ openclacky
135
+ > /new my-app # scaffold a new project
136
+ > Add user auth with email and password
137
+ > How does the payment module work?
138
+ ```
139
+
140
+ ## Advanced — Creator Program
141
+
142
+ Already power users are turning their workflows into vertical AI experts on OpenClacky — encrypted distribution, License management, self-set pricing. Legal, healthcare, financial planning, and more.
143
+
144
+ Learn more: https://www.openclacky.com/ → Creators
118
145
 
119
146
  ## Install from Source
120
147
 
@@ -125,6 +152,13 @@ bundle install
125
152
  bin/clacky
126
153
  ```
127
154
 
155
+ ## Trust & Credibility
156
+
157
+ - **100% open source** — MIT License, all code public, all decisions traceable
158
+ - **2 years of Agentic R&D** — 3 generations of architecture
159
+ - **16 core tools** — minimal by design
160
+ - **Backed by** MiraclePlus · ZhenFund · Sequoia China · Hillhouse Capital
161
+
128
162
  ## Contributing
129
163
 
130
164
  Bug reports and pull requests are welcome on GitHub at https://github.com/clacky-ai/openclacky. Contributors are expected to adhere to the [code of conduct](https://github.com/clacky-ai/openclacky/blob/main/CODE_OF_CONDUCT.md).
@@ -105,8 +105,25 @@ module Clacky
105
105
  cache_write = usage[:cache_creation_input_tokens] || 0
106
106
  cache_read = usage[:cache_read_input_tokens] || 0
107
107
 
108
- # Calculate token delta from previous iteration
109
- delta_tokens = total_tokens - @previous_total_tokens
108
+ # Calculate token delta from previous iteration.
109
+ #
110
+ # Two conventions exist for total_tokens across providers:
111
+ # - OpenAI (default): cumulative per-request input+output (grows
112
+ # with history every turn). Delta = total - prev.
113
+ # - Anthropic direct: already the per-turn new compute
114
+ # (raw_input + cache_creation + output).
115
+ # The MessageFormat sets :total_is_per_turn so
116
+ # we use total_tokens directly as the delta.
117
+ #
118
+ # Without this branch, Anthropic's per-turn total would be treated as
119
+ # cumulative and produce negative / nonsensical deltas whenever cached
120
+ # prefixes make the per-turn new-compute smaller than the previous turn.
121
+ delta_tokens =
122
+ if usage[:total_is_per_turn]
123
+ total_tokens
124
+ else
125
+ total_tokens - @previous_total_tokens
126
+ end
110
127
  @previous_total_tokens = total_tokens # Update for next iteration
111
128
 
112
129
  {
@@ -54,6 +54,20 @@ module Clacky
54
54
  max_retries = 10
55
55
  retry_delay = 5
56
56
  retries = 0
57
+
58
+ # Track whether any of the retry/fallback branches below opened a
59
+ # "retrying" progress slot via show_progress(progress_type:
60
+ # "retrying", phase: "active"). If so, we MUST close it before
61
+ # leaving call_llm — otherwise the UI's legacy shim in
62
+ # UI2::UIController keeps the :quiet ProgressHandle alive, its
63
+ # ticker thread keeps running, and the user sees a frozen
64
+ # "Network failed: ... (681s)" line long after the task finished.
65
+ #
66
+ # The close is done in the outer ensure below so it runs on:
67
+ # - normal success (response returned)
68
+ # - unrecoverable failure (raise propagates out)
69
+ # - BadRequestError reasoning-content retry success
70
+ retrying_progress_opened = false
57
71
  # One-shot flag set by the BadRequestError rescue below when the server
58
72
  # complained about missing reasoning_content. The subsequent retry will
59
73
  # pad every assistant message's reasoning_content, which satisfies
@@ -67,6 +81,7 @@ module Clacky
67
81
  thinking_retry_attempted = false
68
82
 
69
83
  begin
84
+ begin
70
85
  # Use active_messages (Time Machine) when undone, otherwise send full history.
71
86
  # to_api strips internal fields and handles orphaned tool_calls.
72
87
  messages_to_send = if respond_to?(:active_messages)
@@ -118,6 +133,7 @@ module Clacky
118
133
  phase: "active",
119
134
  metadata: { attempt: retries, total: max_retries }
120
135
  )
136
+ retrying_progress_opened = true
121
137
  sleep retry_delay
122
138
  retry
123
139
  else
@@ -144,6 +160,7 @@ module Clacky
144
160
  phase: "active",
145
161
  metadata: { attempt: retries, total: max_retries }
146
162
  )
163
+ retrying_progress_opened = true
147
164
  sleep retry_delay
148
165
  retry
149
166
  else
@@ -180,6 +197,7 @@ module Clacky
180
197
  phase: "active",
181
198
  metadata: { attempt: retries, total: current_max }
182
199
  )
200
+ retrying_progress_opened = true
183
201
  sleep retry_delay
184
202
  retry
185
203
  else
@@ -213,6 +231,21 @@ module Clacky
213
231
  response[:token_usage] = token_data
214
232
 
215
233
  response
234
+ ensure
235
+ # Close any "retrying" progress slot that was opened during the
236
+ # retry/fallback loop above. The legacy UI shim allocates a
237
+ # separate :quiet ProgressHandle under the "retrying" key; if it
238
+ # is never finished its ticker thread keeps running and the user
239
+ # sees a stale "Network failed: ... (NNN s)" line long after the
240
+ # task has completed. This ensure runs on:
241
+ # - successful retry → close the slot, message is "Recovered"
242
+ # so the final frame is informative rather than blank
243
+ # - unrecoverable failure that raises out → close the slot so
244
+ # the spinner doesn't linger while the error bubbles up
245
+ if retrying_progress_opened
246
+ @ui&.show_progress(progress_type: "retrying", phase: "done")
247
+ end
248
+ end
216
249
  end
217
250
 
218
251
  # Attempt to activate the provider fallback model for the given primary model.
@@ -47,11 +47,41 @@ module Clacky
47
47
  handle_compression_response(response, compression_context, progress: handle)
48
48
  true
49
49
  rescue Clacky::AgentInterrupted => e
50
- @ui&.log("Idle compression canceled: #{e.message}", level: :info)
50
+ # User cancelled the idle compression finish the quiet progress
51
+ # slot in place so the user sees exactly what happened (rather
52
+ # than the "Idle detected..." line being silently removed).
53
+ final = "Idle compression cancelled: #{e.message}"
54
+ if handle
55
+ handle.finish(final_message: final)
56
+ else
57
+ @ui&.log(final, level: :info)
58
+ end
51
59
  @history.rollback_before(compression_message)
60
+ Clacky::Logger.info("[idle-compress] cancelled: #{e.message}")
52
61
  false
53
62
  rescue => e
54
- @ui&.log("Idle compression failed: #{e.message}", level: :error)
63
+ # Compression failed (most commonly: network errors after all
64
+ # LlmCaller retries exhausted). Previously this only wrote an
65
+ # @ui.log(:error) that was easy to miss — especially when no
66
+ # other output followed. Now we:
67
+ # 1. Replace the active quiet progress line with the error so
68
+ # the user always sees *something* where the spinner was.
69
+ # 2. Emit a show_warning for a more prominent entry.
70
+ # 3. Persist to Clacky::Logger so post-mortem is possible even
71
+ # if the terminal scrollback has rolled past.
72
+ final = "Idle compression failed: #{e.message}"
73
+ if handle
74
+ handle.finish(final_message: final)
75
+ else
76
+ @ui&.log(final, level: :error)
77
+ end
78
+ @ui&.show_warning(final)
79
+ Clacky::Logger.warn(
80
+ "[idle-compress] failed",
81
+ error_class: e.class.name,
82
+ error_message: e.message,
83
+ backtrace: e.backtrace&.first(5)
84
+ )
55
85
  @history.rollback_before(compression_message)
56
86
  false
57
87
  end
data/lib/clacky/agent.rb CHANGED
@@ -78,7 +78,6 @@ module Clacky
78
78
  @cost_source = :estimated # Track whether cost is from API or estimated
79
79
  @task_cost_source = :estimated # Track cost source for current task
80
80
  @previous_total_tokens = 0 # Track tokens from previous iteration for delta calculation
81
- @interrupted = false # Flag for user interrupt
82
81
  @latest_latency = nil # Most recent LLM call's latency metrics (see Client#send_messages_with_tools)
83
82
  @ui = ui # UIController for direct UI interaction
84
83
  @debug_logs = [] # Debug logs for troubleshooting
@@ -360,9 +359,6 @@ module Clacky
360
359
  task_interrupted = false
361
360
 
362
361
  loop do
363
-
364
- break if should_stop?
365
-
366
362
  @iterations += 1
367
363
  @hooks.trigger(:on_iteration, @iterations)
368
364
 
@@ -929,12 +925,6 @@ module Clacky
929
925
  end
930
926
  end
931
927
 
932
- # Interrupt the agent's current run
933
- # Called when user presses Ctrl+C during agent execution
934
- def interrupt!
935
- @interrupted = true
936
- end
937
-
938
928
  # Enqueue an inline skill injection to be flushed after observe().
939
929
  # Called by InvokeSkill#execute to avoid injecting during tool execution,
940
930
  # which would break Bedrock's toolUse/toolResult pairing requirement.
@@ -1001,16 +991,7 @@ module Clacky
1001
991
 
1002
992
  # Check if agent is currently running
1003
993
  def running?
1004
- @start_time != nil && !should_stop?
1005
- end
1006
-
1007
- private def should_stop?
1008
- if @interrupted
1009
- @interrupted = false # Reset for next run
1010
- return true
1011
- end
1012
-
1013
- false
994
+ !@start_time.nil?
1014
995
  end
1015
996
 
1016
997
  private def build_result(status = :success, error: nil)
data/lib/clacky/client.rb CHANGED
@@ -12,14 +12,29 @@ module Clacky
12
12
  @api_key = api_key
13
13
  @base_url = base_url
14
14
  @model = model
15
- @use_anthropic_format = anthropic_format
16
15
  # Detect Bedrock: ABSK key prefix (native AWS) or abs- model prefix (Clacky AI proxy)
17
16
  @use_bedrock = MessageFormat::Bedrock.bedrock_api_key?(api_key, model)
18
17
 
18
+ # Resolve provider once — reused for capability + api-type lookups.
19
+ provider_id = Providers.resolve_provider(base_url: @base_url, api_key: @api_key)
20
+
21
+ # Decide anthropic_format dynamically based on provider+model, falling
22
+ # back to the explicit constructor flag for unknown providers / custom
23
+ # base_urls. This lets e.g. OpenRouter's Claude models auto-route to the
24
+ # native /v1/messages endpoint (preserving cache_control byte-for-byte)
25
+ # without requiring any change to user YAML.
26
+ provider_prefers_anthropic = provider_id &&
27
+ Providers.anthropic_format_for_model?(provider_id, @model)
28
+ @use_anthropic_format = provider_prefers_anthropic || anthropic_format
29
+
30
+ # Remember the provider id so we can tune connection headers below
31
+ # (OpenRouter's /v1/messages accepts either Bearer or x-api-key, but
32
+ # some OpenRouter-compatible relays only honour Bearer — send both).
33
+ @provider_id = provider_id
34
+
19
35
  # Determine vision support once at construction time.
20
36
  # Non-vision models (DeepSeek, Kimi, MiniMax, etc.) reject image_url
21
37
  # content blocks; the conversion layer strips them when this is false.
22
- provider_id = Providers.resolve_provider(base_url: @base_url, api_key: @api_key)
23
38
  @vision_supported = Providers.supports?(provider_id, :vision, model_name: @model)
24
39
  end
25
40
 
@@ -47,7 +62,7 @@ module Clacky
47
62
  elsif anthropic_format?
48
63
  minimal_body = { model: model, max_tokens: 16,
49
64
  messages: [{ role: "user", content: "hi" }] }.to_json
50
- response = anthropic_connection.post("v1/messages") { |r| r.body = minimal_body }
65
+ response = anthropic_connection.post(anthropic_messages_path) { |r| r.body = minimal_body }
51
66
  else
52
67
  minimal_body = { model: model, max_tokens: 16,
53
68
  messages: [{ role: "user", content: "hi" }] }.to_json
@@ -77,7 +92,7 @@ module Clacky
77
92
  parse_simple_bedrock_response(response)
78
93
  elsif anthropic_format?
79
94
  body = MessageFormat::Anthropic.build_request_body(messages, model, [], max_tokens, false)
80
- response = anthropic_connection.post("v1/messages") { |r| r.body = body.to_json }
95
+ response = anthropic_connection.post(anthropic_messages_path) { |r| r.body = body.to_json }
81
96
  parse_simple_anthropic_response(response)
82
97
  else
83
98
  body = { model: model, max_tokens: max_tokens, messages: messages }
@@ -206,7 +221,7 @@ module Clacky
206
221
  messages = apply_message_caching(messages) if caching_enabled
207
222
 
208
223
  body = MessageFormat::Anthropic.build_request_body(messages, model, tools, max_tokens, caching_enabled)
209
- response = anthropic_connection.post("v1/messages") { |r| r.body = body.to_json }
224
+ response = anthropic_connection.post(anthropic_messages_path) { |r| r.body = body.to_json }
210
225
 
211
226
  raise_error(response) unless response.status == 200
212
227
  check_html_response(response)
@@ -333,6 +348,14 @@ module Clacky
333
348
  conn.headers["x-api-key"] = @api_key
334
349
  conn.headers["anthropic-version"] = "2023-06-01"
335
350
  conn.headers["anthropic-dangerous-direct-browser-access"] = "true"
351
+ # OpenRouter's /v1/messages endpoint authenticates with a Bearer
352
+ # token (the OpenRouter API key), not Anthropic's x-api-key. We send
353
+ # both so the same connection code works for direct Anthropic and
354
+ # for OpenRouter-proxied Claude — each endpoint ignores the header
355
+ # it doesn't recognise.
356
+ if @provider_id == "openrouter"
357
+ conn.headers["Authorization"] = "Bearer #{@api_key}"
358
+ end
336
359
  conn.options.timeout = 300
337
360
  conn.options.open_timeout = 10
338
361
  conn.ssl.verify = false
@@ -340,6 +363,22 @@ module Clacky
340
363
  end
341
364
  end
342
365
 
366
+ # Correct relative path for the Anthropic /v1/messages endpoint, accounting
367
+ # for whether the configured base_url already includes a "/v1" segment.
368
+ #
369
+ # Examples:
370
+ # base_url = "https://api.anthropic.com" → "v1/messages"
371
+ # base_url = "https://openrouter.ai/api/v1" → "messages"
372
+ # base_url = "https://openrouter.ai/api/v1/" → "messages"
373
+ #
374
+ # Without this, OpenRouter would receive POST /api/v1/v1/messages → 404
375
+ # (HTML error page), which bubbles up as the infamous
376
+ # "Invalid API endpoint or server error (received HTML instead of JSON)".
377
+ private def anthropic_messages_path
378
+ base = @base_url.to_s.chomp("/")
379
+ base.end_with?("/v1") ? "messages" : "v1/messages"
380
+ end
381
+
343
382
  # ── Error handling ────────────────────────────────────────────────────────
344
383
 
345
384
  def handle_test_response(response)
@@ -12,15 +12,33 @@
12
12
  # exit 0 — success
13
13
  # exit 1 — failure
14
14
  #
15
- # This file lives in ~/.clacky/parsers/ and can be modified by the LLM
16
- # to add new capabilities (e.g. OCR for scanned PDFs).
15
+ # This file lives in ~/.clacky/parsers/ and can be modified by the LLM.
17
16
  #
18
- # VERSION: 1
17
+ # Extraction pipeline (first successful step wins):
18
+ # 1. pdftotext (poppler) — fastest, text-based PDFs
19
+ # 2. pdfplumber (Python) — handles more layouts
20
+ # (→ pdf_parser_plumber.py)
21
+ # 3. OCR (tesseract) — scanned / image-only PDFs
22
+ # (→ pdf_parser_ocr.py)
23
+ #
24
+ # Each extractor is a plain, self-contained function. Python-backed steps
25
+ # shell out to a sibling .py script so the LLM can edit them directly
26
+ # (with proper syntax highlighting, linters, and per-file run/debug)
27
+ # instead of wrestling with embedded heredocs.
28
+ #
29
+ # VERSION: 3
19
30
 
20
31
  require "open3"
21
32
 
33
+ # Minimum useful output (in bytes). Below this, a step is considered a
34
+ # miss and the next fallback is tried.
22
35
  MIN_CONTENT_BYTES = 20
23
36
 
37
+ # Script directory — resolve sibling .py helpers relative to this file
38
+ # so it works both from the gem's default_parsers/ dir and from the
39
+ # copied-to-user ~/.clacky/parsers/ dir.
40
+ SCRIPT_DIR = File.dirname(File.expand_path(__FILE__))
41
+
24
42
  def try_pdftotext(path)
25
43
  stdout, _stderr, status = Open3.capture3("pdftotext", "-layout", "-enc", "UTF-8", path, "-")
26
44
  return nil unless status.success?
@@ -32,18 +50,10 @@ rescue Errno::ENOENT
32
50
  end
33
51
 
34
52
  def try_pdfplumber(path)
35
- script = <<~PYTHON
36
- import sys, pdfplumber
37
- with pdfplumber.open(sys.argv[1]) as pdf:
38
- pages = []
39
- for i, page in enumerate(pdf.pages, 1):
40
- t = page.extract_text()
41
- if t and t.strip():
42
- pages.append(f"--- Page {i} ---\\n{t.strip()}")
43
- print("\\n\\n".join(pages))
44
- PYTHON
53
+ script = File.join(SCRIPT_DIR, "pdf_parser_plumber.py")
54
+ return nil unless File.exist?(script)
45
55
 
46
- stdout, _stderr, status = Open3.capture3("python3", "-c", script, path)
56
+ stdout, _stderr, status = Open3.capture3("python3", script, path)
47
57
  return nil unless status.success?
48
58
  text = stdout.strip
49
59
  return nil if text.bytesize < MIN_CONTENT_BYTES
@@ -52,6 +62,34 @@ rescue Errno::ENOENT
52
62
  nil # python3 not available
53
63
  end
54
64
 
65
+ # OCR fallback for scanned/image-only PDFs.
66
+ # See pdf_parser_ocr.py for the actual extraction logic.
67
+ #
68
+ # Installation hints (also printed on final failure):
69
+ # macOS: brew install tesseract tesseract-lang poppler
70
+ # pip3 install pytesseract pdf2image
71
+ # Linux: apt install tesseract-ocr tesseract-ocr-chi-sim poppler-utils
72
+ # pip3 install pytesseract pdf2image
73
+ def try_ocr(path)
74
+ # Quick capability check — avoid spawning python if tesseract is missing.
75
+ _stdout, _stderr, status = Open3.capture3("tesseract", "--version")
76
+ return nil unless status.success?
77
+
78
+ script = File.join(SCRIPT_DIR, "pdf_parser_ocr.py")
79
+ return nil unless File.exist?(script)
80
+
81
+ stdout, stderr, status = Open3.capture3("python3", script, path)
82
+ unless status.success?
83
+ warn stderr.strip unless stderr.strip.empty?
84
+ return nil
85
+ end
86
+ text = stdout.strip
87
+ return nil if text.bytesize < MIN_CONTENT_BYTES
88
+ text
89
+ rescue Errno::ENOENT
90
+ nil # tesseract or python3 not available
91
+ end
92
+
55
93
  # --- main ---
56
94
 
57
95
  path = ARGV[0]
@@ -66,14 +104,17 @@ unless File.exist?(path)
66
104
  exit 1
67
105
  end
68
106
 
69
- text = try_pdftotext(path) || try_pdfplumber(path)
107
+ # Try each extractor in order; first non-nil result wins.
108
+ text = try_pdftotext(path) || try_pdfplumber(path) || try_ocr(path)
70
109
 
71
110
  if text
72
111
  print text
73
112
  exit 0
74
113
  else
75
114
  warn "Could not extract text from PDF."
76
- warn "Tip: install poppler for text-based PDFs: brew install poppler"
77
- warn "For scanned PDFs, consider adding OCR support (e.g. tesseract)."
115
+ warn "For text-based PDFs, install poppler: brew install poppler (macOS) / apt install poppler-utils (Linux)"
116
+ warn "For scanned PDFs (OCR):"
117
+ warn " macOS: brew install tesseract tesseract-lang poppler && pip3 install pytesseract pdf2image"
118
+ warn " Linux: apt install tesseract-ocr tesseract-ocr-chi-sim poppler-utils && pip3 install pytesseract pdf2image"
78
119
  exit 1
79
120
  end