rubino-agent 0.4.0 → 0.5.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (322) hide show
  1. checksums.yaml +4 -4
  2. data/.rubocop.yml +6 -0
  3. data/.rubocop_todo.yml +12 -2
  4. data/AGENTS.md +1 -1
  5. data/CHANGELOG.md +454 -1
  6. data/CONTRIBUTING.md +10 -1
  7. data/README.md +69 -11
  8. data/Rakefile +48 -0
  9. data/docs/agents.md +82 -48
  10. data/docs/architecture.md +4 -11
  11. data/docs/commands.md +46 -7
  12. data/docs/configuration.md +174 -30
  13. data/docs/getting-started.md +5 -3
  14. data/docs/mcp.md +3 -3
  15. data/docs/memory.md +3 -3
  16. data/docs/security.md +17 -6
  17. data/docs/tools.md +45 -49
  18. data/docs/troubleshooting.md +1 -1
  19. data/exe/rubino +16 -2
  20. data/ext/landlock/extconf.rb +78 -0
  21. data/ext/landlock/landlock.c +253 -0
  22. data/install.sh +715 -54
  23. data/lib/rubino/active_agent.rb +73 -0
  24. data/lib/rubino/agent/action_claim_guard.rb +913 -0
  25. data/lib/rubino/agent/agent_registry.rb +5 -2
  26. data/lib/rubino/agent/definition.rb +4 -28
  27. data/lib/rubino/agent/fallback_chain.rb +0 -6
  28. data/lib/rubino/agent/iteration_budget.rb +109 -3
  29. data/lib/rubino/agent/loop.rb +664 -42
  30. data/lib/rubino/agent/model_call_runner.rb +81 -3
  31. data/lib/rubino/agent/prompts/build.txt +55 -7
  32. data/lib/rubino/agent/prompts/general.txt +8 -3
  33. data/lib/rubino/agent/response_validator.rb +8 -0
  34. data/lib/rubino/agent/runner.rb +307 -13
  35. data/lib/rubino/agent/tool_executor.rb +368 -31
  36. data/lib/rubino/agent/truncation_continuation.rb +11 -5
  37. data/lib/rubino/api/operations/approvals/decide_operation.rb +0 -4
  38. data/lib/rubino/api/operations/clarifications/decide_operation.rb +0 -4
  39. data/lib/rubino/api/operations/cron_jobs/create_operation.rb +0 -4
  40. data/lib/rubino/api/operations/cron_jobs/delete_operation.rb +0 -4
  41. data/lib/rubino/api/operations/cron_jobs/list_operation.rb +0 -4
  42. data/lib/rubino/api/operations/cron_jobs/pause_operation.rb +1 -5
  43. data/lib/rubino/api/operations/cron_jobs/resume_operation.rb +1 -5
  44. data/lib/rubino/api/operations/cron_jobs/show_operation.rb +0 -4
  45. data/lib/rubino/api/operations/cron_jobs/trigger_operation.rb +0 -4
  46. data/lib/rubino/api/operations/cron_jobs/update_operation.rb +0 -4
  47. data/lib/rubino/api/operations/files/read_operation.rb +1 -5
  48. data/lib/rubino/api/operations/files/upload_operation.rb +0 -4
  49. data/lib/rubino/api/operations/health_operation.rb +1 -5
  50. data/lib/rubino/api/operations/memory/delete_operation.rb +0 -4
  51. data/lib/rubino/api/operations/memory/index_operation.rb +0 -4
  52. data/lib/rubino/api/operations/memory/stats_operation.rb +0 -4
  53. data/lib/rubino/api/operations/metrics_operation.rb +1 -1
  54. data/lib/rubino/api/operations/mode/show_operation.rb +0 -4
  55. data/lib/rubino/api/operations/mode/update_operation.rb +0 -4
  56. data/lib/rubino/api/operations/models/list_operation.rb +0 -4
  57. data/lib/rubino/api/operations/oauth/connections/disconnect_operation.rb +0 -4
  58. data/lib/rubino/api/operations/oauth/connections/list_operation.rb +0 -4
  59. data/lib/rubino/api/operations/oauth/providers/callback_operation.rb +0 -4
  60. data/lib/rubino/api/operations/oauth/providers/connect_operation.rb +0 -4
  61. data/lib/rubino/api/operations/oauth/providers/list_operation.rb +0 -4
  62. data/lib/rubino/api/operations/runs/create_operation.rb +0 -4
  63. data/lib/rubino/api/operations/runs/events_operation.rb +0 -4
  64. data/lib/rubino/api/operations/runs/stop_operation.rb +0 -4
  65. data/lib/rubino/api/operations/sessions/create_operation.rb +0 -4
  66. data/lib/rubino/api/operations/sessions/delete_operation.rb +0 -4
  67. data/lib/rubino/api/operations/sessions/index_operation.rb +0 -4
  68. data/lib/rubino/api/operations/sessions/retry_operation.rb +0 -4
  69. data/lib/rubino/api/operations/sessions/show_operation.rb +0 -4
  70. data/lib/rubino/api/operations/sessions/undo_operation.rb +0 -4
  71. data/lib/rubino/api/operations/skills/list_operation.rb +0 -4
  72. data/lib/rubino/api/operations/skills/toggle_operation.rb +0 -4
  73. data/lib/rubino/api/operations/tasks/index_operation.rb +0 -4
  74. data/lib/rubino/api/operations/tasks/show_operation.rb +0 -4
  75. data/lib/rubino/api/operations/tasks/stop_operation.rb +0 -4
  76. data/lib/rubino/api/router.rb +2 -2
  77. data/lib/rubino/api/server.rb +19 -0
  78. data/lib/rubino/attachments/policy.rb +8 -0
  79. data/lib/rubino/attachments/preamble.rb +16 -8
  80. data/lib/rubino/boot/config_guard.rb +71 -0
  81. data/lib/rubino/cli/chat/completion_builder.rb +44 -8
  82. data/lib/rubino/cli/chat/idle_card_host.rb +7 -1
  83. data/lib/rubino/cli/chat/session_resolver.rb +186 -50
  84. data/lib/rubino/cli/chat_command.rb +1724 -91
  85. data/lib/rubino/cli/commands.rb +373 -1
  86. data/lib/rubino/cli/config_command.rb +118 -11
  87. data/lib/rubino/cli/doctor_command.rb +268 -23
  88. data/lib/rubino/cli/jobs_command.rb +42 -3
  89. data/lib/rubino/cli/memory_command.rb +76 -23
  90. data/lib/rubino/cli/onboarding_wizard.rb +85 -7
  91. data/lib/rubino/cli/server_command.rb +43 -1
  92. data/lib/rubino/cli/session_command.rb +272 -18
  93. data/lib/rubino/cli/setup_command.rb +293 -8
  94. data/lib/rubino/cli/skills_command.rb +88 -20
  95. data/lib/rubino/cli/trust_gate.rb +16 -7
  96. data/lib/rubino/commands/built_ins.rb +4 -2
  97. data/lib/rubino/commands/command.rb +12 -2
  98. data/lib/rubino/commands/executor.rb +161 -19
  99. data/lib/rubino/commands/handlers/agent_switch.rb +100 -0
  100. data/lib/rubino/commands/handlers/agents.rb +324 -60
  101. data/lib/rubino/commands/handlers/config.rb +8 -1
  102. data/lib/rubino/commands/handlers/display.rb +50 -0
  103. data/lib/rubino/commands/handlers/help.rb +106 -14
  104. data/lib/rubino/commands/handlers/mcp.rb +7 -32
  105. data/lib/rubino/commands/handlers/memory.rb +23 -38
  106. data/lib/rubino/commands/handlers/sessions.rb +70 -33
  107. data/lib/rubino/commands/handlers/skills.rb +47 -28
  108. data/lib/rubino/commands/handlers/status.rb +65 -10
  109. data/lib/rubino/commands/loader.rb +12 -0
  110. data/lib/rubino/compression/compression_result.rb +35 -0
  111. data/lib/rubino/compression/compressor.rb +109 -0
  112. data/lib/rubino/compression/content_router.rb +240 -0
  113. data/lib/rubino/compression/diff_compressor.rb +252 -0
  114. data/lib/rubino/compression/javascript_code_skeleton.rb +15 -0
  115. data/lib/rubino/compression/json_compressor.rb +274 -0
  116. data/lib/rubino/compression/line_skeleton.rb +92 -0
  117. data/lib/rubino/compression/log_compressor.rb +299 -0
  118. data/lib/rubino/compression/python_code_skeleton.rb +122 -0
  119. data/lib/rubino/compression/ruby_code_skeleton.rb +80 -0
  120. data/lib/rubino/compression/tree_sitter_code_skeleton.rb +118 -0
  121. data/lib/rubino/compression/tsx_code_skeleton.rb +15 -0
  122. data/lib/rubino/compression/typescript_code_skeleton.rb +15 -0
  123. data/lib/rubino/config/configuration.rb +151 -105
  124. data/lib/rubino/config/defaults.rb +369 -41
  125. data/lib/rubino/config/loader.rb +71 -13
  126. data/lib/rubino/config/reasoning_prefs.rb +23 -0
  127. data/lib/rubino/config/validator.rb +384 -0
  128. data/lib/rubino/config/writer.rb +123 -31
  129. data/lib/rubino/context/compressor.rb +185 -23
  130. data/lib/rubino/context/file_discovery.rb +0 -8
  131. data/lib/rubino/context/message_boundary.rb +26 -5
  132. data/lib/rubino/context/project_languages.rb +83 -0
  133. data/lib/rubino/context/prompt_assembler.rb +110 -22
  134. data/lib/rubino/context/summary_builder.rb +77 -27
  135. data/lib/rubino/context/token_budget.rb +38 -13
  136. data/lib/rubino/context/token_estimate.rb +45 -0
  137. data/lib/rubino/context/tool_result_pruner.rb +81 -0
  138. data/lib/rubino/database/connection.rb +154 -3
  139. data/lib/rubino/database/migrations/001_create_initial_schema.rb +314 -40
  140. data/lib/rubino/database/migrator.rb +81 -14
  141. data/lib/rubino/documents/cap_exceeded.rb +13 -0
  142. data/lib/rubino/documents/converters/csv.rb +4 -3
  143. data/lib/rubino/documents/converters/docx.rb +29 -5
  144. data/lib/rubino/documents/converters/html.rb +5 -1
  145. data/lib/rubino/documents/converters/json.rb +2 -1
  146. data/lib/rubino/documents/converters/pdf.rb +11 -2
  147. data/lib/rubino/documents/converters/plain.rb +2 -1
  148. data/lib/rubino/documents/converters/pptx.rb +11 -2
  149. data/lib/rubino/documents/converters/xlsx.rb +35 -4
  150. data/lib/rubino/documents/converters/xml.rb +2 -1
  151. data/lib/rubino/documents/limits.rb +210 -0
  152. data/lib/rubino/documents.rb +10 -3
  153. data/lib/rubino/errors.rb +36 -5
  154. data/lib/rubino/files/workspace.rb +2 -2
  155. data/lib/rubino/interaction/cancel_token.rb +19 -3
  156. data/lib/rubino/interaction/events.rb +13 -3
  157. data/lib/rubino/interaction/input_queue.rb +11 -0
  158. data/lib/rubino/interaction/lifecycle.rb +238 -33
  159. data/lib/rubino/interaction/polishing.rb +184 -0
  160. data/lib/rubino/interaction/probe.rb +1 -1
  161. data/lib/rubino/jobs/cron_job_repository.rb +5 -12
  162. data/lib/rubino/jobs/handlers/cleanup_sessions_job.rb +11 -0
  163. data/lib/rubino/jobs/handlers/distill_skill_job.rb +67 -21
  164. data/lib/rubino/jobs/queue.rb +133 -13
  165. data/lib/rubino/jobs/runner.rb +24 -6
  166. data/lib/rubino/jobs/worker.rb +1 -5
  167. data/lib/rubino/llm/adapter_factory.rb +1 -1
  168. data/lib/rubino/llm/adapter_response.rb +47 -4
  169. data/lib/rubino/llm/auxiliary_client.rb +63 -3
  170. data/lib/rubino/llm/cache_breakpoint_middleware.rb +194 -0
  171. data/lib/rubino/llm/credential_check.rb +76 -20
  172. data/lib/rubino/llm/error_classifier.rb +186 -77
  173. data/lib/rubino/llm/fake_provider.rb +3 -3
  174. data/lib/rubino/llm/inline_think_filter.rb +103 -15
  175. data/lib/rubino/llm/reasoning_manager.rb +3 -26
  176. data/lib/rubino/llm/request.rb +26 -15
  177. data/lib/rubino/llm/ruby_llm_adapter.rb +623 -67
  178. data/lib/rubino/llm/scenario_loader.rb +10 -17
  179. data/lib/rubino/llm/scenarios/glued-table-prose.yml +36 -0
  180. data/lib/rubino/llm/scenarios/growing-table.yml +49 -0
  181. data/lib/rubino/llm/scenarios/narrow-terminal-table.yml +47 -0
  182. data/lib/rubino/llm/scenarios/streamed-table.yml +55 -0
  183. data/lib/rubino/llm/scenarios/table-then-prose.yml +34 -0
  184. data/lib/rubino/llm/scenarios/too-wide-table.yml +47 -0
  185. data/lib/rubino/llm/scenarios/wide-table.yml +1 -1
  186. data/lib/rubino/llm/thinking_support.rb +17 -12
  187. data/lib/rubino/llm/tool_bridge.rb +200 -32
  188. data/lib/rubino/mcp/manager.rb +71 -10
  189. data/lib/rubino/mcp/mcp_tool_wrapper.rb +38 -3
  190. data/lib/rubino/memory/aux_retry.rb +107 -0
  191. data/lib/rubino/memory/backends/sqlite.rb +104 -67
  192. data/lib/rubino/memory/backends.rb +26 -10
  193. data/lib/rubino/memory/deduplicator.rb +22 -0
  194. data/lib/rubino/memory/flusher.rb +35 -1
  195. data/lib/rubino/memory/salience_gate.rb +129 -0
  196. data/lib/rubino/memory/sqlite_extraction.rb +70 -0
  197. data/lib/rubino/memory/sqlite_extraction_prompt.rb +16 -1
  198. data/lib/rubino/memory/store.rb +48 -20
  199. data/lib/rubino/memory/threat_scanner.rb +60 -0
  200. data/lib/rubino/memory.rb +47 -0
  201. data/lib/rubino/oauth/provider.rb +0 -5
  202. data/lib/rubino/output/cost.rb +52 -0
  203. data/lib/rubino/output/headless_block_latch.rb +53 -0
  204. data/lib/rubino/output/result_serializer.rb +222 -0
  205. data/lib/rubino/output/turn_recorder.rb +77 -0
  206. data/lib/rubino/run/event_store.rb +1 -6
  207. data/lib/rubino/run/repository.rb +0 -14
  208. data/lib/rubino/security/approval_policy.rb +314 -33
  209. data/lib/rubino/security/command_allowlist.rb +79 -4
  210. data/lib/rubino/security/command_normalizer.rb +36 -0
  211. data/lib/rubino/security/dangerous_patterns.rb +17 -4
  212. data/lib/rubino/security/doom_loop_detector.rb +21 -2
  213. data/lib/rubino/security/hardline_guard.rb +190 -16
  214. data/lib/rubino/security/pattern_matcher.rb +28 -5
  215. data/lib/rubino/security/prefix_deriver.rb +25 -6
  216. data/lib/rubino/security/readonly_commands.rb +442 -18
  217. data/lib/rubino/security/redactor.rb +272 -0
  218. data/lib/rubino/security/sandbox.rb +460 -0
  219. data/lib/rubino/security/secret_detector.rb +110 -0
  220. data/lib/rubino/security/secret_path.rb +263 -0
  221. data/lib/rubino/security/url_safety.rb +255 -0
  222. data/lib/rubino/session/lock.rb +91 -0
  223. data/lib/rubino/session/message.rb +38 -3
  224. data/lib/rubino/session/picker.rb +95 -0
  225. data/lib/rubino/session/repository.rb +249 -31
  226. data/lib/rubino/session/store.rb +135 -21
  227. data/lib/rubino/skills/installer.rb +116 -32
  228. data/lib/rubino/skills/prompt_index.rb +2 -2
  229. data/lib/rubino/skills/registry.rb +56 -6
  230. data/lib/rubino/skills/skill.rb +94 -12
  231. data/lib/rubino/skills/skill_tool.rb +21 -25
  232. data/lib/rubino/skills/state_repository.rb +0 -4
  233. data/lib/rubino/tools/background_tasks.rb +299 -47
  234. data/lib/rubino/tools/base.rb +219 -4
  235. data/lib/rubino/tools/edit_tool.rb +116 -31
  236. data/lib/rubino/tools/fuzzy_match.rb +212 -0
  237. data/lib/rubino/tools/glob_tool.rb +52 -9
  238. data/lib/rubino/tools/grep_tool.rb +71 -11
  239. data/lib/rubino/tools/multi_edit_tool.rb +88 -20
  240. data/lib/rubino/tools/patch_tool.rb +56 -10
  241. data/lib/rubino/tools/probe_tool.rb +0 -20
  242. data/lib/rubino/tools/question_tool.rb +54 -2
  243. data/lib/rubino/tools/read_attachment_tool.rb +24 -12
  244. data/lib/rubino/tools/read_tool.rb +159 -35
  245. data/lib/rubino/tools/read_tracker.rb +189 -35
  246. data/lib/rubino/tools/registry.rb +151 -31
  247. data/lib/rubino/tools/result.rb +48 -9
  248. data/lib/rubino/tools/retrieve_output_tool.rb +70 -0
  249. data/lib/rubino/tools/ruby_tool.rb +0 -0
  250. data/lib/rubino/tools/shell_kill_tool.rb +6 -2
  251. data/lib/rubino/tools/shell_output_tool.rb +7 -1
  252. data/lib/rubino/tools/shell_registry.rb +229 -5
  253. data/lib/rubino/tools/shell_tail_tool.rb +6 -1
  254. data/lib/rubino/tools/shell_tool.rb +523 -54
  255. data/lib/rubino/tools/steer_tool.rb +2 -21
  256. data/lib/rubino/tools/subagent_probe.rb +1 -1
  257. data/lib/rubino/tools/summarize_file_tool.rb +12 -0
  258. data/lib/rubino/tools/task_result_tool.rb +8 -2
  259. data/lib/rubino/tools/task_stop_tool.rb +15 -22
  260. data/lib/rubino/tools/task_tool.rb +229 -104
  261. data/lib/rubino/tools/vision_tool.rb +37 -4
  262. data/lib/rubino/tools/webfetch_tool.rb +184 -7
  263. data/lib/rubino/tools/websearch_tool.rb +92 -30
  264. data/lib/rubino/tools/write_tool.rb +24 -5
  265. data/lib/rubino/ui/agent_menu.rb +179 -0
  266. data/lib/rubino/ui/api.rb +12 -3
  267. data/lib/rubino/ui/base.rb +13 -2
  268. data/lib/rubino/ui/bottom_composer.rb +1483 -203
  269. data/lib/rubino/ui/cli.rb +1340 -272
  270. data/lib/rubino/ui/completion_menu.rb +35 -50
  271. data/lib/rubino/ui/composer/input_line.rb +131 -0
  272. data/lib/rubino/ui/composer/subagent_panel.rb +35 -0
  273. data/lib/rubino/ui/headless_trace.rb +63 -0
  274. data/lib/rubino/ui/input_history.rb +90 -5
  275. data/lib/rubino/ui/live_region.rb +82 -7
  276. data/lib/rubino/ui/markdown_renderer.rb +214 -17
  277. data/lib/rubino/ui/menu_view.rb +117 -0
  278. data/lib/rubino/ui/notifier.rb +0 -2
  279. data/lib/rubino/ui/null.rb +53 -6
  280. data/lib/rubino/ui/paste_store.rb +49 -3
  281. data/lib/rubino/ui/printer_base.rb +135 -8
  282. data/lib/rubino/ui/queued_indicators.rb +6 -1
  283. data/lib/rubino/ui/status_bar.rb +61 -7
  284. data/lib/rubino/ui/streaming_markdown.rb +148 -6
  285. data/lib/rubino/ui/subagent_cards.rb +126 -25
  286. data/lib/rubino/ui/tool_label.rb +52 -0
  287. data/lib/rubino/update_check.rb +39 -4
  288. data/lib/rubino/util/atomic_file.rb +129 -0
  289. data/lib/rubino/util/duration.rb +8 -5
  290. data/lib/rubino/util/ignore_rules.rb +120 -0
  291. data/lib/rubino/util/output.rb +275 -13
  292. data/lib/rubino/util/secrets_mask.rb +70 -7
  293. data/lib/rubino/util/spill_store.rb +153 -0
  294. data/lib/rubino/version.rb +7 -1
  295. data/lib/rubino/workspace.rb +74 -3
  296. data/lib/rubino.rb +216 -25
  297. data/rubino-agent.gemspec +28 -1
  298. data/skills/ruby-expert/SKILL.md +1 -0
  299. metadata +116 -29
  300. data/docs/plugins.md +0 -195
  301. data/lib/rubino/agent/router.rb +0 -65
  302. data/lib/rubino/database/migrations/002_create_runs.rb +0 -45
  303. data/lib/rubino/database/migrations/003_create_skill_states.rb +0 -15
  304. data/lib/rubino/database/migrations/004_create_cron_jobs.rb +0 -36
  305. data/lib/rubino/database/migrations/005_create_oauth_connections.rb +0 -27
  306. data/lib/rubino/database/migrations/006_create_webhook_deliveries.rb +0 -34
  307. data/lib/rubino/database/migrations/007_create_messages_fts.rb +0 -59
  308. data/lib/rubino/database/migrations/008_create_memory_facts.rb +0 -75
  309. data/lib/rubino/database/migrations/009_create_memory_graph.rb +0 -55
  310. data/lib/rubino/database/migrations/010_add_owner_pid_to_sessions.rb +0 -20
  311. data/lib/rubino/interaction/state.rb +0 -56
  312. data/lib/rubino/memory/backends/default.rb +0 -101
  313. data/lib/rubino/memory/extractor.rb +0 -85
  314. data/lib/rubino/memory/retriever.rb +0 -50
  315. data/lib/rubino/plugins/registry.rb +0 -75
  316. data/lib/rubino/plugins.rb +0 -86
  317. data/lib/rubino/tools/answer_child_tool.rb +0 -83
  318. data/lib/rubino/tools/ask_parent_tool.rb +0 -232
  319. data/lib/rubino/tools/git_tool.rb +0 -71
  320. data/lib/rubino/tools/github_tool.rb +0 -233
  321. data/lib/rubino/tools/test_tool.rb +0 -454
  322. data/lib/rubino/ui/subagent_view.rb +0 -266
data/README.md CHANGED
@@ -5,12 +5,62 @@ A coding & automation **agent** — small, self-contained, and built to run *whe
5
5
  ## Why rubino
6
6
 
7
7
  - **Runs where the work is** — a single gem on the machine (or VM) that holds the code, not a remote service you pipe files to.
8
- - **Persistent memory** — a tiny SQLite "Zep"-style fact store that learns about you and the project across sessions.
8
+ - **Persistent memory** — a tiny SQLite fact store that learns about you and the project across sessions.
9
9
  - **Context compaction** — automatic compression with session lineage when the conversation outgrows the window.
10
10
  - **CLI *and* HTTP API** — an interactive terminal session for humans, a bearer-protected JSON + SSE API for programs.
11
- - **Real tools, gated** — read/write/edit, shell, ruby, git/github, grep/glob, a structured test runner, vision, and more, behind an approval model with a non-bypassable hardline floor.
11
+ - **Real tools, gated** — read/write/edit, shell, ruby, grep/glob, apply_patch, vision, and more (git, GitHub, and tests run through the hardened shell), behind an approval model with a non-bypassable hardline floor.
12
12
  - **Built on ruby_llm** — provider-agnostic: MiniMax, OpenAI, Anthropic, Gemini, or an OpenAI-compatible gateway.
13
13
 
14
+ ## Cache-friendly compaction (measured)
15
+
16
+ A long agent session only stays cheap if the cached prompt prefix survives
17
+ compaction. rubino is built so that when the conversation is compressed into a
18
+ summary, the summary lands *after* the cached head (system + tools + stable
19
+ history) — so the provider's prompt cache keeps **hitting** the head instead of
20
+ re-encoding it cold every time the session is compacted.
21
+
22
+ Measured with the model held fixed (local oMLX `Qwen3.6-35B-A3B`,
23
+ Anthropic-style `cache_control`) on a 25-turn coding session that triggers
24
+ compaction **9 times**:
25
+
26
+ | metric | rubino |
27
+ |---|---|
28
+ | cached prefix retained right after each compaction | **44–94%** (survives — never resets to 0) |
29
+ | cumulative cache-read over the whole session | **88%** |
30
+ | prefix byte-stability across turns | **0.95** |
31
+ | task solved through all 9 compactions | **10/10** hidden tests, 0 wasted work |
32
+
33
+ Holding the model fixed isolates the **engine** — any difference is the
34
+ scaffolding (prompt assembly, where the compaction summary is placed, cache
35
+ breakpoints), not the model. This is a single model and a single scenario:
36
+ indicative of the design, not a leaderboard. The harness lives in a separate
37
+ benchmark project.
38
+
39
+ ## Tool-output compression (measured)
40
+
41
+ Test logs, diffs and large command dumps are mostly noise. rubino can route
42
+ each tool output through a **deterministic (no-ML)** compressor that keeps the
43
+ signal and drops the rest — opt-in (`tool_output_compression`), with a
44
+ byte-identical passthrough for anything already small and a `retrieve_output`
45
+ pointer back to the full text. Token-honest: counts are the **exact**
46
+ `prompt_tokens` reported by the server (local oMLX `Qwen3.6-35B-A3B`), not
47
+ chars/4 estimates.
48
+
49
+ | tool output | reduction | fidelity (verified) |
50
+ |---|---:|---|
51
+ | rspec full suite (21 failures, ~8k lines) | **97%** | all 21 failures + the tally kept |
52
+ | `git log --stat` / `ls -R` | **94%** | boundary/keyword lines kept |
53
+ | large source diff (9 files) | **42%** | all 575 ± lines, 13 hunks, 9 headers |
54
+ | `package-lock.json` diff (60 bumps) | **99%** | file header + summary (body elided) |
55
+ | whole-file Ruby read → skeleton | **27%** | signatures + structure kept |
56
+ | JSON (kubectl / docker / gh, uniform rows) | **40–88%** | error rows + outliers always kept |
57
+ | rubocop (already signal-dense) | 11% | floor — every offense kept |
58
+
59
+ End-to-end A/B on real edit tasks: **12/12 tasks passed with compression ON and
60
+ OFF** — it never broke a task, and every forced-failure run still recovered the
61
+ single failing line out of a long log. Routing is verified (each output goes to
62
+ the right strategy) and small inputs pass through **byte-identical**.
63
+
14
64
  ## Install
15
65
 
16
66
  One line, Linux and macOS (x86_64 / arm64). Installs a compatible Ruby, then the gem — all in user space, no sudo:
@@ -19,7 +69,15 @@ One line, Linux and macOS (x86_64 / arm64). Installs a compatible Ruby, then the
19
69
  curl -fsSL https://raw.githubusercontent.com/Jhonnyr97/rubino-agent/main/install.sh | bash
20
70
  ```
21
71
 
22
- On **Linux** the installer fetches a precompiled Ruby via [`rv`](https://github.com/spinel-coop/rv). On **macOS**, if [Homebrew](https://brew.sh) is present it asks whether to use Homebrew (`brew install ruby`) or `rv`; without Homebrew it uses `rv` directly. Skip the prompt with `RUBINO_INSTALL_METHOD=brew` or `=rv`.
72
+ The installer supports **three** methods for getting a compatible Ruby + the gem:
73
+
74
+ - **`rv`** ([`rv`](https://github.com/spinel-coop/rv)) — fetches a precompiled Ruby into user space.
75
+ - **Homebrew** (`brew install ruby`) — offered on **macOS** when [Homebrew](https://brew.sh) is present.
76
+ - **`mise`** ([mise](https://mise.jdx.dev)) — a polyglot tool manager; installs `rubino` via its `gem:` backend and pins the latest published gem version.
77
+
78
+ On **macOS** (interactive) you're asked to pick Homebrew / `rv` / `mise`; on **Linux** (interactive) you pick `rv` / `mise` (Homebrew is offered only if `brew` is already on PATH). Skip the prompt with `RUBINO_INSTALL_METHOD=brew`, `=rv`, or `=mise`. For the **mise** method, `RUBINO_INSTALL_SCOPE=global` (default, user-wide `~/.config/mise/config.toml`) or `=local` (this directory only, `./mise.toml`) chooses the scope.
79
+
80
+ On **Debian 12 / old-glibc** systems `rv` would install a musl Ruby this glibc box can't execute; the installer detects that and **steers you from `rv` to `mise`** (precompiled, glibc-correct) so you don't land on a broken `rubino`.
23
81
 
24
82
  > **Review before you pipe.** Piping a script into your shell runs whatever it contains. Read it first:
25
83
  > ```bash
@@ -27,7 +85,7 @@ On **Linux** the installer fetches a precompiled Ruby via [`rv`](https://github.
27
85
  > less install.sh && bash install.sh
28
86
  > ```
29
87
 
30
- The installer is idempotent — safe to re-run and prints the exact `PATH` line for the `rubino` executable plus the next step.
88
+ The installer is idempotent — safe to re-run. It **persists the activation / `PATH` line to your shell rc** (`.zshrc` / `.bashrc` / `.profile`) and then runs a **fresh-shell verification gate** — it opens a clean login shell and fails loudly if `rubino` isn't on `PATH` there, instead of merely printing a hint you might miss. Opt out of any rc modification with `RUBINO_NO_MODIFY_RC=1` (the installer then prints the line for you to add yourself).
31
89
 
32
90
  **Manual install** (if you'd rather not pipe, or already manage Ruby yourself):
33
91
 
@@ -99,11 +157,11 @@ model:
99
157
 
100
158
  agent:
101
159
  max_turns: 90
102
- max_tool_iterations: 8
160
+ max_tool_iterations: 25
103
161
 
104
162
  memory:
105
163
  enabled: true
106
- backend: "sqlite" # tiny-Zep FTS5 + graph-lite recall (default)
164
+ backend: "sqlite" # SQLite FTS5 + graph-lite recall (default)
107
165
  auto_extract: true
108
166
 
109
167
  compression:
@@ -118,7 +176,7 @@ tools:
118
176
  git: true
119
177
  shell: true # ON by default; every command is still approval-gated
120
178
  ruby: true
121
- web: false # gates BOTH webfetch and websearch
179
+ web: true # ON by default (keyless DuckDuckGo backend); gates BOTH webfetch and websearch
122
180
  memory: true
123
181
  ```
124
182
 
@@ -134,7 +192,7 @@ Full reference (every key, env vars, precedence): **[docs/configuration.md](docs
134
192
  - **[Configuration](docs/configuration.md)** — full config + env vars + precedence
135
193
  - **[Tools](docs/tools.md)** — the built-in tool set and approval behavior
136
194
  - **[Skills](docs/skills.md)** — reusable instruction packs, the 3-level disclosure, and `SKILL_LOADED` observability
137
- - **[Memory](docs/memory.md)** — the SQLite tiny-Zep backend
195
+ - **[Memory](docs/memory.md)** — the SQLite memory backend
138
196
  - **[Security](docs/security.md)** — approval model, hardline floor, TLS
139
197
  - **[Troubleshooting](docs/troubleshooting.md)** — keyed on the exact error strings
140
198
  - **[HTTP API](docs/api/v1.md)** · **[Jobs & cron](docs/jobs.md)** · **[OAuth providers](docs/oauth-providers.md)** · **[Architecture](docs/architecture.md)**
@@ -142,7 +200,7 @@ Full reference (every key, env vars, precedence): **[docs/configuration.md](docs
142
200
 
143
201
  ## Built-in tools
144
202
 
145
- The agent ships **33 built-in tools**: `read`, `summarize_file`, `write`, `edit`, `multi_edit`, `grep`, `glob`, `git`, `github`, `shell`, `shell_output`, `shell_tail`, `shell_input`, `shell_kill`, `ruby`, `run_tests`, `apply_patch`, `webfetch`, `websearch`, `question`, `todowrite`, `memory`, `session_search`, `attach_file`, `vision`, `skill`, `task`, `task_result`, `task_stop`, `ask_parent`, `steer`, `probe`, `answer_child`. Each is gated by a `tools.<key>` config flag (opt-out) and the approval model. See **[docs/tools.md](docs/tools.md)**.
203
+ The agent ships **27 built-in tools** (the set `rubino tools` lists): `read`, `read_attachment`, `summarize_file`, `write`, `edit`, `multi_edit`, `apply_patch`, `grep`, `glob`, `git`, `github`, `shell`, `shell_output`, `shell_tail`, `shell_input`, `shell_kill`, `ruby`, `run_tests`, `web`, `question`, `todowrite`, `memory`, `session_search`, `attach_file`, `vision`, `skill`, `task`. A single `web` tool gates both fetching a URL and searching (config key `tools.web`, on by default via the keyless DuckDuckGo backend; it degrades gracefully when no search backend is reachable). Each tool is gated by a `tools.<key>` config flag (opt-out) and the approval model. See **[docs/tools.md](docs/tools.md)**.
146
204
 
147
205
  ## Skills
148
206
 
@@ -183,13 +241,13 @@ These are designed-in but not fully wired yet — don't depend on them in produc
183
241
 
184
242
  - **MCP Support** — connect to Model Context Protocol servers via [ruby_llm-mcp](https://github.com/patvice/ruby_llm-mcp) ([docs/mcp.md](docs/mcp.md)).
185
243
  - **Multi-Agent** — Build / Plan / Explore agents with `@mention` routing ([docs/agents.md](docs/agents.md)).
186
- - **Plugin Hooks** — event hooks for extending behavior ([docs/plugins.md](docs/plugins.md)).
187
244
 
188
245
  ## Development
189
246
 
190
247
  ```bash
191
248
  bundle install
192
- bundle exec rspec # run tests
249
+ bundle exec rspec # run tests (sequential, with coverage)
250
+ bundle exec rake parallel:spec # run tests across all CPU cores
193
251
  bundle exec rubino doctor # verify setup
194
252
  ```
195
253
 
data/Rakefile CHANGED
@@ -6,3 +6,51 @@ require "rspec/core/rake_task"
6
6
  RSpec::Core::RakeTask.new(:spec)
7
7
 
8
8
  task default: :spec
9
+
10
+ # API documentation. `rake rdoc` regenerates the HTML API docs into doc/rdoc
11
+ # (gitignored); the .github/workflows/docs.yml workflow publishes the same
12
+ # output to GitHub Pages. Guarded so the Rakefile still loads if the `rdoc`
13
+ # default gem is somehow absent.
14
+ begin
15
+ require "rdoc/task"
16
+
17
+ RDoc::Task.new(:rdoc) do |rdoc|
18
+ rdoc.rdoc_dir = "doc/rdoc"
19
+ rdoc.main = "README.md"
20
+ rdoc.title = "rubino-agent API documentation"
21
+ rdoc.rdoc_files.include("lib/**/*.rb", "exe/*", "README.md", "CHANGELOG.md", "docs/*.md")
22
+ end
23
+ rescue LoadError
24
+ # `rdoc` unavailable -> the `rake rdoc` task is simply not defined.
25
+ end
26
+
27
+ # Parallel test execution across CPU cores via the `parallel_tests` gem.
28
+ #
29
+ # rake parallel:spec # auto: one worker per core
30
+ # rake parallel:spec[4] # force 4 workers
31
+ #
32
+ # Each worker is its own process with a distinct TEST_ENV_NUMBER, so the
33
+ # per-process isolation already baked into spec/spec_helper.rb (RUBINO_HOME,
34
+ # document fixtures, example-status file) keeps workers from colliding.
35
+ # SimpleCov is skipped in parallel (workers would race the resultset); run the
36
+ # plain sequential `rake spec` / `bundle exec rspec` for a coverage report.
37
+ #
38
+ # Balancing: we use parallel_tests' default **filesize** grouping rather than
39
+ # runtime grouping. Runtime grouping (`--group-by runtime`) is strict — it
40
+ # aborts with RuntimeLogTooSmallError whenever the recorded log is missing an
41
+ # entry for any current spec file (i.e. the first run after ANY new spec is
42
+ # added), which makes the entrypoint brittle. The wall-clock floor here is a
43
+ # single ~70s example (agent_e2e error-retry) that cannot be split across
44
+ # workers regardless of grouping, so filesize grouping already lands the
45
+ # longest worker on essentially that floor while staying deterministic and
46
+ # never breaking on a freshly-added spec.
47
+ namespace :parallel do
48
+ desc "Run the RSpec suite in parallel across CPU cores (rake parallel:spec[N])"
49
+ task :spec, [:count] do |_t, args|
50
+ count = args[:count]
51
+ cmd = %w[bundle exec parallel_rspec]
52
+ cmd += ["-n", count.to_s] if count && !count.empty?
53
+ cmd += ["--", "spec"]
54
+ sh(*cmd)
55
+ end
56
+ end
data/docs/agents.md CHANGED
@@ -1,15 +1,22 @@
1
1
  # Agents & Subagents
2
2
 
3
- rubino has two distinct multi-agent surfaces. Only the first one ships today:
3
+ rubino has two distinct multi-agent surfaces, and **both ship today**:
4
4
 
5
5
  1. **Background subagents** (✅ shipping) — the agent delegates bounded sub-tasks
6
6
  to isolated subagent runs via its `task` tool, and you supervise them with
7
- `/agents` and `/reply`. This is the surface you will actually use.
8
- 2. **Primary-agent switching** ( not yet wired) — Tab-cycling between primary
9
- agents and `@mention` routing. The machinery exists (`Agent::Router`,
10
- `Agent::Definition`, `AgentRegistry`) but no call site passes an agent
11
- definition yet, so the default (build) agent handles every turn. See
12
- [the last section](#planned-primary-agent-switching--mentions-not-yet-wired).
7
+ `/agents` and `/reply`.
8
+ 2. **Primary-agent switching** ( shipping) — pick the primary agent that
9
+ handles your turns: `/agent <name>` (or a bare `/<name>` for a primary)
10
+ pins it for the session, **Tab** cycles through the primaries, and a one-shot
11
+ `/<name> <message>` routes a single message to any agent. The selected agent's
12
+ Definition (its system prompt and tool scope) is threaded into the runner each
13
+ turn, so the choice actually changes the model's persona/tools. See
14
+ [Primary-agent switching](#primary-agent-switching) below.
15
+
16
+ > **Channels are cleanly separated:** `@` is the **workspace file** picker
17
+ > (`@path/to/file`), and `/` is the **agent/command** channel. There are no
18
+ > `@mention` agent routes — a filename like `@explore.rb` is always a file, never
19
+ > an agent. Use `/explore`, `/plan`, etc. to reach an agent.
13
20
 
14
21
  ---
15
22
 
@@ -61,15 +68,21 @@ message instead of fanning out unbounded work:
61
68
  | Glyph | Status | Meaning | You act via |
62
69
  |---|---|---|---|
63
70
  | `●` | `running` | Working (last activity shown) | — |
64
- | `●` | `needs_approval` | A child tool needs your approval | `/agents <id>` |
65
- | `⛔` | `blocked_on_human` | Asked a question only YOU can answer (`ask_parent` escalated to the human) | `/reply <id> <answer>` |
66
- | `◷` | `blocked_on_parent` | Asked its agent-parent a question the PARENT MODEL answers (`answer_child`); not your job unless you choose to step in with `/reply` | (optional) `/reply <id>` |
71
+ | `●` | `needs_approval` | A child tool needs your approval (or a budget request) | `/agents <id>` or `/reply <id>` |
72
+ | `⛔` | `blocked_on_human` | Vocabulary glyph for a child parked on the human (not raised in normal operation now that subagents are non-blocking) | `/reply <id> <answer>` |
73
+ | `◷` | `blocked_on_parent` | Vocabulary glyph for a child parked on its agent-parent (likewise not raised now that subagents are non-blocking) | (optional) `/reply <id>` |
67
74
  | `◌` | `stopping` | Stop requested; unwinding at its next checkpoint | — |
68
75
  | `✓` | `done` | Finished; result available | `/agents <id>` |
69
76
  | `✗` | `failed` | Errored; error available | `/agents <id>` |
70
- | `⊘` | `stopped` | Cancelled by you (`--stop`); blocked descendants unwound; tools that completed before the stop may have left side effects | `/agents <id>` |
77
+ | `⊘` | `stopped` | Cancelled by you (`--stop`); descendants unwound; tools that completed before the stop may have left side effects | `/agents <id>` |
71
78
 
72
- A `⛔ N subagent waiting on you` marker persists until you `/reply`.
79
+ Subagents are **non-blocking** background workers: they never pause to ask you a
80
+ mid-task question. The one way a child waits on you is an **approval** — its next
81
+ tool needs your go-ahead, so it parks as `needs_approval` and a marker persists
82
+ until you resolve it (via `/agents <id>` or `/reply <id>`). The `⛔
83
+ blocked_on_human` / `◷ blocked_on_parent` glyphs remain in the status vocabulary
84
+ the `/agents` surface can render, but with the child→parent ask channel removed
85
+ they are no longer raised in normal operation.
73
86
 
74
87
  ### Supervising from the CLI: `/agents` and `/reply`
75
88
 
@@ -79,12 +92,31 @@ A `⛔ N subagent waiting on you` marker persists until you `/reply`.
79
92
  /agents <id> --stop # cancel a running subagent (blocked descendants unwind too)
80
93
  /agents <id> steer "note" # park a note folded into the child's context at its next turn
81
94
  /agents <id> probe "question" # ephemeral read-only peek — nothing is saved to the child
82
- /reply <id> <answer> # answer a child blocked on an ask_parent question
95
+ /reply <id> <answer> # answer a child blocked on you (e.g. an approval)
83
96
  /reply # bare: list the subagents currently blocked on you
84
97
  ```
85
98
 
86
99
  `/tasks` is an alias for `/agents`. Stopping a node cancels its descendants'
87
- ask-gates too, so a blocking question anywhere in the subtree unwinds at once.
100
+ approval gates too, so anything parked anywhere in the subtree unwinds at once.
101
+
102
+ #### Attach to a subagent (agent-view)
103
+
104
+ The typed forms above work by id from anywhere, but the fastest way to focus on
105
+ one running child is to **attach**. At the idle prompt press `↓` to open the
106
+ subagent picker, arrow to one, and `Enter`:
107
+
108
+ - the screen switches to that agent's **own full timeline** — its tool calls and
109
+ what it said, replayed from its session (not the bounded activity snapshot the
110
+ picker used to show);
111
+ - the prompt becomes **scoped** to it: `sa_xxxx ❯`;
112
+ - while attached, just **type** to steer the running child (or answer it if it's
113
+ blocked on you) — no id needed; `←` on the empty prompt (or `/detach`) returns
114
+ to the main timeline.
115
+
116
+ So attaching makes `/agents <id> steer/probe` and `/reply <id>` redundant for the
117
+ focused child — they're the same operations, just addressed by id. Attach is a
118
+ between-turns action (it owns the screen): while a parent turn is still streaming
119
+ the picker's `Enter` toasts "attach when the turn ends" — attach once it's idle.
88
120
 
89
121
  **steer** is a persistent course-correction: the note enters the child's context
90
122
  at its next turn boundary and changes its trajectory.
@@ -92,12 +124,15 @@ at its next turn boundary and changes its trajectory.
92
124
  child's transcript; the answer is shown to you and discarded — nothing is
93
125
  appended to the child's history.
94
126
 
95
- ### Parentchild channels (model-driven)
127
+ ### Parentchild channels (model-driven)
96
128
 
97
- The same three verbs are MODEL-callable tools, so an agent-parent can supervise
98
- its own children the way you supervise yours. All are gated by `tools.task` and
99
- **ownership-scoped at call time** a caller can only touch its own direct
100
- children (see [tools.md](tools.md) for parameters):
129
+ Both verbs are MODEL-callable tools, so an agent-parent can supervise its own
130
+ children the way you supervise yours. They are **parent→child only** a
131
+ subagent has no channel to ask its parent a question mid-task (subagents are
132
+ non-blocking; they make sensible default calls and surface open decisions in
133
+ their result instead). Both are gated by `tools.task` and **ownership-scoped at
134
+ call time** — a caller can only touch its own direct children (see
135
+ [tools.md](tools.md) for parameters):
101
136
 
102
137
  - **`steer(task_id, note)`** — park a persistent note on one of your running
103
138
  children; it folds into the child's context at its next turn.
@@ -106,19 +141,6 @@ children (see [tools.md](tools.md) for parameters):
106
141
  activity, recent lines); `live: true` is a billed one-shot model peek over the
107
142
  child's transcript, budgeted per child (`tasks.max_live_probes_per_child`,
108
143
  default 5).
109
- - **`ask_parent(question, blocking:)`** — the child→parent escalation (only
110
- available to subagents). `blocking: false` (default) keeps the child working
111
- and folds the answer in later; `blocking: true` parks the child until answered,
112
- bounded by `tasks.ask_parent_timeout` (default 900s — on expiry the child
113
- proceeds with its best judgement instead of hanging).
114
- Routing depends on who spawned the child: an agent-parent gets the question as
115
- a note and answers with `answer_child` (child shows `◷ blocked_on_parent`); a
116
- human-spawned child escalates straight to you (`⛔ blocked_on_human`, answered
117
- via `/reply`). A parent that cannot answer from its own context escalates by
118
- calling its OWN `ask_parent` — questions bubble up the tree to the human.
119
- - **`answer_child(task_id, answer)`** — the agent-parent's `/reply`: delivers
120
- the answer into the asking child's context (unblocks a blocking ask, folds in
121
- for a non-blocking one).
122
144
 
123
145
  ### Approvals inside a background child
124
146
 
@@ -132,14 +154,15 @@ apply (hardline floor still enforced — see [security.md](security.md)).
132
154
  ## Built-in agent definitions
133
155
 
134
156
  These definitions exist in `Agent::AgentRegistry` today. The two *subagents*
135
- are live as `task` targets; the two *primary* agents are only reachable as the
136
- default (`build`) or via the plan **mode** (`/mode plan`), not via agent
137
- switching; the *utility* agents are internal.
157
+ are live as `task` targets; the two *primary* agents are switchable per session
158
+ (`/agent <name>`, a bare `/<name>`, or Tab see
159
+ [Primary-agent switching](#primary-agent-switching)); the *utility* agents are
160
+ internal.
138
161
 
139
162
  | Agent | Type | Access | Description |
140
163
  |-------|------|--------|-------------|
141
- | **build** | primary | Full tools | Default development agent. Handles every turn today. |
142
- | **plan** | primary | Read-only | Analysis/planning definition (the shipping read-only surface is `/mode plan`). |
164
+ | **build** | primary | Full tools | Default development agent (the registry default). |
165
+ | **plan** | primary | Read-only | Analysis/planning agent. Switch to it with `/agent plan`; `/mode plan` is the orthogonal read-only run **mode**. |
143
166
  | **explore** | subagent | Read-only | Fast codebase search and navigation (`task` target). |
144
167
  | **general** | subagent | Full tools | Complex multi-step tasks (`task` target). |
145
168
  | **compaction** | utility | None | Internal: compresses context. Hidden. |
@@ -173,18 +196,29 @@ pattern-based permission overrides (merged over the global rules by
173
196
 
174
197
  ---
175
198
 
176
- ## Planned: primary-agent switching & @mentions (not yet wired)
199
+ ## Primary-agent switching
177
200
 
178
- > **Status:** the machinery exists `Agent::Router` (@mention detection,
179
- > Tab-cycling, default routing) and the `agent_definition:` plumbing through the
180
- > runner but **no call site passes an agent definition**, so Tab and
181
- > `@explore`/`@plan`/`@general` mentions currently do nothing. Use the
182
- > background-subagent surface above for real work.
183
-
184
- The intended design: press **Tab** to cycle through primary agents, or route a
185
- single message with an `@mention`:
201
+ You choose which primary agent handles your turns. The pinned agent is a
202
+ process-level slot (`Rubino::ActiveAgent`, sibling to `Rubino::Modes`): a fresh
203
+ `rubino chat` boots on the registry default (`build`), and an explicit switch
204
+ takes effect for the rest of that process (no premature persistence). Switching
205
+ is entirely on the **slash** channel and **Tab** — there is no `@mention` agent
206
+ routing (`@` is the workspace file picker).
186
207
 
187
208
  ```
188
- you > @explore Where is the database connection configured?
189
- you > @plan How should we restructure the auth module?
209
+ you > /agent plan # pin a primary agent for the session
210
+ you > /plan # bare /<name> same, for a primary agent
211
+ you > <Tab> # cycle through the primary agents, wrapping around
212
+ you > /explore Where is the database connection configured? # one-shot route a single message
190
213
  ```
214
+
215
+ - **`/agent <name>`** (or a bare **`/<name>`** when `<name>` is a primary) pins
216
+ the agent for the session. Only **primary** agents are switchable; subagents
217
+ (`explore`/`general`) are never pinned.
218
+ - **Tab** cycles through the primary agents.
219
+ - **`/<name> <message>`** routes a single message to any agent (primary or
220
+ subagent) without changing the sticky selection.
221
+
222
+ The selected agent's Definition — its system prompt and tool scope — is threaded
223
+ into the runner on every turn, so switching actually changes the model's
224
+ persona and the tools it can call, not just a cosmetic label.
data/docs/architecture.md CHANGED
@@ -17,9 +17,8 @@ Infrastructure Layer → LLM Adapter, Database, MCP, OAuth
17
17
  1. **All output goes through UI** — No `puts`/`print` in core modules
18
18
  2. **LLM is isolated** — Only `LLM::RubyLLMAdapter` talks to ruby_llm
19
19
  3. **SQLite is the single database** — Sessions, memory, jobs, events
20
- 4. **Event-driven** — Core emits events, UI/plugins subscribe
21
- 5. **Plugin hooks** — 38 declared extension points for customization (design surface; few are wired today)
22
- 6. **Config is not architecture** — Configuration describes what; architecture decides how
20
+ 4. **Event-driven** — Core emits events, UI subscribes
21
+ 5. **Config is not architecture** — Configuration describes what; architecture decides how
23
22
 
24
23
  ## Module Map
25
24
 
@@ -96,11 +95,6 @@ Experimental — booted at chat startup when `mcp.servers` is configured
96
95
  - `DoomLoopDetector` — Detects repeated identical tool calls
97
96
  - `CommandAllowlist` — Pre-approved shell commands
98
97
 
99
- ### `plugins/`
100
- - `Registry` — Central hook registry; the hook set (38 points) is declared in
101
- `plugins.rb` as a design surface, with few hooks wired today
102
- - Loaded from `.rubino/plugins/`
103
-
104
98
  ### `skills/`
105
99
  - `Skill` — Parsed SKILL.md with YAML frontmatter
106
100
  - `Registry` — Discovery from configured paths
@@ -148,8 +142,8 @@ User Input
148
142
  ├─→ Commands::Executor (if /command)
149
143
  │ └─→ Render template → feed to agent
150
144
 
151
- ├─→ Agent::Router (if @mention)
152
- │ └─→ Select agent definition
145
+ ├─→ ActiveAgent (if /agent, /<name>, or Tab)
146
+ │ └─→ Select primary agent definition
153
147
 
154
148
  └─→ Interaction::Lifecycle
155
149
 
@@ -167,7 +161,6 @@ User Input
167
161
  │ │ ├─ Check permissions (ApprovalPolicy)
168
162
  │ │ ├─ Check doom loop (DoomLoopDetector)
169
163
  │ │ ├─ Execute tool (ToolExecutor)
170
- │ │ ├─ Run plugin hooks
171
164
  │ │ └─ Loop back to LLM
172
165
  │ └─ Final text response
173
166
 
data/docs/commands.md CHANGED
@@ -60,7 +60,9 @@ is set.
60
60
  | `--new` | | Start a fresh session (bare `chat` resumes the last one by default) |
61
61
  | `--model` | `-m` | Override the model (e.g. `claude-sonnet-4-5`) |
62
62
  | `--provider` | | Override the provider (e.g. `anthropic`, `bedrock`) |
63
- | `--yolo` | | Skip approval prompts (equivalent to `/mode yolo`) |
63
+ | `--yolo` | | Skip approval prompts (equivalent to `/mode yolo`). Honored **only** as a CLI flag — cannot be set from config |
64
+ | `--no-yolo` | | Force fail-closed approvals even over a yolo default (the security half of [#260](#exit-codes-scripting-around-prompt--one-shot)) |
65
+ | `--add-dir` | | Add an extra allowed workspace directory write/edit can reach (repeatable) |
64
66
  | `--max-turns` | | Max tool iterations per turn |
65
67
  | `--ignore-rules` | | Skip `AGENTS.md` and context files |
66
68
 
@@ -83,6 +85,7 @@ Pasting **text** into the chat input goes through the file-backed paste pipeline
83
85
  - `--new` forces a fresh session; `--continue`/`-c` resumes the latest; `--resume`/`-r <id|title>` resumes a specific one.
84
86
  - `--resume` matches an ID prefix first, then a case-insensitive substring of the session title **or its full first prompt** — so a memorable phrase from the tail of a long first message works even though the stored title is truncated. More than one match is an error listing the candidates; no match exits non-zero with a pointer to `rubino sessions list`.
85
87
  - One-shot mode (`-q` / `prompt`) does **not** auto-resume — automation isn't silently hijacked onto a past session; pass `--resume`/`--continue` explicitly if you want it.
88
+ - A bare **`rubino sessions`** on a real terminal opens an arrow-key resume picker over this directory's sessions (`--all` for every dir); ↑↓ select, Enter loads the chosen session into the chat REPL (the same as `rubino chat --session <id>`), Esc cancels. Off a TTY (piped/redirected) it prints the static `list` table so scripts stay deterministic; `rubino sessions list` is always the table.
86
89
  - One-shot output: when stdout is a **terminal** the answer renders through the same markdown pipeline as interactive chat (styled text, fitted tables, wrapping); when stdout is **piped/redirected** the answer stays plain raw text and diagnostics go to stderr, so `$(rubino prompt …)` stays clean.
87
90
  - Sessions are marked ended on clean exit, terminal close (SIGHUP), or kill (SIGTERM), so a closed window doesn't leave a session looking active.
88
91
 
@@ -93,6 +96,18 @@ Pasting **text** into the chat input goes through the file-backed paste pipeline
93
96
  policy along the way (a write outside the workspace boundary, a denied
94
97
  approval, a hardline-blocked command). A refusal the agent handled and
95
98
  explained is expected behavior, not an error.
99
+ - **Headless approvals fail closed (security).** A one-shot / scripted run has
100
+ no interactive session, so a tool that would otherwise prompt for approval —
101
+ a write/edit, or a shell command **not** covered by your `permissions` /
102
+ command allowlist / read-only auto-allow — is **blocked, not run**. A
103
+ single-line `blocked: <tool> needs approval but no interactive session (use
104
+ --yolo to allow, or allowlist it)` goes to stderr and the run exits
105
+ **non-zero (2)**, so automation/CI fails loudly instead of silently skipping
106
+ (or, worse, auto-executing) the action. Anything you already allowlisted, and
107
+ every read-only command, still runs unprompted. Pass `--yolo` to opt back
108
+ into full auto-execute; `--no-yolo` forces fail-closed even if a yolo default
109
+ was set. `--yolo` is honored **only** as a CLI flag — a project-local config
110
+ can never grant it.
96
111
  - It exits **non-zero** when the run itself fails: no usable credentials, the
97
112
  `--resume`/`--session` target doesn't exist or is ambiguous, or the provider
98
113
  call errors out. The reason is printed to stderr; the answer (when any) stays
@@ -120,6 +135,7 @@ rubino memory show ID
120
135
  rubino memory delete ID
121
136
  rubino memory backend [NAME] # show the active memory backend, or switch to NAME
122
137
 
138
+ rubino sessions # on a TTY: arrow-key resume picker (Enter loads, Esc cancels); piped: lists
123
139
  rubino sessions list
124
140
  rubino sessions show ID
125
141
  rubino sessions compact ID
@@ -160,9 +176,11 @@ Type these inside `rubino chat`. Generated from `BuiltIns::DESCRIPTIONS` (drift-
160
176
  | `/compact` | Compact the context now: older turns become a summary |
161
177
  | `/export` | Write the session transcript as markdown (/export [path]) |
162
178
  | `/memory` | Inspect/search/forget what the agent remembers (show ID, backend, --all) |
163
- | `/agents` | List background subagents; steer/probe a running one, or view output |
179
+ | `/agent` | Switch the primary agent (/agent <name>; a bare /<name> or Tab cycles) |
180
+ | `/agents` | List background subagents; ↓+Enter to attach & steer one live, or steer/probe/view by id |
164
181
  | `/tasks` | Alias for /agents |
165
- | `/reply` | Answer a subagent that is blocked waiting on you (ask_parent) |
182
+ | `/reply` | Answer a subagent that is blocked waiting on you (e.g. an approval) |
183
+ | `/stop` | Stop a running subagent (/stop <id>; alias for /agents <id> --stop) |
166
184
  | `/jobs` | List the background job queue (status counts); /jobs <id> for detail |
167
185
  | `/skills` | List skills; activate one ('none' clears), or enable/disable NAME |
168
186
  | `/mcp` | List MCP servers and their tools; restart or disable one |
@@ -186,9 +204,10 @@ Type these inside `rubino chat`. Generated from `BuiltIns::DESCRIPTIONS` (drift-
186
204
 
187
205
  You can keep typing while a turn is running — the pinned input stays live:
188
206
 
189
- - **Enter** interrupts the current turn and runs your line as the **next** turn (the partial answer is kept and marked `⎿ interrupted`).
190
- - **Alt+Enter** queues the line **without** interrupting: it runs after the current turn finishes, with a live `⏳ queued:` indicator above the input until it does. At idle (no turn running) there is nothing to queue behind, so Alt+Enter submits the line immediately, same as Enter.
191
- - **`/queued <message>`** is the terminal-independent fallback for Alt+Enter (some terminals don't deliver the chord) — it queues the message the same way.
207
+ - **Enter** queues the line **without** interrupting (the queue-by-default / type-ahead model, #421): the current turn keeps running, the line waits behind any earlier-queued items (FIFO) with a live `⏳ queued:` indicator above the input, and it is committed as a normal message when its turn runs. At idle (no turn running) Enter submits immediately.
208
+ - **Esc** interrupts the current turn (the partial answer is kept and marked `⎿ interrupted`); any queued lines then run as the next turns.
209
+ - **`/queued <message>`** queues a message explicitly — the terminal-independent way to enqueue without typing it into the live input.
210
+ - **Read-only meta-commands run immediately, mid-turn.** A small set of non-mutating slash commands — `/agents` (and `/tasks`), `/stop`, `/status`, `/jobs`, `/help`, `/commands`, `/dirs` — execute **right away** while a turn is running, so you can drill into a sub-agent, stop the run, or check status without interrupting. State-mutating commands (`/model`, `/clear`, `/new`, `/config`, `/mode`, `/reasoning`, `/think`, …) are not available mid-turn: they show a transient `⚠ <cmd> is not available during an active turn — press Esc to interrupt first` notice instead of running.
192
211
 
193
212
  ### Keys: `Esc Esc` — rewind to an earlier message
194
213
 
@@ -308,12 +327,21 @@ The agent spawns background subagents with its `task` tool; these commands are t
308
327
  /agents <id> --stop # cancel a running subagent (blocked descendants unwind too)
309
328
  /agents <id> steer "note" # park a note folded into the child's context at its next turn
310
329
  /agents <id> probe "question" # ephemeral read-only peek — nothing is saved to the child
311
- /reply <id> <answer> # answer a subagent blocked on an ask_parent question
330
+ /reply <id> <answer> # answer a subagent blocked on you (e.g. an approval)
312
331
  /reply # bare: list the subagents currently blocked on you
313
332
  ```
314
333
 
315
334
  `/tasks` is an alias for `/agents`.
316
335
 
336
+ **Attach to a subagent (agent-view).** Instead of typing ids, press `↓` at the
337
+ idle prompt to open the subagent picker, arrow to one, and `Enter` to **attach**:
338
+ the screen switches to that agent's own full timeline (its tool calls and what it
339
+ said, replayed) and the prompt becomes scoped — `sa_xxxx ❯`. While attached, just
340
+ type to steer the running child (or answer it if it's blocked on you); `←` on the
341
+ empty prompt (or `/detach`) returns to the main timeline. The scoped prompt makes
342
+ the global `/agents <id> steer/probe` and `/reply <id>` forms redundant — they're
343
+ the same operations, by id, from anywhere.
344
+
317
345
  ### Workspace roots: `/add-dir` and `/dirs`
318
346
 
319
347
  The workspace sandbox confines write/edit/delete tools to the workspace roots. `/add-dir <path>` adds an extra allowed root mid-session (and runs the one-time folder-trust gate, so the new root's `AGENTS.md`/skills are only honored once vouched for); `/dirs` lists the current roots and their trust state. Typing `/add-dir ` opens a directory-path dropdown (relative, absolute, and `~` paths complete as you type).
@@ -371,6 +399,17 @@ Custom commands live as Markdown templates in `.rubino/commands/` (project) or `
371
399
 
372
400
  `/commands` lists the available custom commands and explains how to author them. See the [README](../README.md) for the template format (`$ARGUMENTS`, YAML frontmatter).
373
401
 
402
+ ### Primary agents: `/agent`, `/<name>`, and Tab
403
+
404
+ Each turn runs under an **agent** — a persona with its own system prompt and tool scope. The built-ins are `build` (full access, the default) and `plan` (read-only analysis); `explore` and `general` are subagents you invoke one-shot. Switching the primary agent changes who answers the *next* turn:
405
+
406
+ - `/agent` lists the switchable primaries (the current one marked `▸`) and the one-shot subagents.
407
+ - `/agent <name>` — or a bare `/<name>` for a primary — **pins** that agent for the rest of the session (sticky). The active agent shows as an `agent <name>` chip in the status bar (omitted when it's the default `build`).
408
+ - **Tab** on an empty prompt cycles the primary agents (the agent counterpart of Shift+Tab's mode cycle), updating the chip live.
409
+ - `/<name> <message>` routes a **single** turn to that agent — any visible agent, primary or subagent (e.g. `/explore where is the parser`) — without disturbing your sticky pick.
410
+
411
+ Distinct from `/agents` (plural), which drills into the background `task` subagents. `@` is the file picker, so a filename like `@explore.rb` is never shadowed by an agent named `explore`; agent switching lives entirely on the slash channel and Tab.
412
+
374
413
  ### Modes
375
414
 
376
415
  `/mode` (or the `--yolo` flag) switches between the modes below. **Shift+Tab** cycles them from the prompt (default → plan → yolo), updates the mode token that LEADS the status bar under the input (dim `default`, yellow `plan`, red `yolo`), and shows a transient `mode <old> → <new>` footer. Entering `yolo` from the cycle takes a second deliberate Shift+Tab to confirm (the toast says so, and warns when running background subagents would lose their approval gates); an explicit `/mode yolo` switches directly.