@nanhara/hara 0.0.2 → 0.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (60) hide show
  1. package/CHANGELOG.md +582 -0
  2. package/CLA.md +1 -1
  3. package/README.md +207 -10
  4. package/dist/activity.js +30 -0
  5. package/dist/agent/loop.js +184 -0
  6. package/dist/config.js +114 -0
  7. package/dist/context/agents-md.js +64 -0
  8. package/dist/context/mentions.js +90 -0
  9. package/dist/diff.js +103 -0
  10. package/dist/fs-walk.js +103 -0
  11. package/dist/fuzzy.js +62 -0
  12. package/dist/images.js +146 -0
  13. package/dist/index.js +1589 -0
  14. package/dist/mcp/client.js +54 -0
  15. package/dist/md.js +52 -0
  16. package/dist/memory/guard.js +51 -0
  17. package/dist/memory/store.js +93 -0
  18. package/dist/org/planner.js +174 -0
  19. package/dist/org/roles.js +140 -0
  20. package/dist/org/router.js +39 -0
  21. package/dist/plugins/plugins.js +124 -0
  22. package/dist/providers/anthropic.js +83 -0
  23. package/dist/providers/openai.js +125 -0
  24. package/dist/providers/qwen-oauth.js +139 -0
  25. package/dist/providers/types.js +2 -0
  26. package/dist/recall.js +76 -0
  27. package/dist/sandbox.js +78 -0
  28. package/dist/search/embed.js +42 -0
  29. package/dist/search/hybrid.js +38 -0
  30. package/dist/search/semindex.js +192 -0
  31. package/dist/session/store.js +109 -0
  32. package/dist/skills/skills.js +141 -0
  33. package/dist/statusbar.js +69 -0
  34. package/dist/tools/agent.js +26 -0
  35. package/dist/tools/apply-core.js +63 -0
  36. package/dist/tools/builtin.js +106 -0
  37. package/dist/tools/codebase.js +102 -0
  38. package/dist/tools/computer.js +376 -0
  39. package/dist/tools/edit.js +62 -0
  40. package/dist/tools/memory.js +147 -0
  41. package/dist/tools/patch.js +123 -0
  42. package/dist/tools/registry.js +18 -0
  43. package/dist/tools/search.js +176 -0
  44. package/dist/tools/skill.js +30 -0
  45. package/dist/tools/web.js +73 -0
  46. package/dist/tui/App.js +200 -0
  47. package/dist/tui/InputBox.js +208 -0
  48. package/dist/tui/run.js +10 -0
  49. package/dist/tui/theme.js +11 -0
  50. package/dist/ui.js +17 -0
  51. package/dist/undo.js +40 -0
  52. package/dist/vision.js +130 -0
  53. package/package.json +34 -9
  54. package/plugins/browser/.hara-plugin/plugin.json +9 -0
  55. package/plugins/browser/skills/web/SKILL.md +27 -0
  56. package/plugins/chrome/.hara-plugin/plugin.json +9 -0
  57. package/plugins/chrome/skills/chrome/SKILL.md +26 -0
  58. package/LICENSE-MIT +0 -21
  59. package/bin/hara.mjs +0 -25
  60. /package/{LICENSE-APACHE → LICENSE} +0 -0
package/CHANGELOG.md ADDED
@@ -0,0 +1,582 @@
1
+ # Changelog
2
+
3
+ All notable changes to `@nanhara/hara`.
4
+
5
+ > Versioning (pre-1.0, SemVer-style): the **minor** (middle) number bumps for a **new feature**; the
6
+ > **patch** (last) number bumps for **optimizations/fixes of existing features**.
7
+
8
+ ## 0.48.0 — unreleased (chrome plugin: drive your real logged-in Chrome)
9
+
10
+ - New first-party **`chrome` plugin** — web automation via **`chrome-devtools-mcp`** against a **real Chrome with
11
+ a persistent-login profile** (sign into a site once, reused across runs), or attach to your running Chrome via
12
+ `--browserUrl http://127.0.0.1:9222`. The "drive my actual sessions" complement to the isolated-Playwright
13
+ `browser` plugin (enable one, not both) — this is the openclaw/cc-haha route.
14
+ - Shipped as an option (not auto-installed — `browser` stays the default). `chrome-devtools-mcp` verified
15
+ resolvable; both plugin manifests validated.
16
+
17
+ ## 0.47.0 — unreleased (browser plugin: reliable web automation via Playwright MCP)
18
+
19
+ - New first-party **`browser` plugin** wires the **Playwright MCP** (`@playwright/mcp`) into hara → the agent gets
20
+ reliable web automation: `mcp__browser__navigate / snapshot / click / type / fill_form …` acting on the page's
21
+ **DOM/accessibility tree** (selectors, auto-waiting), NOT screenshots or pixel coordinates. This is the
22
+ reliable counterpart to the fragile desktop `computer` tool — no permission walls, no coordinate-guessing.
23
+ - Ships a `web-automation` skill (snapshot-driven workflow; notes the `chrome-devtools-mcp` alternative for
24
+ driving your real logged-in Chrome, à la openclaw/cc-haha).
25
+ - Install: `hara plugin add file:<repo>/plugins/browser`; `npx playwright install chromium` once. Verified
26
+ `@playwright/mcp@0.0.76` resolves + the plugin loads (`hara doctor` → plugins: browser).
27
+
28
+ ## 0.46.0 — unreleased (screen control: bounded-failure circuit breaker)
29
+
30
+ - The `computer` tool now **stops after 3 consecutive failures** instead of letting the agent loop forever on a
31
+ broken setup (learned from codex, which bounds Computer Use attempts then gives up). After 3 in a row it
32
+ returns a clear stop + the likely cause (missing Accessibility/Screen Recording permission, or the app isn't
33
+ reachable) + how to fix; resets on any success. Each failure shows the running `[n/3]` count.
34
+
35
+ ## 0.45.1 — unreleased (activate via `open -a`; Accessibility gotcha)
36
+
37
+ - `activateApp` uses `open -a <app>` on macOS — `osascript … to activate` often left another window on top.
38
+ - Documented (gotcha #0 in `computer.ts`) that **cliclick needs the Accessibility permission, separate from
39
+ Screen Recording** — without it, clicks/keys silently no-op (the #1 cause of "it does nothing").
40
+
41
+ ## 0.45.0 — unreleased (screen control: activate, IME-safe typing)
42
+
43
+ - **`activate` action** — bring the target app to the foreground before screenshot/click. Fixes clicks landing
44
+ on the terminal hara runs in (the "Ghostty" problem): the agent must `activate WeChat` *first*.
45
+ - **IME-safe typing** — `type` now sets the clipboard and pastes (Cmd/Ctrl+V) instead of injecting keystrokes,
46
+ which a Chinese input method garbles. Reliable for **CJK + emoji** (verified pbcopy round-trip: `你好 hello 😀`);
47
+ falls back to keystrokes for ASCII if the clipboard set fails.
48
+ - The hard-won **RPA gotchas** (foreground trap, IME, Retina coords, grounding fragility, placeholder text like
49
+ "AAAA") are documented at the top of `computer.ts`.
50
+ - TUI: the type-ahead pool shows each queued line **highlighted** (accent color) above the input — no verbose
51
+ header (per feedback).
52
+
53
+ ## 0.44.0 — unreleased (type-ahead pool: visible + coalesced)
54
+
55
+ - The type-ahead queue is now a **visible pool**: messages typed while the agent works are listed above the
56
+ input (`📥 pool (N) — sent together when this turn finishes`), so Enter visibly *enters the pool* instead of
57
+ appearing to vanish (the reported "回车消失了/没显示在对话池").
58
+ - On turn-end the pool is **coalesced into one turn** — your "also do X" / "and Y" additions reach the agent
59
+ together, in order, rather than as separate sequential turns.
60
+ - Esc still clears the pool (stop means stop). 130 tests (+1 coalesce; existing type-ahead tests updated).
61
+
62
+ ## 0.43.0 — unreleased (grounding for screen control — accurate clicks)
63
+
64
+ - The `computer` tool now **locates UI elements by description** instead of guessing pixels from a text read.
65
+ Pass `target` to `click`/`move` (e.g. "the Send button") — hara screenshots, asks a vision model for the
66
+ element's position (resolution-independent fractions, Retina-safe), and clicks there. New **`find`** action
67
+ returns coordinates without clicking.
68
+ - This is codex's "native computer-use" lesson applied **locally**: codex's `computer_use` is a remote browser
69
+ sandbox; hara grounds against your own screen + apps. Needs a grounding-capable vision model (e.g. a qwen-VL).
70
+ - `screenSize()` per OS converts fractions → click coords; `parseLocate` accepts per-mille/percent/fraction
71
+ replies (tested). cliclick installed → `hara doctor` shows screencapture ✓ + cliclick ✓.
72
+ - **Still requires you to grant macOS Screen Recording + Accessibility** to actually drive the screen — those
73
+ toggles can only be set by you in System Settings.
74
+
75
+ ## 0.42.0 — unreleased (type-ahead: keep typing while the agent works)
76
+
77
+ - You can now **type while the agent is working** — the message enters a **FIFO queue** and is sent
78
+ automatically when the current turn finishes (the input box stays active mid-turn; a "⌨ working — Enter
79
+ queues" hint shows the depth). Fixes the "input does nothing while working" dogfooding feedback.
80
+ - **Esc stops everything** — interrupts the turn AND clears the queue, so a stopped turn never fires queued
81
+ messages. The queue drain is idempotent (guarded against double-send under React StrictMode).
82
+ - Expert-reviewed for queue correctness (FIFO, exactly-once), the Esc/abort UX, and input-handler conflicts.
83
+
84
+ ## 0.41.0 — unreleased (English session names, auto-summarized)
85
+
86
+ - After the first turn a session gets a short **English kebab-case name** summarizing what it's about
87
+ (e.g. `add-semantic-search`) via one tiny model call — replacing the literal first-message title. A non-English
88
+ conversation is translated to an English gist (pinyin only if untranslatable). Names stay short + ASCII.
89
+ - The stable session **id is still the UUID** (unchanged — this only improves the human-friendly name); falls
90
+ back to the lexical title if the naming call fails. New `slugify()` helper (tested).
91
+
92
+ ## 0.40.0 — unreleased (TUI polish: markdown rendering + numbered choices)
93
+
94
+ - The ink TUI now **renders assistant Markdown** (headers, bold, inline code, bullets; code fences kept
95
+ verbatim) instead of showing raw `**`/`##`/backticks. The renderer (`md.ts`) had only been wired into the
96
+ classic REPL; the default TUI showed markdown literally.
97
+ - **Selection prompts are numbered**: each choice shows `1.`, `2.`, … and you can **press the number to pick it
98
+ directly** (in addition to ↑↓ + Enter). The hint reads "↑↓ or 1–N to choose".
99
+
100
+ ## 0.39.0 — unreleased (hara commit — AI commit messages)
101
+
102
+ - **`hara commit`** generates a conventional-commits message from your staged diff, shows it, and commits after
103
+ a `Y/n` confirm. `-a` stages tracked changes first; the global `-y` skips the confirm. Pairs with `hara
104
+ review` (review → commit). Verified live (glm-5): generated `feat(util): add mul function` and committed it.
105
+ - Note: the skip-confirm reuses the global `-y/--yes` (a subcommand `-y` would collide with it — same lesson as
106
+ `hara plan resume`).
107
+
108
+ ## 0.38.0 — unreleased (hara review — review your changes)
109
+
110
+ - **`hara review`** reviews your uncommitted changes (`git diff HEAD`) for correctness bugs, security issues,
111
+ missing error handling, naming, and missing tests — grouped by severity (**Blocker / Should-fix / Nit**) with
112
+ file:line and concrete fixes. **Read-only**: it can read files for context but never edits. `--staged`
113
+ reviews staged changes; `--base <ref>` reviews against a ref (e.g. `main`).
114
+ - Verified live (glm-5): on a planted diff it flagged a hardcoded secret (Blocker), an unguarded divide, and
115
+ dead code, then gave a clear "do not merge" verdict.
116
+ - `codebase_search` added to the read-only tool set (so reviewers / sub-agents can search the repo).
117
+
118
+ ## 0.37.0 — unreleased (task-aware screenshots for screen control)
119
+
120
+ - Screenshots from the `computer` tool are now read with a **screenshot-tuned prompt** aimed at *acting*, not
121
+ transcribing: interactive elements (buttons/fields/menus) with labels and approximate positions, the active
122
+ element, and any errors. A text-only main model driving the desktop gets something it can actually click.
123
+ - New optional **`focus`** on the screenshot action ("the Login button") narrows the read to the current goal.
124
+ - Internal: `describeImages` gains `system`/`hint` options, `SCREENSHOT_SYSTEM` added, `ctx.describeImage`
125
+ takes a hint. (For contrast: codex's `computer_use` is a remote/hosted *browser* MCP plugin with no local
126
+ syscalls — hara stays **native + local** so it can operate your own desktop software.)
127
+
128
+ ## 0.36.0 — unreleased (resumable plans)
129
+
130
+ - **`hara plan resume`** continues the saved plan (`.hara/org/plan.json`): atoms already marked done are
131
+ skipped, pending/failed ones run. When a verify gate stops a plan midway, fix the issue and resume instead
132
+ of starting from scratch. Interrupted atoms (running/failed) reset to pending; works with `--parallel` too.
133
+ - Internal: execution extracted into a shared `executePlan` (skips completed atoms) used by both fresh runs and
134
+ resume; `loadPlan` wired into the CLI. Verified: a half-done plan resumed, skipped the done atom, ran only
135
+ the pending one.
136
+
137
+ ## 0.35.0 — unreleased (parallel plan execution — the org works in parallel)
138
+
139
+ - **`hara plan --parallel`** runs independent atoms concurrently. The planner already builds a dependency DAG;
140
+ now `topoWaves` groups atoms into dependency *waves* (every atom in a wave depends only on earlier waves), and
141
+ each wave's atoms execute at the same time. A diamond plan `a1 → (a2,a3) → a4` runs a2 and a3 together.
142
+ - This is the org differentiator made literal: not one agent stepping through a list, but a team working the
143
+ independent parts at once. Verified live (glm-5): two independent atoms ran in one wave and completed
144
+ out-of-order; both check-gates passed.
145
+ - Sequential remains the default (and is what interactive approval uses, since concurrent atoms can't share a
146
+ prompt). `hara plan` is full-auto, so `--parallel` is safe there. A wave stops the run if any of its atoms fail.
147
+ - Internal: `executeAtom` extracted (shared by both paths); `topoWaves(atoms)` added alongside `topoOrder`.
148
+
149
+ ## 0.34.0 — unreleased (incremental indexing)
150
+
151
+ - **`hara index` is now incremental.** Re-running it re-embeds only the files whose mtime changed since the
152
+ last build; unchanged files keep their existing vectors, and deleted files drop out. A changed embedding
153
+ model still forces a full rebuild. Output reports `(N embedded, M reused)`.
154
+ - Turns indexing from a run-once-and-go-stale command into something you can re-run after every edit. Measured
155
+ on hara's own repo with local `bge-m3`: full build **~68s** → unchanged rebuild **~0.4s** (~150×); editing one
156
+ file re-embeds just that file's chunks.
157
+ - Internal: each chunk records its source file's mtime; `buildIndex` returns `{total, embedded, reused}`.
158
+
159
+ ## 0.33.0 — 2026-06-20 · first public release (semantic recall + memory)
160
+
161
+ - **`recall` and `memory_search` go hybrid too.** The semantic layer added in 0.32 now also powers your
162
+ code-asset library and durable memory — `hara index --assets` embeds `~/.hara/code-assets`, global skills,
163
+ and `~/.hara/memory` into `assets` + `memory` indexes. `hara recall`, `/recall`, and the `memory_search` tool
164
+ then blend meaning-based hits with lexical (semantic leads, lexical fills, deduped by path).
165
+ - **`hara index [--repo|--assets|--all]`** — `--repo` (default) for `codebase_search`, `--assets` for recall +
166
+ memory, `--all` for everything. Each index is still a self-`.gitignore`d derived artifact; `hara doctor` lists
167
+ which of `repo / assets / memory` are built.
168
+ - **Lexical stays the default everywhere** — with no index/embedder, recall and memory behave exactly as before.
169
+ Capture/dedup (`skill_create`) stays purely lexical by design (saving shouldn't depend on an embedding model).
170
+ - Verified end-to-end with local `bge-m3`: "retrying a request that failed" → a backoff snippet; "how do I ship
171
+ a release" → the deploy note — both matched by meaning, not keywords.
172
+ - **License simplified to Apache-2.0** (from `MIT OR Apache-2.0`). Apache-2.0 adds an explicit patent grant +
173
+ trademark protection — the right fit for a company-backed tool with a commercial future, and matches the peer
174
+ norm (Codex, Goose). `LICENSE-MIT` removed; `LICENSE-APACHE` → `LICENSE`.
175
+
176
+ ## 0.32.0 — unreleased (semantic search for `codebase_search`)
177
+
178
+ - **Opt-in semantic index — `hara index`.** `codebase_search` (the "this repo is a knowledge base" tool) can
179
+ now blend **meaning-based** results with its lexical ranking. Build the index once with `hara index`; queries
180
+ then find the right file even when they share no keywords with the code (e.g. "read an image pasted from the
181
+ clipboard" → `src/images.ts`).
182
+ - **Zero new dependency, lexical stays the default.** The store is a built-in JSON cosine index (fine for repo /
183
+ code-asset scale); when no index or embedding provider is configured, `codebase_search` is exactly as before.
184
+ No native vector DB is required (zvec remains the documented scale-up path).
185
+ - **Bring your own embeddings**: `hara config set embedProvider ollama` (local & offline — e.g. `bge-m3`,
186
+ `nomic-embed-text`), `qwen` (DashScope `text-embedding-v3`), or any OpenAI-compatible `/embeddings` endpoint
187
+ (`embedModel` / `embedBaseURL` / `embedApiKey`). Embeddings never run unless you opt in.
188
+ - The index is a **derived, rebuildable artifact** — written under `.hara/index/` with a self-`.gitignore` so it
189
+ can never be committed (it may embed file contents). `hara doctor` shows the search/semantic/index state.
190
+
191
+ ## 0.31.0 — unreleased (native screen control)
192
+
193
+ - **`computer` tool — operate desktop software, not just the browser.** Screenshot → read → click / move /
194
+ type / press keys at coordinates. Native shell-out per OS (no heavy deps): macOS `screencapture` + `cliclick`,
195
+ Windows PowerShell (.NET / user32, built-in), Linux `scrot` + `xdotool`.
196
+ - **Strict, opt-in safety**: `computerUse: off|read|click|full` (default **off**) gates capability tiers;
197
+ `computerApps` is a frontmost-window **allowlist** checked before any click/type (the key guard against
198
+ wrong-window actions); a **dangerous-key blocklist** (cmd+q, ctrl+alt+del…); and a **once-per-session grant**
199
+ (the `computer` tool kind always confirms once, even in full-auto).
200
+ - Screenshots are **read via the vision sidecar** (a screenshot is described to text) so a text-only main
201
+ model can still act on what's on screen. `hara doctor` shows the tier, per-OS backend availability, and the
202
+ app allowlist.
203
+
204
+ ## 0.30.1 — unreleased
205
+
206
+ - Capture honors `assetCapture`: in **ask** (default) the end-of-session distill now **prompts before saving**
207
+ each skill/memory — the "remind me to confirm" flow — instead of writing silently; **auto** stays silent;
208
+ **off** disables proactive capture. `hara doctor` shows the capture mode.
209
+
210
+ ## 0.30.0 — unreleased (codebase search — the repo as a knowledge base)
211
+
212
+ - **`codebase_search`** — the current project is now a searchable knowledge base. Relevance-ranked **lexical**
213
+ search over the repo's code/text (respects `.gitignore` via `listProjectFiles`), returning the top files +
214
+ their densest snippet (`file:line`). Distinct from `grep` (exact pattern): the agent finds *related* code
215
+ from a natural-language query ("where's auth handled?") while working. Zero new deps; it's the interface a
216
+ semantic (zvec) index slots into later.
217
+
218
+ ## 0.29.0 — unreleased (asset capture & curation — phase 1)
219
+
220
+ - **Unified asset search** (the fix that enables the rest): `recall` / `searchAssets` now cover **skills +
221
+ code-assets** as one corpus — they were disconnected, so agent-saved skills were invisible to recall (and
222
+ dedup was impossible).
223
+ - **`skill_create` is now curated capture:**
224
+ - **`scope`** — `project` (this repo's `.hara/skills`) or `personal` (`~/.hara/skills`, default). Sharing to
225
+ company / public stays a separate, human-confirmed step.
226
+ - **Sanitize on save** — secrets are **redacted** to typed placeholders (`<REDACTED:sk-key>`…) rather than
227
+ blocking the whole save; local identifiers are generalized (`<project>` / `~` / `<email>`); injection
228
+ phrases are still hard-blocked.
229
+ - **Dedup signal** — searches the unified corpus before saving and flags a near-duplicate so you update
230
+ instead of piling up.
231
+ - **`assetCapture: off | ask | auto`** gates proactive end-of-session capture (the distill turn).
232
+ - `guard.ts` gains `redactSecrets()` / `scrubLocal()` — redact on the way in; `scanMemory` still blocks on load.
233
+
234
+ ## 0.28.0 — unreleased (plugins)
235
+
236
+ - **Plugins** — a distribution unit bundling skills + roles + MCP servers; it owns nothing at runtime, the
237
+ existing loaders pick its contents up. Manifest is **Claude-Code-compatible** (`.claude-plugin/plugin.json`,
238
+ `.hara-plugin/plugin.json`, or bare `plugin.json`) so hara can consume community plugins.
239
+ - `hara plugin add file:<path> | github:<owner/repo> | git:<url>` installs into `~/.hara/plugins/<name>`;
240
+ `hara plugin` lists; `hara plugin enable/disable/remove`. Enabled plugins' skills/roles/MCP auto-contribute
241
+ (lowest precedence — project & global override). `hara doctor` shows them.
242
+ - **Claude-Code subagent interop**: `.claude/agents/*.md` load as roles (`tools:` → allowTools).
243
+
244
+ ## 0.27.0 — unreleased (skills)
245
+
246
+ - **Skills** — agentskills.io-standard reusable capabilities at `~/.hara/skills/<name>/SKILL.md` (+ project
247
+ `.hara/skills`). The system prompt lists each skill (id + description); the agent calls the new **`skill`**
248
+ tool to load a skill's full instructions on demand — progressive disclosure (the body returns as a tool
249
+ result, keeping the prompt cache stable). `context: fork` runs the skill as a sub-agent; `allowed-tools` /
250
+ `when_to_use` / `paths` / `user-invocable` frontmatter supported (Claude-Code-compatible).
251
+ - **`skill_create`** replaces `playbook_save` — the agent saves a reusable how-to as a real SKILL.md (lexical
252
+ guard scans it). Playbooks are now just the agent-authored corner of the one skills system.
253
+ - **`hara skills` / `hara skills init`**, plus `/skills` (list) and `/skill <id>` (load into your next
254
+ message). `hara doctor` lists your skills. Reuses the existing recall lexical engine — no new deps.
255
+
256
+ ## 0.26.0 — unreleased (inline image tokens + session UUID & auto-name)
257
+
258
+ - **Pasted images are inline `[Image #N]` tokens** (Claude Code / codex style) — highlighted in the input
259
+ where you paste, carried inline in the message; **backspace over a token removes it + its attachment**
260
+ (and renumbers the rest). Replaces the chip experiment (a desktop-GUI pattern) with the terminal-native
261
+ one both reference tools use.
262
+ - **Sessions now have a full UUID** (was an 8-char stub) + an **auto-summarized name** from the first
263
+ message that's **language-aware (keeps CJK)** — a Chinese first line names the session meaningfully
264
+ instead of a random word; it never shows "new session" (falls back to the short id).
265
+ - Startup header shows `session <uuid>`; the top border shows the name (or short id); `/sessions` + `/name`
266
+ show the short id / full UUID; **`--resume` accepts a short-id prefix**, not just the full UUID.
267
+
268
+ ## 0.25.0 — unreleased (vision UX polish + ground-truth capability map)
269
+
270
+ - **Header shows image routing at startup** — the banner now states whether the main model reads images
271
+ directly, routes them through a describer (`👁 glm-5 is text-only → images read by qwen3.7-plus`), or will
272
+ ask on first paste.
273
+ - **Cleaner paste** — a pasted/dragged image is a 🖼 **chip** below the prompt (no more `[Image #N]` token in
274
+ your text); the input stays clean, you can submit an image with no text, and **backspace on empty input
275
+ removes the last attachment** (cc-haha style).
276
+ - **Capability map corrected to the Alibaba Coding Plan** (ground truth): `qwen3.5/3.6/3.7-plus` + `kimi-k2.5`
277
+ → vision; `qwen3-max`, `qwen3-coder-*`, `glm-5`, `glm-4.7`, `MiniMax-M2.5` → text-only. So `glm-5` no longer
278
+ hits the "unknown" prompt — it routes straight to the describer.
279
+ - **Hardening** (expert review): `/vision` is now one implementation shared by both REPLs; setting a
280
+ non-vision describer warns; `/model` resets the describer cache + reminder; Esc during describe reads as
281
+ "cancelled" not "failed".
282
+
283
+ ## 0.24.1 — unreleased
284
+
285
+ - Capability map: recognize the Alibaba coding-plan **Qwen3 flagships** (`qwen3.x-plus` / `qwen3-max`) as
286
+ **vision-capable** — verified `qwen3.7-plus` accepts image input and describes/OCRs accurately. (As a
287
+ `visionModel` describer it already worked; this corrects its classification when used as the *main* model.)
288
+
289
+ ## 0.24.0 — unreleased (auto-detect vision capability)
290
+
291
+ - **Automatic** image routing — hara classifies the main model and decides each turn:
292
+ - vision-capable (Claude, gpt-4o, qwen-vl, glm-4v…) → image sent **inline**, describer suspended;
293
+ - text-only (DeepSeek, qwen-coder, glm-4-flash…) → image **auto-described** by `visionModel` into text,
294
+ or — if none set — a **reminder** to add one (`/vision <model>`);
295
+ - **unknown** model → hara **asks once** ("Can <model> see images? Yes / No / Skip") and remembers the
296
+ answer per-model.
297
+ - Built-in, extensible **capability map** (`classifyVision`) for the major families — Claude / GPT / Qwen /
298
+ GLM / DeepSeek / Gemini / Mistral / Llama / Kimi / Grok / Pixtral·Llava·InternVL.
299
+ - **`/vision <model>`** sets the describer in-place; **`/vision main yes|no|auto`** overrides/clears the
300
+ current model's detected capability (stored per-model in `modelVision`). `hara doctor` shows it.
301
+
302
+ ## 0.23.0 — unreleased (vision sidecar for text-only models)
303
+
304
+ - **Use pasted images with text-only models** (DeepSeek, coding models, …) via a configurable vision
305
+ **sidecar**: `hara config set visionModel <model>` (e.g. a `qwen-vl-*` on the same Alibaba plan) — hara
306
+ OCRs/describes each pasted image into text with that model, then your main model continues. Reuses the
307
+ main provider's endpoint + key; override with `visionBaseURL` / `visionApiKey` if vision lives elsewhere.
308
+ Unset = images go inline (needs a vision main model).
309
+ - The describe prompt is coding-tuned: verbatim transcription of text/code in fenced blocks, plus UI /
310
+ diagram / error description. `hara doctor` shows the vision status.
311
+
312
+ ## 0.22.0 — unreleased (image paste / vision)
313
+
314
+ - **Paste images into the prompt** (ink TUI) — **Ctrl+V** pastes an image from the OS clipboard (a
315
+ screenshot, or an image copied from a browser); **dragging an image file** into the terminal (or
316
+ pasting its path) attaches it too. Each shows as an `[Image #N]` token in the input with a 🖼 chip
317
+ below the box. Zero new deps — shells out to `osascript`/`sips` (macOS), `wl-paste`/`xclip` (Linux),
318
+ or PowerShell (Windows), the same posture as the sandbox.
319
+ - **Vision on every provider** — attachments are sent as image blocks: base64 `image` blocks for
320
+ Anthropic (Claude), `image_url` data-URLs for OpenAI-compatible endpoints (Qwen-VL / GLM-4V /
321
+ OpenAI). Use a vision-capable model. Oversized images are auto-downsized (macOS `sips`, ≤1568 px)
322
+ and capped at ~5 MB.
323
+ - Only image **paths** ride in the conversation/session JSON (sessions stay small); bytes are read +
324
+ base64-encoded at request time. `@image.png` mentions no longer inline binary — they hint to paste.
325
+ - 85 offline tests (clipboard capture, path detection, provider image blocks, TUI paste).
326
+
327
+ ## 0.21.2 — unreleased (memory everywhere)
328
+
329
+ - Memory now injects into **every execution mode** — `hara -p` one-shot, `hara org`, `hara plan` atoms,
330
+ and sub-agents — not only the interactive REPL (M1 had wired just the interactive turns).
331
+ - `hara doctor` / `/doctor` shows memory status + the `evolve` level.
332
+
333
+ ## 0.21.1 — unreleased (TUI command parity)
334
+
335
+ - Wire the missing slash commands into the default ink TUI: **`/compact`** (with the proactive pre-compact
336
+ flush + working-set distill), **`/sessions`**, **`/usage`**, **`/doctor`**, **`/roles`**, **`/approval [mode]`**.
337
+ (`runDoctor` now returns a string so both the classic REPL and the TUI can render it.) `/org` and `/plan`
338
+ remain `hara org`/`hara plan` subcommands.
339
+
340
+ ## 0.21.0 — unreleased (self-evolution · M2)
341
+
342
+ - **`playbook_save`** — the agent grows its own reusable playbooks (`~/.hara/code-assets/playbooks/<slug>.md`,
343
+ frontmatter + body), found later by `recall` / `memory_search`.
344
+ - **AGENTS.md self-refinement** — the agent may propose AGENTS.md edits via `edit_file`, reviewed through the
345
+ normal diff/approval gate (no new write path).
346
+ - **Guard** (`src/memory/guard.ts`) — a lexical scan on agent-written memory + playbooks blocks prompt-injection
347
+ phrases, secret-shaped tokens (`sk-…`/`AKIA…`/PEM/`ghp_…`), and `file://` URLs before they hit disk.
348
+ - **Session-end distill** — with `evolve: proactive` (default), `/exit` runs one reflection turn that persists
349
+ durable learnings via `memory_write` / `playbook_save`. Set `evolve: light` (no distill) or `off` to disable.
350
+ - 76 offline tests.
351
+
352
+ ## 0.20.0 — unreleased (memory + self-evolution · M1)
353
+
354
+ - **Long-term memory** — a lexical, file-backed store (no embeddings): global `~/.hara/memory/` + project
355
+ `<root>/.hara/memory/` (`MEMORY.md` / `USER.md` / daily logs). Tools: `memory_search`, `memory_get`,
356
+ `memory_write`, `memory_forget`. The agent recalls before answering about prior decisions and is nudged to
357
+ **proactively save** durable facts (conventions, your preferences, tricky solutions).
358
+ - **Injection** — a capped MEMORY/USER digest is added to the system prompt (frozen snapshot at session
359
+ start), reusing the `recall` lexical engine over the memory roots.
360
+ - **Short-term working memory** — `SessionMeta.workingSet` survives `/compact` (which used to wipe it) and
361
+ resume; `/compact` distills its summary into it.
362
+ - **Global roles** — `~/.hara/roles/*.md` (reusable personas) alongside project `.hara/roles/`; project wins
363
+ on name clash — the same global/project scoping as memory + config.
364
+ - 74 offline tests; zero new runtime deps. (M2 = playbooks + AGENTS.md self-refine + a guard + session-end distill.)
365
+
366
+ ## 0.19.0 — unreleased (plan mode + theme)
367
+
368
+ - **Plan mode** — a 4th `shift+tab` mode. hara goes **read-only** (`read_file`/`grep`/`glob`/`ls`/`web_fetch`),
369
+ investigates, and proposes a step-by-step plan; then a **selectable "proceed?"** prompt — *Yes, auto-apply
370
+ edits · Yes, approve each edit · No, keep planning* — flips the approval mode and executes the plan.
371
+ Matches codex (`Default`+`Plan`) / Claude Code.
372
+ - **Selectable prompts** — the tool-approval confirm and the plan-proceed share one `↑↓` / Enter / shortcut
373
+ select component; the input box stays visible underneath.
374
+ - **Theme switch** — `hara config set theme dark|light` (or `HARA_THEME`). Banner/accent is the brand
375
+ vermilion **#FF6B5C** on dark, **#C0392B** on light. Truecolor; chalk degrades on 256/16-color terminals.
376
+
377
+ ## 0.18.0 — unreleased (ink TUI)
378
+
379
+ - **New terminal UI — a real TUI (ink 6 + React 19).** The interactive REPL is now a **bordered input
380
+ box pinned at the bottom**: the session name sits in the top-right corner, and the approval modes +
381
+ token usage + concurrent-agent count live in the bottom border, with the conversation scrolling above.
382
+ Streaming assistant text, dim reasoning, tool calls, and colored diffs render as live blocks; a spinner
383
+ shows while a turn runs (**Esc** interrupts); tool-approval prompts appear inline (y/N); **shift+tab**
384
+ cycles the approval mode. Same approach Claude Code itself uses (ink). `HARA_TUI=0` falls back to the
385
+ classic readline REPL.
386
+ - The agent loop + tools now emit through a `UiSink` so output is rendered by ink (not raw stdout),
387
+ keeping the TUI uncorrupted; the plain path is unchanged when no sink is present (`-p`, pipes, sub-agents).
388
+ - TUI slash commands: `/help` `/tools` `/model` `/undo` `/recall` `/reset` `/exit` (others → `HARA_TUI=0`).
389
+
390
+ ## 0.17.1 — unreleased (status bar actually renders)
391
+
392
+ - **Fix: the status bar now shows.** The pinned-footer (v0.6) used a terminal scroll region that
393
+ doesn't compose with Node's `readline`, so it silently never rendered. It's now a status **header
394
+ printed above each prompt** — session · the three approval modes · tokens + ctx% · concurrent ops —
395
+ visible in any terminal. (True bottom-pinning needs a full TUI; deferred.) `HARA_FOOTER=0` hides it.
396
+
397
+ ## 0.17.0 — unreleased (doctor + command completion)
398
+
399
+ - **`hara doctor` / `/doctor`** — a setup health check: Node version, provider + model, whether auth
400
+ is configured (with a fix hint), config path, code-assets, roles, MCP servers. Diagnoses the common
401
+ "not authenticated / wrong model" pitfalls at a glance.
402
+ - **`/command` Tab-completion** — typing `/` (or `/mo`) + Tab completes slash-command names in the REPL.
403
+
404
+ ## 0.16.1 — unreleased (terminal UX polish)
405
+
406
+ - **`@<dir>` loads a directory** — mentioning a directory now attaches a listing of its files (the
407
+ agent can then read specific ones); previously `@dir` did nothing.
408
+ - **`@src/` Tab drills in** — completing a path that ends in `/` lists that folder's immediate
409
+ children (directories first), like a file picker.
410
+ - **Tool calls show their argument** — `↳ read_file src/x.ts`, `↳ bash npm test`, `↳ grep TODO`
411
+ instead of a bare tool name.
412
+ - **"working Ns" spinner** while a turn is in flight (cleared the moment output/reasoning streams).
413
+
414
+ ## 0.16.0 — unreleased (parallel sub-agents)
415
+
416
+ - **`agent` tool** — delegate an independent sub-task to a fresh sub-agent; spawn several in one turn
417
+ to run them **in parallel** (the footer's `⛁ N agents` count is now real). Sub-agents are read-only
418
+ by default (analysis/search/review/web), so they're safe to parallelize; pass a `role` id to use
419
+ that role's persona + tools. The agent loop gained a `quiet` mode so parallel sub-agents don't
420
+ interleave output — only their results return to the parent. Sub-agents can't recurse (no nested
421
+ fan-out).
422
+
423
+ ## 0.15.0 — unreleased (code-asset recall)
424
+
425
+ - **`hara recall "<query>"` / `/recall`** — a personal, git-versionable library of snippets/playbooks
426
+ at `~/.hara/code-assets` (override with `HARA_ASSETS`). Lexical search ranks `*.md` assets by
427
+ query-word matches; in the REPL `/recall` pulls the top matches into your **next message's context**.
428
+ `hara recall --init` scaffolds the directory with an example. Phase-C v0 — lexical-first (embeddings
429
+ deferred until proven necessary).
430
+
431
+ ## 0.14.1 — unreleased (planner: objective verify gate)
432
+
433
+ - **`hara plan` verify can run a command** — an atom may carry a `check` shell command; the verify
434
+ gate passes only if it exits 0 (objective), falling back to the LLM self-check when no `check` is
435
+ given. Makes plans trustworthy — e.g. `npm test`, `tsc --noEmit`, `test -f path`.
436
+
437
+ ## 0.14.0 — unreleased (web_fetch)
438
+
439
+ - **`web_fetch`** — fetch an `http(s)` URL and return its text (HTML reduced to readable text), for
440
+ pulling docs / references / pages into context. Read-only, follows redirects, 30s timeout,
441
+ size-capped. Not sandboxed (network egress is in-process, not via `bash`).
442
+
443
+ ## 0.13.0 — unreleased (context management)
444
+
445
+ - **`/compact`** — summarize the conversation so far into a brief and replace the history with it, to
446
+ free up context in long sessions (preserves goal, decisions, files changed, next steps).
447
+ - **Context budget warning** — after a turn, if the context reaches ≥80% of the model's window, hara
448
+ warns and suggests `/compact` / `/reset`. (The status bar already shows live `ctx %`.)
449
+
450
+ ## 0.12.0 — unreleased (rendered output + visible reasoning)
451
+
452
+ - **Markdown rendering** — assistant output renders in the terminal: headers, **bold**, `inline
453
+ code`, and bullets are styled; code fences pass through verbatim (copy-paste accurate). Line-buffered
454
+ streaming (`src/md.ts`); interactive terminal only — pipes/`-p` stay raw, disable with `HARA_MD=0`.
455
+ - **Reasoning/thinking display** — when a model streams reasoning (GLM-5 / DeepSeek `reasoning_content`,
456
+ or Anthropic thinking), hara shows it dimmed before the answer. Interactive terminal only.
457
+
458
+ ## 0.11.0 — unreleased (undo + live shell output)
459
+
460
+ - **`/undo`** — revert the last file change(s) made this session. Every edit tool
461
+ (`write_file`/`edit_file`/`apply_patch`) records the prior file state; `/undo` restores it (and
462
+ deletes files that were freshly created). In-session, up to 50 steps. (`src/undo.ts`)
463
+ - **Live bash output** — the `bash` tool now streams stdout/stderr **as the command runs**
464
+ (interactive terminal only) instead of waiting for completion. `runShell` rewritten on `spawn` with
465
+ an `onData` hook; the full output is still captured for the model.
466
+
467
+ ## 0.10.0 — unreleased (multi-file patches + interrupt)
468
+
469
+ - **`apply_patch`** — change several files in one **atomic** step (all-or-nothing). `changes` is an
470
+ array of `{path, type:'update'|'create'|'delete', edits?|content?}`; everything is validated and
471
+ computed in memory first, and **nothing is written if any change fails**. Shows a diff per file.
472
+ Prefer it over multiple `edit_file` calls for multi-file work. (Shared edit core extracted to
473
+ `src/tools/apply-core.ts`, reused by `edit_file`.)
474
+ - **Esc interrupts a running turn** — press Esc while the agent is working to abort the in-flight
475
+ request and return to the prompt (the session is kept). Plumbed via `AbortSignal` through both
476
+ providers; an interrupt renders as a dim `(interrupted)`, not an error.
477
+
478
+ ## 0.9.0 — unreleased (daily-driver polish: streaming + diffs)
479
+
480
+ - **Streaming for OpenAI-compatible providers** — Qwen/GLM/OpenAI now stream tokens live (the whole
481
+ response used to appear at once). Tool calls are accumulated from the stream by index, and usage is
482
+ read from the final chunk (`stream_options.include_usage`). Anthropic already streamed.
483
+ - **Diff display on edits** — after `edit_file`/`write_file`, hara prints a colored unified diff
484
+ (`◇ path +N -M` with `+`/`-` lines) so you see exactly what changed. Zero-dependency line diff
485
+ (`src/diff.ts`); shown in an interactive terminal only (pipes/scripts stay clean).
486
+ - **Sturdier retries** — both SDK clients now retry transient errors (429/5xx/network) up to 4×.
487
+
488
+ ## 0.8.0 — unreleased (atomization planner — the org plans, not just routes)
489
+
490
+ - **`hara plan "<task>"` / `/plan`** — decompose a task into atoms, sequence them as a DAG, then
491
+ execute each step (optionally routed to a role) behind a **verify gate**. This is the execution
492
+ methodology made real: frame → atomize → sequence → execute → verify.
493
+ - **Planner** (`src/org/planner.ts`): `decompose` (LLM → atoms + deps), `topoOrder` (Kahn ordering +
494
+ cycle detection), per-atom `verify` (checks the step's done-criteria), and an SSOT plan state at
495
+ `.hara/org/plan.json` — inspectable, and execution stops on the first failed verification.
496
+ - Atoms may carry a `role`, so the planner routes steps to the org's role-agents
497
+ (implementer/reviewer/docs) with their persona, tool subset, and model.
498
+
499
+ ## 0.7.0 — unreleased (fuzzy matching + did-you-mean)
500
+
501
+ - **Fuzzy `@file` completion** — `@path` now ranks by a built-in subsequence fuzzy matcher (zero new
502
+ deps): `@idx` finds `src/index.ts`, `@sc` finds `src/`. Handles insertions/skips (not transpositions).
503
+ - **Path did-you-mean** — when `read_file`/`edit_file` get a path that doesn't exist, the error now
504
+ suggests the nearest real project files ("Did you mean: src/index.ts?") instead of just failing.
505
+ - **Slash-command did-you-mean** — a mistyped command suggests the closest one ("`/modl` → Did you
506
+ mean /model?").
507
+ - New `src/fuzzy.ts` (`fuzzyScore`/`fuzzyRank`/`nearest`) + `nearestPaths` in `fs-walk.ts`.
508
+
509
+ ## 0.6.0 — unreleased (CLI UX + search tools)
510
+
511
+ - **Status bar** — a persistent footer pinned below the REPL transcript (terminal scroll region):
512
+ session name · the three approval modes with the current one highlighted · live token usage + ctx% ·
513
+ a concurrent-operation count (`⛁ N`). TTY-only; degrades to the plain after-turn status line when
514
+ piped. Disable with `HARA_FOOTER=0`.
515
+ - **Approval mode switching** — bare `/approval` now cycles suggest → auto-edit → full-auto (still
516
+ `/approval <mode>` to set); **shift+tab** cycles it from anywhere (TTY).
517
+ - **Search tools** — `grep` (regex across files, `path:line: text`), `glob` (`**`/`*`/`?` path
518
+ patterns), `ls` (one directory). All read-only, so they never prompt and run in parallel.
519
+ - **Parallel safe-tool execution** — read-only tool calls in a turn now run concurrently (edit/exec
520
+ still run alone, in order); the footer's `⛁` count reflects live concurrency.
521
+ - **`edit_file` hardened** — accepts multiple `edits` applied in order, and falls back to
522
+ quote-insensitive matching (straight ↔ curly) when an exact match isn't found.
523
+ - **`@file` completion fixed** — now walks subdirectories (git-tracked + untracked, or a filesystem
524
+ walk outside git), drills into directories (`@src/…`), and works in non-git projects. Previously it
525
+ only consulted `git ls-files` and silently returned nothing otherwise.
526
+
527
+ ## 0.5.0 — unreleased (Phase 2: governed role-agent org — the differentiator)
528
+
529
+ - **Roles** — markdown role-agents in `.hara/roles/*.md` (frontmatter: `name`, `description`, `owns[]`,
530
+ `rejects[]`, `model?`, `allowTools[]`/`denyTools[]`; body = persona). `hara roles` lists, `hara roles init` scaffolds.
531
+ - **Dispatcher** — `hara org "<task>"` routes a task to the role that **owns** it (keyword match → LLM
532
+ fallback), or `--role <id>` to force one; runs that role's agent with its persona, tool subset, and model.
533
+ `/org` and `/roles` in the REPL.
534
+ - hara now runs like an engineering org, not a single agent — a read-only `reviewer` vs an editing
535
+ `implementer`, each owning its slice of the work.
536
+
537
+ ## 0.4.0 — unreleased (Tier-3)
538
+
539
+ - **Sessions & resume** — conversations saved under `~/.hara/sessions`; `-c`/`--continue` resumes the latest
540
+ in the cwd, `--resume <id>` a specific one, `hara sessions` / `/sessions` list them.
541
+ - **MCP client** — connect stdio MCP servers via an `mcpServers` map in config (global or project);
542
+ their tools register as `mcp__<server>__<tool>` and become available to the agent.
543
+ - **OS sandboxing** — `--sandbox` / `config set sandbox` (`off` | `workspace-write` | `read-only`): the
544
+ `bash` tool runs under macOS Seatbelt — workspace-write confines writes to the project (+ temp),
545
+ read-only blocks writes. Non-macOS runs unsandboxed (the approval gate still applies).
546
+
547
+ ## 0.3.0 — unreleased (Tier-2 coding-CLI polish)
548
+
549
+ - **Approval modes** — `suggest` (confirm edits & shell), `auto-edit` (auto file edits, confirm shell),
550
+ `full-auto` (no prompts). Set via `--approval`, `hara config set approval`, or `/approval`; `-y` = full-auto.
551
+ - **Slash-command registry** — `/help` `/init` `/tools` `/model` `/approval` `/usage` `/reset` `/exit`,
552
+ data-driven (auto-listed in `/help`).
553
+ - **Config profiles & project config** — named `profiles` in `~/.hara/config.json` (`--profile` /
554
+ `HARA_PROFILE`), plus a project-level `.hara/config.json` that overrides the global config.
555
+ - **Status line** — model + cumulative token usage (`↑in ↓out`) after each turn and in `-p` output;
556
+ `/usage` shows it on demand.
557
+
558
+ ## 0.2.0 — unreleased (coding-CLI features, borrowed from Codex)
559
+
560
+ - **Project context (`AGENTS.md`)** — auto-loaded each run (walks up to the project root, concatenates,
561
+ 32 KiB cap). On first run in a project with no `AGENTS.md`, hara offers to analyze the repo and write
562
+ one; `hara init` / `/init` (re)generate it. Uses the cross-tool `AGENTS.md` standard.
563
+ - **`@file` mentions** — `@path` in the REPL or `-p` attaches that file's contents to your message;
564
+ Tab-completes `@paths` from `git ls-files`.
565
+ - **`edit_file` tool** — surgical exact-string edits to existing files (unique-match guard / `replace_all`),
566
+ instead of overwriting whole files with `write_file`. Behind the same confirm gate.
567
+
568
+ ## 0.1.0 — unreleased (first functional release)
569
+
570
+ - Streaming **agentic loop** with a manual tool-use cycle.
571
+ - Built-in tools: `read_file`, `write_file`, `bash`, with a **human-in-the-loop confirmation gate**
572
+ on the dangerous ones (`write_file`, `bash`) unless `-y` is passed.
573
+ - Interactive **REPL** (`/help`, `/tools`, `/model`, `/reset`, `/exit`), one-shot `-p` mode, `-y`/`-m` flags.
574
+ - **Multi-provider**: Anthropic (Claude — streaming + adaptive thinking) and any OpenAI-compatible
575
+ endpoint (Qwen/DashScope, GLM, Kimi, OpenAI) via a provider-neutral conversation core.
576
+ - **`hara config`** (`provider` / `apiKey` / `model` / `baseURL`) → `~/.hara/config.json`; env vars override.
577
+ - Offline **test suite** for the built-in tools.
578
+ - Dual-licensed **MIT OR Apache-2.0**; CLA in place.
579
+
580
+ ## 0.0.2
581
+
582
+ - Placeholder package reserving `@nanhara/hara` on npm (dual MIT/Apache + CLA, functional stub).
package/CLA.md CHANGED
@@ -26,7 +26,7 @@ agree that:
26
26
 
27
27
  3. **Right to relicense.** You agree the Maintainer may license Your
28
28
  contributions to third parties under the Project's then-current open-source
29
- license(s) (**MIT OR Apache-2.0**) **and** under separate terms, including
29
+ license (**Apache-2.0**) **and** under separate terms, including
30
30
  commercial/proprietary licenses. This lets the Project sustain itself via an
31
31
  open-core model without re-contacting every contributor.
32
32