@nanhara/hara 0.33.0 → 0.48.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -5,7 +5,158 @@ All notable changes to `@nanhara/hara`.
5
5
  > Versioning (pre-1.0, SemVer-style): the **minor** (middle) number bumps for a **new feature**; the
6
6
  > **patch** (last) number bumps for **optimizations/fixes of existing features**.
7
7
 
8
- ## 0.33.0 — unreleased (semantic recall + memory)
8
+ ## 0.48.0 — unreleased (chrome plugin: drive your real logged-in Chrome)
9
+
10
+ - New first-party **`chrome` plugin** — web automation via **`chrome-devtools-mcp`** against a **real Chrome with
11
+ a persistent-login profile** (sign into a site once, reused across runs), or attach to your running Chrome via
12
+ `--browserUrl http://127.0.0.1:9222`. The "drive my actual sessions" complement to the isolated-Playwright
13
+ `browser` plugin (enable one, not both) — this is the openclaw/cc-haha route.
14
+ - Shipped as an option (not auto-installed — `browser` stays the default). `chrome-devtools-mcp` verified
15
+ resolvable; both plugin manifests validated.
16
+
17
+ ## 0.47.0 — unreleased (browser plugin: reliable web automation via Playwright MCP)
18
+
19
+ - New first-party **`browser` plugin** wires the **Playwright MCP** (`@playwright/mcp`) into hara → the agent gets
20
+ reliable web automation: `mcp__browser__navigate / snapshot / click / type / fill_form …` acting on the page's
21
+ **DOM/accessibility tree** (selectors, auto-waiting), NOT screenshots or pixel coordinates. This is the
22
+ reliable counterpart to the fragile desktop `computer` tool — no permission walls, no coordinate-guessing.
23
+ - Ships a `web-automation` skill (snapshot-driven workflow; notes the `chrome-devtools-mcp` alternative for
24
+ driving your real logged-in Chrome, à la openclaw/cc-haha).
25
+ - Install: `hara plugin add file:<repo>/plugins/browser`; `npx playwright install chromium` once. Verified
26
+ `@playwright/mcp@0.0.76` resolves + the plugin loads (`hara doctor` → plugins: browser).
27
+
28
+ ## 0.46.0 — unreleased (screen control: bounded-failure circuit breaker)
29
+
30
+ - The `computer` tool now **stops after 3 consecutive failures** instead of letting the agent loop forever on a
31
+ broken setup (learned from codex, which bounds Computer Use attempts then gives up). After 3 in a row it
32
+ returns a clear stop + the likely cause (missing Accessibility/Screen Recording permission, or the app isn't
33
+ reachable) + how to fix; resets on any success. Each failure shows the running `[n/3]` count.
34
+
35
+ ## 0.45.1 — unreleased (activate via `open -a`; Accessibility gotcha)
36
+
37
+ - `activateApp` uses `open -a <app>` on macOS — `osascript … to activate` often left another window on top.
38
+ - Documented (gotcha #0 in `computer.ts`) that **cliclick needs the Accessibility permission, separate from
39
+ Screen Recording** — without it, clicks/keys silently no-op (the #1 cause of "it does nothing").
40
+
41
+ ## 0.45.0 — unreleased (screen control: activate, IME-safe typing)
42
+
43
+ - **`activate` action** — bring the target app to the foreground before screenshot/click. Fixes clicks landing
44
+ on the terminal hara runs in (the "Ghostty" problem): the agent must `activate WeChat` *first*.
45
+ - **IME-safe typing** — `type` now sets the clipboard and pastes (Cmd/Ctrl+V) instead of injecting keystrokes,
46
+ which a Chinese input method garbles. Reliable for **CJK + emoji** (verified pbcopy round-trip: `你好 hello 😀`);
47
+ falls back to keystrokes for ASCII if the clipboard set fails.
48
+ - The hard-won **RPA gotchas** (foreground trap, IME, Retina coords, grounding fragility, placeholder text like
49
+ "AAAA") are documented at the top of `computer.ts`.
50
+ - TUI: the type-ahead pool shows each queued line **highlighted** (accent color) above the input — no verbose
51
+ header (per feedback).
52
+
53
+ ## 0.44.0 — unreleased (type-ahead pool: visible + coalesced)
54
+
55
+ - The type-ahead queue is now a **visible pool**: messages typed while the agent works are listed above the
56
+ input (`📥 pool (N) — sent together when this turn finishes`), so Enter visibly *enters the pool* instead of
57
+ appearing to vanish (the reported "回车消失了/没显示在对话池").
58
+ - On turn-end the pool is **coalesced into one turn** — your "also do X" / "and Y" additions reach the agent
59
+ together, in order, rather than as separate sequential turns.
60
+ - Esc still clears the pool (stop means stop). 130 tests (+1 coalesce; existing type-ahead tests updated).
61
+
62
+ ## 0.43.0 — unreleased (grounding for screen control — accurate clicks)
63
+
64
+ - The `computer` tool now **locates UI elements by description** instead of guessing pixels from a text read.
65
+ Pass `target` to `click`/`move` (e.g. "the Send button") — hara screenshots, asks a vision model for the
66
+ element's position (resolution-independent fractions, Retina-safe), and clicks there. New **`find`** action
67
+ returns coordinates without clicking.
68
+ - This is codex's "native computer-use" lesson applied **locally**: codex's `computer_use` is a remote browser
69
+ sandbox; hara grounds against your own screen + apps. Needs a grounding-capable vision model (e.g. a qwen-VL).
70
+ - `screenSize()` per OS converts fractions → click coords; `parseLocate` accepts per-mille/percent/fraction
71
+ replies (tested). cliclick installed → `hara doctor` shows screencapture ✓ + cliclick ✓.
72
+ - **Still requires you to grant macOS Screen Recording + Accessibility** to actually drive the screen — those
73
+ toggles can only be set by you in System Settings.
74
+
75
+ ## 0.42.0 — unreleased (type-ahead: keep typing while the agent works)
76
+
77
+ - You can now **type while the agent is working** — the message enters a **FIFO queue** and is sent
78
+ automatically when the current turn finishes (the input box stays active mid-turn; a "⌨ working — Enter
79
+ queues" hint shows the depth). Fixes the "input does nothing while working" dogfooding feedback.
80
+ - **Esc stops everything** — interrupts the turn AND clears the queue, so a stopped turn never fires queued
81
+ messages. The queue drain is idempotent (guarded against double-send under React StrictMode).
82
+ - Expert-reviewed for queue correctness (FIFO, exactly-once), the Esc/abort UX, and input-handler conflicts.
83
+
84
+ ## 0.41.0 — unreleased (English session names, auto-summarized)
85
+
86
+ - After the first turn a session gets a short **English kebab-case name** summarizing what it's about
87
+ (e.g. `add-semantic-search`) via one tiny model call — replacing the literal first-message title. A non-English
88
+ conversation is translated to an English gist (pinyin only if untranslatable). Names stay short + ASCII.
89
+ - The stable session **id is still the UUID** (unchanged — this only improves the human-friendly name); falls
90
+ back to the lexical title if the naming call fails. New `slugify()` helper (tested).
91
+
92
+ ## 0.40.0 — unreleased (TUI polish: markdown rendering + numbered choices)
93
+
94
+ - The ink TUI now **renders assistant Markdown** (headers, bold, inline code, bullets; code fences kept
95
+ verbatim) instead of showing raw `**`/`##`/backticks. The renderer (`md.ts`) had only been wired into the
96
+ classic REPL; the default TUI showed markdown literally.
97
+ - **Selection prompts are numbered**: each choice shows `1.`, `2.`, … and you can **press the number to pick it
98
+ directly** (in addition to ↑↓ + Enter). The hint reads "↑↓ or 1–N to choose".
99
+
100
+ ## 0.39.0 — unreleased (hara commit — AI commit messages)
101
+
102
+ - **`hara commit`** generates a conventional-commits message from your staged diff, shows it, and commits after
103
+ a `Y/n` confirm. `-a` stages tracked changes first; the global `-y` skips the confirm. Pairs with `hara
104
+ review` (review → commit). Verified live (glm-5): generated `feat(util): add mul function` and committed it.
105
+ - Note: the skip-confirm reuses the global `-y/--yes` (a subcommand `-y` would collide with it — same lesson as
106
+ `hara plan resume`).
107
+
108
+ ## 0.38.0 — unreleased (hara review — review your changes)
109
+
110
+ - **`hara review`** reviews your uncommitted changes (`git diff HEAD`) for correctness bugs, security issues,
111
+ missing error handling, naming, and missing tests — grouped by severity (**Blocker / Should-fix / Nit**) with
112
+ file:line and concrete fixes. **Read-only**: it can read files for context but never edits. `--staged`
113
+ reviews staged changes; `--base <ref>` reviews against a ref (e.g. `main`).
114
+ - Verified live (glm-5): on a planted diff it flagged a hardcoded secret (Blocker), an unguarded divide, and
115
+ dead code, then gave a clear "do not merge" verdict.
116
+ - `codebase_search` added to the read-only tool set (so reviewers / sub-agents can search the repo).
117
+
118
+ ## 0.37.0 — unreleased (task-aware screenshots for screen control)
119
+
120
+ - Screenshots from the `computer` tool are now read with a **screenshot-tuned prompt** aimed at *acting*, not
121
+ transcribing: interactive elements (buttons/fields/menus) with labels and approximate positions, the active
122
+ element, and any errors. A text-only main model driving the desktop gets something it can actually click.
123
+ - New optional **`focus`** on the screenshot action ("the Login button") narrows the read to the current goal.
124
+ - Internal: `describeImages` gains `system`/`hint` options, `SCREENSHOT_SYSTEM` added, `ctx.describeImage`
125
+ takes a hint. (For contrast: codex's `computer_use` is a remote/hosted *browser* MCP plugin with no local
126
+ syscalls — hara stays **native + local** so it can operate your own desktop software.)
127
+
128
+ ## 0.36.0 — unreleased (resumable plans)
129
+
130
+ - **`hara plan resume`** continues the saved plan (`.hara/org/plan.json`): atoms already marked done are
131
+ skipped, pending/failed ones run. When a verify gate stops a plan midway, fix the issue and resume instead
132
+ of starting from scratch. Interrupted atoms (running/failed) reset to pending; works with `--parallel` too.
133
+ - Internal: execution extracted into a shared `executePlan` (skips completed atoms) used by both fresh runs and
134
+ resume; `loadPlan` wired into the CLI. Verified: a half-done plan resumed, skipped the done atom, ran only
135
+ the pending one.
136
+
137
+ ## 0.35.0 — unreleased (parallel plan execution — the org works in parallel)
138
+
139
+ - **`hara plan --parallel`** runs independent atoms concurrently. The planner already builds a dependency DAG;
140
+ now `topoWaves` groups atoms into dependency *waves* (every atom in a wave depends only on earlier waves), and
141
+ each wave's atoms execute at the same time. A diamond plan `a1 → (a2,a3) → a4` runs a2 and a3 together.
142
+ - This is the org differentiator made literal: not one agent stepping through a list, but a team working the
143
+ independent parts at once. Verified live (glm-5): two independent atoms ran in one wave and completed
144
+ out-of-order; both check-gates passed.
145
+ - Sequential remains the default (and is what interactive approval uses, since concurrent atoms can't share a
146
+ prompt). `hara plan` is full-auto, so `--parallel` is safe there. A wave stops the run if any of its atoms fail.
147
+ - Internal: `executeAtom` extracted (shared by both paths); `topoWaves(atoms)` added alongside `topoOrder`.
148
+
149
+ ## 0.34.0 — unreleased (incremental indexing)
150
+
151
+ - **`hara index` is now incremental.** Re-running it re-embeds only the files whose mtime changed since the
152
+ last build; unchanged files keep their existing vectors, and deleted files drop out. A changed embedding
153
+ model still forces a full rebuild. Output reports `(N embedded, M reused)`.
154
+ - Turns indexing from a run-once-and-go-stale command into something you can re-run after every edit. Measured
155
+ on hara's own repo with local `bge-m3`: full build **~68s** → unchanged rebuild **~0.4s** (~150×); editing one
156
+ file re-embeds just that file's chunks.
157
+ - Internal: each chunk records its source file's mtime; `buildIndex` returns `{total, embedded, reused}`.
158
+
159
+ ## 0.33.0 — 2026-06-20 · first public release (semantic recall + memory)
9
160
 
10
161
  - **`recall` and `memory_search` go hybrid too.** The semantic layer added in 0.32 now also powers your
11
162
  code-asset library and durable memory — `hara index --assets` embeds `~/.hara/code-assets`, global skills,
package/README.md CHANGED
@@ -9,7 +9,7 @@
9
9
  🚧 **v0.33** · TypeScript · local-first · Apache-2.0
10
10
 
11
11
  **Highlights**
12
- - **An org, not just an agent** — `hara org "<task>"` routes work to the role that *owns* it; `hara plan "<task>"` decomposes a task into a verified DAG of atoms (frame → atomize → sequence → execute → **verify gate**).
12
+ - **An org, not just an agent** — `hara org "<task>"` routes work to the role that *owns* it; `hara plan "<task>"` decomposes a task into a verified DAG of atoms (frame → atomize → sequence → execute → **verify gate**), and `hara plan --parallel` runs independent atoms concurrently.
13
13
  - **Real terminal UX** — an **ink TUI**: bottom-pinned input box, **plan mode** (read-only → propose a plan → approve → execute), selectable approvals with "don't ask again", windowed reasoning, **paste images** (Ctrl+V) for vision models, light/dark theme.
14
14
  - **Persistent memory + self-evolution** — `memory_*` tools over global/project `MEMORY.md`; the agent recalls before acting, **proactively saves** durable facts, and grows its own playbooks (a lexical guard screens what it writes).
15
15
  - **Multi-provider, all streamed** — Anthropic (Claude) or any OpenAI-compatible endpoint (Qwen/DashScope, GLM, Kimi, OpenAI) with live Markdown + visible reasoning.
@@ -112,6 +112,10 @@ hara doctor # check your setup (auth / model / node / assets / ro
112
112
  hara roles init # scaffold role-agents (implementer / reviewer / docs)
113
113
  hara org "review src/ for bugs" # dispatch a task to the role that owns it (or --role <id>)
114
114
  hara plan "add a /health endpoint with a test" # decompose → sequence (DAG) → run each step + verify
115
+ hara plan --parallel "..." # run independent atoms concurrently · hara plan resume # continue a stopped plan
116
+ hara review # review uncommitted changes for bugs/security/missing tests (--staged · --base main)
117
+ hara commit # AI commit message from staged changes, then commit (-a to stage all · -y to skip confirm)
118
+ hara index # build the semantic search index (after: hara config set embedProvider ollama|qwen)
115
119
  hara -p "summarize @README.md and fix the lint errors in src/" # one-shot; @path attaches a file
116
120
  hara --approval auto-edit # suggest (default) | auto-edit | full-auto (-y = full-auto)
117
121
  hara --sandbox workspace-write # confine shell writes to the project (macOS Seatbelt)
@@ -163,14 +167,16 @@ not just keywords. By default they're lexical (zero setup). Configure an embeddi
163
167
  then `hara index` (repo, for `codebase_search`) / `hara index --assets` (code-assets, skills & memory) / `hara
164
168
  index --all`. A query like "read an image pasted from the clipboard" then surfaces `src/images.ts` even with no
165
169
  shared words. Indexes are rebuildable `.hara/index/` artifacts (self-`.gitignore`d, never committed); no native
166
- vector DB needed, and lexical still works when there's no index.
170
+ vector DB needed, and lexical still works when there's no index. Re-running `hara index` is **incremental** —
171
+ only changed files re-embed (a full repo rebuild that takes ~a minute re-runs in well under a second).
167
172
 
168
173
  **Approval modes**: `suggest` confirms edits & shell · `auto-edit` auto-applies file edits but confirms shell · `full-auto` runs everything.
169
174
  **Sandbox** (macOS): `--sandbox workspace-write|read-only` runs the `bash` tool under Seatbelt (writes confined to the project / blocked).
170
175
  **Screen control** (opt-in): the `computer` tool drives desktop software (screenshot → click/type), native per OS
171
176
  (mac `screencapture`+`cliclick` · Windows PowerShell · Linux `scrot`+`xdotool`). Off by default — enable a tier with
172
177
  `hara config set computerUse read|click|full` and allowlist apps with `hara config set computerApps "App, …"`. Guarded
173
- by the tier, the frontmost-app allowlist, a dangerous-key blocklist, and a once-per-session grant; screenshots are read via your vision model.
178
+ by the tier, the frontmost-app allowlist, a dangerous-key blocklist, and a once-per-session grant. Screenshots are read via your
179
+ vision model into **actionable** output — interactive elements + positions (pass `focus` to target what you're after) — so even a text-only main model can click.
174
180
  **Sessions**: conversations are saved automatically — `-c` / `--resume <id>` to continue, `hara sessions` to list.
175
181
  **MCP**: add an `mcpServers` map to config (global or project `.hara/config.json`); their tools appear to the agent as `mcp__<server>__<tool>`.
176
182
  **Profiles**: add a `profiles` map to `~/.hara/config.json` (`--profile <name>`), or drop a project-level `.hara/config.json` that overrides the global config.
@@ -190,7 +196,9 @@ sequences them as a DAG, and executes each step (optionally routed to a role) be
190
196
  **verify gate** — frame → atomize → sequence → execute → verify. Each atom may carry a `check` shell
191
197
  command, so verification is **objective** (e.g. `npm test`, `tsc --noEmit`) rather than a
192
198
  self-assessment. Plan state is the SSOT at `.hara/org/plan.json` (inspectable; execution stops on the
193
- first failed verification).
199
+ first failed verification — fix it and **`hara plan resume`** continues, skipping the atoms already done).
200
+ With **`hara plan --parallel`**, independent atoms (the same dependency wave) run **concurrently** — the org
201
+ works the independent parts at once, not one step at a time.
194
202
 
195
203
  ### What it can do
196
204