@link-assistant/hive-mind 1.58.0 → 1.59.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,150 @@
1
1
  # @link-assistant/hive-mind
2
2
 
3
+ ## 1.59.0
4
+
5
+ ### Minor Changes
6
+
7
+ - 903b10e: Add `--auto-input-until-mergeable` (issue #1708): a new experimental
8
+ mode that extends a single Claude session for as long as possible by
9
+ streaming PR/issue comments, CI/CD failures, uncommitted-changes
10
+ status, and PR/issue title/body updates as NDJSON `user` frames into
11
+ the live `claude --input-format stream-json` process — instead of
12
+ killing the process and restarting with the feedback prepended to a
13
+ fresh prompt.
14
+
15
+ What it ships:
16
+ - Three new flags in `src/solve.config.lib.mjs`, all defaulting to
17
+ `false` and marked `[EXPERIMENTAL]`:
18
+ - `--auto-input-until-mergeable` — top-level opt-in for the new
19
+ behavior. Implies `--accept-incomming-comments-as-input` and
20
+ defaults to `--queue-comments-to-input` so the AI can finish its
21
+ current step before being interrupted.
22
+ - `--stream-comments-to-input` — forward each comment immediately
23
+ as it arrives. Default for `--accept-incomming-comments-as-input`
24
+ on its own (preserves the existing #817 behavior).
25
+ - `--queue-comments-to-input` — buffer comments while the AI is
26
+ busy and flush them only on `result` events. Default delivery
27
+ mode for `--auto-input-until-mergeable`. Mutually exclusive with
28
+ `--stream-comments-to-input`; queue mode wins if both are set.
29
+ - Queue-vs-stream delivery wired into
30
+ `src/bidirectional-interactive.lib.mjs#createBidirectionalHandler`:
31
+ - New `deliveryMode` option (`'stream'` / `'queue'`) plus
32
+ `markAiBusy()` / `markAiIdle()` lifecycle methods exposed on the
33
+ handler.
34
+ - In queue mode, comment frames and status frames are buffered in
35
+ `pendingFrames` while busy and FIFO-flushed to stdin on the next
36
+ `result` event. In stream mode, frames go to stdin immediately as
37
+ today.
38
+ - Status streaming (only when `--auto-input-until-mergeable` is on)
39
+ in `src/bidirectional-interactive.lib.mjs#checkForStatusChanges`:
40
+ - New parallel poller emits one-shot NDJSON frames for: PR
41
+ title/body changes, issue title/body changes (Issue #1708 G1),
42
+ uncommitted local changes (`git status --porcelain`), and CI
43
+ blockers (via `getMergeBlockers`).
44
+ - Each change is keyed by a stable signature so the same failing
45
+ check doesn't re-emit on every poll; failures in any sub-check
46
+ are swallowed and logged so the poller never breaks the live
47
+ Claude session.
48
+ - Stream parser in `src/claude.lib.mjs#executeClaudeCommand` now
49
+ signals `markAiBusy()` on `assistant` / `tool_use` / `tool_result`
50
+ events and `markAiIdle()` on `result` events, so queue-mode
51
+ buffering tracks the actual AI lifecycle.
52
+ - `src/solve.auto-merge.lib.mjs#watchUntilMergeable` logs a
53
+ "streaming-first" banner when `--auto-input-until-mergeable` was
54
+ active, so it is clear the auto-restart loop is the fallback rather
55
+ than the primary handler.
56
+ - For non-Claude tools, the validator continues to warn and disable
57
+ all four flags — the existing #817 fallback path. The default
58
+ behavior of every existing flag
59
+ (`--auto-restart-until-mergeable`, `--auto-merge`, etc.) is
60
+ preserved (R4: "must not break any existing features").
61
+ - Tests:
62
+ `tests/test-auto-input-until-mergeable-1708.mjs` (59 assertions)
63
+ and 11 new assertions in
64
+ `tests/test-bidirectional-interactive.mjs` cover flag composition,
65
+ queue-vs-stream routing, FIFO flushing on idle, busy-flag
66
+ preservation across stream-mode writes, default-deliveryMode is
67
+ stream, status-frame stamping with the right header per kind
68
+ (`comment` / `ci` / `uncommitted` / `metadata`), and metadata
69
+ diff/snapshot helpers.
70
+
71
+ The case study at `docs/case-studies/issue-1708/` is updated to
72
+ reflect that R1, R2 (Claude path), R3 (PR/issue title+body, CI,
73
+ uncommitted, comments), R4, R5, R6, plus G1, G5, G7 are addressed
74
+ here. Codex/Agent/OpenCode still degrade gracefully (no mid-session
75
+ NDJSON channel upstream) and use the existing `watchUntilMergeable`
76
+ loop as documented in G4.
77
+
78
+ - 6efcab4: Fix cost / token calculation correctness, unify Total / sub-session format,
79
+ add verbose budget trace, and case study for issue #1710
80
+
81
+ Resolves the four "strange things" the issue reported by changing both the
82
+ public-pricing math and the rendered output:
83
+ - **R1 — `$0.040000` residual eliminated.** `calculateModelCost`
84
+ ([`src/claude.lib.mjs`](./src/claude.lib.mjs)) now bills Anthropic
85
+ server-side tools. `web_search` is charged at the documented
86
+ $10 / 1 000 requests rate (= $0.01 / req) via the new constants module
87
+ [`src/anthropic-server-tool-pricing.lib.mjs`](./src/anthropic-server-tool-pricing.lib.mjs).
88
+ For the issue's PR #1707 run that comes out to exactly the previously-shown
89
+ $0.040000 / +0.16% delta, so the public-pricing total now reconciles with
90
+ Anthropic's reported `total_cost_usd`. `accumulateModelUsage`
91
+ ([`src/claude.budget-stats.lib.mjs`](./src/claude.budget-stats.lib.mjs))
92
+ also picks up `usage.server_tool_use.web_search_requests` from JSONL.
93
+ - **R2 — Haiku sub-session line includes input information.** Sub-agent
94
+ models never appear as the responding model in the parent JSONL, so
95
+ `peakContextUsage` stays at `0`. The fallback in `buildBudgetStatsString`
96
+ now emits the cumulative `(X new + Y cache writes [+ Z cache reads])`
97
+ phrase instead of dropping the input information entirely.
98
+ - **R3/R5 — Sub-session and Total reconcile.** The bullet line is now
99
+ labelled `peak request: …` so it cannot be confused with the cumulative
100
+ Total line. `requestContext` (the source of `peakContextByModel`) excludes
101
+ cache reads, so the bullet figure is `input + cache_creation` and is
102
+ reconcilable with the cumulative non-cached total. Cache reads remain
103
+ visible — and visible separately — on the Total line.
104
+ - **R4 — Total always splits cache reads / cache writes when present.**
105
+ The conditional that previously keyed on `cacheReadTokens` only is replaced
106
+ with a `buildCumulativeInputPhrase` helper that emits
107
+ `(X new + W cache writes + Y cache reads) input tokens` when both kinds of
108
+ cache activity exist, `(X new + W cache writes)` when only writes exist
109
+ (the Haiku case that triggered the issue), and the back-compat
110
+ `(X + Y cached)` form when only reads exist (so common Opus-only output
111
+ is unchanged). Cache writes are billed at 1.25× / 2× of input — fusing
112
+ them silently into the input figure was a real semantic bug, not a
113
+ cosmetic one.
114
+
115
+ Both `displayBudgetStats` (solver-log renderer) and `buildBudgetStatsString`
116
+ (PR-comment renderer) share the helper, so the two paths render identically.
117
+
118
+ Also adds **`dumpBudgetTrace`**
119
+ ([`src/claude.budget-stats.lib.mjs`](./src/claude.budget-stats.lib.mjs)),
120
+ a verbose-only structured per-model trace (peak request, cumulative
121
+ input/cache_write 5m+1h split/cache_read/output, server-tool counts with
122
+ implied dollar cost, public and Anthropic-reported costs, and the data
123
+ source) that fires from `displayBudgetStats` only when `{verbose: true}` is
124
+ set, so the default solver output is unchanged. The trace captures all the
125
+ inputs that drive the renderer in one place, so the next "calculation
126
+ correctness" report can be triaged from a saved log alone.
127
+
128
+ Tests:
129
+ - `tests/test-issue-1710-budget-trace.mjs` — 10 cases for the verbose trace.
130
+ - `tests/test-issue-1710-format-fixes.mjs` — 8 cases locking each requirement
131
+ to numbers from `docs/case-studies/issue-1710/facts.md` (the actual
132
+ PR #1707 result event the issue quotes).
133
+
134
+ Documentation: `docs/case-studies/issue-1710/` contains the root-cause
135
+ analysis (per symptom, with file:line citations), the captured facts, and
136
+ the (now-implemented) solution plans.
137
+
138
+ Also fixes the hosted-CI flake that surfaced while validating this PR:
139
+ `use-m` occasionally hands back a truncated/corrupt global package after
140
+ `npm install -g`, surfacing as either
141
+ `Failed to import module from '...': SyntaxError: Unexpected end of input`
142
+ or `Failed to resolve the path to '<pkg>'` when use-m loads `getenv` /
143
+ `links-notation` from `src/config.lib.mjs` and `src/lino.lib.mjs`. Adds
144
+ `src/use-with-retry.lib.mjs`, a small wrapper around `use(...)` that
145
+ recognises both flake modes, removes the broken alias directory, and
146
+ re-fetches once. Covered by `tests/test-use-with-retry.mjs` (13 cases).
147
+
3
148
  ## 1.58.0
4
149
 
5
150
  ### Minor Changes
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@link-assistant/hive-mind",
3
- "version": "1.58.0",
3
+ "version": "1.59.0",
4
4
  "description": "AI-powered issue solver and hive mind for collaborative problem solving",
5
5
  "main": "src/hive.mjs",
6
6
  "type": "module",
@@ -0,0 +1,34 @@
1
+ #!/usr/bin/env node
2
+
3
+ /**
4
+ * Issue #1710: Anthropic server-side tool pricing.
5
+ *
6
+ * `calculateModelCost` historically only billed token-based usage (input,
7
+ * cache_creation, cache_read, output). When a sub-agent uses Anthropic's
8
+ * server-side web_search tool, the result event reports `webSearchRequests`,
9
+ * which Anthropic bills at $10 / 1 000 searches ($0.01 / request) per
10
+ * <https://platform.claude.com/docs/en/about-claude/pricing#web-search-tool>.
11
+ *
12
+ * Without billing it locally, the public-pricing estimate disagreed with
13
+ * Anthropic's reported `total_cost_usd` by exactly that amount — the
14
+ * "Difference: $0.040000 (+0.16%)" line that issue #1710 quotes.
15
+ *
16
+ * Centralising the constants in this module keeps the source-of-truth in one
17
+ * file: bumping a price is a one-line edit, and `calculateModelCost` /
18
+ * `dumpBudgetTrace` both read from the same map.
19
+ */
20
+ export const SERVER_TOOL_PRICING_USD = Object.freeze({
21
+ // $10 per 1 000 searches = $0.01 per request.
22
+ // https://platform.claude.com/docs/en/about-claude/pricing#web-search-tool
23
+ web_search: { costPerRequest: 0.01, source: 'https://platform.claude.com/docs/en/about-claude/pricing#web-search-tool' },
24
+ // web_fetch is currently free for paying customers; kept here for
25
+ // completeness and so a future price change is a one-line edit.
26
+ web_fetch: { costPerRequest: 0, source: 'https://platform.claude.com/docs/en/about-claude/pricing#web-fetch-tool' },
27
+ });
28
+
29
+ /**
30
+ * Returns the per-request USD price for a server-side tool, or 0 if unknown.
31
+ * @param {string} tool - canonical tool name (e.g. "web_search")
32
+ * @returns {number} per-request price in USD
33
+ */
34
+ export const getServerToolPrice = tool => SERVER_TOOL_PRICING_USD[tool]?.costPerRequest || 0;