npm - @link-assistant/hive-mind - Versions diffs - 1.58.0 → 1.59.1 - Mend

@link-assistant/hive-mind 1.58.0 → 1.59.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (16) hide show

package/CHANGELOG.md +217 -0
package/package.json +1 -1
package/src/anthropic-server-tool-pricing.lib.mjs +34 -0
package/src/bidirectional-interactive.lib.mjs +392 -21
package/src/claude.budget-stats.lib.mjs +151 -26
package/src/claude.cost.lib.mjs +88 -0
package/src/claude.lib.mjs +46 -55
package/src/config.lib.mjs +5 -1
package/src/github-merge-repo-actions.lib.mjs +54 -0
package/src/github-merge.lib.mjs +24 -6
package/src/lino.lib.mjs +3 -1
package/src/queue-config.lib.mjs +7 -2
package/src/solve.auto-merge-helpers.lib.mjs +89 -7
package/src/solve.auto-merge.lib.mjs +27 -2
package/src/solve.config.lib.mjs +29 -0
package/src/use-with-retry.lib.mjs +107 -0

package/CHANGELOG.md CHANGED Viewed

@@ -1,5 +1,222 @@
 # @link-assistant/hive-mind
+## 1.59.1
+### Patch Changes
+- 65d7b99: Fix misleading `/merge` verbose logs that read as "no CI configured" when CI was actually
+  running — addresses issue [#1712](https://github.com/link-assistant/hive-mind/issues/1712)
+  where a user mistakenly Ctrl+C'd the auto-restart-until-mergeable watcher after seeing:
+  ```
+  [VERBOSE] /merge: PR #83 has no CI checks yet - treating as no_checks
+  [VERBOSE] /merge: PR #83 has no CI check-runs yet, but 1 workflow run(s) were triggered ...
+    ⏳ Waiting for CI:         Build and Release Docker Image
+  ```
+  The classification logic was correct — `/merge` was waiting on the legitimate 30-120s gap
+  between GitHub registering a `workflow_run` and publishing the corresponding `check_runs`.
+  The wording was the bug: "no CI checks yet" is parseable as "this repo has no CI", and the
+  listing showed run IDs without URLs, so the user couldn't quickly verify what `/merge` was
+  watching.
+  Changes:
+  - **`src/github-merge.lib.mjs`** — `getDetailedCIStatus` and `checkPRCIStatus` reword the
+    `no_checks` verbose lines to "has no check-runs or commit statuses registered yet",
+    including the short SHA. `getWorkflowRunsForSha` now appends `run.html_url` to every
+    entry. Normalized check-run / commit-status entries carry an `html_url` field
+    (falling back to `details_url` / `target_url`).
+  - **`src/solve.auto-merge-helpers.lib.mjs::getMergeBlockers`** — the `no_checks`,
+    `pending`, and `cancelled` branches now produce blocker `details` strings of the form
+    `"<name> [<status>] — <html_url>"`. The user-facing `⏳ Waiting for CI: …` line in
+    `solve.auto-merge.lib.mjs` (which joins `details` with commas) automatically picks up
+    the URLs, so the user can click through to the run.
+  - **`tests/test-misleading-merge-logs-1712.mjs`** — 13 unit tests covering the wording
+    guard, blocker enrichment for the no_checks / pending / cancelled paths, regression
+    guard for #1466, and the joined user-facing line format.
+  - **`docs/case-studies/issue-1712/README.md`** — full case study with raw logs, timeline,
+    root cause, fix description, and verification on the original PR
+    [link-foundation/box#83](https://github.com/link-foundation/box/pull/83) (which CI
+    passed for, after the user killed the watcher prematurely).
+  Also extends the `useWithRetry` helper (originally added in #1710 to recover from corrupt
+  hosted-CI npm-install state) with a third failure mode: `ERR_INVALID_PACKAGE_CONFIG` —
+  seen in this branch's own CI run when Node refused to parse a truncated
+  `getenv-v-latest/package.json`. `src/queue-config.lib.mjs` now loads `getenv` and
+  `links-notation` through the retry wrapper, matching `config.lib.mjs` and `lino.lib.mjs`.
+  Three new unit tests in `tests/test-use-with-retry.mjs` cover the new mode.
+  No upstream issue is needed — the bug was entirely in `link-assistant/hive-mind`. The
+  external workflow finished successfully (`check-runs-dfc4c14.json` shows `total_count: 22`).
+  **Follow-up round** (after review feedback in
+  [PR #1713 comment](https://github.com/link-assistant/hive-mind/pull/1713#issuecomment-4342387674)):
+  - **List active runs across ALL PR commits, not just HEAD.** New
+    `getActivePRWorkflowRuns()` in `src/github-merge-repo-actions.lib.mjs` walks every
+    commit on the PR (`/repos/.../pulls/N/commits`), dedupes by `run.id`, returns groups
+    marked `head` / `older`. The verbose log now lists active runs on older commits under
+    per-commit URL headers, so the GitHub Actions tab (which shows yellow dots for older
+    commits) reconciles with the log.
+  - **Eliminate duplicate logging.** `getWorkflowRunsForSha(verbose=true)` already prints
+    every run; the no_checks branch no longer re-iterates `workflowRuns`, just emits a
+    single explanatory summary line.
+  - **Commit URLs instead of short SHAs.** Verbose lines that referenced
+    `${sha.substring(0, 7)}` now use `https://github.com/${owner}/${repo}/commit/${sha}`
+    (or `/pull/N/commits/${sha}` where the PR context matters).
+  - **Inline plain-English explanations.** New `STATUS_HINTS` / `CONCLUSION_HINTS`
+    dictionaries plus `explainStatus()` helper — verbose lines read
+    `[in_progress] (currently executing)` instead of bare `in_progress`.
+  - **Multi-line user-facing waiting message.** The `⏳ Waiting for CI:` line is now
+    rendered by `renderBlocker()` — single-line for the common case (one run), but each
+    detail on its own indented line when there are multiple.
+  - 8 new tests added to `tests/test-misleading-merge-logs-1712.mjs` (Groups 5–8); 21
+    total. #1480 (31/31) and #1466 (14/14) regression suites still pass.
+## 1.59.0
+### Minor Changes
+- 903b10e: Add `--auto-input-until-mergeable` (issue #1708): a new experimental
+  mode that extends a single Claude session for as long as possible by
+  streaming PR/issue comments, CI/CD failures, uncommitted-changes
+  status, and PR/issue title/body updates as NDJSON `user` frames into
+  the live `claude --input-format stream-json` process — instead of
+  killing the process and restarting with the feedback prepended to a
+  fresh prompt.
+  What it ships:
+  - Three new flags in `src/solve.config.lib.mjs`, all defaulting to
+    `false` and marked `[EXPERIMENTAL]`:
+    - `--auto-input-until-mergeable` — top-level opt-in for the new
+      behavior. Implies `--accept-incomming-comments-as-input` and
+      defaults to `--queue-comments-to-input` so the AI can finish its
+      current step before being interrupted.
+    - `--stream-comments-to-input` — forward each comment immediately
+      as it arrives. Default for `--accept-incomming-comments-as-input`
+      on its own (preserves the existing #817 behavior).
+    - `--queue-comments-to-input` — buffer comments while the AI is
+      busy and flush them only on `result` events. Default delivery
+      mode for `--auto-input-until-mergeable`. Mutually exclusive with
+      `--stream-comments-to-input`; queue mode wins if both are set.
+  - Queue-vs-stream delivery wired into
+    `src/bidirectional-interactive.lib.mjs#createBidirectionalHandler`:
+    - New `deliveryMode` option (`'stream'` / `'queue'`) plus
+      `markAiBusy()` / `markAiIdle()` lifecycle methods exposed on the
+      handler.
+    - In queue mode, comment frames and status frames are buffered in
+      `pendingFrames` while busy and FIFO-flushed to stdin on the next
+      `result` event. In stream mode, frames go to stdin immediately as
+      today.
+  - Status streaming (only when `--auto-input-until-mergeable` is on)
+    in `src/bidirectional-interactive.lib.mjs#checkForStatusChanges`:
+    - New parallel poller emits one-shot NDJSON frames for: PR
+      title/body changes, issue title/body changes (Issue #1708 G1),
+      uncommitted local changes (`git status --porcelain`), and CI
+      blockers (via `getMergeBlockers`).
+    - Each change is keyed by a stable signature so the same failing
+      check doesn't re-emit on every poll; failures in any sub-check
+      are swallowed and logged so the poller never breaks the live
+      Claude session.
+  - Stream parser in `src/claude.lib.mjs#executeClaudeCommand` now
+    signals `markAiBusy()` on `assistant` / `tool_use` / `tool_result`
+    events and `markAiIdle()` on `result` events, so queue-mode
+    buffering tracks the actual AI lifecycle.
+  - `src/solve.auto-merge.lib.mjs#watchUntilMergeable` logs a
+    "streaming-first" banner when `--auto-input-until-mergeable` was
+    active, so it is clear the auto-restart loop is the fallback rather
+    than the primary handler.
+  - For non-Claude tools, the validator continues to warn and disable
+    all four flags — the existing #817 fallback path. The default
+    behavior of every existing flag
+    (`--auto-restart-until-mergeable`, `--auto-merge`, etc.) is
+    preserved (R4: "must not break any existing features").
+  - Tests:
+    `tests/test-auto-input-until-mergeable-1708.mjs` (59 assertions)
+    and 11 new assertions in
+    `tests/test-bidirectional-interactive.mjs` cover flag composition,
+    queue-vs-stream routing, FIFO flushing on idle, busy-flag
+    preservation across stream-mode writes, default-deliveryMode is
+    stream, status-frame stamping with the right header per kind
+    (`comment` / `ci` / `uncommitted` / `metadata`), and metadata
+    diff/snapshot helpers.
+  The case study at `docs/case-studies/issue-1708/` is updated to
+  reflect that R1, R2 (Claude path), R3 (PR/issue title+body, CI,
+  uncommitted, comments), R4, R5, R6, plus G1, G5, G7 are addressed
+  here. Codex/Agent/OpenCode still degrade gracefully (no mid-session
+  NDJSON channel upstream) and use the existing `watchUntilMergeable`
+  loop as documented in G4.
+- 6efcab4: Fix cost / token calculation correctness, unify Total / sub-session format,
+  add verbose budget trace, and case study for issue #1710
+  Resolves the four "strange things" the issue reported by changing both the
+  public-pricing math and the rendered output:
+  - **R1 — `$0.040000` residual eliminated.** `calculateModelCost`
+    ([`src/claude.lib.mjs`](./src/claude.lib.mjs)) now bills Anthropic
+    server-side tools. `web_search` is charged at the documented
+    $10 / 1 000 requests rate (= $0.01 / req) via the new constants module
+    [`src/anthropic-server-tool-pricing.lib.mjs`](./src/anthropic-server-tool-pricing.lib.mjs).
+    For the issue's PR #1707 run that comes out to exactly the previously-shown
+    $0.040000 / +0.16% delta, so the public-pricing total now reconciles with
+    Anthropic's reported `total_cost_usd`. `accumulateModelUsage`
+    ([`src/claude.budget-stats.lib.mjs`](./src/claude.budget-stats.lib.mjs))
+    also picks up `usage.server_tool_use.web_search_requests` from JSONL.
+  - **R2 — Haiku sub-session line includes input information.** Sub-agent
+    models never appear as the responding model in the parent JSONL, so
+    `peakContextUsage` stays at `0`. The fallback in `buildBudgetStatsString`
+    now emits the cumulative `(X new + Y cache writes [+ Z cache reads])`
+    phrase instead of dropping the input information entirely.
+  - **R3/R5 — Sub-session and Total reconcile.** The bullet line is now
+    labelled `peak request: …` so it cannot be confused with the cumulative
+    Total line. `requestContext` (the source of `peakContextByModel`) excludes
+    cache reads, so the bullet figure is `input + cache_creation` and is
+    reconcilable with the cumulative non-cached total. Cache reads remain
+    visible — and visible separately — on the Total line.
+  - **R4 — Total always splits cache reads / cache writes when present.**
+    The conditional that previously keyed on `cacheReadTokens` only is replaced
+    with a `buildCumulativeInputPhrase` helper that emits
+    `(X new + W cache writes + Y cache reads) input tokens` when both kinds of
+    cache activity exist, `(X new + W cache writes)` when only writes exist
+    (the Haiku case that triggered the issue), and the back-compat
+    `(X + Y cached)` form when only reads exist (so common Opus-only output
+    is unchanged). Cache writes are billed at 1.25× / 2× of input — fusing
+    them silently into the input figure was a real semantic bug, not a
+    cosmetic one.
+  Both `displayBudgetStats` (solver-log renderer) and `buildBudgetStatsString`
+  (PR-comment renderer) share the helper, so the two paths render identically.
+  Also adds **`dumpBudgetTrace`**
+  ([`src/claude.budget-stats.lib.mjs`](./src/claude.budget-stats.lib.mjs)),
+  a verbose-only structured per-model trace (peak request, cumulative
+  input/cache_write 5m+1h split/cache_read/output, server-tool counts with
+  implied dollar cost, public and Anthropic-reported costs, and the data
+  source) that fires from `displayBudgetStats` only when `{verbose: true}` is
+  set, so the default solver output is unchanged. The trace captures all the
+  inputs that drive the renderer in one place, so the next "calculation
+  correctness" report can be triaged from a saved log alone.
+  Tests:
+  - `tests/test-issue-1710-budget-trace.mjs` — 10 cases for the verbose trace.
+  - `tests/test-issue-1710-format-fixes.mjs` — 8 cases locking each requirement
+    to numbers from `docs/case-studies/issue-1710/facts.md` (the actual
+    PR #1707 result event the issue quotes).
+  Documentation: `docs/case-studies/issue-1710/` contains the root-cause
+  analysis (per symptom, with file:line citations), the captured facts, and
+  the (now-implemented) solution plans.
+  Also fixes the hosted-CI flake that surfaced while validating this PR:
+  `use-m` occasionally hands back a truncated/corrupt global package after
+  `npm install -g`, surfacing as either
+  `Failed to import module from '...': SyntaxError: Unexpected end of input`
+  or `Failed to resolve the path to '<pkg>'` when use-m loads `getenv` /
+  `links-notation` from `src/config.lib.mjs` and `src/lino.lib.mjs`. Adds
+  `src/use-with-retry.lib.mjs`, a small wrapper around `use(...)` that
+  recognises both flake modes, removes the broken alias directory, and
+  re-fetches once. Covered by `tests/test-use-with-retry.mjs` (13 cases).
 ## 1.58.0
 ### Minor Changes

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@link-assistant/hive-mind",
-  "version": "1.58.0",
+  "version": "1.59.1",
   "description": "AI-powered issue solver and hive mind for collaborative problem solving",
   "main": "src/hive.mjs",
   "type": "module",

package/src/anthropic-server-tool-pricing.lib.mjs ADDED Viewed

@@ -0,0 +1,34 @@
+#!/usr/bin/env node
+/**
+ * Issue #1710: Anthropic server-side tool pricing.
+ *
+ * `calculateModelCost` historically only billed token-based usage (input,
+ * cache_creation, cache_read, output). When a sub-agent uses Anthropic's
+ * server-side web_search tool, the result event reports `webSearchRequests`,
+ * which Anthropic bills at $10 / 1 000 searches ($0.01 / request) per
+ * <https://platform.claude.com/docs/en/about-claude/pricing#web-search-tool>.
+ *
+ * Without billing it locally, the public-pricing estimate disagreed with
+ * Anthropic's reported `total_cost_usd` by exactly that amount — the
+ * "Difference: $0.040000 (+0.16%)" line that issue #1710 quotes.
+ *
+ * Centralising the constants in this module keeps the source-of-truth in one
+ * file: bumping a price is a one-line edit, and `calculateModelCost` /
+ * `dumpBudgetTrace` both read from the same map.
+ */
+export const SERVER_TOOL_PRICING_USD = Object.freeze({
+  // $10 per 1 000 searches = $0.01 per request.
+  // https://platform.claude.com/docs/en/about-claude/pricing#web-search-tool
+  web_search: { costPerRequest: 0.01, source: 'https://platform.claude.com/docs/en/about-claude/pricing#web-search-tool' },
+  // web_fetch is currently free for paying customers; kept here for
+  // completeness and so a future price change is a one-line edit.
+  web_fetch: { costPerRequest: 0, source: 'https://platform.claude.com/docs/en/about-claude/pricing#web-fetch-tool' },
+});
+/**
+ * Returns the per-request USD price for a server-side tool, or 0 if unknown.
+ * @param {string} tool - canonical tool name (e.g. "web_search")
+ * @returns {number} per-request price in USD
+ */
+export const getServerToolPrice = tool => SERVER_TOOL_PRICING_USD[tool]?.costPerRequest || 0;