@link-assistant/hive-mind 1.59.4 → 1.59.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +122 -0
- package/package.json +1 -1
- package/src/github-merge-repo-actions.lib.mjs +53 -15
- package/src/github-merge.lib.mjs +61 -27
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,127 @@
|
|
|
1
1
|
# @link-assistant/hive-mind
|
|
2
2
|
|
|
3
|
+
## 1.59.5
|
|
4
|
+
|
|
5
|
+
### Patch Changes
|
|
6
|
+
|
|
7
|
+
- bb24175: Fix `/merge` to correctly detect active CI runs on the default branch — issue
|
|
8
|
+
#1722.
|
|
9
|
+
|
|
10
|
+
The `/merge` command merged PR #1719 even though a CI/CD workflow run was
|
|
11
|
+
still in progress on `main`. The merge triggered a new run, which cancelled
|
|
12
|
+
the previous one. Verbose log:
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
[VERBOSE] /merge: Checking for active CI runs on link-assistant/hive-mind branch main...
|
|
16
|
+
[VERBOSE] /merge: Error checking active runs on main: stdout maxBuffer length exceeded
|
|
17
|
+
[VERBOSE] /merge: No active CI runs on main branch. Ready to proceed.
|
|
18
|
+
```
|
|
19
|
+
|
|
20
|
+
Two compounding root causes in
|
|
21
|
+
[`src/github-merge.lib.mjs`](./src/github-merge.lib.mjs)
|
|
22
|
+
`getActiveBranchRuns()` (and the parallel
|
|
23
|
+
[`src/github-merge-repo-actions.lib.mjs`](./src/github-merge-repo-actions.lib.mjs)
|
|
24
|
+
`getAllActiveRepoRuns()` introduced by issue #1503):
|
|
25
|
+
1. **No `maxBuffer` override on `gh api --paginate --slurp`.** Node's default
|
|
26
|
+
`child_process.exec` buffer is 1 MB; the unfiltered `actions/runs` response
|
|
27
|
+
on this repo's `main` was 12.7 MB, so `exec` rejected with
|
|
28
|
+
`stdout maxBuffer length exceeded`.
|
|
29
|
+
2. **Fetch errors became "no active runs".** The `catch` block returned
|
|
30
|
+
`hasActiveRuns: false`, which the caller (`waitForBranchCI`) interpreted as
|
|
31
|
+
"branch CI is idle, ready to merge". A transient fetch/buffer/parse error
|
|
32
|
+
was indistinguishable from genuine idleness.
|
|
33
|
+
|
|
34
|
+
Fix:
|
|
35
|
+
- **Server-side `?status=` filter**, looped over the active set
|
|
36
|
+
(`in_progress`, `queued`, `waiting`, `requested`, `pending`) with run-id
|
|
37
|
+
dedup. Response size scales with active-run count, not with historical-run
|
|
38
|
+
count — typically a few KB instead of 12+ MB.
|
|
39
|
+
- **Raise `exec` `maxBuffer` to `githubLimits.bufferMaxSize`** (10 MB, env
|
|
40
|
+
`HIVE_MIND_GITHUB_BUFFER_MAX_SIZE`) for all `gh` calls in
|
|
41
|
+
`github-merge.lib.mjs` and `github-merge-repo-actions.lib.mjs`. The existing
|
|
42
|
+
`githubLimits` infrastructure was already used in `github.batch.lib.mjs`;
|
|
43
|
+
this just wires it into the `/merge` paths.
|
|
44
|
+
- **Stop swallowing fetch errors as "idle".** Errors now propagate. The
|
|
45
|
+
surrounding `waitForBranchCI` / `waitForAllRepoActions` poll loops already
|
|
46
|
+
retry on the next tick; the timeout-final check has its own try/catch that
|
|
47
|
+
returns an explicit failure (instead of a false-positive "ready to merge").
|
|
48
|
+
|
|
49
|
+
Tests:
|
|
50
|
+
[`tests/test-active-branch-runs-buffer-1722.mjs`](./tests/test-active-branch-runs-buffer-1722.mjs)
|
|
51
|
+
shadows `gh` on `PATH` with a Node script that scripts active-run responses,
|
|
52
|
+
and asserts: (a) every call uses `?status=`, (b) duplicate runs across
|
|
53
|
+
statuses are deduplicated, (c) >1 MB responses are handled cleanly, (d)
|
|
54
|
+
`gh` failures throw rather than report idle, (e) `waitForBranchCI` keeps
|
|
55
|
+
polling on errors, (f) idle branches still resolve as ready,
|
|
56
|
+
(g) `getAllActiveRepoRuns` parity.
|
|
57
|
+
|
|
58
|
+
Documentation:
|
|
59
|
+
[`docs/case-studies/issue-1722/`](./docs/case-studies/issue-1722/README.md)
|
|
60
|
+
contains the timeline (with downloaded bot log, cancelled-run logs, run
|
|
61
|
+
metadata), facts, per-symptom root-cause analysis, and solution plan.
|
|
62
|
+
[`experiments/issue-1722-buffer-overflow.mjs`](./experiments/issue-1722-buffer-overflow.mjs)
|
|
63
|
+
is a minimal reproduction. No upstream report required — the fix lives
|
|
64
|
+
entirely in this repo.
|
|
65
|
+
|
|
66
|
+
- 1a92ca1: Fix flaky CI `test-suites` job caused by `use-m`'s no-retry global npm install
|
|
67
|
+
— issue #1724.
|
|
68
|
+
|
|
69
|
+
CI run [25109962685](https://github.com/link-assistant/hive-mind/actions/runs/25109962685/job/73581228475)
|
|
70
|
+
on `main` failed in the `test-suites` job at the third test file
|
|
71
|
+
(`tests/test-active-branch-runs-buffer-1722.mjs`) with:
|
|
72
|
+
|
|
73
|
+
```
|
|
74
|
+
Error: Failed to install command-stream@latest globally.
|
|
75
|
+
[cause]: Error: Command failed: npm install -g command-stream-v-latest@npm:command-stream@latest
|
|
76
|
+
npm error code ENOTEMPTY
|
|
77
|
+
npm error path /opt/hostedtoolcache/node/24.14.1/x64/lib/node_modules/command-stream-v-latest/js/src/commands
|
|
78
|
+
```
|
|
79
|
+
|
|
80
|
+
Root cause: `src/github.lib.mjs` and `src/playwright-mcp.lib.mjs` call
|
|
81
|
+
`await use('command-stream')` at module top level (via `use-m`). Every test
|
|
82
|
+
file that transitively imports either module re-runs
|
|
83
|
+
`npm install -g command-stream-v-latest@npm:command-stream@latest`. `use-m`'s
|
|
84
|
+
`ensurePackageInstalled` issues a single `npm install -g` with no retry, and
|
|
85
|
+
npm intermittently fails with `ENOTEMPTY: directory not empty, rmdir` on
|
|
86
|
+
GitHub-hosted Ubuntu runners (a long-standing npm rmdir race against itself
|
|
87
|
+
when the previous global install left files behind).
|
|
88
|
+
|
|
89
|
+
Fix:
|
|
90
|
+
- New
|
|
91
|
+
[`scripts/preinstall-use-m-packages.mjs`](./scripts/preinstall-use-m-packages.mjs)
|
|
92
|
+
pre-installs every package the codebase loads through `use-m @latest`
|
|
93
|
+
(`command-stream`, `getenv`, `links-notation`, `@dotenvx/dotenvx`,
|
|
94
|
+
`telegraf`, `zx`, `yargs`) using the same alias scheme `use-m` does
|
|
95
|
+
(`<pkg-without-@-or-/>-v-latest`), with exponential-backoff retry on the
|
|
96
|
+
flake symptoms (`ENOTEMPTY` / `EBUSY` / `EPERM` / `ECONNRESET` / `ETIMEDOUT`
|
|
97
|
+
/ `EAI_AGAIN` / `429` / `503`). After this step, `use-m`'s
|
|
98
|
+
`installedVersion === latestVersion` early-return path skips the install at
|
|
99
|
+
test time, so test imports never touch `npm install -g` again.
|
|
100
|
+
- The script also satisfies the case-study "verbose mode for next iteration"
|
|
101
|
+
requirement via `PREINSTALL_USE_M_VERBOSE=1` (or `RUNNER_DEBUG=1`), which
|
|
102
|
+
logs each attempt's command, stdout, stderr, and backoff delay, and
|
|
103
|
+
recognizes "package present on disk after a flake" as recovered success.
|
|
104
|
+
- Wires `node scripts/preinstall-use-m-packages.mjs` into the `test-suites`
|
|
105
|
+
and `test-execution` jobs in
|
|
106
|
+
[`.github/workflows/release.yml`](./.github/workflows/release.yml) right
|
|
107
|
+
after `npm install`, before any step that runs test files or `solve.mjs`.
|
|
108
|
+
|
|
109
|
+
Tests:
|
|
110
|
+
[`tests/test-preinstall-use-m-packages-1724.mjs`](./tests/test-preinstall-use-m-packages-1724.mjs)
|
|
111
|
+
covers the alias scheme, retryable-error matcher, exponential backoff, and
|
|
112
|
+
the four `installWithRetry` paths (first-success, retry-then-succeed,
|
|
113
|
+
non-retryable-abort, recovered-from-disk) deterministically (no real npm
|
|
114
|
+
calls). Marked `@hive-mind-test-suite default` so it runs in the same job
|
|
115
|
+
that previously flaked.
|
|
116
|
+
|
|
117
|
+
Documentation:
|
|
118
|
+
[`docs/case-studies/issue-1724/`](./docs/case-studies/issue-1724/README.md)
|
|
119
|
+
contains the timeline, verbatim error, downloaded failed-run logs, the
|
|
120
|
+
no-retry snippet from the live `use-m` source
|
|
121
|
+
(`logs/use-m-source.js`), the comparison with both pipeline templates
|
|
122
|
+
(JS/Rust — neither template uses `use-m @latest` at module load yet, so the
|
|
123
|
+
flake is hive-mind-specific until they do), and the implementation plan.
|
|
124
|
+
|
|
3
125
|
## 1.59.4
|
|
4
126
|
|
|
5
127
|
### Patch Changes
|
package/package.json
CHANGED
|
@@ -11,7 +11,14 @@
|
|
|
11
11
|
|
|
12
12
|
import { promisify } from 'util';
|
|
13
13
|
import { exec as execCallback } from 'child_process';
|
|
14
|
-
|
|
14
|
+
import { githubLimits } from './config.lib.mjs';
|
|
15
|
+
const execRaw = promisify(execCallback);
|
|
16
|
+
// Issue #1722: raise exec maxBuffer above Node's 1 MB default for paginated gh
|
|
17
|
+
// API responses (workflow runs can easily exceed that on busy repos).
|
|
18
|
+
const exec = (cmd, opts = {}) => execRaw(cmd, { maxBuffer: githubLimits.bufferMaxSize, ...opts });
|
|
19
|
+
|
|
20
|
+
// Statuses we treat as "not yet finished".
|
|
21
|
+
const ACTIVE_RUN_STATUSES = ['in_progress', 'queued', 'waiting', 'requested', 'pending'];
|
|
15
22
|
|
|
16
23
|
/**
|
|
17
24
|
* Get ALL active workflow runs across the entire repository (no branch filter).
|
|
@@ -21,20 +28,34 @@ const exec = promisify(execCallback);
|
|
|
21
28
|
* @returns {Promise<{runs: Array, hasActiveRuns: boolean, count: number}>}
|
|
22
29
|
*/
|
|
23
30
|
export async function getAllActiveRepoRuns(owner, repo, verbose = false) {
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
31
|
+
// Issue #1722: filter on the server side per status to avoid pulling the full
|
|
32
|
+
// history of workflow runs (which can exceed exec maxBuffer). Also: do not
|
|
33
|
+
// swallow errors as "no active runs" — bubble them up so callers can retry
|
|
34
|
+
// instead of merging on top of a still-running CI run.
|
|
35
|
+
const seen = new Set();
|
|
36
|
+
const runs = [];
|
|
37
|
+
for (const status of ACTIVE_RUN_STATUSES) {
|
|
38
|
+
const { stdout } = await exec(`gh api "repos/${owner}/${repo}/actions/runs?status=${status}&per_page=100" --paginate --slurp`);
|
|
39
|
+
const pages = JSON.parse(stdout.trim() || '[]');
|
|
40
|
+
for (const page of pages) {
|
|
41
|
+
for (const run of page.workflow_runs || []) {
|
|
42
|
+
if (seen.has(run.id)) continue;
|
|
43
|
+
seen.add(run.id);
|
|
44
|
+
runs.push({
|
|
45
|
+
id: run.id,
|
|
46
|
+
name: run.name,
|
|
47
|
+
status: run.status,
|
|
48
|
+
head_branch: run.head_branch,
|
|
49
|
+
head_sha: run.head_sha?.slice(0, 7),
|
|
50
|
+
});
|
|
51
|
+
}
|
|
33
52
|
}
|
|
34
|
-
return { runs, hasActiveRuns: runs.length > 0, count: runs.length };
|
|
35
|
-
} catch {
|
|
36
|
-
return { runs: [], hasActiveRuns: false, count: 0 };
|
|
37
53
|
}
|
|
54
|
+
if (verbose && runs.length > 0) {
|
|
55
|
+
console.log(`[VERBOSE] repo-actions: ${runs.length} active run(s) in ${owner}/${repo}`);
|
|
56
|
+
for (const r of runs) console.log(`[VERBOSE] repo-actions: ${r.name} (${r.status}) on ${r.head_branch}`);
|
|
57
|
+
}
|
|
58
|
+
return { runs, hasActiveRuns: runs.length > 0, count: runs.length };
|
|
38
59
|
}
|
|
39
60
|
|
|
40
61
|
/**
|
|
@@ -52,7 +73,16 @@ export async function waitForAllRepoActions(owner, repo, options = {}, verbose =
|
|
|
52
73
|
let peakRunCount = 0;
|
|
53
74
|
|
|
54
75
|
while (Date.now() - startTime < timeout) {
|
|
55
|
-
|
|
76
|
+
let active;
|
|
77
|
+
try {
|
|
78
|
+
active = await getAllActiveRepoRuns(owner, repo, verbose);
|
|
79
|
+
} catch (error) {
|
|
80
|
+
// Issue #1722: do not silently treat fetch errors as "no active runs".
|
|
81
|
+
// Log and retry on the next poll instead.
|
|
82
|
+
console.error(`[ERROR] repo-actions: Error checking repo CI: ${error.message}`);
|
|
83
|
+
await new Promise(resolve => setTimeout(resolve, pollInterval));
|
|
84
|
+
continue;
|
|
85
|
+
}
|
|
56
86
|
if (onStatusUpdate) {
|
|
57
87
|
try {
|
|
58
88
|
await onStatusUpdate({ ...active, elapsedMs: Date.now() - startTime });
|
|
@@ -66,7 +96,15 @@ export async function waitForAllRepoActions(owner, repo, options = {}, verbose =
|
|
|
66
96
|
peakRunCount = Math.max(peakRunCount, active.count);
|
|
67
97
|
await new Promise(resolve => setTimeout(resolve, pollInterval));
|
|
68
98
|
}
|
|
69
|
-
|
|
99
|
+
// Issue #1722: if the timeout-final check throws, surface that as an error
|
|
100
|
+
// rather than reporting "no remaining runs".
|
|
101
|
+
let finalRuns;
|
|
102
|
+
try {
|
|
103
|
+
finalRuns = await getAllActiveRepoRuns(owner, repo, verbose);
|
|
104
|
+
} catch (error) {
|
|
105
|
+
console.error(`[ERROR] repo-actions: Final CI check failed after timeout: ${error.message}`);
|
|
106
|
+
return { success: false, waitedForRuns: true, timedOut: true, remainingRuns: [] };
|
|
107
|
+
}
|
|
70
108
|
return { success: false, waitedForRuns: true, timedOut: true, remainingRuns: finalRuns.runs };
|
|
71
109
|
}
|
|
72
110
|
|
package/src/github-merge.lib.mjs
CHANGED
|
@@ -14,9 +14,17 @@
|
|
|
14
14
|
import { promisify } from 'util';
|
|
15
15
|
import { exec as execCallback } from 'child_process';
|
|
16
16
|
|
|
17
|
-
const
|
|
17
|
+
const execRaw = promisify(execCallback);
|
|
18
18
|
|
|
19
19
|
import { parseGitHubUrl } from './github.lib.mjs';
|
|
20
|
+
import { githubLimits } from './config.lib.mjs';
|
|
21
|
+
|
|
22
|
+
// Issue #1722: gh api `--paginate --slurp` responses for repos with many
|
|
23
|
+
// historical workflow runs can easily exceed Node's default 1 MB exec buffer
|
|
24
|
+
// (observed 12.7 MB on this repo's main branch). Default to the configured
|
|
25
|
+
// githubLimits.bufferMaxSize (10 MB; HIVE_MIND_GITHUB_BUFFER_MAX_SIZE) for all
|
|
26
|
+
// gh calls in this file.
|
|
27
|
+
const exec = (cmd, opts = {}) => execRaw(cmd, { maxBuffer: githubLimits.bufferMaxSize, ...opts });
|
|
20
28
|
|
|
21
29
|
// Issue #1413: Import ready tag sync, timeline, and label constant from separate module
|
|
22
30
|
// to keep this file under the 1500 line limit
|
|
@@ -674,9 +682,20 @@ export function parseRepositoryUrl(url) {
|
|
|
674
682
|
};
|
|
675
683
|
}
|
|
676
684
|
|
|
685
|
+
/**
|
|
686
|
+
* Statuses we treat as "still running" / "not yet finished".
|
|
687
|
+
* Issue #1722: be exhaustive — GitHub uses several non-completed statuses.
|
|
688
|
+
*/
|
|
689
|
+
const ACTIVE_RUN_STATUSES = ['in_progress', 'queued', 'waiting', 'requested', 'pending'];
|
|
690
|
+
|
|
677
691
|
/**
|
|
678
692
|
* Get active workflow runs on a specific branch
|
|
679
693
|
* Issue #1307: Used to check if there are any in-progress or queued runs on the target branch
|
|
694
|
+
* Issue #1722: Filter on the server side per status, otherwise the unfiltered
|
|
695
|
+
* `--paginate --slurp` response can overflow exec maxBuffer on busy repos
|
|
696
|
+
* (observed 12.7 MB on link-assistant/hive-mind main). Also: errors are now
|
|
697
|
+
* surfaced rather than swallowed as `hasActiveRuns: false`, which previously
|
|
698
|
+
* caused /merge to merge on top of a still-running CI run.
|
|
680
699
|
* @param {string} owner - Repository owner
|
|
681
700
|
* @param {string} repo - Repository name
|
|
682
701
|
* @param {string} branch - Branch name (default: main)
|
|
@@ -684,36 +703,38 @@ export function parseRepositoryUrl(url) {
|
|
|
684
703
|
* @returns {Promise<{runs: Array<Object>, hasActiveRuns: boolean, count: number}>}
|
|
685
704
|
*/
|
|
686
705
|
export async function getActiveBranchRuns(owner, repo, branch = 'main', verbose = false) {
|
|
687
|
-
|
|
688
|
-
|
|
689
|
-
|
|
690
|
-
const
|
|
691
|
-
|
|
692
|
-
|
|
693
|
-
|
|
694
|
-
|
|
695
|
-
|
|
696
|
-
|
|
697
|
-
|
|
698
|
-
|
|
706
|
+
const seen = new Set();
|
|
707
|
+
const runs = [];
|
|
708
|
+
for (const status of ACTIVE_RUN_STATUSES) {
|
|
709
|
+
const { stdout } = await exec(`gh api "repos/${owner}/${repo}/actions/runs?branch=${branch}&status=${status}&per_page=100" --paginate --slurp`);
|
|
710
|
+
const pages = JSON.parse(stdout.trim() || '[]');
|
|
711
|
+
for (const page of pages) {
|
|
712
|
+
for (const run of page.workflow_runs || []) {
|
|
713
|
+
if (seen.has(run.id)) continue;
|
|
714
|
+
seen.add(run.id);
|
|
715
|
+
runs.push({
|
|
716
|
+
id: run.id,
|
|
717
|
+
name: run.name,
|
|
718
|
+
status: run.status,
|
|
719
|
+
created_at: run.created_at,
|
|
720
|
+
html_url: run.html_url,
|
|
721
|
+
});
|
|
699
722
|
}
|
|
700
723
|
}
|
|
724
|
+
}
|
|
701
725
|
|
|
702
|
-
|
|
703
|
-
|
|
704
|
-
|
|
705
|
-
|
|
706
|
-
};
|
|
707
|
-
} catch (error) {
|
|
708
|
-
if (verbose) {
|
|
709
|
-
console.log(`[VERBOSE] /merge: Error checking active runs on ${branch}: ${error.message}`);
|
|
726
|
+
if (verbose) {
|
|
727
|
+
console.log(`[VERBOSE] /merge: Found ${runs.length} active runs on ${owner}/${repo} branch ${branch}`);
|
|
728
|
+
for (const run of runs) {
|
|
729
|
+
console.log(`[VERBOSE] /merge: - Run #${run.id}: ${run.name} (${run.status})`);
|
|
710
730
|
}
|
|
711
|
-
return {
|
|
712
|
-
runs: [],
|
|
713
|
-
hasActiveRuns: false,
|
|
714
|
-
count: 0,
|
|
715
|
-
};
|
|
716
731
|
}
|
|
732
|
+
|
|
733
|
+
return {
|
|
734
|
+
runs,
|
|
735
|
+
hasActiveRuns: runs.length > 0,
|
|
736
|
+
count: runs.length,
|
|
737
|
+
};
|
|
717
738
|
}
|
|
718
739
|
|
|
719
740
|
/**
|
|
@@ -788,7 +809,20 @@ export async function waitForBranchCI(owner, repo, branch = 'main', options = {}
|
|
|
788
809
|
}
|
|
789
810
|
|
|
790
811
|
// Timeout reached
|
|
791
|
-
|
|
812
|
+
// Issue #1722: if the final check throws, do NOT silently report "ready".
|
|
813
|
+
// Treat it the same as still-active (force a timeout failure), so /merge
|
|
814
|
+
// waits/retries instead of merging on top of a still-running CI run.
|
|
815
|
+
let finalCheck;
|
|
816
|
+
try {
|
|
817
|
+
finalCheck = await getActiveBranchRuns(owner, repo, branch, verbose);
|
|
818
|
+
} catch (error) {
|
|
819
|
+
return {
|
|
820
|
+
success: false,
|
|
821
|
+
waitedForRuns: true,
|
|
822
|
+
completedRuns: totalWaitedRuns,
|
|
823
|
+
error: `Timeout reached and final CI check failed on ${branch}: ${error.message}`,
|
|
824
|
+
};
|
|
825
|
+
}
|
|
792
826
|
if (finalCheck.hasActiveRuns) {
|
|
793
827
|
return {
|
|
794
828
|
success: false,
|