@link-assistant/hive-mind 1.25.7 → 1.25.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,21 @@
1
1
  # @link-assistant/hive-mind
2
2
 
3
+ ## 1.25.8
4
+
5
+ ### Patch Changes
6
+
7
+ - fix: update system messages to use authenticated curl for private GitHub issue images
8
+
9
+ Images attached to GitHub issues/PRs (github.com/user-attachments/assets/\*) require authentication. Without auth, GitHub returns "Not Found" (9 bytes ASCII) with HTTP 200 — a silent failure. The AI would then call Read on the non-image file, encoding "Not Found" as base64, causing Anthropic API to return "Could not process image" (HTTP 400), crashing the session.
10
+
11
+ Updated system messages in all 4 prompt files (claude, agent, codex, opencode) to explicitly identify user-attachments URLs as requiring GitHub authentication and provide the exact authenticated curl command using `gh auth token`.
12
+
13
+ fix: auto-restart with --resume on "Request timed out" in --tool claude (Issue #1353)
14
+
15
+ When Claude CLI encounters a network timeout, it exhausts its own internal retries and emits a synthetic result event: `{"type":"result","is_error":true,"result":"Request timed out","session_id":"..."}`. Previously hive-mind treated this as a fatal failure and exited, losing all session context (conversation history, cached tokens, partially completed work).
16
+
17
+ This fix detects the timeout pattern and automatically retries with `--resume <session-id>` to preserve the session, using exponential backoff starting at 5 minutes (increasing to max 1 hour) — longer than regular API errors since Claude CLI has already exhausted its own retries before reporting the timeout.
18
+
3
19
  ## 1.25.7
4
20
 
5
21
  ### Patch Changes
package/README.md CHANGED
@@ -192,6 +192,28 @@ docker attach hive-mind
192
192
  # Run bot here
193
193
 
194
194
  # Press Ctrl + P, Ctrl + Q to detach without destroying the container (no stopping of main bash process)
195
+
196
+ # --- Persisting auth data across restarts ---
197
+
198
+ # Extract auth data from a running (or stopped) container to the host:
199
+ mkdir -p ~/.hive-mind
200
+ docker cp hive-mind:/home/hive/.claude ~/.hive-mind/claude
201
+ docker cp hive-mind:/home/hive/.claude.json ~/.hive-mind/claude.json
202
+ docker cp hive-mind:/home/hive/.config/gh ~/.hive-mind/gh
203
+
204
+ # Fix ownership to match the hive user inside the container:
205
+ HIVE_UID=$(docker exec hive-mind id -u hive)
206
+ chown -R $HIVE_UID:$HIVE_UID ~/.hive-mind/claude ~/.hive-mind/gh
207
+ chown $HIVE_UID:$HIVE_UID ~/.hive-mind/claude.json
208
+
209
+ # On subsequent runs, mount the auth data to keep it between restarts:
210
+ docker run -dit \
211
+ --name hive-mind \
212
+ --restart unless-stopped \
213
+ -v /root/.hive-mind/claude:/home/hive/.claude \
214
+ -v /root/.hive-mind/claude.json:/home/hive/.claude.json \
215
+ -v /root/.hive-mind/gh:/home/hive/.config/gh \
216
+ konard/hive-mind:latest
195
217
  ```
196
218
 
197
219
  **Benefits of Docker:**
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@link-assistant/hive-mind",
3
- "version": "1.25.7",
3
+ "version": "1.25.8",
4
4
  "description": "AI-powered issue solver and hive mind for collaborative problem solving",
5
5
  "main": "src/hive.mjs",
6
6
  "type": "module",
@@ -144,7 +144,7 @@ ${getExperimentsExamplesSubPrompt(argv)}
144
144
  Initial research.
145
145
  - When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
146
146
  - When you read issue, read all details and comments thoroughly.
147
- - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, use WebFetch tool to download the image first, then use Read tool to view and analyze it.
147
+ - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. If the file command shows "HTML", "text", or "ASCII text", the download FAILED — do NOT call Read on this file. For images from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — use: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>"
148
148
  - When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
149
149
  - When you need related code, use gh search code --owner ${owner} [keywords].
150
150
  - When you need repo context, read files in your working directory.${
@@ -378,12 +378,6 @@ export const executeClaude = async params => {
378
378
  prNumber,
379
379
  });
380
380
  };
381
- /**
382
- * Calculate total token usage from a session's JSONL file
383
- * @param {string} sessionId - The session ID
384
- * @param {string} tempDir - The temporary directory where the session ran
385
- * @returns {Object} Token usage statistics
386
- */
387
381
  /**
388
382
  * Fetches model information from pricing API
389
383
  * @param {string} modelId - The model ID (e.g., "claude-sonnet-4-5-20250929")
@@ -845,6 +839,7 @@ export const executeClaudeCommand = async params => {
845
839
  let isOverloadError = false;
846
840
  let is503Error = false;
847
841
  let isInternalServerError = false; // Issue #1331: Track 500 Internal server error
842
+ let isRequestTimeout = false; // Issue #1353: Track "Request timed out" from Claude CLI
848
843
  let stderrErrors = [];
849
844
  let resultSuccessReceived = false; // Issue #1354: Track if result success event was received
850
845
  let anthropicTotalCostUSD = null; // Capture Anthropic's official total_cost_usd from result
@@ -1050,6 +1045,11 @@ export const executeClaudeCommand = async params => {
1050
1045
  if (lastMessage.includes('Internal server error') && !lastMessage.includes('Overloaded')) {
1051
1046
  isInternalServerError = true;
1052
1047
  }
1048
+ // Issue #1353: Detect "Request timed out" — Claude CLI emits {type:"result",is_error:true,result:"Request timed out"} after exhausting retries
1049
+ if (lastMessage === 'Request timed out' || lastMessage.includes('Request timed out')) {
1050
+ isRequestTimeout = true;
1051
+ await log('⏱️ Detected request timeout from Claude CLI (will retry with --resume)', { verbose: true });
1052
+ }
1053
1053
  }
1054
1054
  }
1055
1055
  // Store last message for error detection
@@ -1082,6 +1082,12 @@ export const executeClaudeCommand = async params => {
1082
1082
  lastMessage = item.text;
1083
1083
  await log('⚠️ Detected 503 network error', { verbose: true });
1084
1084
  }
1085
+ // Issue #1353: Detect "Request timed out" in assistant text content
1086
+ if (item.text === 'Request timed out' || item.text.includes('Request timed out')) {
1087
+ isRequestTimeout = true;
1088
+ lastMessage = item.text;
1089
+ await log('⏱️ Detected request timeout in assistant message (will retry with --resume)', { verbose: true });
1090
+ }
1085
1091
  }
1086
1092
  }
1087
1093
  }
@@ -1188,13 +1194,19 @@ export const executeClaudeCommand = async params => {
1188
1194
  }
1189
1195
 
1190
1196
  // Issue #1331: Unified handler for all transient API errors (Overloaded, 503, Internal Server Error)
1191
- // All use same params: 10 retries, 1min initial, 30min max, exponential backoff, session preserved
1192
- const isTransientError = isOverloadError || isInternalServerError || is503Error || (lastMessage.includes('API Error: 500') && (lastMessage.includes('Overloaded') || lastMessage.includes('Internal server error'))) || (lastMessage.includes('api_error') && lastMessage.includes('Overloaded')) || lastMessage.includes('API Error: 503') || (lastMessage.includes('503') && (lastMessage.includes('upstream connect error') || lastMessage.includes('remote connection failure')));
1197
+ // Issue #1353: Also handle "Request timed out" Claude CLI times out after exhausting its own retries
1198
+ // All use exponential backoff with session preservation via --resume
1199
+ const isTransientError = isOverloadError || isInternalServerError || is503Error || isRequestTimeout || (lastMessage.includes('API Error: 500') && (lastMessage.includes('Overloaded') || lastMessage.includes('Internal server error'))) || (lastMessage.includes('api_error') && lastMessage.includes('Overloaded')) || lastMessage.includes('API Error: 503') || (lastMessage.includes('503') && (lastMessage.includes('upstream connect error') || lastMessage.includes('remote connection failure'))) || lastMessage === 'Request timed out' || lastMessage.includes('Request timed out');
1193
1200
  if ((commandFailed || isTransientError) && isTransientError) {
1194
- if (retryCount < retryLimits.maxTransientErrorRetries) {
1195
- const delay = Math.min(retryLimits.initialTransientErrorDelayMs * Math.pow(retryLimits.retryBackoffMultiplier, retryCount), retryLimits.maxTransientErrorDelayMs);
1196
- const errorLabel = isOverloadError || (lastMessage.includes('API Error: 500') && lastMessage.includes('Overloaded')) ? 'API overload (500)' : isInternalServerError || lastMessage.includes('Internal server error') ? 'Internal server error (500)' : '503 network error';
1197
- await log(`\n⚠️ ${errorLabel} detected. Retry ${retryCount + 1}/${retryLimits.maxTransientErrorRetries} in ${Math.round(delay / 60000)} min (session preserved)...`, { level: 'warning' });
1201
+ // Issue #1353: Use timeout-specific backoff params (5min–1hr) vs general transient params (1min–30min)
1202
+ // Timeouts indicate network instability Claude CLI already exhausted its own retries, so we need longer waits
1203
+ const maxRetries = isRequestTimeout ? retryLimits.maxRequestTimeoutRetries : retryLimits.maxTransientErrorRetries;
1204
+ const initialDelay = isRequestTimeout ? retryLimits.initialRequestTimeoutDelayMs : retryLimits.initialTransientErrorDelayMs;
1205
+ const maxDelay = isRequestTimeout ? retryLimits.maxRequestTimeoutDelayMs : retryLimits.maxTransientErrorDelayMs;
1206
+ if (retryCount < maxRetries) {
1207
+ const delay = Math.min(initialDelay * Math.pow(retryLimits.retryBackoffMultiplier, retryCount), maxDelay);
1208
+ const errorLabel = isRequestTimeout ? 'Request timeout' : isOverloadError || (lastMessage.includes('API Error: 500') && lastMessage.includes('Overloaded')) ? 'API overload (500)' : isInternalServerError || lastMessage.includes('Internal server error') ? 'Internal server error (500)' : '503 network error';
1209
+ await log(`\n⚠️ ${errorLabel} detected. Retry ${retryCount + 1}/${maxRetries} in ${Math.round(delay / 60000)} min (session preserved)...`, { level: 'warning' });
1198
1210
  await log(` Error: ${lastMessage.substring(0, 200)}`, { verbose: true });
1199
1211
  if (sessionId && !argv.resume) argv.resume = sessionId; // preserve session for resume
1200
1212
  await waitWithCountdown(delay, log);
@@ -1202,7 +1214,7 @@ export const executeClaudeCommand = async params => {
1202
1214
  retryCount++;
1203
1215
  return await executeWithRetry();
1204
1216
  } else {
1205
- await log(`\n\n❌ Transient API error persisted after ${retryLimits.maxTransientErrorRetries} retries\n Please try again later or check https://status.anthropic.com/`, { level: 'error' });
1217
+ await log(`\n\n❌ Transient API error persisted after ${maxRetries} retries\n Please try again later or check https://status.anthropic.com/`, { level: 'error' });
1206
1218
  return {
1207
1219
  success: false,
1208
1220
  sessionId,
@@ -1247,28 +1259,9 @@ export const executeClaudeCommand = async params => {
1247
1259
  }
1248
1260
  }
1249
1261
  }
1250
- // Additional failure detection: if no messages were processed and there were stderr errors,
1251
- // or if the command produced no output at all, treat it as a failure
1252
- //
1253
- // This is critical for detecting "silent failures" where:
1254
- // 1. Claude CLI encounters an internal error (e.g., "kill EPERM" from timeout)
1255
- // 2. The error is logged to stderr but exit code is 0 or exit event is never sent
1256
- // 3. Result: messageCount=0, toolUseCount=0, but stderrErrors has content
1257
- //
1258
- // Common cause: sudo commands that timeout
1259
- // - Timeout triggers process.kill() in Claude CLI
1260
- // - If child process runs with sudo (root), parent can't kill it → EPERM error
1261
- // - Error logged to stderr, but command doesn't properly fail
1262
- //
1263
- // Workaround (applied in system prompt):
1264
- // - Instruct Claude to run sudo commands (installations) in background
1265
- // - Background processes avoid timeout kill mechanism
1266
- // - Prevents EPERM errors and false success reports
1267
- //
1268
- // See: docs/dependencies-research/claude-code-issues/README.md for full details
1269
- // Issue #1354: Do not trigger if the result event already confirmed success.
1270
- // A successful result event is definitive proof the command succeeded, regardless of
1271
- // messageCount (which may be 0 if "assistant" events were counted instead of "message" type).
1262
+ // Additional failure detection: silent failures (no messages + stderr errors).
1263
+ // E.g., sudo timeout causing "kill EPERM" stderr error but exit code 0.
1264
+ // Issue #1354: Skip if result event confirmed success (definitive proof regardless of messageCount).
1272
1265
  if (!commandFailed && !resultSuccessReceived && stderrErrors.length > 0 && messageCount === 0 && toolUseCount === 0) {
1273
1266
  commandFailed = true;
1274
1267
  const errorsPreview = stderrErrors
@@ -1377,13 +1370,19 @@ export const executeClaudeCommand = async params => {
1377
1370
  });
1378
1371
  const errorStr = error.message || error.toString();
1379
1372
  // Issue #1331: Unified handler for all transient API errors in exception block
1380
- // (Overloaded, 503, Internal Server Error) - same params, all with session preservation
1381
- const isTransientException = (errorStr.includes('API Error: 500') && (errorStr.includes('Overloaded') || errorStr.includes('Internal server error'))) || (errorStr.includes('api_error') && errorStr.includes('Overloaded')) || errorStr.includes('API Error: 503') || (errorStr.includes('503') && (errorStr.includes('upstream connect error') || errorStr.includes('remote connection failure')));
1373
+ // Issue #1353: Also handle "Request timed out" in exception block
1374
+ // (Overloaded, 503, Internal Server Error, Request timed out) - all with session preservation
1375
+ const isTimeoutException = errorStr === 'Request timed out' || errorStr.includes('Request timed out');
1376
+ const isTransientException = isTimeoutException || (errorStr.includes('API Error: 500') && (errorStr.includes('Overloaded') || errorStr.includes('Internal server error'))) || (errorStr.includes('api_error') && errorStr.includes('Overloaded')) || errorStr.includes('API Error: 503') || (errorStr.includes('503') && (errorStr.includes('upstream connect error') || errorStr.includes('remote connection failure')));
1382
1377
  if (isTransientException) {
1383
- if (retryCount < retryLimits.maxTransientErrorRetries) {
1384
- const delay = Math.min(retryLimits.initialTransientErrorDelayMs * Math.pow(retryLimits.retryBackoffMultiplier, retryCount), retryLimits.maxTransientErrorDelayMs);
1385
- const errorLabel = errorStr.includes('Overloaded') ? 'API overload (500)' : errorStr.includes('Internal server error') ? 'Internal server error (500)' : '503 network error';
1386
- await log(`\n⚠️ ${errorLabel} in exception. Retry ${retryCount + 1}/${retryLimits.maxTransientErrorRetries} in ${Math.round(delay / 60000)} min (session preserved)...`, { level: 'warning' });
1378
+ // Issue #1353: Use timeout-specific backoff for request timeouts
1379
+ const maxRetries = isTimeoutException ? retryLimits.maxRequestTimeoutRetries : retryLimits.maxTransientErrorRetries;
1380
+ const initialDelay = isTimeoutException ? retryLimits.initialRequestTimeoutDelayMs : retryLimits.initialTransientErrorDelayMs;
1381
+ const maxDelay = isTimeoutException ? retryLimits.maxRequestTimeoutDelayMs : retryLimits.maxTransientErrorDelayMs;
1382
+ if (retryCount < maxRetries) {
1383
+ const delay = Math.min(initialDelay * Math.pow(retryLimits.retryBackoffMultiplier, retryCount), maxDelay);
1384
+ const errorLabel = isTimeoutException ? 'Request timeout' : errorStr.includes('Overloaded') ? 'API overload (500)' : errorStr.includes('Internal server error') ? 'Internal server error (500)' : '503 network error';
1385
+ await log(`\n⚠️ ${errorLabel} in exception. Retry ${retryCount + 1}/${maxRetries} in ${Math.round(delay / 60000)} min (session preserved)...`, { level: 'warning' });
1387
1386
  if (sessionId && !argv.resume) argv.resume = sessionId;
1388
1387
  await waitWithCountdown(delay, log);
1389
1388
  await log('\n🔄 Retrying now...');
@@ -171,7 +171,7 @@ Initial research.
171
171
  - When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
172
172
  - When user mentions CI failures or asks to investigate logs, consider adding these todos to track the investigation: (1) List recent CI runs with timestamps, (2) Download logs from failed runs to ci-logs/ directory, (3) Analyze error messages and identify root cause, (4) Implement fix, (5) Verify fix resolves the specific errors found in logs.
173
173
  - When you read issue, read all details and comments thoroughly.
174
- - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, use WebFetch tool (or fetch tool) to download the image first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. Reading corrupted or non-image files (like GitHub's HTML 404 pages saved as .png) can cause "Could not process image" errors and may crash the AI solver process. If the file command shows "HTML" or "text", the download failed and you should retry or skip the image.
174
+ - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. Reading corrupted or non-image files (like GitHub's "Not Found" pages saved as .png) can cause "Could not process image" errors and will crash the AI solver process. If the file command shows "HTML", "text", or "ASCII text", the download FAILED do NOT call Read on this file. Instead: (1) For images from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — retry with: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>" (2) If retry still fails, skip the image and note it was unavailable.
175
175
  - When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
176
176
  - When you need related code, use gh search code --owner ${owner} [keywords].
177
177
  - When you need repo context, read files in your working directory.${
@@ -152,7 +152,7 @@ Initial research.
152
152
  - When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
153
153
  - When user mentions CI failures or asks to investigate logs, consider adding these todos to track the investigation: (1) List recent CI runs with timestamps, (2) Download logs from failed runs to ci-logs/ directory, (3) Analyze error messages and identify root cause, (4) Implement fix, (5) Verify fix resolves the specific errors found in logs.
154
154
  - When you read issue, read all details and comments thoroughly.
155
- - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, use WebFetch tool (or fetch tool) to download the image first, then use Read tool to view and analyze it.
155
+ - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. If the file command shows "HTML", "text", or "ASCII text", the download FAILED — do NOT call Read on this file. For images from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — use: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>"
156
156
  - When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
157
157
  - When you need related code, use gh search code --owner ${owner} [keywords].
158
158
  - When you need repo context, read files in your working directory.${
@@ -103,6 +103,11 @@ export const retryLimits = {
103
103
  maxTransientErrorRetries: parseIntWithDefault('HIVE_MIND_MAX_TRANSIENT_ERROR_RETRIES', 10),
104
104
  initialTransientErrorDelayMs: parseIntWithDefault('HIVE_MIND_INITIAL_TRANSIENT_ERROR_DELAY_MS', 60 * 1000), // 1 minute
105
105
  maxTransientErrorDelayMs: parseIntWithDefault('HIVE_MIND_MAX_TRANSIENT_ERROR_DELAY_MS', 30 * 60 * 1000), // 30 minutes
106
+ // Request timeout retry configuration (Issue #1353)
107
+ // Network timeouts need longer waits than API errors — Claude CLI already exhausted its own retries
108
+ maxRequestTimeoutRetries: parseIntWithDefault('HIVE_MIND_MAX_REQUEST_TIMEOUT_RETRIES', 10),
109
+ initialRequestTimeoutDelayMs: parseIntWithDefault('HIVE_MIND_INITIAL_REQUEST_TIMEOUT_DELAY_MS', 5 * 60 * 1000), // 5 minutes
110
+ maxRequestTimeoutDelayMs: parseIntWithDefault('HIVE_MIND_MAX_REQUEST_TIMEOUT_DELAY_MS', 60 * 60 * 1000), // 1 hour
106
111
  };
107
112
 
108
113
  // Claude Code CLI configurations
@@ -146,7 +146,7 @@ ${workspaceInstructions}
146
146
  Initial research.
147
147
  - When you start, make sure you create detailed plan for yourself and follow your todo list step by step, make sure that as many points from these guidelines are added to your todo list to keep track of everything that can help you solve the issue with highest possible quality.
148
148
  - When you read issue, read all details and comments thoroughly.
149
- - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, use WebFetch tool to download the image first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. Reading corrupted or non-image files (like GitHub's HTML 404 pages saved as .png) can cause "Could not process image" errors and may crash the AI solver process. If the file command shows "HTML" or "text", the download failed and you should retry or skip the image.
149
+ - When you see screenshots or images in issue descriptions, pull request descriptions, comments, or discussions, download the image to a local file first, then use Read tool to view and analyze it. IMPORTANT: Before reading downloaded images with the Read tool, verify the file is a valid image (not HTML). Use a CLI tool like 'file' command to check the actual file format. Reading corrupted or non-image files (like GitHub's "Not Found" pages saved as .png) can cause "Could not process image" errors and will crash the AI solver process. If the file command shows "HTML", "text", or "ASCII text", the download FAILED do NOT call Read on this file. Instead: (1) For images from GitHub issues/PRs (URLs containing "github.com/user-attachments"), these require authentication — retry with: curl -L -H "Authorization: token $(gh auth token)" -o <filename> "<url>" (2) If retry still fails, skip the image and note it was unavailable.
150
150
  - When you need issue details, use gh issue view https://github.com/${owner}/${repo}/issues/${issueNumber}.
151
151
  - When you need related code, use gh search code --owner ${owner} [keywords].
152
152
  - When you need repo context, read files in your working directory.${