@link-assistant/hive-mind 1.73.4 → 1.73.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,65 @@
1
1
  # @link-assistant/hive-mind
2
2
 
3
+ ## 1.73.6
4
+
5
+ ### Patch Changes
6
+
7
+ - defa8c4: fix(claude): repair corrupted thinking-block transcripts so resume preserves context (#1834)
8
+
9
+ Follow-up to the Issue #1834 recovery ("can we do even better?"). The previous
10
+ recovery (PR #1835) was reactive: a plain resume of a transcript poisoned by a
11
+ corrupted extended-thinking block (`{ "type": "thinking", "thinking": "" }` with a
12
+ kept signature) just repeats the `400 ... thinking blocks ... cannot be modified`
13
+ error, so recovery almost always fell through to a **fresh restart that discards
14
+ dozens of turns** of accumulated context (50 turns / $3.84 in the second
15
+ reproduction log).
16
+
17
+ Recovery Phase 1 now **proactively repairs the on-disk session transcript** before
18
+ resuming: `repairCorruptedThinkingBlocks` (new
19
+ `src/claude.session-transcript-repair.lib.mjs`) strips the empty-text
20
+ `thinking`/`redacted_thinking` blocks from the session JSONL — a workaround proven
21
+ upstream (the Anthropic API permits _omitting_ earlier thinking, just not
22
+ _modifying_ it). When repair succeeds the resume keeps all accumulated context;
23
+ when it can't help, recovery still falls back to a fresh restart, so there is no
24
+ regression.
25
+
26
+ The repair is conservative: it never throws, only removes empty-text blocks (valid
27
+ signed thinking is untouched), never empties an assistant message, and writes a
28
+ one-time `<session>.jsonl.pre-repair-backup` before rewriting. The case study under
29
+ `docs/case-studies/issue-1834` is updated with a second reproduction log and the
30
+ new repair-then-resume design.
31
+
32
+ ## 1.73.5
33
+
34
+ ### Patch Changes
35
+
36
+ - 7cb9b7e: fix(claude): recover from corrupted extended-thinking blocks instead of looping (#1834)
37
+
38
+ A long Claude (Opus) agentic run with extended thinking + tool use can leave a
39
+ thinking block in the session transcript corrupted (text emptied while the
40
+ original signature is kept). The Anthropic API then rejects every following turn
41
+ with `400 ... `thinking`or`redacted_thinking` blocks in the latest assistant
42
+ message cannot be modified`, permanently poisoning the on-disk session — so any
43
+ `--resume` retry fails forever. This is an upstream Claude Code bug
44
+ (anthropics/claude-code#63147).
45
+
46
+ Hive Mind now detects this terminal error (`classifyRetryableError` →
47
+ `requiresFreshSession`) and recovers with a two-phase escalation: it **tries to
48
+ resume the existing session first** (capped by
49
+ `HIVE_MIND_MAX_THINKING_BLOCK_RESUMES`, default 1) and only when resume is not
50
+ possible does it **discard the un-resumable session and restart fresh** (capped
51
+ by `HIVE_MIND_MAX_THINKING_BLOCK_RESTARTS`, default 2) — rather than retrying the
52
+ dead session or failing outright.
53
+
54
+ Additionally, on **all** critical errors Hive Mind now auto-commits (and
55
+ best-effort pushes) any uncommitted changes by default before recovery resets
56
+ the session, so partial work is preserved in the PR branch history. This is
57
+ on by default and can be toggled with `HIVE_MIND_AUTO_COMMIT_ON_CRITICAL_ERROR`.
58
+
59
+ Verbose logging records the `request_id` and `messages.N.content.N` path for
60
+ diagnostics. A deep case study with the full reproduction log is added under
61
+ `docs/case-studies/issue-1834`.
62
+
3
63
  ## 1.73.4
4
64
 
5
65
  ### Patch Changes
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@link-assistant/hive-mind",
3
- "version": "1.73.4",
3
+ "version": "1.73.6",
4
4
  "description": "AI-powered issue solver and hive mind for collaborative problem solving",
5
5
  "main": "src/hive.mjs",
6
6
  "type": "module",
@@ -2,7 +2,8 @@
2
2
  // Token budget statistics display module
3
3
  // Extracted from claude.lib.mjs to maintain file line limits
4
4
 
5
- import { formatNumber } from './claude.lib.mjs';
5
+ import { formatNumber, calculateSessionTokens } from './claude.lib.mjs';
6
+ import { reportError } from './sentry.lib.mjs';
6
7
  import Decimal from 'decimal.js-light';
7
8
  import { getCacheReadTokenCount, getCacheWriteTokenCount, getCumulativeContextInputTokens, getDisplayContextInputTokens, getExplicitContextFillInputTokens, getInputTokenCount, getOutputTokenCount, getRestoredContextInputTokens } from './context-fill.lib.mjs';
8
9
 
@@ -306,6 +307,81 @@ export const displayBudgetStats = async (usage, tokenUsage, log) => {
306
307
  await dumpBudgetTrace(usage, tokenUsage, log);
307
308
  };
308
309
 
310
+ /**
311
+ * Calculate and display the total token-usage summary for a finished Claude session.
312
+ * Extracted from claude.lib.mjs to keep that file under the 1500-line limit (Issue #1834).
313
+ * Reads the session JSONL, logs the per-model breakdown, cost comparison and (optionally)
314
+ * budget stats. Failures are reported but never thrown — token reporting is best-effort.
315
+ * @param {Object} params
316
+ * @param {string} params.sessionId - Claude session id (skips when falsy)
317
+ * @param {string} params.tempDir - Working directory containing the session JSONL (skips when falsy)
318
+ * @param {Object|null} params.resultModelUsage - Authoritative per-model usage from the result JSON event
319
+ * @param {number} params.anthropicTotalCostUSD - Anthropic's official total cost (for the comparison line)
320
+ * @param {Object} params.argv - Parsed CLI args (reads argv.tokensBudgetStats)
321
+ * @param {Function} params.log - Logger
322
+ */
323
+ export const displaySessionTokenUsage = async ({ sessionId, tempDir, resultModelUsage, anthropicTotalCostUSD, argv, log }) => {
324
+ if (!sessionId || !tempDir) return;
325
+ try {
326
+ const tokenUsage = await calculateSessionTokens(sessionId, tempDir, resultModelUsage);
327
+ if (!tokenUsage) return;
328
+ // Issue #1501: Log deduplication stats in verbose mode
329
+ if (tokenUsage.duplicateEntriesSkipped > 0) {
330
+ await log(`\n⚠️ JSONL deduplication: skipped ${tokenUsage.duplicateEntriesSkipped} duplicate entries (upstream: anthropics/claude-code#6805)`, { verbose: true });
331
+ }
332
+ if (tokenUsage.peakContextUsage > 0) {
333
+ await log(`📊 Peak restored-context input: ${formatNumber(tokenUsage.peakContextUsage)} tokens`, { verbose: true });
334
+ }
335
+ await log('\n💰 Token Usage Summary:');
336
+ // Display per-model breakdown
337
+ if (tokenUsage.modelUsage) {
338
+ const modelIds = Object.keys(tokenUsage.modelUsage);
339
+ const modelsFromResult = modelIds.filter(id => tokenUsage.modelUsage[id]._sourceResultJson);
340
+ if (modelsFromResult.length > 0) {
341
+ await log(`📊 Token data supplemented from result JSON for: ${modelsFromResult.join(', ')}`, { verbose: true });
342
+ }
343
+ for (const modelId of modelIds) {
344
+ const usage = tokenUsage.modelUsage[modelId];
345
+ const sourceNote = usage._sourceResultJson ? ' (from result JSON)' : '';
346
+ await log(`\n 📊 ${usage.modelName || modelId}:${sourceNote}`);
347
+ await displayModelUsage(usage, log);
348
+ // Display budget stats if flag is enabled
349
+ if (argv.tokensBudgetStats && usage.modelInfo?.limit) {
350
+ await displayBudgetStats(usage, tokenUsage, log);
351
+ }
352
+ }
353
+ // Show totals if multiple models were used
354
+ if (modelIds.length > 1) {
355
+ await log('\n 📈 Total across all models:');
356
+ }
357
+ // Show cost comparison (for both single and multiple models)
358
+ await displayCostComparison(tokenUsage.totalCostUSD, anthropicTotalCostUSD, log);
359
+ // Show total tokens for single model only
360
+ if (modelIds.length === 1) {
361
+ await log(` Total tokens: ${formatNumber(tokenUsage.totalTokens)}`);
362
+ }
363
+ } else {
364
+ // Fallback to old format if modelUsage is not available
365
+ await log(` Input tokens: ${formatNumber(tokenUsage.inputTokens)}`);
366
+ if (tokenUsage.cacheCreationTokens > 0) {
367
+ await log(` Cache creation tokens: ${formatNumber(tokenUsage.cacheCreationTokens)}`);
368
+ }
369
+ if (tokenUsage.cacheReadTokens > 0) {
370
+ await log(` Cache read tokens: ${formatNumber(tokenUsage.cacheReadTokens)}`);
371
+ }
372
+ await log(` Output tokens: ${formatNumber(tokenUsage.outputTokens)}`);
373
+ await log(` Total tokens: ${formatNumber(tokenUsage.totalTokens)}`);
374
+ }
375
+ } catch (tokenError) {
376
+ reportError(tokenError, {
377
+ context: 'calculate_session_tokens',
378
+ sessionId,
379
+ operation: 'read_session_jsonl',
380
+ });
381
+ await log(` ⚠️ Could not calculate token usage: ${tokenError.message}`, { verbose: true });
382
+ }
383
+ };
384
+
309
385
  /**
310
386
  * Merge resultModelUsage from Claude Code result JSON into JSONL-based modelUsage map.
311
387
  * Issue #1508: The JSONL file may miss sub-agent model entries (e.g., Haiku used internally),
@@ -15,7 +15,7 @@ import { setupBidirectionalHandler, finalizeBidirectionalHandler, validateBidire
15
15
  import { initProgressMonitoring } from './solve.progress-monitoring.lib.mjs';
16
16
  import { sanitizeObjectStrings } from './unicode-sanitization.lib.mjs';
17
17
  import Decimal from 'decimal.js-light';
18
- import { displayBudgetStats, createEmptySubSessionUsage, accumulateModelUsage, displayModelUsage, displayCostComparison, mergeResultModelUsage, createSubAgentCallEntry, accumulateSubAgentUsage, getRawRequestInputTokens } from './claude.budget-stats.lib.mjs';
18
+ import { createEmptySubSessionUsage, accumulateModelUsage, mergeResultModelUsage, createSubAgentCallEntry, accumulateSubAgentUsage, getRawRequestInputTokens, displaySessionTokenUsage } from './claude.budget-stats.lib.mjs';
19
19
  import { buildClaudeResumeCommand, buildClaudeAutonomousResumeCommand } from './claude.command-builder.lib.mjs';
20
20
  import { buildSolveResumeCommand } from './solve.resume-command.lib.mjs'; // Issue #942
21
21
  import { SESSION_FORCE_KILLED_MARKER, postTrackedComment } from './tool-comments.lib.mjs'; // Issue #1625
@@ -25,9 +25,10 @@ import { buildMcpConfigWithoutPlaywright } from './playwright-mcp.lib.mjs';
25
25
  import { resolveClaudeSessionToolFlags } from './useless-tools.lib.mjs';
26
26
  import { ensureClaudeQuietConfig } from './claude-quiet-config.lib.mjs';
27
27
  import { fetchModelInfo } from './model-info.lib.mjs';
28
- import { classifyRetryableError, maybeSwitchToFallbackModel } from './tool-retry.lib.mjs';
28
+ import { classifyRetryableError, maybeSwitchToFallbackModel, waitWithCountdown } from './tool-retry.lib.mjs';
29
29
  import { resolveSubSessionSize } from './sub-session-size.lib.mjs'; // Issue #1706
30
30
  import { withAgentsMdAsClaudeMd } from './agents-md-claude-support.lib.mjs';
31
+ import { createThinkingBlockRecovery } from './claude.thinking-block-recovery.lib.mjs'; // Issue #1834 (PR #1835 feedback)
31
32
  export { availableModels, fetchModelInfo }; // Re-export for backward compatibility
32
33
  const showResumeCommand = async (sessionId, tempDir, claudePath, model, log, argv = null) => {
33
34
  if (!sessionId || !tempDir) return;
@@ -607,20 +608,12 @@ export const executeClaudeCommand = async params => {
607
608
  // Issue #1331: Unified retry configuration for all transient API errors
608
609
  // (Overloaded, 503 Network Error, Internal Server Error) - same params, all with session preservation
609
610
  let retryCount = 0;
610
- // Helper: wait with per-minute countdown for delays >1 minute (Issue #1331)
611
- const waitWithCountdown = async (delayMs, log) => {
612
- if (delayMs <= 60000) {
613
- await new Promise(resolve => setTimeout(resolve, delayMs));
614
- return;
615
- }
616
- let remaining = delayMs;
617
- const timer = setInterval(async () => {
618
- remaining -= 60000;
619
- if (remaining > 0) await log(`⏳ ${Math.round(remaining / 60000)} min remaining...`);
620
- }, 60000);
621
- await new Promise(resolve => setTimeout(resolve, delayMs));
622
- clearInterval(timer);
623
- };
611
+ // Issue #1834 (PR #1835 feedback): corrupted-thinking-block recovery resume the session first,
612
+ // then escalate to a fresh restart, auto-committing uncommitted work before each attempt. Created
613
+ // once so its resume/restart caps persist across recursive retry calls.
614
+ const tryThinkingBlockRecovery = createThinkingBlockRecovery({ argv, tempDir, branchName, $, log });
615
+ // Helper `waitWithCountdown` (per-minute countdown for delays >1 minute, Issue #1331) is shared
616
+ // from tool-retry.lib.mjs so claude/codex/gemini/qwen/opencode all use one implementation.
624
617
  // Function to execute with retry logic
625
618
  const executeWithRetry = async () => {
626
619
  // Execute claude command from the cloned repository directory
@@ -981,6 +974,12 @@ export const executeClaudeCommand = async params => {
981
974
  isRequestTimeout = true;
982
975
  await log('⏱️ Detected request timeout from Claude CLI (will retry with --resume)', { verbose: true });
983
976
  }
977
+ // Issue #1834: Detect corrupted extended-thinking-block 400 (un-resumable session).
978
+ // Capture diagnostics (request id, content path) to aid debugging and upstream reports.
979
+ if ((lastMessage.includes('thinking') || lastMessage.includes('redacted_thinking')) && lastMessage.includes('cannot be modified')) {
980
+ const contentPath = (lastMessage.match(/messages\.\d+\.content\.\d+/) || [])[0] || 'unknown';
981
+ await log(`🧠 Detected corrupted thinking-block error (un-resumable session). request_id=${data.request_id || 'unknown'}, at=${contentPath}. Will discard the session and restart fresh (Issue #1834, upstream anthropics/claude-code#63147).`, { verbose: true });
982
+ }
984
983
  }
985
984
  }
986
985
  if (data.type === 'text' && data.text) lastMessage = data.text;
@@ -1160,6 +1159,13 @@ export const executeClaudeCommand = async params => {
1160
1159
  // Issue #817: Stop bidirectional mode monitoring and collect queued feedback
1161
1160
  queuedFeedback = await finalizeBidirectionalHandler(bidirectionalHandler, log);
1162
1161
  const retryableLastError = classifyRetryableError(lastMessage);
1162
+ // Issue #1834: Corrupted extended-thinking blocks → try to resume the session first, then fall
1163
+ // back to a fresh restart (PR #1835 feedback). When both caps are reached, tryThinkingBlockRecovery
1164
+ // logs the failure and returns false; we fall through to the normal commandFailed return below
1165
+ // (the 400 is not a transient pattern, so it is not retried).
1166
+ if (commandFailed && retryableLastError.requiresFreshSession && (await tryThinkingBlockRecovery({ classified: retryableLastError, source: 'result', sessionId }))) {
1167
+ return await executeWithRetry();
1168
+ }
1163
1169
  // Issues #1331, #1353, #1472/#1475: Unified transient error retry (exponential backoff, session preservation)
1164
1170
  const isTransientError = isStartupTimeout || isActivityTimeout || isOverloadError || isInternalServerError || is503Error || isRequestTimeout || retryableLastError.isRetryable || (lastMessage.includes('API Error: 500') && (lastMessage.includes('Overloaded') || lastMessage.includes('Internal server error'))) || (lastMessage.includes('API Error: 529') && (lastMessage.includes('overloaded_error') || lastMessage.includes('Overloaded'))) || (lastMessage.includes('api_error') && lastMessage.includes('Overloaded')) || (lastMessage.includes('overloaded_error') && lastMessage.includes('Overloaded')) || lastMessage.includes('API Error: 503') || (lastMessage.includes('503') && (lastMessage.includes('upstream connect error') || lastMessage.includes('remote connection failure'))) || lastMessage === 'Request timed out' || lastMessage.includes('Request timed out');
1165
1171
  if ((commandFailed || isTransientError) && isTransientError) {
@@ -1300,68 +1306,9 @@ export const executeClaudeCommand = async params => {
1300
1306
  await log('\n\n✅ Claude command completed');
1301
1307
  }
1302
1308
  await log(`📊 Total messages: ${messageCount}, Tool uses: ${toolUseCount}`);
1303
- // Calculate and display total token usage from session JSONL file
1304
- if (sessionId && tempDir) {
1305
- try {
1306
- const tokenUsage = await calculateSessionTokens(sessionId, tempDir, resultModelUsage);
1307
- if (tokenUsage) {
1308
- // Issue #1501: Log deduplication stats in verbose mode
1309
- if (tokenUsage.duplicateEntriesSkipped > 0) {
1310
- await log(`\n⚠️ JSONL deduplication: skipped ${tokenUsage.duplicateEntriesSkipped} duplicate entries (upstream: anthropics/claude-code#6805)`, { verbose: true });
1311
- }
1312
- if (tokenUsage.peakContextUsage > 0) {
1313
- await log(`📊 Peak restored-context input: ${formatNumber(tokenUsage.peakContextUsage)} tokens`, { verbose: true });
1314
- }
1315
- await log('\n💰 Token Usage Summary:');
1316
- // Display per-model breakdown
1317
- if (tokenUsage.modelUsage) {
1318
- const modelIds = Object.keys(tokenUsage.modelUsage);
1319
- const modelsFromResult = modelIds.filter(id => tokenUsage.modelUsage[id]._sourceResultJson);
1320
- if (modelsFromResult.length > 0) {
1321
- await log(`📊 Token data supplemented from result JSON for: ${modelsFromResult.join(', ')}`, { verbose: true });
1322
- }
1323
- for (const modelId of modelIds) {
1324
- const usage = tokenUsage.modelUsage[modelId];
1325
- const sourceNote = usage._sourceResultJson ? ' (from result JSON)' : '';
1326
- await log(`\n 📊 ${usage.modelName || modelId}:${sourceNote}`);
1327
- await displayModelUsage(usage, log);
1328
- // Display budget stats if flag is enabled
1329
- if (argv.tokensBudgetStats && usage.modelInfo?.limit) {
1330
- await displayBudgetStats(usage, tokenUsage, log);
1331
- }
1332
- }
1333
- // Show totals if multiple models were used
1334
- if (modelIds.length > 1) {
1335
- await log('\n 📈 Total across all models:');
1336
- }
1337
- // Show cost comparison (for both single and multiple models)
1338
- await displayCostComparison(tokenUsage.totalCostUSD, anthropicTotalCostUSD, log);
1339
- // Show total tokens for single model only
1340
- if (modelIds.length === 1) {
1341
- await log(` Total tokens: ${formatNumber(tokenUsage.totalTokens)}`);
1342
- }
1343
- } else {
1344
- // Fallback to old format if modelUsage is not available
1345
- await log(` Input tokens: ${formatNumber(tokenUsage.inputTokens)}`);
1346
- if (tokenUsage.cacheCreationTokens > 0) {
1347
- await log(` Cache creation tokens: ${formatNumber(tokenUsage.cacheCreationTokens)}`);
1348
- }
1349
- if (tokenUsage.cacheReadTokens > 0) {
1350
- await log(` Cache read tokens: ${formatNumber(tokenUsage.cacheReadTokens)}`);
1351
- }
1352
- await log(` Output tokens: ${formatNumber(tokenUsage.outputTokens)}`);
1353
- await log(` Total tokens: ${formatNumber(tokenUsage.totalTokens)}`);
1354
- }
1355
- }
1356
- } catch (tokenError) {
1357
- reportError(tokenError, {
1358
- context: 'calculate_session_tokens',
1359
- sessionId,
1360
- operation: 'read_session_jsonl',
1361
- });
1362
- await log(` ⚠️ Could not calculate token usage: ${tokenError.message}`, { verbose: true });
1363
- }
1364
- }
1309
+ // Calculate and display total token usage from session JSONL file.
1310
+ // Extracted to claude.budget-stats.lib.mjs to keep this file under the line limit (Issue #1834).
1311
+ await displaySessionTokenUsage({ sessionId, tempDir, resultModelUsage, anthropicTotalCostUSD, argv, log });
1365
1312
  await showResumeCommand(sessionId, tempDir, claudePath, argv.model, log, argv);
1366
1313
  return {
1367
1314
  success: true,
@@ -1388,6 +1335,12 @@ export const executeClaudeCommand = async params => {
1388
1335
  });
1389
1336
  const errorStr = error.message || error.toString();
1390
1337
  const retryableException = classifyRetryableError(errorStr);
1338
+ // Issue #1834: Corrupted extended-thinking blocks surfaced as a thrown exception. Same recovery
1339
+ // as the streamed-result path: resume the session first, then fall back to a fresh restart.
1340
+ if (retryableException.requiresFreshSession && (await tryThinkingBlockRecovery({ classified: retryableException, source: 'exception', sessionId }))) {
1341
+ retryCount++;
1342
+ return await executeWithRetry();
1343
+ }
1391
1344
  // Issue #1331: Unified handler for all transient API errors in exception block
1392
1345
  // Issue #1353: Also handle "Request timed out" in exception block
1393
1346
  // (Overloaded, 503, Internal Server Error, Request timed out) - all with session preservation
@@ -0,0 +1,150 @@
1
+ #!/usr/bin/env node
2
+
3
+ // Issue #1834 (PR #1836): repair a Claude Code session transcript that was poisoned by a
4
+ // corrupted extended-thinking block, so the session can be RESUMED (context preserved) instead
5
+ // of being discarded entirely.
6
+ //
7
+ // Root cause (upstream anthropics/claude-code#63147, #46843, #24662, #41992): when extended
8
+ // thinking is combined with tool use, Claude Code can persist a thinking block to the on-disk
9
+ // session JSONL with its `thinking` text emptied to "" while keeping the original `signature`:
10
+ //
11
+ // { "type": "thinking", "thinking": "", "signature": "Eyc…" }
12
+ //
13
+ // On resume/continue the API replays that block and validates the signature against the now-empty
14
+ // text, rejecting every following turn with a 400:
15
+ // `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be modified.
16
+ //
17
+ // The proven community workaround (anthropics/claude-code#46843, miteshashar/claude-code-thinking-
18
+ // blocks-fix) is to STRIP the corrupted (empty-text) thinking blocks from the transcript — the API
19
+ // permits omitting earlier-turn thinking, so once the offending blocks are gone the session resumes
20
+ // cleanly with all of its text/tool-use history intact. This is strictly better than throwing the
21
+ // whole session away: when the repair succeeds we keep the accumulated context (worth many dollars
22
+ // and dozens of turns); when it can't help we still fall back to a fresh restart.
23
+
24
+ import { promises as fs } from 'fs';
25
+ import os from 'os';
26
+ import path from 'path';
27
+
28
+ /**
29
+ * Resolve the on-disk session transcript path for a Claude Code session. Claude Code stores each
30
+ * session as `~/.claude/projects/<cwd-with-slashes-as-dashes>/<sessionId>.jsonl` (mirrors the
31
+ * path logic already used by getModelUsageFromSession in claude.lib.mjs).
32
+ *
33
+ * @param {string} tempDir - the working directory the Claude session ran in.
34
+ * @param {string} sessionId - the Claude Code session id.
35
+ * @param {string} [homeDir] - override home dir (tests).
36
+ * @returns {string} absolute path to the session JSONL file.
37
+ */
38
+ export const resolveSessionTranscriptPath = (tempDir, sessionId, homeDir = os.homedir()) => {
39
+ const projectDirName = String(tempDir).replace(/\//g, '-');
40
+ return path.join(homeDir, '.claude', 'projects', projectDirName, `${sessionId}.jsonl`);
41
+ };
42
+
43
+ /**
44
+ * True when a content block is a corrupted thinking block: an extended-thinking block whose text
45
+ * was emptied (the upstream corruption) — `{ type: 'thinking', thinking: '' }` (optionally with a
46
+ * stale `signature`) or the redacted variant `{ type: 'redacted_thinking', data: '' }`.
47
+ */
48
+ const isCorruptedThinkingBlock = block => {
49
+ if (!block || typeof block !== 'object') return false;
50
+ if (block.type === 'thinking') return !block.thinking; // '' / undefined / null
51
+ if (block.type === 'redacted_thinking') return !block.data;
52
+ return false;
53
+ };
54
+
55
+ /**
56
+ * Strip corrupted (empty-text) thinking blocks from a Claude Code session transcript so the session
57
+ * can be resumed. Conservative and side-effect-safe:
58
+ * - never throws (returns a result object describing what happened);
59
+ * - only removes blocks whose thinking text is empty (legitimate signed thinking is untouched);
60
+ * - never empties an assistant message (if removing the blocks would leave a message with no
61
+ * content, that message is left exactly as-is);
62
+ * - writes a one-time backup (`<file>.pre-repair-backup`) before modifying the transcript.
63
+ *
64
+ * @param {object} opts
65
+ * @param {string} opts.tempDir - working directory the session ran in.
66
+ * @param {string} opts.sessionId - Claude Code session id.
67
+ * @param {string} [opts.homeDir] - override home dir (tests).
68
+ * @param {Function} [opts.log] - async logger.
69
+ * @returns {Promise<{ repaired: boolean, removedBlocks: number, scannedLines: number, sessionFile: string|null, reason?: string }>}
70
+ */
71
+ export const repairCorruptedThinkingBlocks = async ({ tempDir, sessionId, homeDir, log = async () => {} } = {}) => {
72
+ const result = { repaired: false, removedBlocks: 0, scannedLines: 0, sessionFile: null };
73
+ if (!tempDir || !sessionId) {
74
+ return { ...result, reason: 'missing tempDir or sessionId' };
75
+ }
76
+ const sessionFile = resolveSessionTranscriptPath(tempDir, sessionId, homeDir);
77
+ result.sessionFile = sessionFile;
78
+ let fileContent;
79
+ try {
80
+ fileContent = await fs.readFile(sessionFile, 'utf8');
81
+ } catch {
82
+ // No transcript on disk (e.g. fresh run never persisted, or path mismatch) — nothing to repair.
83
+ return { ...result, reason: 'session transcript not found' };
84
+ }
85
+
86
+ try {
87
+ const lines = fileContent.split('\n');
88
+ const out = [];
89
+ let removedBlocks = 0;
90
+ let scannedLines = 0;
91
+ for (const line of lines) {
92
+ if (!line.trim()) {
93
+ out.push(line);
94
+ continue;
95
+ }
96
+ scannedLines++;
97
+ let entry;
98
+ try {
99
+ entry = JSON.parse(line);
100
+ } catch {
101
+ out.push(line); // preserve anything we can't parse verbatim
102
+ continue;
103
+ }
104
+ const content = entry?.message?.content;
105
+ if (Array.isArray(content)) {
106
+ const corrupted = content.filter(isCorruptedThinkingBlock).length;
107
+ if (corrupted > 0) {
108
+ const cleaned = content.filter(b => !isCorruptedThinkingBlock(b));
109
+ // Never leave an assistant message with an empty content array (invalid for the API).
110
+ if (cleaned.length > 0) {
111
+ entry.message.content = cleaned;
112
+ removedBlocks += corrupted;
113
+ out.push(JSON.stringify(entry));
114
+ continue;
115
+ }
116
+ }
117
+ }
118
+ out.push(line);
119
+ }
120
+
121
+ result.scannedLines = scannedLines;
122
+ if (removedBlocks === 0) {
123
+ return { ...result, reason: 'no corrupted thinking blocks found' };
124
+ }
125
+
126
+ // Back up the original transcript exactly once before rewriting it.
127
+ const backupFile = `${sessionFile}.pre-repair-backup`;
128
+ try {
129
+ await fs.access(backupFile);
130
+ } catch {
131
+ try {
132
+ await fs.copyFile(sessionFile, backupFile);
133
+ } catch {
134
+ // Best effort — a missing backup must not block the repair.
135
+ }
136
+ }
137
+
138
+ await fs.writeFile(sessionFile, out.join('\n'), 'utf8');
139
+ result.repaired = true;
140
+ result.removedBlocks = removedBlocks;
141
+ await log(`🩹 Repaired session transcript: stripped ${removedBlocks} corrupted thinking block(s) from ${scannedLines} message line(s) (Issue #1834). Backup: ${backupFile}`, { verbose: true });
142
+ return result;
143
+ } catch (error) {
144
+ // Defensive: any unexpected failure degrades gracefully to "no repair" so the caller can fall
145
+ // back to a fresh restart.
146
+ return { ...result, reason: `repair failed: ${error?.message || error}` };
147
+ }
148
+ };
149
+
150
+ export default { repairCorruptedThinkingBlocks, resolveSessionTranscriptPath };
@@ -0,0 +1,96 @@
1
+ #!/usr/bin/env node
2
+
3
+ // Issue #1834: recovery for corrupted extended-thinking blocks.
4
+ //
5
+ // When extended thinking is combined with tool use, Claude Code can persist a thinking block to the
6
+ // on-disk session transcript with the `thinking` text emptied to "" while keeping the original
7
+ // `signature`. On resume/continue the API validates the signature against the now-empty text and
8
+ // rejects the turn with a 400:
9
+ // API Error: 400 ... `thinking` or `redacted_thinking` blocks in the latest assistant message
10
+ // cannot be modified. These blocks must remain as they were in the original response.
11
+ // Upstream: https://github.com/anthropics/claude-code/issues/63147
12
+ //
13
+ // PR #1835 feedback: "in case of this specific error we should try resume first, and if not possible
14
+ // try to restart." Recovery is therefore a two-phase escalation:
15
+ // Phase 1 — REPAIR the on-disk transcript (strip the corrupted empty-text thinking blocks) and
16
+ // resume the existing session (context-preserving). Plain resume of a poisoned
17
+ // transcript is futile — the 400 just repeats — so we first remove the offending blocks,
18
+ // which the API permits omitting. When repair succeeds the resume keeps all accumulated
19
+ // text/tool-use history (Issue #1834 "can we do even better?").
20
+ // Phase 2 — repair/resume unavailable or already failed → discard the session and start fresh.
21
+ // On every attempt we first auto-commit any uncommitted work (Issue #1834 / PR #1835 feedback:
22
+ // "on all critical errors we auto commit uncommitted changes by default") so nothing is lost when
23
+ // the session context resets.
24
+
25
+ import { retryLimits, criticalErrorRecovery } from './config.lib.mjs';
26
+ import { waitWithCountdown } from './tool-retry.lib.mjs';
27
+ import { commitUncommittedChangesOnCriticalError } from './critical-error-commit.lib.mjs';
28
+ import { repairCorruptedThinkingBlocks } from './claude.session-transcript-repair.lib.mjs';
29
+
30
+ /**
31
+ * Create a stateful corrupted-thinking-block recovery handler. The returned function persists its
32
+ * resume/restart counters across calls (so the caps survive recursive retries) and mutates
33
+ * `argv.resume` to drive the next session: setting it to the session id resumes, clearing it forces
34
+ * a fresh session.
35
+ *
36
+ * @param {object} ctx
37
+ * @param {object} ctx.argv - parsed CLI args (argv.resume is mutated to choose resume vs fresh).
38
+ * @param {string} ctx.tempDir - working tree for auto-committing uncommitted work.
39
+ * @param {string} [ctx.branchName] - branch to push preserved work to.
40
+ * @param {Function} ctx.$ - command-stream executor.
41
+ * @param {Function} ctx.log - async logger.
42
+ * @param {number} [ctx.waitMs=5000] - settle delay before re-running (overridable for tests).
43
+ * @param {Function} [ctx.repair=repairCorruptedThinkingBlocks] - transcript repair (injectable for tests).
44
+ * @param {string} [ctx.homeDir] - override home dir for transcript lookup (tests).
45
+ * @returns {(opts: {classified: object, source: string, sessionId: string|null}) => Promise<boolean>}
46
+ * Resolves true when a recovery attempt was initiated (caller should re-run); false when
47
+ * both caps are exhausted (caller should fail).
48
+ */
49
+ export const createThinkingBlockRecovery = ({ argv, tempDir, branchName, $, log, waitMs = 5000, repair = repairCorruptedThinkingBlocks, homeDir }) => {
50
+ let resumeCount = 0;
51
+ let restartCount = 0;
52
+ return async ({ classified, source, sessionId }) => {
53
+ const preserveWork = async () => {
54
+ if (criticalErrorRecovery.autoCommitUncommittedChanges) {
55
+ await commitUncommittedChangesOnCriticalError({ tempDir, branchName, $, log, reason: `${classified.label} (${source})` });
56
+ }
57
+ };
58
+ // Phase 1 — repair the on-disk transcript, then resume (keeps accumulated context).
59
+ if (sessionId && resumeCount < retryLimits.maxThinkingBlockResumes) {
60
+ resumeCount++;
61
+ await preserveWork();
62
+ await log(`\n⚠️ ${classified.label} (${source}). Resume attempt ${resumeCount}/${retryLimits.maxThinkingBlockResumes} — repairing the corrupted transcript then resuming the existing session before discarding it (Issue #1834)...`, { level: 'warning' });
63
+ // Strip the corrupted (empty-text) thinking blocks so resume isn't doomed to repeat the 400.
64
+ try {
65
+ const repairResult = await repair({ tempDir, sessionId, homeDir, log });
66
+ if (repairResult?.repaired) {
67
+ await log(` 🩹 Stripped ${repairResult.removedBlocks} corrupted thinking block(s) from the transcript — resume will preserve context (Issue #1834).`, { verbose: true });
68
+ } else {
69
+ await log(` ℹ️ Transcript repair made no change (${repairResult?.reason || 'unknown'}) — resuming as-is (Issue #1834).`, { verbose: true });
70
+ }
71
+ } catch {
72
+ // Repair must never block recovery — fall through to a plain resume attempt.
73
+ }
74
+ argv.resume = sessionId;
75
+ await waitWithCountdown(waitMs, log);
76
+ await log('\n🔄 Resuming the session now...');
77
+ return true;
78
+ }
79
+ // Phase 2 — resume not possible / already failed → discard the session and start fresh.
80
+ if (restartCount < retryLimits.maxThinkingBlockRestarts) {
81
+ restartCount++;
82
+ await preserveWork();
83
+ await log(`\n⚠️ ${classified.label} (${source}). Resume not possible — restart ${restartCount}/${retryLimits.maxThinkingBlockRestarts} with a fresh session (Issue #1834)...`, { level: 'warning' });
84
+ await log(` Discarding session ${argv.resume || sessionId || '(none)'} and starting fresh — the corrupted thinking blocks can never be replayed (upstream anthropics/claude-code#63147).`, { verbose: true });
85
+ // Force a fresh session — do NOT resume the corrupted one, otherwise the 400 repeats forever.
86
+ argv.resume = undefined;
87
+ await waitWithCountdown(waitMs, log);
88
+ await log('\n🔄 Restarting with a fresh session now...');
89
+ return true;
90
+ }
91
+ await log(`\n\n❌ Corrupted thinking blocks persisted after ${resumeCount} resume + ${restartCount} fresh-session attempt(s) (Issue #1834).\n This is an upstream Claude Code bug (anthropics/claude-code#63147). Failing to avoid an endless recovery loop.`, { level: 'error' });
92
+ return false;
93
+ };
94
+ };
95
+
96
+ export default { createThinkingBlockRecovery };
@@ -137,6 +137,27 @@ export const retryLimits = {
137
137
  // Default: 5 — retry generously even when API signals not retryable, since the signal can be wrong
138
138
  // for transient backend glitches (e.g. overloaded errors observed as non-retryable 500s).
139
139
  maxNotRetryableAttempts: parseIntWithDefault('HIVE_MIND_MAX_NOT_RETRYABLE_ATTEMPTS', 5),
140
+ // Corrupted extended-thinking-block recovery (Issue #1834)
141
+ // When Claude Code returns a 400 "`thinking` or `redacted_thinking` blocks ... cannot be modified",
142
+ // the on-disk session is permanently un-resumable (upstream anthropics/claude-code#63147: the
143
+ // transcript stores thinking text as "" but keeps the original signature, so every resumed turn
144
+ // fails signature validation). The only recovery is to discard the session and start a fresh one
145
+ // (equivalent to `/clear`). Cap fresh restarts to avoid expensive re-run loops.
146
+ maxThinkingBlockRestarts: parseIntWithDefault('HIVE_MIND_MAX_THINKING_BLOCK_RESTARTS', 2),
147
+ // PR #1835 feedback: "in case of this specific error we should try resume first, and if not
148
+ // possible try to restart." Before discarding the session we first attempt to resume it this many
149
+ // times (context-preserving). Only after these resume attempts also fail do we fall back to a
150
+ // fresh restart. Default: 1 — one cheap resume attempt, then escalate to a fresh session.
151
+ maxThinkingBlockResumes: parseIntWithDefault('HIVE_MIND_MAX_THINKING_BLOCK_RESUMES', 1),
152
+ };
153
+
154
+ // Critical-error recovery behaviour (Issue #1834, PR #1835 feedback)
155
+ // "On all critical errors we auto commit uncommitted changes by default." When a critical error
156
+ // forces the tool to discard/restart a session, any uncommitted work on disk would be lost when the
157
+ // session context resets. Auto-committing (and pushing) preserves it in the PR branch. On by default;
158
+ // set HIVE_MIND_AUTO_COMMIT_ON_CRITICAL_ERROR=false to disable.
159
+ export const criticalErrorRecovery = {
160
+ autoCommitUncommittedChanges: getenv('HIVE_MIND_AUTO_COMMIT_ON_CRITICAL_ERROR', 'true').toLowerCase() === 'true',
140
161
  };
141
162
 
142
163
  // Claude Code CLI configurations
@@ -0,0 +1,70 @@
1
+ #!/usr/bin/env node
2
+
3
+ // Issue #1834 (PR #1835 feedback): "On all critical errors we auto commit uncommitted changes by
4
+ // default." When the tool hits a critical error and has to discard/restart a session (e.g. the
5
+ // corrupted extended-thinking-block 400, anthropics/claude-code#63147), any work the agent already
6
+ // made on disk would otherwise be silently lost when the session context is reset. This helper
7
+ // commits — and best-effort pushes — those uncommitted changes so the partial work is preserved in
8
+ // the PR branch history before recovery proceeds.
9
+ //
10
+ // It is intentionally dependency-light (receives `$` and `log`) and NEVER throws: a failure to
11
+ // commit must not mask the original critical error or break the recovery flow.
12
+
13
+ import { reportError } from './sentry.lib.mjs';
14
+
15
+ /**
16
+ * Commit (and optionally push) any uncommitted changes in a working tree before critical-error
17
+ * recovery resets the session.
18
+ *
19
+ * @param {object} params
20
+ * @param {string} params.tempDir - Working tree (git clone) to inspect.
21
+ * @param {string} [params.branchName] - Branch to push to (push skipped when absent).
22
+ * @param {Function} params.$ - command-stream tagged-template executor.
23
+ * @param {Function} params.log - async logger.
24
+ * @param {string} [params.reason] - Short human-readable reason, recorded in the commit message.
25
+ * @param {boolean} [params.push=true] - Whether to push after committing.
26
+ * @returns {Promise<{committed: boolean, pushed: boolean}>}
27
+ */
28
+ export const commitUncommittedChangesOnCriticalError = async ({ tempDir, branchName, $, log, reason = 'critical error', push = true }) => {
29
+ if (!tempDir || typeof $ !== 'function') {
30
+ return { committed: false, pushed: false };
31
+ }
32
+ try {
33
+ const statusResult = await $({ cwd: tempDir })`git status --porcelain 2>&1`;
34
+ const statusOutput = statusResult.stdout?.toString().trim() || '';
35
+ if (!statusOutput) {
36
+ await log(' ℹ️ No uncommitted changes to preserve before recovery.', { verbose: true });
37
+ return { committed: false, pushed: false };
38
+ }
39
+ await log(`💾 Critical error (${reason}) — auto-committing uncommitted changes to preserve work before recovery...`);
40
+ for (const line of statusOutput.split('\n')) await log(` ${line}`, { verbose: true });
41
+ const addResult = await $({ cwd: tempDir })`git add -A`;
42
+ if (addResult.code !== 0) {
43
+ await log(`⚠️ Could not stage changes before recovery: ${addResult.stderr?.toString().trim()}`, { level: 'warning' });
44
+ return { committed: false, pushed: false };
45
+ }
46
+ const commitMessage = `🛟 Auto-commit before critical-error recovery (${reason})`;
47
+ const commitResult = await $({ cwd: tempDir })`git commit -m ${commitMessage}`;
48
+ if (commitResult.code !== 0) {
49
+ await log(`⚠️ Could not commit changes before recovery: ${commitResult.stderr?.toString().trim() || commitResult.stdout?.toString().trim()}`, { level: 'warning' });
50
+ return { committed: false, pushed: false };
51
+ }
52
+ await log('✅ Uncommitted changes committed before recovery.');
53
+ if (!push || !branchName) {
54
+ return { committed: true, pushed: false };
55
+ }
56
+ const pushResult = await $({ cwd: tempDir })`git push origin ${branchName} 2>&1`;
57
+ if (pushResult.code === 0) {
58
+ await log('✅ Preserved work pushed to remote.');
59
+ return { committed: true, pushed: true };
60
+ }
61
+ await log(`⚠️ Committed locally but could not push preserved work: ${pushResult.stderr?.toString().trim() || pushResult.stdout?.toString().trim()}`, { level: 'warning' });
62
+ return { committed: true, pushed: false };
63
+ } catch (error) {
64
+ reportError(error, { context: 'commit_uncommitted_on_critical_error', tempDir, operation: 'auto_commit_recovery' });
65
+ await log(`⚠️ Error while auto-committing before recovery (continuing anyway): ${error.message}`, { level: 'warning' });
66
+ return { committed: false, pushed: false };
67
+ }
68
+ };
69
+
70
+ export default { commitUncommittedChangesOnCriticalError };
package/src/solve.mjs CHANGED
@@ -1136,6 +1136,20 @@ try {
1136
1136
  }
1137
1137
  }
1138
1138
 
1139
+ // Issue #1834 (PR #1835 feedback): "on all critical errors we auto commit uncommitted changes
1140
+ // by default." A failed session is a critical error and exits here before the normal
1141
+ // auto-commit chokepoint below, so preserve (commit + push) any work the agent left on disk
1142
+ // first. On by default; disable via HIVE_MIND_AUTO_COMMIT_ON_CRITICAL_ERROR=false. Never throws.
1143
+ try {
1144
+ const { criticalErrorRecovery } = await import('./config.lib.mjs');
1145
+ if (criticalErrorRecovery.autoCommitUncommittedChanges) {
1146
+ const { commitUncommittedChangesOnCriticalError } = await import('./critical-error-commit.lib.mjs');
1147
+ await commitUncommittedChangesOnCriticalError({ tempDir, branchName, $, log, reason: `${argv.tool || 'claude'} execution failed` });
1148
+ }
1149
+ } catch (preserveError) {
1150
+ await log(` ⚠️ Could not auto-commit before failure exit: ${preserveError.message}`, { verbose: true });
1151
+ }
1152
+
1139
1153
  await safeExit(1, `${argv.tool.toUpperCase()} execution failed`);
1140
1154
  }
1141
1155
 
@@ -1159,8 +1173,13 @@ try {
1159
1173
  await log('ℹ️ Playwright MCP auto-cleanup disabled via --no-playwright-mcp-auto-cleanup', { verbose: true });
1160
1174
  }
1161
1175
 
1162
- // When limit is reached, force auto-commit of any uncommitted changes to preserve work
1163
- const shouldAutoCommit = argv['auto-commit-uncommitted-changes'] || limitReached;
1176
+ // When limit is reached, force auto-commit of any uncommitted changes to preserve work.
1177
+ // Issue #1834 (PR #1835 feedback): "on all critical errors we auto commit uncommitted changes by
1178
+ // default." A failed/errored session is a critical error, so auto-commit (and push) to preserve any
1179
+ // work the agent left on disk. On by default; disable via HIVE_MIND_AUTO_COMMIT_ON_CRITICAL_ERROR=false.
1180
+ const { criticalErrorRecovery } = await import('./config.lib.mjs');
1181
+ const criticalError = success === false || errorDuringExecution === true;
1182
+ const shouldAutoCommit = argv['auto-commit-uncommitted-changes'] || limitReached || (criticalError && criticalErrorRecovery.autoCommitUncommittedChanges);
1164
1183
  const autoRestartEnabled = argv['autoRestartOnUncommittedChanges'] !== false;
1165
1184
  const shouldRestart = await checkForUncommittedChanges(tempDir, owner, repo, branchName, $, log, shouldAutoCommit, autoRestartEnabled);
1166
1185
 
@@ -43,6 +43,21 @@ export const classifyRetryableError = value => {
43
43
  return { message, isRetryable: true, isCapacity: false, label: 'Stream disconnected before completion' };
44
44
  }
45
45
 
46
+ // Issue #1834: Corrupted extended-thinking blocks. When extended thinking is combined with tool
47
+ // use, Claude Code can persist a thinking block to the session transcript with the `thinking`
48
+ // text emptied to "" while retaining the original `signature`. On resume/continue the block is
49
+ // replayed as `{ type: 'thinking', thinking: '', signature: <original> }`; the API validates the
50
+ // signature against the (now empty) text and rejects every subsequent turn with:
51
+ // 400 ... `thinking` or `redacted_thinking` blocks in the latest assistant message cannot be
52
+ // modified. These blocks must remain as they were in the original response.
53
+ // The session is therefore permanently un-resumable — retrying with --resume always fails. The
54
+ // only recovery is to discard the session and start fresh (equivalent to `/clear`), so this is
55
+ // flagged with `requiresFreshSession` rather than the plain `isRetryable` retry-with-resume path.
56
+ // Upstream: https://github.com/anthropics/claude-code/issues/63147
57
+ if ((lower.includes('thinking') || lower.includes('redacted_thinking')) && lower.includes('cannot be modified')) {
58
+ return { message, isRetryable: false, isCapacity: false, requiresFreshSession: true, label: 'Corrupted thinking blocks (un-resumable session)' };
59
+ }
60
+
46
61
  if (lower.includes('api error: 503') || (lower.includes('503') && (lower.includes('upstream connect error') || lower.includes('remote connection failure')))) {
47
62
  return { message, isRetryable: true, isCapacity: false, label: '503 network error' };
48
63
  }