@link-assistant/hive-mind 2.0.1 → 2.0.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +37 -0
- package/package.json +1 -1
- package/src/tool-retry.lib.mjs +16 -0
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,42 @@
|
|
|
1
1
|
# @link-assistant/hive-mind
|
|
2
2
|
|
|
3
|
+
## 2.0.2
|
|
4
|
+
|
|
5
|
+
### Patch Changes
|
|
6
|
+
|
|
7
|
+
- 19aea85: fix(retry): auto-resume on "Stream idle timeout - partial response received" (#1937)
|
|
8
|
+
|
|
9
|
+
A long-running solve session (391 turns, ~$34.11) had its streaming response
|
|
10
|
+
stall mid-answer. The Claude CLI surfaced it as a `result` event with
|
|
11
|
+
`is_error: true`, `subtype: "success"`, and:
|
|
12
|
+
|
|
13
|
+
```
|
|
14
|
+
API Error: Stream idle timeout - partial response received
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
Instead of retrying with the session preserved, the harness fell straight
|
|
18
|
+
through to the generic failure path and exited with code 1 after **zero
|
|
19
|
+
retries** — abandoning the whole session even though it had a valid session ID
|
|
20
|
+
and printed the exact `--resume` command needed to continue.
|
|
21
|
+
|
|
22
|
+
Root cause: the shared retry classifier `classifyRetryableError()`
|
|
23
|
+
(`src/tool-retry.lib.mjs`) had no branch for the stream-idle-timeout family, so
|
|
24
|
+
`isRetryable` was false, `isTransientError` evaluated to false, and the unified
|
|
25
|
+
exponential-backoff retry block was never entered.
|
|
26
|
+
|
|
27
|
+
This error is a transient transport-level stall (a slow/stuck server-sent-events
|
|
28
|
+
socket), not a request-content rejection — the on-disk session transcript stays
|
|
29
|
+
valid, which is why a manual `--resume` works. The fix adds one branch to
|
|
30
|
+
`classifyRetryableError()` returning
|
|
31
|
+
`{ isRetryable: true, isCapacity: false, label: 'Stream idle timeout (partial response)' }`,
|
|
32
|
+
so the existing retry block resumes the session with the same context after an
|
|
33
|
+
exponential backoff. Because the classifier is shared, this fixes the behaviour
|
|
34
|
+
for **all** tools (claude/codex/gemini/opencode/qwen/agent) at once.
|
|
35
|
+
|
|
36
|
+
Added `tests/test-issue-1937-stream-idle-timeout-retry.mjs` (17 assertions) and a
|
|
37
|
+
full case study with timeline, root-cause analysis, upstream references, and the
|
|
38
|
+
captured logs under `docs/case-studies/issue-1937`.
|
|
39
|
+
|
|
3
40
|
## 2.0.1
|
|
4
41
|
|
|
5
42
|
### Patch Changes
|
package/package.json
CHANGED
package/src/tool-retry.lib.mjs
CHANGED
|
@@ -43,6 +43,22 @@ export const classifyRetryableError = value => {
|
|
|
43
43
|
return { message, isRetryable: true, isCapacity: false, label: 'Stream disconnected before completion' };
|
|
44
44
|
}
|
|
45
45
|
|
|
46
|
+
// Issue #1937: Stream idle timeout. When the Anthropic streaming response stalls
|
|
47
|
+
// (no bytes for the SDK's idle window) after the model has already emitted part of
|
|
48
|
+
// its answer, the Claude CLI aborts the turn and surfaces a synthetic assistant /
|
|
49
|
+
// result message:
|
|
50
|
+
// "API Error: Stream idle timeout - partial response received"
|
|
51
|
+
// This is a transient network/streaming stall (a slow or stuck server-sent-events
|
|
52
|
+
// socket), not a request-content error, so the session is still valid and safe to
|
|
53
|
+
// resume. Before this branch classifyRetryableError() did not recognise it, so
|
|
54
|
+
// isRetryable was false and the whole solve session aborted with exit code 1 even
|
|
55
|
+
// though `--resume <sessionId>` could continue with the same context. Switching
|
|
56
|
+
// models does not help (the stall is in the response stream, not model capacity),
|
|
57
|
+
// so isCapacity is false → retry with the session preserved after a backoff.
|
|
58
|
+
if (lower.includes('stream idle timeout') || (lower.includes('idle timeout') && lower.includes('partial response'))) {
|
|
59
|
+
return { message, isRetryable: true, isCapacity: false, label: 'Stream idle timeout (partial response)' };
|
|
60
|
+
}
|
|
61
|
+
|
|
46
62
|
// Issue #1881: Transient socket / network disconnects from the SDK's underlying fetch.
|
|
47
63
|
// When the HTTP(S)/streaming socket drops mid-request, the Claude/Codex CLI surfaces a
|
|
48
64
|
// synthetic assistant message such as:
|