claude-coder 1.4.0 → 1.5.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +4 -1
- package/docs/ARCHITECTURE.md +35 -8
- package/docs/README.en.md +2 -1
- package/package.json +1 -1
- package/src/config.js +1 -1
- package/src/indicator.js +42 -7
- package/src/runner.js +19 -0
- package/src/session.js +30 -7
package/README.md
CHANGED
|
@@ -98,7 +98,8 @@ your-project/
|
|
|
98
98
|
tests.json # 验证记录
|
|
99
99
|
test.env # 测试凭证(API Key 等,可选)
|
|
100
100
|
playwright-auth.json # Playwright 登录状态(可选,auth 命令生成)
|
|
101
|
-
.runtime/ #
|
|
101
|
+
.runtime/ # 临时文件
|
|
102
|
+
logs/ # 每 session 独立日志 + activity log
|
|
102
103
|
requirements.md # 需求文档(可选)
|
|
103
104
|
```
|
|
104
105
|
|
|
@@ -108,6 +109,8 @@ your-project/
|
|
|
108
109
|
|
|
109
110
|
**中断恢复**:直接重新运行 `claude-coder run`,会从上次中断处继续。
|
|
110
111
|
|
|
112
|
+
**长时间无响应**:模型处理复杂文件时可能出现 10-20 分钟的思考间隔(spinner 会显示红色警告),这是正常行为。超过 30 分钟无工具调用时 Harness 会自动中断并重试。可通过 `.env` 中 `SESSION_STALL_TIMEOUT=秒数` 调整阈值。
|
|
113
|
+
|
|
111
114
|
**跳过任务**:将 `.claude-coder/tasks.json` 中该任务的 `status` 改为 `done`。
|
|
112
115
|
|
|
113
116
|
**Windows 支持**:完全支持,纯 Node.js 实现。
|
package/docs/ARCHITECTURE.md
CHANGED
|
@@ -21,9 +21,12 @@ Agent 在单次 session 中应最大化推进任务进度。**任何非致命问
|
|
|
21
21
|
- 缺少 API Key → 用 mock 或代码逻辑验证替代,记录到 `test.env`,继续推进
|
|
22
22
|
- 测试环境未就绪 → 跳过该测试步骤,完成其余可验证的步骤
|
|
23
23
|
- 服务启动失败 → 尝试排查修复,无法修复则记录问题后推进代码实现
|
|
24
|
+
- **长时间思考是正常行为**:模型处理大文件(如 500+ 行的代码文件)时可能出现 10-20 分钟的思考间隔,不代表卡死
|
|
24
25
|
|
|
25
26
|
**反面案例**:Agent 因 `OPENAI_API_KEY` 缺失直接标记任务 `failed` → 浪费整个 session
|
|
26
27
|
|
|
28
|
+
> Harness 兜底机制:当工具调用间隔超过 `SESSION_STALL_TIMEOUT`(默认 30 分钟)时自动中断 session 并触发 rollback + 重试。此阈值设计为远超正常思考时长,仅捕捉真正的卡死场景。
|
|
29
|
+
|
|
27
30
|
### 规则 2:回滚即彻底回滚
|
|
28
31
|
|
|
29
32
|
`git reset --hard` 是全量回滚,不做部分文件保护。
|
|
@@ -56,6 +59,16 @@ Agent 在单次 session 中应最大化推进任务进度。**任何非致命问
|
|
|
56
59
|
|
|
57
60
|
Agent 不应浪费工具调用读取 harness 已知的数据。所有可预读的上下文通过 prompt hint 注入(见第 5 节 Prompt 注入架构)。
|
|
58
61
|
|
|
62
|
+
### 规则 6:停顿检测 — 模型卡死自动恢复
|
|
63
|
+
|
|
64
|
+
模型可能长时间「思考」但不调用工具(20+ 分钟无进展)。Harness 通过 PreToolUse hook 追踪最后一次工具调用时间:
|
|
65
|
+
|
|
66
|
+
- 无工具调用 > `SESSION_STALL_TIMEOUT`(默认 1800 秒 / 30 分钟) → 自动中断 session
|
|
67
|
+
- 中断后进入 runner 的重试逻辑(连续失败 ≥ 3 次 → 标记 task failed)
|
|
68
|
+
- Spinner 在无工具调用 > 2 分钟时显示红色警告
|
|
69
|
+
|
|
70
|
+
配置方式:`.claude-coder/.env` 中设置 `SESSION_STALL_TIMEOUT=1800`(秒)
|
|
71
|
+
|
|
59
72
|
---
|
|
60
73
|
|
|
61
74
|
## 1. 核心架构
|
|
@@ -79,7 +92,7 @@ flowchart TB
|
|
|
79
92
|
direction TB
|
|
80
93
|
profile["project_profile.json<br/>tasks.json"]
|
|
81
94
|
runtime["session_result.json<br/>progress.json"]
|
|
82
|
-
phase[".runtime/<br/>phase / step /
|
|
95
|
+
phase[".runtime/<br/>phase / step / logs/"]
|
|
83
96
|
end
|
|
84
97
|
|
|
85
98
|
scan -->|"systemPrompt =<br/>CLAUDE.md + SCAN_PROTOCOL.md"| query
|
|
@@ -96,7 +109,8 @@ flowchart TB
|
|
|
96
109
|
**核心特征:**
|
|
97
110
|
- **项目无关**:项目信息由 Agent 扫描后存入 `project_profile.json`,harness 不含项目特定逻辑
|
|
98
111
|
- **可恢复**:通过 `session_result.json` 跨会话记忆,任意 session 可断点续跑
|
|
99
|
-
- **可观测**:SDK 内联 `PreToolUse` hook
|
|
112
|
+
- **可观测**:SDK 内联 `PreToolUse` hook 实时显示当前工具、操作目标和停顿警告
|
|
113
|
+
- **自愈**:编辑死循环检测 + 停顿超时自动中断 + runner 重试机制
|
|
100
114
|
- **跨平台**:纯 Node.js 实现,macOS / Linux / Windows 通用
|
|
101
115
|
- **零依赖**:`dependencies` 为空,Claude Agent SDK 作为 peerDependency
|
|
102
116
|
|
|
@@ -144,14 +158,14 @@ bin/cli.js CLI 入口:参数解析、命令路由、SDK peerDep 检
|
|
|
144
158
|
src/
|
|
145
159
|
config.js 配置管理:.env 加载、模型映射、环境变量构建、全局同步
|
|
146
160
|
runner.js 主循环:scan → session → validate → retry/rollback
|
|
147
|
-
session.js SDK 交互:query() 调用、hook
|
|
161
|
+
session.js SDK 交互:query() 调用、hook 绑定、停顿检测、日志流
|
|
148
162
|
prompts.js 提示语构建:系统 prompt 组合 + 条件 hint + 任务分解指导
|
|
149
163
|
init.js 环境初始化:读取 profile 执行依赖安装、服务启动、健康检查
|
|
150
164
|
scanner.js 初始化扫描:调用 runScanSession + 重试
|
|
151
165
|
validator.js 校验引擎:分层校验(fatal/recoverable/pass)+ git 检查 + 测试覆盖
|
|
152
166
|
tasks.js 任务管理:CRUD + 状态机 + 进度展示
|
|
153
167
|
auth.js Playwright 凭证:导出登录状态 + MCP 配置 + gitignore
|
|
154
|
-
indicator.js 进度指示:终端 spinner + phase/step 文件写入
|
|
168
|
+
indicator.js 进度指示:终端 spinner + 工具目标显示 + 停顿警告 + phase/step 文件写入
|
|
155
169
|
setup.js 交互式配置:模型选择、API Key、MCP 工具
|
|
156
170
|
templates/
|
|
157
171
|
CLAUDE.md Agent 协议(注入为 systemPrompt)
|
|
@@ -193,7 +207,7 @@ templates/
|
|
|
193
207
|
| `session_result.json` | 每次 session 结束 | 当前 session 结果(扁平格式,向后兼容旧 `current` 包装) |
|
|
194
208
|
| `playwright-auth.json` | `claude-coder auth` | Playwright 登录状态(cookies + localStorage) |
|
|
195
209
|
| `tests.json` | 首次测试时 | 验证记录(防止反复测试) |
|
|
196
|
-
| `.runtime/` | 运行时 | 临时文件(phase、step、activity.log
|
|
210
|
+
| `.runtime/` | 运行时 | 临时文件(phase、step、logs/session_N.activity.log) |
|
|
197
211
|
|
|
198
212
|
---
|
|
199
213
|
|
|
@@ -299,17 +313,27 @@ sequenceDiagram
|
|
|
299
313
|
participant SDK as Claude Agent SDK
|
|
300
314
|
participant Hook as inferPhaseStep()
|
|
301
315
|
participant Ind as Indicator (setInterval)
|
|
316
|
+
participant Stall as stallChecker (30s)
|
|
302
317
|
participant Term as 终端
|
|
303
318
|
|
|
304
319
|
SDK->>Hook: PreToolUse 回调<br/>{tool_name: "Edit", tool_input: {path: "src/app.tsx"}}
|
|
305
320
|
Hook->>Hook: 推断阶段: Edit → coding
|
|
306
321
|
Hook->>Ind: updatePhase("coding")
|
|
322
|
+
Hook->>Ind: lastToolTime = now
|
|
323
|
+
Hook->>Ind: toolTarget = "src/app.tsx"
|
|
307
324
|
Hook->>Ind: appendActivity("Edit", "src/app.tsx")
|
|
308
325
|
|
|
309
326
|
Note over SDK,Hook: 同步回调,return {decision: "allow"}
|
|
310
327
|
|
|
311
328
|
loop 每 500ms
|
|
312
|
-
Ind->>Term: ⠋ [Session 3] 编码中 02:15 |
|
|
329
|
+
Ind->>Term: ⠋ [Session 3] 编码中 02:15 | 读取文件: ppt_generator.py
|
|
330
|
+
end
|
|
331
|
+
|
|
332
|
+
loop 每 30s
|
|
333
|
+
Stall->>Ind: 检查 now - lastToolTime
|
|
334
|
+
alt 超过 STALL_TIMEOUT
|
|
335
|
+
Stall->>SDK: stallDetected = true → break for-await
|
|
336
|
+
end
|
|
313
337
|
end
|
|
314
338
|
```
|
|
315
339
|
|
|
@@ -364,9 +388,12 @@ Harness 在 `buildCodingPrompt()` 中预读 `tasks.json`,将下一个待办任
|
|
|
364
388
|
|
|
365
389
|
Harness 在 `buildCodingPrompt()` 中预读 `session_result.json`,将上次会话的 task_id、结果和 notes 摘要注入 user prompt。Agent 无需自行读取历史 session 数据。
|
|
366
390
|
|
|
367
|
-
###
|
|
391
|
+
### 自愈机制
|
|
392
|
+
|
|
393
|
+
**编辑死循环检测**:PreToolUse hook 追踪每个文件的编辑次数,同一文件 Write/Edit 超 5 次 → `decision: "block"`。
|
|
368
394
|
|
|
369
|
-
|
|
395
|
+
**停顿超时检测**:每 30 秒检查 `indicator.lastToolTime`,若距上次工具调用超过 `SESSION_STALL_TIMEOUT`(默认 1800 秒 / 30 分钟),自动 `break` 退出并触发 rollback + 重试。
|
|
396
|
+
> 注意:模型在处理复杂文件时可能出现 10-20 分钟的长时间思考,这是正常行为。超时设为 30 分钟以避免误杀正常思考。可通过 `.env` 中 `SESSION_STALL_TIMEOUT=秒数` 自定义。
|
|
370
397
|
|
|
371
398
|
### 文件权限模型
|
|
372
399
|
|
package/docs/README.en.md
CHANGED
|
@@ -86,7 +86,8 @@ your-project/
|
|
|
86
86
|
tests.json # Verification records
|
|
87
87
|
test.env # Test credentials (API keys, optional)
|
|
88
88
|
playwright-auth.json # Playwright login state (optional, via auth command)
|
|
89
|
-
.runtime/ # Temp files
|
|
89
|
+
.runtime/ # Temp files
|
|
90
|
+
logs/ # Per-session logs + activity logs
|
|
90
91
|
requirements.md # Requirements (optional)
|
|
91
92
|
```
|
|
92
93
|
|
package/package.json
CHANGED
package/src/config.js
CHANGED
|
@@ -63,7 +63,6 @@ function paths() {
|
|
|
63
63
|
runtime,
|
|
64
64
|
phaseFile: path.join(runtime, 'phase'),
|
|
65
65
|
stepFile: path.join(runtime, 'step'),
|
|
66
|
-
activityLog: path.join(runtime, 'activity.log'),
|
|
67
66
|
logsDir: path.join(runtime, 'logs'),
|
|
68
67
|
};
|
|
69
68
|
}
|
|
@@ -107,6 +106,7 @@ function loadConfig() {
|
|
|
107
106
|
defaultSonnet: env.ANTHROPIC_DEFAULT_SONNET_MODEL || '',
|
|
108
107
|
defaultHaiku: env.ANTHROPIC_DEFAULT_HAIKU_MODEL || '',
|
|
109
108
|
thinkingBudget: env.ANTHROPIC_THINKING_BUDGET || '',
|
|
109
|
+
stallTimeout: parseInt(env.SESSION_STALL_TIMEOUT, 10) || 1800,
|
|
110
110
|
raw: env,
|
|
111
111
|
};
|
|
112
112
|
|
package/src/indicator.js
CHANGED
|
@@ -9,15 +9,18 @@ class Indicator {
|
|
|
9
9
|
constructor() {
|
|
10
10
|
this.phase = 'thinking';
|
|
11
11
|
this.step = '';
|
|
12
|
+
this.toolTarget = '';
|
|
12
13
|
this.spinnerIndex = 0;
|
|
13
14
|
this.timer = null;
|
|
14
15
|
this.lastActivity = '';
|
|
16
|
+
this.lastToolTime = Date.now();
|
|
15
17
|
this.sessionNum = 0;
|
|
16
18
|
this.startTime = Date.now();
|
|
17
19
|
}
|
|
18
20
|
|
|
19
|
-
start(sessionNum) {
|
|
21
|
+
start(sessionNum, activityLogPath) {
|
|
20
22
|
this.sessionNum = sessionNum;
|
|
23
|
+
this.activityLogPath = activityLogPath || null;
|
|
21
24
|
this.startTime = Date.now();
|
|
22
25
|
this.timer = setInterval(() => this._render(), 500);
|
|
23
26
|
}
|
|
@@ -45,8 +48,9 @@ class Indicator {
|
|
|
45
48
|
const entry = `[${ts}] ${toolName}: ${summary}`;
|
|
46
49
|
this.lastActivity = entry;
|
|
47
50
|
try {
|
|
48
|
-
|
|
49
|
-
|
|
51
|
+
if (this.activityLogPath) {
|
|
52
|
+
fs.appendFileSync(this.activityLogPath, entry + '\n', 'utf8');
|
|
53
|
+
}
|
|
50
54
|
} catch { /* ignore */ }
|
|
51
55
|
}
|
|
52
56
|
|
|
@@ -74,8 +78,17 @@ class Indicator {
|
|
|
74
78
|
? `${COLOR.yellow}思考中${COLOR.reset}`
|
|
75
79
|
: `${COLOR.green}编码中${COLOR.reset}`;
|
|
76
80
|
|
|
81
|
+
const idleMs = Date.now() - this.lastToolTime;
|
|
82
|
+
const idleMin = Math.floor(idleMs / 60000);
|
|
83
|
+
|
|
77
84
|
let line = `${spinner} [Session ${this.sessionNum}] ${clock} ${phaseLabel} ${mm}:${ss}`;
|
|
78
|
-
if (
|
|
85
|
+
if (idleMin >= 2) {
|
|
86
|
+
line += ` | ${COLOR.red}${idleMin}分无工具调用${COLOR.reset}`;
|
|
87
|
+
}
|
|
88
|
+
if (this.step) {
|
|
89
|
+
line += ` | ${this.step}`;
|
|
90
|
+
if (this.toolTarget) line += `: ${this.toolTarget}`;
|
|
91
|
+
}
|
|
79
92
|
return line;
|
|
80
93
|
}
|
|
81
94
|
|
|
@@ -94,6 +107,14 @@ class Indicator {
|
|
|
94
107
|
function inferPhaseStep(indicator, toolName, toolInput) {
|
|
95
108
|
const name = (toolName || '').toLowerCase();
|
|
96
109
|
|
|
110
|
+
indicator.lastToolTime = Date.now();
|
|
111
|
+
|
|
112
|
+
const rawTarget = typeof toolInput === 'object'
|
|
113
|
+
? (toolInput.file_path || toolInput.path || toolInput.command || toolInput.pattern || '')
|
|
114
|
+
: String(toolInput || '');
|
|
115
|
+
const shortTarget = rawTarget.split('/').slice(-2).join('/').slice(0, 40);
|
|
116
|
+
indicator.toolTarget = shortTarget;
|
|
117
|
+
|
|
97
118
|
if (name === 'write' || name === 'edit' || name === 'multiedit' || name === 'str_replace_editor' || name === 'strreplace') {
|
|
98
119
|
indicator.updatePhase('coding');
|
|
99
120
|
} else if (name === 'bash' || name === 'shell') {
|
|
@@ -119,9 +140,23 @@ function inferPhaseStep(indicator, toolName, toolInput) {
|
|
|
119
140
|
indicator.updateStep('查阅文档');
|
|
120
141
|
}
|
|
121
142
|
|
|
122
|
-
|
|
123
|
-
|
|
124
|
-
|
|
143
|
+
let summary;
|
|
144
|
+
if (typeof toolInput === 'object') {
|
|
145
|
+
const target = toolInput.file_path || toolInput.path || '';
|
|
146
|
+
const cmd = toolInput.command || '';
|
|
147
|
+
const pattern = toolInput.pattern || '';
|
|
148
|
+
if (target) {
|
|
149
|
+
summary = target;
|
|
150
|
+
} else if (cmd) {
|
|
151
|
+
summary = cmd.slice(0, 200);
|
|
152
|
+
} else if (pattern) {
|
|
153
|
+
summary = `pattern: ${pattern}`;
|
|
154
|
+
} else {
|
|
155
|
+
summary = JSON.stringify(toolInput).slice(0, 200);
|
|
156
|
+
}
|
|
157
|
+
} else {
|
|
158
|
+
summary = String(toolInput || '').slice(0, 200);
|
|
159
|
+
}
|
|
125
160
|
indicator.appendActivity(toolName, summary);
|
|
126
161
|
}
|
|
127
162
|
|
package/src/runner.js
CHANGED
|
@@ -301,6 +301,25 @@ async function run(requirement, opts = {}) {
|
|
|
301
301
|
lastValidateLog: consecutiveFailures > 0 ? '上次校验失败' : '',
|
|
302
302
|
});
|
|
303
303
|
|
|
304
|
+
if (sessionResult.stalled) {
|
|
305
|
+
log('warn', `Session ${session} 因停顿超时中断,跳过校验直接重试`);
|
|
306
|
+
consecutiveFailures++;
|
|
307
|
+
rollback(headBefore, '停顿超时');
|
|
308
|
+
if (consecutiveFailures >= MAX_RETRY) {
|
|
309
|
+
log('error', `连续失败 ${MAX_RETRY} 次,跳过当前任务`);
|
|
310
|
+
markTaskFailed();
|
|
311
|
+
consecutiveFailures = 0;
|
|
312
|
+
}
|
|
313
|
+
appendProgress({
|
|
314
|
+
session,
|
|
315
|
+
timestamp: new Date().toISOString(),
|
|
316
|
+
result: 'stalled',
|
|
317
|
+
cost: sessionResult.cost,
|
|
318
|
+
taskId,
|
|
319
|
+
});
|
|
320
|
+
continue;
|
|
321
|
+
}
|
|
322
|
+
|
|
304
323
|
// Validate
|
|
305
324
|
log('info', '开始 harness 校验 ...');
|
|
306
325
|
const validateResult = await validate(headBefore);
|
package/src/session.js
CHANGED
|
@@ -99,12 +99,23 @@ async function runCodingSession(sessionNum, opts = {}) {
|
|
|
99
99
|
const taskId = opts.taskId || 'unknown';
|
|
100
100
|
const dateStr = new Date().toISOString().slice(0, 10).replace(/-/g, '');
|
|
101
101
|
const logFile = path.join(p.logsDir, `${taskId}_session_${sessionNum}_${dateStr}.log`);
|
|
102
|
+
const activityLogFile = path.join(p.logsDir, `session_${sessionNum}.activity.log`);
|
|
102
103
|
const logStream = fs.createWriteStream(logFile, { flags: 'a' });
|
|
103
104
|
|
|
104
|
-
indicator.start(sessionNum);
|
|
105
|
+
indicator.start(sessionNum, activityLogFile);
|
|
105
106
|
|
|
106
107
|
const editCounts = {};
|
|
107
108
|
const EDIT_THRESHOLD = 5;
|
|
109
|
+
const stallTimeoutMs = config.stallTimeout * 1000;
|
|
110
|
+
let stallDetected = false;
|
|
111
|
+
|
|
112
|
+
const stallChecker = setInterval(() => {
|
|
113
|
+
const idleMs = Date.now() - indicator.lastToolTime;
|
|
114
|
+
if (idleMs > stallTimeoutMs && !stallDetected) {
|
|
115
|
+
stallDetected = true;
|
|
116
|
+
log('warn', `无新工具调用超过 ${Math.floor(idleMs / 60000)} 分钟,自动中断 session`);
|
|
117
|
+
}
|
|
118
|
+
}, 30000);
|
|
108
119
|
|
|
109
120
|
try {
|
|
110
121
|
const queryOpts = buildQueryOptions(config, opts);
|
|
@@ -115,13 +126,18 @@ async function runCodingSession(sessionNum, opts = {}) {
|
|
|
115
126
|
hooks: [async (input) => {
|
|
116
127
|
inferPhaseStep(indicator, input.tool_name, input.tool_input);
|
|
117
128
|
|
|
118
|
-
const
|
|
119
|
-
|
|
120
|
-
|
|
121
|
-
|
|
129
|
+
const target = input.tool_input?.file_path || input.tool_input?.path || '';
|
|
130
|
+
const toolSummary = target ? target.split('/').slice(-2).join('/') : '';
|
|
131
|
+
if (toolSummary) {
|
|
132
|
+
logStream.write(`[${new Date().toISOString()}] ${input.tool_name}: ${toolSummary}\n`);
|
|
133
|
+
}
|
|
134
|
+
|
|
135
|
+
if (['Write', 'Edit', 'MultiEdit'].includes(input.tool_name) && target) {
|
|
136
|
+
editCounts[target] = (editCounts[target] || 0) + 1;
|
|
137
|
+
if (editCounts[target] > EDIT_THRESHOLD) {
|
|
122
138
|
return {
|
|
123
139
|
decision: 'block',
|
|
124
|
-
message: `已对 ${
|
|
140
|
+
message: `已对 ${target} 编辑 ${editCounts[target]} 次,疑似死循环。请重新审视方案后再继续。`,
|
|
125
141
|
};
|
|
126
142
|
}
|
|
127
143
|
}
|
|
@@ -135,21 +151,28 @@ async function runCodingSession(sessionNum, opts = {}) {
|
|
|
135
151
|
|
|
136
152
|
const collected = [];
|
|
137
153
|
for await (const message of session) {
|
|
154
|
+
if (stallDetected) {
|
|
155
|
+
log('warn', '停顿超时,中断消息循环');
|
|
156
|
+
break;
|
|
157
|
+
}
|
|
138
158
|
collected.push(message);
|
|
139
159
|
logMessage(message, logStream, indicator);
|
|
140
160
|
}
|
|
141
161
|
|
|
162
|
+
clearInterval(stallChecker);
|
|
142
163
|
logStream.end();
|
|
143
164
|
indicator.stop();
|
|
144
165
|
|
|
145
166
|
const result = extractResult(collected);
|
|
146
167
|
return {
|
|
147
|
-
exitCode: 0,
|
|
168
|
+
exitCode: stallDetected ? 2 : 0,
|
|
148
169
|
cost: result?.total_cost_usd ?? null,
|
|
149
170
|
tokenUsage: result?.usage ?? null,
|
|
150
171
|
logFile,
|
|
172
|
+
stalled: stallDetected,
|
|
151
173
|
};
|
|
152
174
|
} catch (err) {
|
|
175
|
+
clearInterval(stallChecker);
|
|
153
176
|
logStream.end();
|
|
154
177
|
indicator.stop();
|
|
155
178
|
log('error', `Claude SDK 错误: ${err.message}`);
|