openclaw-diag-cli 0.1.3 → 0.2.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +84 -71
- package/bin/openclaw-diag.js +65 -176
- package/diag/01_sys_health.py +0 -2
- package/diag/02_environment.py +32 -6
- package/diag/03_configuration.py +4 -1
- package/diag/04_gateway.py +30 -8
- package/diag/05_recent_errors.py +24 -14
- package/diag/06_cron_jobs.py +4 -41
- package/diag/07_performance.py +114 -42
- package/diag/08_sessions.py +2 -54
- package/diag/09_plugin_diag.py +52 -25
- package/diag/10_shell_history.py +28 -10
- package/lib/bundle.py +6 -13
- package/ocdiag/__init__.py +1 -1
- package/ocdiag/cli.py +16 -1
- package/ocdiag/dispatcher.py +140 -53
- package/ocdiag/doctor.py +162 -0
- package/ocdiag/jsonlog.py +0 -5
- package/ocdiag/paths.py +0 -1
- package/ocdiag/recent_logs.py +0 -3
- package/ocdiag/sensitive.py +95 -1
- package/ocdiag/timeutil.py +0 -11
- package/ocdiag/tokens.py +0 -4
- package/package.json +2 -2
- package/tools/oc_session_extract.py +75 -7
- package/tools/oc_session_trace.py +31 -9
package/README.md
CHANGED
|
@@ -1,49 +1,89 @@
|
|
|
1
|
-
#
|
|
1
|
+
# openclaw-diag
|
|
2
2
|
|
|
3
|
-
|
|
3
|
+
OpenClaw 出问题时,**先跑这条命令再开 ticket**:
|
|
4
4
|
|
|
5
|
-
|
|
5
|
+
```bash
|
|
6
|
+
npx openclaw-diag-cli all
|
|
7
|
+
```
|
|
8
|
+
|
|
9
|
+
零安装、零依赖、observer-only — 只读探测,绝不改你的状态。
|
|
10
|
+
|
|
11
|
+
## 这是什么
|
|
12
|
+
|
|
13
|
+
一个排查 [OpenClaw](https://github.com/openclaw/openclaw) 运行问题的命令行工具箱。
|
|
14
|
+
|
|
15
|
+
把日常排障要做的事情切成 12 个原子诊断:每个回答一个具体问题("Gateway 起来了吗?"、"哪个 cron 连续失败?"、"P95 延迟最高的模型是哪个?"),可以单独跑、可以一键全跑、也可以拼成 jq 管道喂给监控。
|
|
16
|
+
|
|
17
|
+
**适合谁用**
|
|
18
|
+
|
|
19
|
+
- 用 OpenClaw 的运维 / SRE — 想知道线上某个组件在不在状态
|
|
20
|
+
- 应急响应工程师 — 用户报障时要 5 分钟摸清"哪挂了、什么时候挂的、谁动过它"
|
|
21
|
+
- 自动化平台 — 想把 OpenClaw 健康指标接到自家监控
|
|
22
|
+
|
|
23
|
+
**不是什么**
|
|
24
|
+
|
|
25
|
+
- 不是修复工具 — 它告诉你出了什么问题,但不会去改任何东西
|
|
26
|
+
- 不是替代 `openclaw doctor` 的内置检查 — 它做的是更深一层的事故诊断
|
|
27
|
+
- 不是性能压测工具 — 它读真实运行数据,不主动施压
|
|
28
|
+
|
|
29
|
+
## 安装与上手
|
|
30
|
+
|
|
31
|
+
需要 Node 18+ 和 Python 3.8+。
|
|
6
32
|
|
|
7
33
|
```bash
|
|
8
|
-
#
|
|
34
|
+
# 一次性运行(首次会下载到 npm cache,之后离线可用)
|
|
9
35
|
npx openclaw-diag-cli
|
|
10
36
|
|
|
11
|
-
#
|
|
37
|
+
# 或装到 PATH
|
|
12
38
|
npm install -g openclaw-diag-cli
|
|
13
|
-
openclaw-diag
|
|
14
39
|
```
|
|
15
40
|
|
|
16
|
-
依赖:Node 18+ 和 Python 3.8+。
|
|
17
|
-
|
|
18
|
-
## 五分钟上手
|
|
19
|
-
|
|
20
41
|
```bash
|
|
21
|
-
#
|
|
22
|
-
openclaw-diag
|
|
23
|
-
|
|
24
|
-
# 2. 检查环境是否就绪
|
|
42
|
+
# 检查工具自身环境
|
|
25
43
|
openclaw-diag doctor
|
|
26
44
|
|
|
27
|
-
#
|
|
45
|
+
# 跑某个具体诊断(看 Gateway 状态)
|
|
28
46
|
openclaw-diag gateway
|
|
29
47
|
|
|
30
|
-
#
|
|
48
|
+
# 一次跑完所有 state collectors(任一崩了不影响其他)
|
|
31
49
|
openclaw-diag all
|
|
32
50
|
|
|
33
|
-
#
|
|
51
|
+
# 输出结构化 JSON(适合喂给 jq / 监控)
|
|
34
52
|
openclaw-diag gateway --json
|
|
53
|
+
|
|
54
|
+
# 追踪一条用户消息从进入到响应的完整时间轴
|
|
55
|
+
openclaw-diag trace <session-uuid>
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
输出大致长这样(截取 `openclaw-diag gateway`):
|
|
59
|
+
|
|
60
|
+
```
|
|
61
|
+
── 模块 4:Gateway 状态 ──
|
|
62
|
+
|
|
63
|
+
• 进程 / 端口
|
|
64
|
+
PID 12847 (uptime 3d 2h),监听 :8080,HTTP /healthz → 200
|
|
65
|
+
• 24h 重启
|
|
66
|
+
无重启事件
|
|
67
|
+
• Model API
|
|
68
|
+
amazon-bedrock 可达(DNS+HTTP+认证均通)
|
|
69
|
+
• WS 生命周期
|
|
70
|
+
最近 1h 内 134 次连接,平均存活 47s,无异常关闭
|
|
35
71
|
```
|
|
36
72
|
|
|
73
|
+
加 `--json` 后输出严格结构化(同字段、同值),方便管道处理。
|
|
74
|
+
|
|
37
75
|
## 诊断列表
|
|
38
76
|
|
|
39
|
-
|
|
77
|
+
```bash
|
|
78
|
+
openclaw-diag list # 看完整列表
|
|
79
|
+
```
|
|
40
80
|
|
|
41
|
-
|
|
81
|
+
**扫描类(无需参数,扫一遍系统当前状态)**
|
|
42
82
|
|
|
43
83
|
| 诊断 | 看什么 |
|
|
44
84
|
|---|---|
|
|
45
85
|
| `sys_health` | DNS / 网络 / CPU / 内存 / 磁盘 / IO / 进程 / 时间同步 |
|
|
46
|
-
| `environment` | OpenClaw 版本一致性、Gateway
|
|
86
|
+
| `environment` | OpenClaw 版本一致性、Gateway 进程的环境变量 |
|
|
47
87
|
| `configuration` | `openclaw.json` 展平(敏感字段已脱敏) |
|
|
48
88
|
| `gateway` | Gateway 进程、端口、24h 启停、WS 生命周期、错误码 |
|
|
49
89
|
| `recent_errors` | 应用日志 / journalctl / session 工具调用错误聚合 |
|
|
@@ -53,74 +93,62 @@ openclaw-diag gateway --json
|
|
|
53
93
|
| `plugin_diag` | 插件状态一致性、ERROR/WARN、Hook 异常、Channel、外部依赖 DNS |
|
|
54
94
|
| `shell_history` | 高危命令、openclaw 命令、最近操作 |
|
|
55
95
|
|
|
56
|
-
|
|
96
|
+
**对象类(需要 session uuid)**
|
|
57
97
|
|
|
58
98
|
| 诊断 | 看什么 |
|
|
59
99
|
|---|---|
|
|
60
|
-
| `trace <uuid>` |
|
|
61
|
-
| `extract <uuid>` |
|
|
100
|
+
| `trace <uuid>` | 一条用户消息从进入到响应的完整时间轴 |
|
|
101
|
+
| `extract <uuid>` | session.jsonl 导出为可读格式(active / reset / deleted / backup 全状态) |
|
|
62
102
|
|
|
63
|
-
|
|
103
|
+
**其它命令**
|
|
64
104
|
|
|
65
105
|
| 命令 | 作用 |
|
|
66
106
|
|---|---|
|
|
67
107
|
| `openclaw-diag all` | 跑全部 state collectors |
|
|
68
|
-
| `openclaw-diag
|
|
69
|
-
| `openclaw-diag
|
|
70
|
-
| `openclaw-diag bundle <id>` | 打成 self-contained 单文件 .py |
|
|
108
|
+
| `openclaw-diag doctor` | 检查 Node / Python / openclaw-diag / OpenClaw 环境 |
|
|
109
|
+
| `openclaw-diag bundle <id>` | 打成单文件 .py,离线机器零依赖运行 |
|
|
71
110
|
|
|
72
|
-
##
|
|
111
|
+
## 配方(jq 管道)
|
|
73
112
|
|
|
74
113
|
```bash
|
|
75
|
-
#
|
|
114
|
+
# 哪些 cron 任务出问题了
|
|
76
115
|
openclaw-diag cron_jobs --json | jq '.data.jobs[] | select(.status!="ok")'
|
|
77
116
|
|
|
78
|
-
#
|
|
79
|
-
openclaw-diag performance |
|
|
80
|
-
|
|
81
|
-
# 哪些插件今天有 ERROR
|
|
82
|
-
openclaw-diag plugin_diag --json | jq '.data.plugin_errors | to_entries[] | select(.value.error_count > 0)'
|
|
117
|
+
# P95 延迟 top 3 的模型
|
|
118
|
+
openclaw-diag performance --json | jq '.data.models | to_entries | sort_by(-.value.p95_s) | .[0:3]'
|
|
83
119
|
|
|
84
|
-
#
|
|
85
|
-
openclaw-diag all --json 2>/dev/null | jq -s '.' > report.json
|
|
86
|
-
|
|
87
|
-
# 找出有 stuck session 的事件
|
|
120
|
+
# 找出有 stuck session 的 agent
|
|
88
121
|
openclaw-diag sessions --json | jq '.data.stuck_sessions'
|
|
89
122
|
|
|
90
|
-
#
|
|
91
|
-
openclaw-diag
|
|
92
|
-
|
|
93
|
-
# 导出 session 为可读格式
|
|
94
|
-
openclaw-diag extract <session-uuid> --summary
|
|
123
|
+
# 把所有诊断聚合成 NDJSON 报告(崩溃模块也有错误行,不会丢)
|
|
124
|
+
openclaw-diag all --json 2>/dev/null > report.ndjson
|
|
95
125
|
```
|
|
96
126
|
|
|
97
|
-
##
|
|
127
|
+
## 离线机器与配置覆盖
|
|
128
|
+
|
|
129
|
+
如果目标机器没法装 npm,先在有网的机器上 `bundle` 出单文件:
|
|
98
130
|
|
|
99
131
|
```bash
|
|
100
|
-
#
|
|
132
|
+
# 在有网的机器上
|
|
101
133
|
openclaw-diag bundle gateway > standalone-gateway.py
|
|
102
134
|
|
|
103
|
-
#
|
|
135
|
+
# 拷到目标机器(只需 Python 3.8+,无需安装任何东西)
|
|
104
136
|
scp standalone-gateway.py prod-server:/tmp/
|
|
105
137
|
ssh prod-server "python3 /tmp/standalone-gateway.py --json"
|
|
106
138
|
```
|
|
107
139
|
|
|
108
|
-
|
|
109
|
-
|
|
110
|
-
## 配置覆盖
|
|
111
|
-
|
|
112
|
-
诊断别人机器或容器时,无需改代码:
|
|
140
|
+
诊断他人机器或容器时,无需改代码,用环境变量或 flag 覆盖默认路径:
|
|
113
141
|
|
|
114
142
|
| 环境变量 | 默认值 | 说明 |
|
|
115
143
|
|---|---|---|
|
|
116
144
|
| `OPENCLAW_HOME` | `~/.openclaw` | OpenClaw 主目录 |
|
|
117
145
|
| `OPENCLAW_CONFIG` | `$OPENCLAW_HOME/openclaw.json` | 配置文件 |
|
|
118
146
|
| `OPENCLAW_LOG_DIR` | `/tmp/openclaw` | 日志目录 |
|
|
119
|
-
| `OPENCLAW_SESSIONS` | `$OPENCLAW_HOME/agents` | Session
|
|
147
|
+
| `OPENCLAW_SESSIONS` | `$OPENCLAW_HOME/agents` | Session 根目录 |
|
|
120
148
|
|
|
121
|
-
|
|
149
|
+
或单次运行时用 flag:`--config /path/to/file --log-dir /path/to/logs`。
|
|
122
150
|
|
|
123
|
-
##
|
|
151
|
+
## 退出码与设计
|
|
124
152
|
|
|
125
153
|
| rc | 含义 |
|
|
126
154
|
|---|---|
|
|
@@ -128,23 +156,8 @@ ssh prod-server "python3 /tmp/standalone-gateway.py --json"
|
|
|
128
156
|
| 1 | 诊断运行成功但报告 `status: "error"`(数据源缺失等) |
|
|
129
157
|
| 2 | 诊断崩溃(已隔离,不影响 `all`) |
|
|
130
158
|
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
| | |
|
|
134
|
-
|---|---|
|
|
135
|
-
| **只读** | 永远不修改文件、不重启服务 |
|
|
136
|
-
| **零依赖** | 仅 Python 3.8+ 标准库 |
|
|
137
|
-
| **故障隔离** | 单诊断崩溃不带崩 `all` |
|
|
138
|
-
| **数据可靠** | 每个字段都能溯源 |
|
|
139
|
-
| **可组合** | 文本 + JSON 双输出,stderr 与 stdout 分流 |
|
|
140
|
-
|
|
141
|
-
详细设计 → [docs/DESIGN.md](docs/DESIGN.md)(公理推导、目录结构、扩展指南)
|
|
159
|
+
设计上遵循 7 条公理:observer-only、零运行时依赖、仓库内独立、双视角输出(文本/JSON 同字段同值)、数据溯源、失败显式、默认脱敏。详细推导见 [docs/DESIGN.md](docs/DESIGN.md)。
|
|
142
160
|
|
|
143
161
|
## 反馈
|
|
144
162
|
|
|
145
|
-
|
|
146
|
-
- 来源:从 4391 行的 `openclaw-diag.sh` 拆分重写
|
|
147
|
-
|
|
148
|
-
## License
|
|
149
|
-
|
|
150
|
-
MIT
|
|
163
|
+
Issues: https://github.com/wujiaming88/openclaw-diag-cli/issues。License: MIT。
|
package/bin/openclaw-diag.js
CHANGED
|
@@ -1,8 +1,12 @@
|
|
|
1
1
|
#!/usr/bin/env node
|
|
2
2
|
// openclaw-diag — Node entry shell.
|
|
3
|
-
//
|
|
4
|
-
//
|
|
5
|
-
//
|
|
3
|
+
//
|
|
4
|
+
// All real logic lives in Python (ocdiag.dispatcher, ocdiag.doctor). This
|
|
5
|
+
// shell exists for one reason only: npx-friendly install. It locates a
|
|
6
|
+
// suitable python3, hands argv to the Python dispatcher, and forwards stdio
|
|
7
|
+
// + exit code transparently. The single source of truth for the module
|
|
8
|
+
// catalogue is `ocdiag/dispatcher.py`; the Node shell pulls the list from
|
|
9
|
+
// `ocdiag list --json` instead of duplicating it (axiom #3).
|
|
6
10
|
|
|
7
11
|
'use strict';
|
|
8
12
|
|
|
@@ -12,27 +16,12 @@ const fs = require('fs');
|
|
|
12
16
|
|
|
13
17
|
const REPO_ROOT = path.resolve(__dirname, '..');
|
|
14
18
|
const PKG = JSON.parse(fs.readFileSync(path.join(REPO_ROOT, 'package.json'), 'utf8'));
|
|
19
|
+
const DISPATCHER = path.join(REPO_ROOT, 'bin', 'ocdiag');
|
|
15
20
|
|
|
16
21
|
const PYTHON_CANDIDATES = process.platform === 'win32'
|
|
17
22
|
? ['python3', 'python', 'py']
|
|
18
23
|
: ['python3', 'python'];
|
|
19
24
|
|
|
20
|
-
// Keep these in sync with ocdiag/dispatcher.py.
|
|
21
|
-
const STATE_COLLECTORS = [
|
|
22
|
-
'sys_health', 'environment', 'configuration', 'gateway', 'recent_errors',
|
|
23
|
-
'cron_jobs', 'performance', 'sessions', 'plugin_diag', 'shell_history',
|
|
24
|
-
];
|
|
25
|
-
const OBJECT_INSPECTORS = ['trace', 'extract'];
|
|
26
|
-
const MODULE_IDS = new Set([...STATE_COLLECTORS, ...OBJECT_INSPECTORS]);
|
|
27
|
-
|
|
28
|
-
const STATE_SCRIPTS = [
|
|
29
|
-
'01_sys_health.py', '02_environment.py', '03_configuration.py',
|
|
30
|
-
'04_gateway.py', '05_recent_errors.py', '06_cron_jobs.py',
|
|
31
|
-
'07_performance.py', '08_sessions.py', '09_plugin_diag.py',
|
|
32
|
-
'10_shell_history.py',
|
|
33
|
-
];
|
|
34
|
-
const OBJECT_SCRIPTS = ['oc_session_trace.py', 'oc_session_extract.py'];
|
|
35
|
-
|
|
36
25
|
function findPython() {
|
|
37
26
|
for (const cmd of PYTHON_CANDIDATES) {
|
|
38
27
|
try {
|
|
@@ -49,55 +38,68 @@ function findPython() {
|
|
|
49
38
|
}
|
|
50
39
|
}
|
|
51
40
|
} catch (_) {
|
|
52
|
-
// try next
|
|
41
|
+
// try next candidate
|
|
53
42
|
}
|
|
54
43
|
}
|
|
55
44
|
return null;
|
|
56
45
|
}
|
|
57
46
|
|
|
58
47
|
function pythonNotFound() {
|
|
59
|
-
console.error('Error: Python 3.8+
|
|
60
|
-
console.error('
|
|
48
|
+
console.error('Error: 需要 Python 3.8+ 但未找到。');
|
|
49
|
+
console.error(' Linux: sudo apt install python3 / sudo yum install python3');
|
|
50
|
+
console.error(' macOS: brew install python3 / 或从 https://www.python.org/downloads/ 安装');
|
|
51
|
+
console.error(' Windows: https://www.python.org/downloads/ (记得勾上 "Add to PATH")');
|
|
52
|
+
console.error(' 装完后再次运行 openclaw-diag 即可。');
|
|
61
53
|
process.exit(127);
|
|
62
54
|
}
|
|
63
55
|
|
|
56
|
+
function fetchModules(pyCmd) {
|
|
57
|
+
const r = spawnSync(pyCmd, [DISPATCHER, 'list', '--json'], {
|
|
58
|
+
stdio: ['ignore', 'pipe', 'pipe'],
|
|
59
|
+
});
|
|
60
|
+
if (r.status !== 0) return null;
|
|
61
|
+
try {
|
|
62
|
+
return JSON.parse((r.stdout || '').toString());
|
|
63
|
+
} catch (_) {
|
|
64
|
+
return null;
|
|
65
|
+
}
|
|
66
|
+
}
|
|
67
|
+
|
|
64
68
|
function printVersion() {
|
|
65
69
|
console.log(PKG.version);
|
|
66
70
|
}
|
|
67
71
|
|
|
68
|
-
function printHelp() {
|
|
72
|
+
function printHelp(modules) {
|
|
73
|
+
const state = modules ? modules.state_collectors.map((m) => m.id) : [];
|
|
74
|
+
const obj = modules ? modules.object_inspectors.map((m) => m.id) : [];
|
|
69
75
|
const lines = [
|
|
70
76
|
'openclaw-diag — OpenClaw 诊断工具箱',
|
|
71
77
|
'',
|
|
72
|
-
'
|
|
78
|
+
'用法:',
|
|
73
79
|
' openclaw-diag 打印 banner + 诊断目录',
|
|
74
|
-
' openclaw-diag list 列出全部诊断(按类型分组)',
|
|
75
80
|
' openclaw-diag <id> [args...] 跑单个诊断',
|
|
76
81
|
' openclaw-diag all [--skip a,b] 跑全部 state collectors',
|
|
77
|
-
' openclaw-diag
|
|
78
|
-
' openclaw-diag
|
|
79
|
-
' openclaw-diag
|
|
82
|
+
' openclaw-diag list 列出所有诊断',
|
|
83
|
+
' openclaw-diag doctor 检查 Node / Python / 环境',
|
|
84
|
+
' openclaw-diag bundle <id> 生成单文件 .py(离线机器用)',
|
|
80
85
|
' openclaw-diag --version 打印版本号',
|
|
81
86
|
' openclaw-diag --help 本帮助',
|
|
82
87
|
'',
|
|
83
|
-
'
|
|
84
|
-
' ' +
|
|
88
|
+
'扫描类(无需参数):',
|
|
89
|
+
' ' + (state.length ? state.join(' ') : '(无法连接到 Python)'),
|
|
85
90
|
'',
|
|
86
|
-
'
|
|
87
|
-
' ' +
|
|
91
|
+
'对象类(需要 session uuid):',
|
|
92
|
+
' ' + (obj.length ? obj.join(' ') : '(无法连接到 Python)'),
|
|
88
93
|
'',
|
|
89
|
-
'
|
|
94
|
+
'常用 flag:--json(结构化输出) --no-color(关掉颜色) --unmask(不脱敏)',
|
|
90
95
|
];
|
|
91
96
|
console.log(lines.join('\n'));
|
|
92
97
|
}
|
|
93
98
|
|
|
94
|
-
function
|
|
95
|
-
const
|
|
96
|
-
if (!py) pythonNotFound();
|
|
97
|
-
const dispatcher = path.join(REPO_ROOT, 'bin', 'ocdiag');
|
|
98
|
-
const child = spawn(py.cmd, [dispatcher, ...args], { stdio: 'inherit' });
|
|
99
|
+
function spawnDispatcher(pyCmd, args) {
|
|
100
|
+
const child = spawn(pyCmd, [DISPATCHER, ...args], { stdio: 'inherit' });
|
|
99
101
|
child.on('error', (err) => {
|
|
100
|
-
console.error(`Error: failed to spawn ${
|
|
102
|
+
console.error(`Error: failed to spawn ${pyCmd}: ${err.message}`);
|
|
101
103
|
process.exit(1);
|
|
102
104
|
});
|
|
103
105
|
child.on('exit', (code, signal) => {
|
|
@@ -109,12 +111,16 @@ function runDispatcher(args) {
|
|
|
109
111
|
});
|
|
110
112
|
}
|
|
111
113
|
|
|
112
|
-
function
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
114
|
+
function runBundle(pyCmd, args) {
|
|
115
|
+
if (args.length === 0) {
|
|
116
|
+
console.error('Error: bundle requires a module id (e.g. `openclaw-diag bundle gateway`)');
|
|
117
|
+
process.exit(2);
|
|
118
|
+
}
|
|
119
|
+
const child = spawn(pyCmd, [path.join(REPO_ROOT, 'lib', 'bundle.py'), ...args], {
|
|
120
|
+
stdio: 'inherit',
|
|
121
|
+
});
|
|
116
122
|
child.on('error', (err) => {
|
|
117
|
-
console.error(`Error: failed to spawn ${
|
|
123
|
+
console.error(`Error: failed to spawn ${pyCmd}: ${err.message}`);
|
|
118
124
|
process.exit(1);
|
|
119
125
|
});
|
|
120
126
|
child.on('exit', (code, signal) => {
|
|
@@ -126,142 +132,21 @@ function runScript(scriptPath, args) {
|
|
|
126
132
|
});
|
|
127
133
|
}
|
|
128
134
|
|
|
129
|
-
function
|
|
130
|
-
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
}
|
|
134
|
-
runScript(path.join(REPO_ROOT, 'lib', 'bundle.py'), args);
|
|
135
|
-
}
|
|
136
|
-
|
|
137
|
-
// ── doctor ──
|
|
138
|
-
|
|
139
|
-
function nodeVersionOk() {
|
|
140
|
-
const m = process.versions.node.match(/^(\d+)\./);
|
|
141
|
-
return m && parseInt(m[1], 10) >= 18;
|
|
142
|
-
}
|
|
143
|
-
|
|
144
|
-
function checkOcdiagImport(pyCmd) {
|
|
145
|
-
const r = spawnSync(
|
|
146
|
-
pyCmd,
|
|
147
|
-
['-c', 'import sys, os; sys.path.insert(0, os.environ["OCDIAG_REPO_ROOT"]); import ocdiag; print(ocdiag.__version__)'],
|
|
148
|
-
{
|
|
149
|
-
stdio: ['ignore', 'pipe', 'pipe'],
|
|
150
|
-
env: { ...process.env, OCDIAG_REPO_ROOT: REPO_ROOT },
|
|
151
|
-
},
|
|
152
|
-
);
|
|
153
|
-
if (r.status === 0) {
|
|
154
|
-
return { ok: true, version: (r.stdout || '').toString().trim() };
|
|
155
|
-
}
|
|
156
|
-
return { ok: false, error: ((r.stderr || '') + (r.stdout || '')).toString().trim() };
|
|
157
|
-
}
|
|
158
|
-
|
|
159
|
-
function checkDiagScripts(pyCmd) {
|
|
160
|
-
const failed = [];
|
|
161
|
-
const all = [
|
|
162
|
-
...STATE_SCRIPTS.map((n) => ({ name: n, path: path.join(REPO_ROOT, 'diag', n) })),
|
|
163
|
-
...OBJECT_SCRIPTS.map((n) => ({ name: n, path: path.join(REPO_ROOT, 'tools', n) })),
|
|
164
|
-
];
|
|
165
|
-
for (const item of all) {
|
|
166
|
-
const r = spawnSync(pyCmd, [item.path, '--help'], {
|
|
167
|
-
stdio: ['ignore', 'pipe', 'pipe'],
|
|
168
|
-
timeout: 10000,
|
|
169
|
-
});
|
|
170
|
-
if (r.status !== 0) {
|
|
171
|
-
failed.push({ script: item.name, status: r.status, stderr: ((r.stderr || '').toString().trim()).slice(0, 200) });
|
|
172
|
-
}
|
|
173
|
-
}
|
|
174
|
-
return { failed, total: all.length };
|
|
175
|
-
}
|
|
176
|
-
|
|
177
|
-
function checkOpenclawConfig() {
|
|
178
|
-
const home = process.env.HOME || require('os').homedir();
|
|
179
|
-
const cfg = process.env.OPENCLAW_CONFIG
|
|
180
|
-
|| path.join(process.env.OPENCLAW_HOME || path.join(home, '.openclaw'), 'openclaw.json');
|
|
181
|
-
return { path: cfg, exists: fs.existsSync(cfg) };
|
|
182
|
-
}
|
|
183
|
-
|
|
184
|
-
function runDoctor(args) {
|
|
185
|
-
const jsonMode = args.includes('--json');
|
|
186
|
-
const result = {
|
|
187
|
-
node: { version: process.versions.node, ok: nodeVersionOk() },
|
|
188
|
-
python: null,
|
|
189
|
-
ocdiag: null,
|
|
190
|
-
diag_scripts: null,
|
|
191
|
-
openclaw_config: null,
|
|
192
|
-
};
|
|
193
|
-
|
|
194
|
-
const py = findPython();
|
|
195
|
-
if (!py) {
|
|
196
|
-
result.python = { ok: false, error: 'Python 3.8+ not found in PATH' };
|
|
197
|
-
if (jsonMode) {
|
|
198
|
-
console.log(JSON.stringify(result, null, 2));
|
|
199
|
-
} else {
|
|
200
|
-
console.log(`✓ Node v${result.node.version}${result.node.ok ? '' : ' (need >= 18)'}`);
|
|
201
|
-
console.log('✗ Python 3.8+ not found in PATH');
|
|
202
|
-
console.log(' Install: https://www.python.org/downloads/ or apt install python3');
|
|
203
|
-
}
|
|
204
|
-
process.exit(1);
|
|
205
|
-
}
|
|
206
|
-
result.python = { ok: true, version: py.version, cmd: py.cmd };
|
|
207
|
-
|
|
208
|
-
const ocdiag = checkOcdiagImport(py.cmd);
|
|
209
|
-
result.ocdiag = ocdiag;
|
|
210
|
-
|
|
211
|
-
const { failed, total } = checkDiagScripts(py.cmd);
|
|
212
|
-
result.diag_scripts = {
|
|
213
|
-
ok: failed.length === 0,
|
|
214
|
-
total,
|
|
215
|
-
failed,
|
|
216
|
-
};
|
|
217
|
-
|
|
218
|
-
const cfg = checkOpenclawConfig();
|
|
219
|
-
result.openclaw_config = cfg;
|
|
220
|
-
|
|
221
|
-
if (jsonMode) {
|
|
222
|
-
console.log(JSON.stringify(result, null, 2));
|
|
223
|
-
} else {
|
|
224
|
-
console.log(`${result.node.ok ? '✓' : '✗'} Node v${result.node.version}${result.node.ok ? '' : ' (need >= 18)'}`);
|
|
225
|
-
console.log(`✓ Python ${py.version} (${py.cmd})`);
|
|
226
|
-
if (ocdiag.ok) {
|
|
227
|
-
console.log(`✓ ocdiag package importable (version ${ocdiag.version})`);
|
|
228
|
-
} else {
|
|
229
|
-
console.log('✗ ocdiag package not importable');
|
|
230
|
-
if (ocdiag.error) {
|
|
231
|
-
console.log(' ' + ocdiag.error.split('\n').slice(-3).join(' | '));
|
|
232
|
-
}
|
|
233
|
-
}
|
|
234
|
-
if (failed.length === 0) {
|
|
235
|
-
console.log(`✓ All ${total} diagnostics respond to --help`);
|
|
236
|
-
} else {
|
|
237
|
-
console.log(`✗ ${failed.length}/${total} diagnostics failed --help:`);
|
|
238
|
-
for (const f of failed) {
|
|
239
|
-
console.log(` ${f.script} (rc=${f.status})`);
|
|
240
|
-
}
|
|
241
|
-
}
|
|
242
|
-
if (cfg.exists) {
|
|
243
|
-
console.log(`✓ OpenClaw config present (${cfg.path})`);
|
|
244
|
-
} else {
|
|
245
|
-
console.log(`ℹ OpenClaw config not found (${cfg.path}) — diagnostics will run but report missing`);
|
|
246
|
-
}
|
|
247
|
-
}
|
|
248
|
-
|
|
249
|
-
const ok = result.node.ok && result.python.ok && ocdiag.ok && failed.length === 0;
|
|
250
|
-
process.exit(ok ? 0 : 1);
|
|
135
|
+
function runDoctor(pyCmd, args) {
|
|
136
|
+
// Forward Node version into the Python doctor so it can include it in the
|
|
137
|
+
// report. ocdiag.doctor handles the actual logic; we just spawn it.
|
|
138
|
+
spawnDispatcher(pyCmd, ['doctor', '--node-version', process.versions.node, ...args]);
|
|
251
139
|
}
|
|
252
140
|
|
|
253
|
-
// ── main ──
|
|
254
|
-
|
|
255
141
|
function main() {
|
|
256
142
|
const argv = process.argv.slice(2);
|
|
143
|
+
const py = findPython();
|
|
257
144
|
|
|
258
145
|
if (argv.length === 0) {
|
|
259
|
-
const py = findPython();
|
|
260
146
|
if (!py) pythonNotFound();
|
|
261
147
|
console.log(`openclaw-diag v${PKG.version} — OpenClaw 诊断工具箱`);
|
|
262
148
|
console.log('');
|
|
263
|
-
|
|
264
|
-
spawnSync(py.cmd, [dispatcher, 'list'], { stdio: 'inherit' });
|
|
149
|
+
spawnSync(py.cmd, [DISPATCHER, 'list'], { stdio: 'inherit' });
|
|
265
150
|
console.log('');
|
|
266
151
|
console.log('常用命令:');
|
|
267
152
|
console.log(' openclaw-diag gateway 跑单个 state collector');
|
|
@@ -279,20 +164,24 @@ function main() {
|
|
|
279
164
|
process.exit(0);
|
|
280
165
|
}
|
|
281
166
|
if (head === '--help' || head === '-h') {
|
|
282
|
-
|
|
167
|
+
if (!py) pythonNotFound();
|
|
168
|
+
printHelp(fetchModules(py.cmd));
|
|
283
169
|
process.exit(0);
|
|
284
170
|
}
|
|
171
|
+
|
|
172
|
+
if (!py) pythonNotFound();
|
|
173
|
+
|
|
285
174
|
if (head === 'doctor') {
|
|
286
|
-
runDoctor(argv.slice(1));
|
|
175
|
+
runDoctor(py.cmd, argv.slice(1));
|
|
287
176
|
return;
|
|
288
177
|
}
|
|
289
178
|
if (head === 'bundle') {
|
|
290
|
-
runBundle(argv.slice(1));
|
|
179
|
+
runBundle(py.cmd, argv.slice(1));
|
|
291
180
|
return;
|
|
292
181
|
}
|
|
293
182
|
|
|
294
|
-
// Pass
|
|
295
|
-
|
|
183
|
+
// Pass everything else (flat ids, `all`, `list`, `run` alias, unknown) to dispatcher.
|
|
184
|
+
spawnDispatcher(py.cmd, argv);
|
|
296
185
|
}
|
|
297
186
|
|
|
298
187
|
main();
|
package/diag/01_sys_health.py
CHANGED
|
@@ -7,7 +7,6 @@ import json
|
|
|
7
7
|
import os
|
|
8
8
|
import re
|
|
9
9
|
import shutil
|
|
10
|
-
import socket
|
|
11
10
|
import subprocess
|
|
12
11
|
import sys
|
|
13
12
|
import time
|
|
@@ -412,7 +411,6 @@ def section_time_sync(out: output.Output) -> None:
|
|
|
412
411
|
def main() -> int:
|
|
413
412
|
parser = cli.build_common_parser(
|
|
414
413
|
description="模块 1:系统健康检查",
|
|
415
|
-
prog="01_sys_health",
|
|
416
414
|
)
|
|
417
415
|
args = parser.parse_args()
|
|
418
416
|
|
package/diag/02_environment.py
CHANGED
|
@@ -7,7 +7,6 @@ import json
|
|
|
7
7
|
import os
|
|
8
8
|
import re
|
|
9
9
|
import shlex
|
|
10
|
-
import shutil
|
|
11
10
|
import subprocess
|
|
12
11
|
import sys
|
|
13
12
|
from pathlib import Path
|
|
@@ -16,7 +15,7 @@ from typing import Optional
|
|
|
16
15
|
sys.path.insert(0, str(Path(__file__).resolve().parent.parent))
|
|
17
16
|
|
|
18
17
|
from ocdiag import cli, output, paths
|
|
19
|
-
from ocdiag.sensitive import safe_val
|
|
18
|
+
from ocdiag.sensitive import safe_val, sanitize_text
|
|
20
19
|
|
|
21
20
|
|
|
22
21
|
def run(cmd, timeout=5):
|
|
@@ -110,7 +109,6 @@ def parse_proc_environ(pid: str) -> Optional[list]:
|
|
|
110
109
|
def main() -> int:
|
|
111
110
|
parser = cli.build_common_parser(
|
|
112
111
|
description="模块 2:采集 OpenClaw 基础环境",
|
|
113
|
-
prog="02_environment",
|
|
114
112
|
)
|
|
115
113
|
args = parser.parse_args()
|
|
116
114
|
out = output.init("environment", json_mode=args.json, no_color=args.no_color)
|
|
@@ -123,6 +121,12 @@ def main() -> int:
|
|
|
123
121
|
out.item("OpenClaw 版本: 无法确定")
|
|
124
122
|
out.evidence("openclaw --version", "命令未找到或无输出")
|
|
125
123
|
out.set_data("oc_version", oc_version)
|
|
124
|
+
if not oc_version:
|
|
125
|
+
out.set_data("oc_version_status", {
|
|
126
|
+
"found": False,
|
|
127
|
+
"reason": "command_not_found",
|
|
128
|
+
"checked": "openclaw --version + pnpm/global node_modules",
|
|
129
|
+
})
|
|
126
130
|
|
|
127
131
|
service_file = paths.SERVICE_FILE
|
|
128
132
|
svc_version = None
|
|
@@ -153,6 +157,10 @@ def main() -> int:
|
|
|
153
157
|
out.item("Node.js: 未找到")
|
|
154
158
|
out.evidence("node --version", "命令未找到")
|
|
155
159
|
out.set_data("node_version", node_ver)
|
|
160
|
+
if not node_ver:
|
|
161
|
+
out.set_data("node_version_status", {
|
|
162
|
+
"found": False, "reason": "command_not_found", "checked": "node --version",
|
|
163
|
+
})
|
|
156
164
|
|
|
157
165
|
rc, stdout, _ = run(["free", "-m"])
|
|
158
166
|
mem_avail = ""
|
|
@@ -166,6 +174,10 @@ def main() -> int:
|
|
|
166
174
|
if mem_avail:
|
|
167
175
|
out.item(f"可用内存: {mem_avail} MB")
|
|
168
176
|
out.set_data("memory_available_mb", mem_avail)
|
|
177
|
+
if not mem_avail:
|
|
178
|
+
out.set_data("memory_status", {
|
|
179
|
+
"found": False, "reason": "free_unavailable", "checked": "free -m",
|
|
180
|
+
})
|
|
169
181
|
|
|
170
182
|
rc, stdout, _ = run(["df", "-m", paths.OPENCLAW_HOME])
|
|
171
183
|
disk_avail = ""
|
|
@@ -178,6 +190,11 @@ def main() -> int:
|
|
|
178
190
|
if disk_avail:
|
|
179
191
|
out.item(f"磁盘可用 ({paths.OPENCLAW_HOME}): {disk_avail} MB")
|
|
180
192
|
out.set_data("disk_available_mb", disk_avail)
|
|
193
|
+
if not disk_avail:
|
|
194
|
+
out.set_data("disk_status", {
|
|
195
|
+
"found": False, "reason": "df_unavailable",
|
|
196
|
+
"checked": f"df -m {paths.OPENCLAW_HOME}",
|
|
197
|
+
})
|
|
181
198
|
|
|
182
199
|
gw_status = gateway_systemctl_status()
|
|
183
200
|
if gw_status:
|
|
@@ -245,8 +262,16 @@ def main() -> int:
|
|
|
245
262
|
out.set_data("gateway_env", [{"key": k, "value": v} for k, v in env_pairs])
|
|
246
263
|
elif pid:
|
|
247
264
|
out.item(f"无法读取 /proc/{pid}/environ(权限不足?)")
|
|
265
|
+
out.set_data("gateway_env_status", {
|
|
266
|
+
"found": False, "reason": "proc_unreadable",
|
|
267
|
+
"checked": f"/proc/{pid}/environ",
|
|
268
|
+
})
|
|
248
269
|
else:
|
|
249
270
|
out.item("Gateway 进程未运行,跳过")
|
|
271
|
+
out.set_data("gateway_env_status", {
|
|
272
|
+
"found": False, "reason": "process_not_running",
|
|
273
|
+
"checked": "pgrep -f openclaw.*gateway",
|
|
274
|
+
})
|
|
250
275
|
|
|
251
276
|
if os.path.isfile(paths.SERVICE_ENV_FILE):
|
|
252
277
|
out.line("")
|
|
@@ -281,9 +306,10 @@ def main() -> int:
|
|
|
281
306
|
try:
|
|
282
307
|
with open(service_file) as f:
|
|
283
308
|
for line in f:
|
|
284
|
-
|
|
285
|
-
|
|
286
|
-
|
|
309
|
+
raw = line.rstrip("\n")
|
|
310
|
+
out.item(raw if args.unmask else sanitize_text(raw))
|
|
311
|
+
except OSError as e:
|
|
312
|
+
out.item(f"读取失败: {e}")
|
|
287
313
|
|
|
288
314
|
return out.done()
|
|
289
315
|
|