direxio-deployer 0.1.2 → 0.1.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -42,7 +42,7 @@ If a change writes a path into `state.json`, `credentials.json`, `env`, `cc-conn
42
42
  ## Direxio Connect Wiring
43
43
 
44
44
  - S5/S6 must fail closed when `agent_room_id` is missing or uses a legacy pseudo id such as `!agent:<domain>`.
45
- - S6 must create a Matrix session through `agent.matrix_session.create` and require `@agent:<server>` for the bridge. Returning `@owner:<server>` is a server-side compatibility failure.
45
+ - S6 must create a Matrix session through `agent.matrix_session.create` using `agent_token`, not owner `access_token`, and require `@agent:<server>` for the bridge. Returning `@owner:<server>` is a server-side compatibility failure.
46
46
  - The generated cc-connect config must contain one Matrix platform and must restrict sync/replies to the real `agent_room_id`.
47
47
  - The generated agent config must preserve the selected connect agent type and optional agent-specific TOML. Some providers require more than `cmd`; for example `reasonix` needs `serve_url`, `tmux` needs `session`, and generic `acp` may need command/args.
48
48
  - `DIREXIO_AGENT_INSTALL=auto` may install/start `direxio-connect`; `recommend` must only write files and print commands.
package/README.md CHANGED
@@ -23,12 +23,14 @@ Install the deployer skill from npm, then place it into the current agent runtim
23
23
 
24
24
  The GitHub repository keeps tests for maintainers and CI, but the published npm package and installed skill copy exclude `tests/` to keep user installs small.
25
25
 
26
- If you want Codex to install and deploy in one instruction, do not say "install skills <GitHub URL>". That triggers GitHub skill installation and can place the skill in the global `~/.codex` directory. Say this instead:
26
+ If you want Codex to install and deploy in one instruction, do not say "install skills <GitHub URL>". That triggers GitHub skill installation and can place the skill in the global `~/.codex` directory. Use a short instruction that tells the agent to read this README's npm/project-local install rule:
27
27
 
28
28
  ```text
29
- Install the Direxio deployer skill in the current project via npm: run npm install -g direxio-deployer@latest, then run direxio-deployer skill install --agent codex --scope project --project .; do not install the skill from GitHub and do not install it globally under ~/.codex. After installation, use that skill to deploy Direxio with domain __DOMAIN__.
29
+ Follow the direxio-deployer README npm/project-local install rule, then deploy Direxio with domain __DOMAIN__.
30
30
  ```
31
31
 
32
+ After reading this instruction, the agent should run the npm install commands below; it should not switch to a GitHub skill installer.
33
+
32
34
  POSIX shells:
33
35
 
34
36
  ```bash
@@ -59,7 +61,7 @@ direxio-deployer skill install --agent codex --scope global
59
61
  The installer writes `.direxio-skill-install.json` into the target directory and refuses to overwrite unmanaged existing content unless `--force` is provided. To pin a version, install that package version first:
60
62
 
61
63
  ```bash
62
- npm install -g direxio-deployer@0.1.2
64
+ npm install -g direxio-deployer@0.1.3
63
65
  direxio-deployer skill update --agent codex --scope project --project .
64
66
  ```
65
67
 
package/README_zh.md CHANGED
@@ -21,12 +21,14 @@
21
21
 
22
22
  GitHub 仓库保留测试用于维护和 CI,但发布到 npm 的包以及安装到智能体 skill 目录的副本不包含 `tests/`,以减小用户安装体积。
23
23
 
24
- 如果你想让 Codex 一句话安装并开始部署,不要说“安装 skills <GitHub 链接>”。那会触发 GitHub skill 安装器,容易安装到全局 `~/.codex`。推荐这样说:
24
+ 如果你想让 Codex 一句话安装并开始部署,不要说“安装 skills <GitHub 链接>”。那会触发 GitHub skill 安装器,容易安装到全局 `~/.codex`。推荐只说短句,让 agent 先读本 README 中的 npm/project-local 安装规则:
25
25
 
26
26
  ```text
27
- 请在当前项目用 npm 安装 Direxio deployer skill:运行 npm install -g direxio-deployer@latest,然后运行 direxio-deployer skill install --agent codex --scope project --project .;不要从 GitHub 安装 skill,不要安装到全局 ~/.codex。安装后使用该 skill 部署 Direxio 服务,域名使用 __DOMAIN__。
27
+ 请按 direxio-deployer README npm/project-local 规则安装 skill,然后部署 Direxio,域名 __DOMAIN__。
28
28
  ```
29
29
 
30
+ Agent 读到这句后应执行下方 npm 安装命令;不要改用 GitHub skill installer。
31
+
30
32
  POSIX shell:
31
33
 
32
34
  ```bash
@@ -57,7 +59,7 @@ direxio-deployer skill install --agent codex --scope global
57
59
  安装器会在目标目录写入 `.direxio-skill-install.json`,并拒绝覆盖没有该 manifest 的既有目录,除非显式传入 `--force`。如需固定版本,先安装指定 npm 版本:
58
60
 
59
61
  ```bash
60
- npm install -g direxio-deployer@0.1.2
62
+ npm install -g direxio-deployer@0.1.3
61
63
  direxio-deployer skill update --agent codex --scope project --project .
62
64
  ```
63
65
 
package/SKILL.md CHANGED
@@ -355,7 +355,7 @@ The local MCP tool surface is `direxio-mcp`, installed from `direxio-mcp@latest`
355
355
 
356
356
  `DIREXIO_AGENT_PLATFORM` describes the host runtime following the skill, while `DIREXIO_CC_CONNECT_AGENT` describes the local agent backend that `direxio-connect` should launch. Host runtimes such as Hermes or OpenClaw are not native cc-connect backend types; S6 maps them to the generic ACP backend by default and records `cc_connect_agent=acp`. Override `DIREXIO_CC_CONNECT_AGENT` only when the operator intentionally wants a different local backend.
357
357
 
358
- `DIREXIO_AGENT_INSTALL` may be `skip`, `recommend`, or `auto`. Only `auto` attempts to run `npm install -g direxio-connent@latest` and `direxio-connect daemon install --config ~/.direxio/nodes/<service_id>/cc-connect/config.toml --service-name <service_id> --force`; the default `recommend` records and prints the command without mutating local daemon state. An automatic install is reported as installed only when `direxio-connect daemon status --service-name <service_id>` returns `Status: Running` and recent daemon logs do not show ACP session initialization failure; otherwise S6 records `agent_install_status=install_failed`.
358
+ `DIREXIO_AGENT_INSTALL` may be `skip`, `recommend`, or `auto`. Only `auto` attempts to run `npm install -g direxio-connent@latest` and `direxio-connect daemon install --config ~/.direxio/nodes/<service_id>/cc-connect/config.toml --service-name <service_id> --force`; the default `recommend` records and prints the command without mutating local daemon state. An automatic install is reported as installed only when `direxio-connect daemon status --service-name <service_id>` returns `Status: Running` and recent daemon logs do not show ACP session initialization failure; otherwise S6 records `agent_install_status=install_failed`. S6 calls `agent.matrix_session.create` with `agent_token` and retries transient HTTP 000/5xx responses before failing, because the Matrix action can become reachable a few seconds after `/healthz`.
359
359
 
360
360
  Voice input is supported through `direxio-connect` speech-to-text. When `DIREXIO_SPEECH_API_KEY` or a provider-specific key such as `DIREXIO_SPEECH_QWEN_API_KEY`, `OPENAI_API_KEY`, `GROQ_API_KEY`, `DASHSCOPE_API_KEY`, `GEMINI_API_KEY`, or `GOOGLE_API_KEY` is present, S6 writes `[speech] enabled = true` into the generated config. Without an STT key, do not claim voice input is enabled.
361
361
 
@@ -398,9 +398,12 @@ DOMAIN=<DOMAIN> bash scripts/orchestrate.sh verify mcp_tools
398
398
  DOMAIN=<DOMAIN> bash scripts/orchestrate.sh verify mcp_smoke
399
399
  ```
400
400
 
401
- Use `verify runtime` as the normal aggregate check. It runs the service-scoped
402
- connect daemon check plus MCP doctor, MCP `tools/list`, and read-only backend
403
- smoke, then writes `runtime_checks.summary`. The individual commands are useful
401
+ Use `verify runtime` as the normal aggregate check. It runs MCP doctor, MCP
402
+ `tools/list`, and read-only backend smoke. It also runs the service-scoped
403
+ connect daemon check when the daemon was expected to be installed; when S6
404
+ recorded `agent_install_status=recommend` or `skip`, the aggregate marks
405
+ `runtime_checks.connect_daemon.status=manual_pending` and does not fail the
406
+ summary for that explicit operator action. The individual commands are useful
404
407
  when diagnosing one layer. These commands write `runtime_checks.connect_daemon`,
405
408
  `runtime_checks.mcp_doctor`, `runtime_checks.mcp_tools`, and
406
409
  `runtime_checks.mcp_smoke` into `state.json` and the operation report.
@@ -503,7 +506,8 @@ NS nameservers before authoritative DNS can resolve. Never use temporary
503
506
  **Credential freshness:** The synced `password` and owner `access_token`
504
507
  are one-time/volatile values. User login or token exchange can reset them
505
508
  on the server. Before reporting the eight-digit app initialization code or using an owner
506
- `access_token` for API calls, rerun the credential sync path or pull the
509
+ `access_token` for owner API calls, or using `agent_token` for
510
+ `agent.matrix_session.create`, rerun the credential sync path or pull the
507
511
  latest `/opt/p2p/bootstrap.json` from the server; do not reuse values from
508
512
  old chat output, old `state.json`, or stale local `credentials.json`.
509
513
  **Runtime detection note:** S6 checks active-process signals before stale
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "direxio-deployer",
3
- "version": "0.1.2",
3
+ "version": "0.1.3",
4
4
  "description": "Versioned Direxio deployer agent skill and portable deployment orchestration tools.",
5
5
  "type": "module",
6
6
  "bin": {
@@ -18,7 +18,7 @@
18
18
  "scripts/"
19
19
  ],
20
20
  "scripts": {
21
- "test": "bash tests/npm_skill_distribution_test.sh && bash tests/skill_structure_test.sh && bash tests/s6_wire_local_test.sh && bash tests/render_userdata_remote_nodes_test.sh"
21
+ "test": "bash tests/npm_skill_distribution_test.sh && bash tests/skill_structure_test.sh && bash tests/private_file_permissions_test.sh && bash tests/s6_wire_local_test.sh && bash tests/render_userdata_remote_nodes_test.sh"
22
22
  },
23
23
  "engines": {
24
24
  "node": ">=18"
@@ -6,10 +6,10 @@ Use this file when installing or updating this skill and when reviewing S6 local
6
6
 
7
7
  Prefer a project-local npm-managed install when a project or workspace exists. Install the versioned package, then let the CLI copy the skill bundle into the runtime-specific target:
8
8
 
9
- Do not use a generic "install skills <GitHub URL>" instruction for normal users. That can invoke a host's GitHub skill installer and place this repository under the global runtime directory before the npm-managed installer runs. For Codex, the project-local instruction is:
9
+ Do not use a generic "install skills <GitHub URL>" instruction for normal users. That can invoke a host's GitHub skill installer and place this repository under the global runtime directory before the npm-managed installer runs. A short user prompt should point the agent back to this npm/project-local rule:
10
10
 
11
11
  ```text
12
- Install the Direxio deployer skill in the current project via npm: run npm install -g direxio-deployer@latest, then run direxio-deployer skill install --agent codex --scope project --project .; do not install the skill from GitHub and do not install it globally under ~/.codex.
12
+ Follow the direxio-deployer README npm/project-local install rule, then deploy Direxio with domain __DOMAIN__.
13
13
  ```
14
14
 
15
15
  POSIX shells:
@@ -145,7 +145,10 @@ When the user or runtime evidence confirms a manual product gate, write it back
145
145
  to state before regenerating the report. Connect daemon status is a
146
146
  service-scoped local bridge check, MCP doctor is a non-polluting runtime check,
147
147
  MCP tools is stdio `tools/list` discovery, and MCP smoke is a read-only backend
148
- call. They are not the full runtime product gate:
148
+ call. In the default `DIREXIO_AGENT_INSTALL=recommend` path, `verify runtime`
149
+ records `connect_daemon=manual_pending` instead of failing the aggregate,
150
+ because daemon installation is an explicit operator action. These checks are
151
+ not the full runtime product gate:
149
152
 
150
153
  ```bash
151
154
  DOMAIN=__DOMAIN__ bash scripts/orchestrate.sh verify runtime
@@ -74,7 +74,7 @@ DIREXIO_CREDENTIALS_FILE=~/.direxio/nodes/<service_id>/credentials.json direxio-
74
74
 
75
75
  ## cc-connect Matrix Bridge
76
76
 
77
- S6 calls `agent.matrix_session.create` with the owner token. Current message-server builds must return a Matrix session for `@agent:<server>`, not for `@owner:<server>`. The resulting session is stored at:
77
+ S6 calls `agent.matrix_session.create` with the backend `agent_token`, not the owner `access_token`. Current message-server builds must return a Matrix session for `@agent:<server>`, not for `@owner:<server>`. S6 retries transient HTTP 000/5xx responses before failing, because the Matrix action can become reachable shortly after `/healthz`. The resulting session is stored at:
78
78
 
79
79
  ```text
80
80
  ~/.direxio/nodes/<service_id>/cc-connect/matrix-session.json
@@ -137,7 +137,7 @@ Defaults:
137
137
  - `DIREXIO_CC_CONNECT_AGENT_OPTIONS_TOML` appends agent-specific options under `[projects.agent.options]`; use it for agents with required non-command options such as `reasonix` (`serve_url`) or `tmux` (`session`).
138
138
  - OpenClaw Gateway ACP auto-detects the Gateway from `~/.openclaw/openclaw.json` when `DIREXIO_OPENCLAW_ACP_URL` and `DIREXIO_OPENCLAW_ACP_TOKEN_FILE` are unset. It uses `DIREXIO_OPENCLAW_ACP_SESSION` when provided, otherwise `agent:main:main`. To force explicit Gateway settings, complete OpenClaw pairing first and set all three real values: `DIREXIO_OPENCLAW_ACP_URL`, `DIREXIO_OPENCLAW_ACP_TOKEN_FILE`, and `DIREXIO_OPENCLAW_ACP_SESSION`.
139
139
  - `DIREXIO_OPENCLAW_ACP_ARGS_TOML` replaces the generated OpenClaw ACP args array, for example `["acp", "--url", "wss://gateway.example.test:18789", "--token-file", "$HOME/.openclaw/gateway.token", "--session", "agent:main:main"]`. `DIREXIO_HERMES_ACP_ARGS_TOML` supplies the child Hermes args; S6 prefixes `["hermes-acp-adapter", "--", "<hermes-command>"]` automatically.
140
- - `DIREXIO_AGENT_INSTALL=recommend` prints and records the command only.
140
+ - `DIREXIO_AGENT_INSTALL=recommend` prints and records the command only. `verify runtime` records the daemon check as `manual_pending` in this mode and still verifies MCP doctor/tools/smoke.
141
141
  - `DIREXIO_AGENT_INSTALL=auto` runs `npm install -g direxio-connent@latest` and then installs the `direxio-connect` daemon with the generated config and `--service-name <service_id>`. It is recorded as installed only when `direxio-connect daemon status --service-name <service_id>` reports `Status: Running` and recent daemon logs do not show ACP session initialization failure; otherwise S6 records `agent_install_status=install_failed`.
142
142
  - `DIREXIO_AGENT_INSTALL_MODE=recommended` maps every supported local runtime to `cc-connect`.
143
143
  - Speech defaults to `DIREXIO_SPEECH_PROVIDER=openai` and `DIREXIO_SPEECH_LANGUAGE=zh`. Provider-specific keys are also accepted: `DIREXIO_SPEECH_OPENAI_API_KEY` or `OPENAI_API_KEY`, `DIREXIO_SPEECH_GROQ_API_KEY` or `GROQ_API_KEY`, `DIREXIO_SPEECH_QWEN_API_KEY` or `DASHSCOPE_API_KEY`, and `DIREXIO_SPEECH_GEMINI_API_KEY`, `GEMINI_API_KEY`, or `GOOGLE_API_KEY`. Set `DIREXIO_SPEECH_ENABLED=false` to suppress speech config generation even when a key exists.
@@ -9,8 +9,8 @@
9
9
  - **S2_DOMAIN**: 确认正式长期域名和 Matrix `server_name` 不可逆绑定。
10
10
  - **S3_PROVISION**: 创建 EC2、密钥对、安全组、Elastic IP,按 DNS 模式处理 Route53 hosted zone/A 记录或等待外部 DNS,渲染 cloud-init。默认镜像 `MESSAGE_SERVER_IMAGE=direxio/message-server:latest`。
11
11
  - **S4_BOOTSTRAP_STACK**: 等 cloud-init 安装 Docker 并启动 `postgres:18 + message-server + caddy + coturn`,轮询 `https://<domain>/healthz`。
12
- - **S5_INIT_TOKENS**: SSH 读取云端 `init-tokens.sh` 生成的 `/opt/p2p/bootstrap.json`,归一化 `password`、`access_token`、`agent_token`、真实 `agent_room_id`。云端脚本会先调用 `portal.bootstrap`,并在服务端未返回房间时用 Matrix Client API 创建和回写真实 agent room。`password` 和 owner `access_token` 按一次性/易失凭据处理;需要登录或用 token 调接口前,必须重新从服务器拉取最新 `/opt/p2p/bootstrap.json`,不要复用旧输出。
13
- - **S6_WIRE_LOCAL**: 写本地凭据、创建 `@agent:<server>` Matrix session、写 `cc-connect/config.toml`,写 MCP 配置片段,并按策略安装或推荐 `direxio-connect`。
12
+ - **S5_INIT_TOKENS**: SSH 读取云端 `init-tokens.sh` 生成的 `/opt/p2p/bootstrap.json`,归一化 `password`、`access_token`、`agent_token`、真实 `agent_room_id`。云端脚本会先调用 `portal.bootstrap`,用 `agent_token` 创建 `@agent:<server>` Matrix session,再用 owner Matrix token 创建房间并邀请/加入 agent,最后回写真正的 agent room。`password`、owner `access_token` 和 `agent_token` 按一次性/易失凭据处理;需要登录或用 token 调接口前,必须重新从服务器拉取最新 `/opt/p2p/bootstrap.json`,不要复用旧输出。
13
+ - **S6_WIRE_LOCAL**: 写本地凭据、用 `agent_token` 创建 `@agent:<server>` Matrix session、写 `cc-connect/config.toml`,写 MCP 配置片段,并按策略安装或推荐 `direxio-connect`。
14
14
  - **S7_VERIFY_E2E**: 验证 `/_p2p`、Matrix versions、well-known、owner.json+CORS、TURN。
15
15
 
16
16
  ## 云端 compose
@@ -2,7 +2,7 @@
2
2
 
3
3
  每次重部署或清空数据卷后,`password`、owner `access_token`、`agent_token` 和 cc-connect Matrix session 都会变化。状态机 S6 会自动回填;手动恢复时按这里检查。
4
4
 
5
- 从服务端同步过来的 `password` 和 owner `access_token` 必须按一次性/易失凭据处理。`password` 是后端字段名,对用户展示时必须叫八位 App 初始化码。用户完成初始化或 token exchange 后,服务端可能立刻重置这些值;任何需要再次获取初始化码,或需要用 `access_token` 调 `/_p2p/command`、Matrix Client API 等接口的操作,都必须先重新从服务器拉取最新 `/opt/p2p/bootstrap.json`,再更新本地 `credentials.json`。不要复用聊天记录、旧 `state.json`、旧 `credentials.json` 或历史部署输出里的 password/access token。
5
+ 从服务端同步过来的 `password` 和 owner `access_token` 必须按一次性/易失凭据处理。`password` 是后端字段名,对用户展示时必须叫八位 App 初始化码。用户完成初始化或 token exchange 后,服务端可能立刻重置这些值;任何需要再次获取初始化码,或需要用 `access_token` 调 owner 身份 API/Matrix Client API,或需要用 `agent_token` 调 `agent.matrix_session.create` 的操作,都必须先重新从服务器拉取最新 `/opt/p2p/bootstrap.json`,再更新本地 `credentials.json`。不要复用聊天记录、旧 `state.json`、旧 `credentials.json` 或历史部署输出里的 password/access token。
6
6
 
7
7
  现有节点执行 `scripts/update.sh` 或 `scripts/reset-app-data.sh` 后,本地旧证据也必须作废。脚本会清掉旧 `password`、`access_token`、`agent_token`、`agent_room_id`、`user_confirmations` 和 `runtime_checks`,把 `agent_install_status` 标成 `refresh_pending`,并只在 `WorkDir` 匹配当前 service 时停止对应的本地 bridge(stops only the matching service-scoped direxio-connect daemon),再把 S4-S7 标回 pending。这样旧的用户确认、MCP discovery、Agent runtime probe 或旧 bridge 安装状态不会被误用到更新/重置后的节点。`status` 会显示 `Local refresh:`,提醒 update/reset 已经清掉旧 credentials、user confirmations、runtime checks 和 bridge install proof;下一步必须 rerun the deployment workflow to refresh S4-S7, local credentials, MCP snippets, and runtime checks。后续必须续跑 `scripts/orchestrate.sh`,让 S5/S6/S7 和 `verify runtime` 重新写入当前证据。
8
8
 
@@ -109,7 +109,7 @@ Use the Git Bash `$HOME` path for files generated by the deployer. If running `d
109
109
 
110
110
  ## EC2 SSH Key Paths
111
111
 
112
- SSH key files are written with Windows-compatible paths such as `C:/Users/.../.direxio/deploy/p2p-*.pem`. The SSH command printed in the delivery summary works in Git Bash. If using PowerShell or cmd, convert forward slashes to backslashes.
112
+ SSH key files are written with Windows-compatible paths such as `C:/Users/.../.direxio/deploy/p2p-*.pem`. The deployer removes inherited Windows ACLs where possible so OpenSSH does not reject the private key as too open. The SSH command printed in the delivery summary works in Git Bash. If using PowerShell or cmd, convert forward slashes to backslashes.
113
113
 
114
114
  ## Verifying Deployment
115
115
 
@@ -149,7 +149,7 @@ PY
149
149
  }
150
150
 
151
151
  ensure_agent_room() {
152
- local owner_token agent_user session room_resp join_resp agent_token room_id room_path
152
+ local owner_token agent_auth_token agent_user session room_resp join_resp matrix_agent_token room_id room_path
153
153
  if copy_bootstrap_file && bootstrap_has_real_agent_room "$BOOTSTRAP_FILE"; then
154
154
  log "agent_room_id is already present."
155
155
  return 0
@@ -160,15 +160,20 @@ ensure_agent_room() {
160
160
  log "FATAL: access_token is missing; cannot create agent room"
161
161
  return 1
162
162
  fi
163
+ agent_auth_token=$(json_string agent_token "$BOOTSTRAP_FILE")
164
+ if [ -z "$agent_auth_token" ]; then
165
+ log "FATAL: agent_token is missing; cannot create agent Matrix session"
166
+ return 1
167
+ fi
163
168
  agent_user="@agent:${DOMAIN}"
164
169
  session=$(mktemp)
165
- if ! container_post_json "/_p2p/command" '{"action":"agent.matrix_session.create","params":{"device_id":"DIREXIO_DEPLOY_BOOTSTRAP"}}' "$owner_token" > "$session" 2>/dev/null; then
170
+ if ! container_post_json "/_p2p/command" '{"action":"agent.matrix_session.create","params":{"device_id":"DIREXIO_DEPLOY_BOOTSTRAP"}}' "$agent_auth_token" > "$session" 2>/dev/null; then
166
171
  log "FATAL: agent.matrix_session.create failed: $(head -c 160 "$session" 2>/dev/null)"
167
172
  rm -f "$session"
168
173
  return 1
169
174
  fi
170
- agent_token=$(json_string access_token "$session")
171
- if [ -z "$agent_token" ]; then
175
+ matrix_agent_token=$(json_string access_token "$session")
176
+ if [ -z "$matrix_agent_token" ]; then
172
177
  log "FATAL: agent.matrix_session.create did not return access_token: $(head -c 160 "$session" 2>/dev/null)"
173
178
  rm -f "$session"
174
179
  return 1
@@ -190,7 +195,7 @@ ensure_agent_room() {
190
195
 
191
196
  room_path=$(matrix_room_path "$room_id")
192
197
  join_resp=$(mktemp)
193
- if ! container_post_json "/_matrix/client/v3/rooms/${room_path}/join" '{}' "$agent_token" > "$join_resp" 2>/dev/null; then
198
+ if ! container_post_json "/_matrix/client/v3/rooms/${room_path}/join" '{}' "$matrix_agent_token" > "$join_resp" 2>/dev/null; then
194
199
  log "FATAL: agent join failed for ${room_id}: $(head -c 160 "$join_resp" 2>/dev/null)"
195
200
  rm -f "$join_resp"
196
201
  return 1
@@ -46,6 +46,28 @@ is_yes() {
46
46
  esac
47
47
  }
48
48
 
49
+ restrict_private_file() {
50
+ local file=$1 uname_s win_file user user_domain
51
+ chmod 600 "$file" 2>/dev/null || true
52
+ uname_s=$(uname -s 2>/dev/null || printf unknown)
53
+ case "$uname_s" in
54
+ MINGW*|MSYS*|CYGWIN*)
55
+ command -v icacls >/dev/null 2>&1 || return 0
56
+ win_file=$file
57
+ if command -v cygpath >/dev/null 2>&1; then
58
+ win_file=$(cygpath -w "$file")
59
+ fi
60
+ user=$(cmd.exe /c whoami 2>/dev/null | tr -d '\r' | tail -n 1 || true)
61
+ user_domain=${USERDOMAIN:-}
62
+ icacls "$win_file" /inheritance:r >/dev/null 2>&1 || true
63
+ icacls "$win_file" /remove:g \
64
+ "Users" "Authenticated Users" "Everyone" "CodexSandboxUsers" \
65
+ "${user_domain}\\CodexSandboxUsers" >/dev/null 2>&1 || true
66
+ [ -n "$user" ] && icacls "$win_file" /grant:r "$user:R" >/dev/null 2>&1 || true
67
+ ;;
68
+ esac
69
+ }
70
+
49
71
  # Initialize state.json for a new deployment.
50
72
  state_init() {
51
73
  mkdir -p "$P2P_WORKDIR"
@@ -966,15 +966,35 @@ runtime_check_status() {
966
966
  json_get "$STATE_JSON" "runtime_checks.$check.status" "not_run"
967
967
  }
968
968
 
969
+ runtime_status_counts_as_failure() {
970
+ local status=$1
971
+ case "$status" in
972
+ passed|manual_pending|skipped) return 1 ;;
973
+ *) return 0 ;;
974
+ esac
975
+ }
976
+
969
977
  cmd_verify_runtime() {
970
978
  [ -f "$STATE_JSON" ] || {
971
979
  warn "state.json not found: $STATE_JSON"
972
980
  return 1
973
981
  }
974
982
 
975
- local rc=0 failed_count=0 connect_status doctor_status tools_status smoke_status status
983
+ local rc=0 failed_count=0 connect_status doctor_status tools_status smoke_status status install_status install_policy service_name
976
984
 
977
- cmd_verify_connect_daemon >/dev/null || rc=1
985
+ install_status=$(json_get "$STATE_JSON" agent_install_status)
986
+ install_policy=$(json_get "$STATE_JSON" agent_install_policy)
987
+ service_name=$(json_get "$STATE_JSON" agent_service_id)
988
+ [ -n "$service_name" ] || service_name=$(json_get "$STATE_JSON" domain)
989
+ if [ "$install_status" = "recommend" ] || { [ "$install_status" = "skip" ] && [ "${install_policy:-skip}" = "skip" ]; }; then
990
+ state_set_object runtime_checks.connect_daemon \
991
+ status=manual_pending \
992
+ "ts=$(_now)" \
993
+ "evidence=direxio-connect daemon install is an explicit operator action for policy=$install_status" \
994
+ "service_name=${service_name:-cc-connect}"
995
+ else
996
+ cmd_verify_connect_daemon >/dev/null || rc=1
997
+ fi
978
998
  cmd_verify_mcp_doctor >/dev/null || rc=1
979
999
  cmd_verify_mcp_tools >/dev/null || rc=1
980
1000
  cmd_verify_mcp_smoke >/dev/null || rc=1
@@ -985,7 +1005,7 @@ cmd_verify_runtime() {
985
1005
  smoke_status=$(runtime_check_status mcp_smoke)
986
1006
 
987
1007
  for status in "$connect_status" "$doctor_status" "$tools_status" "$smoke_status"; do
988
- [ "$status" = "passed" ] || failed_count=$((failed_count + 1))
1008
+ runtime_status_counts_as_failure "$status" && failed_count=$((failed_count + 1))
989
1009
  done
990
1010
 
991
1011
  if [ "$failed_count" -eq 0 ]; then
@@ -57,7 +57,7 @@ run_phase() {
57
57
  if [ -z "$(res_get key_name)" ]; then
58
58
  log "Creating key pair $name ..."
59
59
  aws ec2 create-key-pair --key-name "$name" --query KeyMaterial --output text > "$keyfile"
60
- chmod 600 "$keyfile"
60
+ restrict_private_file "$keyfile"
61
61
  res_set key_name "$name"; res_set key_file "$keyfile"
62
62
  else
63
63
  log "Key pair already exists; skipping."; keyfile=$(res_get key_file)
@@ -885,25 +885,51 @@ EOF
885
885
  }
886
886
 
887
887
  _create_cc_connect_matrix_session() {
888
- local asurl=$1 access_token=$2 device_id=$3 out=$4 body code http_body
888
+ local asurl=$1 agent_auth_token=$2 device_id=$3 out=$4 body code http_body
889
+ local max_attempts interval attempt preview
889
890
  body=$(json_build matrix-session-create "$device_id")
890
- http_body=$(mktemp)
891
- code=$(curl -sk -o "$http_body" -w '%{http_code}' -X POST "$asurl/_p2p/command" \
892
- -H 'Content-Type: application/json' \
893
- -H "Authorization: Bearer $access_token" \
894
- -d "$body" 2>/dev/null || true)
895
- if [ "$code" != "200" ]; then
896
- warn "agent.matrix_session.create returned HTTP ${code:-000}: $(head -c 200 "$http_body" 2>/dev/null)"
897
- rm -f "$http_body"
898
- return 1
899
- fi
900
- if ! json_assert "$http_body" matrix-session >/dev/null; then
901
- warn "agent.matrix_session.create response is missing Matrix session fields: $(head -c 200 "$http_body" 2>/dev/null)"
891
+ max_attempts=${DIREXIO_MATRIX_SESSION_CREATE_MAX:-4}
892
+ interval=${DIREXIO_MATRIX_SESSION_RETRY_INTERVAL:-2}
893
+ attempt=1
894
+ while [ "$attempt" -le "$max_attempts" ]; do
895
+ http_body=$(mktemp)
896
+ code=$(curl -sk \
897
+ --connect-timeout "${DIREXIO_MATRIX_SESSION_CURL_CONNECT_TIMEOUT:-10}" \
898
+ --max-time "${DIREXIO_MATRIX_SESSION_CURL_MAX_TIME:-20}" \
899
+ -o "$http_body" -w '%{http_code}' -X POST "$asurl/_p2p/command" \
900
+ -H 'Content-Type: application/json' \
901
+ -H "Authorization: Bearer $agent_auth_token" \
902
+ -d "$body" 2>/dev/null || true)
903
+ if [ "$code" = "200" ]; then
904
+ if ! json_assert "$http_body" matrix-session >/dev/null; then
905
+ warn "agent.matrix_session.create response is missing Matrix session fields: $(head -c 200 "$http_body" 2>/dev/null)"
906
+ rm -f "$http_body"
907
+ return 1
908
+ fi
909
+ mv "$http_body" "$out"
910
+ chmod 600 "$out" 2>/dev/null || true
911
+ return 0
912
+ fi
913
+ preview=$(head -c 200 "$http_body" 2>/dev/null || true)
902
914
  rm -f "$http_body"
915
+ case "${code:-000}" in
916
+ 000|5*)
917
+ if [ "$attempt" -lt "$max_attempts" ]; then
918
+ warn "agent.matrix_session.create returned HTTP ${code:-000} on attempt $attempt/$max_attempts; retrying."
919
+ sleep "$interval"
920
+ attempt=$((attempt + 1))
921
+ continue
922
+ fi
923
+ ;;
924
+ 401)
925
+ warn "agent.matrix_session.create rejected agent_token. Refresh bootstrap credentials or deploy a message-server build that allows agent_token for this action."
926
+ ;;
927
+ *) ;;
928
+ esac
929
+ warn "agent.matrix_session.create returned HTTP ${code:-000}: $preview"
903
930
  return 1
904
- fi
905
- mv "$http_body" "$out"
906
- chmod 600 "$out" 2>/dev/null || true
931
+ done
932
+ return 1
907
933
  }
908
934
 
909
935
  _write_cc_connect_config() {
@@ -1334,7 +1360,7 @@ run_phase() {
1334
1360
 
1335
1361
  mkdir -p "$workspace"
1336
1362
  mkdir -p "$cc_runtime_dir"
1337
- if ! _create_cc_connect_matrix_session "$asurl" "$access_token" "DIREXIO_CC_CONNECT_${node_id}" "$cc_session"; then
1363
+ if ! _create_cc_connect_matrix_session "$asurl" "$token" "DIREXIO_CC_CONNECT_${node_id}" "$cc_session"; then
1338
1364
  phase_set S6_WIRE_LOCAL failed "agent Matrix session creation failed"
1339
1365
  fail "failed to create cc-connect Matrix session via agent.matrix_session.create."
1340
1366
  fi