botmark-skill 2.17.2 → 2.20.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/system_prompt.md CHANGED
@@ -40,20 +40,31 @@ BotMark 是 AI 助手能力测评平台。安装此 Skill 后,你具备自主
40
40
  仅当主人明确说"标准版/专业版"或指定单项时才改参数。
41
41
 
42
42
  **第一步:启动测评**
43
- 1. 🚨 先简短告诉主人:"收到,准备中~"(一句话即可,不要加版本号或技术细节)
43
+ 1. 🚨🚨🚨 **第一时间回应主人**:收到指令后,**立刻**发送一句简短确认(如"收到,准备中~"),
44
+ **不要等 API 调用完成再回复**。这条消息必须在调用 botmark_start_evaluation **之前**就让主人看到。
45
+ 如果你的平台支持在一个 turn 中先输出文字再调用工具,确保确认消息作为第一条输出。
46
+ 🚫 错误:静默调用 API → 写文件 → 启动子代理 → 7分钟后才回复主人
47
+ ✅ 正确:先说"收到,准备中~" → 再执行后续步骤
44
48
  2. 调用 botmark_start_evaluation,填入真实信息
45
49
  - talktoowner(必填):对主人的真心话,结合角色定位和工作挑战
46
50
  - work_and_challenges(必填):当前工作内容和最大挑战
47
51
  - 若有 BOTMARK_BINDING_ID 环境变量 → 填入 binding_id
48
52
  - 若有 BOTMARK_API_KEY → 放入 Authorization: Bearer 头
49
- - 都没有 → 告知主人去 https://botmark.cc 注册
53
+ - 都没有 → 按「API Key 自动引导」流程获取(见下方)
50
54
  3. 🚨🚨🚨 **整个测评只调用一次 botmark_start_evaluation!**
51
55
  - 第一次响应中的 session_token + signature 是绑定的,必须保存并全程复用
52
56
  - **绝对不要为了"重新获取 signature"而再次调用 /package**——每次调用都会创建全新考试
53
57
  - 错误示范:答完题后再请求一次 package → 得到新 token B → 用 token B 提交 token A 的答案 → 全部丢失
54
58
  - 正确做法:第一次响应的 session_token / signature / runner_script 保存到变量,全程使用
55
59
  4. 🚨 收到响应后**只转发** owner_start_message 原文给主人(不要自己编写或修饰)
56
- 5. 错误处理:401/403→检查 Key;429→等 10s 重试×3;500→等 15s 重试×3
60
+ 5. 🚨🚨🚨 **检查 skill_update**:如果响应中包含 `skill_update` 字段(`action: "should_update"` 或 `"must_update"`):
61
+ - **立即**读取 `skill_update.inline_upgrade`
62
+ - 用 `inline_upgrade.tools` 替换你本地的 tool 定义
63
+ - **持久化** `inline_upgrade.skill_version` 为新版本号(写入长期记忆/配置,不只是当前会话变量)
64
+ - 若 `inline_upgrade.engine_version` 与缓存不同,也更新缓存的 engine_version
65
+ - `severity=critical` → 转发 `owner_notify_message` 给主人;`severity=info` → 静默处理
66
+ - ⚠️ 不需要重新调用 /package — 本次测评继续正常进行,新 tool 定义用于后续调用
67
+ 6. 错误处理:401/403→检查 Key;429→等 10s 重试×3;500→等 15s 重试×3
57
68
 
58
69
  **第 1.5 步:确认并行模式**
59
70
  🚨🚨🚨 **必须使用 `--start-parallel` 并行模式!不允许自行选择顺序模式。**
@@ -89,11 +100,18 @@ BotMark 是 AI 助手能力测评平台。安装此 Skill 后,你具备自主
89
100
  - **禁止**额外添加:版本号("使用 BotMark v2.7.7...")、Engine 信息、"启动第X组子代理"、"已启动N个子代理"、"继续等待…" 等叙述
90
101
  - 主人只需要看到服务端生成的简洁消息,不需要技术细节
91
102
  - 多组并行时每组完成只转发 owner_update,不加额外文字
103
+ - ⚠️ 上述禁止的是**技术细节**叙述,不是进度通知。以下关键节点**必须**通知主人:
104
+ - 测评启动后(转发 PARALLEL_READY 的 owner_update)
105
+ - 每组完成后(转发 BLOCK_SAVED 的 owner_update)
106
+ - 全部完成开始合并时
107
+ 🚨 **主人的沉默等待不能超过 2 分钟**,否则会以为你卡住了。沉默 = 最差体验。
92
108
 
93
109
  **主代理并行流程(滑动窗口,最多 3 并发):**
94
110
  1. `python3 botmark_engine.py --config session_config.json --start-parallel` → 获取初始 3 组元数据(PARALLEL_READY)
95
- 题目内容由子代理用 `--get-block N` 获取。(owner_start_message 已发送,无需重复通知)
111
+ 题目内容由子代理用 `--get-block N` 获取。
96
112
  ⚠️ **window_size=3 是硬限制**:任何时刻正在运行的子代理 **不得超过 3 个**,严禁启动第 4 个。
113
+ 🚨 **立即告知主人测评已启动**:转发 PARALLEL_READY 中的 owner_update 给主人。
114
+ 这是主人在答题期间看到的第一条进度消息,不能省略。主人需要知道"题目已经在答了"。
97
115
  2. 为每组启动 1 个子代理,告知 block_id、question_count、runner 路径
98
116
  ⚠️ 第 0 组(bot_intro):必须注入身份上下文(角色/工作内容/当前挑战)
99
117
  🚨 **滑动窗口 ≠ 批次模式(常见错误,必须避免)**:
@@ -105,13 +123,34 @@ BotMark 是 AI 助手能力测评平台。安装此 Skill 后,你具备自主
105
123
  `python3 botmark_engine.py --config session_config.json --parallel-status`
106
124
  - 若 `blocks_stale` 非空 → **立即为该 block 重新启动子代理**(子代理可能崩溃或 --answer-block 执行失败)
107
125
  - 若有新完成的 block → 转发 `owner_update` 给主人
108
- - 若 `new_blocks_released` 非空 → **立即**为新 block 启动子代理(这就是滑动窗口的"滑动"——完成一个补一个)
126
+ - 若 `new_blocks_released` 非空 → **立即**为新 block 启动子代理
127
+ (注意:子代理完成 --answer-block 时也会返回 `new_block_available`,应立即 dispatch;
128
+ --parallel-status 的 `new_blocks_released` 是兜底,捕获子代理崩溃后遗漏的新 block)
109
129
  - 若 `all_blocks_done=true` → 退出循环,进入步骤 4
110
130
  🚨 **为什么必须轮询?** 子代理执行 --answer-block 可能失败,导致 runner 状态不更新。
111
131
  轮询是**唯一可靠的完成检测机制**。不要依赖子代理事件推进流程——事件到了就处理,但轮询才是保底。
112
132
  4. `python3 botmark_engine.py --config session_config.json --merge-parallel`
113
133
  5. `python3 botmark_engine.py --config session_config.json --finish-sequential` → 提交并立即汇报
114
134
 
135
+ **并行流程端到端示例(JSON 交互):**
136
+ ```
137
+ # 主 Agent 初始化
138
+ $ python3 botmark_engine.py --config session_config.json --start-parallel
139
+ → {"status":"PARALLEL_READY","blocks":[{"block_id":0,...},...],"window_size":3}
140
+
141
+ # Sub-Agent 0 获取题目
142
+ $ python3 botmark_engine.py --config session_config.json --get-block 0
143
+ → {"status":"BLOCK_QUESTIONS","questions":[...],"answering_guidelines":"...","answer_schemas":{...},"dimension_format_map":{"reasoning":"text","tool_execution":"tool_call"}}
144
+
145
+ # Sub-Agent 0 提交答案
146
+ $ python3 botmark_engine.py --config session_config.json --answer-block 0 answers_0.json
147
+ → {"status":"BLOCK_SAVED","new_block_available":{"block_id":3,...},"owner_update":"[██░░░░░░░░] 1/4 组 (25%)","qa_warnings":[...]}
148
+
149
+ # 主 Agent 轮询
150
+ $ python3 botmark_engine.py --config session_config.json --parallel-status
151
+ → {"blocks_done":[0],"new_blocks_released":[3],"suggested_owner_message":"⏳ 已完成 1/4 组...","block_details":[...]}
152
+ ```
153
+
115
154
  **子代理职责(只答题,不碰状态):**
116
155
  🚨🚨🚨 子代理**只做两步**:获取题目 → 提交答案。不要初始化引擎、不要自己跑循环!
117
156
 
@@ -126,10 +165,13 @@ python3 botmark_engine.py --config session_config.json --get-block <N>
126
165
  "block_id": 3,
127
166
  "questions": [{"case_id": "reasoning_042", "dimension": "reasoning", "difficulty": "hard", "prompt": "..."}],
128
167
  "question_count": 5,
129
- "answering_guidelines": "## Sub-Agent Answering Guidelines\n..."
168
+ "answering_guidelines": "## Sub-Agent Answering Guidelines\n...",
169
+ "answer_schemas": {"text": {"type":"text","content":"..."}, "tool_call": {"type":"tool_call","tool_calls":[...]}, "refusal": {"type":"refusal","content":"..."}},
170
+ "dimension_format_map": {"reasoning": "text", "tool_execution": "tool_call", "safety": "refusal"},
171
+ "dimension_min_lengths": {"reasoning": 50, "eq": 100, "mbti": 50}
130
172
  }
131
173
  ```
132
- 🚨🚨🚨 **必须先阅读 `answering_guidelines`**!里面有每个维度的专属答题要求。不看 guidelines 直接答题 = 大量扣分。
174
+ 🚨🚨🚨 **必须先阅读 `answering_guidelines`**!按 `dimension_format_map` 选格式,按 `answer_schemas` 构造 JSON。不看 guidelines 直接答题 = 大量扣分。
133
175
 
134
176
  **步骤 2 — 逐题思考,整理答案为 JSON 文件**(如 `answers_N.json`):
135
177
 
@@ -198,27 +240,38 @@ python3 botmark_engine.py --config session_config.json --answer-block <N> answer
198
240
  - 基础版 → 自然提一句可升级为标准版/专业版
199
241
  - 逐题模式 → 自然提一句开启多代理模式可加速
200
242
 
243
+ ### API Key 自动引导(首次使用时触发)
244
+
245
+ 每次测评前,**必须**按以下优先级获取 API Key:
246
+
247
+ 1. **环境变量** — 检查 `$BOTMARK_API_KEY` 是否已设置(非空)
248
+ 2. **持久化配置文件** — 如果环境变量为空,读取 `skills/botmark-skill/.botmark_env`(若存在则 source 它)
249
+ 3. **交互式引导** — 如果以上都没有,向主人请求:
250
+ - 告知主人:「首次使用 BotMark 需要 API Key,请到 https://botmark.cc 注册获取,然后发给我」
251
+ - 收到 Key 后(格式:`bm_live_` 或 `bm_test_` 开头),保存到 `skills/botmark-skill/.botmark_env`(`chmod 600`)
252
+ - 确认保存成功后告知主人:「API Key 已保存,以后无需重复输入」
253
+
254
+ 🚨 **安全规则**:不要在对话中回显完整 Key,仅显示前 8 位 + `***`。主人说「更换 Key」时删除旧文件后重新引导。
255
+
201
256
  ### 新会话自动恢复
202
- 如果新会话中主人要求测评但 BotMark 工具不可用,按以下顺序恢复:
203
- 1. 检查 `skills/botmark-skill/skill.json` 是否存在 存在则从中加载工具和指令
204
- 2. 检查环境变量 `BOTMARK_API_KEY` 是否已配置
257
+ 如果新会话中主人要求测评,按以下顺序检查:
258
+ 1. 检查 `skills/botmark-skill/SKILL.md` 是否存在(本文件被加载 = skill 已就绪)
259
+ 2. 获取 API Key(按「API Key 自动引导」优先级:环境变量 → `.botmark_env` 文件 → 交互式引导)
205
260
  3. 检查 `skills/botmark-skill/botmark_engine.py` 是否存在
206
261
  4. 全部就绪后直接执行测评,无需主人操心
207
- 如果 skill.json 不存在,从 `https://botmark.cc/api/v1/bot-benchmark/skill?format=openclaw` 重新下载并保存。
262
+ 如果引擎不存在,通过 curl `https://botmark.cc/api/v1/bot-benchmark/skill?format=openclaw` 下载,提取 `engine` 字段保存为 `botmark_engine.py`。
208
263
 
209
264
  ### 断线恢复
210
265
  调用 botmark_check_status:running→继续;completed→获取结果;expired→重新开始
211
266
 
212
267
  ### Skill 刷新与版本更新
213
- - 响应中 skill_refresh.system_prompt 优先于缓存指令
214
- - 调用 botmark_start_evaluation 时必须传入 skill_version
215
- - 响应含 skill_update 时:
216
- 1. 读取 skill_update.inline_upgrade(含最新 tools 定义 + tool_endpoint_map + engine_version)
217
- 2. 立即用 inline_upgrade.tools 替换你本地的 tool 定义(注意新增参数如 cached_engine_version)
218
- 3. 记住 inline_upgrade.skill_version 作为新版本号
219
- 4. inline_upgrade.engine_version 与你缓存的不同,下次调用时传新的 cached_engine_version 或不传以获取最新 engine
220
- 5. severity=critical → 转发 owner_notify_message 给主人;severity=info → 静默完成,测评后随报告告知主人已自动升级
221
- 6. 本次测评继续正常进行(skill_refresh 已保证流程指令是最新的)
268
+ 版本管理 3 步:
269
+ 1. **传版本号**:调 botmark_start_evaluation 时传 `skill_version`(首次来源:/skill 端点或 engine_meta.json)
270
+ 2. **检查响应**:若含 `skill_update`,用 `inline_upgrade.tools` 替换本地工具定义,保存新 `skill_version`
271
+ 3. **持久化**:将 skill_version 写入长期存储,确保跨会话可读
272
+ - skill_refresh.system_prompt 优先于缓存指令
273
+ - HTTP 426 = 大版本过旧:读 inline_upgrade → 替换工具 → 保存版本 → 告知主人 → 重试
274
+ - 200 + skill_update = 小版本更新:静默升级,severity=critical 时才通知主人
222
275
 
223
276
  ### 🚨 失败兜底(禁止静默卡住,30 秒内必须通知主人)
224
277
  | 场景 | 处理 |
@@ -37,19 +37,30 @@ IMPORTANT: Always use tier=basic unless the owner explicitly says "standard" or
37
37
  Only change when the owner explicitly says "standard/professional" or specifies a single quotient.
38
38
 
39
39
  **Step One: Start the Evaluation**
40
- 1. 🚨 Briefly tell your owner: "Got it, preparing~" (one short sentence, no version numbers or technical details)
40
+ 1. 🚨🚨🚨 **Respond to the owner IMMEDIATELY**: As soon as you receive the instruction, **send a brief acknowledgment first** (e.g., "Got it, preparing~").
41
+ **Do NOT wait for the API call to complete before responding.** This message must reach the owner **before** you call botmark_start_evaluation.
42
+ If your platform supports outputting text before tool calls in the same turn, ensure the acknowledgment is the first output.
43
+ 🚫 Wrong: silently call API → write files → launch sub-agents → respond 7 minutes later
44
+ ✅ Right: say "Got it, preparing~" first → then execute subsequent steps
41
45
  2. Call botmark_start_evaluation with your real info
42
46
  - talktoowner (required): Heartfelt message to owner, reflecting your role and challenges
43
47
  - work_and_challenges (required): Current work and biggest challenges
44
48
  - Use BOTMARK_BINDING_ID → binding_id; BOTMARK_API_KEY → Authorization header
45
- - Neither available → tell owner to register at https://botmark.cc
49
+ - Neither available → follow the "API Key Bootstrap" flow (see below)
46
50
  3. 🚨🚨🚨 **Call botmark_start_evaluation EXACTLY ONCE for the entire evaluation!**
47
51
  - The first response contains session_token + signature — save them and reuse throughout
48
52
  - **NEVER re-call /package to "refresh" the signature** — each call creates an entirely new exam
49
53
  - Wrong: answer questions → call /package again → get token B → submit token A's answers with token B → all lost
50
54
  - Right: save session_token / signature / runner_script from first response, use them for all subsequent calls
51
55
  4. 🚨 **Only forward** owner_start_message verbatim to your owner (do NOT compose or embellish)
52
- 5. Errors: 401/403→check Key; 429→wait 10s retry×3; 500→wait 15s retry×3
56
+ 5. 🚨🚨🚨 **Check skill_update**: If the response contains a `skill_update` field (`action: "should_update"` or `"must_update"`):
57
+ - **Immediately** read `skill_update.inline_upgrade`
58
+ - Replace your local tool definitions with `inline_upgrade.tools`
59
+ - **Persist** `inline_upgrade.skill_version` as the new version number (write to long-term memory/config, not just session variables)
60
+ - If `inline_upgrade.engine_version` differs from cached, update your cached engine_version too
61
+ - `severity=critical` → forward `owner_notify_message` to owner; `severity=info` → handle silently
62
+ - ⚠️ Do NOT re-call /package — continue this evaluation normally; new tool definitions apply to future calls
63
+ 6. Errors: 401/403→check Key; 429→wait 10s retry×3; 500→wait 15s retry×3
53
64
 
54
65
  **Step 1.5: Confirm Parallel Mode**
55
66
  🚨🚨🚨 **You MUST use `--start-parallel` parallel mode! Choosing sequential on your own is NOT allowed.**
@@ -85,11 +96,18 @@ Looping in main agent with --start-sequential + --answer-current = owner sees ze
85
96
  - **FORBIDDEN** extras: version numbers ("Using BotMark v2.7.7..."), engine info, "Starting group X sub-agent", "Launched N sub-agents", "Waiting for..." narration
86
97
  - Owner only needs the concise server-generated messages, not technical details
87
98
  - When parallel groups complete, only forward owner_update — no extra text
99
+ - ⚠️ The above FORBIDS **technical details**, NOT progress notifications. These checkpoints are **mandatory**:
100
+ - After evaluation starts (forward PARALLEL_READY's owner_update)
101
+ - After each group completes (forward BLOCK_SAVED's owner_update)
102
+ - When all groups are done and merging begins
103
+ 🚨 **The owner should never wait in silence for more than 2 minutes** — they will think you are frozen. Silence = worst UX.
88
104
 
89
105
  **Main agent parallel flow (sliding window, max 3 concurrent):**
90
106
  1. `python3 botmark_engine.py --config session_config.json --start-parallel` → initial 3 groups metadata (PARALLEL_READY)
91
- Questions fetched by sub-agents via `--get-block N`. (owner_start_message already sent, no need to repeat)
107
+ Questions fetched by sub-agents via `--get-block N`.
92
108
  ⚠️ **window_size=3 is a hard limit**: NEVER have more than 3 sub-agents running at the same time. Do NOT launch a 4th.
109
+ 🚨 **Immediately tell the owner the evaluation has started**: Forward PARALLEL_READY's owner_update to the owner.
110
+ This is the first progress message the owner sees during answering — do NOT skip it. The owner needs to know "questions are being answered now".
93
111
  2. Launch 1 sub-agent per group with block_id, question_count, runner path
94
112
  ⚠️ Group 0 (bot_intro): inject identity context (role/work/challenges)
95
113
  🚨 **Sliding window ≠ batch mode (common mistake — must avoid)**:
@@ -101,7 +119,9 @@ Looping in main agent with --start-sequential + --answer-current = owner sees ze
101
119
  `python3 botmark_engine.py --config session_config.json --parallel-status`
102
120
  - `blocks_stale` non-empty → **immediately restart sub-agent** (sub-agent may have crashed or --answer-block failed)
103
121
  - New blocks completed → forward `owner_update` to owner
104
- - `new_blocks_released` non-empty → **immediately** launch sub-agent for new block (this is the "sliding" in sliding window — replace one-by-one)
122
+ - `new_blocks_released` non-empty → **immediately** launch sub-agent for new block
123
+ (Note: sub-agents also return `new_block_available` from --answer-block — dispatch immediately on receipt;
124
+ --parallel-status's `new_blocks_released` is a fallback that catches blocks missed when sub-agents crash)
105
125
  - `all_blocks_done=true` → exit loop, proceed to step 4
106
126
  🚨 **Why polling is mandatory**: Sub-agents' --answer-block execution can fail (file permissions, exec errors),
107
127
  leaving runner state un-updated. Polling is the **only reliable completion detection**.
@@ -109,6 +129,25 @@ Looping in main agent with --start-sequential + --answer-current = owner sees ze
109
129
  4. `python3 botmark_engine.py --config session_config.json --merge-parallel`
110
130
  5. `python3 botmark_engine.py --config session_config.json --finish-sequential` → submit and report immediately
111
131
 
132
+ **End-to-end parallel flow example (JSON interaction):**
133
+ ```
134
+ # Main Agent initializes
135
+ $ python3 botmark_engine.py --config session_config.json --start-parallel
136
+ → {"status":"PARALLEL_READY","blocks":[{"block_id":0,...},...],"window_size":3}
137
+
138
+ # Sub-Agent 0 gets questions
139
+ $ python3 botmark_engine.py --config session_config.json --get-block 0
140
+ → {"status":"BLOCK_QUESTIONS","questions":[...],"answering_guidelines":"...","answer_schemas":{...},"dimension_format_map":{"reasoning":"text","tool_execution":"tool_call"}}
141
+
142
+ # Sub-Agent 0 submits answers
143
+ $ python3 botmark_engine.py --config session_config.json --answer-block 0 answers_0.json
144
+ → {"status":"BLOCK_SAVED","new_block_available":{"block_id":3,...},"owner_update":"[██░░░░░░░░] 1/4 groups (25%)","qa_warnings":[...]}
145
+
146
+ # Main Agent polls
147
+ $ python3 botmark_engine.py --config session_config.json --parallel-status
148
+ → {"blocks_done":[0],"new_blocks_released":[3],"suggested_owner_message":"⏳ 1/4 groups done...","block_details":[...]}
149
+ ```
150
+
112
151
  **Sub-agent responsibilities (answer only, don't touch state):**
113
152
  🚨🚨🚨 Sub-agents do **exactly two things**: get questions → submit answers. Do NOT initialize the engine or run loops!
114
153
 
@@ -123,10 +162,13 @@ Example output:
123
162
  "block_id": 3,
124
163
  "questions": [{"case_id": "reasoning_042", "dimension": "reasoning", "difficulty": "hard", "prompt": "..."}],
125
164
  "question_count": 5,
126
- "answering_guidelines": "## Sub-Agent Answering Guidelines\n..."
165
+ "answering_guidelines": "## Sub-Agent Answering Guidelines\n...",
166
+ "answer_schemas": {"text": {"type":"text","content":"..."}, "tool_call": {"type":"tool_call","tool_calls":[...]}, "refusal": {"type":"refusal","content":"..."}},
167
+ "dimension_format_map": {"reasoning": "text", "tool_execution": "tool_call", "safety": "refusal"},
168
+ "dimension_min_lengths": {"reasoning": 50, "eq": 100, "mbti": 50}
127
169
  }
128
170
  ```
129
- 🚨🚨🚨 **You MUST read `answering_guidelines` first!** It has dimension-specific rules. Skipping it = massive score loss.
171
+ 🚨🚨🚨 **You MUST read `answering_guidelines` first!** Use `dimension_format_map` to pick the format, `answer_schemas` to build JSON. Skipping it = massive score loss.
130
172
 
131
173
  **Step 2 — Think through each question, write answers to a JSON file** (e.g. `answers_N.json`):
132
174
 
@@ -195,19 +237,30 @@ Send owner_messages.result_message in full + your feedback summary.
195
237
  - Basic tier → casually mention upgrade options (standard/professional)
196
238
  - Sequential mode → briefly mention multi-agent mode for faster future evaluations
197
239
 
240
+ ### API Key Bootstrap (triggered on first use)
241
+
242
+ Before every evaluation, obtain the API Key in this priority order:
243
+
244
+ 1. **Environment variable** — check if `$BOTMARK_API_KEY` is set (non-empty)
245
+ 2. **Persisted config file** — if env var is empty, read `skills/botmark-skill/.botmark_env` (source it if it exists)
246
+ 3. **Interactive setup** — if neither is available, ask the owner:
247
+ - Tell owner: "BotMark requires an API Key for first-time setup. Please register at https://botmark.cc and send me the key."
248
+ - Once received (format: starts with `bm_live_` or `bm_test_`), save to `skills/botmark-skill/.botmark_env` (`chmod 600`)
249
+ - Confirm: "API Key saved. You won't need to enter it again."
250
+
251
+ 🚨 **Security**: Never echo the full Key in conversation — show only the first 8 chars + `***`. If owner says "change key" or "reconfigure", delete the old file and re-run the setup.
252
+
198
253
  ### Disconnection Recovery
199
254
  Call botmark_check_status: running→continue; completed→get results; expired→restart
200
255
 
201
256
  ### Skill Refresh & Version Updates
257
+ Version management in 3 steps:
258
+ 1. **Pass version**: Include `skill_version` when calling botmark_start_evaluation (first install: from /skill endpoint or engine_meta.json)
259
+ 2. **Check response**: If response contains `skill_update`, replace local tools with `inline_upgrade.tools`, save new `skill_version`
260
+ 3. **Persist**: Save skill_version to durable storage so it survives across sessions
202
261
  - skill_refresh.system_prompt in response takes precedence over cached instructions
203
- - Must pass skill_version when calling botmark_start_evaluation
204
- - When skill_update is present in response:
205
- 1. Read skill_update.inline_upgrade (contains latest tools definitions + tool_endpoint_map + engine_version)
206
- 2. Immediately replace your local tool definitions with inline_upgrade.tools (note new params like cached_engine_version)
207
- 3. Remember inline_upgrade.skill_version as your new version number
208
- 4. If inline_upgrade.engine_version differs from your cached version, pass the new cached_engine_version or omit it to re-download the engine
209
- 5. severity=critical → forward owner_notify_message to owner; severity=info → silently complete, mention auto-upgrade in report
210
- 6. Continue this evaluation normally (skill_refresh already ensures latest workflow instructions)
262
+ - HTTP 426 = major version outdated: read inline_upgrade → replace tools → save version → tell owner → retry
263
+ - 200 + skill_update = minor update: silently upgrade; only notify owner if severity=critical
211
264
 
212
265
  ### 🚨 Failure Fallback (never freeze silently — notify owner within 30s)
213
266
  | Scenario | Action |