@chanlerdev/scorel 0.0.6 → 0.0.8
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +5 -3
- package/dist/index.js +599 -67
- package/dist/index.js.map +4 -4
- package/docs/CHANGELOG.md +53 -0
- package/docs/ROADMAP.md +5 -4
- package/docs/spec/channels.md +1 -1
- package/docs/spec/events.md +28 -10
- package/docs/spec/runtime.md +3 -3
- package/docs/spec/session.md +4 -4
- package/docs/spec/ship/S0106-snip-context-control.md +1 -1
- package/docs/spec/ship/S0107-system-reminder-unification.md +112 -51
- package/docs/spec/ship/S0108-gui-bundled-cli-runtime.md +1 -1
- package/docs/spec/ship/S0109-scorel-run-headless-task-runner.md +172 -0
- package/package.json +1 -1
package/docs/CHANGELOG.md
CHANGED
|
@@ -2,6 +2,59 @@
|
|
|
2
2
|
|
|
3
3
|
## Unreleased
|
|
4
4
|
|
|
5
|
+
## 0.0.8 - 2026-06-26
|
|
6
|
+
|
|
7
|
+
### Highlights
|
|
8
|
+
|
|
9
|
+
- New `scorel run` command for headless, non-interactive task execution
|
|
10
|
+
- GUI now correctly loads memory status for the selected project on session preload
|
|
11
|
+
- Sessions can be attached even when provider credentials are missing
|
|
12
|
+
|
|
13
|
+
### Changes
|
|
14
|
+
|
|
15
|
+
- Added `scorel run` command with multiple prompt sources, execution options, provider overrides, and machine-readable summary output
|
|
16
|
+
- GUI now fetches and displays memory status for the selected project during app initialization
|
|
17
|
+
- Provider settings UI no longer autosaves on credential mode change, preventing incomplete data saves
|
|
18
|
+
- Daemon session loading no longer requires a runtime; sessions can be attached without provider API keys configured
|
|
19
|
+
|
|
20
|
+
### Fixes
|
|
21
|
+
|
|
22
|
+
- Fixed GUI not loading memory status for the selected project on session preload
|
|
23
|
+
- Fixed provider attach failure when provider config is incomplete (e.g., missing API key)
|
|
24
|
+
|
|
25
|
+
### Verification
|
|
26
|
+
|
|
27
|
+
- Tests cover `scorel run` prompt sources, output formats, summary, timeout exit code, and provider overrides
|
|
28
|
+
- Unit test confirms memory status is fetched for correct project on app mount
|
|
29
|
+
- Tests verify sessions can be loaded and messages sent even when provider credentials are missing
|
|
30
|
+
|
|
31
|
+
## 0.0.7 - 2026-06-24
|
|
32
|
+
|
|
33
|
+
### Highlights
|
|
34
|
+
|
|
35
|
+
- Introduce structured system_reminder content blocks across CLI, GUI, WebUI, and daemon, replacing ad-hoc XML strings.
|
|
36
|
+
- Bundle GUI runtime dependencies for self-contained execution.
|
|
37
|
+
|
|
38
|
+
### Changes
|
|
39
|
+
|
|
40
|
+
- System reminders now use structured `system_reminder` content blocks with origin, visibility, and scope, enabling consistent handling across all interfaces.
|
|
41
|
+
- GUI's CLI runtime is now bundled with all dependencies, eliminating the need for node_modules.
|
|
42
|
+
|
|
43
|
+
### Fixes
|
|
44
|
+
|
|
45
|
+
- Snip tool result no longer exposes internal span IDs or event counts to the model, keeping output concise.
|
|
46
|
+
|
|
47
|
+
### Breaking Changes
|
|
48
|
+
|
|
49
|
+
- Protocol version incremented from 4 to 5; session headers must now carry version 5.
|
|
50
|
+
|
|
51
|
+
### Verification
|
|
52
|
+
|
|
53
|
+
- All existing tests pass, with new tests covering system_reminder lowering, message-attached reminders, projector filtering, and bundled runtime integrity.
|
|
54
|
+
|
|
55
|
+
- Protocol version incremented to 5 with structured `system_reminder` content blocks.
|
|
56
|
+
- Snip tool results now return a concise model-visible confirmation while keeping internal span details out of provider context.
|
|
57
|
+
|
|
5
58
|
## 0.0.6 - 2026-06-23
|
|
6
59
|
|
|
7
60
|
### Highlights
|
package/docs/ROADMAP.md
CHANGED
|
@@ -601,8 +601,8 @@ M5 WebUI 的正式产品方向记录在 [`S0030`](spec/ship/S0030-webui-product-
|
|
|
601
601
|
| M9.F1.25 | [`S0103`](spec/ship/S0103-daemon-lifecycle-and-settings-resilience.md) | Daemon 生命周期按入口区分,并修复 GUI Settings remote 切换黑屏风险 | Done |
|
|
602
602
|
| M9.F1.26 | [`S0105`](spec/ship/S0105-cli-update-and-gui-release.md) | CLI 命令面统一补齐、NPM 手动/自动更新、GUI release 打包和增量更新框架 | Done |
|
|
603
603
|
| M9.F1.27 | [`S0106`](spec/ship/S0106-snip-context-control.md) | `context_control` 持久事件和 `snip` tool,让 agent 隐藏已完成 user turn 的未来 LLM context 投影 | Done |
|
|
604
|
-
| M9.F1.28 | [`S0107`](spec/ship/S0107-system-reminder-unification.md) | 统一 system reminder 的持久化、构造、LLM 投影和 UI visibility 语义 |
|
|
605
|
-
| M9.F1.29 | [`S0108`](spec/ship/S0108-gui-bundled-cli-runtime.md) | GUI release 内置同版本 CLI runtime,packaged GUI 用 bundle 内可执行文件启动本地 Host |
|
|
604
|
+
| M9.F1.28 | [`S0107`](spec/ship/S0107-system-reminder-unification.md) | 统一 system reminder 的持久化、构造、LLM 投影和 UI visibility 语义 | Done |
|
|
605
|
+
| M9.F1.29 | [`S0108`](spec/ship/S0108-gui-bundled-cli-runtime.md) | GUI release 内置同版本 CLI runtime,packaged GUI 用 bundle 内可执行文件启动本地 Host | Done |
|
|
606
606
|
|
|
607
607
|
**Not in M9 Follow-up**:
|
|
608
608
|
|
|
@@ -783,8 +783,9 @@ HTTP adapter 必须映射已有 Host use cases,不复制领域逻辑。
|
|
|
783
783
|
| [`S0104`](spec/ship/S0104-tool-result-artifacts.md) | Tool result artifacts for oversized Bash output | Done |
|
|
784
784
|
| [`S0105`](spec/ship/S0105-cli-update-and-gui-release.md) | CLI update and GUI release | Done |
|
|
785
785
|
| [`S0106`](spec/ship/S0106-snip-context-control.md) | Snip context control | Done |
|
|
786
|
-
| [`S0107`](spec/ship/S0107-system-reminder-unification.md) | System reminder unification |
|
|
787
|
-
| [`S0108`](spec/ship/S0108-gui-bundled-cli-runtime.md) | GUI bundled CLI runtime |
|
|
786
|
+
| [`S0107`](spec/ship/S0107-system-reminder-unification.md) | System reminder unification | Done |
|
|
787
|
+
| [`S0108`](spec/ship/S0108-gui-bundled-cli-runtime.md) | GUI bundled CLI runtime | Done |
|
|
788
|
+
| [`S0109`](spec/ship/S0109-scorel-run-headless-task-runner.md) | `scorel run` headless task runner | Active |
|
|
788
789
|
|
|
789
790
|
---
|
|
790
791
|
|
package/docs/spec/channels.md
CHANGED
|
@@ -94,7 +94,7 @@ running default -> follow_up queue
|
|
|
94
94
|
|
|
95
95
|
## 6. Source Reminder
|
|
96
96
|
|
|
97
|
-
每个 IM turn 会在用户消息前注入 hidden
|
|
97
|
+
每个 IM turn 会在用户消息前注入 hidden `harness_item kind="channel_context"`。`buildContext()` 会把它转成 `system_reminder` block,provider adapter 最后 lower 成类似下面的模型输入文本:
|
|
98
98
|
|
|
99
99
|
```xml
|
|
100
100
|
<system-reminder>
|
package/docs/spec/events.md
CHANGED
|
@@ -340,7 +340,7 @@ interface EventTypeHandler<T extends PersistentEvent> {
|
|
|
340
340
|
```typescript
|
|
341
341
|
type LlmAction =
|
|
342
342
|
| { action: "include"; message: ScorelMessage } // 正常包含为一条消息
|
|
343
|
-
| { action: "merge_prev";
|
|
343
|
+
| { action: "merge_prev"; reminder: SystemReminderContentBlock } // 合入前一条 tool_result
|
|
344
344
|
| { action: "skip" } // 不包含在 LLM context 中
|
|
345
345
|
| { action: "barrier"; summary: ScorelMessage } // 替换上方所有消息,注入 summary,停止遍历
|
|
346
346
|
```
|
|
@@ -350,7 +350,7 @@ type LlmAction =
|
|
|
350
350
|
| Event 类型 | convertToLlm | convertToDisplay |
|
|
351
351
|
|---|---|---|
|
|
352
352
|
| `message`(user/assistant/tool_result) | `include` — 原样包含 | 正常气泡 |
|
|
353
|
-
| `
|
|
353
|
+
| `harness_item`(steer / runtime guidance) | `merge_prev` — 合入前一条 tool_result 的 `system_reminder` block;无 tool_result 则作为独立 user msg | 内联小字提示 |
|
|
354
354
|
| `message`(meta.source = "followUp") | 同 steer | 内联 "追加任务" 标记 |
|
|
355
355
|
| `rewind` | `skip` | "回退到此处" 标记 |
|
|
356
356
|
| `branch` | `skip` | "切换分支" 标记 |
|
|
@@ -397,23 +397,40 @@ function buildContext(tree: SessionTree, leafId: EventId): ScorelMessage[] {
|
|
|
397
397
|
|
|
398
398
|
---
|
|
399
399
|
|
|
400
|
-
## 7.
|
|
400
|
+
## 7. `system_reminder` 通用 Harness 注入机制
|
|
401
401
|
|
|
402
402
|
### 7.1 用途
|
|
403
403
|
|
|
404
|
-
|
|
404
|
+
`system_reminder` 是 Scorel harness 向 LLM 传递旁路信息的结构化 content block。所有非用户直接输入、但需要 LLM 看到的系统级内容都先表达成结构化 block;`<system-reminder>` 只是 provider adapter 最后生成的传输 envelope。
|
|
405
405
|
|
|
406
406
|
### 7.2 使用场景
|
|
407
407
|
|
|
408
408
|
| 场景 | 注入内容 | 注入位置 |
|
|
409
409
|
|------|---------|---------|
|
|
410
410
|
| Steer(用户中途插话) | 用户的引导文字 | merge 进前一条 tool_result |
|
|
411
|
-
|
|
|
411
|
+
| User message sidecar | 当前 turn 时间、用户输入引用、`snip.userMessageId` | 同一条 user message 的 `content` |
|
|
412
|
+
| Hook 上下文(UserPromptSubmit 等) | hook 产出 | user message / tool_result |
|
|
412
413
|
| Memory 召回 | 记忆内容 | tool_result 末尾 |
|
|
413
414
|
| 系统提醒(超时、配额等) | 通知文本 | tool_result 末尾 |
|
|
414
415
|
| Channel 来源标注 | 来自哪个群/频道 | user message 内 |
|
|
415
416
|
|
|
416
|
-
### 7.3 格式
|
|
417
|
+
### 7.3 Canonical 格式
|
|
418
|
+
|
|
419
|
+
```typescript
|
|
420
|
+
interface SystemReminderContentBlock {
|
|
421
|
+
type: "system_reminder";
|
|
422
|
+
kind: SystemReminderKind;
|
|
423
|
+
origin: "system" | "user" | "tool" | "skill";
|
|
424
|
+
text: string;
|
|
425
|
+
visibility: "model" | "display" | "compact";
|
|
426
|
+
scope: "message" | "turn" | "next_model_call" | "session";
|
|
427
|
+
data?: Record<string, unknown>;
|
|
428
|
+
}
|
|
429
|
+
```
|
|
430
|
+
|
|
431
|
+
### 7.4 Provider lowering
|
|
432
|
+
|
|
433
|
+
Provider adapter 把 canonical block lower 成当前模型输入格式。默认 text envelope 是:
|
|
417
434
|
|
|
418
435
|
```xml
|
|
419
436
|
<system-reminder>
|
|
@@ -421,12 +438,13 @@ function buildContext(tree: SessionTree, leafId: EventId): ScorelMessage[] {
|
|
|
421
438
|
</system-reminder>
|
|
422
439
|
```
|
|
423
440
|
|
|
424
|
-
### 7.
|
|
441
|
+
### 7.5 注入规则
|
|
425
442
|
|
|
426
|
-
-
|
|
427
|
-
-
|
|
443
|
+
- **跟随 user message**:作为同一条 `user_message.content` 的 sidecar block,创建时固定,保持 prompt-cache 稳定。
|
|
444
|
+
- **工具循环中**:merge 进最近一条 tool_result 的 content 末尾;provider 不支持时 fallback 为紧邻 tool result batch 的 user message。
|
|
445
|
+
- **无 tool_result 时(idle / turn 结束后)**:作为独立 user message。
|
|
428
446
|
|
|
429
|
-
### 7.
|
|
447
|
+
### 7.6 LLM System Prompt 声明
|
|
430
448
|
|
|
431
449
|
LLM 在 system prompt 中被告知:
|
|
432
450
|
|
package/docs/spec/runtime.md
CHANGED
|
@@ -174,15 +174,15 @@ Steer message persist 为**独立 PersistentEvent**(role = "user",`meta.sour
|
|
|
174
174
|
|
|
175
175
|
| 前面有 tool_result | 行为 |
|
|
176
176
|
|---|---|
|
|
177
|
-
| ✅ 有 | `merge_prev` — 合入前一条 tool_result content
|
|
177
|
+
| ✅ 有 | `merge_prev` — 合入前一条 tool_result content 末尾,内容为结构化 `system_reminder` block |
|
|
178
178
|
| ❌ 没有(idle 状态) | `include` — 作为独立 user message |
|
|
179
179
|
|
|
180
|
-
LLM 最终看到的(工具循环中):
|
|
180
|
+
Provider lowering 后 LLM 最终看到的(工具循环中):
|
|
181
181
|
```
|
|
182
182
|
tool_result: "文件内容...\n\n<system-reminder>\n别改了,直接跑测试\n</system-reminder>"
|
|
183
183
|
```
|
|
184
184
|
|
|
185
|
-
LLM 最终看到的(idle 时):
|
|
185
|
+
Provider lowering 后 LLM 最终看到的(idle 时):
|
|
186
186
|
```
|
|
187
187
|
user: "别改了,直接跑测试"
|
|
188
188
|
```
|
package/docs/spec/session.md
CHANGED
|
@@ -198,8 +198,8 @@ function buildContext(tree: SessionTree, leafId: EventId): ScorelMessage[] {
|
|
|
198
198
|
messages.unshift(result.message);
|
|
199
199
|
break;
|
|
200
200
|
case "merge_prev":
|
|
201
|
-
// 合入 messages 中最后一条 tool_result 的 content
|
|
202
|
-
mergeIntoPrevToolResult(messages, result.
|
|
201
|
+
// 合入 messages 中最后一条 tool_result 的 content 末尾(system_reminder block)
|
|
202
|
+
mergeIntoPrevToolResult(messages, result.reminder);
|
|
203
203
|
break;
|
|
204
204
|
case "skip":
|
|
205
205
|
break;
|
|
@@ -216,7 +216,7 @@ function buildContext(tree: SessionTree, leafId: EventId): ScorelMessage[] {
|
|
|
216
216
|
|
|
217
217
|
各事件类型的 LlmAction:
|
|
218
218
|
- `message`(普通)→ `include`
|
|
219
|
-
- `
|
|
219
|
+
- `harness_item` / runtime guidance → `merge_prev`(前面有 tool_result 时)或独立 user message,内容为结构化 `system_reminder` block
|
|
220
220
|
- `compact` → `barrier`(注入 summary,停止遍历)
|
|
221
221
|
- `rewind` / `branch` / `channel_inject` / `session_info` / `custom` → `skip`
|
|
222
222
|
- `custom_message` → `include`
|
|
@@ -367,7 +367,7 @@ if (estimateTokens(compactCandidates) > threshold) {
|
|
|
367
367
|
- `rewind` / `branch` / `channel_inject` / `session_info` / `custom` → `skip`(不进入 LLM)
|
|
368
368
|
- `compact` → `barrier`(注入 summary,停止向上)
|
|
369
369
|
- `context_control` → `filter`(从未来 LLM context 排除指定 user turn span)
|
|
370
|
-
- `
|
|
370
|
+
- `harness_item` / runtime guidance → `merge_prev` 或独立 user message,内容为结构化 `system_reminder` block
|
|
371
371
|
|
|
372
372
|
换言之,应用层能玩的花样很多,LLM 始终只看到 handler 声明要暴露的内容。
|
|
373
373
|
|
|
@@ -56,7 +56,7 @@ Expose a lazily available `Snip` runtime tool that lets the agent request hiding
|
|
|
56
56
|
{ "userMessageId": "u_...", "reason": "optional short reason" }
|
|
57
57
|
```
|
|
58
58
|
|
|
59
|
-
The Host validates the request, resolves the model-visible short alias to a target span, appends a `context_control` event, and returns a tool result
|
|
59
|
+
The Host validates the request, resolves the model-visible short alias to a target span, appends a `context_control` event, and returns a brief model-visible confirmation. Internal span details such as `anchorUserEventId`, `throughEventId`, and hidden event count may remain in structured tool result details for diagnostics, but provider context must only receive the concise confirmation. The tool result is still part of the current turn; the hidden span disappears on the next context build.
|
|
60
60
|
|
|
61
61
|
The tool is session-context control, not a generic coding tool. It must be registered by the Host with access to the current lane, not by `createCodingTools()`.
|
|
62
62
|
|
|
@@ -2,98 +2,154 @@
|
|
|
2
2
|
|
|
3
3
|
## Goal
|
|
4
4
|
|
|
5
|
-
Unify
|
|
5
|
+
Unify Scorel's runtime reminders as structured, model-facing context fragments.
|
|
6
6
|
|
|
7
|
-
The business value is prompt and transcript hygiene
|
|
7
|
+
The business value is prompt and transcript hygiene on a high-frequency path. Scorel will routinely attach reminders to user messages, inject reminders while a turn is running, and route reminders through different provider message formats. This must be one stable product contract, not ad-hoc `<system-reminder>` strings scattered across daemon, session replay, tool-result merge paths, UI projectors, or provider adapters.
|
|
8
8
|
|
|
9
9
|
## Scope
|
|
10
10
|
|
|
11
|
-
### Reminder
|
|
11
|
+
### Structured Reminder Block
|
|
12
|
+
|
|
13
|
+
Add a protocol-level content block:
|
|
14
|
+
|
|
15
|
+
```typescript
|
|
16
|
+
type SystemReminderKind =
|
|
17
|
+
| "attachment"
|
|
18
|
+
| "time"
|
|
19
|
+
| "message_ref"
|
|
20
|
+
| "skill_listing"
|
|
21
|
+
| "skill_delta"
|
|
22
|
+
| "memory"
|
|
23
|
+
| "channel_context"
|
|
24
|
+
| "steer"
|
|
25
|
+
| "todo_nudge"
|
|
26
|
+
| "runtime_notice"
|
|
27
|
+
| "compact_summary";
|
|
28
|
+
|
|
29
|
+
type SystemReminderOrigin = "system" | "user" | "tool" | "skill";
|
|
30
|
+
|
|
31
|
+
type SystemReminderScope =
|
|
32
|
+
| "message" // travels with one persisted message whenever that message is in context
|
|
33
|
+
| "turn" // relevant to the user turn that created it
|
|
34
|
+
| "next_model_call" // runtime nudge, consumed by the next provider call
|
|
35
|
+
| "session"; // durable session context such as initial memory
|
|
36
|
+
|
|
37
|
+
type SystemReminderVisibility = "model" | "display" | "compact";
|
|
38
|
+
|
|
39
|
+
interface SystemReminderContentBlock {
|
|
40
|
+
type: "system_reminder";
|
|
41
|
+
kind: SystemReminderKind;
|
|
42
|
+
origin: SystemReminderOrigin;
|
|
43
|
+
text: string;
|
|
44
|
+
visibility: SystemReminderVisibility;
|
|
45
|
+
scope: SystemReminderScope;
|
|
46
|
+
data?: Record<string, unknown>;
|
|
47
|
+
}
|
|
48
|
+
```
|
|
12
49
|
|
|
13
|
-
|
|
50
|
+
`<system-reminder>` remains the model-facing transport envelope, but it is no longer stored or hand-written by feature code. Callers create structured reminder blocks; core/provider projection renders the envelope.
|
|
14
51
|
|
|
15
|
-
|
|
52
|
+
### Two Placement Modes
|
|
16
53
|
|
|
17
|
-
|
|
18
|
-
- Compact summary messages.
|
|
19
|
-
- Model-only metadata attached to a specific `user_message`, such as `snip.userMessageId`.
|
|
20
|
-
- Future runtime guidance that should be visible to the model but not necessarily displayed as transcript text.
|
|
54
|
+
System reminders can appear in two product situations:
|
|
21
55
|
|
|
22
|
-
|
|
56
|
+
1. **Message-attached reminders**: created together with a `user_message` and persisted in that message's `content`.
|
|
57
|
+
- Examples: current time for the submitted turn, `snip.userMessageId`, references to prior user messages, channel context that explains the submitted text.
|
|
58
|
+
- These are stable sidecars. They are created when the message is persisted, so replaying the same message later does not mutate historical content or break prompt-cache prefix stability.
|
|
23
59
|
|
|
24
|
-
|
|
25
|
-
-
|
|
26
|
-
-
|
|
60
|
+
2. **Runtime injected reminders**: appended while a turn is running or between provider calls.
|
|
61
|
+
- Examples: steer, skill delta, a nudge that the model has not used `TodoWrite`, runtime notices.
|
|
62
|
+
- These remain standalone `harness_item` events because they are independent session facts. `buildContext()` lowers them into structured reminder blocks and then places them according to provider-safe rules.
|
|
27
63
|
|
|
28
|
-
###
|
|
64
|
+
### Canonical Context And Provider Lowering
|
|
29
65
|
|
|
30
|
-
|
|
66
|
+
Scorel keeps a provider-neutral context:
|
|
31
67
|
|
|
32
|
-
|
|
33
|
-
|
|
34
|
-
|
|
35
|
-
</system-reminder>
|
|
36
|
-
```
|
|
68
|
+
- `ScorelMessage.content` may contain `system_reminder` blocks.
|
|
69
|
+
- UI/display projectors use block type and `visibility`; they must not parse `<system-reminder>` text.
|
|
70
|
+
- Provider adapters receive canonical `ScorelMessage[]` and lower `system_reminder` blocks to the provider's legal representation.
|
|
37
71
|
|
|
38
|
-
|
|
72
|
+
Default lowering keeps the current prompt contract:
|
|
39
73
|
|
|
40
|
-
```
|
|
74
|
+
```xml
|
|
41
75
|
<system-reminder>
|
|
42
76
|
content
|
|
43
77
|
</system-reminder>
|
|
44
78
|
```
|
|
45
79
|
|
|
46
|
-
|
|
80
|
+
Provider placement rules:
|
|
81
|
+
|
|
82
|
+
- User-message sidecars are rendered inside the same user message after visible user text.
|
|
83
|
+
- Runtime reminders prefer merge-after-tool-result when a valid previous tool result exists.
|
|
84
|
+
- If a provider cannot legally merge after a tool result, fallback to a standalone user message immediately after the tool result batch.
|
|
85
|
+
- Provider-level system/developer prompt is not used for runtime reminders.
|
|
47
86
|
|
|
48
|
-
###
|
|
87
|
+
### Core Helper Surface
|
|
49
88
|
|
|
50
|
-
|
|
89
|
+
Core owns reminder construction and rendering:
|
|
51
90
|
|
|
52
|
-
- `
|
|
53
|
-
-
|
|
54
|
-
-
|
|
55
|
-
-
|
|
91
|
+
- `createSystemReminderBlock(input)`
|
|
92
|
+
- `renderSystemReminder(block | text)`
|
|
93
|
+
- `renderSystemReminderText(text)`
|
|
94
|
+
- `appendSystemReminderToToolResult(message, block)`
|
|
95
|
+
- `systemReminderMessage(block, meta?)`
|
|
56
96
|
|
|
57
|
-
|
|
97
|
+
Feature code must pass semantic fields: `kind`, `origin`, `scope`, `visibility`, `text`, optional `data`. Feature code must not write `<system-reminder>` tags.
|
|
58
98
|
|
|
59
|
-
|
|
99
|
+
### Existing Source Migration
|
|
60
100
|
|
|
61
|
-
|
|
101
|
+
Migrate these sources to structured reminder blocks:
|
|
102
|
+
|
|
103
|
+
- `snip.userMessageId` model-only block attached to every persisted user message.
|
|
104
|
+
- `harness_item` conversion for memory, channel context, skill listing, skill delta, steer, runtime notice, and future todo nudges.
|
|
105
|
+
- compact summary injection.
|
|
106
|
+
- GUI and WebUI transcript projection for model-only blocks.
|
|
107
|
+
|
|
108
|
+
`harness_item` remains the persistent event for standalone runtime/session injections. S0107 does not need a new event type unless the implementation proves `harness_item` cannot carry the contract.
|
|
62
109
|
|
|
63
110
|
## Not In Scope
|
|
64
111
|
|
|
65
|
-
- Changing snip
|
|
66
|
-
-
|
|
67
|
-
- Changing provider-level system prompt assembly.
|
|
68
|
-
- UI controls for browsing hidden reminders.
|
|
112
|
+
- Changing `snip` behavior from S0106.
|
|
113
|
+
- Adding UI controls for browsing hidden reminders.
|
|
69
114
|
- Backfilling or migrating old session JSONL files.
|
|
70
115
|
- Renaming `<system-reminder>` in the model-facing prompt.
|
|
116
|
+
- Moving runtime reminders into provider-level system/developer prompts.
|
|
117
|
+
- Replacing all event-type conversion with a full handler registry.
|
|
71
118
|
|
|
72
119
|
## Acceptance Criteria
|
|
73
120
|
|
|
121
|
+
- Protocol supports `system_reminder` content blocks.
|
|
74
122
|
- No daemon or feature code hand-writes `<system-reminder>` strings.
|
|
75
|
-
- `buildContext()` uses
|
|
76
|
-
-
|
|
77
|
-
-
|
|
123
|
+
- `buildContext()` uses shared core helpers for `harness_item` and compact summary conversion.
|
|
124
|
+
- `snip.userMessageId` is a message-attached `system_reminder` block with stable prompt-cache behavior across later turns.
|
|
125
|
+
- Provider adapters lower `system_reminder` blocks through the shared renderer, including reminders inside user messages and merged tool results.
|
|
126
|
+
- WebUI and GUI hide model-only reminder blocks without parsing reminder text.
|
|
78
127
|
- Existing harness visibility behavior stays intact:
|
|
79
|
-
- hidden harness items do not render as visible turns;
|
|
80
|
-
- display harness items still render as lightweight transcript evidence.
|
|
81
|
-
- Prompt-cache stability is preserved for message-attached reminders: replaying the same persisted user message in later provider calls produces the same content.
|
|
82
|
-
- Provider adapters do not own system-reminder formatting rules.
|
|
128
|
+
- hidden harness items do not render as visible transcript turns;
|
|
129
|
+
- display/compact harness items can still render as lightweight transcript evidence.
|
|
83
130
|
- `pnpm typecheck && pnpm test` passes.
|
|
84
131
|
|
|
85
132
|
## Testing Requirements
|
|
86
133
|
|
|
87
|
-
-
|
|
134
|
+
- Protocol tests for `system_reminder` content block round trip / exhaustiveness.
|
|
135
|
+
- Core session tests for:
|
|
136
|
+
- message-attached reminder blocks rendering to `<system-reminder>` in provider context;
|
|
137
|
+
- `harness_item` conversion producing structured reminder blocks;
|
|
138
|
+
- merge-after-tool-result behavior preserving tool result content;
|
|
139
|
+
- compact summary using the shared reminder renderer.
|
|
88
140
|
- Daemon embedded test proving snip's message-attached reminder remains stable across later turns.
|
|
89
|
-
-
|
|
90
|
-
-
|
|
141
|
+
- Provider adapter test proving `system_reminder` blocks are lowered to `<system-reminder>` text.
|
|
142
|
+
- WebUI and GUI projector tests proving model-only reminder blocks are hidden while display harness items remain visible.
|
|
143
|
+
- Static regression check that common runtime paths no longer hand-write `<system-reminder>` literals outside tests/docs and the shared renderer.
|
|
91
144
|
|
|
92
145
|
## Impacted Files
|
|
93
146
|
|
|
147
|
+
- `packages/protocol/src/messages.ts`
|
|
148
|
+
- `packages/protocol/src/index.test.ts`
|
|
94
149
|
- `packages/core/src/session/index.ts`
|
|
95
150
|
- `packages/core/src/session/session.test.ts`
|
|
96
|
-
- `packages/core/src/
|
|
151
|
+
- `packages/core/src/provider/pi-ai.ts`
|
|
152
|
+
- `packages/core/src/provider/pi-ai.test.ts`
|
|
97
153
|
- `packages/daemon/src/index.ts`
|
|
98
154
|
- `packages/daemon/src/embedded/embedded.test.ts`
|
|
99
155
|
- `apps/webui/lib/events/projector.ts`
|
|
@@ -106,7 +162,12 @@ For reminders attached to a specific persisted message, the model-facing block m
|
|
|
106
162
|
|
|
107
163
|
## Risks And Boundaries
|
|
108
164
|
|
|
109
|
-
- Reminder placement affects prompt-cache behavior.
|
|
110
|
-
- Tool-result merge behavior
|
|
111
|
-
-
|
|
112
|
-
-
|
|
165
|
+
- Reminder placement affects prompt-cache behavior. Do not move message-attached reminders into dynamic `buildContext()` injection.
|
|
166
|
+
- Tool-result merge behavior must preserve valid assistant tool-call / tool-result replay.
|
|
167
|
+
- Runtime reminders can be frequent. The data model must keep origin, kind, scope, and visibility explicit so future skill, time, todo, IM, and provider-specific rules do not become string parsing.
|
|
168
|
+
- UI must use explicit metadata, not text parsing.
|
|
169
|
+
- A full event handler registry remains a later refactor; S0107 should ship the stable reminder contract first.
|
|
170
|
+
|
|
171
|
+
## Status
|
|
172
|
+
|
|
173
|
+
Done.
|
|
@@ -0,0 +1,172 @@
|
|
|
1
|
+
# S0109: Scorel Run Headless Task Runner
|
|
2
|
+
|
|
3
|
+
## Goal
|
|
4
|
+
|
|
5
|
+
Add a non-interactive `scorel run` command for one-shot agent tasks, matching the headless shape of tools such as `claude -p` and `codex exec`.
|
|
6
|
+
|
|
7
|
+
The command must run through Scorel's existing embedded Host, daemon/client, runtime, tool, project registry, and JSONL session path. It must be usable by external harnesses such as Harbor / Terminal-Bench without entering the interactive `scorel chat` REPL.
|
|
8
|
+
|
|
9
|
+
## Scope
|
|
10
|
+
|
|
11
|
+
- Add `scorel run [prompt]`.
|
|
12
|
+
- Add prompt input forms:
|
|
13
|
+
- positional prompt
|
|
14
|
+
- `--prompt <text>`
|
|
15
|
+
- `--prompt-file <path>`
|
|
16
|
+
- `--stdin`
|
|
17
|
+
- Add execution options:
|
|
18
|
+
- `--cwd <dir>`
|
|
19
|
+
- `--state-dir <dir>`
|
|
20
|
+
- `--sessions-dir <dir>`
|
|
21
|
+
- `--session <id>`
|
|
22
|
+
- `--timeout-ms <ms>`
|
|
23
|
+
- `--output-format text|json|stream-json|none`
|
|
24
|
+
- `--summary <path>`
|
|
25
|
+
- `--quiet`
|
|
26
|
+
- `--model <role-or-id>`
|
|
27
|
+
- `--provider <name>`
|
|
28
|
+
- `--api <openai-completions|openai-responses|google-generative-ai|anthropic-messages>` / `--protocol <...>`
|
|
29
|
+
- `--base-url <url>` / `--baseurl <url>`
|
|
30
|
+
- `--api-key <key>` / `--apikey <key>`
|
|
31
|
+
- Reuse the same product path as `scorel chat`: embedded `ScorelHost`, `DaemonClient`, project registration, real runtime, and append-only session JSONL.
|
|
32
|
+
- Return only after the submitted user turn finishes, errors, or times out.
|
|
33
|
+
- Write an optional summary JSON containing status, session id, project id, cwd, state/sessions paths, session JSONL path, elapsed time, output format, and error details.
|
|
34
|
+
|
|
35
|
+
## Product Boundary
|
|
36
|
+
|
|
37
|
+
This spec targets the minimum complete command contract needed for Terminal-Bench / Harbor installed-agent integration. The command must be stable enough for an external harness to:
|
|
38
|
+
|
|
39
|
+
1. provide one task instruction;
|
|
40
|
+
2. run Scorel in a specific task workspace;
|
|
41
|
+
3. isolate state and session artifacts per trial;
|
|
42
|
+
4. pin provider protocol, base URL, API key, and model from the harness;
|
|
43
|
+
5. wait for one agent turn to finish or time out;
|
|
44
|
+
6. read deterministic summary/session artifacts without parsing human-oriented stdout.
|
|
45
|
+
|
|
46
|
+
This is intentionally narrower than the full non-interactive command surface exposed by mature coding agents such as Claude Code `-p` or Codex `exec`.
|
|
47
|
+
|
|
48
|
+
Current required parity:
|
|
49
|
+
|
|
50
|
+
- one-shot prompt execution;
|
|
51
|
+
- workspace selection;
|
|
52
|
+
- model/provider selection;
|
|
53
|
+
- output format selection;
|
|
54
|
+
- timeout;
|
|
55
|
+
- stable exit codes;
|
|
56
|
+
- session artifact persistence;
|
|
57
|
+
- machine-readable summary file.
|
|
58
|
+
|
|
59
|
+
Known gaps versus Claude Code / Codex that are not required for this first Terminal-Bench integration:
|
|
60
|
+
|
|
61
|
+
- explicit permission modes and tool allow/deny lists;
|
|
62
|
+
- sandbox / approval policy flags;
|
|
63
|
+
- system prompt and append-system-prompt overrides;
|
|
64
|
+
- structured input protocol beyond plain prompt/stdin;
|
|
65
|
+
- budget and cost limits;
|
|
66
|
+
- MCP config injection;
|
|
67
|
+
- debug file / verbose diagnostic switches;
|
|
68
|
+
- partial-message streaming controls;
|
|
69
|
+
- tool-set selection;
|
|
70
|
+
- no-persistence mode;
|
|
71
|
+
- full resume/continue UX beyond explicit `--session` load-or-create behavior.
|
|
72
|
+
|
|
73
|
+
These gaps should be prioritized from real Terminal-Bench failure evidence, not copied wholesale from other CLIs.
|
|
74
|
+
|
|
75
|
+
## Command Contract
|
|
76
|
+
|
|
77
|
+
Examples:
|
|
78
|
+
|
|
79
|
+
```bash
|
|
80
|
+
scorel run "Fix the failing test and run the relevant verification command."
|
|
81
|
+
scorel run --prompt "Summarize this project" --output-format json
|
|
82
|
+
scorel run --prompt-file /tmp/instruction.txt --cwd /workspace --state-dir /tmp/scorel-state --summary /logs/agent/scorel-summary.json --output-format none
|
|
83
|
+
scorel run --prompt-file /tmp/instruction.txt --api openai-completions --base-url https://api.example.test/v1 --api-key "$API_KEY" --model gpt-5.4-mini
|
|
84
|
+
cat instruction.txt | scorel run --stdin --output-format stream-json
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
Prompt precedence is strict:
|
|
88
|
+
|
|
89
|
+
1. positional prompt
|
|
90
|
+
2. `--prompt`
|
|
91
|
+
3. `--prompt-file`
|
|
92
|
+
4. `--stdin`
|
|
93
|
+
|
|
94
|
+
Exactly one prompt source is allowed.
|
|
95
|
+
|
|
96
|
+
Exit codes:
|
|
97
|
+
|
|
98
|
+
- `0`: run completed.
|
|
99
|
+
- `1`: runtime / provider / agent error.
|
|
100
|
+
- `2`: command usage or configuration error.
|
|
101
|
+
- `124`: timeout.
|
|
102
|
+
|
|
103
|
+
Output formats:
|
|
104
|
+
|
|
105
|
+
- `text`: stream assistant text deltas and tool summaries, like a compact non-interactive `scorel chat`.
|
|
106
|
+
- `json`: print one final JSON object.
|
|
107
|
+
- `stream-json`: print newline-delimited JSON events for live deltas and final status.
|
|
108
|
+
- `none`: print no stdout except unexpected lower-level output; intended for benchmark harnesses that use files and container state.
|
|
109
|
+
|
|
110
|
+
## Not In Scope
|
|
111
|
+
|
|
112
|
+
- Harbor agent adapter.
|
|
113
|
+
- Terminal-Bench dataset or leaderboard submission.
|
|
114
|
+
- ATIF trajectory export.
|
|
115
|
+
- Background Bash / long-running command lifecycle.
|
|
116
|
+
- Permission sandbox policy.
|
|
117
|
+
- Resuming previous headless runs beyond explicit `--session` load-or-create behavior.
|
|
118
|
+
- Replacing `scorel chat`.
|
|
119
|
+
|
|
120
|
+
## Acceptance Criteria
|
|
121
|
+
|
|
122
|
+
- `scorel run --prompt ...` creates or resumes a session and submits exactly one user message.
|
|
123
|
+
- `scorel run --base-url ... --api-key ... --api ... --model ...` uses a run-local provider config without writing Scorel config files.
|
|
124
|
+
- The command exits after `DaemonClient.sendMessage()` resolves.
|
|
125
|
+
- `--cwd` controls the registered project workdir and runtime tool cwd.
|
|
126
|
+
- `--state-dir` isolates project registry and Scorel home.
|
|
127
|
+
- `--sessions-dir` controls where `{sessionId}.jsonl` is written.
|
|
128
|
+
- `--summary` writes deterministic JSON on success, runtime error, and timeout.
|
|
129
|
+
- `--output-format none` produces no normal stdout on success.
|
|
130
|
+
- `--output-format json` produces parseable final JSON.
|
|
131
|
+
- `--output-format stream-json` emits parseable JSONL progress/final events.
|
|
132
|
+
- Timeout returns exit code `124` and best-effort cancels the active session.
|
|
133
|
+
- Usage errors return exit code `2` and print a concise error.
|
|
134
|
+
|
|
135
|
+
## Testing
|
|
136
|
+
|
|
137
|
+
- Extend `apps/cli/src/index.test.ts`.
|
|
138
|
+
- Add focused tests for:
|
|
139
|
+
- prompt via `--prompt`.
|
|
140
|
+
- prompt file.
|
|
141
|
+
- stdin prompt.
|
|
142
|
+
- output format `none`.
|
|
143
|
+
- output format `json`.
|
|
144
|
+
- summary file content and session JSONL path.
|
|
145
|
+
- prompt source conflict.
|
|
146
|
+
- timeout exit code and summary.
|
|
147
|
+
|
|
148
|
+
Run:
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
pnpm --filter @scorel/app-cli test -- index
|
|
152
|
+
pnpm --filter @scorel/app-cli typecheck
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
Before completion, run the repository check:
|
|
156
|
+
|
|
157
|
+
```bash
|
|
158
|
+
pnpm typecheck && pnpm test
|
|
159
|
+
```
|
|
160
|
+
|
|
161
|
+
## Affected Paths
|
|
162
|
+
|
|
163
|
+
- `apps/cli/src/index.ts`
|
|
164
|
+
- `apps/cli/src/index.test.ts`
|
|
165
|
+
- `docs/ROADMAP.md`
|
|
166
|
+
- `docs/spec/ship/S0109-scorel-run-headless-task-runner.md`
|
|
167
|
+
|
|
168
|
+
## Risks
|
|
169
|
+
|
|
170
|
+
- Treating `scorel run` as a wrapper around REPL stdin would make completion unreliable. It must call the daemon/client request path directly.
|
|
171
|
+
- Parsing stdout in external harnesses would be fragile. Summary JSON and session JSONL are the durable artifacts.
|
|
172
|
+
- `--state-dir` and `--sessions-dir` must remain explicit to support one-task-per-container benchmark isolation.
|