@chanlerdev/scorel 0.0.6 → 0.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/docs/CHANGELOG.md CHANGED
@@ -2,6 +2,59 @@
2
2
 
3
3
  ## Unreleased
4
4
 
5
+ ## 0.0.8 - 2026-06-26
6
+
7
+ ### Highlights
8
+
9
+ - New `scorel run` command for headless, non-interactive task execution
10
+ - GUI now correctly loads memory status for the selected project on session preload
11
+ - Sessions can be attached even when provider credentials are missing
12
+
13
+ ### Changes
14
+
15
+ - Added `scorel run` command with multiple prompt sources, execution options, provider overrides, and machine-readable summary output
16
+ - GUI now fetches and displays memory status for the selected project during app initialization
17
+ - Provider settings UI no longer autosaves on credential mode change, preventing incomplete data saves
18
+ - Daemon session loading no longer requires a runtime; sessions can be attached without provider API keys configured
19
+
20
+ ### Fixes
21
+
22
+ - Fixed GUI not loading memory status for the selected project on session preload
23
+ - Fixed provider attach failure when provider config is incomplete (e.g., missing API key)
24
+
25
+ ### Verification
26
+
27
+ - Tests cover `scorel run` prompt sources, output formats, summary, timeout exit code, and provider overrides
28
+ - Unit test confirms memory status is fetched for correct project on app mount
29
+ - Tests verify sessions can be loaded and messages sent even when provider credentials are missing
30
+
31
+ ## 0.0.7 - 2026-06-24
32
+
33
+ ### Highlights
34
+
35
+ - Introduce structured system_reminder content blocks across CLI, GUI, WebUI, and daemon, replacing ad-hoc XML strings.
36
+ - Bundle GUI runtime dependencies for self-contained execution.
37
+
38
+ ### Changes
39
+
40
+ - System reminders now use structured `system_reminder` content blocks with origin, visibility, and scope, enabling consistent handling across all interfaces.
41
+ - GUI's CLI runtime is now bundled with all dependencies, eliminating the need for node_modules.
42
+
43
+ ### Fixes
44
+
45
+ - Snip tool result no longer exposes internal span IDs or event counts to the model, keeping output concise.
46
+
47
+ ### Breaking Changes
48
+
49
+ - Protocol version incremented from 4 to 5; session headers must now carry version 5.
50
+
51
+ ### Verification
52
+
53
+ - All existing tests pass, with new tests covering system_reminder lowering, message-attached reminders, projector filtering, and bundled runtime integrity.
54
+
55
+ - Protocol version incremented to 5 with structured `system_reminder` content blocks.
56
+ - Snip tool results now return a concise model-visible confirmation while keeping internal span details out of provider context.
57
+
5
58
  ## 0.0.6 - 2026-06-23
6
59
 
7
60
  ### Highlights
package/docs/ROADMAP.md CHANGED
@@ -601,8 +601,8 @@ M5 WebUI 的正式产品方向记录在 [`S0030`](spec/ship/S0030-webui-product-
601
601
  | M9.F1.25 | [`S0103`](spec/ship/S0103-daemon-lifecycle-and-settings-resilience.md) | Daemon 生命周期按入口区分,并修复 GUI Settings remote 切换黑屏风险 | Done |
602
602
  | M9.F1.26 | [`S0105`](spec/ship/S0105-cli-update-and-gui-release.md) | CLI 命令面统一补齐、NPM 手动/自动更新、GUI release 打包和增量更新框架 | Done |
603
603
  | M9.F1.27 | [`S0106`](spec/ship/S0106-snip-context-control.md) | `context_control` 持久事件和 `snip` tool,让 agent 隐藏已完成 user turn 的未来 LLM context 投影 | Done |
604
- | M9.F1.28 | [`S0107`](spec/ship/S0107-system-reminder-unification.md) | 统一 system reminder 的持久化、构造、LLM 投影和 UI visibility 语义 | Planned |
605
- | M9.F1.29 | [`S0108`](spec/ship/S0108-gui-bundled-cli-runtime.md) | GUI release 内置同版本 CLI runtime,packaged GUI 用 bundle 内可执行文件启动本地 Host | Active |
604
+ | M9.F1.28 | [`S0107`](spec/ship/S0107-system-reminder-unification.md) | 统一 system reminder 的持久化、构造、LLM 投影和 UI visibility 语义 | Done |
605
+ | M9.F1.29 | [`S0108`](spec/ship/S0108-gui-bundled-cli-runtime.md) | GUI release 内置同版本 CLI runtime,packaged GUI 用 bundle 内可执行文件启动本地 Host | Done |
606
606
 
607
607
  **Not in M9 Follow-up**:
608
608
 
@@ -783,8 +783,9 @@ HTTP adapter 必须映射已有 Host use cases,不复制领域逻辑。
783
783
  | [`S0104`](spec/ship/S0104-tool-result-artifacts.md) | Tool result artifacts for oversized Bash output | Done |
784
784
  | [`S0105`](spec/ship/S0105-cli-update-and-gui-release.md) | CLI update and GUI release | Done |
785
785
  | [`S0106`](spec/ship/S0106-snip-context-control.md) | Snip context control | Done |
786
- | [`S0107`](spec/ship/S0107-system-reminder-unification.md) | System reminder unification | Planned |
787
- | [`S0108`](spec/ship/S0108-gui-bundled-cli-runtime.md) | GUI bundled CLI runtime | Active |
786
+ | [`S0107`](spec/ship/S0107-system-reminder-unification.md) | System reminder unification | Done |
787
+ | [`S0108`](spec/ship/S0108-gui-bundled-cli-runtime.md) | GUI bundled CLI runtime | Done |
788
+ | [`S0109`](spec/ship/S0109-scorel-run-headless-task-runner.md) | `scorel run` headless task runner | Active |
788
789
 
789
790
  ---
790
791
 
@@ -94,7 +94,7 @@ running default -> follow_up queue
94
94
 
95
95
  ## 6. Source Reminder
96
96
 
97
- 每个 IM turn 会在用户消息前注入 hidden channel reminder:
97
+ 每个 IM turn 会在用户消息前注入 hidden `harness_item kind="channel_context"`。`buildContext()` 会把它转成 `system_reminder` block,provider adapter 最后 lower 成类似下面的模型输入文本:
98
98
 
99
99
  ```xml
100
100
  <system-reminder>
@@ -340,7 +340,7 @@ interface EventTypeHandler<T extends PersistentEvent> {
340
340
  ```typescript
341
341
  type LlmAction =
342
342
  | { action: "include"; message: ScorelMessage } // 正常包含为一条消息
343
- | { action: "merge_prev"; content: string } // 合入前一条消息(<system-reminder> 包裹)
343
+ | { action: "merge_prev"; reminder: SystemReminderContentBlock } // 合入前一条 tool_result
344
344
  | { action: "skip" } // 不包含在 LLM context 中
345
345
  | { action: "barrier"; summary: ScorelMessage } // 替换上方所有消息,注入 summary,停止遍历
346
346
  ```
@@ -350,7 +350,7 @@ type LlmAction =
350
350
  | Event 类型 | convertToLlm | convertToDisplay |
351
351
  |---|---|---|
352
352
  | `message`(user/assistant/tool_result) | `include` — 原样包含 | 正常气泡 |
353
- | `message`(meta.source = "steer") | `merge_prev` — 合入前一条 tool_result 的 `<system-reminder>`;无 tool_result `include` 作为独立 user msg | 内联小字提示 |
353
+ | `harness_item`(steer / runtime guidance) | `merge_prev` — 合入前一条 tool_result 的 `system_reminder` block;无 tool_result 则作为独立 user msg | 内联小字提示 |
354
354
  | `message`(meta.source = "followUp") | 同 steer | 内联 "追加任务" 标记 |
355
355
  | `rewind` | `skip` | "回退到此处" 标记 |
356
356
  | `branch` | `skip` | "切换分支" 标记 |
@@ -397,23 +397,40 @@ function buildContext(tree: SessionTree, leafId: EventId): ScorelMessage[] {
397
397
 
398
398
  ---
399
399
 
400
- ## 7. `<system-reminder>` 通用 Harness 注入机制
400
+ ## 7. `system_reminder` 通用 Harness 注入机制
401
401
 
402
402
  ### 7.1 用途
403
403
 
404
- `<system-reminder>` 是 Scorel harness 向 LLM 传递旁路信息的统一格式。所有非用户直接输入、但需要 LLM 看到的系统级内容都用此标签包裹。
404
+ `system_reminder` 是 Scorel harness 向 LLM 传递旁路信息的结构化 content block。所有非用户直接输入、但需要 LLM 看到的系统级内容都先表达成结构化 block;`<system-reminder>` 只是 provider adapter 最后生成的传输 envelope。
405
405
 
406
406
  ### 7.2 使用场景
407
407
 
408
408
  | 场景 | 注入内容 | 注入位置 |
409
409
  |------|---------|---------|
410
410
  | Steer(用户中途插话) | 用户的引导文字 | merge 进前一条 tool_result |
411
- | Hook 上下文(UserPromptSubmit 等) | hook 产出 | user message / tool_result 末尾 |
411
+ | User message sidecar | 当前 turn 时间、用户输入引用、`snip.userMessageId` | 同一条 user message `content` |
412
+ | Hook 上下文(UserPromptSubmit 等) | hook 产出 | user message / tool_result |
412
413
  | Memory 召回 | 记忆内容 | tool_result 末尾 |
413
414
  | 系统提醒(超时、配额等) | 通知文本 | tool_result 末尾 |
414
415
  | Channel 来源标注 | 来自哪个群/频道 | user message 内 |
415
416
 
416
- ### 7.3 格式
417
+ ### 7.3 Canonical 格式
418
+
419
+ ```typescript
420
+ interface SystemReminderContentBlock {
421
+ type: "system_reminder";
422
+ kind: SystemReminderKind;
423
+ origin: "system" | "user" | "tool" | "skill";
424
+ text: string;
425
+ visibility: "model" | "display" | "compact";
426
+ scope: "message" | "turn" | "next_model_call" | "session";
427
+ data?: Record<string, unknown>;
428
+ }
429
+ ```
430
+
431
+ ### 7.4 Provider lowering
432
+
433
+ Provider adapter 把 canonical block lower 成当前模型输入格式。默认 text envelope 是:
417
434
 
418
435
  ```xml
419
436
  <system-reminder>
@@ -421,12 +438,13 @@ function buildContext(tree: SessionTree, leafId: EventId): ScorelMessage[] {
421
438
  </system-reminder>
422
439
  ```
423
440
 
424
- ### 7.4 注入规则
441
+ ### 7.5 注入规则
425
442
 
426
- - **工具循环中**:merge 进最近一条 tool_resultcontent 末尾
427
- - **无 tool_result 时(idle / turn 结束后)**:作为独立 user message(或附加到 user message 内)
443
+ - **跟随 user message**:作为同一条 `user_message.content` sidecar block,创建时固定,保持 prompt-cache 稳定。
444
+ - **工具循环中**:merge 进最近一条 tool_result content 末尾;provider 不支持时 fallback 为紧邻 tool result batch 的 user message
445
+ - **无 tool_result 时(idle / turn 结束后)**:作为独立 user message。
428
446
 
429
- ### 7.5 LLM System Prompt 声明
447
+ ### 7.6 LLM System Prompt 声明
430
448
 
431
449
  LLM 在 system prompt 中被告知:
432
450
 
@@ -174,15 +174,15 @@ Steer message persist 为**独立 PersistentEvent**(role = "user",`meta.sour
174
174
 
175
175
  | 前面有 tool_result | 行为 |
176
176
  |---|---|
177
- | ✅ 有 | `merge_prev` — 合入前一条 tool_result content 末尾,用 `<system-reminder>` 包裹 |
177
+ | ✅ 有 | `merge_prev` — 合入前一条 tool_result content 末尾,内容为结构化 `system_reminder` block |
178
178
  | ❌ 没有(idle 状态) | `include` — 作为独立 user message |
179
179
 
180
- LLM 最终看到的(工具循环中):
180
+ Provider lowering 后 LLM 最终看到的(工具循环中):
181
181
  ```
182
182
  tool_result: "文件内容...\n\n<system-reminder>\n别改了,直接跑测试\n</system-reminder>"
183
183
  ```
184
184
 
185
- LLM 最终看到的(idle 时):
185
+ Provider lowering 后 LLM 最终看到的(idle 时):
186
186
  ```
187
187
  user: "别改了,直接跑测试"
188
188
  ```
@@ -198,8 +198,8 @@ function buildContext(tree: SessionTree, leafId: EventId): ScorelMessage[] {
198
198
  messages.unshift(result.message);
199
199
  break;
200
200
  case "merge_prev":
201
- // 合入 messages 中最后一条 tool_result 的 content 末尾(<system-reminder> 包裹)
202
- mergeIntoPrevToolResult(messages, result.content);
201
+ // 合入 messages 中最后一条 tool_result 的 content 末尾(system_reminder block)
202
+ mergeIntoPrevToolResult(messages, result.reminder);
203
203
  break;
204
204
  case "skip":
205
205
  break;
@@ -216,7 +216,7 @@ function buildContext(tree: SessionTree, leafId: EventId): ScorelMessage[] {
216
216
 
217
217
  各事件类型的 LlmAction:
218
218
  - `message`(普通)→ `include`
219
- - `message`(meta.source = "steer"/"followUp")→ `merge_prev`(前面有 tool_result 时)或 `include`(没有时)
219
+ - `harness_item` / runtime guidance → `merge_prev`(前面有 tool_result 时)或独立 user message,内容为结构化 `system_reminder` block
220
220
  - `compact` → `barrier`(注入 summary,停止遍历)
221
221
  - `rewind` / `branch` / `channel_inject` / `session_info` / `custom` → `skip`
222
222
  - `custom_message` → `include`
@@ -367,7 +367,7 @@ if (estimateTokens(compactCandidates) > threshold) {
367
367
  - `rewind` / `branch` / `channel_inject` / `session_info` / `custom` → `skip`(不进入 LLM)
368
368
  - `compact` → `barrier`(注入 summary,停止向上)
369
369
  - `context_control` → `filter`(从未来 LLM context 排除指定 user turn span)
370
- - `message`(meta.source = "steer")→ `merge_prev`(合入前一条 tool_result `<system-reminder>`)
370
+ - `harness_item` / runtime guidance → `merge_prev` 或独立 user message,内容为结构化 `system_reminder` block
371
371
 
372
372
  换言之,应用层能玩的花样很多,LLM 始终只看到 handler 声明要暴露的内容。
373
373
 
@@ -56,7 +56,7 @@ Expose a lazily available `Snip` runtime tool that lets the agent request hiding
56
56
  { "userMessageId": "u_...", "reason": "optional short reason" }
57
57
  ```
58
58
 
59
- The Host validates the request, resolves the model-visible short alias to a target span, appends a `context_control` event, and returns a tool result describing what changed. The tool result is still part of the current turn; the hidden span disappears on the next context build.
59
+ The Host validates the request, resolves the model-visible short alias to a target span, appends a `context_control` event, and returns a brief model-visible confirmation. Internal span details such as `anchorUserEventId`, `throughEventId`, and hidden event count may remain in structured tool result details for diagnostics, but provider context must only receive the concise confirmation. The tool result is still part of the current turn; the hidden span disappears on the next context build.
60
60
 
61
61
  The tool is session-context control, not a generic coding tool. It must be registered by the Host with access to the current lane, not by `createCodingTools()`.
62
62
 
@@ -2,98 +2,154 @@
2
2
 
3
3
  ## Goal
4
4
 
5
- Unify how Scorel represents, persists, projects, and displays system reminders.
5
+ Unify Scorel's runtime reminders as structured, model-facing context fragments.
6
6
 
7
- The business value is prompt and transcript hygiene: runtime guidance should reach the model through one stable contract, without ad-hoc `<system-reminder>` string construction scattered across daemon, session replay, tool-result merge paths, or future context-control features. UI should consistently hide or display reminder evidence based on explicit visibility, not by parsing model-facing text.
7
+ The business value is prompt and transcript hygiene on a high-frequency path. Scorel will routinely attach reminders to user messages, inject reminders while a turn is running, and route reminders through different provider message formats. This must be one stable product contract, not ad-hoc `<system-reminder>` strings scattered across daemon, session replay, tool-result merge paths, UI projectors, or provider adapters.
8
8
 
9
9
  ## Scope
10
10
 
11
- ### Reminder Source Model
11
+ ### Structured Reminder Block
12
+
13
+ Add a protocol-level content block:
14
+
15
+ ```typescript
16
+ type SystemReminderKind =
17
+ | "attachment"
18
+ | "time"
19
+ | "message_ref"
20
+ | "skill_listing"
21
+ | "skill_delta"
22
+ | "memory"
23
+ | "channel_context"
24
+ | "steer"
25
+ | "todo_nudge"
26
+ | "runtime_notice"
27
+ | "compact_summary";
28
+
29
+ type SystemReminderOrigin = "system" | "user" | "tool" | "skill";
30
+
31
+ type SystemReminderScope =
32
+ | "message" // travels with one persisted message whenever that message is in context
33
+ | "turn" // relevant to the user turn that created it
34
+ | "next_model_call" // runtime nudge, consumed by the next provider call
35
+ | "session"; // durable session context such as initial memory
36
+
37
+ type SystemReminderVisibility = "model" | "display" | "compact";
38
+
39
+ interface SystemReminderContentBlock {
40
+ type: "system_reminder";
41
+ kind: SystemReminderKind;
42
+ origin: SystemReminderOrigin;
43
+ text: string;
44
+ visibility: SystemReminderVisibility;
45
+ scope: SystemReminderScope;
46
+ data?: Record<string, unknown>;
47
+ }
48
+ ```
12
49
 
13
- Define one internal reminder representation for model-facing non-user guidance.
50
+ `<system-reminder>` remains the model-facing transport envelope, but it is no longer stored or hand-written by feature code. Callers create structured reminder blocks; core/provider projection renders the envelope.
14
51
 
15
- Current sources include:
52
+ ### Two Placement Modes
16
53
 
17
- - `harness_item` events such as memory, channel context, skill listing, skill delta, and steer.
18
- - Compact summary messages.
19
- - Model-only metadata attached to a specific `user_message`, such as `snip.userMessageId`.
20
- - Future runtime guidance that should be visible to the model but not necessarily displayed as transcript text.
54
+ System reminders can appear in two product situations:
21
55
 
22
- The new contract must preserve the existing product distinction:
56
+ 1. **Message-attached reminders**: created together with a `user_message` and persisted in that message's `content`.
57
+ - Examples: current time for the submitted turn, `snip.userMessageId`, references to prior user messages, channel context that explains the submitted text.
58
+ - These are stable sidecars. They are created when the message is persisted, so replaying the same message later does not mutate historical content or break prompt-cache prefix stability.
23
59
 
24
- - Some reminders are standalone session events (`harness_item`).
25
- - Some reminders are attached to a specific message to preserve prompt-cache prefix stability.
26
- - Some reminders can merge into a previous `tool_result`.
60
+ 2. **Runtime injected reminders**: appended while a turn is running or between provider calls.
61
+ - Examples: steer, skill delta, a nudge that the model has not used `TodoWrite`, runtime notices.
62
+ - These remain standalone `harness_item` events because they are independent session facts. `buildContext()` lowers them into structured reminder blocks and then places them according to provider-safe rules.
27
63
 
28
- ### Single Renderer
64
+ ### Canonical Context And Provider Lowering
29
65
 
30
- Move `<system-reminder>` wrapping behind a single public core helper or equivalent abstraction. Callers should provide reminder content and placement intent, not hand-write:
66
+ Scorel keeps a provider-neutral context:
31
67
 
32
- ```text
33
- <system-reminder>
34
- ...
35
- </system-reminder>
36
- ```
68
+ - `ScorelMessage.content` may contain `system_reminder` blocks.
69
+ - UI/display projectors use block type and `visibility`; they must not parse `<system-reminder>` text.
70
+ - Provider adapters receive canonical `ScorelMessage[]` and lower `system_reminder` blocks to the provider's legal representation.
37
71
 
38
- The renderer must keep the existing prompt contract:
72
+ Default lowering keeps the current prompt contract:
39
73
 
40
- ```text
74
+ ```xml
41
75
  <system-reminder>
42
76
  content
43
77
  </system-reminder>
44
78
  ```
45
79
 
46
- Any future format change must happen in one place.
80
+ Provider placement rules:
81
+
82
+ - User-message sidecars are rendered inside the same user message after visible user text.
83
+ - Runtime reminders prefer merge-after-tool-result when a valid previous tool result exists.
84
+ - If a provider cannot legally merge after a tool result, fallback to a standalone user message immediately after the tool result batch.
85
+ - Provider-level system/developer prompt is not used for runtime reminders.
47
86
 
48
- ### Visibility And Projection
87
+ ### Core Helper Surface
49
88
 
50
- Clarify and consolidate visibility semantics:
89
+ Core owns reminder construction and rendering:
51
90
 
52
- - `harness_item.visibility` controls whether the harness event is displayed as transcript evidence.
53
- - Message-level model-only blocks are included in LLM context but hidden from WebUI and GUI transcript projection.
54
- - Display projectors must not parse `<system-reminder>` text to decide visibility.
55
- - Provider adapters should receive already-rendered model-facing text or a normalized reminder block from core, not duplicate reminder formatting.
91
+ - `createSystemReminderBlock(input)`
92
+ - `renderSystemReminder(block | text)`
93
+ - `renderSystemReminderText(text)`
94
+ - `appendSystemReminderToToolResult(message, block)`
95
+ - `systemReminderMessage(block, meta?)`
56
96
 
57
- ### Prompt Cache Stability
97
+ Feature code must pass semantic fields: `kind`, `origin`, `scope`, `visibility`, `text`, optional `data`. Feature code must not write `<system-reminder>` tags.
58
98
 
59
- Reminder placement must not rewrite older model context on later turns.
99
+ ### Existing Source Migration
60
100
 
61
- For reminders attached to a specific persisted message, the model-facing block must be created when that message is persisted. Later `buildContext()` calls may clone or filter it, but must not mutate historical messages based on later session state.
101
+ Migrate these sources to structured reminder blocks:
102
+
103
+ - `snip.userMessageId` model-only block attached to every persisted user message.
104
+ - `harness_item` conversion for memory, channel context, skill listing, skill delta, steer, runtime notice, and future todo nudges.
105
+ - compact summary injection.
106
+ - GUI and WebUI transcript projection for model-only blocks.
107
+
108
+ `harness_item` remains the persistent event for standalone runtime/session injections. S0107 does not need a new event type unless the implementation proves `harness_item` cannot carry the contract.
62
109
 
63
110
  ## Not In Scope
64
111
 
65
- - Changing snip semantics from S0106.
66
- - Replacing `harness_item` with a new event type unless the implementation proves the existing event cannot express the contract.
67
- - Changing provider-level system prompt assembly.
68
- - UI controls for browsing hidden reminders.
112
+ - Changing `snip` behavior from S0106.
113
+ - Adding UI controls for browsing hidden reminders.
69
114
  - Backfilling or migrating old session JSONL files.
70
115
  - Renaming `<system-reminder>` in the model-facing prompt.
116
+ - Moving runtime reminders into provider-level system/developer prompts.
117
+ - Replacing all event-type conversion with a full handler registry.
71
118
 
72
119
  ## Acceptance Criteria
73
120
 
121
+ - Protocol supports `system_reminder` content blocks.
74
122
  - No daemon or feature code hand-writes `<system-reminder>` strings.
75
- - `buildContext()` uses the shared reminder renderer for `harness_item` and compact summaries.
76
- - Snip's model-only user-message id block uses the shared reminder renderer or normalized reminder block.
77
- - WebUI and GUI hide model-only message blocks without parsing reminder text.
123
+ - `buildContext()` uses shared core helpers for `harness_item` and compact summary conversion.
124
+ - `snip.userMessageId` is a message-attached `system_reminder` block with stable prompt-cache behavior across later turns.
125
+ - Provider adapters lower `system_reminder` blocks through the shared renderer, including reminders inside user messages and merged tool results.
126
+ - WebUI and GUI hide model-only reminder blocks without parsing reminder text.
78
127
  - Existing harness visibility behavior stays intact:
79
- - hidden harness items do not render as visible turns;
80
- - display harness items still render as lightweight transcript evidence.
81
- - Prompt-cache stability is preserved for message-attached reminders: replaying the same persisted user message in later provider calls produces the same content.
82
- - Provider adapters do not own system-reminder formatting rules.
128
+ - hidden harness items do not render as visible transcript turns;
129
+ - display/compact harness items can still render as lightweight transcript evidence.
83
130
  - `pnpm typecheck && pnpm test` passes.
84
131
 
85
132
  ## Testing Requirements
86
133
 
87
- - Core session tests for the shared reminder renderer and `buildContext()` conversion.
134
+ - Protocol tests for `system_reminder` content block round trip / exhaustiveness.
135
+ - Core session tests for:
136
+ - message-attached reminder blocks rendering to `<system-reminder>` in provider context;
137
+ - `harness_item` conversion producing structured reminder blocks;
138
+ - merge-after-tool-result behavior preserving tool result content;
139
+ - compact summary using the shared reminder renderer.
88
140
  - Daemon embedded test proving snip's message-attached reminder remains stable across later turns.
89
- - WebUI and GUI projector tests proving model-only blocks are hidden while display harness items remain visible.
90
- - Regression test or static check that common runtime paths no longer hand-write `<system-reminder>` literals outside the shared renderer and tests/docs.
141
+ - Provider adapter test proving `system_reminder` blocks are lowered to `<system-reminder>` text.
142
+ - WebUI and GUI projector tests proving model-only reminder blocks are hidden while display harness items remain visible.
143
+ - Static regression check that common runtime paths no longer hand-write `<system-reminder>` literals outside tests/docs and the shared renderer.
91
144
 
92
145
  ## Impacted Files
93
146
 
147
+ - `packages/protocol/src/messages.ts`
148
+ - `packages/protocol/src/index.test.ts`
94
149
  - `packages/core/src/session/index.ts`
95
150
  - `packages/core/src/session/session.test.ts`
96
- - `packages/core/src/tools/index.ts` or a new core reminder module
151
+ - `packages/core/src/provider/pi-ai.ts`
152
+ - `packages/core/src/provider/pi-ai.test.ts`
97
153
  - `packages/daemon/src/index.ts`
98
154
  - `packages/daemon/src/embedded/embedded.test.ts`
99
155
  - `apps/webui/lib/events/projector.ts`
@@ -106,7 +162,12 @@ For reminders attached to a specific persisted message, the model-facing block m
106
162
 
107
163
  ## Risks And Boundaries
108
164
 
109
- - Reminder placement affects prompt-cache behavior. A cleanup that moves snip ids from persisted user messages into later dynamic `buildContext()` injection would regress S0106.
110
- - Tool-result merge behavior is easy to break. The implementation must preserve valid assistant tool-call / tool-result replay.
111
- - UI should use explicit visibility metadata, not text parsing. Parsing `<system-reminder>` in UI would make display behavior depend on prompt formatting.
112
- - A broader content-block redesign may be attractive, but S0107 should stay focused on unifying reminder construction and projection.
165
+ - Reminder placement affects prompt-cache behavior. Do not move message-attached reminders into dynamic `buildContext()` injection.
166
+ - Tool-result merge behavior must preserve valid assistant tool-call / tool-result replay.
167
+ - Runtime reminders can be frequent. The data model must keep origin, kind, scope, and visibility explicit so future skill, time, todo, IM, and provider-specific rules do not become string parsing.
168
+ - UI must use explicit metadata, not text parsing.
169
+ - A full event handler registry remains a later refactor; S0107 should ship the stable reminder contract first.
170
+
171
+ ## Status
172
+
173
+ Done.
@@ -105,4 +105,4 @@ pnpm release patch --dry-run
105
105
 
106
106
  ## Status
107
107
 
108
- Active.
108
+ Done.
@@ -0,0 +1,172 @@
1
+ # S0109: Scorel Run Headless Task Runner
2
+
3
+ ## Goal
4
+
5
+ Add a non-interactive `scorel run` command for one-shot agent tasks, matching the headless shape of tools such as `claude -p` and `codex exec`.
6
+
7
+ The command must run through Scorel's existing embedded Host, daemon/client, runtime, tool, project registry, and JSONL session path. It must be usable by external harnesses such as Harbor / Terminal-Bench without entering the interactive `scorel chat` REPL.
8
+
9
+ ## Scope
10
+
11
+ - Add `scorel run [prompt]`.
12
+ - Add prompt input forms:
13
+ - positional prompt
14
+ - `--prompt <text>`
15
+ - `--prompt-file <path>`
16
+ - `--stdin`
17
+ - Add execution options:
18
+ - `--cwd <dir>`
19
+ - `--state-dir <dir>`
20
+ - `--sessions-dir <dir>`
21
+ - `--session <id>`
22
+ - `--timeout-ms <ms>`
23
+ - `--output-format text|json|stream-json|none`
24
+ - `--summary <path>`
25
+ - `--quiet`
26
+ - `--model <role-or-id>`
27
+ - `--provider <name>`
28
+ - `--api <openai-completions|openai-responses|google-generative-ai|anthropic-messages>` / `--protocol <...>`
29
+ - `--base-url <url>` / `--baseurl <url>`
30
+ - `--api-key <key>` / `--apikey <key>`
31
+ - Reuse the same product path as `scorel chat`: embedded `ScorelHost`, `DaemonClient`, project registration, real runtime, and append-only session JSONL.
32
+ - Return only after the submitted user turn finishes, errors, or times out.
33
+ - Write an optional summary JSON containing status, session id, project id, cwd, state/sessions paths, session JSONL path, elapsed time, output format, and error details.
34
+
35
+ ## Product Boundary
36
+
37
+ This spec targets the minimum complete command contract needed for Terminal-Bench / Harbor installed-agent integration. The command must be stable enough for an external harness to:
38
+
39
+ 1. provide one task instruction;
40
+ 2. run Scorel in a specific task workspace;
41
+ 3. isolate state and session artifacts per trial;
42
+ 4. pin provider protocol, base URL, API key, and model from the harness;
43
+ 5. wait for one agent turn to finish or time out;
44
+ 6. read deterministic summary/session artifacts without parsing human-oriented stdout.
45
+
46
+ This is intentionally narrower than the full non-interactive command surface exposed by mature coding agents such as Claude Code `-p` or Codex `exec`.
47
+
48
+ Current required parity:
49
+
50
+ - one-shot prompt execution;
51
+ - workspace selection;
52
+ - model/provider selection;
53
+ - output format selection;
54
+ - timeout;
55
+ - stable exit codes;
56
+ - session artifact persistence;
57
+ - machine-readable summary file.
58
+
59
+ Known gaps versus Claude Code / Codex that are not required for this first Terminal-Bench integration:
60
+
61
+ - explicit permission modes and tool allow/deny lists;
62
+ - sandbox / approval policy flags;
63
+ - system prompt and append-system-prompt overrides;
64
+ - structured input protocol beyond plain prompt/stdin;
65
+ - budget and cost limits;
66
+ - MCP config injection;
67
+ - debug file / verbose diagnostic switches;
68
+ - partial-message streaming controls;
69
+ - tool-set selection;
70
+ - no-persistence mode;
71
+ - full resume/continue UX beyond explicit `--session` load-or-create behavior.
72
+
73
+ These gaps should be prioritized from real Terminal-Bench failure evidence, not copied wholesale from other CLIs.
74
+
75
+ ## Command Contract
76
+
77
+ Examples:
78
+
79
+ ```bash
80
+ scorel run "Fix the failing test and run the relevant verification command."
81
+ scorel run --prompt "Summarize this project" --output-format json
82
+ scorel run --prompt-file /tmp/instruction.txt --cwd /workspace --state-dir /tmp/scorel-state --summary /logs/agent/scorel-summary.json --output-format none
83
+ scorel run --prompt-file /tmp/instruction.txt --api openai-completions --base-url https://api.example.test/v1 --api-key "$API_KEY" --model gpt-5.4-mini
84
+ cat instruction.txt | scorel run --stdin --output-format stream-json
85
+ ```
86
+
87
+ Prompt precedence is strict:
88
+
89
+ 1. positional prompt
90
+ 2. `--prompt`
91
+ 3. `--prompt-file`
92
+ 4. `--stdin`
93
+
94
+ Exactly one prompt source is allowed.
95
+
96
+ Exit codes:
97
+
98
+ - `0`: run completed.
99
+ - `1`: runtime / provider / agent error.
100
+ - `2`: command usage or configuration error.
101
+ - `124`: timeout.
102
+
103
+ Output formats:
104
+
105
+ - `text`: stream assistant text deltas and tool summaries, like a compact non-interactive `scorel chat`.
106
+ - `json`: print one final JSON object.
107
+ - `stream-json`: print newline-delimited JSON events for live deltas and final status.
108
+ - `none`: print no stdout except unexpected lower-level output; intended for benchmark harnesses that use files and container state.
109
+
110
+ ## Not In Scope
111
+
112
+ - Harbor agent adapter.
113
+ - Terminal-Bench dataset or leaderboard submission.
114
+ - ATIF trajectory export.
115
+ - Background Bash / long-running command lifecycle.
116
+ - Permission sandbox policy.
117
+ - Resuming previous headless runs beyond explicit `--session` load-or-create behavior.
118
+ - Replacing `scorel chat`.
119
+
120
+ ## Acceptance Criteria
121
+
122
+ - `scorel run --prompt ...` creates or resumes a session and submits exactly one user message.
123
+ - `scorel run --base-url ... --api-key ... --api ... --model ...` uses a run-local provider config without writing Scorel config files.
124
+ - The command exits after `DaemonClient.sendMessage()` resolves.
125
+ - `--cwd` controls the registered project workdir and runtime tool cwd.
126
+ - `--state-dir` isolates project registry and Scorel home.
127
+ - `--sessions-dir` controls where `{sessionId}.jsonl` is written.
128
+ - `--summary` writes deterministic JSON on success, runtime error, and timeout.
129
+ - `--output-format none` produces no normal stdout on success.
130
+ - `--output-format json` produces parseable final JSON.
131
+ - `--output-format stream-json` emits parseable JSONL progress/final events.
132
+ - Timeout returns exit code `124` and best-effort cancels the active session.
133
+ - Usage errors return exit code `2` and print a concise error.
134
+
135
+ ## Testing
136
+
137
+ - Extend `apps/cli/src/index.test.ts`.
138
+ - Add focused tests for:
139
+ - prompt via `--prompt`.
140
+ - prompt file.
141
+ - stdin prompt.
142
+ - output format `none`.
143
+ - output format `json`.
144
+ - summary file content and session JSONL path.
145
+ - prompt source conflict.
146
+ - timeout exit code and summary.
147
+
148
+ Run:
149
+
150
+ ```bash
151
+ pnpm --filter @scorel/app-cli test -- index
152
+ pnpm --filter @scorel/app-cli typecheck
153
+ ```
154
+
155
+ Before completion, run the repository check:
156
+
157
+ ```bash
158
+ pnpm typecheck && pnpm test
159
+ ```
160
+
161
+ ## Affected Paths
162
+
163
+ - `apps/cli/src/index.ts`
164
+ - `apps/cli/src/index.test.ts`
165
+ - `docs/ROADMAP.md`
166
+ - `docs/spec/ship/S0109-scorel-run-headless-task-runner.md`
167
+
168
+ ## Risks
169
+
170
+ - Treating `scorel run` as a wrapper around REPL stdin would make completion unreliable. It must call the daemon/client request path directly.
171
+ - Parsing stdout in external harnesses would be fragile. Summary JSON and session JSONL are the durable artifacts.
172
+ - `--state-dir` and `--sessions-dir` must remain explicit to support one-task-per-container benchmark isolation.
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@chanlerdev/scorel",
3
- "version": "0.0.6",
3
+ "version": "0.0.8",
4
4
  "description": "Replayable, recoverable, remotely controllable AI Agent workspace.",
5
5
  "type": "module",
6
6
  "packageManager": "pnpm@11.1.2",