xitto-kernel 0.9.5 → 0.9.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +15 -0
- package/README.md +172 -159
- package/README.zh-TW.md +281 -0
- package/package.json +6 -2
- package/src/app/server.js +3 -2
- package/src/app/web/index.html +24 -10
- package/src/kernel/compaction.js +1 -1
- package/src/kernel/goal-loop.js +1 -1
- package/src/kernel/provider.js +1 -1
package/CHANGELOG.md
CHANGED
|
@@ -1,5 +1,20 @@
|
|
|
1
1
|
# Changelog
|
|
2
2
|
|
|
3
|
+
## 0.9.6
|
|
4
|
+
|
|
5
|
+
- **依賴遷移到維護中的 `@earendil-works/pi-ai`(修 moderate 安全漏洞)**:舊 `@mariozechner/pi-ai` 已棄用凍結(停 0.73.1),且 0.70.6 透過 `@anthropic-ai/sdk` 帶 2 個 moderate 漏洞(GHSA-p7fg-763f-g4gf)。
|
|
6
|
+
- 改用 `@earendil-works/pi-ai@^0.80.2`,import 指向其 `/compat` 相容入口——保留 `streamSimple`/`completeSimple` 同簽名,**邏輯零改**。
|
|
7
|
+
- 驗證:`npm audit` 0 漏洞、無 deprecation 警告、202 測試綠、`examples/live.js`(streamSimple)與 `checkGoal`(completeSimple)live 實打通過。
|
|
8
|
+
- `/compat` 為官方過渡層,未來移除時需做 `createModels()` 深層遷移(追蹤 #7)。
|
|
9
|
+
- **許願台:本地資料夾選擇器支援隱藏資料夾**:加「顯示隱藏資料夾」開關(`/v1/fs` 支援 `hidden=1`,`node_modules` 仍一律排除);偏好記 localStorage,隱藏資料夾以暗色標示。
|
|
10
|
+
- **修:執行中追加(steer)/ 回答(needs-input)框無法輸入中文**:Enter 判斷加 `!isComposing`(IME 確認候選字不再被誤送);組字期間暫停 1.2s 輪詢重繪,避免重建 input 洗掉未確認的拼音。
|
|
11
|
+
- **開源治理與門面**:
|
|
12
|
+
- README 英文化為預設(`README.md`),繁中移至 `README.zh-TW.md`,兩者互相連結。
|
|
13
|
+
- 新增 `SECURITY.md`(私密漏洞回報 + 威脅模型)、`CODE_OF_CONDUCT.md`(Contributor Covenant 2.1)、Issue/PR 模板、`dependabot.yml`。
|
|
14
|
+
- CI 加固:`npm install`→`npm ci`、加 `permissions: contents:read` 最小權限。
|
|
15
|
+
- 修正 README 過時測試數字、移除失效的 `../xitto-code` 連結。
|
|
16
|
+
- 測試 202/202。
|
|
17
|
+
|
|
3
18
|
## 0.9.5
|
|
4
19
|
|
|
5
20
|
- **領域自動判斷(auto-routing)**:非技術使用者不必懂「該選哪個 pack」——預設「🪄 自動判斷領域」,系統依願望文字自動挑最適合的領域,並顯示「已自動用『研究』領域」+ 可在下拉覆蓋。
|
package/README.md
CHANGED
|
@@ -1,131 +1,133 @@
|
|
|
1
1
|
# xitto-kernel
|
|
2
2
|
|
|
3
|
+
**English** · [繁體中文](./README.zh-TW.md)
|
|
4
|
+
|
|
3
5
|
[](https://www.npmjs.com/package/xitto-kernel)
|
|
4
6
|
[](https://github.com/ishoplus/xitto-kernel/actions/workflows/ci.yml)
|
|
5
7
|
[](./LICENSE)
|
|
6
8
|
[](https://nodejs.org)
|
|
7
9
|
|
|
8
|
-
>
|
|
10
|
+
> A domain-agnostic agent foundation (**usable as a dependency** — your domain agent is a standalone project that imports the kernel rather than cloning it, so upgrades don't get frozen in).
|
|
9
11
|
|
|
10
|
-
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
12
|
+
Takes `xitto-code`, a complete coding agent, and abstracts it into a **domain-agnostic agent kernel** + pluggable **DomainPacks**.
|
|
13
|
+
The same kernel (multi-step tool loop, guard chain, permissions/sandbox, provider abstraction) can host an agent for any domain;
|
|
14
|
+
"coding" is just one DomainPack — swap it for "data query", "knowledge base", "support/ops", etc. by replacing the pack.
|
|
15
|
+
The interactive CLI lives in the app layer (thin); a richer TUI or other frontends can be another app consuming the same kernel events.
|
|
14
16
|
|
|
15
|
-
##
|
|
17
|
+
## In one line
|
|
16
18
|
|
|
17
|
-
> **kernel
|
|
19
|
+
> **The kernel provides "how to run an agent"; a DomainPack provides "what this agent can do, and what it guards."**
|
|
18
20
|
|
|
19
|
-
##
|
|
21
|
+
## Where the design comes from
|
|
20
22
|
|
|
21
|
-
xitto-code
|
|
22
|
-
`read-before-edit
|
|
23
|
+
After scanning xitto-code, roughly **80% was already a domain-agnostic kernel**; only three things were truly coupled to coding:
|
|
24
|
+
`read-before-edit`, `lint/type auto-verification`, and `git integration`. This design peels those three out of the kernel and into the pack's responsibilities.
|
|
23
25
|
|
|
24
|
-
##
|
|
26
|
+
## Quick start
|
|
25
27
|
|
|
26
|
-
|
|
28
|
+
**Prerequisite**: Node.js ≥ 20
|
|
27
29
|
|
|
28
|
-
**1.
|
|
30
|
+
**1. Install** (published on npm)
|
|
29
31
|
```bash
|
|
30
|
-
npm install -g xitto-kernel #
|
|
32
|
+
npm install -g xitto-kernel # global command: xitto-kernel
|
|
31
33
|
```
|
|
32
|
-
>
|
|
34
|
+
> Developing this repo: `cd xitto-kernel && npm install && npm link`.
|
|
33
35
|
|
|
34
|
-
**2.
|
|
36
|
+
**2. First-time setup** (interactive guide, generates `~/.xitto-code/providers.json`)
|
|
35
37
|
```bash
|
|
36
38
|
xitto-kernel init
|
|
37
39
|
```
|
|
38
|
-
|
|
39
|
-
|
|
40
|
-
|
|
40
|
+
Walks you through picking a provider (MiniMax / Anthropic / OpenAI / DeepSeek / custom) → filling in the model →
|
|
41
|
+
setting the API key (recommended: reference an env var via `${NAME}` so the key never lands on disk). Existing xitto-code users can reuse their config and skip this step.
|
|
42
|
+
(Starting without config prompts you to run `init`; existing config is never overwritten — only `--force` merges in new providers.)
|
|
41
43
|
|
|
42
|
-
**3.
|
|
44
|
+
**3. Run a built-in pack (interactive CLI)**
|
|
43
45
|
```bash
|
|
44
|
-
xitto-kernel # coding agent
|
|
45
|
-
xitto-kernel --tui #
|
|
46
|
-
xitto-kernel --pack notes #
|
|
46
|
+
xitto-kernel # coding agent (read/write files, run commands)
|
|
47
|
+
xitto-kernel --tui # full Ink TUI (persistent status bar, streaming, Esc to interrupt, tool cards ⏺/⎿, colored diffs, todos ☑; needs a real terminal)
|
|
48
|
+
xitto-kernel --pack notes # notes / knowledge-base agent
|
|
47
49
|
xitto-kernel --pack data-query
|
|
48
|
-
xitto-kernel --sandbox #
|
|
50
|
+
xitto-kernel --sandbox # open the Seatbelt sandbox on startup
|
|
49
51
|
```
|
|
50
52
|
|
|
51
|
-
**CLI
|
|
53
|
+
**Inside the CLI**: just type what you want (the model calls tools itself); commands `/help` `/goal <goal>` `/sandbox` `/plan` `/undo` `/tools` `/trust` `/memory` `/sessions` `/resume` `/exit`; `Ctrl+C` interrupts the current turn, press again while idle to exit.
|
|
52
54
|
|
|
53
|
-
|
|
55
|
+
**Progressive trust (accumulates as you go)**: mutating/dangerous tools ask for confirmation before running; when you approve you can choose `[a]` to trust the whole tool, or `[c]` to trust only "this command-signature class" (e.g. `git status`, `npm test` — fine-grained; `npm install` still asks). Choices are **persisted to `.xitto-kernel/<pack>/allow.json` and remembered across sessions**, so next time the same class auto-passes and is marked "✓ trusted". `/trust` to view, `/trust forget <item>` to revoke, `/trust clear` to wipe. Cautious at first, smoother as you go — dangerous commands are never written into trust, every one is gated each time.
|
|
54
56
|
|
|
55
|
-
|
|
57
|
+
**Sedimenting experience while running (project playbook)**: when the agent figures out "how this project does things" (build/test/deploy commands, conventions, required steps, pitfalls and fixes), it uses `playbook_update` to record it by topic into `.xitto-kernel/<pack>/playbook.md` (same topic overwrites — naturally deduplicated); **the next session auto-loads it into the system prompt, so it doesn't have to rediscover everything**. Because the file is bound to cwd, the playbook naturally only applies to this project. `/playbook` to view, `/playbook forget <topic>`, `/playbook clear`. Division of labor: `memory` stores facts/preferences/decisions (flat), `playbook` stores repeatable procedural knowledge (by topic).
|
|
56
58
|
|
|
57
|
-
|
|
59
|
+
**Self-crystallizing skills (crystallization layer, must be verified)**: once it works out a repeatable procedure/SOP, the agent uses `skill_save` to **write it as a new skill** (markdown) into `.xitto-kernel/<pack>/skills/`. **Policy gate: every new skill must include (1) a clear `goal` and (2) one `verify` command — verify actually runs in the sandbox, and the skill only lands if it passes (exit 0)**, otherwise it's rejected and the output is returned for the agent to fix (dangerous commands are always blocked). This ensures what crystallizes is "verified success", not "claimed success". **The skill is usable immediately this session via `skill` loaded by name (hot scan), and future sessions list it automatically under "available skills"** (progressive disclosure: the prompt only lists name + summary, loading the full text on demand). **Self-maintenance**: loading records usage (`usedCount`); `skills_check`/`/skills check` re-runs each skill's stored verify to detect **drift** — ones invalidated by project changes surface as `⚠ stale` for you to fix or delete, keeping the skill library trustworthy (stale ones are flagged in the prompt so they aren't misused). `/skills` to view (incl. usage/staleness), `/skills forget <name>` to remove. Division of labor: `playbook` is project-factual know-how, `skill` is a cross-task reusable and **verified** procedure. This layer lets the agent **grow its own skill library like Voyager** — but every entry is verified, self-checks, and runs inside the kernel's sandbox + progressive-trust governance.
|
|
58
60
|
|
|
59
|
-
|
|
61
|
+
**Episodic memory + relevance recall (episodic layer)**: after finishing a valuable task, the agent uses `episode_record` to log an episode (summary + tags + success/failure) into `.xitto-kernel/<pack>/episodes.jsonl`. **The key is recall, not storage**: on a similar task, the kernel **automatically** injects the top-K past episodes most relevant to the current input (relevance score: keyword / Chinese bigram overlap + tag weighting + slight recency bias) into that turn's prompt — **only the few most relevant, not the whole dump** (to avoid diluting context or misleading). You can also recall actively via `episode_recall`. Logging does Jaccard dedup to avoid bloat. `/episodes` lists recent, `/episodes <keyword>` tests recall, `/episodes clear`. This directly solves the real bottleneck of every memory system — **recalling the right few** (zero-dependency, explainable scoring, not black-box embeddings).
|
|
60
62
|
|
|
61
|
-
|
|
63
|
+
**Automatic fact extraction (fact layer)**: after each turn, the kernel uses one lightweight LLM call to **automatically** extract "persistent facts worth remembering across sessions" (preferences, identity, long-term decisions, stable settings) into `memory` — no longer relying only on the agent voluntarily calling `memory_save`. One-off task details / small talk are skipped (that's the episodic layer's job), and already-known facts are filtered out as duplicates. **Non-blocking** (hung on the `memoryExtraction` promise returned by `runTurn`, doesn't stall the reply); toggle via `config.autoExtractMemory` (on by default in the CLI), and `api.extractMemory()` can trigger it manually. Mirrors xitto's extractMemory.
|
|
62
64
|
|
|
63
|
-
###
|
|
65
|
+
### Sedimenting experience: all five layers
|
|
64
66
|
|
|
65
|
-
agent
|
|
67
|
+
The agent accumulates experience automatically while running, and every layer is governed:
|
|
66
68
|
|
|
67
|
-
|
|
|
69
|
+
| Layer | What it sediments | Mechanism |
|
|
68
70
|
|---|---|---|
|
|
69
|
-
|
|
|
70
|
-
|
|
|
71
|
-
|
|
|
72
|
-
|
|
|
73
|
-
|
|
|
71
|
+
| Reflex layer | What's safe | progressive trust (per-pattern, across sessions) |
|
|
72
|
+
| Fact layer | Things to remember | per-turn auto-extraction of persistent facts into memory |
|
|
73
|
+
| Procedure layer | How to do this project | playbook (by topic, auto-injected) |
|
|
74
|
+
| Episodic layer | What it has done | episodes + **relevance recall** (inject only the most relevant few) |
|
|
75
|
+
| Crystallization layer | Reusable procedures | self-written skills (must verify + self-check for drift) |
|
|
74
76
|
|
|
75
|
-
|
|
77
|
+
**General autonomous agent (give a goal, it finishes it itself)**
|
|
76
78
|
```bash
|
|
77
|
-
xitto-kernel --pack general --yes --goal "
|
|
79
|
+
xitto-kernel --pack general --yes --goal "Fetch a summary of example.com and write it in Traditional Chinese into summary.txt"
|
|
78
80
|
```
|
|
79
|
-
`general` pack
|
|
81
|
+
The `general` pack (files/shell/web_fetch) + the kernel's **goal loop** (repeated runTurn + LLM self-verification until done / no progress / limit). In interactive mode use `/goal <goal>`.
|
|
80
82
|
|
|
81
|
-
|
|
83
|
+
**Outcome-oriented: conversation is just the process, the deliverable is the product**
|
|
82
84
|
|
|
83
|
-
|
|
85
|
+
For non-technical users, what they really want isn't "chatting with an AI" — it's "get it done, give me the result." `api.runOutcome(goal)` runs the goal loop and returns not a conversation but a **deliverable**:
|
|
84
86
|
```js
|
|
85
|
-
const o = await kernel.runOutcome('
|
|
86
|
-
// → { done, summary
|
|
87
|
+
const o = await kernel.runOutcome('Create greet.js and write an example to verify it');
|
|
88
|
+
// → { done, summary (what it did), artifacts: { created:[...], modified:[...] }, rounds }
|
|
87
89
|
```
|
|
88
|
-
`--goal`
|
|
90
|
+
Both `--goal` and the server's `POST /v1/tasks` (mode=goal) return deliverables — **the files produced/changed** (diffing the working directory before/after, catching even what bash wrote) + a summary + whether the goal was met. Conversation is demoted to process; the result (files/completion) is put front and center. Background-task webhooks also carry `artifacts`.
|
|
89
91
|
|
|
90
|
-
|
|
91
|
-
- **CLI
|
|
92
|
-
-
|
|
92
|
+
**Clarification channel (interrupts you only when it truly must)**: the risk of autonomous delivery is "autonomously going wrong." The `ask_user` tool lets the agent pause and ask when **key information is missing and can't be reasonably inferred** — rather than blindly guessing or constantly interrupting (the prompt explicitly guides: if a reasonable default works, don't ask). The app injects `config.askUser` to decide the form of "asking":
|
|
93
|
+
- **CLI**: inline question, you type the answer, the agent continues
|
|
94
|
+
- **Background task**: the task moves to `needs-input` state and parks the question → you `POST /v1/tasks/:id/answer` → the pause is released and it continues (you can answer hours later, fully asynchronous)
|
|
93
95
|
|
|
94
|
-
|
|
96
|
+
In practice: given "create a config file but I haven't decided the filename/content yet" → the agent doesn't guess, pauses to ask for the filename and content → only delivers the correct `app.config.json` after you answer. This makes "wish → deliver" both autonomous and in control.
|
|
95
97
|
|
|
96
|
-
**🪄
|
|
98
|
+
**🪄 Wishboard web UI (for non-technical users: open a browser and go)**
|
|
97
99
|
```bash
|
|
98
|
-
XITTO_SERVER_TOKEN=secret npm run serve #
|
|
99
|
-
#
|
|
100
|
-
npm run serve:local # = LOCAL=1 SANDBOX=off,token
|
|
100
|
+
XITTO_SERVER_TOKEN=secret npm run serve # then open http://localhost:8787/ in a browser
|
|
101
|
+
# Local in-place mode (pick a real folder, edit files in place, sandbox off):
|
|
102
|
+
npm run serve:local # = LOCAL=1 SANDBOX=off, token defaults to secret (override with XITTO_SERVER_TOKEN)
|
|
101
103
|
```
|
|
102
|
-
|
|
103
|
-
-
|
|
104
|
-
-
|
|
105
|
-
-
|
|
106
|
-
-
|
|
107
|
-
-
|
|
108
|
-
-
|
|
109
|
-
-
|
|
110
|
-
-
|
|
111
|
-
-
|
|
104
|
+
No terminal, no touching keys (managed server-side). The interface centers on **results**, not chat:
|
|
105
|
+
- **Wish**: type one line of "what you want done" → submit (runs the goal loop in the background)
|
|
106
|
+
- **In progress**: **live progress + proof of life** — a heartbeat clock ticking "elapsed Ns" every second, the current phase (thinking / acting / verifying), the agent's current **thinking text** (💭), tool actions translated into plain language, the round number + action count. You can see what it's thinking and doing
|
|
107
|
+
- **Todo checklist**: when the agent plans a multi-step task with `todo_write`, it shows a ☐/◐/☑ list, turning "unknown duration" into "visible remaining steps" (à la Claude Code)
|
|
108
|
+
- **Stop anytime**: every in-progress task has a "Stop" button → `POST /v1/tasks/:id/cancel` (aborts the running agent). Control stays with the user, reducing the anxiety of "starting something you can't control"
|
|
109
|
+
- **Expand the process**: quiet by default (just progress and deliverables); want details, click "Expand process" → full step cards (read/edit/run, in plain language) + **colored diffs of edits** (green +/red -). One screen serves both "just give me the result" and "I want to see the details" (à la Claude Code's ⏺/⎿ + ctrl+r expand)
|
|
110
|
+
- **Needs your answer**: when the agent pauses to ask, a question + answer box pops up (clarification channel)
|
|
111
|
+
- **Collect deliverables**: on completion it shows a summary + **the files produced**, click a filename to view its content directly (`GET /v1/tasks/:id/file`, path-traversal protected)
|
|
112
|
+
- **Continue / adjust (iteration with context)**: each deliverable has "↳ Continue / adjust this result" — type one line of what to change / dig deeper into, submitting a **follow-up task** that **continues this conversation (sessionId) + the same workspace**. The agent has both "the files + the discussion and reasoning at the time", not just the files. By default each wish is a clean new conversation (no bloat); clicking "Continue" picks up that thread (like ChatGPT starting a new chat vs continuing one). History marks continuation chains with `↳`
|
|
113
|
+
- **Deliverable history**: a list of past wishes (the wish + status), not a chat thread
|
|
112
114
|
|
|
113
|
-
|
|
115
|
+
**Single-page layout (no tabs, everything at a glance)**: a **wish input** at the top + a left column (**deliverable history** + **📂 file browser**, each scrolling internally) + a main area (shared by **current task / progress / deliverables / file preview**). No tab switching — submitting a task, viewing history, browsing workspace files, and previewing content all on one page. The file browser navigates **level by level** (like a file explorer, not recursively flattened all at once), and clicking any file (deliverable or workspace) previews it in the main area. Container is 1180px, narrow screens (≤860px) collapse to a single column automatically. Deliberately kept lightweight (not an IDE).
|
|
114
116
|
|
|
115
|
-
|
|
117
|
+
**Persistent workspace (relationships between deliverables)**: each deliverable is an **independent conversation** (doesn't continue the previous one, avoiding context bloat), but they **share one persistent workspace** (`.xitto-server/ws/<workspace>`, default `default`) — so ① **files persist**, and later tasks can build on earlier results ("translate the plan.md I made last time into English"); ② **the five experience layers accumulate across deliverables** (preferences/skills/episodes/trust) — it **understands you better the more you use it**, no longer a stranger starting from scratch each time. `workspace` can be specified at POST time (one per user for multi-user); the web UI has a "Project" dropdown to switch, and each deliverable card marks its `📁 owning workspace`.
|
|
116
118
|
|
|
117
|
-
|
|
119
|
+
**Local in-place mode (edit a real folder you pick, like Claude Code)**: with `XITTO_SERVER_LOCAL=1`, the web UI gains a "**📁 Pick folder**" button — **click your way** from the home directory into your real folder and select it (no typing paths; the browser can't get absolute paths, so the local server lists folders), or "New project" by pasting an absolute path directly. The task then **edits the files in that folder in place** (no separate isolated copy), and the workbench lists it too. This bridges two models — "Wishboard (isolated, serving non-technical users)" and "Claude Code (in-place, editing your existing codebase)": **local self-use, want in-place → give a path; isolated/hosted → give a name**. **Safety**: absolute paths are only honored in `local` mode; **in hosted mode an absolute path is sanitized into a managed workspace and cannot escape to arbitrary host paths**.
|
|
118
120
|
|
|
119
|
-
|
|
121
|
+
**History survives restarts (persistence)**: the task list is persisted to `.xitto-server/tasks/` and conversation sessions to `.xitto-server/sessions/`, reloaded on startup — so **after a restart, deliverable history shows automatically and old deliverables can still be "continued/adjusted"** (the conversation context is there too). Tasks still running/awaiting-answer at restart are marked "interrupted (restart)". Mirrors Claude Code's "conversations auto-persist", but the Wishboard **auto-displays history** (a deliverable list) rather than Claude Code's explicit `--resume`.
|
|
120
122
|
|
|
121
|
-
|
|
123
|
+
**Provenance / file location**: a deliverable records its **logical location (workspace)**; the **physical absolute path** is hidden by default (hosted mode doesn't leak server paths), shown only in **local mode** (`XITTO_SERVER_LOCAL=1`), where a deliverable carries a "📂 file location" so you can find the file in Finder/Explorer.
|
|
122
124
|
|
|
123
|
-
|
|
125
|
+
A zero-dependency single HTML file (`src/app/web/index.html`), using polling rather than SSE. The token is injected into the page for same-origin calls — zero-config for local self-use; **put real authentication in front for production deployment**.
|
|
124
126
|
|
|
125
|
-
##
|
|
127
|
+
## Running it as a service (not just a CLI)
|
|
126
128
|
|
|
127
|
-
kernel
|
|
128
|
-
|
|
129
|
+
The kernel is UI-agnostic; the CLI is just one app. `src/app/server.js` is a PoC that wraps it into an **HTTP service**
|
|
130
|
+
(zero-dependency `node:http`) — proving "personal tool → serviceable foundation":
|
|
129
131
|
|
|
130
132
|
```bash
|
|
131
133
|
XITTO_SERVER_TOKEN=secret npm run serve # http://localhost:8787
|
|
@@ -134,135 +136,146 @@ curl -s -XPOST localhost:8787/v1/run -H "Authorization: Bearer secret" \
|
|
|
134
136
|
-H content-type:application/json -d '{"pack":"general","sessionId":"s1","input":"..."}'
|
|
135
137
|
```
|
|
136
138
|
|
|
137
|
-
|
|
138
|
-
|
|
139
|
-
|
|
139
|
+
Features: bearer-token auth, **per-session isolated working directory + history** (multi-turn remembers context), sandbox (Seatbelt),
|
|
140
|
+
structured JSON logs (audit/observability), 6 packs to choose from, JSON or SSE (`/v1/stream`) streaming.
|
|
141
|
+
"Personal vs production" is an **app-layer** concern — same kernel, the CLI and the server are two apps.
|
|
140
142
|
|
|
141
|
-
|
|
143
|
+
**Background tasks + completion notification (asynchronous interaction)** — dispatch a task, get a `taskId` immediately, and have a webhook called on completion, without watching it constantly:
|
|
142
144
|
```bash
|
|
143
|
-
#
|
|
145
|
+
# Dispatch a task (returns 202 + taskId immediately), POSTs the result to the webhook on completion
|
|
144
146
|
curl -s -XPOST localhost:8787/v1/tasks -H "Authorization: Bearer secret" \
|
|
145
147
|
-H content-type:application/json \
|
|
146
|
-
-d '{"pack":"general","mode":"goal","goal":"...","webhook":"https
|
|
148
|
+
-d '{"pack":"general","mode":"goal","goal":"...","webhook":"https://your-service/done"}'
|
|
147
149
|
|
|
148
|
-
curl -s localhost:8787/v1/tasks -H "Authorization: Bearer secret" #
|
|
149
|
-
curl -s localhost:8787/v1/tasks/<id> -H "Authorization: Bearer secret" #
|
|
150
|
-
curl -sN localhost:8787/v1/tasks/<id>/events -H "Authorization: Bearer secret" #
|
|
150
|
+
curl -s localhost:8787/v1/tasks -H "Authorization: Bearer secret" # list
|
|
151
|
+
curl -s localhost:8787/v1/tasks/<id> -H "Authorization: Bearer secret" # status + result
|
|
152
|
+
curl -sN localhost:8787/v1/tasks/<id>/events -H "Authorization: Bearer secret" # attach to event stream (SSE, replay + live)
|
|
151
153
|
```
|
|
152
|
-
|
|
153
|
-
|
|
154
|
+
Concurrency limited by `XITTO_SERVER_CONCURRENCY` (default 2); on completion the webhook receives `{taskId,status,text,usage,rounds,done}`.
|
|
155
|
+
This extends "watch it live" into a "dispatch → notify" asynchronous form (like treating the agent as a coworker).
|
|
154
156
|
|
|
155
|
-
##
|
|
157
|
+
## Build your own domain agent (without freezing in)
|
|
156
158
|
|
|
157
|
-
kernel
|
|
159
|
+
The kernel is a **depended-on package**, not a template to clone. Your agent is a small standalone project:
|
|
158
160
|
|
|
159
161
|
```bash
|
|
160
|
-
xitto-kernel new-agent my-bot #
|
|
162
|
+
xitto-kernel new-agent my-bot # produces a standalone project (imports the kernel, doesn't modify it)
|
|
161
163
|
cd my-bot && npm install && npm start
|
|
162
164
|
```
|
|
163
165
|
|
|
164
|
-
|
|
165
|
-
runtime
|
|
166
|
+
The generated `my-bot/` has only: `pack.js` (your domain: what it can do / what it guards) + `index.js` (a few lines to start) + `package.json` (`"xitto-kernel": "file:…"`).
|
|
167
|
+
The runtime (multi-step loop / streaming / permissions / sandbox / CLI) all lives in the kernel; `npm update xitto-kernel` upgrades the foundation and **your agent doesn't get frozen in**.
|
|
166
168
|
|
|
167
169
|
```
|
|
168
|
-
my-bot/ ←
|
|
170
|
+
my-bot/ ← your standalone project
|
|
169
171
|
├── package.json dependencies: { xitto-kernel: file:… }
|
|
170
|
-
├── pack.js ←
|
|
172
|
+
├── pack.js ← your DomainPack
|
|
171
173
|
└── index.js import { runCli, loadModel } from 'xitto-kernel/app'
|
|
172
174
|
```
|
|
173
175
|
|
|
174
|
-
>
|
|
176
|
+
> The built-in coding / data-query / notes packs are "official example packs" that live in the kernel repo; your pack lives in your own project. They coexist without freezing each other in.
|
|
175
177
|
|
|
176
|
-
##
|
|
178
|
+
## Build status
|
|
177
179
|
|
|
178
180
|
```
|
|
179
181
|
xitto-kernel/
|
|
180
182
|
├── src/
|
|
181
|
-
│ ├── types.js
|
|
182
|
-
│ ├── index.js
|
|
183
|
+
│ ├── types.js type definitions (DomainPack / Tool / KernelServices …)
|
|
184
|
+
│ ├── index.js public API (createKernel / loadPack / defineDomainPack …)
|
|
183
185
|
│ ├── kernel/
|
|
184
|
-
│ │ ├── pack-loader.js ✅ pack
|
|
185
|
-
│ │ ├── tool-registry.js ✅
|
|
186
|
-
│ │ ├── guard-chain.js ✅
|
|
187
|
-
│ │ ├── agent-loop.js ✅
|
|
188
|
-
│ │ ├── provider.js ✅ provider
|
|
189
|
-
│ │ ├── security/ ✅
|
|
190
|
-
│ │ │ ├── sandbox.js ✅
|
|
191
|
-
│ │ │ ├── danger.js ✅
|
|
192
|
-
│ │ │ ├── allow.js ✅
|
|
193
|
-
│ │ │ └── permission-step.js ✅
|
|
194
|
-
│ │ └── index.js ✅ createKernel
|
|
195
|
-
│ ├── app/ ✅ app
|
|
196
|
-
│ │ ├── index.js ✅ xitto-kernel/app
|
|
197
|
-
│ │ ├── cli.js ✅
|
|
198
|
-
│ │ ├── main.js ✅
|
|
199
|
-
│ │ ├── scaffold.js ✅
|
|
200
|
-
│ │ ├── templates/ ✅
|
|
201
|
-
│ │ └── providers.js ✅ providers.json
|
|
186
|
+
│ │ ├── pack-loader.js ✅ pack loading/validation
|
|
187
|
+
│ │ ├── tool-registry.js ✅ tool-metadata driven (replaces a hard-coded list)
|
|
188
|
+
│ │ ├── guard-chain.js ✅ fixed-order beforeToolCall guard chain
|
|
189
|
+
│ │ ├── agent-loop.js ✅ Agent ported from xitto-code (streaming + multi-step tool loop)
|
|
190
|
+
│ │ ├── provider.js ✅ provider-call adapter (pi-ai streamSimple + cache compatible)
|
|
191
|
+
│ │ ├── security/ ✅ real sandbox (guard-chain slot 5)
|
|
192
|
+
│ │ │ ├── sandbox.js ✅ static policy + macOS Seatbelt OS-level isolation
|
|
193
|
+
│ │ │ ├── danger.js ✅ dangerous-command detection (rm -rf / fork bomb / curl|sh …)
|
|
194
|
+
│ │ │ ├── allow.js ✅ command-signature allowlist
|
|
195
|
+
│ │ │ └── permission-step.js ✅ slot 5: deny→static policy→danger→confirm (metadata driven)
|
|
196
|
+
│ │ └── index.js ✅ createKernel: runTool + runTurn + sandbox wiring
|
|
197
|
+
│ ├── app/ ✅ app layer (thin; the TUI is not inside the kernel)
|
|
198
|
+
│ │ ├── index.js ✅ xitto-kernel/app public API (runCli/loadModel/newAgent)
|
|
199
|
+
│ │ ├── cli.js ✅ interactive CLI: streaming text + tool display + /commands + Ctrl+C interrupt
|
|
200
|
+
│ │ ├── main.js ✅ entry point + new-agent subcommand
|
|
201
|
+
│ │ ├── scaffold.js ✅ scaffolding: produce a standalone agent project (doesn't modify the kernel)
|
|
202
|
+
│ │ ├── templates/ ✅ standalone-project templates (package.json/index.js/pack.js…)
|
|
203
|
+
│ │ └── providers.js ✅ providers.json loading (provider config is an app concern, not the kernel's)
|
|
202
204
|
│ └── packs/
|
|
203
|
-
│ ├── coding/ ✅
|
|
204
|
-
│ ├── data-query/ ✅
|
|
205
|
-
│ ├── notes/ ✅
|
|
206
|
-
│ ├── general/ ✅
|
|
207
|
-
│ ├── deep-research/ ✅
|
|
208
|
-
│ └── devops/ ✅
|
|
209
|
-
├── bin/xitto-kernel.js ✅ CLI
|
|
210
|
-
├── test/ ✅
|
|
205
|
+
│ ├── coding/ ✅ reference pack (read/ls/write/edit/bash/git)
|
|
206
|
+
│ ├── data-query/ ✅ second domain (proves orthogonality)
|
|
207
|
+
│ ├── notes/ ✅ third domain (knowledge base)
|
|
208
|
+
│ ├── general/ ✅ general autonomous agent (files/shell/web/http + goal loop)
|
|
209
|
+
│ ├── deep-research/ ✅ deep research (multi-source search → verify → cited conclusion)
|
|
210
|
+
│ └── devops/ ✅ ops/SRE (shell + bash_bg + config + logs + health checks)
|
|
211
|
+
├── bin/xitto-kernel.js ✅ CLI entry point (run / new-agent)
|
|
212
|
+
├── test/ ✅ all tests green (runTurn + Seatbelt isolation + scaffolding + …)
|
|
211
213
|
└── examples/
|
|
212
|
-
├── demo.js ✅
|
|
213
|
-
└── live.js ✅
|
|
214
|
+
├── demo.js ✅ no LLM: same kernel, two domains, guards genuinely in effect
|
|
215
|
+
└── live.js ✅ real LLM (MiniMax): the model actually calls tools to finish a task
|
|
214
216
|
```
|
|
215
217
|
|
|
216
|
-
|
|
217
|
-
**runTurn
|
|
218
|
-
|
|
219
|
-
|
|
218
|
+
**Also runnable**: `npm test` (200+ tests, all green), `npm run demo` (no LLM), `node examples/live.js` (real LLM).
|
|
219
|
+
**runTurn is ported**: the multi-step loop of stream → tool call (through the kernel guard chain) → feed back → stream again, drivable by a real provider.
|
|
220
|
+
**Real sandbox is wired into guard-chain slot 5**: (A) static policy blocks network/privilege-escalation/dangerous commands; (B) macOS Seatbelt provides runtime OS-level isolation, catching obfuscated out-of-bounds writes the static policy missed. `sandboxable` tools are auto-wrapped, `tool.readOnly` is auto-passed — all metadata driven, no domain lists.
|
|
221
|
+
**Still seams (later)**: in-turn compaction, hooks/skills/MCP/subagent, contextFiles loading, interactive permission confirmation (the CLI currently passes mutating tools headlessly; dangerous commands are still blocked). A richer Ink TUI can be another app consuming the same kernel events.
|
|
220
222
|
|
|
221
|
-
##
|
|
223
|
+
## Documentation index
|
|
222
224
|
|
|
223
|
-
|
|
|
225
|
+
| Doc | Contents |
|
|
224
226
|
|------|------|
|
|
225
|
-
| [01-architecture.md](docs/01-architecture.md) |
|
|
226
|
-
| [02-domain-pack-spec.md](docs/02-domain-pack-spec.md) | `DomainPack`
|
|
227
|
-
| [03-kernel-contract.md](docs/03-kernel-contract.md) | kernel
|
|
228
|
-
| [04-migration-from-xitto-code.md](docs/04-migration-from-xitto-code.md) |
|
|
229
|
-
| [05-example-packs.md](docs/05-example-packs.md) |
|
|
230
|
-
| [06-authoring-a-pack.md](docs/06-authoring-a-pack.md) |
|
|
227
|
+
| [01-architecture.md](docs/01-architecture.md) | Layered architecture, kernel module list, the lifecycle of one turn, and the kernel/pack boundary |
|
|
228
|
+
| [02-domain-pack-spec.md](docs/02-domain-pack-spec.md) | Full spec of the `DomainPack` interface (per field, required/optional, defaults) |
|
|
229
|
+
| [03-kernel-contract.md](docs/03-kernel-contract.md) | The services the kernel provides to a pack (`KernelServices`) and lifecycle hooks |
|
|
230
|
+
| [04-migration-from-xitto-code.md](docs/04-migration-from-xitto-code.md) | Concrete steps for extracting from xitto-code: how each coupling point moves, and the risks |
|
|
231
|
+
| [05-example-packs.md](docs/05-example-packs.md) | Example pack comparison (coding / data-query built in + ops sketch), proving the same interface runs different domains |
|
|
232
|
+
| [06-authoring-a-pack.md](docs/06-authoring-a-pack.md) | **How to build a new domain agent on the foundation**: minimal pack, tool shape, three steps, tool vs prompt |
|
|
231
233
|
|
|
232
|
-
##
|
|
234
|
+
## Status and next steps
|
|
233
235
|
|
|
234
|
-
|
|
235
|
-
|
|
236
|
-
|
|
237
|
-
**git
|
|
238
|
-
**skills
|
|
236
|
+
**Done**: the pack system, tool-metadata-driven tools, fixed-order guard chain, agent loop (real-LLM multi-step loop),
|
|
237
|
+
real sandbox (static policy + macOS Seatbelt), pack.verify self-acceptance, pack.contextFiles loading,
|
|
238
|
+
**cross-session memory + resume**, **interactive permission confirmation** (/auto, --yes), **/plan plan mode + /undo**,
|
|
239
|
+
**git capabilities** (coding pack), **spawn_agent subagents**, **PreToolUse/PostToolUse hooks**,
|
|
240
|
+
**skills progressive disclosure**, **MCP tool integration**, the interactive CLI, scaffolding (`new-agent` produces a standalone project). All tests green (200+).
|
|
239
241
|
|
|
240
|
-
|
|
241
|
-
|
|
242
|
+
**Published on npm**: `npm install -g xitto-kernel`; projects produced by `new-agent` depend on `^0.1.0` by default (`--local` uses file: for development).
|
|
243
|
+
**Optional next**: a full-featured Ink TUI as another app (the CLI already has lightweight streaming markdown + colored diffs).
|
|
242
244
|
|
|
243
|
-
|
|
245
|
+
**Design stance**: stays on Node ESM + the pi-ai provider abstraction; doesn't rewrite xitto-code (the kernel is an abstraction; xitto-code can still exist independently).
|
|
244
246
|
|
|
245
|
-
##
|
|
247
|
+
## Evaluation (capability is quantifiable)
|
|
246
248
|
|
|
247
|
-
|
|
248
|
-
|
|
249
|
+
Each pack ships with an EvalSuite (`eval/`, sharing `eval/framework.js`, not part of the npm package).
|
|
250
|
+
Paradigm: **a new-domain agent = a new pack (what it can do) + a new EvalSuite (how to score it)**.
|
|
249
251
|
|
|
250
|
-
| Suite |
|
|
252
|
+
| Suite | Benchmarked against | Scoring | How to run | Reference result* |
|
|
251
253
|
|------|------|------|------|------|
|
|
252
|
-
| coding | SWE-bench Verified |
|
|
253
|
-
| coding
|
|
254
|
-
| general | GAIA
|
|
255
|
-
| data-query | Spider/BIRD
|
|
256
|
-
| deep-research | GAIA
|
|
257
|
-
| devops | Terminal-Bench
|
|
258
|
-
|
|
|
254
|
+
| coding | SWE-bench Verified | hidden tests fail→pass (Docker) | `eval/swebench-generate.js` + official harness | 3/8 resolved (real subset) |
|
|
255
|
+
| coding (mini) | SWE-bench style | hidden tests (no Docker) | `npm run eval` | 4/4 |
|
|
256
|
+
| general | GAIA style | answer match / state check | `node eval/general-run.js` | 4/4 |
|
|
257
|
+
| data-query | Spider/BIRD style | real SQLite + answer match | `node eval/data-query-run.js` | 4/4 |
|
|
258
|
+
| deep-research | GAIA/research | factual correctness + genuine verification (allOf) | `node eval/deep-research-run.js` | 3/3 |
|
|
259
|
+
| devops | Terminal-Bench style | state check (system/files meet target) | `node eval/devops-run.js` | 4/4 |
|
|
260
|
+
| tool calling | BFCL style | trajectory check (calls the right tool/params) | `node eval/tool-calling-run.js` | 6/6 |
|
|
261
|
+
|
|
262
|
+
\* Reference numbers run with MiniMax-M2.7 (small sample); for swapping models / expanding the sample see `eval/README.md`. Scorer types: `answerMatch` / `stateCheck` / `toolCalled`.
|
|
263
|
+
|
|
264
|
+
## Security
|
|
265
|
+
|
|
266
|
+
xitto-kernel runs an agent that **executes commands and edits files chosen by an LLM**. Treat it like running code you didn't write. Key caveats before you deploy:
|
|
267
|
+
|
|
268
|
+
- **OS sandbox is macOS-only.** The real isolation layer is macOS Seatbelt. On **Linux/Windows there is no OS-level sandbox** — the agent runs commands with your user's privileges. Run untrusted goals inside a container/VM or a throwaway environment.
|
|
269
|
+
- **The example HTTP server is an unhardened PoC.** The bearer token is injected into the page for same-origin calls and there is no rate limiting. **Never expose it unauthenticated to the public internet** — put real authentication and TLS in front, and prefer running it locally.
|
|
270
|
+
- **Prompt injection is a real surface.** Web pages, files, and tool output the agent reads can carry adversarial instructions. The command-danger detector (`rm -rf`, fork bombs, `curl | sh`, …), the command-signature allowlist, and progressive trust reduce the blast radius but do not eliminate it. Dangerous commands are always gated; review what you grant trust to.
|
|
271
|
+
- **Keys never need to land on disk.** Reference API keys via env vars (`${NAME}`) in `providers.json`, which is git-ignored.
|
|
259
272
|
|
|
260
|
-
|
|
273
|
+
Found a vulnerability? Please report it privately — see [SECURITY.md](SECURITY.md). Do not open a public issue.
|
|
261
274
|
|
|
262
|
-
##
|
|
275
|
+
## Contributing
|
|
263
276
|
|
|
264
|
-
|
|
277
|
+
See [CONTRIBUTING.md](CONTRIBUTING.md). Core principle: the kernel must stay domain-agnostic (safety behavior comes from tool metadata, not hard-coded domain lists); a new domain = adding a pack, with zero kernel changes.
|
|
265
278
|
|
|
266
|
-
##
|
|
279
|
+
## License
|
|
267
280
|
|
|
268
281
|
[MIT](LICENSE) © ishoplus
|
package/README.zh-TW.md
ADDED
|
@@ -0,0 +1,281 @@
|
|
|
1
|
+
# xitto-kernel
|
|
2
|
+
|
|
3
|
+
[English](./README.md) · **繁體中文**
|
|
4
|
+
|
|
5
|
+
[](https://www.npmjs.com/package/xitto-kernel)
|
|
6
|
+
[](https://github.com/ishoplus/xitto-kernel/actions/workflows/ci.yml)
|
|
7
|
+
[](./LICENSE)
|
|
8
|
+
[](https://nodejs.org)
|
|
9
|
+
|
|
10
|
+
> 領域無關的 agent 底座(**可當依賴套件** — 你的領域 agent 是獨立專案,import kernel 而非 clone,升級不固化)
|
|
11
|
+
|
|
12
|
+
把 `xitto-code` 這個完整的編碼智能體,抽象成一個**領域無關的 agent kernel** + 可插拔的 **DomainPack**。
|
|
13
|
+
同一個 kernel(多步工具循環、守衛鏈、權限/沙箱、provider 抽象)能承載任何領域的 agent;
|
|
14
|
+
「編碼」只是其中一個 DomainPack,換成「資料查詢」「知識庫」「客服/維運」等只需替換 pack。
|
|
15
|
+
互動 CLI 在 app 層(薄);更豐富的 TUI 或其他前端可作為另一個 app 消費同一組 kernel 事件。
|
|
16
|
+
|
|
17
|
+
## 一句話
|
|
18
|
+
|
|
19
|
+
> **kernel 提供「怎麼跑一個 agent」,DomainPack 提供「這個 agent 會什麼、守什麼」。**
|
|
20
|
+
|
|
21
|
+
## 設計從哪來
|
|
22
|
+
|
|
23
|
+
xitto-code 經掃描後,約 **8 成已是領域無關的 kernel**;真正跟編碼綁死的只有三件事:
|
|
24
|
+
`read-before-edit`、`lint/型別自動驗收`、`git 整合`。本設計把這三件事從 kernel 剝離成 pack 的職責。
|
|
25
|
+
|
|
26
|
+
## 快速開始
|
|
27
|
+
|
|
28
|
+
**前置需求**:Node.js ≥ 20
|
|
29
|
+
|
|
30
|
+
**1. 安裝**(已發佈 npm)
|
|
31
|
+
```bash
|
|
32
|
+
npm install -g xitto-kernel # 全域命令 xitto-kernel
|
|
33
|
+
```
|
|
34
|
+
> 開發本倉庫:`cd xitto-kernel && npm install && npm link`。
|
|
35
|
+
|
|
36
|
+
**2. 首次設定**(互動導引,產生 `~/.xitto-code/providers.json`)
|
|
37
|
+
```bash
|
|
38
|
+
xitto-kernel init
|
|
39
|
+
```
|
|
40
|
+
引導你選 provider(MiniMax / Anthropic / OpenAI / DeepSeek / 自訂)→ 填 model →
|
|
41
|
+
設定 API key(建議用環境變數參照 `${NAME}`,金鑰不落地)。已是 xitto-code 使用者可直接共用既有設定、跳過此步。
|
|
42
|
+
(沒設定就啟動會提示你跑 `init`;既有設定不會被覆寫,`--force` 才會合併新 provider。)
|
|
43
|
+
|
|
44
|
+
**3. 跑內建 pack(互動 CLI)**
|
|
45
|
+
```bash
|
|
46
|
+
xitto-kernel # coding agent(讀寫檔案、跑命令)
|
|
47
|
+
xitto-kernel --tui # 完整 Ink TUI(持久狀態列、串流、Esc 中斷、工具卡⏺/⎿、彩色 diff、待辦☑;需真實終端)
|
|
48
|
+
xitto-kernel --pack notes # 筆記 / 知識庫 agent
|
|
49
|
+
xitto-kernel --pack data-query
|
|
50
|
+
xitto-kernel --sandbox # 啟動就開 Seatbelt 沙箱
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
**CLI 內操作**:直接打需求(模型會自己呼叫工具);指令 `/help` `/goal <目標>` `/sandbox` `/plan` `/undo` `/tools` `/trust` `/memory` `/sessions` `/resume` `/exit`;`Ctrl+C` 中斷該輪、閒置時再按一次離開。
|
|
54
|
+
|
|
55
|
+
**漸進式放權(trust 隨用累積)**:mutating/危險工具執行前會確認;批准時可選 `[a]` 信任整個工具、或 `[c]` 只信任「該命令簽章類」(如 `git status`、`npm test`——細粒度,`npm install` 仍會問)。選擇會**落地到 `.xitto-kernel/<pack>/allow.json`,跨 session 記得**,下次同類自動放行並標示「✓ 已信任」。`/trust` 查看、`/trust forget <項>` 撤銷、`/trust clear` 全清。一開始謹慎、用著用著越來越順手——危險命令永不寫入信任,每次都把關。
|
|
56
|
+
|
|
57
|
+
**執行中沉澱經驗(專案手冊)**:agent 摸清「這個專案怎麼做事」(建置/測試/部署指令、慣例、必經步驟、踩過的坑與修法)時,會用 `playbook_update` 按 topic 記進 `.xitto-kernel/<pack>/playbook.md`(同 topic 覆蓋,天然去重);**下次 session 自動載入系統提示,不必重新摸索**。因檔案綁 cwd,手冊天然只對這個專案生效。`/playbook` 查看、`/playbook forget <主題>`、`/playbook clear`。分工:`memory` 存事實/偏好/決策(扁平),`playbook` 存可重複的程序知識(按主題)。
|
|
58
|
+
|
|
59
|
+
**自我結晶技能(結晶層,須驗證)**:摸出一套可重複的操作流程/SOP 時,agent 用 `skill_save` 把它**寫成新技能**(markdown)存進 `.xitto-kernel/<pack>/skills/`。**政策閘門:每個技能新增時必須附 (1) `goal` 明確目標 (2) `verify` 一條驗證指令——verify 會在沙箱實際執行,通過(exit 0)才落地**,否則拒絕並回傳輸出讓 agent 修正(危險指令一律擋下)。確保結晶的是「已驗證的成功」而非「宣稱的成功」。**本 session 立即可用 `skill` 按名載入(熱掃描),未來 session 自動列入「可用技能」**(漸進揭露:prompt 只列名稱+簡述,需要時才載全文)。**自我維護**:載入會記用量(`usedCount`);`skills_check`/`/skills check` 重跑每個技能存的 verify 偵測**漂移**——專案變動後失效的標 `⚠ stale` 浮上來讓你修或刪,保持技能庫可信(失效的在 prompt 標注、別誤用)。`/skills` 查看(含用量/失效)、`/skills forget <名>` 移除。分工:`playbook` 是專案事實性 know-how,`skill` 是可跨任務複用且**已驗證**的操作流程。這層讓 agent 像 Voyager 一樣**長出自己的技能庫**——但每條都經驗證、會自我體檢,且跑在 kernel 的沙箱 + 漸進信任治理裡。
|
|
60
|
+
|
|
61
|
+
**情節記憶 + 相關性召回(情節層)**:完成有價值的任務後,agent 用 `episode_record` 記一筆情節(摘要 + tags + 成敗)進 `.xitto-kernel/<pack>/episodes.jsonl`。**關鍵在召回不在存**:遇到相似任務時,kernel **自動**把與當前 input 最相關的 top-K 過往情節(相關性評分:關鍵詞/中文 bigram 重疊 + tag 加權 + 近期微傾)注入該輪 prompt——**只注入最相關的幾條,不全量倒**(避免稀釋 context、誤導)。也可主動 `episode_recall`。記錄時做 Jaccard 去重避免膨脹。`/episodes` 列近期、`/episodes <關鍵詞>` 測召回、`/episodes clear`。這直接解掉所有記憶系統的真正瓶頸——**召回對的那幾條**(零依賴、可解釋的評分,非黑箱 embedding)。
|
|
62
|
+
|
|
63
|
+
**事實自動萃取(事實層)**:每輪對話後,kernel 用一次輕量 LLM **自動**把「值得跨 session 記住的持久事實」(偏好、身分、長期決策、穩定設定)抽出來存進 `memory`——不再只靠 agent 自覺呼叫 `memory_save`。一次性的任務細節/閒聊會被略過(那是情節層的事),已知事實會過濾不重複。**非阻塞**(掛在 `runTurn` 回傳的 `memoryExtraction` promise,不卡回覆);`config.autoExtractMemory` 開關(CLI 預設開),`api.extractMemory()` 也可手動觸發。對標 xitto 的 extractMemory。
|
|
64
|
+
|
|
65
|
+
### 沉澱經驗:五層完整
|
|
66
|
+
|
|
67
|
+
agent 執行中自動累積經驗,且每層都有治理:
|
|
68
|
+
|
|
69
|
+
| 層 | 沉澱什麼 | 機制 |
|
|
70
|
+
|---|---|---|
|
|
71
|
+
| 反射層 | 什麼安全 | 漸進信任(per-pattern,跨 session) |
|
|
72
|
+
| 事實層 | 記住的事 | 每輪自動萃取持久事實進 memory |
|
|
73
|
+
| 程序層 | 這專案怎麼做 | playbook(按 topic,自動注入) |
|
|
74
|
+
| 情節層 | 做過什麼 | episodes + **相關性召回**(只注入最相關幾條) |
|
|
75
|
+
| 結晶層 | 可複用流程 | 自寫 skill(須驗證 + 自我體檢漂移) |
|
|
76
|
+
|
|
77
|
+
**通用自主 agent(給目標、自己做到完成)**
|
|
78
|
+
```bash
|
|
79
|
+
xitto-kernel --pack general --yes --goal "抓取 example.com 摘要成繁中寫進 summary.txt"
|
|
80
|
+
```
|
|
81
|
+
`general` pack(檔案/shell/web_fetch)+ kernel 的 **goal loop**(反覆 runTurn + LLM 自我驗收,直到達成/無進展/上限)。互動模式用 `/goal <目標>`。
|
|
82
|
+
|
|
83
|
+
**結果導向:對話只是過程,交付物才是產品**
|
|
84
|
+
|
|
85
|
+
對非技術使用者,真正要的不是「跟 AI 聊天」,是「把事做完、給我結果」。`api.runOutcome(goal)` 跑 goal loop,回傳的不是對話而是**交付物**:
|
|
86
|
+
```js
|
|
87
|
+
const o = await kernel.runOutcome('建立 greet.js 並寫個範例驗證');
|
|
88
|
+
// → { done, summary(做了什麼), artifacts: { created:[...], modified:[...] }, rounds }
|
|
89
|
+
```
|
|
90
|
+
`--goal` 與 server 的 `POST /v1/tasks`(mode=goal)都會回交付物——**產出/改動的檔案**(掃工作目錄前後 diff,連 bash 寫的也抓)+ 摘要 + 是否達成。對話被降格成過程,結果(檔案/達成)被擺到最前面。背景任務的 webhook 也帶 `artifacts`。
|
|
91
|
+
|
|
92
|
+
**澄清通道(只在非問不可時才打斷你)**:自主交付的風險是「自主走錯」。`ask_user` 工具讓 agent 在**缺少關鍵資訊、無法合理推斷**時暫停提問——而非盲猜或頻繁打擾(prompt 明確引導:能用合理預設就別問)。由 app 注入 `config.askUser` 決定「問」的形態:
|
|
93
|
+
- **CLI**:內嵌提問,你打字回答,agent 續跑
|
|
94
|
+
- **背景任務**:任務轉 `needs-input` 狀態並掛起問題 → 你 `POST /v1/tasks/:id/answer` 回答 → 解除暫停、續跑(可隔數小時才答,完全非同步)
|
|
95
|
+
|
|
96
|
+
實測:給「建個設定檔但檔名/內容我還沒決定」→ agent 不亂猜,暫停問你檔名與內容 → 答完才交付正確的 `app.config.json`。這讓「許願→交付」既自主又不失控。
|
|
97
|
+
|
|
98
|
+
**🪄 許願台網頁(給非技術使用者:瀏覽器打開就用)**
|
|
99
|
+
```bash
|
|
100
|
+
XITTO_SERVER_TOKEN=secret npm run serve # 然後瀏覽器開 http://localhost:8787/
|
|
101
|
+
# 本地就地模式(可選真實資料夾、就地改檔,沙箱關):
|
|
102
|
+
npm run serve:local # = LOCAL=1 SANDBOX=off,token 預設 secret(可用 XITTO_SERVER_TOKEN 覆寫)
|
|
103
|
+
```
|
|
104
|
+
不用終端機、不用碰金鑰(伺服器端管)。介面以**結果**為中心,不是聊天:
|
|
105
|
+
- **許願**:打一句「你想完成什麼」→ 交辦(背景跑 goal loop)
|
|
106
|
+
- **進行中**:**即時進度 + 活著的證明**——每秒跳動的「已進行 Ns」心跳時鐘、目前階段(思考中/執行中/驗收中)、agent 當下的**思考文字**(💭)、工具動作翻成人話、第幾輪 + 動作數。看得到它在想什麼、做什麼
|
|
107
|
+
- **待辦打勾**:agent 用 `todo_write` 規劃多步任務時,顯示 ☐/◐/☑ 清單,把「未知時長」變成「看得到的剩餘步數」(對標 Claude Code)
|
|
108
|
+
- **隨時可停**:每個進行中任務有「停止」鈕 → `POST /v1/tasks/:id/cancel`(abort 正在跑的 agent)。控制權在使用者手上,降低「啟動了控制不了的東西」的焦慮
|
|
109
|
+
- **展開過程**:預設安靜(只給進度與成品);想看細節按「展開過程」→ 完整步驟卡(讀/改/跑,人話)+ **編輯的彩色 diff**(綠 +/紅 -)。同一畫面服務「只要結果」與「想看細節」兩種人(對標 Claude Code 的 ⏺/⎿ + ctrl+r 展開)
|
|
110
|
+
- **需要你回答**:agent 暫停提問時,跳出問題 + 回答框(澄清通道)
|
|
111
|
+
- **收成品**:完成後顯示摘要 + **產出的檔案**,點檔名可直接看內容(`GET /v1/tasks/:id/file`,防路徑穿越)
|
|
112
|
+
- **繼續/調整(迭代有脈絡)**:成品上有「↳ 繼續/調整這個成果」——打一句想改什麼/想深入什麼,送出一個**後續任務**,**接續這次的對話(sessionId)+ 同工作區**。agent 同時有「檔案 + 當時的討論與理由」,不只是檔案。預設每個許願是乾淨新對話(不暴脹),按「繼續」才接續那條線(像 ChatGPT 開新對話 vs 接著聊)。歷史以 `↳` 標出接續鏈
|
|
113
|
+
- **歷史成品**:過往交辦的清單(願望 + 狀態),不是聊天串
|
|
114
|
+
|
|
115
|
+
**單頁佈局(無分頁,一眼看全部)**:頂部**許願輸入** + 左欄(**歷史成品** + **📂 檔案瀏覽器**,各自內捲)+ 主區(**當前任務/進度/成品/檔案預覽**共用)。不用切分頁——交辦任務、看歷史、瀏覽工作區檔案、預覽內容都在同一頁。檔案瀏覽器**逐層導航**(像檔案總管,不一次遞迴攤平),點任一檔(成品或工作區)都在主區預覽。容器 1180px,窄螢幕(≤860px)自動收成單欄。刻意保持輕量(不是 IDE)。
|
|
116
|
+
|
|
117
|
+
**持久工作空間(成品間的關係)**:每個成品是**獨立的對話**(不續接前一個,避免 context 暴脹),但**共用一個持久工作空間**(`.xitto-server/ws/<workspace>`,預設 `default`)——所以 ① **檔案留存**,後面的任務能接續前面的成果(「把我上次做的 plan.md 翻成英文」);② **五層沉澱跨成品累積**(偏好/技能/經驗/信任)——它**越用越懂你**,不再是每次都從零開始的陌生人。`workspace` 可在 POST 時指定(多使用者各自一個);網頁有「專案」下拉切換,每份成品卡標出 `📁 所屬空間`。
|
|
118
|
+
|
|
119
|
+
**本地就地模式(像 Claude Code 改你選的真實資料夾)**:`XITTO_SERVER_LOCAL=1` 時,網頁多一個「**📁 選資料夾**」鈕——**用點的**從家目錄瀏覽進你的真實資料夾並選定(不用打路徑;瀏覽器拿不到絕對路徑,所以由 local server 端列資料夾),或「新專案」直接貼絕對路徑也行。任務就**就地改那個資料夾的檔**(不另開隔離副本),工作台列的也是它。這把「許願台(隔離,服務非技術使用者)」和「Claude Code(就地,改你現有的 codebase)」兩個模型打通:**本機自用想就地 → 給路徑;隔離/託管 → 給名稱**。**安全**:只在 `local` 模式才認絕對路徑;**託管模式收到絕對路徑會被消毒成管理空間,不會逃逸到主機任意路徑**。
|
|
120
|
+
|
|
121
|
+
**重啟後歷史還在(持久化)**:任務清單落地 `.xitto-server/tasks/`、對話 session 落地 `.xitto-server/sessions/`,啟動時載回——所以**重啟後歷史成品自動顯示、舊成品仍能「繼續/調整」**(對話脈絡也在)。重啟時還在跑/待答的任務會標「已中斷(重啟)」。對標 Claude Code「對話自動落地」,但許願台是**自動顯示歷史**(成品清單),而非 Claude Code 的明確 `--resume`。
|
|
122
|
+
|
|
123
|
+
**溯源/檔案位置**:成品記錄它的**邏輯位置(workspace)**;**實體絕對路徑**預設不外露(託管不洩漏伺服器路徑),只在**本地模式**(`XITTO_SERVER_LOCAL=1`)才在成品附「📂 檔案位置」供你到 Finder/Explorer 找檔。
|
|
124
|
+
|
|
125
|
+
零依賴單一 HTML(`src/app/web/index.html`),polling 不靠 SSE。token 注入頁面供同源呼叫——本地自用零設定;**正式部署請前置真實認證**。
|
|
126
|
+
|
|
127
|
+
## 當成服務跑(不只 CLI)
|
|
128
|
+
|
|
129
|
+
kernel 是 UI 無關的,CLI 只是其中一個 app。`src/app/server.js` 是把它包成 **HTTP 服務**的 PoC
|
|
130
|
+
(零依賴 `node:http`)—— 證明「個人工具 → 可服務化底座」:
|
|
131
|
+
|
|
132
|
+
```bash
|
|
133
|
+
XITTO_SERVER_TOKEN=secret npm run serve # http://localhost:8787
|
|
134
|
+
curl -s localhost:8787/health
|
|
135
|
+
curl -s -XPOST localhost:8787/v1/run -H "Authorization: Bearer secret" \
|
|
136
|
+
-H content-type:application/json -d '{"pack":"general","sessionId":"s1","input":"..."}'
|
|
137
|
+
```
|
|
138
|
+
|
|
139
|
+
特性:bearer token 認證、**per-session 隔離工作目錄 + 歷史**(多輪記得上文)、沙箱(Seatbelt)、
|
|
140
|
+
結構化 JSON 日誌(審計/觀測)、6 個 pack 可選、JSON 或 SSE(`/v1/stream`)串流。
|
|
141
|
+
「個人 vs 生產」是 **app 層**的事 —— 同一個 kernel,CLI 與 server 是兩個 app。
|
|
142
|
+
|
|
143
|
+
**背景任務 + 完成通知(非同步交互)** —— 派任務出去、立刻拿到 `taskId`、做完回呼 webhook,不用一直盯著:
|
|
144
|
+
```bash
|
|
145
|
+
# 派任務(立刻回 202 + taskId),完成時 POST 結果到 webhook
|
|
146
|
+
curl -s -XPOST localhost:8787/v1/tasks -H "Authorization: Bearer secret" \
|
|
147
|
+
-H content-type:application/json \
|
|
148
|
+
-d '{"pack":"general","mode":"goal","goal":"...","webhook":"https://你的服務/done"}'
|
|
149
|
+
|
|
150
|
+
curl -s localhost:8787/v1/tasks -H "Authorization: Bearer secret" # 列表
|
|
151
|
+
curl -s localhost:8787/v1/tasks/<id> -H "Authorization: Bearer secret" # 狀態 + 結果
|
|
152
|
+
curl -sN localhost:8787/v1/tasks/<id>/events -H "Authorization: Bearer secret" # 附掛事件流(SSE,replay+即時)
|
|
153
|
+
```
|
|
154
|
+
限流並發 `XITTO_SERVER_CONCURRENCY`(預設 2);webhook 完成時收到 `{taskId,status,text,usage,rounds,done}`。
|
|
155
|
+
這把「即時盯著看」延伸到「派任務→通知」的非同步形態(像把 agent 當同事)。
|
|
156
|
+
|
|
157
|
+
## 做你自己的領域 agent(不固化)
|
|
158
|
+
|
|
159
|
+
kernel 是**被依賴的套件**,不是被 clone 的範本。你的 agent 是獨立小專案:
|
|
160
|
+
|
|
161
|
+
```bash
|
|
162
|
+
xitto-kernel new-agent my-bot # 產出獨立專案(import kernel,不改 kernel)
|
|
163
|
+
cd my-bot && npm install && npm start
|
|
164
|
+
```
|
|
165
|
+
|
|
166
|
+
產出的 `my-bot/` 只有:`pack.js`(你的領域:會什麼/守什麼)+ `index.js`(幾行啟動)+ `package.json`(`"xitto-kernel": "file:…"`)。
|
|
167
|
+
runtime(多步循環/串流/權限/沙箱/CLI)全在 kernel;`npm update xitto-kernel` 升級底座,**你的 agent 不會被固化**。
|
|
168
|
+
|
|
169
|
+
```
|
|
170
|
+
my-bot/ ← 你的獨立專案
|
|
171
|
+
├── package.json dependencies: { xitto-kernel: file:… }
|
|
172
|
+
├── pack.js ← 你的 DomainPack
|
|
173
|
+
└── index.js import { runCli, loadModel } from 'xitto-kernel/app'
|
|
174
|
+
```
|
|
175
|
+
|
|
176
|
+
> 內建的 coding / data-query / notes 是「官方範例 pack」,住在 kernel repo 裡;你的 pack 住在你自己的專案裡。兩者並存、互不固化。
|
|
177
|
+
|
|
178
|
+
## 搭建狀態
|
|
179
|
+
|
|
180
|
+
```
|
|
181
|
+
xitto-kernel/
|
|
182
|
+
├── src/
|
|
183
|
+
│ ├── types.js 型別定義(DomainPack / Tool / KernelServices …)
|
|
184
|
+
│ ├── index.js 公開 API(createKernel / loadPack / defineDomainPack …)
|
|
185
|
+
│ ├── kernel/
|
|
186
|
+
│ │ ├── pack-loader.js ✅ pack 載入/驗證
|
|
187
|
+
│ │ ├── tool-registry.js ✅ 工具 metadata 驅動(取代寫死名單)
|
|
188
|
+
│ │ ├── guard-chain.js ✅ 固定順序 beforeToolCall 守衛鏈
|
|
189
|
+
│ │ ├── agent-loop.js ✅ 移植自 xitto-code 的 Agent(串流 + 多步工具循環)
|
|
190
|
+
│ │ ├── provider.js ✅ provider 呼叫適配(pi-ai streamSimple + cache 相容)
|
|
191
|
+
│ │ ├── security/ ✅ 真實 sandbox(守衛鏈第 5 格)
|
|
192
|
+
│ │ │ ├── sandbox.js ✅ 靜態策略 + macOS Seatbelt OS 級隔離
|
|
193
|
+
│ │ │ ├── danger.js ✅ 危險命令偵測(rm -rf / fork bomb / curl|sh …)
|
|
194
|
+
│ │ │ ├── allow.js ✅ 命令簽章白名單
|
|
195
|
+
│ │ │ └── permission-step.js ✅ 第 5 格:deny→靜態策略→危險→確認(metadata 驅動)
|
|
196
|
+
│ │ └── index.js ✅ createKernel:runTool + runTurn + sandbox 接線
|
|
197
|
+
│ ├── app/ ✅ app 層(薄;TUI 不在 kernel 內)
|
|
198
|
+
│ │ ├── index.js ✅ xitto-kernel/app 公開 API(runCli/loadModel/newAgent)
|
|
199
|
+
│ │ ├── cli.js ✅ 互動 CLI:串流文字 + 工具顯示 + /指令 + Ctrl+C 中斷
|
|
200
|
+
│ │ ├── main.js ✅ 進入點 + new-agent 子指令
|
|
201
|
+
│ │ ├── scaffold.js ✅ 腳手架:產出獨立 agent 專案(不改 kernel)
|
|
202
|
+
│ │ ├── templates/ ✅ 獨立專案樣板(package.json/index.js/pack.js…)
|
|
203
|
+
│ │ └── providers.js ✅ providers.json 載入(provider 設定屬 app,非 kernel)
|
|
204
|
+
│ └── packs/
|
|
205
|
+
│ ├── coding/ ✅ 參考 pack(read/ls/write/edit/bash/git)
|
|
206
|
+
│ ├── data-query/ ✅ 第二領域(證明正交)
|
|
207
|
+
│ ├── notes/ ✅ 第三領域(知識庫)
|
|
208
|
+
│ ├── general/ ✅ 通用自主 agent(檔案/shell/web/http + goal loop)
|
|
209
|
+
│ ├── deep-research/ ✅ 深度研究(多來源搜尋→查證→有引用結論)
|
|
210
|
+
│ └── devops/ ✅ 維運/SRE(shell + bash_bg + 設定 + 日誌 + 健康檢查)
|
|
211
|
+
├── bin/xitto-kernel.js ✅ CLI 進入點(run / new-agent)
|
|
212
|
+
├── test/ ✅ 測試全綠(runTurn + Seatbelt 隔離 + 腳手架 + …)
|
|
213
|
+
└── examples/
|
|
214
|
+
├── demo.js ✅ 不靠 LLM:同 kernel、兩領域、守衛真實生效
|
|
215
|
+
└── live.js ✅ 真實 LLM(MiniMax):模型實際呼叫工具完成任務
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
**也可跑**:`npm test`(200+ 測試全綠)、`npm run demo`(不靠 LLM)、`node examples/live.js`(真實 LLM)。
|
|
219
|
+
**runTurn 已移植**:串流 → 工具呼叫(過 kernel 守衛鏈)→ 回灌 → 再串流的多步循環,能用真實 provider 驅動。
|
|
220
|
+
**真實 sandbox 已接守衛鏈第 5 格**:(A) 靜態策略擋網路/提權/危險命令;(B) macOS Seatbelt 在執行期 OS 級隔離,擋下靜態策略漏掉的混淆越界寫入。`sandboxable` 工具自動包裹,`tool.readOnly` 自動放行——全 metadata 驅動,無領域名單。
|
|
221
|
+
**仍為接縫(後續)**:回合內壓縮、hooks/skills/MCP/subagent、contextFiles 載入、互動權限確認(CLI 目前 headless 放行 mutating、危險命令仍擋)。更豐富的 Ink TUI 可作為另一個 app 消費同一組 kernel 事件。
|
|
222
|
+
|
|
223
|
+
## 文件索引
|
|
224
|
+
|
|
225
|
+
| 文件 | 內容 |
|
|
226
|
+
|------|------|
|
|
227
|
+
| [01-architecture.md](docs/01-architecture.md) | 分層架構、kernel 模組清單、一次 turn 的生命週期與 kernel/pack 交界 |
|
|
228
|
+
| [02-domain-pack-spec.md](docs/02-domain-pack-spec.md) | `DomainPack` 介面完整規格(逐欄位、必填/選填、預設)|
|
|
229
|
+
| [03-kernel-contract.md](docs/03-kernel-contract.md) | kernel 對 pack 提供的服務(`KernelServices`)與生命週期 hook |
|
|
230
|
+
| [04-migration-from-xitto-code.md](docs/04-migration-from-xitto-code.md) | 從 xitto-code 抽離的具體步驟:每個耦合點怎麼搬、風險 |
|
|
231
|
+
| [05-example-packs.md](docs/05-example-packs.md) | 範例 pack 對照(coding / data-query 已內建 + ops 示意)驗證同介面能跑不同領域 |
|
|
232
|
+
| [06-authoring-a-pack.md](docs/06-authoring-a-pack.md) | **怎麼用底座做一個新領域 agent**:最小 pack、工具形狀、三步驟、放工具 vs prompt |
|
|
233
|
+
|
|
234
|
+
## 現況與後續
|
|
235
|
+
|
|
236
|
+
**已完成**:pack 系統、工具 metadata 驅動、固定順序守衛鏈、agent loop(真實 LLM 多步循環)、
|
|
237
|
+
真實 sandbox(靜態策略 + macOS Seatbelt)、pack.verify 自我驗收、pack.contextFiles 載入、
|
|
238
|
+
**跨 session 記憶 + resume**、**互動權限確認**(/auto、--yes)、**/plan 計劃模式 + /undo**、
|
|
239
|
+
**git 能力**(coding pack)、**spawn_agent 子 agent**、**PreToolUse/PostToolUse hooks**、
|
|
240
|
+
**skills 漸進揭露**、**MCP 工具接入**、互動 CLI、腳手架(`new-agent` 產出獨立專案)。測試全綠(200+)。
|
|
241
|
+
|
|
242
|
+
**已發佈 npm**:`npm install -g xitto-kernel`;`new-agent` 產出的專案預設依賴 `^0.1.0`(`--local` 用 file: 開發)。
|
|
243
|
+
**可選後續**:Ink 全功能 TUI 可作為另一個 app(目前 CLI 已有輕量串流 markdown + 彩色 diff)。
|
|
244
|
+
|
|
245
|
+
**設計取向**:沿用 Node ESM + pi-ai provider 抽象;不重寫 xitto-code(kernel 是抽象,xitto-code 仍可獨立存在)。
|
|
246
|
+
|
|
247
|
+
## 評估(能力可量化)
|
|
248
|
+
|
|
249
|
+
每個 pack 配一個 EvalSuite(`eval/`,共用 `eval/framework.js`,不進 npm 包)。
|
|
250
|
+
範式:**新領域 agent = 新 pack(會什麼)+ 新 EvalSuite(怎麼打分)**。
|
|
251
|
+
|
|
252
|
+
| Suite | 對標 | 評分方式 | 跑法 | 參考結果* |
|
|
253
|
+
|------|------|------|------|------|
|
|
254
|
+
| coding | SWE-bench Verified | 隱藏測試 fail→pass(Docker)| `eval/swebench-generate.js` + 官方 harness | 3/8 resolved(真實子集)|
|
|
255
|
+
| coding(迷你)| SWE-bench 風格 | 隱藏測試(免 Docker)| `npm run eval` | 4/4 |
|
|
256
|
+
| general | GAIA 風格 | 答案比對 / 狀態檢查 | `node eval/general-run.js` | 4/4 |
|
|
257
|
+
| data-query | Spider/BIRD 風格 | 真實 SQLite + 答案比對 | `node eval/data-query-run.js` | 4/4 |
|
|
258
|
+
| deep-research | GAIA/研究 | 事實正確 + 真的查證(allOf)| `node eval/deep-research-run.js` | 3/3 |
|
|
259
|
+
| devops | Terminal-Bench 風格 | 狀態檢查(系統/檔案達標)| `node eval/devops-run.js` | 4/4 |
|
|
260
|
+
| 工具呼叫 | BFCL 風格 | 軌跡檢查(呼叫對工具/參數)| `node eval/tool-calling-run.js` | 6/6 |
|
|
261
|
+
|
|
262
|
+
\* 用 MiniMax-M2.7 跑的參考數字(小樣本);換模型/擴樣本見 `eval/README.md`。scorer 型:`answerMatch` / `stateCheck` / `toolCalled`。
|
|
263
|
+
|
|
264
|
+
## 安全(Security)
|
|
265
|
+
|
|
266
|
+
xitto-kernel 跑的 agent 會**執行 LLM 決定的命令、修改檔案**,請當成「跑你沒寫過的程式碼」看待。部署前的關鍵須知:
|
|
267
|
+
|
|
268
|
+
- **OS 沙箱只有 macOS。** 真正的隔離層是 macOS Seatbelt;在 **Linux/Windows 沒有 OS 級沙箱**——agent 以你的使用者權限執行命令。不信任的任務請在容器/VM 或拋棄式環境裡跑。
|
|
269
|
+
- **範例 HTTP server 是未加固的 PoC。** bearer token 注入頁面供同源呼叫、且無速率限制。**切勿未認證就暴露到公網**——前面要加真實認證與 TLS,優先本機自用。
|
|
270
|
+
- **Prompt injection 是真實攻擊面。** agent 讀到的網頁、檔案、工具輸出可能夾帶惡意指令。危險命令偵測(`rm -rf`、fork bomb、`curl | sh`…)、命令簽章白名單、漸進信任能縮小波及範圍但無法根除。危險命令一律把關;你授予信任前請審視。
|
|
271
|
+
- **金鑰不必落地。** 在 `providers.json` 用環境變數 `${NAME}` 參照 API key,該檔已被 git-ignore。
|
|
272
|
+
|
|
273
|
+
發現漏洞?請走私密回報——見 [SECURITY.md](SECURITY.md),**勿開公開 issue**。
|
|
274
|
+
|
|
275
|
+
## 貢獻
|
|
276
|
+
|
|
277
|
+
見 [CONTRIBUTING.md](CONTRIBUTING.md)。核心原則:kernel 必須領域無關(安全行為靠工具 metadata,不寫死領域名單);新領域 = 新增一個 pack,kernel 零改動。
|
|
278
|
+
|
|
279
|
+
## 授權
|
|
280
|
+
|
|
281
|
+
[MIT](LICENSE) © ishoplus
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "xitto-kernel",
|
|
3
|
-
"version": "0.9.
|
|
3
|
+
"version": "0.9.6",
|
|
4
4
|
"type": "module",
|
|
5
5
|
"description": "領域無關的 agent 底座(kernel + 可插拔 DomainPack),從 xitto-code 抽象而來",
|
|
6
6
|
"keywords": [
|
|
@@ -34,9 +34,13 @@
|
|
|
34
34
|
"bin",
|
|
35
35
|
"docs",
|
|
36
36
|
"README.md",
|
|
37
|
+
"README.zh-TW.md",
|
|
37
38
|
"LICENSE",
|
|
38
39
|
"CHANGELOG.md"
|
|
39
40
|
],
|
|
41
|
+
"publishConfig": {
|
|
42
|
+
"registry": "https://registry.npmjs.org"
|
|
43
|
+
},
|
|
40
44
|
"scripts": {
|
|
41
45
|
"test": "node --test",
|
|
42
46
|
"demo": "node examples/demo.js",
|
|
@@ -56,7 +60,7 @@
|
|
|
56
60
|
"./packs/devops": "./src/packs/devops/index.js"
|
|
57
61
|
},
|
|
58
62
|
"dependencies": {
|
|
59
|
-
"@
|
|
63
|
+
"@earendil-works/pi-ai": "^0.80.2",
|
|
60
64
|
"@modelcontextprotocol/sdk": "^1.29.0",
|
|
61
65
|
"cli-highlight": "^2.1.11",
|
|
62
66
|
"ink": "^5.2.1",
|
package/src/app/server.js
CHANGED
|
@@ -7,7 +7,7 @@ import { mkdirSync, readFileSync, writeFileSync, existsSync, rmSync, readdirSync
|
|
|
7
7
|
import { join, dirname, isAbsolute, relative, basename, resolve } from 'node:path';
|
|
8
8
|
import { fileURLToPath } from 'node:url';
|
|
9
9
|
import { homedir } from 'node:os';
|
|
10
|
-
import { completeSimple } from '@
|
|
10
|
+
import { completeSimple } from '@earendil-works/pi-ai/compat';
|
|
11
11
|
import { createKernel } from '../kernel/index.js';
|
|
12
12
|
import { cacheRetentionFor } from '../kernel/provider.js';
|
|
13
13
|
import { loadModel } from './providers.js';
|
|
@@ -458,8 +458,9 @@ export function createServerApp({ model, getApiKey, token, baseDir = '.xitto-ser
|
|
|
458
458
|
if (req.method === 'GET' && path === '/v1/fs') {
|
|
459
459
|
if (!local) return json(res, 403, { error: '僅本地模式可瀏覽資料夾' });
|
|
460
460
|
const dir = resolve(url.searchParams.get('path') || homedir());
|
|
461
|
+
const showHidden = url.searchParams.get('hidden') === '1'; // 預設藏 dot 開頭;前端勾「顯示隱藏資料夾」才帶 hidden=1
|
|
461
462
|
try {
|
|
462
|
-
const dirs = readdirSync(dir, { withFileTypes: true }).filter((e) => e.isDirectory() && e.name !== 'node_modules' && !e.name.startsWith('.')).map((e) => e.name).sort();
|
|
463
|
+
const dirs = readdirSync(dir, { withFileTypes: true }).filter((e) => e.isDirectory() && e.name !== 'node_modules' && (showHidden || !e.name.startsWith('.'))).map((e) => e.name).sort();
|
|
463
464
|
return json(res, 200, { path: dir, parent: dirname(dir), home: homedir(), dirs });
|
|
464
465
|
} catch (e) { return json(res, 400, { error: '無法讀取:' + e.message }); }
|
|
465
466
|
}
|
package/src/app/web/index.html
CHANGED
|
@@ -102,7 +102,8 @@
|
|
|
102
102
|
.fspath { font:12px ui-monospace,Menlo,monospace; color:var(--dim); margin-bottom:8px; word-break:break-all; }
|
|
103
103
|
.fslist { flex:1; overflow:auto; border:1px solid var(--line); border-radius:8px; }
|
|
104
104
|
.fsrow { padding:8px 12px; cursor:pointer; border-bottom:1px solid var(--line); font-size:14px; }
|
|
105
|
-
.fsrow:hover { background:#0c0e12; } .fsrow.up { color:var(--dim); }
|
|
105
|
+
.fsrow:hover { background:#0c0e12; } .fsrow.up { color:var(--dim); } .fsrow.hid { color:var(--dim); }
|
|
106
|
+
.fshidden { font-size:12.5px; color:var(--dim); display:flex; align-items:center; gap:5px; cursor:pointer; user-select:none; } .fshidden input { cursor:pointer; }
|
|
106
107
|
.fsbar { display:flex; align-items:center; gap:8px; margin-top:10px; }
|
|
107
108
|
.viewer { margin-top:10px; border:1px solid var(--line); border-radius:10px; padding:12px; background:#0c0e12; }
|
|
108
109
|
.vbar { font-size:12px; color:var(--dim); margin-bottom:8px; }
|
|
@@ -137,6 +138,7 @@
|
|
|
137
138
|
<div class="fslist" id="fslist"></div>
|
|
138
139
|
<div class="fsbar">
|
|
139
140
|
<button class="ghost" onclick="fsHome()">🏠 家目錄</button>
|
|
141
|
+
<label class="fshidden"><input type="checkbox" id="fshidden" onchange="fsToggleHidden()"> 顯示隱藏資料夾</label>
|
|
140
142
|
<button class="ghost" onclick="closeFs()">取消</button>
|
|
141
143
|
<span class="spacer"></span>
|
|
142
144
|
<button onclick="chooseFs()">✓ 選這個資料夾</button>
|
|
@@ -223,16 +225,18 @@ renderSpaces();
|
|
|
223
225
|
|
|
224
226
|
// 資料夾瀏覽器(本地模式「用選的」):伺服器端列資料夾,網頁點進去挑一個
|
|
225
227
|
if(LOCAL) $("#browsebtn").style.display="";
|
|
226
|
-
$("#browsebtn").onclick = ()=>{ $("#fsmodal").style.display="flex"; fsGo(null); };
|
|
227
|
-
let fsPath=null;
|
|
228
|
+
$("#browsebtn").onclick = ()=>{ $("#fsmodal").style.display="flex"; $("#fshidden").checked=fsShowHidden; fsGo(null); };
|
|
229
|
+
let fsPath=null, fsShowHidden=localStorage.getItem("xk_fshidden")==="1";
|
|
228
230
|
async function fsGo(p){
|
|
229
|
-
const
|
|
231
|
+
const qs=[]; if(p) qs.push("path="+encodeURIComponent(p)); if(fsShowHidden) qs.push("hidden=1");
|
|
232
|
+
const r=await api("/v1/fs"+(qs.length?("?"+qs.join("&")):"")).then(r=>r.json()).catch(()=>({error:"讀取失敗"}));
|
|
230
233
|
if(r.error){ alert(r.error); return; }
|
|
231
234
|
fsPath=r.path; window._fsHome=r.home;
|
|
232
235
|
$("#fspath").textContent="📂 "+r.path;
|
|
233
236
|
const base=r.path.endsWith("/")?r.path:r.path+"/";
|
|
234
|
-
$("#fslist").innerHTML=`<div class="fsrow up" onclick="fsGo('${esc(r.parent)}')">⬆ 上一層</div>`+(r.dirs.length?r.dirs.map(d=>`<div class="fsrow" onclick="fsGo('${esc(base+d)}')">📁 ${esc(d)}</div>`).join(""):`<div class="empty" style="padding:10px">(沒有子資料夾,可直接選這個)</div>`);
|
|
237
|
+
$("#fslist").innerHTML=`<div class="fsrow up" onclick="fsGo('${esc(r.parent)}')">⬆ 上一層</div>`+(r.dirs.length?r.dirs.map(d=>`<div class="fsrow${d.startsWith(".")?" hid":""}" onclick="fsGo('${esc(base+d)}')">📁 ${esc(d)}</div>`).join(""):`<div class="empty" style="padding:10px">(沒有子資料夾,可直接選這個)</div>`);
|
|
235
238
|
}
|
|
239
|
+
function fsToggleHidden(){ fsShowHidden=$("#fshidden").checked; localStorage.setItem("xk_fshidden", fsShowHidden?"1":"0"); fsGo(fsPath); }
|
|
236
240
|
function fsHome(){ fsGo(window._fsHome||null); }
|
|
237
241
|
function closeFs(){ $("#fsmodal").style.display="none"; }
|
|
238
242
|
function chooseFs(){ if(!fsPath) return; const p=fsPath; if(!spaces.includes(p)) spaces.push(p); curSpace=p; localStorage.setItem("xk_spaces",JSON.stringify(spaces)); localStorage.setItem("xk_space",curSpace); renderSpaces(); closeFs(); $("#current").innerHTML=""; refreshAll(); }
|
|
@@ -334,12 +338,21 @@ const CANCELLABLE = ["queued","running","needs-input"];
|
|
|
334
338
|
function todosHtml(p){ if(!p||!(p.todos||[]).length) return ""; const ic=s=>s==="completed"?"☑":s==="in_progress"?"◐":"☐"; return `<div class="todos">${p.todos.map(td=>`<div class="todo ${td.status}">${ic(td.status)} ${esc(td.content)}</div>`).join("")}</div>`; }
|
|
335
339
|
async function cancelTask(id){ await api("/v1/tasks/"+id+"/cancel",{method:"POST"}); for(let i=0;i<10;i++){ await new Promise(r=>setTimeout(r,600)); const t=await api("/v1/tasks/"+id).then(r=>r.json()); liveTask=t; renderCurrent(t); if(["done","error","cancelled"].includes(t.status)){loadHistory();break;} } }
|
|
336
340
|
|
|
341
|
+
// IME 組字旗標:中文/日文輸入法組字期間,暫停輪詢重繪,避免重建 input 把未確認的拼音(不在 value 裏)洗掉
|
|
342
|
+
let composing = false;
|
|
343
|
+
function wireComposition(el){
|
|
344
|
+
if(!el) return;
|
|
345
|
+
el.addEventListener("compositionstart", ()=>{ composing=true; });
|
|
346
|
+
el.addEventListener("compositionend", ()=>{ composing=false; });
|
|
347
|
+
}
|
|
348
|
+
|
|
337
349
|
async function poll() {
|
|
338
350
|
clearTimeout(polling);
|
|
339
351
|
const t = await api("/v1/tasks/"+activeId).then(r=>r.json());
|
|
340
352
|
liveTask = t;
|
|
341
|
-
|
|
342
|
-
if (
|
|
353
|
+
const terminal = t.status==="done" || t.status==="error";
|
|
354
|
+
if (!composing || terminal) renderCurrent(t); // 組字中暫停重繪(任務已結束則仍須更新一次)
|
|
355
|
+
if (terminal) { loadHistory(); loadFiles(); return; }
|
|
343
356
|
if (t.status==="needs-input") return; // 等使用者回答
|
|
344
357
|
polling = setTimeout(poll, 1200);
|
|
345
358
|
}
|
|
@@ -376,8 +389,8 @@ function renderCurrent(t) {
|
|
|
376
389
|
</div>`:""}
|
|
377
390
|
</div>`;
|
|
378
391
|
if (t.status==="needs-input") {
|
|
379
|
-
const inp = $("#ans"); inp.focus();
|
|
380
|
-
inp.onkeydown = async (e) => { if (e.key==="Enter" && inp.value.trim()) {
|
|
392
|
+
const inp = $("#ans"); inp.focus(); wireComposition(inp);
|
|
393
|
+
inp.onkeydown = async (e) => { if (e.key==="Enter" && !e.isComposing && inp.value.trim()) {
|
|
381
394
|
const ans = inp.value.trim(); inp.disabled = true;
|
|
382
395
|
await api("/v1/tasks/"+t.taskId+"/answer", { method:"POST", body: JSON.stringify({ answer: ans }) });
|
|
383
396
|
poll();
|
|
@@ -387,7 +400,8 @@ function renderCurrent(t) {
|
|
|
387
400
|
const si = $("#steerin");
|
|
388
401
|
if (si) {
|
|
389
402
|
if (ps) { si.value = ps.v; if (ps.f) { si.focus(); try { si.setSelectionRange(ps.s, ps.s); } catch {} } } // 還原打到一半的內容/游標
|
|
390
|
-
|
|
403
|
+
wireComposition(si);
|
|
404
|
+
si.onkeydown = async (e) => { if (e.key==="Enter" && !e.isComposing && si.value.trim()) {
|
|
391
405
|
const txt = si.value.trim(); si.value = ""; si.disabled = true;
|
|
392
406
|
await api("/v1/tasks/"+t.taskId+"/steer", { method:"POST", body: JSON.stringify({ text: txt }) });
|
|
393
407
|
si.disabled = false; si.focus();
|
package/src/kernel/compaction.js
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
// 回合內上下文壓縮 — kernel 內建。上下文逼近 model 視窗時,把較舊對話摘要成一段、保留最近數輪,
|
|
2
2
|
// 避免長對話爆窗。對標 xitto-code compaction.js(自足版:以字元/4 粗估 tokens,不依賴 pi-coding-agent)。
|
|
3
|
-
import { completeSimple } from '@
|
|
3
|
+
import { completeSimple } from '@earendil-works/pi-ai/compat';
|
|
4
4
|
import { cacheRetentionFor } from './provider.js';
|
|
5
5
|
|
|
6
6
|
export const DEFAULT_COMPACTION = { enabled: true, reserveTokens: 16384, keepRecentTokens: 20000 };
|
package/src/kernel/goal-loop.js
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
// 目標驅動自主循環 — kernel 內建(領域無關)。給目標 → 反覆 runTurn + LLM 自我驗收,
|
|
2
2
|
// 直到達成 / 到上限 / 無進展。對標 xitto-code 的 /loop。checkGoal 用 LLM 判斷是否完成。
|
|
3
|
-
import { completeSimple } from '@
|
|
3
|
+
import { completeSimple } from '@earendil-works/pi-ai/compat';
|
|
4
4
|
import { cacheRetentionFor } from './provider.js';
|
|
5
5
|
|
|
6
6
|
const JUDGE_SYS = '你是嚴格的驗收員。依「目標」與「對話進展」判斷目標是否已達成。' +
|
package/src/kernel/provider.js
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
// Provider 呼叫適配 — kernel 怎麼正確地調用 LLM provider(與「provider 設定」不同,後者屬 app)。
|
|
2
2
|
// 預設 streamFn 包 pi-ai 的 streamSimple,並處理 anthropic 相容端點的 prompt caching 相容性。
|
|
3
|
-
import { streamSimple } from '@
|
|
3
|
+
import { streamSimple } from '@earendil-works/pi-ai/compat';
|
|
4
4
|
|
|
5
5
|
// 該 model 是否該關掉 prompt caching:'none' = 關閉。
|
|
6
6
|
// pi-ai 對所有 anthropic-messages provider 預設加 cache_control,但只有「真正的 Anthropic」端點支援;
|