sentinel-agentos 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +636 -0
- package/dist/api.d.ts +151 -0
- package/dist/api.d.ts.map +1 -0
- package/dist/api.js +179 -0
- package/dist/api.js.map +1 -0
- package/dist/cli.d.ts +14 -0
- package/dist/cli.d.ts.map +1 -0
- package/dist/cli.js +182 -0
- package/dist/cli.js.map +1 -0
- package/dist/core.d.ts +139 -0
- package/dist/core.d.ts.map +1 -0
- package/dist/core.js +247 -0
- package/dist/core.js.map +1 -0
- package/dist/evaluator/exec-evaluator.d.ts +102 -0
- package/dist/evaluator/exec-evaluator.d.ts.map +1 -0
- package/dist/evaluator/exec-evaluator.js +266 -0
- package/dist/evaluator/exec-evaluator.js.map +1 -0
- package/dist/evaluator/feedback.d.ts +66 -0
- package/dist/evaluator/feedback.d.ts.map +1 -0
- package/dist/evaluator/feedback.js +195 -0
- package/dist/evaluator/feedback.js.map +1 -0
- package/dist/evaluator/profiler.d.ts +53 -0
- package/dist/evaluator/profiler.d.ts.map +1 -0
- package/dist/evaluator/profiler.js +108 -0
- package/dist/evaluator/profiler.js.map +1 -0
- package/dist/guard/audit-log.d.ts +75 -0
- package/dist/guard/audit-log.d.ts.map +1 -0
- package/dist/guard/audit-log.js +207 -0
- package/dist/guard/audit-log.js.map +1 -0
- package/dist/guard/risk-gate.d.ts +97 -0
- package/dist/guard/risk-gate.d.ts.map +1 -0
- package/dist/guard/risk-gate.js +160 -0
- package/dist/guard/risk-gate.js.map +1 -0
- package/dist/guard/sandbox.d.ts +112 -0
- package/dist/guard/sandbox.d.ts.map +1 -0
- package/dist/guard/sandbox.js +379 -0
- package/dist/guard/sandbox.js.map +1 -0
- package/dist/guard/schema-gate.d.ts +90 -0
- package/dist/guard/schema-gate.d.ts.map +1 -0
- package/dist/guard/schema-gate.js +452 -0
- package/dist/guard/schema-gate.js.map +1 -0
- package/dist/guard/snapshot-verify.d.ts +111 -0
- package/dist/guard/snapshot-verify.d.ts.map +1 -0
- package/dist/guard/snapshot-verify.js +578 -0
- package/dist/guard/snapshot-verify.js.map +1 -0
- package/dist/index.d.ts +28 -0
- package/dist/index.d.ts.map +1 -0
- package/dist/index.js +59 -0
- package/dist/index.js.map +1 -0
- package/dist/memory/episodic.d.ts +76 -0
- package/dist/memory/episodic.d.ts.map +1 -0
- package/dist/memory/episodic.js +289 -0
- package/dist/memory/episodic.js.map +1 -0
- package/dist/memory/semantic.d.ts +69 -0
- package/dist/memory/semantic.d.ts.map +1 -0
- package/dist/memory/semantic.js +243 -0
- package/dist/memory/semantic.js.map +1 -0
- package/dist/memory/working.d.ts +53 -0
- package/dist/memory/working.d.ts.map +1 -0
- package/dist/memory/working.js +150 -0
- package/dist/memory/working.js.map +1 -0
- package/dist/middleware/openclaw.d.ts +45 -0
- package/dist/middleware/openclaw.d.ts.map +1 -0
- package/dist/middleware/openclaw.js +95 -0
- package/dist/middleware/openclaw.js.map +1 -0
- package/dist/middleware/wrapper.d.ts +54 -0
- package/dist/middleware/wrapper.d.ts.map +1 -0
- package/dist/middleware/wrapper.js +155 -0
- package/dist/middleware/wrapper.js.map +1 -0
- package/dist/server.d.ts +45 -0
- package/dist/server.d.ts.map +1 -0
- package/dist/server.js +229 -0
- package/dist/server.js.map +1 -0
- package/dist/types/index.d.ts +201 -0
- package/dist/types/index.d.ts.map +1 -0
- package/dist/types/index.js +4 -0
- package/dist/types/index.js.map +1 -0
- package/package.json +64 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 jishuanjimingtian
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,636 @@
|
|
|
1
|
+
# Sentinel AgentOS — AI Agent 操作系统 · AI Agent Operating System
|
|
2
|
+
|
|
3
|
+
> **确定性 Guard 层 + 分层记忆 + 自动评估,让任何 Agent 变得可靠、可审计、可改进。**
|
|
4
|
+
> *Deterministic Guard Layer + Layered Memory + Automated Evaluation — making any Agent reliable, auditable, and self-improving.*
|
|
5
|
+
|
|
6
|
+
[](https://www.typescriptlang.org/)
|
|
7
|
+
[](https://github.com/jishuanjimingtian/sentinel-agentos/actions)
|
|
8
|
+
[](./LICENSE)
|
|
9
|
+
|
|
10
|
+
---
|
|
11
|
+
|
|
12
|
+
## 🤔 为什么需要 Sentinel AgentOS · Why Sentinel AgentOS
|
|
13
|
+
|
|
14
|
+
AI Agent 面临五大核心问题,现有框架都没能真正解决。
|
|
15
|
+
*AI Agents face five critical problems that no existing framework truly solves.*
|
|
16
|
+
|
|
17
|
+
| 痛点 · Pain | 现状 · Status Quo | Sentinel AgentOS 方案 · Solution |
|
|
18
|
+
|------|------|-------------|
|
|
19
|
+
| 🔴 **幻觉导致错误操作** · *Hallucinated operations* | Prompt 里说"不要删文件"——这是愿望,不是约束 · *"Please don't delete files" — that's a wish, not a constraint* | Guard 层确定性校验,不依赖 LLM 判断 · *Deterministic Guard checks, zero LLM dependency* |
|
|
20
|
+
| 🔴 **越权/危险操作** · *Over-privileged ops* | 无分级控制,要么全禁要么全放 · *All-or-nothing access control* | Risk Gate 四维数学公式,0-100 自动分级 · *4-dimensional risk formula with auto-thresholding* |
|
|
21
|
+
| 🔴 **记不住、记不对** · *Poor memory* | 把对话扔进向量库——只有检索没有理解 · *Dump conversation into vector DB — retrieval, not understanding* | 三层记忆,像人脑一样评级、压缩、遗忘 · *3-layer memory: rate, compress, forget like a brain* |
|
|
22
|
+
| 🔴 **出事查不到原因** · *No audit trail* | Agent 做了什么、为什么做——全不可追溯 · *What the agent did and why — completely untraceable* | 每次操作前后 diff,JSONL 不可篡改审计 · *Pre/post diff per operation, immutable JSONL audit* |
|
|
23
|
+
| 🔴 **不知道 Agent 好不好** · *No quality measurement* | 最多有个 success rate 计数器 · *At best, a success-rate counter* | 三阶段评估 + 隐性反馈 + 质量画像 · *3-phase evaluation + implicit feedback + quality profile* |
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## 🏗️ 架构设计 · Architecture
|
|
28
|
+
|
|
29
|
+
```
|
|
30
|
+
┌─────────────────────────────────────────────────────────────────┐
|
|
31
|
+
│ Sentinel AgentOS 架构 │
|
|
32
|
+
│ │
|
|
33
|
+
│ 任意 Agent 框架 · Any Agent Framework │
|
|
34
|
+
│ (OpenClaw / LangChain / CrewAI / 自研 · Custom) │
|
|
35
|
+
│ │ │
|
|
36
|
+
│ ▼ │
|
|
37
|
+
│ ┌───────────────────────────────────────────────────────────┐ │
|
|
38
|
+
│ │ Sentinel AgentOS 内核 · Kernel │ │
|
|
39
|
+
│ │ │ │
|
|
40
|
+
│ │ ┌─────────┐ ┌──────────┐ ┌────────────┐ │ │
|
|
41
|
+
│ │ │ Guard │ │ Memory │ │ Evaluator │ │ │
|
|
42
|
+
│ │ │ 守卫层 │ │ 记忆层 │ │ 评估层 │ │ │
|
|
43
|
+
│ │ ├─────────┤ ├──────────┤ ├────────────┤ │ │
|
|
44
|
+
│ │ │ Schema │ │ Working │ │ Pre-exec │ │ │
|
|
45
|
+
│ │ │ Gate │ │ 工作记忆 │ │ 执行前评估 │ │ │
|
|
46
|
+
│ │ │ ↓ │ │ ↓ │ │ ↓ │ │ │
|
|
47
|
+
│ │ │ Risk │ │ Episodic │ │ Runtime │ │ │
|
|
48
|
+
│ │ │ Gate │ │ 情景记忆 │ │ 执行中评估 │ │ │
|
|
49
|
+
│ │ │ ↓ │ │ ↓ │ │ ↓ │ │ │
|
|
50
|
+
│ │ │Snapshot │ │ Semantic │ │ Post-exec │ │ │
|
|
51
|
+
│ │ │ Gate │ │ 语义记忆 │ │ 执行后评估 │ │ │
|
|
52
|
+
│ │ │ ↓ │ └──────────┘ │ ↓ │ │ │
|
|
53
|
+
│ │ │ Verify │ │ Feedback │ │ │
|
|
54
|
+
│ │ │ Gate │ │ 反馈引擎 │ │ │
|
|
55
|
+
│ │ │ ↓ │ │ ↓ │ │ │
|
|
56
|
+
│ │ │ Audit │ │ Profiler │ │ │
|
|
57
|
+
│ │ │ Log │ │ 质量画像 │ │ │
|
|
58
|
+
│ │ └─────────┘ └────────────┘ │ │
|
|
59
|
+
│ │ │ │
|
|
60
|
+
│ │ ┌─────────────────────────────────────────────────────┐ │ │
|
|
61
|
+
│ │ │ Sandbox 沙箱 (direct/sandbox/dry-run) │ │ │
|
|
62
|
+
│ │ └─────────────────────────────────────────────────────┘ │ │
|
|
63
|
+
│ └───────────────────────────────────────────────────────────┘ │
|
|
64
|
+
│ │ │
|
|
65
|
+
│ ┌───────────────┴───────────────┐ │
|
|
66
|
+
│ ▼ ▼ │
|
|
67
|
+
│ 安全执行 · Safe Execution 可靠记忆 · Reliable Memory │
|
|
68
|
+
│ 全程审计 · Full Audit 持续改进 · Continuous Improve│
|
|
69
|
+
└─────────────────────────────────────────────────────────────────┘
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
### 设计哲学 · Design Philosophy
|
|
73
|
+
|
|
74
|
+
| 原则 · Principle | 含义 · Meaning | 反例 · Counter-example |
|
|
75
|
+
|------|------|------|
|
|
76
|
+
| **确定性优先** · *Determinism first* | 能不用 LLM 就不用,用确定性代码 · *Use deterministic code whenever possible* | 用 LLM 做安全判断 · *Using LLM for security* |
|
|
77
|
+
| **可审计优先** · *Auditability first* | 所有操作可追溯、可回滚、可解释 · *Every operation traceable, rollbackable, explainable* | Agent 操作后无日志 · *No log after agent operations* |
|
|
78
|
+
| **渐进增强** · *Progressive enhancement* | 框架无关,可增量接入,不要求替换现有架构 · *Framework-agnostic, incremental adoption* | LangChain 的强绑定 · *Hard vendor lock-in* |
|
|
79
|
+
|
|
80
|
+
### 不是 Agent,是 Agent 的操作系统 · *Not an Agent — an OS for Agents*
|
|
81
|
+
|
|
82
|
+
| 类比 · Analogy | 对应 · Implementation |
|
|
83
|
+
|------|------|
|
|
84
|
+
| 应用程序 · Applications | 任意 Agent 框架 · Any Agent Framework |
|
|
85
|
+
| 操作系统 · Operating System | Sentinel AgentOS |
|
|
86
|
+
| 内核 · Kernel | Schema Gate + Risk Gate(确定性代码,零 LLM) |
|
|
87
|
+
| 文件系统 · File System | 分层 Memory Store |
|
|
88
|
+
| 日志系统 · Logging | Audit Log(不可篡改,支持回滚 · *immutable, supports rollback*) |
|
|
89
|
+
| 性能监控 · Performance Monitor | Evaluator 三阶段评估 · *3-phase evaluation* |
|
|
90
|
+
|
|
91
|
+
---
|
|
92
|
+
|
|
93
|
+
## ✨ 功能介绍 · Features
|
|
94
|
+
|
|
95
|
+
### 🛡️ Guard 守卫层(6 个组件,零 LLM 依赖 · *6 components, zero LLM*)
|
|
96
|
+
|
|
97
|
+
#### Schema Gate — 参数格式校验 · Parameter Validation
|
|
98
|
+
|
|
99
|
+
| 校验项 · Check | 说明 · Description | 示例 · Example |
|
|
100
|
+
|--------|------|------|
|
|
101
|
+
| 必填 · required | 字段必须存在 · *Field must exist* | `delete_file` 必须提供 `path` |
|
|
102
|
+
| 类型 · types | string/number/boolean/object/array | `path` 必须是 string |
|
|
103
|
+
| 允许值 · allowedValues | 枚举约束 · *Enum constraint* | `mode` 只能是 `read`/`write`/`append` |
|
|
104
|
+
| 数值范围 · min/max | 数字范围 · *Numeric range* | `max_tokens` 1-100000 |
|
|
105
|
+
| 长度范围 · — | 字符串/数组长度 · *String/array length* | `query` 至少 3 字符 |
|
|
106
|
+
| 正则 · patterns | 格式校验 · *Pattern match* | `email` 符合邮箱格式 |
|
|
107
|
+
| 路径约束 · pathScope | 限制在 workspace 内 · *Workspace boundary* | 禁止写系统目录 · *No system dir writes* |
|
|
108
|
+
| 路径黑白名单 · pathAllow/Deny | 允许/禁止的文件模式 · *File pattern allow/block* | 禁止 `.env`, `*.key`, `*.pem` |
|
|
109
|
+
| 参数依赖 · dependsOn | 条件必填 · *Conditional required* | `auto_merge=true` → `base_branch` 必填 |
|
|
110
|
+
| 参数互斥 · mutuallyExclusive | 互斥参数 · *Mutually exclusive* | `content` 和 `file_path` 不同时存在 |
|
|
111
|
+
| 参数大小 · maxSize | 内容字节上限 · *Max content bytes* | `content` ≤ 1MB |
|
|
112
|
+
| 敏感标记 · secrets | 日志中脱敏 · *Redact in logs* | |
|
|
113
|
+
|
|
114
|
+
#### Risk Gate — 风险分级 · Risk Scoring
|
|
115
|
+
|
|
116
|
+
四维数学公式,零 LLM · *4-dimensional formula:*
|
|
117
|
+
|
|
118
|
+
```
|
|
119
|
+
RiskScore = Impact × (1 - Reversibility) × Sensitivity × (1 + ErrorRate)
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
| 分数区间 · Score | 动作 · Action |
|
|
123
|
+
|----------|------|
|
|
124
|
+
| ≤ 0.5 | 🟢 自动放行 · *Auto-approve* |
|
|
125
|
+
| ≤ 1.0 | 🔵 执行后通知 · *Notify after execution* |
|
|
126
|
+
| ≤ 3.0 | 🟡 暂停等待确认 · *Pause for confirmation* |
|
|
127
|
+
| > 8.0 | 🔴 直接拒绝 · *Deny* |
|
|
128
|
+
|
|
129
|
+
#### Snapshot Gate — 执行前快照 · Pre-exec Snapshot
|
|
130
|
+
|
|
131
|
+
只记录文件 SHA-256 hash + git 状态,不做全量备份,极快。
|
|
132
|
+
*Records only SHA-256 hashes + git state, no full backup — extremely fast.*
|
|
133
|
+
|
|
134
|
+
#### Verify Gate — 执行后校验 · Post-exec Verification
|
|
135
|
+
|
|
136
|
+
**8 项确定性校验** · *8 deterministic checks:*
|
|
137
|
+
|
|
138
|
+
| 校验项 · Check | 说明 · Description |
|
|
139
|
+
|--------|------|
|
|
140
|
+
| 文件存在 · File existence | `fs.existsSync()` 验证声称创建的文件 |
|
|
141
|
+
| 文件变更 · File changed | 对比 Snapshot hash,确认真的改了 |
|
|
142
|
+
| Lint | ESLint 验证代码文件 |
|
|
143
|
+
| TypeCheck | `tsc --noEmit` TypeScript 验证 |
|
|
144
|
+
| 格式合法 · Valid format | JSON.parse 验证声称的 JSON 结果 |
|
|
145
|
+
| 返回值非空 · Non-empty result | 不应为空但为空 → WARN |
|
|
146
|
+
| npm 发布 · npm publish | `npm view` 真实验证 |
|
|
147
|
+
| git push | `git ls-remote` HEAD 对比 |
|
|
148
|
+
|
|
149
|
+
#### Audit Log — 不可篡改审计 · Immutable Audit
|
|
150
|
+
|
|
151
|
+
追加写入 JSONL 文件,每次操作前后完整记录(Schema + Risk + Snapshot + Verify + Diff)。支持按 session/tool/status 查询。
|
|
152
|
+
*Append-only JSONL with full pre/post state per operation. Query by session/tool/status.*
|
|
153
|
+
|
|
154
|
+
#### Rollback — 回滚
|
|
155
|
+
|
|
156
|
+
基于 Snapshot + Git 自动回滚。Verify Gate FAIL + 高风险 → 自动触发。
|
|
157
|
+
*Git-based auto-rollback. Triggered on Verify FAIL + high risk.*
|
|
158
|
+
|
|
159
|
+
#### Sandbox Executor — 沙箱执行器
|
|
160
|
+
|
|
161
|
+
三种模式 · *Three modes:* **direct** · **sandbox** · **dry-run**
|
|
162
|
+
|
|
163
|
+
- 网络策略 · *Network policy:* none / localhost / whitelist
|
|
164
|
+
- 文件系统策略 · *Filesystem policy:* writablePaths / readonlyPaths
|
|
165
|
+
- 内置危险命令检测(`rm -rf /`、`sudo`、fork bomb、`curl\|bash` 等)· *Built-in dangerous command detection*
|
|
166
|
+
|
|
167
|
+
---
|
|
168
|
+
|
|
169
|
+
### 🧠 Memory 记忆层(3 层,人脑模型 · *3-layer brain model*)
|
|
170
|
+
|
|
171
|
+
```
|
|
172
|
+
Working Memory → Episodic Memory → Semantic Memory
|
|
173
|
+
工作记忆 → 情景记忆 → 语义记忆
|
|
174
|
+
──────────────────────────────────────────────────────
|
|
175
|
+
当前会话 → 跨会话事件线 → 永久知识
|
|
176
|
+
Current session → Cross-session → Permanent knowledge
|
|
177
|
+
1 session 存活 → 数周-数月 → 永久
|
|
178
|
+
< 50KB → < 500KB → < 100KB
|
|
179
|
+
```
|
|
180
|
+
|
|
181
|
+
| 层 · Layer | 用途 · Purpose | 关键能力 · Key Ability |
|
|
182
|
+
|----|------|---------|
|
|
183
|
+
| **Working** | 当前会话实时上下文 · *Live session context* | 消息/任务/工具缓存/文件/token 预算 · *Messages, tasks, tool cache, open files, token budget* |
|
|
184
|
+
| **Episodic** | 跨会话事件时间线 · *Cross-session timeline* | 9 种事件类型、自动重要性评分、渐进压缩(full→summary→one-liner→forgotten) |
|
|
185
|
+
| **Semantic** | 提炼后的永久知识 · *Distilled permanent knowledge* | 用户偏好/事实、项目上下文、学习规则(含置信度)、术语表 |
|
|
186
|
+
|
|
187
|
+
#### Session 启动上下文注入 · Startup Context Injection
|
|
188
|
+
|
|
189
|
+
新 session 自动从 Semantic + Episodic 注入最相关上下文:
|
|
190
|
+
*Auto-injects relevant context at every new session:*
|
|
191
|
+
|
|
192
|
+
```
|
|
193
|
+
[Sentinel AgentOS Memory Context]
|
|
194
|
+
你正在帮助用户"老板"处理项目"coderev"。
|
|
195
|
+
You are helping user "Boss" with project "coderev".
|
|
196
|
+
上次会话讨论了 Guard 层设计。
|
|
197
|
+
Last session discussed Guard layer design.
|
|
198
|
+
关键提醒 · Key reminders:
|
|
199
|
+
- 老板偏好直接、不说废话 · Boss prefers direct, no fluff
|
|
200
|
+
- 发布 npm 前必须更新 CHANGELOG.md · Update CHANGELOG before npm publish
|
|
201
|
+
```
|
|
202
|
+
|
|
203
|
+
---
|
|
204
|
+
|
|
205
|
+
### 📊 Evaluator 评估层(三阶段 + 隐性反馈 · *3-phase + Implicit Feedback*)
|
|
206
|
+
|
|
207
|
+
#### 三阶段评估 · Three-Phase Evaluation
|
|
208
|
+
|
|
209
|
+
```
|
|
210
|
+
Pre-exec 评估 → Runtime 评估 → Post-exec 评估
|
|
211
|
+
执行前评估 执行中评估 执行后评估
|
|
212
|
+
↓ ↓ ↓
|
|
213
|
+
参数质量 重试次数 验证结果
|
|
214
|
+
风险分数 自适应评分 用户接受度
|
|
215
|
+
上下文利用 工具选择准确性 结果利用度
|
|
216
|
+
```
|
|
217
|
+
|
|
218
|
+
#### 隐性反馈捕获 · Implicit Feedback(核心差异点 · *Key differentiator*)
|
|
219
|
+
|
|
220
|
+
不靠"👍👎",靠行为推断满意度。*No thumbs up/down — infer satisfaction from behavior.*
|
|
221
|
+
|
|
222
|
+
| 用户行为 · User Behavior | 隐性信号 · Signal | 强度 · Strength |
|
|
223
|
+
|----------|---------|------|
|
|
224
|
+
| 用户删除了 Agent 创建的代码 · *User deleted agent's code* | `user_deleted_code` | -0.8 |
|
|
225
|
+
| 用户打断了 Agent · *User interrupted agent* | `user_interrupted` | -0.6 |
|
|
226
|
+
| 用户修改了 Agent 输出 · *User modified agent output* | `user_modified_output` | -0.5 |
|
|
227
|
+
| 用户重复了相同指令 · *User repeated same command* | `user_repeated_instruction` | -0.3 |
|
|
228
|
+
| 用户立即继续对话 · *User immediately continued* | `user_immediate_continue` | +0.3 |
|
|
229
|
+
| 用户说"做得好" · *User said "good job"* | `user_explicit_approval` | +0.6 |
|
|
230
|
+
| 用户使用了 Agent 的结果 · *User used agent's result* | `user_used_result` | +0.7 |
|
|
231
|
+
| 用户分享了 Agent 输出 · *User shared agent output* | `user_shared_output` | +0.8 |
|
|
232
|
+
|
|
233
|
+
#### Agent 质量画像 · Quality Profile
|
|
234
|
+
|
|
235
|
+
```
|
|
236
|
+
=== Sentinel AgentOS Status Report ===
|
|
237
|
+
|
|
238
|
+
Quality Score: 85/100 📈
|
|
239
|
+
Total Operations: 156 (12 in last 24h)
|
|
240
|
+
|
|
241
|
+
--- Breakdown ---
|
|
242
|
+
Pre-Exec: 92/100
|
|
243
|
+
Runtime: 88/100
|
|
244
|
+
Post-Exec: 85/100
|
|
245
|
+
Satisfaction: 82/100
|
|
246
|
+
|
|
247
|
+
--- Audit ---
|
|
248
|
+
Total: 156 | Failures: 2 | High-Risk: 3
|
|
249
|
+
|
|
250
|
+
--- ⚠️ Warnings ---
|
|
251
|
+
- 2 verify failures in last 24h — review session #3
|
|
252
|
+
24小时内2次校验失败——检查session #3
|
|
253
|
+
|
|
254
|
+
--- ✅ Strengths 强项 ---
|
|
255
|
+
- Excellent execution reliability · 优秀的执行可靠性
|
|
256
|
+
- Strong positive user feedback · 强烈正向用户反馈
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
---
|
|
260
|
+
|
|
261
|
+
## 📦 安装 · Installation
|
|
262
|
+
|
|
263
|
+
```bash
|
|
264
|
+
npm install @Sentinel AgentOS/core
|
|
265
|
+
```
|
|
266
|
+
|
|
267
|
+
⚠️ **npm 包尚未发布 · *not yet published***,当前从源码使用 · *use from source:*
|
|
268
|
+
|
|
269
|
+
```bash
|
|
270
|
+
git clone git@github.com:jishuanjimingtian/Sentinel AgentOS.git
|
|
271
|
+
cd Sentinel AgentOS
|
|
272
|
+
npm install
|
|
273
|
+
npm test # 99 tests, all passing · 99个测试全部通过
|
|
274
|
+
npm run build # 编译到 dist/
|
|
275
|
+
```
|
|
276
|
+
|
|
277
|
+
---
|
|
278
|
+
|
|
279
|
+
## 🚀 使用说明 · Usage
|
|
280
|
+
|
|
281
|
+
### 基础用法 · Basic
|
|
282
|
+
|
|
283
|
+
```typescript
|
|
284
|
+
import { AgentOS } from 'sentinel-agentos';
|
|
285
|
+
|
|
286
|
+
const aos = new AgentOS({
|
|
287
|
+
workspaceRoot: process.cwd(),
|
|
288
|
+
maxWorkingTokens: 50000,
|
|
289
|
+
maxEpisodicSizeKb: 500,
|
|
290
|
+
});
|
|
291
|
+
|
|
292
|
+
// 设置记忆 · Configure memory
|
|
293
|
+
aos.memory.semantic.setPreference('language', 'zh-CN');
|
|
294
|
+
aos.memory.semantic.addFact('用户在北京,偏好简洁沟通');
|
|
295
|
+
aos.memory.semantic.learnRule('提交前运行 npm test', 'session_1');
|
|
296
|
+
|
|
297
|
+
// 设置 Schema 规则 · Register schema rules
|
|
298
|
+
aos.guard.schema.registerRule({
|
|
299
|
+
tool: 'write_file',
|
|
300
|
+
required: ['path', 'content'],
|
|
301
|
+
types: { path: 'string', content: 'string' },
|
|
302
|
+
pathDeny: { path: ['.env', '*.key', '*.pem', '.git/**'] },
|
|
303
|
+
maxSize: { content: 1048576 },
|
|
304
|
+
secrets: ['content'],
|
|
305
|
+
});
|
|
306
|
+
|
|
307
|
+
// Pre-exec:校验 + 快照 · Validate + snapshot
|
|
308
|
+
const { preExec, snapshot } = aos.executePipeline({
|
|
309
|
+
sessionId: 'session_1',
|
|
310
|
+
agentId: 'main_agent',
|
|
311
|
+
toolName: 'write_file',
|
|
312
|
+
parameters: { path: 'src/main.ts', content: 'console.log("hello");' },
|
|
313
|
+
affectedFiles: ['src/main.ts'],
|
|
314
|
+
});
|
|
315
|
+
|
|
316
|
+
console.log(`Risk: ${preExec.riskScore.score} → ${preExec.riskScore.action}`);
|
|
317
|
+
// → Risk: 0.19 → auto
|
|
318
|
+
|
|
319
|
+
// Post-exec:验证 + 审计 · Verify + audit
|
|
320
|
+
const result = aos.completeExecution({
|
|
321
|
+
sessionId: 'session_1', agentId: 'main_agent',
|
|
322
|
+
toolName: 'write_file',
|
|
323
|
+
toolParameters: { path: 'src/main.ts', content: 'console.log("hello");' },
|
|
324
|
+
toolResult: 'file written',
|
|
325
|
+
snapshot,
|
|
326
|
+
startTime: Date.now() - 500, endTime: Date.now(),
|
|
327
|
+
retryCount: 0, wasSelfCorrected: false, hadTimeout: false,
|
|
328
|
+
userAccepted: true, userProvidedEdit: false, resultWasUsed: true,
|
|
329
|
+
});
|
|
330
|
+
|
|
331
|
+
console.log(`Post-exec: ${result.postExec.outcomeScore}`);
|
|
332
|
+
console.log(`Audit: ${result.auditEntry.id}`);
|
|
333
|
+
|
|
334
|
+
// 记录反馈 · Record feedback
|
|
335
|
+
aos.recordFeedback('user_immediate_continue', 'session_1');
|
|
336
|
+
|
|
337
|
+
// 查看报告 · Status report
|
|
338
|
+
console.log(aos.statusReport());
|
|
339
|
+
```
|
|
340
|
+
|
|
341
|
+
### 接入现有 Agent 框架 · Integrate with any Agent framework
|
|
342
|
+
|
|
343
|
+
```typescript
|
|
344
|
+
import { AgentOS } from 'sentinel-agentos';
|
|
345
|
+
|
|
346
|
+
const aos = new AgentOS({ workspaceRoot: process.cwd() });
|
|
347
|
+
|
|
348
|
+
async function safeToolCall(toolName: string, params: Record<string, unknown>) {
|
|
349
|
+
const sessionId = getCurrentSessionId();
|
|
350
|
+
|
|
351
|
+
// 1. 校验 + 风险 + 快照 · Validate + Risk + Snapshot
|
|
352
|
+
const { preExec, snapshot } = aos.executePipeline({
|
|
353
|
+
sessionId, agentId: 'my_agent', toolName, parameters: params,
|
|
354
|
+
});
|
|
355
|
+
|
|
356
|
+
if (preExec.riskScore.action === 'deny') {
|
|
357
|
+
throw new Error(`Rejected: risk ${preExec.riskScore.score}`);
|
|
358
|
+
}
|
|
359
|
+
|
|
360
|
+
if (preExec.riskScore.action === 'confirm') {
|
|
361
|
+
const ok = await askUser(`Risk ${preExec.riskScore.score}. Proceed?`);
|
|
362
|
+
if (!ok) return;
|
|
363
|
+
}
|
|
364
|
+
|
|
365
|
+
const t0 = Date.now();
|
|
366
|
+
const result = await yourActualCall(toolName, params);
|
|
367
|
+
|
|
368
|
+
// 2. 验证 + 审计 · Verify + Audit
|
|
369
|
+
return aos.completeExecution({
|
|
370
|
+
sessionId, agentId: 'my_agent', toolName,
|
|
371
|
+
toolParameters: params, toolResult: result, snapshot,
|
|
372
|
+
startTime: t0, endTime: Date.now(),
|
|
373
|
+
retryCount: 0, wasSelfCorrected: false, hadTimeout: false,
|
|
374
|
+
userAccepted: true, userProvidedEdit: false, resultWasUsed: false,
|
|
375
|
+
});
|
|
376
|
+
}
|
|
377
|
+
```
|
|
378
|
+
|
|
379
|
+
### 沙箱 · Sandbox
|
|
380
|
+
|
|
381
|
+
```typescript
|
|
382
|
+
import { SandboxExecutor } from 'sentinel-agentos';
|
|
383
|
+
|
|
384
|
+
const sandbox = new SandboxExecutor({
|
|
385
|
+
mode: 'sandbox',
|
|
386
|
+
workspaceRoot: process.cwd(),
|
|
387
|
+
timeoutMs: 30000,
|
|
388
|
+
networkAccess: 'whitelist',
|
|
389
|
+
networkWhitelist: ['api.github.com', 'registry.npmjs.org'],
|
|
390
|
+
writablePaths: ['src/', 'tests/', 'dist/'],
|
|
391
|
+
allowedTools: ['read_file', 'write_file', 'edit', 'exec'],
|
|
392
|
+
forbiddenTools: ['rm', 'unlink'],
|
|
393
|
+
});
|
|
394
|
+
|
|
395
|
+
// Pre-flight · 预检
|
|
396
|
+
sandbox.validate('rm', { path: 'src/main.ts' });
|
|
397
|
+
// → { success: false, sandboxRejectReason: 'Tool "rm" is forbidden' }
|
|
398
|
+
|
|
399
|
+
// Execute · 执行
|
|
400
|
+
await sandbox.execute('exec', { command: 'npm test', cwd: process.cwd() });
|
|
401
|
+
```
|
|
402
|
+
|
|
403
|
+
### API 层 · SDK API
|
|
404
|
+
|
|
405
|
+
```typescript
|
|
406
|
+
import { Sentinel AgentOS, Sentinel AgentOSAPI } from 'Sentinel AgentOS';
|
|
407
|
+
|
|
408
|
+
const api = new AgentOSAPI(new AgentOS());
|
|
409
|
+
|
|
410
|
+
// Guard
|
|
411
|
+
api.guardRegisterRule({ tool: 'delete_file', required: ['path'] });
|
|
412
|
+
|
|
413
|
+
// Memory
|
|
414
|
+
api.memorySetPreference('language', 'zh-CN');
|
|
415
|
+
api.memoryLearnRule('Never push directly to main', 'cr_1');
|
|
416
|
+
|
|
417
|
+
// Pipeline + Audit
|
|
418
|
+
const result = await api.pipelineExecute({...});
|
|
419
|
+
const audit = api.auditQuery({ toolName: 'write_file', limit: 10 });
|
|
420
|
+
|
|
421
|
+
// Profile
|
|
422
|
+
const report = api.getStatusReport();
|
|
423
|
+
api.endSession('session_1');
|
|
424
|
+
```
|
|
425
|
+
|
|
426
|
+
---
|
|
427
|
+
|
|
428
|
+
## 📖 使用案例 · Examples
|
|
429
|
+
|
|
430
|
+
### 1. 拦截危险命令 · Blocking dangerous commands
|
|
431
|
+
|
|
432
|
+
```typescript
|
|
433
|
+
const { preExec } = aos.executePipeline({
|
|
434
|
+
sessionId: 's1', agentId: 'a1',
|
|
435
|
+
toolName: 'exec',
|
|
436
|
+
parameters: { command: 'rm -rf /home' },
|
|
437
|
+
});
|
|
438
|
+
// → { score: 9.18, action: 'deny' } 🔴 自动拒绝 · Auto-denied
|
|
439
|
+
```
|
|
440
|
+
|
|
441
|
+
### 2. 检测 Agent 幻觉 · Detecting hallucination
|
|
442
|
+
|
|
443
|
+
```typescript
|
|
444
|
+
// Agent 声称"写入了文件",但实际没有 · Agent claims "file written" but didn't
|
|
445
|
+
const result = aos.completeExecution({
|
|
446
|
+
...,
|
|
447
|
+
toolResult: 'file written successfully', // ← Agent 幻觉!
|
|
448
|
+
snapshot,
|
|
449
|
+
...
|
|
450
|
+
});
|
|
451
|
+
console.log(result.postExec.verifyPassed); // → false
|
|
452
|
+
// Verify Gate 检测到文件不存在 → FAIL
|
|
453
|
+
```
|
|
454
|
+
|
|
455
|
+
### 3. 跨会话记忆 · Cross-session memory
|
|
456
|
+
|
|
457
|
+
```typescript
|
|
458
|
+
// Session 1: 学到偏好 · Learned preferences
|
|
459
|
+
const aos1 = new AgentOS();
|
|
460
|
+
aos1.memory.semantic.setPreference('language', 'zh-CN');
|
|
461
|
+
aos1.memory.semantic.learnRule('测试前先编译', 'session_1');
|
|
462
|
+
aos1.endSession('session_1');
|
|
463
|
+
|
|
464
|
+
// Session 2: 自动注入 · Auto-injected
|
|
465
|
+
const aos2 = new AgentOS();
|
|
466
|
+
console.log(aos2.injectContext());
|
|
467
|
+
// [Sentinel AgentOS Semantic Memory]
|
|
468
|
+
// ## Preferences
|
|
469
|
+
// - language: "zh-CN"
|
|
470
|
+
// ## Learned Rules · 学习到的规则
|
|
471
|
+
// - [50%] 测试前先编译
|
|
472
|
+
```
|
|
473
|
+
|
|
474
|
+
### 4. 隐性反馈驱动改进 · Feedback-driven improvement
|
|
475
|
+
|
|
476
|
+
```typescript
|
|
477
|
+
// 用户删除了 Agent 创建的代码 · User deleted agent-created code
|
|
478
|
+
aos.recordFeedback('user_deleted_code', 's1');
|
|
479
|
+
|
|
480
|
+
// 连续3次修改 · Modified 3 times
|
|
481
|
+
aos.recordFeedback('user_modified_output', 's1');
|
|
482
|
+
aos.recordFeedback('user_modified_output', 's1');
|
|
483
|
+
aos.recordFeedback('user_modified_output', 's1');
|
|
484
|
+
|
|
485
|
+
const stats = aos.evaluator.feedback.stats();
|
|
486
|
+
// → { totalSignals: 4, negativeSignals: 4, averageStrength: -0.57 }
|
|
487
|
+
|
|
488
|
+
const profile = aos.getProfile();
|
|
489
|
+
// → warnings: ["User satisfaction declining — review recent sessions"]
|
|
490
|
+
```
|
|
491
|
+
|
|
492
|
+
### 5. 沙箱保护 · Sandbox protection
|
|
493
|
+
|
|
494
|
+
```typescript
|
|
495
|
+
const sandbox = new SandboxExecutor({
|
|
496
|
+
mode: 'sandbox',
|
|
497
|
+
networkAccess: 'none', // 禁止网络 · No network
|
|
498
|
+
writablePaths: ['src/', 'logs/'], // 只能写这里 · Only writable here
|
|
499
|
+
forbiddenTools: ['rm', 'git_push'],
|
|
500
|
+
});
|
|
501
|
+
|
|
502
|
+
await sandbox.execute('exec', { command: 'npm test' }); // ✅ OK
|
|
503
|
+
await sandbox.execute('exec', { command: 'curl evil.com' }); // ❌ 网络被黑洞 · Blocked
|
|
504
|
+
```
|
|
505
|
+
|
|
506
|
+
---
|
|
507
|
+
|
|
508
|
+
## 🧪 测试 · Tests
|
|
509
|
+
|
|
510
|
+
```bash
|
|
511
|
+
npm test
|
|
512
|
+
```
|
|
513
|
+
|
|
514
|
+
```
|
|
515
|
+
PASS tests/guard/schema-gate.test.ts (21 tests)
|
|
516
|
+
PASS tests/guard/risk-gate.test.ts (20 tests)
|
|
517
|
+
PASS tests/guard/snapshot-verify-audit.test.ts (17 tests)
|
|
518
|
+
PASS tests/memory/memory.test.ts (29 tests)
|
|
519
|
+
PASS tests/core.test.ts (12 tests)
|
|
520
|
+
────────────────────────────────────────────────
|
|
521
|
+
Test Suites: 5 passed, 5 total
|
|
522
|
+
Tests: 99 passed, 99 total
|
|
523
|
+
```
|
|
524
|
+
|
|
525
|
+
---
|
|
526
|
+
|
|
527
|
+
## ⚠️ 常见问题 · FAQ
|
|
528
|
+
|
|
529
|
+
<details>
|
|
530
|
+
<summary><b>Q: Sentinel AgentOS 和 LangChain / CrewAI 什么关系?</b></summary>
|
|
531
|
+
|
|
532
|
+
Sentinel AgentOS **不是竞争对手**,是基础设施层。LangChain/CrewAI 是 Agent 框架,Sentinel AgentOS 是给它们提供安全、记忆、评估的操作系统。可以增量接入任何框架。
|
|
533
|
+
*Not a competitor — infrastructure. LangChain/CrewAI are agent frameworks; Sentinel AgentOS provides safety + memory + evaluation as an OS layer. Incrementally pluggable into any framework.*
|
|
534
|
+
</details>
|
|
535
|
+
|
|
536
|
+
<details>
|
|
537
|
+
<summary><b>Q: 为什么 Guard 层不用 LLM? · Why no LLM in Guard?</b></summary>
|
|
538
|
+
|
|
539
|
+
LLM 做安全判断 = 用问题制造者来解决问题。Schema 校验是纯工程数学——类型检查、范围检查、hash 对比——这些 LLM 反而做不好(会幻觉)。确定性代码 = 0 幻觉。
|
|
540
|
+
*Using LLM for security = solving problems with the problem-maker. Schema validation is pure engineering — type/range/hash checks — things LLMs are bad at. Deterministic code = zero hallucination.*
|
|
541
|
+
</details>
|
|
542
|
+
|
|
543
|
+
<details>
|
|
544
|
+
<summary><b>Q: Memory 层和 RAG 有什么区别? · How is this different from RAG?</b></summary>
|
|
545
|
+
|
|
546
|
+
RAG = 把对话扔进向量库做检索。Sentinel AgentOS Memory = 人脑模型:Working(当前会话)、Episodic(自动评分+压缩)、Semantic(提炼后的永久知识)。最重要的是:**Sentinel AgentOS 会自动写入记忆,不需要 Agent 手动管理**。
|
|
547
|
+
*RAG = dump conversation into vector DB. Sentinel AgentOS Memory = brain model. Most importantly: **Sentinel AgentOS auto-writes memory; agents don't need to manage it manually.***
|
|
548
|
+
</details>
|
|
549
|
+
|
|
550
|
+
<details>
|
|
551
|
+
<summary><b>Q: 会不会很慢? · Is it slow?</b></summary>
|
|
552
|
+
|
|
553
|
+
不会。Guard 层所有校验都是 `fs.existsSync()`、hash 对比、数学公式,每个校验 **< 1ms**。Snapshot 只记录 hash 不复制文件。整个流水线开销可忽略。
|
|
554
|
+
*No. All Guard checks are fs.existsSync(), hash comparison, math formulas — each < 1ms. Snapshot records hashes only, no file copy. Pipeline overhead is negligible.*
|
|
555
|
+
</details>
|
|
556
|
+
|
|
557
|
+
<details>
|
|
558
|
+
<summary><b>Q: 能用在生产环境吗? · Production-ready?</b></summary>
|
|
559
|
+
|
|
560
|
+
v1.0 已完成 100% 设计文档覆盖率、99 个测试全通过、TypeScript 严格模式。API 稳定,可以集成。但建议先在测试环境跑一段时间。
|
|
561
|
+
*v1.0 has 100% design coverage, 99 passing tests, strict TypeScript. API is stable and integrable. Recommend testing before production.*
|
|
562
|
+
</details>
|
|
563
|
+
|
|
564
|
+
<details>
|
|
565
|
+
<summary><b>Q: npm 包什么时候发布? · When npm publish?</b></summary>
|
|
566
|
+
|
|
567
|
+
TODO。当前可以直接 `git clone` + `npm link` 使用。
|
|
568
|
+
*Currently use via `git clone` + `npm link`.*
|
|
569
|
+
</details>
|
|
570
|
+
|
|
571
|
+
<details>
|
|
572
|
+
<summary><b>Q: 沙箱模式安全吗? · Is sandbox truly secure?</b></summary>
|
|
573
|
+
|
|
574
|
+
v1.0 沙箱基于环境变量 + 路径校验 + 命令模式检测,不是容器级隔离。v2.0 计划支持 Docker 沙箱。
|
|
575
|
+
*v1.0 sandbox uses env vars + path validation + command pattern detection — not container-level isolation. Docker sandbox planned for v2.0.*
|
|
576
|
+
</details>
|
|
577
|
+
|
|
578
|
+
<details>
|
|
579
|
+
<summary><b>Q: 怎么看 Audit Log? · How to view audit logs?</b></summary>
|
|
580
|
+
|
|
581
|
+
```bash
|
|
582
|
+
cat .Sentinel AgentOS/audit.jsonl | jq '.'
|
|
583
|
+
```
|
|
584
|
+
|
|
585
|
+
或通过 API · *or via API:*
|
|
586
|
+
|
|
587
|
+
```typescript
|
|
588
|
+
const entries = api.auditQuery({ minScore: 3.0 }); // 高风险操作 · High-risk ops
|
|
589
|
+
```
|
|
590
|
+
</details>
|
|
591
|
+
|
|
592
|
+
---
|
|
593
|
+
|
|
594
|
+
## 🗺️ 路线图 · Roadmap
|
|
595
|
+
|
|
596
|
+
| 版本 | 内容 | 状态 |
|
|
597
|
+
|------|------|:--:|
|
|
598
|
+
| v0.1 | 项目脚手架 + 类型定义 · *Scaffold + types* | ✅ |
|
|
599
|
+
| v0.2 | Guard 层(6 组件)· *Guard layer (6 components)* | ✅ |
|
|
600
|
+
| v0.3 | Memory 层(3 层)· *Memory layer (3 layers)* | ✅ |
|
|
601
|
+
| v0.4 | Evaluator 层(评估 + 反馈 + 画像)· *Evaluator* | ✅ |
|
|
602
|
+
| v1.0 | 沙箱 + API + x- 扩展 + 校验补齐 · *Sandbox + API + x-ext* | ✅ |
|
|
603
|
+
| v1.1 | npm 发布 · *npm publish* | 📋 |
|
|
604
|
+
| v2.0 | Docker 沙箱、Dashboard、SaaS · *Docker sandbox, Dashboard, SaaS* | 📋 |
|
|
605
|
+
|
|
606
|
+
---
|
|
607
|
+
|
|
608
|
+
## 📂 源码结构 · Source Layout
|
|
609
|
+
|
|
610
|
+
```
|
|
611
|
+
src/
|
|
612
|
+
├── index.ts # 导出入口 · Exports (30+)
|
|
613
|
+
├── core.ts # Sentinel AgentOS 主循环 · Main loop
|
|
614
|
+
├── api.ts # SDK 协议层 · API layer (25+ methods)
|
|
615
|
+
├── types/index.ts # 完整类型定义 · Type definitions
|
|
616
|
+
├── guard/
|
|
617
|
+
│ ├── schema-gate.ts # Schema 校验 · Validation (12 checks)
|
|
618
|
+
│ ├── risk-gate.ts # 风险评分 · Risk scoring
|
|
619
|
+
│ ├── snapshot-verify.ts # 快照 + 验证 + 回滚 · Snapshot + Verify + Rollback
|
|
620
|
+
│ ├── audit-log.ts # 审计日志 · Audit
|
|
621
|
+
│ └── sandbox.ts # 沙箱执行器 · Sandbox executor
|
|
622
|
+
├── memory/
|
|
623
|
+
│ ├── working.ts # 工作记忆 · Working
|
|
624
|
+
│ ├── episodic.ts # 情景记忆 · Episodic
|
|
625
|
+
│ └── semantic.ts # 语义记忆 · Semantic
|
|
626
|
+
└── evaluator/
|
|
627
|
+
├── exec-evaluator.ts # 三阶段评估器 · 3-phase evaluator
|
|
628
|
+
├── feedback.ts # 隐性反馈引擎 · Implicit feedback
|
|
629
|
+
└── profiler.ts # Agent 质量画像 · Quality profiler
|
|
630
|
+
```
|
|
631
|
+
|
|
632
|
+
---
|
|
633
|
+
|
|
634
|
+
## 📄 License
|
|
635
|
+
|
|
636
|
+
MIT
|