neurain 0.1.0-alpha.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +19 -0
- package/LICENSE +57 -0
- package/README.md +205 -0
- package/SECURITY.md +22 -0
- package/bin/neurain.mjs +7 -0
- package/docs/comparison-mem0.en.md +22 -0
- package/docs/connect-claude.en.md +48 -0
- package/docs/connect-claude.kr.md +51 -0
- package/docs/connect-codex.en.md +38 -0
- package/docs/connect-codex.kr.md +40 -0
- package/docs/connect-gemini.en.md +71 -0
- package/docs/connect-gemini.kr.md +71 -0
- package/docs/connect-runtime.en.md +61 -0
- package/docs/connect-runtime.kr.md +61 -0
- package/docs/development-status.en.md +157 -0
- package/docs/development-status.kr.md +157 -0
- package/docs/knowledge-os.en.md +105 -0
- package/docs/knowledge-os.kr.md +106 -0
- package/docs/pricing.en.md +14 -0
- package/docs/privacy-and-data-flow.en.md +25 -0
- package/docs/public-saas-readiness.en.md +39 -0
- package/docs/quickstart.en.md +64 -0
- package/docs/quickstart.kr.md +64 -0
- package/docs/release-checklist.en.md +38 -0
- package/docs/safety.en.md +36 -0
- package/docs/self-improvement-90-roadmap.en.md +429 -0
- package/docs/self-improvement-90-roadmap.kr.md +429 -0
- package/docs/self-improving-workflows.en.md +163 -0
- package/docs/self-improving-workflows.kr.md +163 -0
- package/docs/support.en.md +17 -0
- package/docs/troubleshooting.en.md +35 -0
- package/package.json +36 -0
- package/src/cli.mjs +261 -0
- package/src/core/adopt.mjs +304 -0
- package/src/core/answer_eval.mjs +450 -0
- package/src/core/capabilities.mjs +217 -0
- package/src/core/capture_durable.mjs +181 -0
- package/src/core/classify.mjs +237 -0
- package/src/core/compile_desk.mjs +324 -0
- package/src/core/complete.mjs +108 -0
- package/src/core/config.mjs +142 -0
- package/src/core/connect.mjs +355 -0
- package/src/core/curator.mjs +351 -0
- package/src/core/daemon.mjs +536 -0
- package/src/core/digest.mjs +155 -0
- package/src/core/doctor.mjs +115 -0
- package/src/core/durable.mjs +96 -0
- package/src/core/envelope.mjs +97 -0
- package/src/core/flush.mjs +190 -0
- package/src/core/fs.mjs +121 -0
- package/src/core/init.mjs +194 -0
- package/src/core/journal.mjs +269 -0
- package/src/core/labels.mjs +117 -0
- package/src/core/lessons.mjs +793 -0
- package/src/core/lifecycle.mjs +1138 -0
- package/src/core/link_check.mjs +180 -0
- package/src/core/live_cases.mjs +221 -0
- package/src/core/onboard.mjs +175 -0
- package/src/core/plan_receipt.mjs +177 -0
- package/src/core/plan_writeback.mjs +176 -0
- package/src/core/queue.mjs +62 -0
- package/src/core/queue_archive.mjs +87 -0
- package/src/core/queue_model.mjs +161 -0
- package/src/core/queue_write.mjs +28 -0
- package/src/core/recall.mjs +1802 -0
- package/src/core/recall_bench.mjs +275 -0
- package/src/core/recall_corpus.mjs +152 -0
- package/src/core/recall_facts.mjs +233 -0
- package/src/core/recall_intel.mjs +233 -0
- package/src/core/recall_lexical.mjs +269 -0
- package/src/core/recap.mjs +78 -0
- package/src/core/review_queue.mjs +131 -0
- package/src/core/review_worker.mjs +284 -0
- package/src/core/route.mjs +73 -0
- package/src/core/safety.mjs +57 -0
- package/src/core/scheduler.mjs +697 -0
- package/src/core/search.mjs +54 -0
- package/src/core/secret_scan.mjs +143 -0
- package/src/core/semantic.mjs +187 -0
- package/src/core/source_digest.mjs +56 -0
- package/src/core/source_digest_gen.mjs +311 -0
- package/src/core/stage.mjs +105 -0
- package/src/core/status.mjs +175 -0
- package/src/core/vault_state.mjs +115 -0
- package/src/core/watch.mjs +282 -0
- package/src/core/wiki_log.mjs +29 -0
- package/src/core/wrap.mjs +62 -0
- package/src/mcp/server.mjs +865 -0
- package/templates/starter-vault/README.md +9 -0
|
@@ -0,0 +1,429 @@
|
|
|
1
|
+
# [kr] Self-Improvement 90 로드맵
|
|
2
|
+
|
|
3
|
+
버전: 1.0
|
|
4
|
+
|
|
5
|
+
이 문서는 Neurain Knowledge OS가 Agent Runtime의 공개 구조에서 좋은 점을 흡수하는 계획입니다. 단, Neurain을 LLM처럼 설명하거나 Runtime runtime을 그대로 복제하지 않습니다.
|
|
6
|
+
|
|
7
|
+
Shipped vs planned:
|
|
8
|
+
|
|
9
|
+
- 이미 구현된 alpha: lesson list, lesson candidates, synthetic mode와 reviewed case-file mode를 지원하는 lesson candidate detection eval, recap, wrap dry-run, capabilities, narrow CLI promotion, rollback receipt, event journal add/list/verify, journal receipts, read-only watch report, read-only review worker report, read-only scheduler tick, synthetic mode와 reviewed case-file mode를 지원하는 scheduler trigger eval, bounded foreground scheduler monitor, user-started foreground continuous daemon, append-only lifecycle emit, read-only lifecycle lineage report, Claude Code lifecycle hook preview, Codex lifecycle hook preview, E21 native lifecycle automation eval, snapshot-gated curator lifecycle, optional SQLite FTS5 recall DB, E22 local lexical-semantic recall, hybrid recall, E23 live-content recall coverage, E23 redacted live-case scaffold, E24 read-only onboarding, E25 publish-ready CLI gates, E26 Codex/Claude/Gemini/Runtime MCP connector surface, cross-host recall eval smoke, larger 100-case cross-host recall regression eval, reviewed recall case-file eval, synthetic mode와 reviewed case-file mode를 지원하는 answer-quality eval, MCP preview-only boundary.
|
|
10
|
+
- 이 roadmap의 남은 항목: shipped host-proxy contract를 넘어서는 더 깊은 native Runtime lifecycle event automation과 external user walkthrough evidence.
|
|
11
|
+
- Planned section의 command는 목표 interface이며, 오늘 이미 존재한다는 뜻이 아닙니다.
|
|
12
|
+
|
|
13
|
+
Naming note: "Agent Runtime"은 prompt assembly, model call, tool execution, session lifecycle을 소유하는 일반 AI host를 뜻합니다. "Self-Improvement 90"은 Neurain 내부 milestone 이름이며, upstream runtime release version에 대한 주장이 아닙니다.
|
|
14
|
+
|
|
15
|
+
Distribution note: 이 roadmap은 alpha planning document입니다. Customer-facing launch page는 architecture와 promise를 사용할 수 있지만, internal readiness estimate와 미구현 gap은 제거해야 합니다.
|
|
16
|
+
|
|
17
|
+
## 카테고리 정정
|
|
18
|
+
|
|
19
|
+
여기서 Agent Runtime은 일반 AI host를 뜻하며, model family가 아닙니다. Agent Runtime은 Claude, Gemini, GPT 같은 모델과 비교할 대상이 아닙니다. 실행 runtime입니다.
|
|
20
|
+
|
|
21
|
+
올바른 비교는 다음과 같습니다.
|
|
22
|
+
|
|
23
|
+
- Agent Runtime vs Codex vs Claude Code vs Gemini CLI vs OpenClaw
|
|
24
|
+
- Neurain Knowledge OS는 이 runtime들이 공통으로 사용할 수 있는 local-first knowledge layer
|
|
25
|
+
|
|
26
|
+
## 목표 업그레이드
|
|
27
|
+
|
|
28
|
+
| 기능 | 목표 |
|
|
29
|
+
|---|---:|
|
|
30
|
+
| Host-agnostic lesson 후보 감지 | 90 |
|
|
31
|
+
| 안전한 promotion과 rollback | 95 |
|
|
32
|
+
| background 자동 review | 90 |
|
|
33
|
+
| skill curator 대응 기능 | 90 |
|
|
34
|
+
| cross-session recall | 90 |
|
|
35
|
+
| 사용자 체감 자동화 | 90 |
|
|
36
|
+
|
|
37
|
+
이 점수는 모델 지능 점수가 아니라 제품 준비도 목표입니다. Internal baseline estimate는 published package 밖에서 관리합니다.
|
|
38
|
+
|
|
39
|
+
## Measurement Plan
|
|
40
|
+
|
|
41
|
+
| Capability | Measurement method | Minimum fixture |
|
|
42
|
+
|---|---|---|
|
|
43
|
+
| Lesson candidate detection | reviewed candidate fixture의 precision과 recall | 100 events, positive 40, negative 60 |
|
|
44
|
+
| Safe promotion and rollback | synthetic promotion, partial-write, drift, rollback test | 50 writes |
|
|
45
|
+
| Background review | trigger precision, trigger recall, no-recursion check | 100 event sequences |
|
|
46
|
+
| Curator | lifecycle, pinned protection, snapshot, rollback, no-auto-delete test | 50 lessons/playbooks |
|
|
47
|
+
| Cross-session recall | Hit@5, source support, private-boundary test, rebuild test | 100 recall prompts |
|
|
48
|
+
| User-perceived automation | first-run task completion과 comprehension survey | non-developer walkthrough 10명 |
|
|
49
|
+
|
|
50
|
+
## 일반화할 runtime 패턴
|
|
51
|
+
|
|
52
|
+
현대적인 self-improving agent runtime 기준으로 일반화할 만한 제품 패턴은 5개입니다.
|
|
53
|
+
|
|
54
|
+
1. Agent loop ownership: prompt assembly, model call, tool execution, retry, fallback, compression, persistence를 runtime이 관리.
|
|
55
|
+
2. Skills as procedural memory: cover는 가볍게 보고, 필요할 때만 body를 로드.
|
|
56
|
+
3. Background self-improvement review: 주기적 또는 idle-triggered review가 memory/skill 개선 후보를 제안.
|
|
57
|
+
4. Curator lifecycle: agent-created skill을 active, stale, archived 상태로 관리하고 snapshot/rollback 보호.
|
|
58
|
+
5. Session search: SQLite와 FTS5로 과거 session message와 lineage를 빠르게 검색.
|
|
59
|
+
|
|
60
|
+
Neurain은 이 패턴을 흡수하되, Runtime runtime 자체에 종속되지 않아야 합니다.
|
|
61
|
+
|
|
62
|
+
## Neurain식 적용
|
|
63
|
+
|
|
64
|
+
### 1. Neurain Work Loop
|
|
65
|
+
|
|
66
|
+
전용 runtime은 agent loop를 직접 소유하므로 model response가 끝나는 순간을 정확히 압니다. 하지만 Neurain은 Codex, Claude Code, Gemini CLI, generic runtime session을 기본적으로 직접 소유하지 않습니다.
|
|
67
|
+
|
|
68
|
+
그래서 Neurain 적용은 여러 layer로 나뉩니다.
|
|
69
|
+
|
|
70
|
+
- Manual mode: `neurain wrap <folder> --dry-run` 또는 `!wrap`.
|
|
71
|
+
- Watch report, shipped alpha: `neurain watch <folder> --poll-once`가 local receipt, file change, journal event, recap hint, lesson candidate를 관찰하고 write 없이 review 후보를 제안.
|
|
72
|
+
- Review worker report, shipped alpha: `neurain review <folder> --json`이 watch와 journal signal을 model call이나 write 없이 manual improvement proposal로 변환.
|
|
73
|
+
- Scheduler tick, shipped alpha: `neurain scheduler tick <folder> --json`이 local review를 실행할지 판단하고 threshold를 넘을 때만 review worker report를 포함.
|
|
74
|
+
- Scheduler trigger eval, shipped alpha: `neurain scheduler eval <folder> --fixture-size 100 --json`이 trigger precision, trigger recall, no-recursion, private-boundary handling, target-root non-write를 측정.
|
|
75
|
+
- Scheduler monitor, shipped alpha: `neurain scheduler monitor <folder> --max-ticks 3`가 background job 설치 없이 bounded foreground run에서 read-only scheduler tick을 반복.
|
|
76
|
+
- Lifecycle report, shipped alpha: `neurain lifecycle emit`이 검토된 host boundary event를 기록하고, `neurain lifecycle report`가 turn completion, compaction, resume, parent-session lineage를 요약합니다. Neurain이 host loop를 직접 소유한다고 과장하지 않습니다.
|
|
77
|
+
- Claude Code lifecycle hook preview, shipped alpha: `neurain connect claude <folder> --lifecycle-hooks --dry-run`이 SessionStart, UserPromptSubmit, Stop, SessionEnd를 Neurain lifecycle receipt로 매핑하는 settings snippet을 출력합니다. Prompt body, transcript path, 성공 stdout은 저장하거나 model context에 주입하지 않습니다.
|
|
78
|
+
- Continuous daemon, shipped alpha: `neurain daemon run <folder>`이 user-started foreground loop로 계속 실행되며 durable knowledge write 없이 scheduler tick을 호출할 수 있음.
|
|
79
|
+
- Host-proxy mode: 미래 connector에서 host의 turn lifecycle event를 Neurain에 전달.
|
|
80
|
+
|
|
81
|
+
목표는 모든 host를 통제한다고 과장하지 않으면서, 사용자 입장에서는 자동으로 좋아지는 것처럼 느끼게 만드는 것입니다.
|
|
82
|
+
|
|
83
|
+
완료 기준:
|
|
84
|
+
|
|
85
|
+
- Watch mode는 durable knowledge를 조용히 쓰지 않습니다.
|
|
86
|
+
- 모든 제안은 receipt 또는 candidate id를 가집니다.
|
|
87
|
+
- Host-proxy mode는 선택 사항이며 connector별로 다릅니다.
|
|
88
|
+
|
|
89
|
+
### 2. Lessons And Playbooks
|
|
90
|
+
|
|
91
|
+
Runtime skill은 Neurain lesson/playbook으로 매핑합니다.
|
|
92
|
+
|
|
93
|
+
Neurain 구조:
|
|
94
|
+
|
|
95
|
+
- cover: title, trigger, scope, sensitivity, confidence, last verified
|
|
96
|
+
- body: procedure, pitfalls, tests, rollback, examples
|
|
97
|
+
- source: raw capture, receipt, review finding, user correction
|
|
98
|
+
- lifecycle: candidate, active, stale, archived
|
|
99
|
+
|
|
100
|
+
완료 기준:
|
|
101
|
+
|
|
102
|
+
- 검토된 sample 기준 candidate precision 90 percent 이상.
|
|
103
|
+
- `neurain lessons eval --fixture-size 100`이 recall, precision, false-positive, unsafe-blocking, target-root non-write gate를 통과.
|
|
104
|
+
- `neurain lessons eval --case-file <json>`이 검토된 lesson candidate case를 target root write 없이 실행 가능.
|
|
105
|
+
- private source 기반 candidate는 global lesson으로 승격 불가.
|
|
106
|
+
- prompt injection과 secret-like content는 prompt context 사용 전에 차단.
|
|
107
|
+
|
|
108
|
+
### 3. Review Worker
|
|
109
|
+
|
|
110
|
+
전용 runtime은 background review fork를 실행할 수 있습니다. Neurain은 이를 local review worker로 일반화합니다.
|
|
111
|
+
|
|
112
|
+
Trigger:
|
|
113
|
+
|
|
114
|
+
- 의미 있는 event N개마다
|
|
115
|
+
- `wrap` 이후
|
|
116
|
+
- 실패한 test가 수정되어 성공한 뒤
|
|
117
|
+
- Claude review finding이 해결된 뒤
|
|
118
|
+
- 반복되는 사용자 correction 이후
|
|
119
|
+
- idle watch mode 중
|
|
120
|
+
|
|
121
|
+
Output:
|
|
122
|
+
|
|
123
|
+
- lesson candidate
|
|
124
|
+
- stale lesson warning
|
|
125
|
+
- capability routing hint
|
|
126
|
+
- recall gap
|
|
127
|
+
- rollback risk
|
|
128
|
+
|
|
129
|
+
완료 기준:
|
|
130
|
+
|
|
131
|
+
- Review worker trigger precision 85 percent 이상.
|
|
132
|
+
- 자기 자신을 재귀적으로 다시 실행하지 않음.
|
|
133
|
+
- 외부 write 실행 불가.
|
|
134
|
+
- source, receipt, log, handoff에 들어온 injected instruction을 실행하거나 승격하지 않음.
|
|
135
|
+
- durable promotion은 여전히 사용자 확인 필요.
|
|
136
|
+
|
|
137
|
+
### 4. Curator
|
|
138
|
+
|
|
139
|
+
Runtime curator는 agent-created skill을 관리합니다. Neurain curator는 Neurain-created lesson/playbook을 관리해야 합니다.
|
|
140
|
+
|
|
141
|
+
규칙:
|
|
142
|
+
|
|
143
|
+
- 설정된 non-use 기간 후 active에서 stale로 전환
|
|
144
|
+
- 추가 non-use 기간 후 stale에서 archived로 전환
|
|
145
|
+
- pinned item 보호
|
|
146
|
+
- human-authored docs 보호
|
|
147
|
+
- mutating curator pass 전 snapshot 생성
|
|
148
|
+
- rollback은 정확한 이전 상태 복원
|
|
149
|
+
- 실제 변경 전 dry-run 제공
|
|
150
|
+
|
|
151
|
+
완료 기준:
|
|
152
|
+
|
|
153
|
+
- accidental destructive curator change: 0
|
|
154
|
+
- synthetic test에서 snapshot restore success: 100 percent
|
|
155
|
+
- pinned item mutation refusal: 100 percent
|
|
156
|
+
|
|
157
|
+
### 5. Recall DB
|
|
158
|
+
|
|
159
|
+
Runtime은 SQLite와 FTS5로 session search를 구현할 수 있습니다. Neurain은 markdown을 canonical truth로 유지하면서 optional local recall index를 추가해야 합니다.
|
|
160
|
+
|
|
161
|
+
설계:
|
|
162
|
+
|
|
163
|
+
- SQLite FTS5가 session event, receipt, lesson cover, handoff summary, source reference를 저장.
|
|
164
|
+
- Markdown은 계속 source of truth.
|
|
165
|
+
- WAL mode로 concurrent read 지원.
|
|
166
|
+
- Codex, Claude Code, Agent Runtime, Gemini CLI trace를 host별로 구분.
|
|
167
|
+
- compaction 또는 resume된 작업은 session lineage로 연결.
|
|
168
|
+
- indexing 전 sensitivity filter 실행.
|
|
169
|
+
- private-area fact는 기본적으로 global/cross-host recall에서 제외.
|
|
170
|
+
- secret-like content와 instruction-injection content는 indexing 전 redaction.
|
|
171
|
+
|
|
172
|
+
완료 기준:
|
|
173
|
+
|
|
174
|
+
- Cross-session recall Hit@5 90 percent 이상.
|
|
175
|
+
- Recall-backed answer의 citation/source support 95 percent 이상.
|
|
176
|
+
- Index를 삭제해도 markdown fallback으로 복구 가능.
|
|
177
|
+
- markdown과 receipt에서 rebuild했을 때 synthetic test의 indexed id와 source pointer가 유지.
|
|
178
|
+
- global/cross-host recall fixture에서 private 또는 secret leakage 0.
|
|
179
|
+
|
|
180
|
+
### 6. 사용자 체감 자동화
|
|
181
|
+
|
|
182
|
+
사용자는 내부 명령어를 몰라도 개선을 체감해야 합니다.
|
|
183
|
+
|
|
184
|
+
목표 경험:
|
|
185
|
+
|
|
186
|
+
- `npx neurain init <folder>`로 시작.
|
|
187
|
+
- host 연결.
|
|
188
|
+
- 평소처럼 작업.
|
|
189
|
+
- Neurain이 "무엇이 바뀌었는지, 무엇을 기억해야 하는지, 무엇이 위험한지, 무엇을 검토해야 하는지"를 보여줌.
|
|
190
|
+
- 오래 남는 변경만 사용자 승인을 요청.
|
|
191
|
+
|
|
192
|
+
완료 기준:
|
|
193
|
+
|
|
194
|
+
- first-run에서 useful status까지 5분 이내.
|
|
195
|
+
- 비개발자 onboarding copy는 check state와 wrap work 중심.
|
|
196
|
+
- 고급 명령어는 유지하되 onboarding 전면에 두지 않음.
|
|
197
|
+
|
|
198
|
+
## 구현 단계
|
|
199
|
+
|
|
200
|
+
### Phase 0: Branding And Category Clarity
|
|
201
|
+
|
|
202
|
+
- "Neurain Knowledge OS"를 공식 브랜드로 승격.
|
|
203
|
+
- Runtime category 설명 수정.
|
|
204
|
+
- 3-layer model 배포.
|
|
205
|
+
|
|
206
|
+
### Phase 1: Event Journal
|
|
207
|
+
|
|
208
|
+
- Status: alpha shipped.
|
|
209
|
+
- `neurain journal add`는 `--confirm "1건 저장 진행"`이 있을 때만 reviewed event를 append합니다.
|
|
210
|
+
- `neurain journal list`와 `neurain journal verify`는 read-only입니다.
|
|
211
|
+
- MCP는 journal list/verify만 노출하고 append는 노출하지 않습니다.
|
|
212
|
+
- Secret-like와 instruction-injection summary는 redaction되고 prompt context나 cross-host indexing에 unsafe로 표시됩니다.
|
|
213
|
+
|
|
214
|
+
### Phase 2 / E3: Watch Mode
|
|
215
|
+
|
|
216
|
+
- Status: read-only watch report alpha shipped.
|
|
217
|
+
- `neurain watch <folder> --poll-once`가 안전한 local signal을 관찰.
|
|
218
|
+
- 최근 text file, event journal entry, recap hint, lesson candidate를 읽음.
|
|
219
|
+
- candidate report만 생성.
|
|
220
|
+
- Alpha에서는 daemon을 시작하지 않고, event를 append하지 않고, lesson을 promote하지 않고, durable wiki knowledge를 쓰지 않음.
|
|
221
|
+
- MCP는 read-only `neurain_watch_report`만 노출.
|
|
222
|
+
|
|
223
|
+
### Phase 3: Review Worker
|
|
224
|
+
|
|
225
|
+
- Status: read-only review worker report alpha shipped.
|
|
226
|
+
- `neurain review <folder> --json`이 watch report, journal event sequence, recap hint, lesson candidate를 review item, blocked item, suggested action으로 변환.
|
|
227
|
+
- promotion path 전에 candidate usefulness와 risk를 점수화.
|
|
228
|
+
- private, unsafe, withheld signal에 대한 blocked reason 포함.
|
|
229
|
+
- model을 호출하지 않고, nested review worker를 시작하지 않고, external tool을 실행하지 않고, lesson을 promote하지 않고, durable wiki knowledge를 쓰지 않음.
|
|
230
|
+
- MCP는 read-only `neurain_review_worker`만 노출.
|
|
231
|
+
- connector-triggered automatic scheduling은 아직 planned. user-started foreground daemon check는 별도로 shipped.
|
|
232
|
+
|
|
233
|
+
### Phase 3b: Scheduler Tick
|
|
234
|
+
|
|
235
|
+
- Status: read-only scheduler tick alpha shipped.
|
|
236
|
+
- `neurain scheduler tick <folder> --json`이 watch signal을 확인하고 local review를 실행할지 판단합니다.
|
|
237
|
+
- `neurain scheduler status <folder> --json`은 같은 판단을 report하지만 review worker report는 포함하지 않습니다.
|
|
238
|
+
- Background job을 설치하지 않고, daemon을 시작하지 않고, model을 호출하지 않고, lesson을 promote하지 않고, durable wiki knowledge를 쓰지 않습니다.
|
|
239
|
+
- MCP는 scheduler tick용 read-only `neurain_scheduler_tick`을 노출합니다.
|
|
240
|
+
|
|
241
|
+
### Phase 3b2 / E20: Scheduler Trigger Eval
|
|
242
|
+
|
|
243
|
+
- Status: read-only scheduler trigger eval alpha shipped.
|
|
244
|
+
- `neurain scheduler eval <folder> --fixture-size 100 --json`은 target root를 건드리지 않고 synthetic background-review trigger regression을 실행합니다.
|
|
245
|
+
- `neurain scheduler eval <folder> --case-file scheduler-cases.json --min-cases 5 --json`은 사람이 검토한 scheduler trigger case를 실행합니다.
|
|
246
|
+
- Trigger precision, trigger recall, false positive, false negative, no-recursion, private-boundary handling, target-root non-write를 gate로 확인합니다.
|
|
247
|
+
- E20 safety hardening으로 broad target-root snapshot, temp cleanup proof, explicit private/no-recursion denominator count, case-file size cap, traversal refusal test, MCP positive-integer validation을 추가했습니다.
|
|
248
|
+
- MCP는 read-only `neurain_scheduler_eval`을 노출합니다.
|
|
249
|
+
|
|
250
|
+
### Phase 3c: Scheduler Monitor
|
|
251
|
+
|
|
252
|
+
- Status: bounded foreground scheduler monitor alpha shipped.
|
|
253
|
+
- `neurain scheduler monitor <folder> --interval-seconds 60 --max-ticks 3 --json`이 사용자가 지정한 tick 수만큼 read-only scheduler tick을 반복합니다.
|
|
254
|
+
- Background job을 설치하지 않고, daemon을 시작하지 않고, 명령 종료 뒤 계속 실행되지 않고, model을 호출하지 않고, lesson을 promote하지 않고, durable wiki knowledge를 쓰지 않습니다.
|
|
255
|
+
- Alpha에서 MCP는 scheduler monitor를 노출하지 않습니다.
|
|
256
|
+
|
|
257
|
+
### Phase 3d: Lifecycle Lineage
|
|
258
|
+
|
|
259
|
+
- Status: append-only lifecycle event와 read-only lineage report alpha shipped.
|
|
260
|
+
- `neurain lifecycle emit <folder> --host codex --event turn_end --session-id <id> --turn-id <id> --confirm "1건 저장 진행"`이 검토된 host boundary event를 event journal에 기록합니다.
|
|
261
|
+
- 지원하는 lifecycle event는 `session_start`, `turn_start`, `turn_end`, `wrap_complete`, `review_due`, `review_complete`, `compaction`, `resume`, `session_end`입니다.
|
|
262
|
+
- `neurain lifecycle report <folder> --json`은 completed turn, open turn, review-due event, compaction, resume, parent session, journal integrity를 요약합니다.
|
|
263
|
+
- MCP는 read-only `neurain_lifecycle_report`만 노출합니다. lifecycle emit은 노출하지 않습니다.
|
|
264
|
+
- 이것은 Neurain이 Codex, Claude Code, Gemini CLI, Agent Runtime, OpenClaw 아래에 붙어 있을 때 Runtime agent-loop ownership을 대체하는 현재 방식입니다.
|
|
265
|
+
- Codex lifecycle hook preview와 Runtime host-proxy lifecycle contract는 alpha shipped 상태입니다. 더 깊은 native Runtime lifecycle emission은 connector-specific future work입니다.
|
|
266
|
+
|
|
267
|
+
### Phase 3d2 / E21: Native Lifecycle Automation Eval
|
|
268
|
+
|
|
269
|
+
- Status: read-only native lifecycle automation eval alpha shipped.
|
|
270
|
+
- `neurain lifecycle eval <folder> --fixture-size 100 --json`이 Claude Code, Codex, Runtime, generic host의 synthetic lifecycle payload를 격리된 임시 root에서 실제 hook adapter로 재생합니다.
|
|
271
|
+
- `neurain lifecycle eval <folder> --case-file lifecycle-cases.json --min-cases 6 --json`이 사람이 검토한 lifecycle payload case를 실행합니다.
|
|
272
|
+
- host coverage(claude, codex, runtime, generic), lifecycle event coverage(session_start, turn_start, turn_end, review_due, compaction, resume, session_end), 그리고 malformed/unsupported/missing-session payload가 durable write 없이 ignored 처리되는지를 gate로 확인합니다.
|
|
273
|
+
- 모든 durable artifact를 다시 읽어 prompt body, transcript path, tool stdout, tool stderr, secret, private payload가 전혀 저장되지 않음을 증명하고, path-traversal containment, broad target-root non-write snapshot, temp cleanup을 함께 검증합니다.
|
|
274
|
+
- model이나 external tool을 호출하지 않고 target root에 write하지 않습니다.
|
|
275
|
+
- MCP는 read-only `neurain_lifecycle_eval`만 노출합니다. lifecycle emit이나 lifecycle write path는 노출하지 않습니다.
|
|
276
|
+
- E21은 lifecycle 관찰 안전 gate입니다. 완전한 background self-improvement loop를 만드는 단계가 아니며, 자동 background review, 자동 lesson promotion, 자동 rollback을 주장하지 않습니다. 다음 epic인 E22 semantic recall quality가 여기서 이어집니다.
|
|
277
|
+
|
|
278
|
+
### Phase 5b / E22: Semantic Recall Quality
|
|
279
|
+
|
|
280
|
+
- Status: read-only local lexical-semantic recall alpha가 2026-06-09에 shipped.
|
|
281
|
+
- `neurain recall semantic-search <folder> <query> --json`이 exact-token recall 위에 lexical-semantic 계층(어간, 큐레이션 동의어, char-trigram 퍼지)을 얹어 의역 쿼리를 찾습니다.
|
|
282
|
+
- `neurain recall eval <folder> --semantic --fixture-size 60 --min-cases 50 --json`이 synthetic 의역 fixture에서 semantic Hit@top이 exact-token 기준선을 명확히 능가함을 증명합니다.
|
|
283
|
+
- `neurain recall eval <folder> --semantic --case-file semantic-recall-cases.json --min-cases 50 --json`이 reviewed semantic recall case를 실행합니다.
|
|
284
|
+
- 기본 `local-lexical` provider는 결정론적이고, model call이 없고, 외부 의존성이 없고, 별도 generated index가 필요 없으며(markdown canonical), 특정 LLM에 종속되지 않습니다. embedding provider는 교체 가능해서 나중에 진짜 벡터모델을 canonical 변경 없이 끼울 수 있습니다.
|
|
285
|
+
- Gates: semantic Hit@top >= 0.9 이고 exact baseline + 0.2 이상, source support, host isolation, private 제외, no-answer abstention, rebuild 동등성, target-root non-write.
|
|
286
|
+
- MCP는 read-only `neurain_recall_semantic_search`와 `neurain_recall_semantic_eval`만 노출.
|
|
287
|
+
- 정직한 범위: 형태소 변형/큐레이션 동의어/오타까지 다루며 neural의 임의 개념유사는 아닙니다. synthetic eval은 메커니즘을 결정론적으로 증명할 뿐 실사용자 benchmark는 아니고, 맵에 없는 개념은 pluggable embedding provider가 필요하며 실사용자 semantic recall 증명은 후속 과제입니다.
|
|
288
|
+
|
|
289
|
+
### Phase 6 / E23: Live User Eval Pack (first increment)
|
|
290
|
+
|
|
291
|
+
- Status: 2026-06-09 첫 increments shipped - read-only live-content recall coverage와 redacted live case scaffold.
|
|
292
|
+
- `neurain recall live-eval <folder>`이 실제 폴더의 자기 콘텐츠가 의역 쿼리(실제 단어를 동의어로 교체)로 얼마나 회수되는지 측정해 hybrid coverage(권장 exact-union-semantic), exact-token coverage, semantic-only coverage, kind별 coverage를 보고합니다.
|
|
293
|
+
- 실제 콘텐츠 finding: live-eval을 실제 폴더에 돌리니 순수 semantic만으로는 exact-token보다 recall이 더 낮을 수 있음이 드러났습니다(실제 내부 코퍼스에서 semantic 0.767 vs exact 0.814). lexical-semantic 계층이 exact-token BM25의 희귀도 가중치가 없기 때문입니다. 이를 E22 후속 수정으로 추가: hybrid recall(`recall hybrid-search`, MCP `neurain_recall_hybrid_search`)은 exact-token 합집합 semantic이라 exact보다 절대 나쁘지 않고 의역 케이스를 더합니다(같은 코퍼스에서 hybrid 0.86 vs exact 0.814). hybrid search에서 `--top`은 각 branch의 후보 깊이라 semantic-only catch가 있으면 반환되는 합집합은 `--top`보다 길어질 수 있습니다. synthetic fixture가 놓치고 실제 콘텐츠 eval이 잡아낸 정확한 사례입니다.
|
|
294
|
+
- read-only이고 메트릭만 반환하며 콘텐츠를 저장하지 않습니다(private vault에서 안전). model/external call 없음. readiness는 hybrid coverage가 exact-token coverage보다 나쁘지 않은지 gate.
|
|
295
|
+
- 목적: alpha 증거를 synthetic fixture에서 실제 콘텐츠로 옮기고, 실사용자가 자기 자료에서 recall이 충분한지/무엇을 놓치는지 직접 보게 함.
|
|
296
|
+
- 정직한 범위와 남은 것: 쿼리는 자동 생성이지 사람이 판단한 relevance가 아니고, 폴더 하나는 3명이 아닙니다. synthetic-only 주장을 줄이지만 제거하진 못합니다. 완전한 E23(최소 3개 실제 폴더의 reviewed live case, redacted, readiness 연결)은 사람의 몫(실제 walkthrough)입니다. 에이전트는 사용자 증거를 지어내지 않습니다.
|
|
297
|
+
- Storage-safety bridge shipped: `neurain live-cases scaffold <folder>`가 hash-only source ref를 쓰고 raw source text와 absolute path를 저장하지 않는 redacted reviewed-case pack scaffold를 준비합니다. 사람이 안전한 local case file을 채우고 eval을 실행하기 전까지 `reviewed_live_user_evidence: false`, `human_judged: false`를 보고합니다.
|
|
298
|
+
- 남은 E23 increment: 실제 폴더 기준 answer/lesson live-eval, 최소 3개 실제 업무 폴더의 filled reviewed live case file, 그리고 그 reviewed receipt가 있어야 외부 사용자 증거를 주장할 수 있게 하는 readiness gate.
|
|
299
|
+
|
|
300
|
+
### Phase 6b / E24: Non-Developer Onboarding
|
|
301
|
+
|
|
302
|
+
- Status: read-only first-run onboarding alpha가 2026-06-09에 shipped.
|
|
303
|
+
- `neurain onboard <folder>`는 신규 사용자에게 다음 행동이 `init`, `adopt --dry-run`, host connection, `wrap --dry-run` 중 무엇인지 알려줍니다.
|
|
304
|
+
- Neurain이 저장하는 것과 저장하지 않는 것을 보고합니다.
|
|
305
|
+
- Model call, external call, durable write가 없습니다.
|
|
306
|
+
- Readiness는 non-developer onboarding dry-run gate를 포함합니다.
|
|
307
|
+
|
|
308
|
+
### Phase 6c / E25: Installable CLI Readiness
|
|
309
|
+
|
|
310
|
+
- Status: publish-ready alpha CLI gate가 2026-06-09에 shipped.
|
|
311
|
+
- Package는 `neurain` binary를 노출하고 public docs, CI workflow, template, license, security doc을 포함합니다.
|
|
312
|
+
- Full readiness는 npm audit, npm pack dry-run, temporary tarball install smoke를 검증합니다.
|
|
313
|
+
- 정직한 범위: 실제 npm publish 또는 package name reservation은 implementation claim이 아니라 별도 release action입니다.
|
|
314
|
+
|
|
315
|
+
### Phase 6d / E26: Thin MCP Connector
|
|
316
|
+
|
|
317
|
+
- Status: Codex, Claude Code, Gemini CLI, Agent Runtime MCP surface가 2026-06-09에 shipped.
|
|
318
|
+
- `neurain connect codex <folder> --dry-run`, `neurain connect claude <folder> --dry-run`, `neurain connect gemini <folder> --dry-run`은 host CLI setup command를 출력합니다.
|
|
319
|
+
- `neurain connect runtime <folder> --dry-run`은 bounded config snippet을 출력합니다.
|
|
320
|
+
- Gemini는 bounded read-first allowlist를 사용합니다. Raw capture는 Gemini allowlist에 기본 포함되지 않습니다.
|
|
321
|
+
- MCP는 bounded read, scan, eval, live-case scaffold, preview tool을 노출합니다. Silent durable wiki write, lifecycle emit, daemon run/stop, curator write, recall rebuild write, lesson promotion은 노출하지 않습니다.
|
|
322
|
+
- Gemini lifecycle automation은 alpha에서 manual-only입니다. 필요할 때 explicit `lifecycle emit --host gemini`이 receipt path입니다.
|
|
323
|
+
|
|
324
|
+
### Phase 3e / E11: Continuous Daemon
|
|
325
|
+
|
|
326
|
+
- Status: user-started foreground continuous daemon alpha shipped.
|
|
327
|
+
- `neurain daemon run <folder> --interval-seconds 300 --json`은 process가 멈출 때까지 scheduler tick을 반복합니다.
|
|
328
|
+
- `neurain daemon run <folder> --max-ticks 2 --json`은 test와 demo를 위한 bounded proof path입니다.
|
|
329
|
+
- `neurain daemon status <folder> --json`은 마지막 operational state를 읽습니다.
|
|
330
|
+
- `neurain daemon stop <folder> --json`은 matching foreground daemon이 실행 중일 때 cooperative stop을 요청합니다.
|
|
331
|
+
- Stop request는 sleep interval 중에도 약 1초 단위로 확인합니다.
|
|
332
|
+
- Daemon은 `00_system/neurain/daemon-state.json` operational state만 씁니다.
|
|
333
|
+
- Background job을 설치하지 않고, durable wiki knowledge를 쓰지 않고, lesson을 promote하지 않고, model을 호출하지 않고, external tool을 호출하지 않고, state에 private path를 저장하지 않고, MCP tool로 노출되지 않습니다.
|
|
334
|
+
|
|
335
|
+
### Phase 4: Curator
|
|
336
|
+
|
|
337
|
+
- Status: snapshot-gated curator lifecycle alpha shipped.
|
|
338
|
+
- `neurain curator status <folder>`가 write 없이 planned lifecycle change를 report.
|
|
339
|
+
- `neurain curator run <folder> --dry-run`이 같은 plan을 preview.
|
|
340
|
+
- `neurain curator run <folder> --confirm "1건 저장 진행"`이 snapshot receipt를 쓰고 lesson `Status` field만 변경.
|
|
341
|
+
- `neurain curator rollback <folder> --receipt <receipt>`가 registry drift가 없을 때 정확한 이전 registry snapshot을 복원.
|
|
342
|
+
- Curator는 lesson을 삭제하지 않고 pinned, human-authored, private, non-agent-created lesson을 보호.
|
|
343
|
+
- MCP는 read-only `neurain_curator_status`와 `neurain_curator_run_preview`만 노출.
|
|
344
|
+
|
|
345
|
+
### Phase 5: Recall DB
|
|
346
|
+
|
|
347
|
+
- Status: optional SQLite FTS5 recall DB alpha shipped.
|
|
348
|
+
- `neurain recall status <folder>`가 write 없이 cache state, row count, manifest hash, markdown fallback availability를 report.
|
|
349
|
+
- `neurain recall rebuild <folder> --dry-run`이 write 없이 indexed count를 preview.
|
|
350
|
+
- `neurain recall rebuild <folder>`는 `00_system/neurain/recall.sqlite`와 rebuild receipt만 씁니다.
|
|
351
|
+
- `neurain recall search <folder> <query>`는 SQLite FTS5 cache가 있으면 이를 사용하고, cache가 없으면 markdown fallback으로 검색합니다.
|
|
352
|
+
- `neurain recall verify <folder>`는 현재 markdown, safe event, safe receipt와 cache manifest를 비교합니다.
|
|
353
|
+
- Markdown은 계속 canonical이고, SQLite file은 rebuildable cache입니다.
|
|
354
|
+
- Private path, raw source body, secret-like content, instruction-injection content, recall rebuild receipt는 index에서 제외됩니다.
|
|
355
|
+
- MCP는 read-only `neurain_recall_status`, `neurain_recall_rebuild_preview`, `neurain_recall_search`, `neurain_recall_verify`, `neurain_recall_cross_host_eval`을 노출합니다.
|
|
356
|
+
- MCP는 answer-quality fixture regression용 read-only `neurain_answer_quality_eval`도 노출합니다.
|
|
357
|
+
- MCP는 scheduler trigger regression용 read-only `neurain_scheduler_eval`도 노출합니다.
|
|
358
|
+
- `neurain recall eval <folder>`은 safe host-tagged journal event를 대상으로 하는 read-only alpha smoke eval로 shipped.
|
|
359
|
+
- `neurain recall eval <folder> --fixture-size 100 --private-probes 20`은 larger synthetic cross-host regression eval로 shipped. Target root를 건드리지 않고 exact-token host filtering, source-supporting snippet, host isolation, private leakage를 gate로 확인합니다.
|
|
360
|
+
- `neurain recall eval <folder> --case-file recall-cases.json`은 reviewed case-file eval로 shipped. Target root를 건드리지 않고 사람이 검토한 recall case 기준으로 host filtering, source-supporting snippet, host isolation, private leakage를 gate로 확인합니다.
|
|
361
|
+
- 이 recall eval mode들은 recall plumbing, reviewed case handling, privacy filter를 검증합니다. 이 둘만으로 90 percent 목표의 semantic recall quality evidence라고 주장하지는 않습니다.
|
|
362
|
+
- `neurain answer eval <folder> --fixture-size 120`은 answer faithfulness, citation accuracy, conflict surfacing, abstention, private boundary, stale-source handling을 확인하는 read-only alpha fixture로 shipped 상태입니다.
|
|
363
|
+
- `neurain answer eval <folder> --case-file answer-cases.json`은 answer faithfulness, citation accuracy, conflict surfacing, abstention, private boundary, stale-source handling을 확인하는 read-only reviewed case-file eval로 shipped 상태입니다.
|
|
364
|
+
- 이 answer eval mode들은 policy gate와 reviewed case handling을 검증하지만 live production answer quality를 증명하지는 않습니다. 90 percent answer quality를 주장하려면 external user walkthrough가 아직 필요합니다.
|
|
365
|
+
|
|
366
|
+
### Phase 6: Runtime Connector
|
|
367
|
+
|
|
368
|
+
- Status: Runtime MCP config-preview connector alpha shipped.
|
|
369
|
+
- `neurain connect runtime <folder> --dry-run`은 host-managed config용 bounded MCP server snippet을 출력합니다.
|
|
370
|
+
- Snippet은 Neurain을 local stdio MCP server로 쓰며 작은 read-heavy tool allowlist만 노출합니다.
|
|
371
|
+
- Alpha는 Runtime config를 직접 수정하지 않습니다.
|
|
372
|
+
- Neurain adoption에 Runtime를 필수로 만들지 않음.
|
|
373
|
+
- 더 깊은 Runtime lifecycle-event connector automation은 아직 planned입니다.
|
|
374
|
+
|
|
375
|
+
## 90 percent 주장 전 hard gate
|
|
376
|
+
|
|
377
|
+
- product package leakage 없음.
|
|
378
|
+
- secret 또는 prompt-injection lesson이 prompt context에 들어가지 않음.
|
|
379
|
+
- source, receipt, log, handoff, recall-index에서 온 instruction injection을 실행하거나 승격하지 않음.
|
|
380
|
+
- global/cross-host recall fixture에 private-area fact 또는 secret-like string이 나오지 않음.
|
|
381
|
+
- private 또는 area-only source에서 global promotion 불가.
|
|
382
|
+
- rollback은 receipt-listed write만 제거.
|
|
383
|
+
- curator는 dry-run, snapshot, rollback, pinned protection 보유.
|
|
384
|
+
- recall DB는 canonical markdown과 receipt에서 삭제 후 rebuild 가능.
|
|
385
|
+
- recall DB rebuild equivalence와 markdown-only fallback이 synthetic test를 통과.
|
|
386
|
+
- background review는 trigger precision과 trigger recall evidence를 모두 보유.
|
|
387
|
+
- background review trigger eval은 synthetic mode와 reviewed case-file gate를 모두 통과.
|
|
388
|
+
- 사용자 체감 자동화는 non-developer walkthrough evidence를 보유.
|
|
389
|
+
- Claude MAX EFFORT review가 PASS 또는 PASS-WITH-CHANGES with blocking findings resolved 상태.
|
|
390
|
+
- 구현 후 product test와 readiness check 통과.
|
|
391
|
+
|
|
392
|
+
## 현재 상태
|
|
393
|
+
|
|
394
|
+
현재 alpha는 첫 번째 안전한 slice를 구현했습니다.
|
|
395
|
+
|
|
396
|
+
- lesson list
|
|
397
|
+
- event journal add/list/verify
|
|
398
|
+
- lesson candidates
|
|
399
|
+
- recap
|
|
400
|
+
- wrap dry-run
|
|
401
|
+
- capabilities
|
|
402
|
+
- read-only watch report
|
|
403
|
+
- read-only review worker report
|
|
404
|
+
- read-only scheduler tick
|
|
405
|
+
- synthetic mode와 reviewed case-file mode를 지원하는 scheduler trigger eval alpha
|
|
406
|
+
- bounded foreground scheduler monitor
|
|
407
|
+
- user-started foreground continuous daemon
|
|
408
|
+
- append-only lifecycle emit
|
|
409
|
+
- read-only lifecycle lineage report
|
|
410
|
+
- Claude Code lifecycle hook preview
|
|
411
|
+
- snapshot-gated curator lifecycle
|
|
412
|
+
- optional SQLite FTS5 recall DB
|
|
413
|
+
- E22 local lexical-semantic recall
|
|
414
|
+
- hybrid recall
|
|
415
|
+
- E23 live-content recall coverage
|
|
416
|
+
- E23 redacted live-case scaffold
|
|
417
|
+
- E24 read-only onboarding
|
|
418
|
+
- E25 publish-ready CLI gates
|
|
419
|
+
- E26 Codex, Claude Code, Gemini CLI, Agent Runtime MCP connector surface
|
|
420
|
+
- cross-host recall eval smoke
|
|
421
|
+
- larger 100-case cross-host recall regression eval
|
|
422
|
+
- synthetic mode와 reviewed case-file mode를 지원하는 answer-quality eval alpha
|
|
423
|
+
- 좁은 CLI promotion
|
|
424
|
+
- rollback receipt
|
|
425
|
+
- MCP preview-only boundary
|
|
426
|
+
|
|
427
|
+
Claude Code connector-specific lifecycle hook preview는 alpha로 shipped 상태입니다. SessionStart, UserPromptSubmit, Stop, SessionEnd를 Neurain lifecycle receipt로 매핑하되 prompt body, transcript path, 성공 stdout을 저장하거나 model context에 주입하지 않습니다.
|
|
428
|
+
|
|
429
|
+
Codex lifecycle hook preview도 `.codex/hooks.json` SessionStart와 Edit 또는 Write PostToolUse review marker 기준으로 alpha shipped 상태입니다. Runtime host-proxy lifecycle contract도 direct lifecycle boundary event를 전달할 수 있는 agent loop 기준으로 alpha shipped 상태입니다. 더 깊은 native Runtime lifecycle-event automation은 Neurain이 Runtime agent loop를 소유하지 않기 때문에 future work로 남아 있습니다.
|
|
@@ -0,0 +1,163 @@
|
|
|
1
|
+
# [en] Self-Improving Workflows
|
|
2
|
+
|
|
3
|
+
Version: 0.14
|
|
4
|
+
|
|
5
|
+
Neurain Knowledge OS does not train or replace an LLM. It gives Codex, Claude Code, Gemini CLI, Agent Runtime, OpenClaw, or another AI host a safer local memory layer.
|
|
6
|
+
|
|
7
|
+
Agent Runtime is an agent runtime, not a model. Neurain's role is to provide the local-first Knowledge OS layer that such runtimes can use.
|
|
8
|
+
|
|
9
|
+
The Runtime-inspired loop is:
|
|
10
|
+
|
|
11
|
+
1. Work happens in a local folder.
|
|
12
|
+
2. Neurain captures or reads source-grounded evidence.
|
|
13
|
+
3. `neurain wrap <folder> --dry-run` summarizes recent signals.
|
|
14
|
+
4. `neurain journal add` can append a reviewed local event with a receipt.
|
|
15
|
+
5. `neurain watch <folder> --poll-once` produces a read-only candidate report from recent files, journal events, recap hints, and lesson candidates.
|
|
16
|
+
6. `neurain review <folder> --json` converts those signals into a read-only review worker report with manual improvement proposals.
|
|
17
|
+
7. `neurain scheduler tick <folder> --json` decides whether those signals are strong enough to include a review worker report.
|
|
18
|
+
8. `neurain scheduler eval <folder> --fixture-size 100` checks trigger precision, trigger recall, no-recursion, private boundaries, and target-root non-write before Neurain claims background review quality.
|
|
19
|
+
9. `neurain scheduler monitor <folder> --max-ticks 3` can repeat those read-only ticks in the foreground while the user keeps the command running.
|
|
20
|
+
10. `neurain daemon run <folder>` can keep the same scheduler checks running as a user-started foreground loop with operational state only.
|
|
21
|
+
11. `neurain lifecycle emit` can append reviewed host boundary events such as turn end, wrap complete, compaction, or resume.
|
|
22
|
+
12. `neurain connect claude <folder> --lifecycle-hooks --dry-run` can preview Claude Code hooks that map SessionStart, UserPromptSubmit, Stop, and SessionEnd into lifecycle receipts without storing prompt bodies, transcript paths, or success stdout in model context.
|
|
23
|
+
13. `neurain lifecycle report` summarizes session lineage and shows where review or recall should continue.
|
|
24
|
+
14. Neurain proposes lesson candidates when a recurring mistake or workflow pattern appears.
|
|
25
|
+
15. The user reviews the candidate before any durable lesson is promoted through CLI.
|
|
26
|
+
16. `neurain curator` can later mark old agent-created lessons stale or archived with snapshot rollback and pinned protection.
|
|
27
|
+
17. `neurain recall` can build an optional local SQLite FTS5 cache so future sessions can find prior handoffs, events, receipts, and lessons faster.
|
|
28
|
+
18. `neurain recall eval` can check whether safe host-tagged events are retrieved under the right host filter, using synthetic fixtures or reviewed case files.
|
|
29
|
+
19. `neurain answer eval` checks answer faithfulness, citation accuracy, conflict surfacing, abstention, private boundaries, and stale-source handling with synthetic fixtures or reviewed case files, without model calls or target-root writes.
|
|
30
|
+
20. `neurain lessons eval` checks candidate detection precision, recall, false-positive rate, and unsafe candidate blocking without model calls or target-root writes.
|
|
31
|
+
21. `neurain lifecycle eval` checks that host lifecycle hook payloads from Claude Code, Codex, Runtime, and generic hosts map to the right lifecycle events, that malformed, unsupported, and missing-session payloads become ignored results with no durable write, and that prompt bodies, transcript paths, tool stdout, tool stderr, secrets, and private payloads are never persisted, without model calls or target-root writes.
|
|
32
|
+
22. Agent Runtime can connect through a bounded MCP config snippet while Neurain remains the backend Knowledge OS.
|
|
33
|
+
23. Future sessions can read active lesson covers, lifecycle lineage, recall results, scheduler-trigger gates, lesson-detection gates, answer-quality gates, and lifecycle-automation gates first, then avoid repeating the same mistake.
|
|
34
|
+
|
|
35
|
+
## What Neurain Stores
|
|
36
|
+
|
|
37
|
+
- Active lessons in `00_system/neurain/lessons.md`.
|
|
38
|
+
- Append-only workflow events in `00_system/neurain/events.ndjson`.
|
|
39
|
+
- Append-only host lifecycle boundary events in the same journal, marked as `lifecycle` events.
|
|
40
|
+
- Optional detailed lesson cards under an area's `memory/lessons/` folder.
|
|
41
|
+
- Optional generated recall cache in `00_system/neurain/recall.sqlite`.
|
|
42
|
+
- Foreground daemon operational state in `00_system/neurain/daemon-state.json`.
|
|
43
|
+
- Raw source captures only when the user or MCP capture tool provides them.
|
|
44
|
+
- Receipts for safety-critical writes such as adoption, lesson promotion, and lesson rollback.
|
|
45
|
+
|
|
46
|
+
## What Neurain Does Not Do In Alpha
|
|
47
|
+
|
|
48
|
+
- It does not silently promote lesson candidates.
|
|
49
|
+
- MCP remains preview-only and cannot promote lessons.
|
|
50
|
+
- It does not overwrite durable wiki knowledge through MCP.
|
|
51
|
+
- It does not expose a web admin page.
|
|
52
|
+
- It does not upload your folder to a hosted memory service.
|
|
53
|
+
- It does not treat the recall SQLite cache as canonical truth.
|
|
54
|
+
|
|
55
|
+
## Commands
|
|
56
|
+
|
|
57
|
+
```bash
|
|
58
|
+
neurain onboard ~/NeurainDemo --lang en
|
|
59
|
+
neurain connect codex ~/NeurainDemo --dry-run
|
|
60
|
+
neurain connect claude ~/NeurainDemo --dry-run
|
|
61
|
+
neurain connect gemini ~/NeurainDemo --dry-run
|
|
62
|
+
neurain connect runtime ~/NeurainDemo --dry-run
|
|
63
|
+
neurain lessons list ~/NeurainDemo
|
|
64
|
+
neurain journal list ~/NeurainDemo
|
|
65
|
+
neurain journal add ~/NeurainDemo --type test --summary "npm test passed" --confirm "1건 저장 진행"
|
|
66
|
+
neurain journal verify ~/NeurainDemo
|
|
67
|
+
neurain watch ~/NeurainDemo --poll-once
|
|
68
|
+
neurain review ~/NeurainDemo --json
|
|
69
|
+
neurain scheduler tick ~/NeurainDemo --json
|
|
70
|
+
neurain scheduler status ~/NeurainDemo --json
|
|
71
|
+
neurain scheduler monitor ~/NeurainDemo --interval-seconds 60 --max-ticks 3 --json
|
|
72
|
+
neurain scheduler eval ~/NeurainDemo --fixture-size 100 --json
|
|
73
|
+
neurain scheduler eval ~/NeurainDemo --case-file scheduler-cases.json --min-cases 5 --json
|
|
74
|
+
neurain daemon run ~/NeurainDemo --interval-seconds 300 --max-ticks 2 --json
|
|
75
|
+
neurain daemon status ~/NeurainDemo --json
|
|
76
|
+
neurain daemon stop ~/NeurainDemo --json
|
|
77
|
+
neurain lifecycle emit ~/NeurainDemo --host codex --event turn_end --session-id codex-session-1 --turn-id turn-1 --confirm "1건 저장 진행" --json
|
|
78
|
+
neurain lifecycle report ~/NeurainDemo --json
|
|
79
|
+
neurain lifecycle eval ~/NeurainDemo --fixture-size 100 --json
|
|
80
|
+
neurain lifecycle eval ~/NeurainDemo --case-file lifecycle-cases.json --min-cases 6 --json
|
|
81
|
+
neurain curator status ~/NeurainDemo --json
|
|
82
|
+
neurain curator run ~/NeurainDemo --dry-run --json
|
|
83
|
+
neurain curator run ~/NeurainDemo --confirm "1건 저장 진행" --json
|
|
84
|
+
neurain curator rollback ~/NeurainDemo --receipt output/receipts/curator/<receipt>.json
|
|
85
|
+
neurain recall status ~/NeurainDemo --json
|
|
86
|
+
neurain recall rebuild ~/NeurainDemo --dry-run --json
|
|
87
|
+
neurain recall rebuild ~/NeurainDemo --json
|
|
88
|
+
neurain recall search ~/NeurainDemo "rollback lesson" --json
|
|
89
|
+
neurain recall semantic-search ~/NeurainDemo "fixed the login bug" --json
|
|
90
|
+
neurain recall hybrid-search ~/NeurainDemo "fixed the login bug" --json
|
|
91
|
+
neurain recall verify ~/NeurainDemo --json
|
|
92
|
+
neurain recall eval ~/NeurainDemo --json
|
|
93
|
+
neurain recall eval ~/NeurainDemo --fixture-size 100 --private-probes 20 --json
|
|
94
|
+
neurain recall eval ~/NeurainDemo --case-file recall-cases.json --min-cases 1 --json
|
|
95
|
+
neurain recall eval ~/NeurainDemo --semantic --fixture-size 60 --min-cases 50 --json
|
|
96
|
+
neurain recall eval ~/NeurainDemo --semantic --case-file semantic-recall-cases.json --min-cases 50 --json
|
|
97
|
+
neurain recall live-eval ~/NeurainDemo --sample-size 60 --json
|
|
98
|
+
neurain live-cases scaffold ~/NeurainDemo --sample-size 12 --json
|
|
99
|
+
neurain live-cases scaffold ~/NeurainDemo --write --confirm "1건 저장 진행" --output output/live-cases/e23-live-case-pack.json --json
|
|
100
|
+
neurain answer eval ~/NeurainDemo --fixture-size 120 --json
|
|
101
|
+
neurain answer eval ~/NeurainDemo --case-file answer-cases.json --min-cases 5 --json
|
|
102
|
+
neurain lessons eval ~/NeurainDemo --fixture-size 100 --json
|
|
103
|
+
neurain lessons eval ~/NeurainDemo --case-file lesson-cases.json --min-cases 4 --json
|
|
104
|
+
neurain connect runtime ~/NeurainDemo --dry-run
|
|
105
|
+
neurain lessons candidates ~/NeurainDemo
|
|
106
|
+
neurain lessons promote ~/NeurainDemo --candidate-id <candidate-id> --confirm "1건 저장 진행"
|
|
107
|
+
neurain lessons rollback ~/NeurainDemo --receipt output/receipts/lessons/<receipt>.json
|
|
108
|
+
neurain recap ~/NeurainDemo
|
|
109
|
+
neurain wrap ~/NeurainDemo --dry-run
|
|
110
|
+
neurain capabilities "rollback lesson"
|
|
111
|
+
```
|
|
112
|
+
|
|
113
|
+
For multi-area vaults, preview commands do not scan every area by default. Pass `--area <name>` to inspect one specific area. Unsafe preview lines are redacted when they contain secret-like or instruction-injection content.
|
|
114
|
+
|
|
115
|
+
## Safety Rules
|
|
116
|
+
|
|
117
|
+
Lesson candidates are blocked or marked for sanitization when they include secret-like content or instruction-injection language such as requests to ignore prior instructions. Private-derived lessons must stay private unless the user explicitly declassifies them. Promotion writes only an append block and receipt. Rollback removes only the exact promoted block recorded in that receipt.
|
|
118
|
+
|
|
119
|
+
Journal events follow the same safety direction. Secret-like or instruction-injection summaries are redacted before storage, marked as unsafe for prompt context, and blocked from cross-host recall indexing.
|
|
120
|
+
|
|
121
|
+
Dry-run journal output marks `line_hash_status` as `planned`. Durable journal writes create a pending receipt before the NDJSON append, then finalize the receipt after the append succeeds. A pending receipt marks `line_hash_status` as `expected_pending_append`; a finalized receipt marks it as `observed_after_append`.
|
|
122
|
+
|
|
123
|
+
`neurain journal verify` checks that stored event lines still match their event hashes and that each line is valid JSON. It is an alpha integrity check, not a full tamper-proof chain: deletion, reordering, or a fully forged event with a valid standalone hash requires later sequence-chain work.
|
|
124
|
+
|
|
125
|
+
Watch reports are read-only. They do not start a daemon, append journal events, promote lessons, or write durable wiki knowledge in alpha. Private or unsafe event summaries are withheld unless explicitly included by a local CLI user, and MCP watch reports never receive an append capability. Withheld events may still expose minimal routing metadata such as event id, event type, timestamp, and sensitivity so review triggers can be counted without exposing private content.
|
|
126
|
+
|
|
127
|
+
Review worker reports are read-only. They convert watch reports, event sequences, recap hints, and lesson candidates into manual review items, blocked items, and suggested actions. They do not call a model, start a nested review worker, run external tools, promote lessons, or write durable knowledge. MCP exposes `neurain_review_worker` as the same read-only report surface.
|
|
128
|
+
|
|
129
|
+
Scheduler ticks are read-only. They inspect watch signals, decide whether a local review fork should run, and include a review worker report only when thresholds are met. They do not install background jobs, start daemons, call models, promote lessons, or write durable knowledge. MCP exposes `neurain_scheduler_tick` as the same read-only tick surface.
|
|
130
|
+
|
|
131
|
+
Scheduler eval is read-only and separate from scheduler tick. It checks whether background review triggers have enough precision and recall across meaningful events, lesson candidates, journal-integrity failures, and ordinary no-review notes. It also verifies no-recursion on triggered review reports, private boundary handling only on private-marker cases, case-file size limits, temp cleanup, and broad target-root non-write snapshots. Synthetic fixtures use `--fixture-size`; reviewed live cases can be supplied with `--case-file`. MCP exposes `neurain_scheduler_eval` as the same read-only eval surface.
|
|
132
|
+
|
|
133
|
+
Scheduler monitors are bounded foreground runs. They repeat read-only scheduler ticks for a user-specified number of ticks, do not install background jobs, do not continue after the command exits, and do not write durable knowledge. MCP does not expose scheduler monitor in alpha.
|
|
134
|
+
|
|
135
|
+
Daemon run is a user-started foreground loop for continuous local review checks. It repeats scheduler ticks until the user stops it or until `--max-ticks` is reached. Stop requests are checked during the sleep interval with about one-second polling. It writes only operational state in `00_system/neurain/daemon-state.json`, stores no private paths in that state, does not install a background job, does not write durable wiki knowledge, does not promote lessons, does not call models, does not call external tools, and is not exposed through MCP.
|
|
136
|
+
|
|
137
|
+
Lifecycle events adapt agent-loop awareness to hosts Neurain does not own. `neurain lifecycle emit` appends reviewed boundary receipts for events such as `session_start`, `turn_start`, `turn_end`, `wrap_complete`, `review_due`, `review_complete`, `compaction`, `resume`, and `session_end`. Emits require the exact confirmation phrase. `neurain lifecycle report` is read-only and summarizes completed turns, open turns, parent sessions, compactions, resumes, and review-due events. MCP exposes only `neurain_lifecycle_report`, not lifecycle emit.
|
|
138
|
+
|
|
139
|
+
Lifecycle hook previews are connector-specific. `neurain connect claude <folder> --lifecycle-hooks --dry-run` prints a Claude Code settings snippet, `neurain connect codex <folder> --lifecycle-hooks --dry-run` prints a `.codex/hooks.json` snippet, and `neurain connect runtime <folder> --lifecycle-hooks --dry-run` prints a host-proxy contract. The adapters map host hook payloads into lifecycle receipts, store only operational metadata, and do not store prompt body text, transcript paths, tool stdout, tool stderr, or success stdout in model context.
|
|
140
|
+
|
|
141
|
+
Lifecycle eval is read-only and separate from lifecycle emit. It is the E21 native lifecycle automation gate: it does not build the full background self-improvement loop, it proves that the lifecycle observation layer is safe. It replays synthetic host lifecycle payloads from Claude Code, Codex, Runtime, and generic hosts, plus reviewed case files, through the real hook adapter inside isolated temporary roots, then reads back every durable artifact (journal, connector state, receipts) and proves no artifact contains the prompt body, transcript path, tool stdout, tool stderr, or any secret-like or instruction-injection content, including when that dangerous content is placed in the host-controlled source, reason, matcher, tool-name, or tool fields that do flow into the lifecycle summary. Short host-provided lineage labels in those fields persist in sanitized, lowercased, length-capped, redaction-filtered form by design, so the non-persistence claim is scoped to payload body content and secret or injection content rather than to those short trigger labels. Matching is case-insensitive, and a `non_persistence_proof_complete` flag reports whether every non-persistence dimension was actually exercised so a probe-less reviewed case file is never mistaken for a full proof. It also gates host coverage, lifecycle event coverage, ignored handling for malformed, unsupported, and missing-session payloads, path-traversal containment, broad target-root non-write snapshots, and temp cleanup. Synthetic fixtures use `--fixture-size`; reviewed live cases can be supplied with `--case-file`. MCP exposes `neurain_lifecycle_eval` as the same read-only eval surface and still cannot emit lifecycle events. It does not claim that Neurain owns the host loop, and it does not perform automatic background review, automatic lesson promotion, or automatic rollback.
|
|
142
|
+
|
|
143
|
+
Curator lifecycle changes are snapshot-gated. `neurain curator status` and `neurain curator run --dry-run` are read-only. A confirmed curator run changes only lesson `Status` fields, never deletes lessons, protects pinned, human-authored, private, and non-agent-created lessons, and writes a rollback receipt with the previous registry snapshot. Rollback refuses when the registry has changed after the curator run, so user edits are not overwritten. MCP exposes only read-only curator status and run-preview tools.
|
|
144
|
+
|
|
145
|
+
Recall DB is optional and rebuildable. `neurain recall status`, `neurain recall search`, `neurain recall verify`, and `neurain recall eval` are read-only. `neurain recall rebuild` writes only the generated SQLite cache and a rebuild receipt. Markdown, events, handoffs, and receipts remain canonical. Private paths, raw source bodies, secret-like content, instruction-injection content, and recall rebuild receipts are excluded from indexing. If the SQLite file is deleted, search falls back to markdown and the cache can be rebuilt. `neurain recall eval --fixture-size 100 --private-probes 20` runs a larger synthetic cross-host regression suite without touching the target root. `neurain recall eval --case-file recall-cases.json` runs reviewed recall cases supplied by a human reviewer, also without touching the target root. Both modes gate on host filtering, source-supporting snippets, host isolation, and private leakage. They are not semantic recall quality benchmarks by themselves.
|
|
146
|
+
|
|
147
|
+
Semantic recall is a separate, read-only layer (E22). `neurain recall semantic-search` and `neurain recall eval --semantic` add a local lexical-semantic provider on top of exact-token recall so paraphrased queries are found: it combines stemming (fix / fixed / fixing), a curated synonym map (resolved <-> fixed, defect <-> bug, authentication <-> login), and fuzzy character-trigram overlap for typos. The default `local-lexical` provider is fully deterministic, makes no model call, has no external dependency, needs no separate generated index (the markdown stays canonical), and does not lock Neurain to any LLM; the embedding provider is swappable, so a real vector model can be attached later without changing the canonical source. Honest scope: this layer handles morphological variants, curated synonyms, and typos, not arbitrary neural concept similarity; concepts outside the synonym map still need a pluggable embedding provider. The semantic eval is read-only and proves, on a synthetic paraphrase fixture, that semantic recall clearly beats the exact-token baseline (Hit@top improvement), while preserving host isolation, private exclusion, no-answer abstention, rebuild equivalence, and target-root non-write. It demonstrates the mechanism deterministically; it does not replace real-user recall benchmarks. MCP exposes read-only `neurain_recall_semantic_search` and `neurain_recall_semantic_eval` only.
|
|
148
|
+
|
|
149
|
+
Hybrid recall is the recommended robust recall. `neurain recall hybrid-search` returns exact-token results union local lexical-semantic results, so it is never worse than exact-token and adds paraphrase catches on top. `--top` is the candidate depth for each branch, so the returned union can be longer than `--top` when semantic-only catches exist. This exists because running live-eval (below) on real content surfaced that pure semantic alone can recall LESS than exact-token on real corpora: the lexical-semantic layer lacks the rarity weighting that exact-token BM25 has, so on a real corpus with overlapping vocabulary it discriminates worse. The hybrid union fixes that. MCP exposes read-only `neurain_recall_hybrid_search`.
|
|
150
|
+
|
|
151
|
+
Live-content recall coverage (E23, first increment) is the first step from synthetic fixtures toward real-user evidence. `neurain recall live-eval <folder>` is read-only and measures how much of a real folder's own indexed content is recallable when queried with auto-derived paraphrases (real terms swapped to synonyms), reporting hybrid coverage (the recommended exact-union-semantic strategy), exact-token coverage, semantic-only coverage, and per-kind coverage so a real user can see whether recall is good enough on their own material and what it misses. It gates on hybrid coverage being at least exact-token coverage. Running it on real content is exactly what surfaced that pure semantic-only coverage can be lower than exact-token, which led to the hybrid recall above; the per-term semantic win is validated separately by the E22 semantic eval. Output is metrics-only with no stored content, so it is safe to run on a private vault, and it makes no model or external calls. Honest scope: the queries are auto-derived, not human-judged relevance, and one folder is not three users; live-content coverage is a real-content step, not a substitute for external user walkthroughs, so the synthetic-only claim is reduced but not yet fully removed. Full E23 (reviewed live cases from at least three real work folders) remains the human step.
|
|
152
|
+
|
|
153
|
+
Live case scaffolding is the E23 storage-safety bridge for those reviewed live cases. `neurain live-cases scaffold <folder>` samples eligible local markdown and emits hash-only source refs plus recall, answer, and lesson templates. It does not store raw source text, absolute local paths, private path names, secret-like content, or instruction-injection content. It deliberately reports `reviewed_live_user_evidence: false` and `human_judged: false`; a human reviewer still has to fill safe local case files and run the evals before Neurain can claim reviewed live evidence. With `--write`, it requires the exact confirmation phrase and writes only the redacted pack under `output/live-cases/`.
|
|
154
|
+
|
|
155
|
+
Answer eval is separate from recall eval. `neurain answer eval` is read-only and checks policy-level answer quality: faithfulness to provided sources, citation accuracy, conflict surfacing, abstention when evidence is missing, private boundary handling, and stale-source handling. Synthetic fixtures use `--fixture-size`; reviewed live cases can be supplied with `--case-file`. It does not call a model, does not write to the target root, and does not prove production answer quality without external user walkthroughs.
|
|
156
|
+
|
|
157
|
+
Lessons eval is separate from lesson promotion. `neurain lessons eval` is read-only and checks candidate detection quality: positive lesson recall, safe candidate precision, false-positive control, and unsafe secret-like or instruction-injection blocking. Synthetic fixtures use `--fixture-size`; reviewed live cases can be supplied with `--case-file`. It does not call a model, does not write to the target root, and does not prove production lesson quality without external user cases.
|
|
158
|
+
|
|
159
|
+
Onboarding is the E24 first-run layer. `neurain onboard <folder>` is read-only and tells a non-developer whether the next action is `init`, `adopt --dry-run`, host connection, or `wrap --dry-run`. It makes no model call, no external call, and no durable write.
|
|
160
|
+
|
|
161
|
+
Gemini CLI connector support is shipped as part of E26. `neurain connect gemini <folder> --dry-run` prints a bounded `gemini mcp add` command and uses a read-first allowlist that includes status, search, adopt scan, recall, eval, live-case scaffold, lifecycle report/eval, lesson previews, scheduler previews, and wrap preview. Gemini lifecycle hook automation is manual-only in alpha, so explicit `lifecycle emit --host gemini` is the receipt path when needed.
|
|
162
|
+
|
|
163
|
+
Runtime connector alpha is config-preview only. `neurain connect runtime <folder> --dry-run` prints a bounded MCP server snippet with a small read-heavy Neurain allowlist for a host-managed config. It does not edit host config directly and does not make any single runtime a Neurain dependency.
|