leerness 1.32.0 → 1.34.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,110 +1,120 @@
1
- # leerness
2
-
3
- ```
4
- ██╗ ███████╗███████╗██████╗ ███╗ ██╗███████╗███████╗
5
- ██║ ██╔════╝██╔════╝██╔══██╗████╗ ██║██╔════╝██╔════╝
6
- ██║ █████╗ █████╗ ██████╔╝██╔██╗ ██║█████╗ ███████╗
7
- ██║ ██╔══╝ ██╔══╝ ██╔══██╗██║╚██╗██║██╔══╝ ╚════██║
8
- ███████╗███████╗███████╗██║ ██║██║ ╚████║███████╗███████║
9
- ╚══════╝╚══════╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═══╝╚══════╝╚══════╝
10
- ```
11
-
12
- > **The AI-coding operations layer that makes "done" require evidence — for any language, any AI agent.**
13
- > leerness does not write code. It gives your AI agent persistent memory, verified completion, and clean handoffs — stored inside your repo as plain files, exposed via CLI + MCP.
14
-
15
- [![npm](https://img.shields.io/npm/v/leerness)](https://www.npmjs.com/package/leerness) · ![MCP tools](https://img.shields.io/badge/MCP--tools-85-blue) · **0 runtime deps** · **0 install scripts** · offline-first · Node ≥ 18 · MIT
16
-
17
- **🇰🇷 한국어 전문: [README.ko.md](./README.ko.md)**
18
-
19
- ---
20
-
21
- ## Try it in 30 seconds
22
-
23
- ```bash
24
- npx -y leerness init . --yes # adds .harness/ memory + guard files to your project
25
- npx leerness handoff . # everything your AI should know right now, in one call
26
- ```
27
-
28
- Your project now has agent-independent memory. To see the flagship feature — catching a false "done" claim:
29
-
30
- ```bash
31
- npx leerness task add "Implement payment API" # prints the new id, e.g. T-0002 — use it below
32
- npx leerness task update T-0002 --status done --evidence "payment.js implemented + tested"
33
- npx leerness verify-claim T-0002 # exit 1 — payment.js does not exist. Claim rejected.
34
- ```
35
-
36
- Now actually write `payment.js`, then run the **same** `verify-claim T-0002` → it exits 0. That is the whole idea: **"done" must match reality.**
37
-
38
- > Tip: if your evidence claims a specific test count (e.g. "5 tests passed"), leerness measures the real count and rejects a mismatch — so claim only what's true, or add `--run-tests --test-cmd "<your test cmd>"` to verify by running them.
39
-
40
- > Want a smaller footprint? `leerness init . --minimal` installs only the core memory + verification files instead of the full set.
41
-
42
- ---
43
-
44
- ## No terminal? Let your AI run it
45
-
46
- You never have to type a command yourself. Paste this into Claude Code, Cursor, Codex, or any coding agent:
47
-
48
- > Set up leerness in this project by running `npx -y leerness init . --yes`. From now on, run `leerness handoff .` at the start of every session, verify finished work with `leerness verify-claim`, and run `leerness session close .` before you finish.
49
-
50
- The agent installs and operates it for you — `leerness init` also writes the instructions into CLAUDE.md / AGENTS.md so future sessions pick them up automatically.
51
-
52
- Prefer pure natural language? leerness ships an **MCP server with 85 tools** (`leerness mcp serve`). Connect it once to Claude Desktop / Claude Code and just ask: *"what was I working on?"*, *"did the AI actually finish T-0001?"*
53
-
54
- ---
55
-
56
- ## Claude and Codex already have memory. Why leerness?
57
-
58
- Built-in harnesses remember what the AI **said**. leerness verifies what the AI **did** — and keeps working when you switch agents.
59
-
60
- | | Built-in (CLAUDE.md, agent memory) | leerness |
61
- |---|---|---|
62
- | Memory | per-agent, free-form notes | structured tasks / decisions / lessons / rules — agent-independent files in your repo |
63
- | "Done" claims | trusted as written | **evidence-gated**: claimed files, test counts, and run output are checked against reality — bluffs exit 1 |
64
- | Switching agents (Claude → Codex → Cursor) | context lost | same `.harness/` state, same one-call handoff |
65
- | Secrets · encoding · drift guards | none | `scan secrets` · `encoding check` · `drift check --auto-fix` — CI-ready |
66
- | Lock-in | one vendor | any agent, any language, 0 runtime dependencies |
67
-
68
- This positioning was verified by **independent clean-room evaluations** — fresh `npm install` into temp dirs, driven by behavior only, including adversarial attacks against the verifier itself (fake tests, comment-only stubs, inflated test counts — all rejected). Methodology, results, and honest limitations: **[docs/clean-room-evaluations.md](./docs/clean-room-evaluations.md)**.
69
-
70
- ---
71
-
72
- ## Guidance vs enforcement (be honest about this)
73
-
74
- By default leerness is **cooperative**: your AI agent runs the commands because CLAUDE.md / AGENTS.md tell it to. A determined agent could skip them. To make verification **enforced**, not optional:
75
-
76
- ```bash
77
- leerness ci init # writes .github/workflows/leerness-gate.yml — runs `leerness gate` on every PR
78
- ```
79
-
80
- Then make that check **required** in GitHub branch protection. Now a PR that skips verification (or whose claims fail) **cannot merge** — the gate runs independently of the agent, returns a non-zero exit code, and blocks. That is the difference between a guideline and a guardrail.
81
-
82
- ---
83
-
84
- ## What is inside (the 60-second tour)
85
-
86
- - **Memory**`task` / `plan` / `decision` / `lesson` / `rule`: canonical JSON + markdown projections, archive/restore.
87
- - **Handoff** — `handoff` (session start context) · `session close` (closing report). Survives agent swaps.
88
- - **Verification** — `verify-claim` (evidence vs reality, stub/fake-test/inflated-count detection, `--run-tests --test-cmd` for any language) · `contract verify` (spec impl) · `gate` (one-call CI gate).
89
- - **Audit** — `audit` · `lazy detect` · `drift check` keep the workspace honest over time.
90
- - **Security**`scan secrets` (committed-secret detection) · `encoding check` (BOM/CP949)also runs at `session close`.
91
-
92
- Full command reference, workflows, and architecture: **[README.ko.md](./README.ko.md)** (Korean) · `leerness commands` · `leerness help`.
93
-
94
- ## Links
95
-
96
- - npm: https://www.npmjs.com/package/leerness
97
- - Site & release videos: https://leerness.pages.dev
98
- - Changelog: [CHANGELOG.md](./CHANGELOG.md)
99
-
100
- ## License
101
-
1
+ # leerness
2
+
3
+ ```
4
+ ██╗ ███████╗███████╗██████╗ ███╗ ██╗███████╗███████╗
5
+ ██║ ██╔════╝██╔════╝██╔══██╗████╗ ██║██╔════╝██╔════╝
6
+ ██║ █████╗ █████╗ ██████╔╝██╔██╗ ██║█████╗ ███████╗
7
+ ██║ ██╔══╝ ██╔══╝ ██╔══██╗██║╚██╗██║██╔══╝ ╚════██║
8
+ ███████╗███████╗███████╗██║ ██║██║ ╚████║███████╗███████║
9
+ ╚══════╝╚══════╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═══╝╚══════╝╚══════╝
10
+ ```
11
+
12
+ > **The AI-coding operations layer that makes "done" require evidence — for any language, any AI agent.**
13
+ > leerness does not write code. It gives your AI agent persistent memory, verified completion, and clean handoffs — stored inside your repo as plain files, exposed via CLI + MCP.
14
+
15
+ [![npm](https://img.shields.io/npm/v/leerness)](https://www.npmjs.com/package/leerness) · ![MCP tools](https://img.shields.io/badge/MCP--tools-86-blue) · **0 runtime deps** · **0 install scripts** · offline-first · Node ≥ 18 · MIT
16
+
17
+ **🇰🇷 한국어 전문: [README.ko.md](./README.ko.md)**
18
+
19
+ ---
20
+
21
+ ## Try it in 30 seconds
22
+
23
+ ```bash
24
+ npx -y leerness init . --yes # adds .harness/ memory + guard files to your project
25
+ npx leerness handoff . # everything your AI should know right now, in one call
26
+ ```
27
+
28
+ Your project now has agent-independent memory. To see the flagship feature — catching a false "done" claim:
29
+
30
+ ```bash
31
+ npx leerness task add "Implement payment API" # prints the new id, e.g. T-0002 — use it below
32
+ npx leerness task update T-0002 --status done --evidence "payment.js implemented + tested"
33
+ npx leerness verify-claim T-0002 # exit 1 — payment.js does not exist. Claim rejected.
34
+ ```
35
+
36
+ Now actually write `payment.js`, then run the **same** `verify-claim T-0002` → it exits 0. That is the whole idea: **"done" must match reality.**
37
+
38
+ > Tip: if your evidence claims a specific test count (e.g. "5 tests passed"), leerness measures the real count and rejects a mismatch — so claim only what's true, or add `--run-tests --test-cmd "<your test cmd>"` to verify by running them.
39
+
40
+ > Want a smaller footprint? `leerness init . --minimal` installs only the core memory + verification files instead of the full set.
41
+
42
+ ---
43
+
44
+ ## No terminal? Let your AI run it
45
+
46
+ You never have to type a command yourself. Paste this into Claude Code, Cursor, Codex, or any coding agent:
47
+
48
+ > Set up leerness in this project by running `npx -y leerness init . --yes`. From now on, run `leerness handoff .` at the start of every session, verify finished work with `leerness verify-claim`, and run `leerness session close .` before you finish.
49
+
50
+ The agent installs and operates it for you — `leerness init` also writes the instructions into CLAUDE.md / AGENTS.md so future sessions pick them up automatically.
51
+
52
+ Prefer pure natural language? leerness ships an **MCP server with 86 tools** (`leerness mcp serve`). Connect it once to Claude Desktop / Claude Code and just ask: *"what was I working on?"*, *"did the AI actually finish T-0001?"*
53
+
54
+ ---
55
+
56
+ ## Claude and Codex already have memory. Why leerness?
57
+
58
+ Built-in harnesses remember what the AI **said**. leerness verifies what the AI **did** — and keeps working when you switch agents.
59
+
60
+ | | Built-in (CLAUDE.md, agent memory) | leerness |
61
+ |---|---|---|
62
+ | Memory | per-agent, free-form notes | structured tasks / decisions / lessons / rules — agent-independent files in your repo |
63
+ | "Done" claims | trusted as written | **evidence-gated**: claimed files, test counts, and run output are checked against reality — bluffs exit 1 |
64
+ | Switching agents (Claude → Codex → Cursor) | context lost | same `.harness/` state, same one-call handoff |
65
+ | Secrets · encoding · drift guards | none | `scan secrets` · `encoding check` · `drift check --auto-fix` — CI-ready |
66
+ | Lock-in | one vendor | any agent, any language, 0 runtime dependencies |
67
+
68
+ This positioning is checked by **self-administered clean-room evaluations** — AI agents do a fresh `npm install` into temp dirs and drive it by behavior only, including adversarial attacks against the verifier itself (fake tests, comment-only stubs, inflated test counts — all rejected). To be clear: these are *AI* clean-room runs, **not third-party human audits or peer review** — they make the claim *checkable* rather than a marketing line. Methodology, results, and honest limitations: **[docs/clean-room-evaluations.md](./docs/clean-room-evaluations.md)**.
69
+
70
+ ---
71
+
72
+ ## Guidance vs enforcement (be honest about this)
73
+
74
+ By default leerness is **cooperative**: your AI agent runs the commands because CLAUDE.md / AGENTS.md tell it to. A determined agent could skip them. To make verification **enforced**, not optional:
75
+
76
+ ```bash
77
+ leerness ci init # writes .github/workflows/leerness-gate.yml — runs `leerness gate` on every PR
78
+ ```
79
+
80
+ The generated workflow is production-grade: it **pins the leerness version** (reproducible — the gate's verdict can't change from a silent upgrade), runs with **least-privilege permissions** (`contents: read`), and cancels superseded runs.
81
+
82
+ Then make that check **required** in GitHub branch protection. Now a PR that skips verification (or whose claims fail) **cannot merge** — the gate runs independently of the agent, returns a non-zero exit code, and blocks. That is the difference between a guideline and a guardrail. For exact per-claim enforcement, run `leerness gate --claims` — it adds a 6th check that runs `verify-claim` on **every** completed task and fails the gate if any "done" task's evidence doesn't match reality (the default 5-check gate already blocks false-done via heuristics; `--claims` makes it precise).
83
+
84
+ ---
85
+
86
+ ## Maturityand why trying it is still cheap
87
+
88
+ Be honest with yourself before you depend on this: leerness is **early and largely solo-maintained**, developed mostly through autonomous AI rounds so its own `selftest` + e2e suites are the primary quality signal, and external adoption is still small. Don't make it load-bearing on faith: **pin a version**, and treat the differentiated slice — `verify-claim` + the CI `gate` as a required check — as the part worth relying on.
89
+
90
+ The asymmetry is what makes a trial reasonable anyway: MIT, **0 runtime dependencies**, offline-first, and all state is plain files in *your* repo. Lock-in is near zero if it doesn't earn its place, remove the tool and your `task`/`decision`/`lesson` files stay. (For secret scanning specifically, mature dedicated tools like gitleaks/trufflehog exist use those if you need a hard guarantee; leerness's `scan secrets` is a convenience guard, not a replacement.)
91
+
92
+ ---
93
+
94
+ ## What is inside (the 60-second tour)
95
+
96
+ - **Memory** — `task` / `plan` / `decision` / `lesson` / `rule`: canonical JSON + markdown projections, archive/restore.
97
+ - **Handoff** `handoff` (session start context) · `session close` (closing report). Survives agent swaps.
98
+ - **Verification** — `verify-claim` (evidence vs reality, stub/fake-test/inflated-count detection, `--run-tests --test-cmd` for any language; `--all` checks **every** completed claim at once for CI) · `contract verify` (spec ↔ impl) · `gate` (one-call CI gate).
99
+ - **Audit** — `audit` · `lazy detect` · `drift check` keep the workspace honest over time.
100
+ - **Security** — `scan secrets` (committed-secret detection) · `encoding check` (BOM/CP949) — also runs at `session close`.
101
+
102
+ Full command reference, workflows, and architecture: **[README.ko.md](./README.ko.md)** (Korean) · `leerness commands` · `leerness help`.
103
+
104
+ ## Links
105
+
106
+ - npm: https://www.npmjs.com/package/leerness
107
+ - Site & release videos: https://leerness.pages.dev
108
+ - Changelog: [CHANGELOG.md](./CHANGELOG.md)
109
+
110
+ ## License
111
+
102
112
  MIT
103
113
 
104
114
  <!-- leerness:project-readme:start -->
105
115
  ## Leerness Project Harness
106
116
 
107
- 이 프로젝트는 Leerness v1.32.0 하네스를 사용합니다. AI 에이전트는 작업 전 `leerness handoff`로 컨텍스트를 적재하고, 작업 후 `leerness check`/`leerness audit`/`leerness session close`를 수행해야 합니다.
117
+ 이 프로젝트는 Leerness v1.34.0 하네스를 사용합니다. AI 에이전트는 작업 전 `leerness handoff`로 컨텍스트를 적재하고, 작업 후 `leerness check`/`leerness audit`/`leerness session close`를 수행해야 합니다.
108
118
 
109
119
  ### 정체성 — AI 에이전트 운영 레이어 (UR-0030)
110
120
 
@@ -158,7 +168,7 @@ leerness memory restore decision <date|title>
158
168
 
159
169
  ### MCP server (외부 AI 통합)
160
170
 
161
- Leerness v1.32.0는 stdio JSON-RPC MCP server를 내장합니다 — Claude Code · Cursor · Codex CLI 등 외부 AI에 **85개 도구**를 노출:
171
+ Leerness v1.34.0는 stdio JSON-RPC MCP server를 내장합니다 — Claude Code · Cursor · Codex CLI 등 외부 AI에 **86개 도구**를 노출:
162
172
 
163
173
  ```jsonc
164
174
  // 카테고리별
@@ -171,7 +181,7 @@ Leerness v1.32.0는 stdio JSON-RPC MCP server를 내장합니다 — Claude Code
171
181
  // • Workflow: session_close / agents_list / task_export / env_check / usage_stats / reuse_map / whats_new
172
182
 
173
183
  // MCP server 실행: leerness mcp serve
174
- // tools/list 응답: 85 도구
184
+ // tools/list 응답: 86 도구
175
185
  ```
176
186
 
177
187
  ### Autonomous mode (자율 모드)
@@ -179,7 +189,7 @@ Leerness v1.32.0는 stdio JSON-RPC MCP server를 내장합니다 — Claude Code
179
189
  `<<autonomous-loop-dynamic>>` 신호만 보내면 AI가:
180
190
  1) 다음 라운드 후보 선정 → 2) 코드 변경 → 3) stress-v* 신규 작성 + 누적 회귀 → 4) e2e 219/219 → 5) npm pack + git tag + GitHub release → 6) main 자동 push (1.9.140+) → 7) session close → 8) 다음 라운드 예약.
181
191
 
182
- 현재 누적: **70 라운드 (1.9.40 → 1.32.0)** · 매 라운드 GitHub release/태그 생성 · _reports/는 비공개 보존.
192
+ 현재 누적: **70 라운드 (1.9.40 → 1.34.0)** · 매 라운드 GitHub release/태그 생성 · _reports/는 비공개 보존.
183
193
 
184
194
  ### 성능 가이드 (1.9.140 측정)
185
195
 
@@ -217,6 +227,6 @@ leerness release pack --close --auto-main-push
217
227
  - `.harness/session-handoff.md`: 다음 세션 인수인계 (자동 작성)
218
228
  - `.harness/lessons.md` / `decisions.md` / `rules.md`: 영구 메모리 (5 surface)
219
229
 
220
- Last synced by Leerness v1.32.0: 2026-06-16
230
+ Last synced by Leerness v1.34.0: 2026-06-19
221
231
  <!-- leerness:project-readme:end -->
222
232