claude-smart 0.1.3 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -6,18 +6,21 @@
6
6
  claude-smart
7
7
  </h1>
8
8
 
9
- <h4 align="center">Self-improving <a href="https://claude.com/claude-code" target="_blank">Claude Code</a> plugin — learns from your corrections, not just remembers them.</h4>
9
+ <h4 align="center">The <a href="https://claude.com/claude-code" target="_blank">Claude Code</a> plugin that makes Claude Code self-improve as you use it not by remembering past sessions, but by turning your corrections into rules it actually follows next time.</h4>
10
10
 
11
11
  <p align="center">
12
12
  <a href="LICENSE">
13
13
  <img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License">
14
14
  </a>
15
- <a href="pyproject.toml">
16
- <img src="https://img.shields.io/badge/version-0.1.2-green.svg" alt="Version">
15
+ <a href="plugin/pyproject.toml">
16
+ <img src="https://img.shields.io/badge/version-0.1.5-green.svg" alt="Version">
17
17
  </a>
18
- <a href="pyproject.toml">
18
+ <a href="plugin/pyproject.toml">
19
19
  <img src="https://img.shields.io/badge/python-%3E%3D3.12-brightgreen.svg" alt="Python">
20
20
  </a>
21
+ <a href="package.json">
22
+ <img src="https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg" alt="Node">
23
+ </a>
21
24
  <a href="#quick-start">
22
25
  <img src="https://img.shields.io/badge/llm-claude%20code%20cli-purple.svg" alt="LLM">
23
26
  </a>
@@ -34,23 +37,45 @@
34
37
  </p>
35
38
 
36
39
  <p align="center">
37
- claude-smart turns your Claude Code corrections into durable rules that shape <i>future</i> sessions. Instead of replaying past observations as context, it distils them into a project playbook and per-session preferences — so Claude stops repeating the same mistakes and adapts to how your codebase actually wants to be written.
40
+ It learns both corrections and successful execution patterns—so Claude Code avoids repeating mistakes and reuses what works. Instead of repeatedly explaining your stack, conventions, preferences, or the same gotchas, Claude Code steadily adapts to <i>how you like</i> to work—across projects, codebases, and sessions.
38
41
  </p>
39
42
 
43
+ <p align="center">
44
+ <b>Head-to-head vs <code>claude-mem</code></b>, evaluated by an LLM on how well each system’s reinjected context matched the expected rule: claude-smart achieved <b>~2.7× higher overall accuracy</b>, is better at <b>preventing Claude Code from repeating mistakes you have already corrected</b> rather than merely recalling that those mistakes happened, and is <b>3× better at converting past events into future-facing rules</b> — see <a href="benchmarks/memory_comparison/EXPERIMENT.md">EXPERIMENT.md</a> for details.
45
+ </p>
40
46
  ---
41
47
 
42
48
  ## Why Learning, Not Memory
43
49
 
44
- Plain memory solutions re-inject transcripts or summaries from prior sessions. That preserves continuity but has limits:
50
+ Most memory solutions re-inject transcripts or summaries from prior sessions, but it is still mostly informative—Claude remembers what happened, without necessarily changing what it does next.
51
+
52
+ `claude-smart` focuses on learning instead.
53
+
54
+ Four ways this changes what Claude Code can do for you:
55
+
56
+ - **Actionable, not just informative:** Produces rules Claude can follow next time; memory only records what happened;
57
+
58
+ > *Example:* you tell Claude to stop running `npm test` without `--run` because watch mode hangs.
59
+ > **Memory:** “user was annoyed about npm test hanging”
60
+ > **Learning:** “always pass `--run` to `npm test` in this repo — default watch mode blocks CI”
61
+
62
+ - **Optimized paths, not just past events:** Preserves successful execution paths so Claude can reuse what already works.
63
+ > *Example:* Claude spends several iterations trying to start the local dev environment before discovering that this repo requires `pnpm dev:all` instead of the usual `npm run dev`.
64
+ > **Memory:** “user mentioned that `npm run dev` did not work”
65
+ > **Learning:** “for this repo, always use `pnpm dev:all` to start the full local stack — `npm run dev` only starts the frontend and causes missing service errors”
66
+
67
+ Instead of re-exploring the same setup problem next time, Claude starts from the proven path—reducing planning steps, latency, and token usage.
68
+
69
+ - **Project-wide, not session-siloed:** Session memory disappears with the conversation. The project playbook persists and improves across every session in that repo.
45
70
 
46
- - **Actionable, not just informative.** Memory logs *what happened*; learning produces *rules to follow* that change the next decision.
47
- > *e.g.* you told Claude to stop running `npm test` without `--run` because it hangs on watch mode. **Memory** recalls "user was annoyed about npm test hanging". **Learning** writes the rule *"always pass `--run` to `npm test` in this repo — default watch mode blocks CI"* and applies it next session.
48
- - **Preferences, not events.** Memory records literal facts; learning abstracts the *why* into rules that generalize.
49
- > *e.g.* you reject Jest in favor of Vitest once. **Memory** stores that single choice. **Learning** derives *"prefer ESM-native test runners for this TypeScript monorepo"* — which also covers the next framework decision (e.g. picking `tsx` over `ts-node`) without waiting for the same correction to repeat.
50
- - **Carries across sessions and workspaces.** Playbooks are keyed to the project and surface in every future session against that repo.
51
- - **Compact.** Distilled, deduplicated rules stay in dozens of tokens — not thousands — even as the project grows.
71
+ - **Compact:** Distilled, deduplicated rules stay in dozens of tokens—not thousands—even as the project grows.
52
72
 
53
- claude-smart's approach: **extract, don't accumulate**. Corrections and successful patterns are distilled into two artifacts — a session-scoped **user profile** and a cross-session **project playbook** (each rule with explicit `trigger` and `rationale`, deduplicated and archived as they evolve) — and reinjected at the start of every session.
73
+ claude-smart turns corrections and successful execution patterns into two artifacts:
74
+
75
+ - **User Profile** → your preferences and working style across sessions
76
+ - **Project Playbook** → durable rules and optimized execution paths for how Claude should behave
77
+
78
+ Both are automatically reinjected at the start of every session, so Claude Code gets better the more you use it.
54
79
 
55
80
  ---
56
81
 
@@ -79,90 +104,57 @@ Developing the plugin itself? See [DEVELOPER.md](./DEVELOPER.md#developing-local
79
104
 
80
105
  - 🧠 **Learn, don't just remember** — Corrections become structured, deduplicated rules, not transcript replays.
81
106
  - ⚡ **Fully automatic learning** — Every user turn, tool call, and assistant response is captured via lifecycle hooks and extracted into rules without you running anything.
107
+ - 📈 **Compounds with every session** — Rules auto-merge, supersede, and archive as your project evolves — the playbook sharpens with use instead of bloating.
108
+ > *e.g.* you correct the same `npm test --run` gotcha twice → **claude-smart** consolidates them into one stronger rule. Later you switch the policy to `pnpm test` → the old rule is archived and the new one supersedes it, no manual cleanup.
82
109
  - 🎯 **Two-tier scope** — Per-session profiles for the current conversation; cross-session playbooks for the whole project.
83
- - 🔌 **Fully local — no external API call** — semantic search runs on an in-process ONNX embedder (all-MiniLM-L6-v2), and all data (profiles, playbooks, interaction buffers) is stored locally on your machine (`~/.reflexio/` and `~/.claude-smart/`). The whole stack works offline.
110
+ - 🔌 **No external API call** — semantic search runs on an in-process ONNX embedder (all-MiniLM-L6-v2), and all data (profiles, playbooks, interaction buffers) is stored locally on your machine (`~/.reflexio/` and `~/.claude-smart/`).
84
111
  - 🔎 **Hybrid search** — Playbooks and profiles are indexed with vector + BM25 search for fast, robust retrieval.
85
112
  - 🧪 **Offline resilience** — If the reflexio backend is down, hooks buffer to disk; the next successful publish drains them.
86
113
  - 🧰 **Manual correction tag** — `/claude-smart:tag` flags the last turn as a correction so the extractor weights it heavily.
87
114
 
88
115
  ---
89
116
 
90
- ## Slash Commands
91
-
92
- | Command | What it does |
93
- | --- | --- |
94
- | `/show` | Print the current project playbook plus the current session's user profiles (same markdown that `SessionStart` injects). Use it to audit what rules and preferences Claude is being told to follow. |
95
- | `/learn` | Force reflexio to run extraction *now* on the current session's unpublished interactions. Without this, extraction runs at the end of the session or on reflexio's batch interval. |
96
- | `/tag [note]` | Tag the most recent turn as a correction, for cases the automatic heuristic missed. The note becomes the correction description the extractor sees. |
97
-
98
- ---
99
-
100
117
  ## Dashboard
101
118
 
102
- A Next.js web UI lives in [`dashboard/`](dashboard/) for browsing session buffers, inspecting user profiles, and editing project playbooks. It auto-starts alongside the backend — just open **http://localhost:3001**.
119
+ A Next.js web UI lives in [`plugin/dashboard/`](plugin/dashboard/) for browsing session buffers, inspecting user profiles, and editing project playbooks. It auto-starts alongside the backend — just open **http://localhost:3001**.
120
+
121
+ <p align="center">
122
+ <img src="assets/profile_dashboard.png" alt="Profile dashboard" width="49%">
123
+ <img src="assets/playbook_dashboard.png" alt="Playbook dashboard" width="49%">
124
+ </p>
103
125
 
104
126
  ---
105
127
 
106
128
  ## How It Works
107
129
 
108
- **Core components:**
130
+ claude-smart builds two artifacts as you work and injects them into Claude at the start of every new session:
109
131
 
110
- 1. **5 lifecycle hooks** (`plugin/hooks/hooks.json`)
111
- - `SessionStart`fetches the project playbook from reflexio and injects it as `additionalContext`.
112
- - `UserPromptSubmit` — buffers each user turn, heuristically flags corrections.
113
- - `PostToolUse` — records tool invocations for later extraction.
114
- - `Stop` — finalizes the assistant turn from the transcript, publishes to reflexio.
115
- - `SessionEnd` — flushes the remaining buffer with `force_extraction=True`.
116
- 2. **Local state buffer** — JSONL per session at `~/.claude-smart/sessions/{session_id}.jsonl`. Offline-safe.
117
- 3. **Reflexio backend** (submodule at `reflexio/`) — SQLite storage, hybrid search, profile/playbook extraction, dedup, status lifecycle (`CURRENT` → `ARCHIVED`). Runs on `localhost:8081`.
118
- 4. **Claude Code LLM provider** — a LiteLLM custom provider registered inside reflexio. Every generation call (extraction, update, dedup, evaluation) subprocesses `claude -p --output-format json`, so no OpenAI/Anthropic key is needed for the learning loop.
132
+ - **User profile** — session-scoped preferences (stack, role, small quirks). *e.g.* "uses pnpm, not npm"; "prefers terse answers"; "backend engineer — explain frontend with backend analogues."
133
+ - **Project playbook** durable, generalized rules accumulated across every session in the repo. Each says when it applies and why. *e.g.* "always pass `--run` to `npm test` — watch mode hangs CI"; "use real Postgres for integration tests — mocks once hid a broken migration."
119
134
 
120
- **Data flow:**
135
+ Rules clean themselves up: correct the same thing twice and they merge; change your mind and the old one is archived.
121
136
 
122
- ```
123
- Claude Code session
124
- ├─ UserPromptSubmit ─┐
125
- ├─ PostToolUse ─────┤ → JSONL buffer ─→ Stop ─→ reflexio publish_interaction
126
- └─ Stop ─────┘ │
127
-
128
- ┌─────────────────────────┐
129
- │ reflexio extractors │
130
- │ (run via claude -p) │
131
- │ → profiles + playbooks │
132
- └────────────┬────────────┘
133
-
134
-
135
- Next session → SessionStart → search_user_playbooks(agent_version=project_id)
136
- → additionalContext injected into Claude's system prompt
137
- ```
137
+ Under the hood: hooks watch your turns, tool calls, and Claude's replies, auto-flagging corrections (or anything you `/tag`). At session end (or on `/learn`), [reflexio](https://github.com/ReflexioAI/reflexio) — the self-improving engine that powers claude-smart — extracts profile entries and playbook rules. Next session, both get injected into the system prompt — run `/show` to see what Claude is being told. Everything runs on your machine.
138
138
 
139
- **Mapping to reflexio:**
139
+ See [ARCHITECTURE.md](./ARCHITECTURE.md) for hooks, data flow, and reflexio details.
140
140
 
141
- | Reflexio field | claude-smart value |
142
- | --- | --- |
143
- | `user_id` | Claude Code `session_id` — scopes profiles to the current conversation |
144
- | `agent_version` | `project_id` (git-toplevel basename) — stable across sessions, so playbooks accumulate project-wide |
145
- | `session_id` | Claude Code `session_id` — for reflexio's deferred success evaluation |
141
+ ---
146
142
 
147
- Cross-session playbook retrieval uses `search_user_playbooks(agent_version=project_id, user_id=None)` — playbooks written from any prior session in this project surface for every future session.
143
+ ## Slash Commands
144
+
145
+ | Command | What it does |
146
+ | --- | --- |
147
+ | `/show` | Print the current project playbook plus the current session's user profiles (same markdown that `SessionStart` injects). Use it to audit what rules and preferences Claude is being told to follow. |
148
+ | `/learn` | Force reflexio to run extraction *now* on the current session's unpublished interactions. Without this, extraction runs at the end of the session or on reflexio's batch interval. |
149
+ | `/tag [note]` | Tag the most recent turn as a correction, for cases the automatic heuristic missed. The note becomes the correction description the extractor sees. |
150
+ | `/restart` | Restart the reflexio backend and dashboard to pick up new changes (e.g. after upgrading the plugin or editing local reflexio code). |
151
+ | `/clear-all` | **Destructive.** Delete *all* reflexio interactions, profiles, and user playbooks. Use when you want to wipe learned state and start fresh. |
148
152
 
149
153
  ---
150
154
 
151
155
  ## Configuration
152
156
 
153
- ### Environment variables
154
-
155
- | Variable | Default | Purpose |
156
- | --- | --- | --- |
157
- | `CLAUDE_SMART_USE_LOCAL_CLI` | `0` (installer sets `1`) | Route generation through the local `claude` CLI. Written to `~/.reflexio/.env` by `claude-smart install`. |
158
- | `CLAUDE_SMART_USE_LOCAL_EMBEDDING` | `0` (installer sets `1`) | Use the in-process ONNX embedder (requires `chromadb`). Written to `~/.reflexio/.env` by `claude-smart install`. |
159
- | `CLAUDE_SMART_CLI_PATH` | `shutil.which("claude")` | Override the path to the `claude` binary. |
160
- | `CLAUDE_SMART_CLI_TIMEOUT` | `120` | Per-call subprocess timeout (seconds). Raise for slow prompts. |
161
- | `CLAUDE_SMART_STATE_DIR` | `~/.claude-smart/sessions/` | Where the per-session JSONL buffer lives. |
162
- | `CLAUDE_SMART_BACKEND_AUTOSTART` | `1` | Set to `0` to stop the SessionStart hook from spawning the reflexio backend on `localhost:8081`. |
163
- | `CLAUDE_SMART_DASHBOARD_AUTOSTART` | `1` | Set to `0` to stop the SessionStart hook from spawning the Next.js dashboard on `localhost:3001`. |
164
- | `CLAUDE_SMART_BACKEND_STOP_ON_END` | `0` | Set to `1` to tear down the backend at `SessionEnd` instead of leaving it long-lived. |
165
- | `REFLEXIO_URL` | `http://localhost:8081/` | Point the plugin at a non-local reflexio backend. |
157
+ Advanced users can tune claude-smart via environment variables — see [DEVELOPER.md](./DEVELOPER.md#environment-variables) for the full list.
166
158
 
167
159
  ### Where data lives
168
160
 
@@ -173,11 +165,6 @@ Cross-session playbook retrieval uses `search_user_playbooks(agent_version=proje
173
165
  | `~/.claude-smart/sessions/{session_id}.jsonl` | Per-session buffer. User turns, assistant turns, tool invocations, `{"published_up_to": N}` watermarks. Safe to inspect and safe to delete — everything past the latest watermark has already been written to reflexio's DB. |
174
166
  | `~/.cache/chroma/onnx_models/all-MiniLM-L6-v2/` | Cached ONNX weights (~86 MB, downloaded once). Delete to force a re-download. |
175
167
 
176
- ### Scope: profile vs. playbook
177
-
178
- - **Profile** (`user_id = session_id`) — session-scoped preferences. Does not persist across sessions, but *is* reinjected if you resume the same session (`/resume`, `/clear`, `/compact`).
179
- - **Playbook** (`agent_version = project_id`) — cross-session. Every session in the same project — identified by git-toplevel basename — sees the accumulated playbook.
180
-
181
168
  ### Embeddings
182
169
 
183
170
  claude-smart uses an in-process ONNX embedder (Chroma's `all-MiniLM-L6-v2`, 384-dim, zero-padded to reflexio's 512-dim schema). The model weights are downloaded on first use (~80 MB, cached under `~/.cache/chroma/onnx_models/`) — after that, no network calls for embedding. Runtime cost is a few milliseconds per short document on CPU.
@@ -192,7 +179,7 @@ If you still want to use a cloud embedding provider (OpenAI, Gemini, etc.), omit
192
179
  Extraction is async by default. Run `/learn` to force it, wait ~20–30s, then run `/show` — no new session needed. `/show` shows whether the rule was actually extracted.
193
180
 
194
181
  **Reflexio refuses to boot with "no embedding-capable provider".**
195
- Check that `CLAUDE_SMART_USE_LOCAL_EMBEDDING=1` is in `~/.reflexio/.env` *and* that `chromadb` is installed in the venv (`uv run python -c "import chromadb"` should print nothing). If you'd rather use a cloud embedder instead, drop the env flag and set `OPENAI_API_KEY` or `GEMINI_API_KEY` in the same file.
182
+ Check that `CLAUDE_SMART_USE_LOCAL_EMBEDDING=1` is in `~/.reflexio/.env` *and* that `chromadb` is installed in the venv (`uv run --project plugin python -c "import chromadb"` should print nothing). If you'd rather use a cloud embedder instead, drop the env flag and set `OPENAI_API_KEY` or `GEMINI_API_KEY` in the same file.
196
183
 
197
184
  **`claude-smart` doesn't see my interactions.**
198
185
  Check `~/.claude-smart/sessions/`. If your current session's JSONL has no `User`/`Assistant` rows, the plugin isn't receiving hook events — verify `.claude/settings.local.json` has the right path and that `enabledPlugins` is `true`.
@@ -58,6 +58,9 @@ function printHelp() {
58
58
  " 3. Appends CLAUDE_SMART_USE_LOCAL_CLI=1 and CLAUDE_SMART_USE_LOCAL_EMBEDDING=1",
59
59
  " to ~/.reflexio/.env (idempotent).",
60
60
  "",
61
+ "Update:",
62
+ " npx claude-smart update Update to the latest version",
63
+ "",
61
64
  ].join("\n"),
62
65
  );
63
66
  }
@@ -73,6 +76,26 @@ function parseSource(args) {
73
76
  return value;
74
77
  }
75
78
 
79
+ function runUpdate() {
80
+ if (!hasClaudeCli()) {
81
+ process.stderr.write(
82
+ "error: 'claude' CLI not found on PATH. " +
83
+ "Install Claude Code first: https://claude.com/claude-code\n",
84
+ );
85
+ process.exit(1);
86
+ }
87
+
88
+ try {
89
+ execFileSync("claude", ["plugin", "update", PLUGIN_SPEC], { stdio: "inherit" });
90
+ } catch (err) {
91
+ const code = typeof err.status === "number" ? err.status : 1;
92
+ process.stderr.write(`error: \`claude plugin update ${PLUGIN_SPEC}\` failed (exit ${code})\n`);
93
+ process.exit(code);
94
+ }
95
+
96
+ process.stdout.write("\nclaude-smart updated. Restart Claude Code to apply.\n");
97
+ }
98
+
76
99
  function runInstall(args) {
77
100
  if (!hasClaudeCli()) {
78
101
  process.stderr.write(
@@ -110,10 +133,9 @@ function runInstall(args) {
110
133
  process.stdout.write(
111
134
  [
112
135
  "",
113
- "claude-smart installed. Next steps:",
114
- " 1. Start the reflexio backend (leave it running in another terminal):",
115
- " uv run reflexio services start --only backend --no-reload",
116
- " 2. Restart Claude Code in your project.",
136
+ "claude-smart installed. Restart Claude Code in your project.",
137
+ "The reflexio backend and dashboard auto-start on session start.",
138
+ "Opt out with CLAUDE_SMART_BACKEND_AUTOSTART=0 or CLAUDE_SMART_DASHBOARD_AUTOSTART=0.",
117
139
  "",
118
140
  ].join("\n"),
119
141
  );
@@ -133,6 +155,11 @@ function main() {
133
155
  return;
134
156
  }
135
157
 
158
+ if (cmd === "update") {
159
+ runUpdate();
160
+ return;
161
+ }
162
+
136
163
  process.stderr.write(
137
164
  `claude-smart: unknown command '${cmd}'. Try 'npx claude-smart --help'.\n`,
138
165
  );
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "claude-smart",
3
- "version": "0.1.3",
3
+ "version": "0.1.5",
4
4
  "description": "Self-improving Claude Code plugin — learns from corrections via reflexio",
5
5
  "keywords": [
6
6
  "claude",