npm - claude-smart - Versions diffs - 0.1.3 → 0.1.5 - Mend

claude-smart 0.1.3 → 0.1.5

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -6,18 +6,21 @@
   claude-smart
 </h1>
-<h4 align="center">Self-improving <a href="https://claude.com/claude-code" target="_blank">Claude Code</a> plugin — learns from your corrections, not just remembers them.</h4>
+<h4 align="center">The <a href="https://claude.com/claude-code" target="_blank">Claude Code</a> plugin that makes Claude Code self-improve as you use it — not by remembering past sessions, but by turning your corrections into rules it actually follows next time.</h4>
 <p align="center">
   <a href="LICENSE">
     <img src="https://img.shields.io/badge/License-Apache%202.0-blue.svg" alt="License">
   </a>
-  <a href="pyproject.toml">
-    <img src="https://img.shields.io/badge/version-0.1.2-green.svg" alt="Version">
+  <a href="plugin/pyproject.toml">
+    <img src="https://img.shields.io/badge/version-0.1.5-green.svg" alt="Version">
   </a>
-  <a href="pyproject.toml">
+  <a href="plugin/pyproject.toml">
     <img src="https://img.shields.io/badge/python-%3E%3D3.12-brightgreen.svg" alt="Python">
   </a>
+  <a href="package.json">
+    <img src="https://img.shields.io/badge/node-%3E%3D18-brightgreen.svg" alt="Node">
+  </a>
   <a href="#quick-start">
     <img src="https://img.shields.io/badge/llm-claude%20code%20cli-purple.svg" alt="LLM">
   </a>
@@ -34,23 +37,45 @@
 </p>
 <p align="center">
-  claude-smart turns your Claude Code corrections into durable rules that shape <i>future</i> sessions. Instead of replaying past observations as context, it distils them into a project playbook and per-session preferences — so Claude stops repeating the same mistakes and adapts to how your codebase actually wants to be written.
+  It learns both corrections and successful execution patterns—so Claude Code avoids repeating mistakes and reuses what works. Instead of repeatedly explaining your stack, conventions, preferences, or the same gotchas, Claude Code steadily adapts to <i>how you like</i> to work—across projects, codebases, and sessions.
 </p>
+<p align="center">
+  <b>Head-to-head vs <code>claude-mem</code></b>, evaluated by an LLM on how well each system’s reinjected context matched the expected rule: claude-smart achieved <b>~2.7× higher overall accuracy</b>, is better at <b>preventing Claude Code from repeating mistakes you have already corrected</b> rather than merely recalling that those mistakes happened, and is <b>3× better at converting past events into future-facing rules</b> — see <a href="benchmarks/memory_comparison/EXPERIMENT.md">EXPERIMENT.md</a> for details.
+</p>
 ---
 ## Why Learning, Not Memory
-Plain memory solutions re-inject transcripts or summaries from prior sessions. That preserves continuity but has limits:
+Most memory solutions re-inject transcripts or summaries from prior sessions, but it is still mostly informative—Claude remembers what happened, without necessarily changing what it does next.
+`claude-smart` focuses on learning instead.
+Four ways this changes what Claude Code can do for you:
+- **Actionable, not just informative:** Produces rules Claude can follow next time; memory only records what happened;
+  > *Example:* you tell Claude to stop running `npm test` without `--run` because watch mode hangs.
+  > **Memory:** “user was annoyed about npm test hanging”
+  > **Learning:** “always pass `--run` to `npm test` in this repo — default watch mode blocks CI”
+- **Optimized paths, not just past events:** Preserves successful execution paths so Claude can reuse what already works.
+  > *Example:* Claude spends several iterations trying to start the local dev environment before discovering that this repo requires `pnpm dev:all` instead of the usual `npm run dev`.
+  > **Memory:** “user mentioned that `npm run dev` did not work”
+  > **Learning:** “for this repo, always use `pnpm dev:all` to start the full local stack — `npm run dev` only starts the frontend and causes missing service errors”
+  Instead of re-exploring the same setup problem next time, Claude starts from the proven path—reducing planning steps, latency, and token usage.
+- **Project-wide, not session-siloed:** Session memory disappears with the conversation. The project playbook persists and improves across every session in that repo.
-- **Actionable, not just informative.** Memory logs *what happened*; learning produces *rules to follow* that change the next decision.
-  > *e.g.* you told Claude to stop running `npm test` without `--run` because it hangs on watch mode. **Memory** recalls "user was annoyed about npm test hanging". **Learning** writes the rule *"always pass `--run` to `npm test` in this repo — default watch mode blocks CI"* and applies it next session.
-- **Preferences, not events.** Memory records literal facts; learning abstracts the *why* into rules that generalize.
-  > *e.g.* you reject Jest in favor of Vitest once. **Memory** stores that single choice. **Learning** derives *"prefer ESM-native test runners for this TypeScript monorepo"* — which also covers the next framework decision (e.g. picking `tsx` over `ts-node`) without waiting for the same correction to repeat.
-- **Carries across sessions and workspaces.** Playbooks are keyed to the project and surface in every future session against that repo.
-- **Compact.** Distilled, deduplicated rules stay in dozens of tokens — not thousands — even as the project grows.
+- **Compact:** Distilled, deduplicated rules stay in dozens of tokens—not thousands—even as the project grows.
-claude-smart's approach: **extract, don't accumulate**. Corrections and successful patterns are distilled into two artifacts — a session-scoped **user profile** and a cross-session **project playbook** (each rule with explicit `trigger` and `rationale`, deduplicated and archived as they evolve) — and reinjected at the start of every session.
+claude-smart turns corrections and successful execution patterns into two artifacts:
+- **User Profile** → your preferences and working style across sessions
+- **Project Playbook** → durable rules and optimized execution paths for how Claude should behave
+Both are automatically reinjected at the start of every session, so Claude Code gets better the more you use it.
 ---
@@ -79,90 +104,57 @@ Developing the plugin itself? See [DEVELOPER.md](./DEVELOPER.md#developing-local
 - 🧠 **Learn, don't just remember** — Corrections become structured, deduplicated rules, not transcript replays.
 - ⚡ **Fully automatic learning** — Every user turn, tool call, and assistant response is captured via lifecycle hooks and extracted into rules without you running anything.
+- 📈 **Compounds with every session** — Rules auto-merge, supersede, and archive as your project evolves — the playbook sharpens with use instead of bloating.
+  > *e.g.* you correct the same `npm test --run` gotcha twice → **claude-smart** consolidates them into one stronger rule. Later you switch the policy to `pnpm test` → the old rule is archived and the new one supersedes it, no manual cleanup.
 - 🎯 **Two-tier scope** — Per-session profiles for the current conversation; cross-session playbooks for the whole project.
-- 🔌 **Fully local — no external API call** — semantic search runs on an in-process ONNX embedder (all-MiniLM-L6-v2), and all data (profiles, playbooks, interaction buffers) is stored locally on your machine (`~/.reflexio/` and `~/.claude-smart/`). The whole stack works offline.
+- 🔌 **No external API call** — semantic search runs on an in-process ONNX embedder (all-MiniLM-L6-v2), and all data (profiles, playbooks, interaction buffers) is stored locally on your machine (`~/.reflexio/` and `~/.claude-smart/`).
 - 🔎 **Hybrid search** — Playbooks and profiles are indexed with vector + BM25 search for fast, robust retrieval.
 - 🧪 **Offline resilience** — If the reflexio backend is down, hooks buffer to disk; the next successful publish drains them.
 - 🧰 **Manual correction tag** — `/claude-smart:tag` flags the last turn as a correction so the extractor weights it heavily.
 ---
-## Slash Commands
-| Command | What it does |
-| --- | --- |
-| `/show` | Print the current project playbook plus the current session's user profiles (same markdown that `SessionStart` injects). Use it to audit what rules and preferences Claude is being told to follow. |
-| `/learn` | Force reflexio to run extraction *now* on the current session's unpublished interactions. Without this, extraction runs at the end of the session or on reflexio's batch interval. |
-| `/tag [note]` | Tag the most recent turn as a correction, for cases the automatic heuristic missed. The note becomes the correction description the extractor sees. |
----
 ## Dashboard
-A Next.js web UI lives in [`dashboard/`](dashboard/) for browsing session buffers, inspecting user profiles, and editing project playbooks. It auto-starts alongside the backend — just open **http://localhost:3001**.
+A Next.js web UI lives in [`plugin/dashboard/`](plugin/dashboard/) for browsing session buffers, inspecting user profiles, and editing project playbooks. It auto-starts alongside the backend — just open **http://localhost:3001**.
+<p align="center">
+  <img src="assets/profile_dashboard.png" alt="Profile dashboard" width="49%">
+  <img src="assets/playbook_dashboard.png" alt="Playbook dashboard" width="49%">
+</p>
 ---
 ## How It Works
-**Core components:**
+claude-smart builds two artifacts as you work and injects them into Claude at the start of every new session:
-1. **5 lifecycle hooks** (`plugin/hooks/hooks.json`)
-   - `SessionStart` — fetches the project playbook from reflexio and injects it as `additionalContext`.
-   - `UserPromptSubmit` — buffers each user turn, heuristically flags corrections.
-   - `PostToolUse` — records tool invocations for later extraction.
-   - `Stop` — finalizes the assistant turn from the transcript, publishes to reflexio.
-   - `SessionEnd` — flushes the remaining buffer with `force_extraction=True`.
-2. **Local state buffer** — JSONL per session at `~/.claude-smart/sessions/{session_id}.jsonl`. Offline-safe.
-3. **Reflexio backend** (submodule at `reflexio/`) — SQLite storage, hybrid search, profile/playbook extraction, dedup, status lifecycle (`CURRENT` → `ARCHIVED`). Runs on `localhost:8081`.
-4. **Claude Code LLM provider** — a LiteLLM custom provider registered inside reflexio. Every generation call (extraction, update, dedup, evaluation) subprocesses `claude -p --output-format json`, so no OpenAI/Anthropic key is needed for the learning loop.
+- **User profile** — session-scoped preferences (stack, role, small quirks). *e.g.* "uses pnpm, not npm"; "prefers terse answers"; "backend engineer — explain frontend with backend analogues."
+- **Project playbook** — durable, generalized rules accumulated across every session in the repo. Each says when it applies and why. *e.g.* "always pass `--run` to `npm test` — watch mode hangs CI"; "use real Postgres for integration tests — mocks once hid a broken migration."
-**Data flow:**
+Rules clean themselves up: correct the same thing twice and they merge; change your mind and the old one is archived.
-```
-Claude Code session
-  ├─ UserPromptSubmit ─┐
-  ├─ PostToolUse  ─────┤  → JSONL buffer ─→ Stop ─→ reflexio publish_interaction
-  └─ Stop         ─────┘                              │
-                                                      ▼
-                                        ┌─────────────────────────┐
-                                        │ reflexio extractors     │
-                                        │  (run via claude -p)    │
-                                        │  → profiles + playbooks │
-                                        └────────────┬────────────┘
-                                                     │
-                                                     ▼
-Next session → SessionStart → search_user_playbooks(agent_version=project_id)
-              → additionalContext injected into Claude's system prompt
-```
+Under the hood: hooks watch your turns, tool calls, and Claude's replies, auto-flagging corrections (or anything you `/tag`). At session end (or on `/learn`), [reflexio](https://github.com/ReflexioAI/reflexio) — the self-improving engine that powers claude-smart — extracts profile entries and playbook rules. Next session, both get injected into the system prompt — run `/show` to see what Claude is being told. Everything runs on your machine.
-**Mapping to reflexio:**
+See [ARCHITECTURE.md](./ARCHITECTURE.md) for hooks, data flow, and reflexio details.
-| Reflexio field | claude-smart value |
-| --- | --- |
-| `user_id` | Claude Code `session_id` — scopes profiles to the current conversation |
-| `agent_version` | `project_id` (git-toplevel basename) — stable across sessions, so playbooks accumulate project-wide |
-| `session_id` | Claude Code `session_id` — for reflexio's deferred success evaluation |
+---
-Cross-session playbook retrieval uses `search_user_playbooks(agent_version=project_id, user_id=None)` — playbooks written from any prior session in this project surface for every future session.
+## Slash Commands
+| Command | What it does |
+| --- | --- |
+| `/show` | Print the current project playbook plus the current session's user profiles (same markdown that `SessionStart` injects). Use it to audit what rules and preferences Claude is being told to follow. |
+| `/learn` | Force reflexio to run extraction *now* on the current session's unpublished interactions. Without this, extraction runs at the end of the session or on reflexio's batch interval. |
+| `/tag [note]` | Tag the most recent turn as a correction, for cases the automatic heuristic missed. The note becomes the correction description the extractor sees. |
+| `/restart` | Restart the reflexio backend and dashboard to pick up new changes (e.g. after upgrading the plugin or editing local reflexio code). |
+| `/clear-all` | **Destructive.** Delete *all* reflexio interactions, profiles, and user playbooks. Use when you want to wipe learned state and start fresh. |
 ---
 ## Configuration
-### Environment variables
-| Variable | Default | Purpose |
-| --- | --- | --- |
-| `CLAUDE_SMART_USE_LOCAL_CLI` | `0` (installer sets `1`) | Route generation through the local `claude` CLI. Written to `~/.reflexio/.env` by `claude-smart install`. |
-| `CLAUDE_SMART_USE_LOCAL_EMBEDDING` | `0` (installer sets `1`) | Use the in-process ONNX embedder (requires `chromadb`). Written to `~/.reflexio/.env` by `claude-smart install`. |
-| `CLAUDE_SMART_CLI_PATH` | `shutil.which("claude")` | Override the path to the `claude` binary. |
-| `CLAUDE_SMART_CLI_TIMEOUT` | `120` | Per-call subprocess timeout (seconds). Raise for slow prompts. |
-| `CLAUDE_SMART_STATE_DIR` | `~/.claude-smart/sessions/` | Where the per-session JSONL buffer lives. |
-| `CLAUDE_SMART_BACKEND_AUTOSTART` | `1` | Set to `0` to stop the SessionStart hook from spawning the reflexio backend on `localhost:8081`. |
-| `CLAUDE_SMART_DASHBOARD_AUTOSTART` | `1` | Set to `0` to stop the SessionStart hook from spawning the Next.js dashboard on `localhost:3001`. |
-| `CLAUDE_SMART_BACKEND_STOP_ON_END` | `0` | Set to `1` to tear down the backend at `SessionEnd` instead of leaving it long-lived. |
-| `REFLEXIO_URL` | `http://localhost:8081/` | Point the plugin at a non-local reflexio backend. |
+Advanced users can tune claude-smart via environment variables — see [DEVELOPER.md](./DEVELOPER.md#environment-variables) for the full list.
 ### Where data lives
@@ -173,11 +165,6 @@ Cross-session playbook retrieval uses `search_user_playbooks(agent_version=proje
 | `~/.claude-smart/sessions/{session_id}.jsonl` | Per-session buffer. User turns, assistant turns, tool invocations, `{"published_up_to": N}` watermarks. Safe to inspect and safe to delete — everything past the latest watermark has already been written to reflexio's DB. |
 | `~/.cache/chroma/onnx_models/all-MiniLM-L6-v2/` | Cached ONNX weights (~86 MB, downloaded once). Delete to force a re-download. |
-### Scope: profile vs. playbook
-- **Profile** (`user_id = session_id`) — session-scoped preferences. Does not persist across sessions, but *is* reinjected if you resume the same session (`/resume`, `/clear`, `/compact`).
-- **Playbook** (`agent_version = project_id`) — cross-session. Every session in the same project — identified by git-toplevel basename — sees the accumulated playbook.
 ### Embeddings
 claude-smart uses an in-process ONNX embedder (Chroma's `all-MiniLM-L6-v2`, 384-dim, zero-padded to reflexio's 512-dim schema). The model weights are downloaded on first use (~80 MB, cached under `~/.cache/chroma/onnx_models/`) — after that, no network calls for embedding. Runtime cost is a few milliseconds per short document on CPU.
@@ -192,7 +179,7 @@ If you still want to use a cloud embedding provider (OpenAI, Gemini, etc.), omit
 Extraction is async by default. Run `/learn` to force it, wait ~20–30s, then run `/show` — no new session needed. `/show` shows whether the rule was actually extracted.
 **Reflexio refuses to boot with "no embedding-capable provider".**
-Check that `CLAUDE_SMART_USE_LOCAL_EMBEDDING=1` is in `~/.reflexio/.env` *and* that `chromadb` is installed in the venv (`uv run python -c "import chromadb"` should print nothing). If you'd rather use a cloud embedder instead, drop the env flag and set `OPENAI_API_KEY` or `GEMINI_API_KEY` in the same file.
+Check that `CLAUDE_SMART_USE_LOCAL_EMBEDDING=1` is in `~/.reflexio/.env` *and* that `chromadb` is installed in the venv (`uv run --project plugin python -c "import chromadb"` should print nothing). If you'd rather use a cloud embedder instead, drop the env flag and set `OPENAI_API_KEY` or `GEMINI_API_KEY` in the same file.
 **`claude-smart` doesn't see my interactions.**
 Check `~/.claude-smart/sessions/`. If your current session's JSONL has no `User`/`Assistant` rows, the plugin isn't receiving hook events — verify `.claude/settings.local.json` has the right path and that `enabledPlugins` is `true`.

package/bin/claude-smart.js CHANGED Viewed

@@ -58,6 +58,9 @@ function printHelp() {
       "  3. Appends CLAUDE_SMART_USE_LOCAL_CLI=1 and CLAUDE_SMART_USE_LOCAL_EMBEDDING=1",
       "     to ~/.reflexio/.env (idempotent).",
       "",
+      "Update:",
+      "  npx claude-smart update                        Update to the latest version",
+      "",
     ].join("\n"),
   );
 }
@@ -73,6 +76,26 @@ function parseSource(args) {
   return value;
 }
+function runUpdate() {
+  if (!hasClaudeCli()) {
+    process.stderr.write(
+      "error: 'claude' CLI not found on PATH. " +
+        "Install Claude Code first: https://claude.com/claude-code\n",
+    );
+    process.exit(1);
+  }
+  try {
+    execFileSync("claude", ["plugin", "update", PLUGIN_SPEC], { stdio: "inherit" });
+  } catch (err) {
+    const code = typeof err.status === "number" ? err.status : 1;
+    process.stderr.write(`error: \`claude plugin update ${PLUGIN_SPEC}\` failed (exit ${code})\n`);
+    process.exit(code);
+  }
+  process.stdout.write("\nclaude-smart updated. Restart Claude Code to apply.\n");
+}
 function runInstall(args) {
   if (!hasClaudeCli()) {
     process.stderr.write(
@@ -110,10 +133,9 @@ function runInstall(args) {
   process.stdout.write(
     [
       "",
-      "claude-smart installed. Next steps:",
-      "  1. Start the reflexio backend (leave it running in another terminal):",
-      "       uv run reflexio services start --only backend --no-reload",
-      "  2. Restart Claude Code in your project.",
+      "claude-smart installed. Restart Claude Code in your project.",
+      "The reflexio backend and dashboard auto-start on session start.",
+      "Opt out with CLAUDE_SMART_BACKEND_AUTOSTART=0 or CLAUDE_SMART_DASHBOARD_AUTOSTART=0.",
       "",
     ].join("\n"),
   );
@@ -133,6 +155,11 @@ function main() {
     return;
   }
+  if (cmd === "update") {
+    runUpdate();
+    return;
+  }
   process.stderr.write(
     `claude-smart: unknown command '${cmd}'. Try 'npx claude-smart --help'.\n`,
   );

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "claude-smart",
-  "version": "0.1.3",
+  "version": "0.1.5",
   "description": "Self-improving Claude Code plugin — learns from corrections via reflexio",
   "keywords": [
     "claude",