npm - @martian-engineering/lossless-claw - Versions diffs - 0.1.2 → 0.1.4 - Mend

@martian-engineering/lossless-claw 0.1.2 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

package/README.md CHANGED Viewed

@@ -1,5 +1,7 @@
 # lossless-claw
+> ⚠️ **Current requirement:** This plugin currently requires a custom OpenClaw build with [PR #22201](https://github.com/openclaw/openclaw/pull/22201) applied until that PR is merged upstream.
 Lossless Context Management plugin for [OpenClaw](https://github.com/openclaw/openclaw), based on the [LCM paper](https://voltropy.com/LCM). Replaces OpenClaw's built-in sliding-window compaction with a DAG-based summarization system that preserves every message while keeping active context within model token limits.
 ## What it does
@@ -26,45 +28,37 @@ Nothing is lost. Raw messages stay in the database. Summaries link back to their
 ### Install the plugin
-**From npm** (recommended):
+Use OpenClaw's plugin installer (recommended):
 ```bash
-npm install @martian-engineering/lossless-claw
+openclaw plugins install @martian-engineering/lossless-claw
 ```
-**From source** (for development):
+If you're running from a local OpenClaw checkout, use:
 ```bash
-git clone https://github.com/Martian-Engineering/lossless-claw.git
-cd lossless-claw
-npm install
+pnpm openclaw plugins install @martian-engineering/lossless-claw
 ```
-### Configure OpenClaw
-Add the plugin to your OpenClaw config (`~/.openclaw/openclaw.json`):
+For local plugin development, link your working copy instead of copying files:
-```json
-{
-  "plugins": {
-    "paths": [
-      "node_modules/@martian-engineering/lossless-claw"
-    ],
-    "slots": {
-      "contextEngine": "lossless-claw"
-    }
-  }
-}
+```bash
+openclaw plugins install --link /path/to/lossless-claw
+# or from a local OpenClaw checkout:
+# pnpm openclaw plugins install --link /path/to/lossless-claw
 ```
-If installed from source, use the absolute path to the cloned repo instead:
+The install command records the plugin, enables it, and applies compatible slot selection (including `contextEngine` when applicable).
+### Configure OpenClaw
+In most cases, no manual JSON edits are needed after `openclaw plugins install`.
+If you need to set it manually, ensure the context engine slot points at lossless-claw:
 ```json
 {
   "plugins": {
-    "paths": [
-      "/path/to/lossless-claw"
-    ],
     "slots": {
       "contextEngine": "lossless-claw"
     }
@@ -72,8 +66,6 @@ If installed from source, use the absolute path to the cloned repo instead:
 }
 ```
-The `slots.contextEngine` setting tells OpenClaw to route all context management through LCM instead of the built-in legacy engine.
 Restart OpenClaw after configuration changes.
 ## Configuration

package/docs/architecture.md CHANGED Viewed

@@ -67,8 +67,10 @@ The **leaf pass** converts raw messages into leaf summaries:
 3. Concatenate message content with timestamps.
 4. Resolve the most recent prior summary for continuity (passed as `previous_context` so the LLM avoids repeating known information).
 5. Send to the LLM with the leaf prompt.
-6. If the summary is larger than the input (LLM failure), retry with the aggressive prompt. If still too large, fall back to deterministic truncation.
-7. Persist the summary, link to source messages, and replace the message range in context_items.
+6. Normalize provider response blocks (Anthropic/OpenAI text, output_text, and nested content/summary shapes) into plain text.
+7. If normalization is empty, log provider/model/block-type diagnostics and fall back to deterministic truncation.
+8. If the summary is larger than the input (LLM failure), retry with the aggressive prompt. If still too large, fall back to deterministic truncation.
+9. Persist the summary, link to source messages, and replace the message range in context_items.
 ### Condensation
@@ -215,8 +217,8 @@ All mutating operations (ingest, compact) are serialized per-session using a pro
 LCM needs to call an LLM for summarization. It resolves credentials through a three-tier cascade:
-1. **Explicit API key** — If provided in legacy params
+1. **Auth profiles** — OpenClaw's OAuth/token/API-key profile system (`auth-profiles.json`), checked in priority order
 2. **Environment variables** — Standard provider env vars (`ANTHROPIC_API_KEY`, etc.)
-3. **Auth profiles** — OpenClaw's OAuth/token/API-key profile system (`auth-profiles.json`)
+3. **Custom provider key** — From models config (e.g., `models.json`)
 For OAuth providers (e.g., Anthropic via Claude Max), LCM handles token refresh and credential persistence automatically.

package/docs/configuration.md CHANGED Viewed

@@ -2,24 +2,25 @@
 ## Quick start
-Install the plugin and add it to your OpenClaw config:
+Install the plugin with OpenClaw's plugin installer:
 ```bash
-npm install @martian-engineering/lossless-claw
+openclaw plugins install @martian-engineering/lossless-claw
 ```
-```json
-{
-  "plugins": {
-    "paths": ["node_modules/@martian-engineering/lossless-claw"],
-    "slots": {
-      "contextEngine": "lossless-claw"
-    }
-  }
-}
+If you're running from a local OpenClaw checkout:
+```bash
+pnpm openclaw plugins install @martian-engineering/lossless-claw
+```
+For local development of this plugin, link your working copy:
+```bash
+openclaw plugins install --link /path/to/lossless-claw
 ```
-If installed from source, use the absolute path to the repo instead of `node_modules/...`.
+`openclaw plugins install` handles plugin registration/enabling and slot selection automatically.
 Set recommended environment variables:

package/docs/tui.md CHANGED Viewed

@@ -176,7 +176,7 @@ Lists files that exceeded the large file threshold (default 25k tokens) and were
 Re-summarizes a single summary node using the current depth-aware prompt templates. The process:
 1. **Preview** — shows the prompt that will be sent, including source material, target token count, previous context, and time range
-2. **API call** — sends to Anthropic's API (Claude Sonnet by default)
+2. **API call** — sends to the configured provider API (Anthropic by default)
 3. **Review** — shows old and new content side-by-side with token delta. Toggle unified diff view with `d`. Scroll with `j`/`k`.
 | Key (Preview) | Action |
@@ -280,6 +280,9 @@ lcm-tui rewrite 44 --depth 0 --apply
 # Rewrite everything bottom-up
 lcm-tui rewrite 44 --all --apply --diff
+# Rewrite with OpenAI Responses API
+lcm-tui rewrite 44 --summary sum_abc123 --provider openai --model gpt-5.3-codex --apply
 # Use custom prompt templates
 lcm-tui rewrite 44 --all --apply --prompt-dir ~/.config/lcm-tui/prompts
 ```
@@ -292,7 +295,8 @@ lcm-tui rewrite 44 --all --apply --prompt-dir ~/.config/lcm-tui/prompts
 | `--apply` | Write changes to database |
 | `--dry-run` | Show before/after without writing (default) |
 | `--diff` | Show unified diff |
-| `--model <model>` | Anthropic model (default: `claude-sonnet-4-20250514`) |
+| `--provider <id>` | API provider (inferred from `--model` when omitted) |
+| `--model <model>` | API model (default depends on provider) |
 | `--prompt-dir <path>` | Custom prompt template directory |
 | `--timestamps` | Inject timestamps into source text (default: true) |
 | `--tz <timezone>` | Timezone for timestamps (default: system local) |
@@ -348,6 +352,56 @@ Everything runs in a single transaction.
 | `--apply` | Execute transplant |
 | `--dry-run` | Show what would be transplanted (default) |
+### `lcm-tui backfill`
+Imports a pre-LCM JSONL session into `conversations/messages/context_items`, runs iterative depth-aware compaction with the configured provider + prompt templates, optionally forces a single-root fold, and can transplant the result to another conversation.
+```bash
+# Preview import + compaction plan (no writes)
+lcm-tui backfill my-agent session_abc123
+# Import + compact
+lcm-tui backfill my-agent session_abc123 --apply
+# Re-run compaction for an already-imported session
+lcm-tui backfill my-agent session_abc123 --apply --recompact
+# Force a single summary root when possible
+lcm-tui backfill my-agent session_abc123 --apply --recompact --single-root
+# Import + compact + transplant into an active conversation
+lcm-tui backfill my-agent session_abc123 --apply --transplant-to 653
+# Backfill using OpenAI
+lcm-tui backfill my-agent session_abc123 --apply --provider openai --model gpt-5.3-codex
+```
+All write paths are transactional:
+1. Import transaction (conversation/messages/message_parts/context)
+2. Per-pass compaction transactions (leaf/condensed replacements)
+3. Optional transplant transaction (reuse of transplant command internals)
+An idempotency guard prevents duplicate imports for the same `session_id`.
+| Flag | Description |
+|------|-------------|
+| `--apply` | Execute import/compaction/transplant |
+| `--dry-run` | Show what would run, without writes (default) |
+| `--recompact` | Re-run compaction for already-imported sessions (message import remains idempotent) |
+| `--single-root` | Force condensed folding until one summary remains when possible |
+| `--transplant-to <conv_id>` | Transplant backfilled summaries into target conversation |
+| `--title <text>` | Override imported conversation title |
+| `--leaf-chunk-tokens <n>` | Max source tokens per leaf chunk |
+| `--leaf-target-tokens <n>` | Target output tokens for leaf summaries |
+| `--condensed-target-tokens <n>` | Target output tokens for condensed summaries |
+| `--leaf-fanout <n>` | Min leaves required for d1 condensation |
+| `--condensed-fanout <n>` | Min summaries required for d2+ condensation |
+| `--hard-fanout <n>` | Min summaries for forced single-root passes |
+| `--fresh-tail <n>` | Preserve freshest N raw messages from leaf compaction |
+| `--provider <id>` | API provider (inferred from model when omitted) |
+| `--model <id>` | API model (default depends on provider) |
+| `--prompt-dir <path>` | Custom depth-prompt directory |
 ### `lcm-tui prompts`
 Manage and inspect depth-aware prompt templates. Templates control how the LLM summarizes at each depth level.
@@ -404,21 +458,31 @@ All templates end with an `"Expand for details about:"` footer listing topics av
 ## Authentication
-The TUI needs an Anthropic API key for rewrite and repair operations. It resolves credentials in this order:
+The TUI resolves API keys by provider for rewrite, repair, and backfill compaction operations.
+- Anthropic: `ANTHROPIC_API_KEY`
+- OpenAI: `OPENAI_API_KEY`
-1. `ANTHROPIC_API_KEY` environment variable
-2. OpenClaw config (`~/.openclaw/openclaw.json`) — reads the `anthropic:default` auth profile mode
+Resolution order:
+1. Provider API key environment variable
+2. OpenClaw config (`~/.openclaw/openclaw.json`) — checks matching provider auth profile mode
 3. OpenClaw env file
 4. `~/.zshrc` export
-5. Various credential file candidates under `~/.openclaw/`
+5. Credential file candidates under `~/.openclaw/`
+If the provider auth profile mode is `oauth` (not `api_key`), set the provider API key environment variable explicitly.
+Interactive rewrite (`w`/`W`) can be configured with:
+- `LCM_TUI_SUMMARY_PROVIDER`
+- `LCM_TUI_SUMMARY_MODEL`
-If the auth profile mode is `oauth` (not `api_key`), the TUI cannot use it — set `ANTHROPIC_API_KEY` explicitly for repair/rewrite commands.
+It also honors `LCM_SUMMARY_PROVIDER` / `LCM_SUMMARY_MODEL` as fallback.
 ## Database
-The TUI operates directly on the SQLite database at `~/.openclaw/lcm.db`. All write operations (rewrite, dissolve, repair, transplant) use transactions. Changes take effect on the next conversation turn — the running OpenClaw instance picks up database changes automatically.
+The TUI operates directly on the SQLite database at `~/.openclaw/lcm.db`. All write operations (rewrite, dissolve, repair, transplant, backfill) use transactions. Changes take effect on the next conversation turn — the running OpenClaw instance picks up database changes automatically.
-**Backup recommendation:** Before batch operations (repair `--all`, rewrite `--all`, transplant), copy the database:
+**Backup recommendation:** Before batch operations (repair `--all`, rewrite `--all`, transplant, backfill), copy the database:
 ```bash
 cp ~/.openclaw/lcm.db ~/.openclaw/lcm.db.bak-$(date +%Y%m%d)
@@ -428,7 +492,7 @@ cp ~/.openclaw/lcm.db ~/.openclaw/lcm.db.bak-$(date +%Y%m%d)
 **"No LCM summaries found"** — The session may not have an associated conversation in the LCM database. Check that the `conv_id` column shows a non-zero value in the session list. Sessions without LCM tracking won't have summaries.
-**Rewrite returns empty/bad content** — Check the API key is valid and the model is accessible. The TUI uses `claude-sonnet-4-20250514` by default; override with `--model` if needed.
+**Rewrite returns empty/bad content** — Check provider/model access and API key. If normalization still yields empty text, the TUI now returns diagnostics including `provider`, `model`, and response `block_types` to help pinpoint adapter mismatches.
 **Dissolve fails with "not condensed"** — Only condensed summaries (depth > 0) can be dissolved. Leaf summaries have no parent summaries to restore.

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@martian-engineering/lossless-claw",
-  "version": "0.1.2",
+  "version": "0.1.4",
   "description": "Lossless Context Management plugin for OpenClaw — DAG-based conversation summarization with incremental compaction",
   "type": "module",
   "main": "index.ts",

package/src/summarize.ts CHANGED Viewed

@@ -78,13 +78,119 @@ function estimateTokens(text: string): number {
   return Math.ceil(text.length / 4);
 }
-/** Narrows completion response blocks to plain text blocks. */
-function isTextBlock(block: unknown): block is { type: string; text: string } {
-  if (!block || typeof block !== "object" || Array.isArray(block)) {
-    return false;
+/** Narrow unknown values to plain object records. */
+function isRecord(value: unknown): value is Record<string, unknown> {
+  return !!value && typeof value === "object" && !Array.isArray(value);
+}
+/**
+ * Normalize text fragments from provider-specific block shapes.
+ *
+ * Deduplicates exact repeated fragments while preserving first-seen order so
+ * providers that mirror output in multiple fields don't duplicate summaries.
+ */
+function normalizeTextFragments(chunks: string[]): string {
+  const normalized: string[] = [];
+  const seen = new Set<string>();
+  for (const chunk of chunks) {
+    const trimmed = chunk.trim();
+    if (!trimmed || seen.has(trimmed)) {
+      continue;
+    }
+    seen.add(trimmed);
+    normalized.push(trimmed);
+  }
+  return normalized.join("\n").trim();
+}
+/** Collect all nested `type` labels for diagnostics on normalization failures. */
+function collectBlockTypes(value: unknown, out: Set<string>): void {
+  if (Array.isArray(value)) {
+    for (const entry of value) {
+      collectBlockTypes(entry, out);
+    }
+    return;
+  }
+  if (!isRecord(value)) {
+    return;
+  }
+  if (typeof value.type === "string" && value.type.trim()) {
+    out.add(value.type.trim());
+  }
+  for (const nested of Object.values(value)) {
+    collectBlockTypes(nested, out);
+  }
+}
+/** Collect text payloads from common provider response shapes. */
+function collectTextLikeFields(value: unknown, out: string[]): void {
+  if (Array.isArray(value)) {
+    for (const entry of value) {
+      collectTextLikeFields(entry, out);
+    }
+    return;
+  }
+  if (!isRecord(value)) {
+    return;
+  }
+  for (const key of ["text", "output_text", "thinking"]) {
+    appendTextValue(value[key], out);
+  }
+  for (const key of ["content", "summary", "output", "message", "response"]) {
+    if (key in value) {
+      collectTextLikeFields(value[key], out);
+    }
+  }
+}
+/** Append raw textual values and nested text wrappers (`value`, `text`). */
+function appendTextValue(value: unknown, out: string[]): void {
+  if (typeof value === "string") {
+    out.push(value);
+    return;
+  }
+  if (Array.isArray(value)) {
+    for (const entry of value) {
+      appendTextValue(entry, out);
+    }
+    return;
+  }
+  if (!isRecord(value)) {
+    return;
+  }
+  if (typeof value.value === "string") {
+    out.push(value.value);
+  }
+  if (typeof value.text === "string") {
+    out.push(value.text);
+  }
+}
+/** Normalize provider completion content into a plain-text summary payload. */
+function normalizeCompletionSummary(content: unknown): { summary: string; blockTypes: string[] } {
+  const chunks: string[] = [];
+  const blockTypeSet = new Set<string>();
+  collectTextLikeFields(content, chunks);
+  collectBlockTypes(content, blockTypeSet);
+  const blockTypes = [...blockTypeSet].sort((a, b) => a.localeCompare(b));
+  return {
+    summary: normalizeTextFragments(chunks),
+    blockTypes,
+  };
+}
+/** Format normalized block types for concise diagnostics. */
+function formatBlockTypes(blockTypes: string[]): string {
+  if (blockTypes.length === 0) {
+    return "(none)";
   }
-  const record = block as { type?: unknown; text?: unknown };
-  return record.type === "text" && typeof record.text === "string";
+  return blockTypes.join(",");
 }
 /**
@@ -426,15 +532,15 @@ export async function createLcmSummarizeFromLegacyParams(params: {
       temperature: aggressive ? 0.1 : 0.2,
     });
-    const summary = result.content
-      .filter(isTextBlock)
-      .map((block) => block.text.trim())
-      .filter(Boolean)
-      .join("\n")
-      .trim();
+    const normalized = normalizeCompletionSummary(result.content);
+    const summary = normalized.summary;
     if (!summary) {
-      console.error(`[lcm] summarize got empty content from LLM (${result.content.length} blocks, types: ${result.content.map(b => b.type).join(",")}), falling back to truncation`);
+      console.error(
+        `[lcm] summarize empty normalized summary; provider=${provider} model=${model} block_types=${formatBlockTypes(
+          normalized.blockTypes,
+        )}; response_blocks=${result.content.length}; falling back to truncation`,
+      );
       return buildDeterministicFallbackSummary(text, targetTokens);
     }