npm - @martian-engineering/lossless-claw - Versions diffs - 0.6.3 → 0.8.0 - Mend

@martian-engineering/lossless-claw 0.6.3 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (38) hide show

package/README.md +26 -6
package/docs/agent-tools.md +16 -5
package/docs/configuration.md +223 -214
package/openclaw.plugin.json +123 -0
package/package.json +1 -1
package/skills/lossless-claw/SKILL.md +3 -2
package/skills/lossless-claw/references/architecture.md +12 -0
package/skills/lossless-claw/references/config.md +135 -3
package/skills/lossless-claw/references/diagnostics.md +13 -0
package/src/assembler.ts +17 -5
package/src/compaction.ts +161 -53
package/src/db/config.ts +102 -4
package/src/db/connection.ts +35 -7
package/src/db/features.ts +24 -5
package/src/db/migration.ts +257 -78
package/src/engine.ts +1007 -110
package/src/estimate-tokens.ts +80 -0
package/src/lcm-log.ts +37 -0
package/src/plugin/index.ts +493 -101
package/src/plugin/lcm-command.ts +288 -7
package/src/plugin/lcm-doctor-apply.ts +1 -3
package/src/plugin/lcm-doctor-cleaners.ts +655 -0
package/src/plugin/shared-init.ts +59 -0
package/src/prune.ts +391 -0
package/src/retrieval.ts +8 -9
package/src/startup-banner-log.ts +1 -0
package/src/store/compaction-telemetry-store.ts +156 -0
package/src/store/conversation-store.ts +6 -1
package/src/store/fts5-sanitize.ts +25 -4
package/src/store/full-text-sort.ts +21 -0
package/src/store/index.ts +8 -0
package/src/store/summary-store.ts +21 -14
package/src/summarize.ts +55 -34
package/src/tools/lcm-describe-tool.ts +9 -4
package/src/tools/lcm-expand-query-tool.ts +609 -200
package/src/tools/lcm-expand-tool.ts +9 -4
package/src/tools/lcm-grep-tool.ts +22 -8
package/src/types.ts +1 -0

package/README.md CHANGED Viewed

@@ -30,11 +30,25 @@ Nothing is lost. Raw messages stay in the database. Summaries link back to their
 ## Commands And Skill
-The plugin now ships a bundled `lossless-claw` skill plus a small native command surface:
+The plugin now ships a bundled `lossless-claw` skill plus a small plugin command surface for supported OpenClaw chat/native command providers:
 - `/lcm` shows version, enablement/selection state, DB path and size, summary counts, and summary-health status
 - `/lcm doctor` scans for broken or truncated summaries
-- `/lossless` is an alias for `/lcm` on native command surfaces
+- `/lcm doctor clean` shows read-only high-confidence junk diagnostics for archived subagents, cron sessions, and NULL-key orphaned subagent runs
+- `/lossless` is an alias for `/lcm` on supported native command surfaces
+These are plugin slash/native commands, not root shell CLI subcommands. Supported examples:
+- `/lcm`
+- `/lcm doctor`
+- `/lcm doctor clean`
+- `/lossless`
+Not currently supported as root CLI commands:
+- `openclaw lcm`
+- `openclaw lossless`
+- `openclaw /lcm`
 The bundled skill focuses on configuration, diagnostics, architecture, and recall-tool usage. Its reference set lives under `skills/lossless-claw/references/`.
@@ -70,7 +84,7 @@ openclaw plugins install --link /path/to/lossless-claw
 The install command records the plugin, enables it, and applies compatible slot selection (including `contextEngine` when applicable).
-> **Note:** If your OpenClaw config uses `plugins.allow`, make sure both `lossless-claw` and any active plugins you rely on remain allowlisted. In some setups, narrowing the allowlist can prevent plugin-backed integrations from loading, even if `lossless-claw` itself is installed correctly. Restart the gateway after plugin config changes.
+> **Note:** If your OpenClaw config uses `plugins.allow`, allowlist the plugin id `lossless-claw` plus any other active plugins you rely on. Do not add command tokens or aliases like `lossless` or `/lcm` to `plugins.allow`; that setting only accepts plugin ids. In some setups, narrowing the allowlist can prevent plugin-backed integrations from loading, even if `lossless-claw` itself is installed correctly. Restart the gateway after plugin config changes.
 ### Configure OpenClaw
@@ -113,8 +127,8 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
           "ignoreSessionPatterns": [
             "agent:*:cron:**"
           ],
-          "summaryModel": "anthropic/claude-haiku-4-5",
-          "expansionModel": "anthropic/claude-haiku-4-5",
+          "summaryModel": "openai/gpt-5.4-mini",
+          "expansionModel": "openai/gpt-5.4-mini",
           "delegationTimeoutMs": 300000,
           "summaryTimeoutMs": 60000
         }
@@ -152,7 +166,7 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
 | `LCM_SUMMARY_MODEL` | `""` | Model override for compaction summarization; falls back to OpenClaw's default model when unset |
 | `LCM_SUMMARY_PROVIDER` | `""` | Provider override for compaction summarization; falls back to `OPENCLAW_PROVIDER` or the provider embedded in the model ref |
 | `LCM_SUMMARY_BASE_URL` | *(from OpenClaw / provider default)* | Base URL override for summarization API calls |
-| `LCM_EXPANSION_MODEL` | *(from OpenClaw)* | Model override for `lcm_expand_query` sub-agent (e.g. `anthropic/claude-haiku-4-5`) |
+| `LCM_EXPANSION_MODEL` | *(from OpenClaw)* | Model override for `lcm_expand_query` sub-agent (e.g. `openai/gpt-5.4-mini`) |
 | `LCM_EXPANSION_PROVIDER` | *(from OpenClaw)* | Provider override for `lcm_expand_query` sub-agent |
 | `LCM_DELEGATION_TIMEOUT_MS` | `120000` | Max time to wait for delegated `lcm_expand_query` sub-agent completion |
 | `LCM_SUMMARY_TIMEOUT_MS` | `60000` | Max time to wait for a single model-backed LCM summarizer call |
@@ -162,6 +176,8 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
 If you want `lcm_expand_query` to run on a dedicated model via `expansionModel` or `LCM_EXPANSION_MODEL`, OpenClaw must explicitly trust the plugin to request sub-agent model overrides.
+For most setups, `openai/gpt-5.4-mini` is a better starting point than Anthropic Haiku because it is cheap, fast, and does not depend on Anthropic quota remaining.
 Add a `subagent` policy under `plugins.entries.lossless-claw` and allowlist the canonical `provider/model` target you want the plugin to use:
 ```json
@@ -215,6 +231,8 @@ For compaction summarization, lossless-claw resolves the model in this order:
 If `summaryModel` already includes a provider prefix such as `anthropic/claude-sonnet-4-20250514`, `summaryProvider` is ignored for that choice. Otherwise, the provider falls back to the matching override, then `OPENCLAW_PROVIDER`, then the provider inferred by the caller.
+Runtime-managed OAuth providers are supported here too. In particular, `openai-codex` and `github-copilot` auth profiles can be used for summary and expansion calls without a separate API key.
 ### Recommended starting configuration
 ```
@@ -222,6 +240,8 @@ LCM_FRESH_TAIL_COUNT=64
 LCM_LEAF_CHUNK_TOKENS=20000
 LCM_INCREMENTAL_MAX_DEPTH=1
 LCM_CONTEXT_THRESHOLD=0.75
+LCM_SUMMARY_MODEL=openai/gpt-5.4-mini
+LCM_EXPANSION_MODEL=openai/gpt-5.4-mini
 ```
 - **freshTailCount=64** protects the last 64 messages from compaction, giving the model more recent context for continuity.

package/docs/agent-tools.md CHANGED Viewed

@@ -24,7 +24,7 @@ Summaries are lossy by design. The "Expand for details about:" footer at the end
 - Tool call sequences and their outputs
 - Verbatim quotes or specific data points
-`lcm_expand_query` is bounded (~120s, scoped sub-agent) and relatively cheap. Don't ration it.
+`lcm_expand_query` is bounded (~120s, scoped sub-agent) and relatively cheap. Don't ration it, but use `lcm_grep` first when you need broad discovery across many sessions.
 ## Tool reference
@@ -32,6 +32,8 @@ Summaries are lossy by design. The "Expand for details about:" footer at the end
 Search across messages and/or summaries using regex or full-text search.
+Use `mode: "full_text"` for keyword or topical recall. Wrap exact multi-word phrases in quotes to preserve phrase matching. Keep the default `sort: "recency"` for recent events, switch to `sort: "relevance"` when looking for the best older match on a topic, and use `sort: "hybrid"` when you want relevance without giving up recency entirely.
 **Parameters:**
 | Param | Type | Required | Default | Description |
@@ -44,6 +46,7 @@ Search across messages and/or summaries using regex or full-text search.
 | `since` | string | | — | ISO timestamp lower bound |
 | `before` | string | | — | ISO timestamp upper bound |
 | `limit` | number | | 50 | Max results (1–200) |
+| `sort` | string | | `"recency"` | `"recency"`, `"relevance"`, or `"hybrid"` for full-text ranking |
 **Returns:** Array of matches with:
 - `id` — Message or summary ID
@@ -59,6 +62,9 @@ Search across messages and/or summaries using regex or full-text search.
 # Full-text search across all conversations
 lcm_grep(pattern: "database migration", mode: "full_text", allConversations: true)
+# Older-topic recall ranked by FTS relevance
+lcm_grep(pattern: "\"error handling\" retries", mode: "full_text", sort: "relevance")
 # Regex search in summaries only
 lcm_grep(pattern: "config\\.threshold.*0\\.[0-9]+", scope: "summaries")
@@ -108,6 +114,8 @@ lcm_describe(id: "file_789abc012345")
 Answer a focused question by expanding summaries through the DAG. Spawns a bounded sub-agent that walks parent links down to source material and returns a compact answer.
+When `allConversations: true` is set, `lcm_expand_query` can now synthesize one answer across multiple conversations. That cross-conversation mode is bounded, not exhaustive: it ranks conversation buckets, expands only the top few, and marks the result truncated when lower-ranked buckets are skipped or fail.
 **Parameters:**
 | Param | Type | Required | Default | Description |
@@ -124,9 +132,11 @@ Answer a focused question by expanding summaries through the DAG. Spawns a bound
 **Returns:**
 - `answer` — The focused answer text
 - `citedIds` — Summary IDs that contributed to the answer
+- `sourceConversationIds` — Conversations that were successfully expanded
 - `expandedSummaryCount` — How many summaries were expanded
 - `totalSourceTokens` — Total tokens read from the DAG
 - `truncated` — Whether the answer was truncated to fit maxTokens
+- `conversationBreakdown` — Optional per-conversation success/failure diagnostics for bounded multi-conversation runs
 **Examples:**
@@ -143,7 +153,7 @@ lcm_expand_query(
   prompt: "What were the exact file changes?"
 )
-# Cross-conversation search
+# Cross-conversation synthesis
 lcm_expand_query(
   query: "deployment procedure",
   prompt: "What's the current deployment process?",
@@ -167,9 +177,9 @@ Add instructions to your agent's system prompt so it knows when to use LCM tools
 ## Memory & Context
 Use LCM tools for recall:
-1. `lcm_grep` — Search all conversations by keyword/regex
+1. `lcm_grep` — Search all conversations by keyword/regex. Prefer `mode: "full_text"` for topic recall, quote exact phrases, use `sort: "relevance"` for older-topic lookups, and `sort: "hybrid"` when recency should still matter.
 2. `lcm_describe` — Inspect a specific summary (cheap, no sub-agent)
-3. `lcm_expand_query` — Deep recall with sub-agent expansion
+3. `lcm_expand_query` — Deep recall with bounded sub-agent expansion
 When summaries in context have an "Expand for details about:" footer
 listing something you need, use `lcm_expand_query` to get the full detail.
@@ -177,7 +187,7 @@ listing something you need, use `lcm_expand_query` to get the full detail.
 ### Conversation scoping
-By default, tools operate on the current conversation. Use `allConversations: true` to search across all of them (all agents, all sessions). Use `conversationId` to target a specific conversation you already know about (from previous grep results).
+By default, tools operate on the current conversation. Use `lcm_grep(..., allConversations: true)` when you need broad global discovery. Use `lcm_expand_query(..., allConversations: true)` when you want bounded synthesis across sessions. Use `conversationId` when you already know the exact conversation to inspect or expand.
 ### Performance considerations
@@ -185,3 +195,4 @@ By default, tools operate on the current conversation. Use `allConversations: tr
 - `lcm_expand_query` spawns a sub-agent and takes ~30–120 seconds
 - The sub-agent has a 120-second timeout with cleanup guarantees
 - Token caps (`LCM_MAX_EXPAND_TOKENS`) prevent runaway expansion
+- Cross-conversation `lcm_expand_query` expands only a bounded set of top-ranked conversations