@martian-engineering/lossless-claw 0.6.3 → 0.8.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (38) hide show
  1. package/README.md +26 -6
  2. package/docs/agent-tools.md +16 -5
  3. package/docs/configuration.md +223 -214
  4. package/openclaw.plugin.json +123 -0
  5. package/package.json +1 -1
  6. package/skills/lossless-claw/SKILL.md +3 -2
  7. package/skills/lossless-claw/references/architecture.md +12 -0
  8. package/skills/lossless-claw/references/config.md +135 -3
  9. package/skills/lossless-claw/references/diagnostics.md +13 -0
  10. package/src/assembler.ts +17 -5
  11. package/src/compaction.ts +161 -53
  12. package/src/db/config.ts +102 -4
  13. package/src/db/connection.ts +35 -7
  14. package/src/db/features.ts +24 -5
  15. package/src/db/migration.ts +257 -78
  16. package/src/engine.ts +1007 -110
  17. package/src/estimate-tokens.ts +80 -0
  18. package/src/lcm-log.ts +37 -0
  19. package/src/plugin/index.ts +493 -101
  20. package/src/plugin/lcm-command.ts +288 -7
  21. package/src/plugin/lcm-doctor-apply.ts +1 -3
  22. package/src/plugin/lcm-doctor-cleaners.ts +655 -0
  23. package/src/plugin/shared-init.ts +59 -0
  24. package/src/prune.ts +391 -0
  25. package/src/retrieval.ts +8 -9
  26. package/src/startup-banner-log.ts +1 -0
  27. package/src/store/compaction-telemetry-store.ts +156 -0
  28. package/src/store/conversation-store.ts +6 -1
  29. package/src/store/fts5-sanitize.ts +25 -4
  30. package/src/store/full-text-sort.ts +21 -0
  31. package/src/store/index.ts +8 -0
  32. package/src/store/summary-store.ts +21 -14
  33. package/src/summarize.ts +55 -34
  34. package/src/tools/lcm-describe-tool.ts +9 -4
  35. package/src/tools/lcm-expand-query-tool.ts +609 -200
  36. package/src/tools/lcm-expand-tool.ts +9 -4
  37. package/src/tools/lcm-grep-tool.ts +22 -8
  38. package/src/types.ts +1 -0
package/README.md CHANGED
@@ -30,11 +30,25 @@ Nothing is lost. Raw messages stay in the database. Summaries link back to their
30
30
 
31
31
  ## Commands And Skill
32
32
 
33
- The plugin now ships a bundled `lossless-claw` skill plus a small native command surface:
33
+ The plugin now ships a bundled `lossless-claw` skill plus a small plugin command surface for supported OpenClaw chat/native command providers:
34
34
 
35
35
  - `/lcm` shows version, enablement/selection state, DB path and size, summary counts, and summary-health status
36
36
  - `/lcm doctor` scans for broken or truncated summaries
37
- - `/lossless` is an alias for `/lcm` on native command surfaces
37
+ - `/lcm doctor clean` shows read-only high-confidence junk diagnostics for archived subagents, cron sessions, and NULL-key orphaned subagent runs
38
+ - `/lossless` is an alias for `/lcm` on supported native command surfaces
39
+
40
+ These are plugin slash/native commands, not root shell CLI subcommands. Supported examples:
41
+
42
+ - `/lcm`
43
+ - `/lcm doctor`
44
+ - `/lcm doctor clean`
45
+ - `/lossless`
46
+
47
+ Not currently supported as root CLI commands:
48
+
49
+ - `openclaw lcm`
50
+ - `openclaw lossless`
51
+ - `openclaw /lcm`
38
52
 
39
53
  The bundled skill focuses on configuration, diagnostics, architecture, and recall-tool usage. Its reference set lives under `skills/lossless-claw/references/`.
40
54
 
@@ -70,7 +84,7 @@ openclaw plugins install --link /path/to/lossless-claw
70
84
 
71
85
  The install command records the plugin, enables it, and applies compatible slot selection (including `contextEngine` when applicable).
72
86
 
73
- > **Note:** If your OpenClaw config uses `plugins.allow`, make sure both `lossless-claw` and any active plugins you rely on remain allowlisted. In some setups, narrowing the allowlist can prevent plugin-backed integrations from loading, even if `lossless-claw` itself is installed correctly. Restart the gateway after plugin config changes.
87
+ > **Note:** If your OpenClaw config uses `plugins.allow`, allowlist the plugin id `lossless-claw` plus any other active plugins you rely on. Do not add command tokens or aliases like `lossless` or `/lcm` to `plugins.allow`; that setting only accepts plugin ids. In some setups, narrowing the allowlist can prevent plugin-backed integrations from loading, even if `lossless-claw` itself is installed correctly. Restart the gateway after plugin config changes.
74
88
 
75
89
  ### Configure OpenClaw
76
90
 
@@ -113,8 +127,8 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
113
127
  "ignoreSessionPatterns": [
114
128
  "agent:*:cron:**"
115
129
  ],
116
- "summaryModel": "anthropic/claude-haiku-4-5",
117
- "expansionModel": "anthropic/claude-haiku-4-5",
130
+ "summaryModel": "openai/gpt-5.4-mini",
131
+ "expansionModel": "openai/gpt-5.4-mini",
118
132
  "delegationTimeoutMs": 300000,
119
133
  "summaryTimeoutMs": 60000
120
134
  }
@@ -152,7 +166,7 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
152
166
  | `LCM_SUMMARY_MODEL` | `""` | Model override for compaction summarization; falls back to OpenClaw's default model when unset |
153
167
  | `LCM_SUMMARY_PROVIDER` | `""` | Provider override for compaction summarization; falls back to `OPENCLAW_PROVIDER` or the provider embedded in the model ref |
154
168
  | `LCM_SUMMARY_BASE_URL` | *(from OpenClaw / provider default)* | Base URL override for summarization API calls |
155
- | `LCM_EXPANSION_MODEL` | *(from OpenClaw)* | Model override for `lcm_expand_query` sub-agent (e.g. `anthropic/claude-haiku-4-5`) |
169
+ | `LCM_EXPANSION_MODEL` | *(from OpenClaw)* | Model override for `lcm_expand_query` sub-agent (e.g. `openai/gpt-5.4-mini`) |
156
170
  | `LCM_EXPANSION_PROVIDER` | *(from OpenClaw)* | Provider override for `lcm_expand_query` sub-agent |
157
171
  | `LCM_DELEGATION_TIMEOUT_MS` | `120000` | Max time to wait for delegated `lcm_expand_query` sub-agent completion |
158
172
  | `LCM_SUMMARY_TIMEOUT_MS` | `60000` | Max time to wait for a single model-backed LCM summarizer call |
@@ -162,6 +176,8 @@ Add a `lossless-claw` entry under `plugins.entries` in your OpenClaw config:
162
176
 
163
177
  If you want `lcm_expand_query` to run on a dedicated model via `expansionModel` or `LCM_EXPANSION_MODEL`, OpenClaw must explicitly trust the plugin to request sub-agent model overrides.
164
178
 
179
+ For most setups, `openai/gpt-5.4-mini` is a better starting point than Anthropic Haiku because it is cheap, fast, and does not depend on Anthropic quota remaining.
180
+
165
181
  Add a `subagent` policy under `plugins.entries.lossless-claw` and allowlist the canonical `provider/model` target you want the plugin to use:
166
182
 
167
183
  ```json
@@ -215,6 +231,8 @@ For compaction summarization, lossless-claw resolves the model in this order:
215
231
 
216
232
  If `summaryModel` already includes a provider prefix such as `anthropic/claude-sonnet-4-20250514`, `summaryProvider` is ignored for that choice. Otherwise, the provider falls back to the matching override, then `OPENCLAW_PROVIDER`, then the provider inferred by the caller.
217
233
 
234
+ Runtime-managed OAuth providers are supported here too. In particular, `openai-codex` and `github-copilot` auth profiles can be used for summary and expansion calls without a separate API key.
235
+
218
236
  ### Recommended starting configuration
219
237
 
220
238
  ```
@@ -222,6 +240,8 @@ LCM_FRESH_TAIL_COUNT=64
222
240
  LCM_LEAF_CHUNK_TOKENS=20000
223
241
  LCM_INCREMENTAL_MAX_DEPTH=1
224
242
  LCM_CONTEXT_THRESHOLD=0.75
243
+ LCM_SUMMARY_MODEL=openai/gpt-5.4-mini
244
+ LCM_EXPANSION_MODEL=openai/gpt-5.4-mini
225
245
  ```
226
246
 
227
247
  - **freshTailCount=64** protects the last 64 messages from compaction, giving the model more recent context for continuity.
@@ -24,7 +24,7 @@ Summaries are lossy by design. The "Expand for details about:" footer at the end
24
24
  - Tool call sequences and their outputs
25
25
  - Verbatim quotes or specific data points
26
26
 
27
- `lcm_expand_query` is bounded (~120s, scoped sub-agent) and relatively cheap. Don't ration it.
27
+ `lcm_expand_query` is bounded (~120s, scoped sub-agent) and relatively cheap. Don't ration it, but use `lcm_grep` first when you need broad discovery across many sessions.
28
28
 
29
29
  ## Tool reference
30
30
 
@@ -32,6 +32,8 @@ Summaries are lossy by design. The "Expand for details about:" footer at the end
32
32
 
33
33
  Search across messages and/or summaries using regex or full-text search.
34
34
 
35
+ Use `mode: "full_text"` for keyword or topical recall. Wrap exact multi-word phrases in quotes to preserve phrase matching. Keep the default `sort: "recency"` for recent events, switch to `sort: "relevance"` when looking for the best older match on a topic, and use `sort: "hybrid"` when you want relevance without giving up recency entirely.
36
+
35
37
  **Parameters:**
36
38
 
37
39
  | Param | Type | Required | Default | Description |
@@ -44,6 +46,7 @@ Search across messages and/or summaries using regex or full-text search.
44
46
  | `since` | string | | — | ISO timestamp lower bound |
45
47
  | `before` | string | | — | ISO timestamp upper bound |
46
48
  | `limit` | number | | 50 | Max results (1–200) |
49
+ | `sort` | string | | `"recency"` | `"recency"`, `"relevance"`, or `"hybrid"` for full-text ranking |
47
50
 
48
51
  **Returns:** Array of matches with:
49
52
  - `id` — Message or summary ID
@@ -59,6 +62,9 @@ Search across messages and/or summaries using regex or full-text search.
59
62
  # Full-text search across all conversations
60
63
  lcm_grep(pattern: "database migration", mode: "full_text", allConversations: true)
61
64
 
65
+ # Older-topic recall ranked by FTS relevance
66
+ lcm_grep(pattern: "\"error handling\" retries", mode: "full_text", sort: "relevance")
67
+
62
68
  # Regex search in summaries only
63
69
  lcm_grep(pattern: "config\\.threshold.*0\\.[0-9]+", scope: "summaries")
64
70
 
@@ -108,6 +114,8 @@ lcm_describe(id: "file_789abc012345")
108
114
 
109
115
  Answer a focused question by expanding summaries through the DAG. Spawns a bounded sub-agent that walks parent links down to source material and returns a compact answer.
110
116
 
117
+ When `allConversations: true` is set, `lcm_expand_query` can now synthesize one answer across multiple conversations. That cross-conversation mode is bounded, not exhaustive: it ranks conversation buckets, expands only the top few, and marks the result truncated when lower-ranked buckets are skipped or fail.
118
+
111
119
  **Parameters:**
112
120
 
113
121
  | Param | Type | Required | Default | Description |
@@ -124,9 +132,11 @@ Answer a focused question by expanding summaries through the DAG. Spawns a bound
124
132
  **Returns:**
125
133
  - `answer` — The focused answer text
126
134
  - `citedIds` — Summary IDs that contributed to the answer
135
+ - `sourceConversationIds` — Conversations that were successfully expanded
127
136
  - `expandedSummaryCount` — How many summaries were expanded
128
137
  - `totalSourceTokens` — Total tokens read from the DAG
129
138
  - `truncated` — Whether the answer was truncated to fit maxTokens
139
+ - `conversationBreakdown` — Optional per-conversation success/failure diagnostics for bounded multi-conversation runs
130
140
 
131
141
  **Examples:**
132
142
 
@@ -143,7 +153,7 @@ lcm_expand_query(
143
153
  prompt: "What were the exact file changes?"
144
154
  )
145
155
 
146
- # Cross-conversation search
156
+ # Cross-conversation synthesis
147
157
  lcm_expand_query(
148
158
  query: "deployment procedure",
149
159
  prompt: "What's the current deployment process?",
@@ -167,9 +177,9 @@ Add instructions to your agent's system prompt so it knows when to use LCM tools
167
177
  ## Memory & Context
168
178
 
169
179
  Use LCM tools for recall:
170
- 1. `lcm_grep` — Search all conversations by keyword/regex
180
+ 1. `lcm_grep` — Search all conversations by keyword/regex. Prefer `mode: "full_text"` for topic recall, quote exact phrases, use `sort: "relevance"` for older-topic lookups, and `sort: "hybrid"` when recency should still matter.
171
181
  2. `lcm_describe` — Inspect a specific summary (cheap, no sub-agent)
172
- 3. `lcm_expand_query` — Deep recall with sub-agent expansion
182
+ 3. `lcm_expand_query` — Deep recall with bounded sub-agent expansion
173
183
 
174
184
  When summaries in context have an "Expand for details about:" footer
175
185
  listing something you need, use `lcm_expand_query` to get the full detail.
@@ -177,7 +187,7 @@ listing something you need, use `lcm_expand_query` to get the full detail.
177
187
 
178
188
  ### Conversation scoping
179
189
 
180
- By default, tools operate on the current conversation. Use `allConversations: true` to search across all of them (all agents, all sessions). Use `conversationId` to target a specific conversation you already know about (from previous grep results).
190
+ By default, tools operate on the current conversation. Use `lcm_grep(..., allConversations: true)` when you need broad global discovery. Use `lcm_expand_query(..., allConversations: true)` when you want bounded synthesis across sessions. Use `conversationId` when you already know the exact conversation to inspect or expand.
181
191
 
182
192
  ### Performance considerations
183
193
 
@@ -185,3 +195,4 @@ By default, tools operate on the current conversation. Use `allConversations: tr
185
195
  - `lcm_expand_query` spawns a sub-agent and takes ~30–120 seconds
186
196
  - The sub-agent has a 120-second timeout with cleanup guarantees
187
197
  - Token caps (`LCM_MAX_EXPAND_TOKENS`) prevent runaway expansion
198
+ - Cross-conversation `lcm_expand_query` expands only a bounded set of top-ranked conversations