npm - @possumtech/rummy - Versions diffs - 2.1.0 → 2.2.1 - Mend

@possumtech/rummy 2.1.0 → 2.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (140) hide show

package/.env.example +40 -15
package/.xai.key +1 -0
package/PLUGINS.md +169 -53
package/README.md +38 -32
package/SPEC.md +366 -179
package/bin/digest.js +1097 -0
package/biome/no-fallbacks.grit +2 -2
package/gemini.key +1 -0
package/lang/en.json +10 -1
package/migrations/001_initial_schema.sql +9 -2
package/package.json +19 -8
package/service.js +1 -0
package/src/agent/AgentLoop.js +76 -26
package/src/agent/ContextAssembler.js +2 -0
package/src/agent/Entries.js +238 -60
package/src/agent/ProjectAgent.js +44 -0
package/src/agent/TurnExecutor.js +99 -30
package/src/agent/XmlParser.js +206 -111
package/src/agent/errors.js +35 -0
package/src/agent/known_queries.sql +1 -1
package/src/agent/known_store.sql +3 -42
package/src/agent/materializeContext.js +30 -1
package/src/agent/runs.sql +8 -18
package/src/agent/tokens.js +0 -1
package/src/agent/turns.sql +1 -0
package/src/hooks/Hooks.js +26 -0
package/src/hooks/RummyContext.js +12 -1
package/src/lib/hedberg/README.md +60 -0
package/src/lib/hedberg/hedberg.js +60 -0
package/src/lib/hedberg/marker.js +158 -0
package/src/{plugins → lib}/hedberg/matcher.js +1 -2
package/src/llm/LlmProvider.js +41 -3
package/src/llm/openaiStream.js +17 -0
package/src/plugins/ask_user/ask_user.js +12 -2
package/src/plugins/ask_user/ask_userDoc.md +1 -5
package/src/plugins/budget/README.md +29 -24
package/src/plugins/budget/budget.js +166 -110
package/src/plugins/cli/README.md +3 -4
package/src/plugins/cli/cli.js +31 -5
package/src/plugins/cloudflare/cloudflare.js +136 -0
package/src/plugins/cp/cp.js +41 -4
package/src/plugins/cp/cpDoc.md +5 -6
package/src/plugins/engine/engine.sql +1 -1
package/src/plugins/env/README.md +5 -4
package/src/plugins/env/env.js +7 -4
package/src/plugins/env/envDoc.md +7 -8
package/src/plugins/error/error.js +56 -15
package/src/plugins/file/README.md +12 -3
package/src/plugins/file/file.js +2 -2
package/src/plugins/get/get.js +59 -36
package/src/plugins/get/getDoc.md +10 -34
package/src/plugins/google/google.js +115 -0
package/src/plugins/hedberg/hedberg.js +13 -56
package/src/plugins/helpers.js +66 -12
package/src/plugins/index.js +1 -2
package/src/plugins/instructions/README.md +44 -47
package/src/plugins/instructions/instructions-system.md +44 -0
package/src/plugins/instructions/instructions-user.md +53 -0
package/src/plugins/instructions/instructions.js +58 -189
package/src/plugins/known/README.md +6 -7
package/src/plugins/known/known.js +24 -30
package/src/plugins/log/log.js +41 -32
package/src/plugins/mv/mv.js +40 -1
package/src/plugins/mv/mvDoc.md +1 -8
package/src/plugins/ollama/ollama.js +4 -3
package/src/plugins/openai/openai.js +4 -3
package/src/plugins/openrouter/openrouter.js +14 -4
package/src/plugins/persona/README.md +11 -13
package/src/plugins/persona/default.md +29 -0
package/src/plugins/persona/persona.js +10 -66
package/src/plugins/policy/policy.js +23 -22
package/src/plugins/prompt/README.md +37 -27
package/src/plugins/prompt/prompt.js +13 -19
package/src/plugins/rm/rm.js +18 -0
package/src/plugins/rm/rmDoc.md +5 -6
package/src/plugins/rpc/rpc.js +3 -3
package/src/plugins/set/set.js +205 -323
package/src/plugins/set/setDoc.md +47 -17
package/src/plugins/sh/README.md +6 -5
package/src/plugins/sh/sh.js +8 -5
package/src/plugins/sh/shDoc.md +7 -8
package/src/plugins/skill/README.md +37 -14
package/src/plugins/skill/skill.js +200 -101
package/src/plugins/skill/skillDoc.js +3 -0
package/src/plugins/skill/skillDoc.md +9 -0
package/src/plugins/stream/README.md +7 -6
package/src/plugins/stream/finalize.js +100 -0
package/src/plugins/stream/stream.js +13 -45
package/src/plugins/telemetry/telemetry.js +27 -4
package/src/plugins/think/think.js +2 -3
package/src/plugins/think/thinkDoc.md +2 -4
package/src/plugins/unknown/README.md +1 -1
package/src/plugins/unknown/unknown.js +17 -19
package/src/plugins/update/update.js +4 -51
package/src/plugins/update/updateDoc.md +21 -6
package/src/plugins/xai/xai.js +68 -102
package/src/plugins/yolo/yolo.js +102 -75
package/src/sql/functions/hedmatch.js +1 -1
package/src/sql/functions/hedreplace.js +1 -1
package/src/sql/functions/hedsearch.js +1 -1
package/src/sql/functions/slugify.js +16 -2
package/BENCH_ENVIRONMENT.md +0 -230
package/CLIENT_INTERFACE.md +0 -396
package/last_run.txt +0 -5617
package/scriptify/ask_run.js +0 -77
package/scriptify/cache_probe.js +0 -66
package/scriptify/cache_probe_grok.js +0 -74
package/src/agent/budget.js +0 -33
package/src/agent/config.js +0 -38
package/src/plugins/hedberg/README.md +0 -71
package/src/plugins/hedberg/docs.md +0 -0
package/src/plugins/hedberg/edits.js +0 -55
package/src/plugins/hedberg/normalize.js +0 -17
package/src/plugins/hedberg/sed.js +0 -49
package/src/plugins/instructions/instructions.md +0 -34
package/src/plugins/instructions/instructions_104.md +0 -8
package/src/plugins/instructions/instructions_105.md +0 -39
package/src/plugins/instructions/instructions_106.md +0 -22
package/src/plugins/instructions/instructions_107.md +0 -17
package/src/plugins/instructions/instructions_108.md +0 -0
package/src/plugins/known/knownDoc.js +0 -3
package/src/plugins/known/knownDoc.md +0 -8
package/src/plugins/unknown/unknownDoc.js +0 -3
package/src/plugins/unknown/unknownDoc.md +0 -11
package/turns/cli_1777462658211/turn_001.txt +0 -772
package/turns/cli_1777462658211/turn_002.txt +0 -606
package/turns/cli_1777462658211/turn_003.txt +0 -667
package/turns/cli_1777462658211/turn_004.txt +0 -297
package/turns/cli_1777462658211/turn_005.txt +0 -301
package/turns/cli_1777462658211/turn_006.txt +0 -262
package/turns/cli_1777465095132/turn_001.txt +0 -715
package/turns/cli_1777465095132/turn_002.txt +0 -236
package/turns/cli_1777465095132/turn_003.txt +0 -287
package/turns/cli_1777465095132/turn_004.txt +0 -694
package/turns/cli_1777465095132/turn_005.txt +0 -422
package/turns/cli_1777465095132/turn_006.txt +0 -365
package/turns/cli_1777465095132/turn_007.txt +0 -885
package/turns/cli_1777465095132/turn_008.txt +0 -1277
package/turns/cli_1777465095132/turn_009.txt +0 -736
/package/src/{plugins → lib}/hedberg/patterns.js +0 -0

package/.env.example CHANGED Viewed

@@ -17,16 +17,25 @@ RUMMY_MMAP_MB=0
 # Agent Loop Limits — per-loop cap (turns within a single loop).
 # No per-run cap; a run can comprise many loops.
-RUMMY_MAX_LOOP_TURNS=99
+RUMMY_MAX_LOOP_TURNS=999
 # Hard cap on commands per turn — high by design. The real cost
 # ceiling is the Token Budget; per-tool rate limits (e.g.
 # RUMMY_WEB_SEARCH_MAX) bound the expensive tools individually.
 RUMMY_MAX_COMMANDS=99
 # Per-turn cap on <search>. Refusals strike via 429.
 RUMMY_WEB_SEARCH_MAX=1
+# Default candidate count per <search>. Brave caps at 20; the model can
+# still narrow per-call via <search results="N">.
+RUMMY_WEB_SEARCH_RESULTS=20
 RUMMY_MAX_STRIKES=3
 RUMMY_MIN_CYCLES=3
 RUMMY_MAX_CYCLE_PERIOD=4
+# Free turns in an FCRM admin stage (Decomposition or Demotion) before
+# the stagnation strike fires. Turn N+1 in the same admin stage emits a
+# "N+1 turns in current stage" reminder and contributes to the strike
+# streak (3 strikes → MAX_STRIKES → abandon). Distillation and Deployment
+# are exempt — those phases can grind for many turns on hard tasks.
+RUMMY_STAGNATION_FREE_TURNS=3
 # Hygiene
 # Days to keep completed/aborted runs before purging
@@ -35,16 +44,25 @@ RUMMY_RETENTION_DAYS=31
 # Timeouts (ms)
 RUMMY_RPC_TIMEOUT=30000
 RUMMY_FETCH_TIMEOUT=300000
+RUMMY_WEB_FETCH_TIMEOUT=300000
 # Test harness — how long AuditClient waits for a single ask/act to reach
 # terminal status. Sized for full-context ingest on large-window models.
 RUMMY_TEST_RUN_TIMEOUT=3600000
-# rummy-cli watchdog — wall-clock budget for a one-shot CLI invocation.
-# Overridable per invocation via --RUMMY_RUN_TIMEOUT=<ms>.
-RUMMY_RUN_TIMEOUT=3600000
+# rummy-cli watchdog — wall-clock budget for a single loop (one ask/act
+# CLI invocation). Overridable per invocation via --RUMMY_LOOP_TIMEOUT=<ms>.
+RUMMY_LOOP_TIMEOUT=86400000
 # Plugin module load watchdog.
 RUMMY_PLUGINS_LOAD_TIMEOUT=10000
+# Per-entry storage cap (bytes). Generous by design — rummy is a
+# memory-resident workspace, not a chat buffer — but bounded so a
+# pathological capture (e.g. 100 MB of vim escape codes from a single
+# <sh>) becomes a healthy 413 strike instead of an unbounded write.
+# Enforced at the SQLite layer (entries.body CHECK) and surfaced to
+# the model as an error.log entry the strike system can act on.
+RUMMY_ENTRY_SIZE_MAX=104857600
 # LLM retry policy: time-bounded exponential backoff with full jitter.
 # DEADLINE is total wall-clock budget for an LLM call across all retries.
 # MAX_BACKOFF caps each inter-attempt sleep so a long deadline doesn't
@@ -55,9 +73,20 @@ RUMMY_LLM_MAX_BACKOFF=30000
 # Debug
 # RUMMY_DEBUG=true
-# Think tag: 1 = model uses <think> tags for reasoning (default)
-# 0 = disabled, model reasons via API reasoning_content field only
-RUMMY_THINK=1
+# Reasoning request flag forwarded to the LLM provider's `think`
+# parameter (gemma/llama.cpp: think=false skips server-side reasoning;
+# ollama: same). 1 asks the provider to reason and surface reasoning
+# content to rummy.
+#
+# Default 0 because forced reasoning on weak models (e.g. local gemma)
+# burns the n_ctx ceiling on `<think>` blocks and triggers reasoning-
+# runaway strikes. Opt in deliberately, per model.
+#
+# OpenRouter is intentionally orthogonal: rummy.web's openrouter plugin
+# always sends `include_reasoning: true` (relay-level always-on
+# telemetry — we pay the upstream cost regardless, so we keep the
+# reasoning bytes). RUMMY_THINK does not apply there.
+RUMMY_THINK=0
 # Budget
 # Fraction of context window used as ceiling. 0.9 = 90%, 10% reserved as headroom.
@@ -72,7 +101,7 @@ RUMMY_TOKEN_DIVISOR=2
 # Model Behavior
 # LLM temperature (0 = deterministic, 0.7 = creative). Client can override per-request.
-RUMMY_TEMPERATURE=0.5
+RUMMY_TEMPERATURE=0.1
 # Run Attribute Defaults
 # Per-run attributes (passed in the run-creation set call) trump these.
@@ -97,7 +126,7 @@ RUMMY_X_TITLE=RUMMY
 # OPENAI_BASE_URL="http://127.0.0.1:11434"
 # OPENAI_API_KEY=
-# XAI_BASE_URL="https://api.x.ai/v1/responses"
+# XAI_BASE_URL="https://api.x.ai/v1"
 # XAI_API_KEY=""
 # Model Aliases (Optional)
@@ -108,7 +137,7 @@ RUMMY_X_TITLE=RUMMY
 # RUMMY_MODEL_grok="xai/grok-4-1-fast-reasoning-latest"
 # RUMMY_MODEL_opus="openrouter/anthropic/claude-opus-4.6"
 # RUMMY_MODEL_gpro="openrouter/google/gemini-3.1-pro-preview"
-# RUMMY_MODEL_gemma="openai/gemma-4-26B-A4B-it-UD-Q3_K_XL.gguf"
+# RUMMY_MODEL_gemma="openai/macher.gguf"
 # RUMMY_MODEL_qwen="ollama/qwen:7b"
 # Necessary for automated testing
@@ -116,11 +145,7 @@ RUMMY_X_TITLE=RUMMY
 # Web Search
-# RUMMY_SEARCH="searxng"
-# RUMMY_SEARXNG_URL="http://127.0.0.1:8888"
-# RUMMY_SEARCH="brave"
-# BRAVE_API_KEY=""
+# RUMMY_WEB_SEARXNG_URL="http://127.0.0.1:8888"
 # External plugins: npm i -g <package>, then uncomment
 # RUMMY_PLUGIN_WEB="@possumtech/rummy.web"

package/.xai.key ADDED Viewed

	@@ -0,0 +1 @@
1	+ export XAI_API_KEY="xai-oI0qbXtb0SUCLqgw3qeDPhIytrRs2YuPn3zGIV0l8XSygn1s5M1ZglxfbFHvCJUrIYrIbQvurCzkLpZA"

package/PLUGINS.md CHANGED Viewed

@@ -182,7 +182,7 @@ is a superset of what's below.
 | Event | Payload | Purpose |
 |-------|---------|---------|
 | `"handler"` | `(entry, rummy)` | Tool handler — called when model/client invokes this tool |
-| `"visible"` | `(entry)` | Visible-visibility projection — body shown in `<knowns>` / `<performed>` |
+| `"visible"` | `(entry)` | Visible-visibility projection — body shown in `<visible>` (data) or `<log>` (logging) |
 | `"summarized"` | `(entry)` | Summarized-visibility projection — path + summary only (body hidden) |
 | `"turn.started"` | `({rummy, mode, prompt, loopIteration, isContinuation})` | Turn beginning — plugins write prompt/instructions entries |
 | `"turn.response"` | `({rummy, turn, result, responseMessage, content, commands, ...})` | LLM responded — write audit entries, commit usage |
@@ -253,7 +253,7 @@ the message. Current `assembly.user` registrations:
 | Priority | Block | Plugin | Mutates per turn? |
 |---|---|---|---|
-| 50 | `<summarized>` | `known.js` | Slow — only on new entry |
+| 50 | `<summary>` | `known.js` | Slow — only on new entry |
 | 75 | `<visible>` | `known.js` | Fast — on every promote/demote |
 | 100 | `<log>` | `log.js` | Always — appends per action |
 | 200 | `<unknowns>` | `unknown.js` | On unknown lifecycle |
@@ -266,7 +266,7 @@ and predictable rendering position):
 | Range | Position | Use for |
 |---|---|---|
-| `0–49` | Top of user | Reserved (stable identity-tier blocks above `<summarized>`) |
+| `0–49` | Top of user | Reserved (stable identity-tier blocks above `<summary>`) |
 | `50–99` | Codebase data surface | Don't add here — owned by `known.js` |
 | `100–149` | History tier | Action history, timeline-style content |
 | `150–199` | Open slot | Inter-history blocks (e.g. recent-decisions, tracked progress) |
@@ -378,10 +378,10 @@ full contract and the rationale.
 Returns the string the model sees for this tool's entries at the
 given visibility. Every tool MUST register `full`. `summary` is
-optional — if unregistered, falls back to `attributes.summary`
+optional — if unregistered, falls back to `attributes.tags`
 (model-authored keyword description) or empty string.
-At summary visibility, `attributes.summary` is prepended above the
+At summary visibility, `attributes.tags` is prepended above the
 plugin's summary output automatically by ToolRegistry.view().
 ## Two Objects {#plugins_two_objects}
@@ -424,6 +424,23 @@ instead.
 | `rummy.getEntry(path)` | First matching entry or null |
 | `rummy.getEntries(pattern, bodyFilter?)` | Array of matching entries |
 | `rummy.setAttributes(path, attrs)` | Merge attributes via json_patch |
+| `rummy.entries.logPath(runId, turn, action, target)` | Build a `log://turn_N/<action>/<slug>` path, slugified + collision-safe |
+| `rummy.entries.slugPath(runId, scheme, content, summary?)` | Build a `<scheme>://<slug>` path, slugified + collision-safe |
+#### Path conventions {#plugins_path_conventions}
+Entry paths are bounded by a hard `length(path) <= 2048` DB
+CHECK constraint. In normal use, paths stay well under ~100 chars
+because plugins build them via `logPath` / `slugPath`, which run the
+target through `slugify` (80-char cap, `/` preserved as separator,
+URL-encoded per segment) and append an integer tie-breaker on
+collision (e.g. `log://turn_3/set/src/app.js_2`).
+Plugin authors should pass any model-supplied target straight
+through these helpers instead of stitching paths from the model's
+raw input. The helpers absorb arbitrary target length and exotic
+character composition without the caller having to defend against
+either. The 2048 limit is the outer wall, not the working budget.
 ### Properties {#plugins_rummy_properties}
@@ -460,23 +477,44 @@ handles all exclusions:
 ## Hedberg {#plugins_hedberg}
-The hedberg plugin exposes pattern matching and interpretation
-utilities on `core.hooks.hedberg` for all plugins to use:
+Hedberg has two faces. The implementation is a **library** at
+`src/lib/hedberg/` — pattern matching, fuzzy literal replacement,
+unified-diff generation. Internal plugins import these utilities
+directly:
 ```js
-const { match, search, replace, parseSed, parseEdits,
-    generatePatch } = core.hooks.hedberg;
+import { hedmatch, hedsearch } from "../../lib/hedberg/patterns.js";
+import Hedberg, { generatePatch } from "../../lib/hedberg/hedberg.js";
+```
+A thin **plugin shim** at `src/plugins/hedberg/` re-exposes the same
+surface on `core.hooks.hedberg` for external plugins shipped in
+separate packages (`rummy.repo`, `rummy.web`, etc.) that can't reach
+into rummy/main's internals via direct import.
+```js
+const { match, search, replace, generatePatch } = core.hooks.hedberg;
 ```
 | Method | Purpose |
 |--------|---------|
 | `match(pattern, string)` | Full-string pattern match (glob, regex, literal) |
 | `search(pattern, string)` | Substring search |
-| `replace(body, search, replacement, opts?)` | Apply replacement |
-| `parseSed(input)` | Parse sed syntax (any delimiter) |
-| `parseEdits(content)` | Detect edit format (merge conflict, udiff, sed) |
+| `replace(body, search, replacement)` | Fuzzy literal replacement (whitespace-tolerant) |
 | `generatePatch(path, old, new)` | Generate unified diff |
+Edit-shape parsing for `<set>` bodies (the `<<:::IDENT...:::IDENT`
+marker family — see SPEC.md "Edit Syntax") lives in
+`src/lib/hedberg/marker.js` and is invoked by the XmlParser at
+`<set>` resolution time. It's not on `core.hooks.hedberg` because no
+external plugin needs to re-parse model output.
+**The split is intentional.** `src/lib/` is for stateless utility
+modules anyone in the project can import. `src/plugins/` is for
+contracts exposed via the hook system. Hedberg is one of the few
+modules that has both shapes — same code, two access paths, one for
+internal consumers and one for cross-package consumers.
 ## Events & Filters {#plugins_events_overview}
 **Events** are fire-and-forget. All handlers run. Return values ignored.
@@ -501,6 +539,7 @@ All hooks are async.
 | `run.config` | filter | Before run config applied |
 | `run.progress` | event | Transient turn activity (`thinking` / `processing` / `retrying`) |
 | `run.state` | event | Turn conclusion, per-command incremental, or terminal run close — full state snapshot (status, history, unknowns, telemetry) |
+| `turn.verdict` | filter | Post-turn decision: continue / abandon / strike. Filter chain — multiple plugins (strike streak, cycle detect, stagnation today; future voters can join) each transform a verdict object. Initial value `{ continue: true }`; final value drives the loop's continue/abandon decision. |
 | `run.step.completed` | event | Turn verdict resolved (post-healer, pre-close) |
 | `loop.completed` | event | Loop exit — fires from `finally`, guaranteed on every exit path |
 | `ask.completed` | event | Ask-mode run finished |
@@ -510,32 +549,92 @@ All hooks are async.
 ### Turn Pipeline {#plugins_turn_pipeline}
-Hooks fire in this order every turn:
+Hooks fire in this order every turn. Type column legend:
+**event** = fire-and-forget, all handlers run, no return value;
+**filter** = chain transform, ordered by priority, return value carries forward;
+**call** = direct named-method invocation on a specific plugin.
+Exceptions for `call`-shaped hooks are documented under
+[Architectural exceptions](#plugins_architectural_exceptions).
 | # | Hook | Type | When |
 |---|------|------|------|
 | 1 | `turn.started` | event | Plugins write prompt/instructions entries |
-| 2 | `context.materialized` | event | turn_context populated from v_model_context |
-| 3 | `assembly.system` | filter | Build system message from entries |
-| 4 | `assembly.user` | filter | Build user message (prompt plugin adds `<prompt tokensFree tokenUsage>`) |
-| 5 | `budget.enforce` | call | Measure assembled tokens; if over and it's turn 1, demote prompt, re-materialize, re-check; still over → 413 |
-| 6 | `llm.messages` | filter | Transform messages before LLM call |
-| 7 | `llm.request.started` | event | LLM call about to fire |
-| 8 | `llm.response` | filter | Transform raw LLM response |
-| 9 | `llm.request.completed` | event | LLM call finished |
-| 10 | `turn.response` | event | Plugins write audit entries (telemetry) |
-| 11 | `entry.recording` | filter | Per command, during `#record()`. Returning an entry with `state: "failed"` (or `"cancelled"`) rejects it. |
-| 12 | Per recorded entry (sequential, abort-on-failure): | | |
+| 2 | `instructions.resolveSystemPrompt` | call ⚠ | System prompt assembly — single-owner exception (cache stability) |
+| 3 | `context.materialized` | event | turn_context populated from v_model_context |
+| 4 | `assembly.system` | filter | Build system message from entries (called from inside `materializeContext`) |
+| 5 | `assembly.user` | filter | Build user message (prompt plugin adds `<prompt tokensFree tokenUsage>`) |
+| 6 | `turn.beforeDispatch` | filter | Measure assembled tokens; if over and turn 1, demote prompt, re-materialize, re-check; still over → 413. Filter chain on the dispatch packet `{ messages, rows, contextSize, lastPromptTokens, assembledTokens, ok, overflow }`. Budget participates here; future plugins may trim, re-order, or annotate via the same surface. `ok=false` short-circuits dispatch. |
+| 7 | `llm.messages` | filter | Transform messages before LLM call |
+| 8 | `llm.request.started` | event | LLM call about to fire |
+| 9 | (LLM completion call) | — | Direct provider call. Errors caught: ContextExceededError → 413; TimeoutError/AbortError → 504 strike (unless drain). |
+| 10 | `llm.response` | filter | Transform raw LLM response |
+| 11 | `llm.request.completed` | event | LLM call finished |
+| 12 | (XML parse + parser-warning emission) | — | Synchronous; warnings emitted via `error.log` with `soft: true` — recoverable, no strike |
+| 13 | `llm.reasoning` | filter | Layer plugin reasoning contributions onto API-provided seed (used by `<think>` plugin to merge content-channel thinking into reasoning_content) |
+| 14 | `turn.response` | event | Plugins write audit entries (telemetry) |
+| 15 | `entry.recording` | filter | Per command, during `#record()`. Returning an entry with `state: "failed"` (or `"cancelled"`) rejects it. |
+| 16 | Per recorded entry (sequential, abort-on-failure): | | |
 |    | `tool.before` | event | Before handler dispatch |
-|    | `tools.dispatch` | — | Scheme's registered handler runs |
+|    | `tools.dispatch` | call (keyed) | Scheme's registered handler runs. Keyed dispatch is principled — multi-plugin contract by scheme name. |
 |    | `tool.after` | event | Handler finished |
 |    | `entry.created` | event | Entry written to store |
 |    | `run.state` | event | Incremental state push to connected clients |
 |    | `proposal.prepare` | event | This entry's dispatch may have created proposals (e.g. set → 202 revisions) |
 |    | `proposal.pending` | event | Per each materialized proposal — client is notified, dispatch awaits resolution |
-| 13 | `budget.postDispatch` | call | Re-materialize + check. If over ceiling → Turn Demotion (visibility=summarized on turn's visible rows) + emit 413 error. |
-| 14 | `hooks.update.resolve` | call | Update plugin classifies this turn's `<update>` (terminal/continuation, override-to-continuation if actions failed, heal from raw content if missing) |
-| 15 | `turn.completed` | event | Turn fully resolved with final status |
+| 17 | `turn.dispatched` | event | Post-dispatch cleanup. Budget subscribes for Turn Demotion (visibility=summarized on visible rows that overflow) + 413 `error://` emission via `hooks.error.log.emit`. Future plugins may subscribe for any post-dispatch concern. |
+| 18 | `update.resolve` | call ⚠ | Update plugin classifies this turn's `<update>` (terminal/continuation, override-to-continuation if actions failed, heal from raw content if missing). Single-owner exception — synchronous return value (`{ summaryText, updateText }`) is load-bearing. |
+| 19 | `turn.completed` | event | Turn fully resolved with final status |
+**Legend:** ⚠ = load-bearing exception (kept by design, see below); ✗ = refactor candidate (ceremonial coupling).
+### Architectural exceptions {#plugins_architectural_exceptions}
+The plugin contract aims for **events for emit, filters for transform,
+keyed dispatch for multi-plugin lookups by category**. Five points
+intentionally deviate. They're documented here so they aren't
+mistaken for ceremony and "fixed" in a way that breaks the
+load-bearing reason.
+**1. `instructions.resolveSystemPrompt(rummy)` — single-owner, cache-stable.**
+The system prompt is deliberately not a filter chain. Multiple
+participants would defeat prefix-cache reasoning ("Static base in
+system, phase-specific in user," see AGENTS.md instruction
+discipline). One plugin owns the surface; direct call enforces it.
+**2. `update.resolve({ recorded, ... })` — single-owner with
+synchronous return value.** Caller (`TurnExecutor`) needs
+`{ summaryText, updateText }` back to drive the resolve callback.
+Events emit but don't return; only the update plugin understands
+terminal-vs-continuation status semantics. Filter-chain shape
+would only have one element (still update), so the chain would be
+ceremony.
+**3. Static utility imports across plugins
+(`Entries.scheme`, `Entries.normalizePath`, `countTokens`,
+`stateToStatus`).** Pure stateless utilities. Routing through
+hooks adds a ceremony layer for zero capability gain — these aren't
+extension points; they're canonical implementations.
+**4. Hedberg lib + thin plugin shim.** The library lives at
+`src/lib/hedberg/` (pattern matching, sed parsing, merge handling).
+A thin plugin shim at `src/plugins/hedberg/hedberg.js` re-exposes
+the same surface on `core.hooks.hedberg` for external plugins
+(rummy.repo, rummy.web) that can't reach into rummy/main's
+internals via direct import. Internal plugins use direct imports
+from `src/lib/hedberg/`; external plugins use the hook namespace.
+See [Hedberg](#plugins_hedberg) for the API table.
+**5. Transport plugins (`cli`, `rpc`).** These are *interface*
+plugins, not action plugins. Their job is to bridge external
+interfaces (stdin/stdout, WebSocket) to the agent. Direct imports
+of `ProjectAgent` / `RummyContext` are what makes them transports;
+fitting them into the action-plugin shape would require running
+the agent over a back-channel to itself.
+**Anything else that looks like a direct named call into a plugin
+is a seam, not an exception** — see the ✗-marked entries in the
+Turn Pipeline above. Refactor surface tracked in AGENTS.md "Now"
+under Phase 2.
 `entry.changed` fires asynchronously from mutation points — not
 pipeline-ordered. Subscribe when you need to react to any entry
@@ -566,22 +665,41 @@ update, visibility change, state change, attribute update. Payload:
 | Hook | Type | When |
 |------|------|------|
-| `hooks.budget.enforce` | method | Pre-LLM ceiling check. On first-turn 413 → Prompt Demotion + re-check. |
-| `hooks.budget.postDispatch` | method | Post-dispatch re-check. On 413 → Turn Demotion + 413 `error://` entry via `hooks.error.log.emit`. |
+| `turn.beforeDispatch` filter | subscriber | Pre-LLM ceiling check on the dispatch packet. On first-turn 413 → Prompt Demotion + re-check; sets `ok=false` + `overflow` to short-circuit dispatch. |
+| `turn.dispatched` event | subscriber | Post-dispatch re-check. On 413 → Turn Demotion + 413 `error://` entry via `hooks.error.log.emit`. |
+| `assembly.user` filter | subscriber | Renders `<budget>` table into the user message. |
 The budget plugin measures tokens on the assembled messages — the
 actual content being sent to the LLM. No estimates at the ceiling,
 no SQL token sums. The assembled message IS the measurement. When
-turn 2+ information is available, `budget.enforce` prefers the actual
-API-reported token count (`turns.context_tokens` from the prior
-turn) over re-measuring the assembled string.
+turn 2+ information is available, the pre-LLM check prefers the
+actual API-reported token count (`turns.context_tokens` from the
+prior turn) over re-measuring the assembled string.
+**Use of the assembler.** Budget calls the context assembler in two
+spots — these are projections, not orchestration leaks:
+- **Pre-LLM Prompt Demotion (`turn.beforeDispatch`)** — when the
+  first-turn packet overflows, budget demotes the prompt entry in
+  the DB, swaps `body` from `vBody` to `sBody` on the local prompt
+  row, and re-runs `ContextAssembler.assembleFromTurnContext` on
+  the modified rows. No `materializeContext` round-trip — the row
+  already carries both projections.
+- **Post-dispatch projection (`turn.dispatched`)** — budget re-runs
+  `materializeContext` to project the *next* turn's packet
+  (entries written during dispatch need projection through
+  `hooks.tools.view`). If predicted next packet overflows, budget
+  demotes now so next turn's enforce isn't stuck with only the
+  prompt-demotion lever. Cost projection is the budget plugin's
+  job; the assembler is the measurement instrument.
 **DB tokens vs assembled tokens:** The `tokens` column on `entries`
-is strictly for DISPLAY — showing token costs in `<knowns>` tags so
-the model can reason about entry sizes. It is NEVER used for budget
-decisions. Budget math uses only assembled message token counts.
-These are two separate numbers that must never be conflated. See
-See [budget_enforcement](SPEC.md#budget_enforcement) for the three-measure table.
+is strictly for DISPLAY — showing token costs on entry tags in
+`<summary>` / `<visible>` so the model can reason about entry
+sizes. It is NEVER used for budget decisions. Budget math uses only
+assembled message token counts. These are two separate numbers that
+must never be conflated. See
+[budget_enforcement](SPEC.md#budget_enforcement) for the three-measure table.
 ### Client Notifications {#plugins_client_notifications}
@@ -611,7 +729,7 @@ Every entry follows the same lifecycle regardless of origin:
 Entries at `visibility = 'archived'` skip steps 4–6 (invisible to
 model, discoverable via pattern search). Entries at `visibility =
-'summarized'` render with `attributes.summary` (model-authored keyword
+'summarized'` render with `attributes.tags` (model-authored keyword
 description) prepended above the plugin's `summarized` view output —
 the body is hidden; promoting with `<get>` brings it back.
@@ -631,7 +749,7 @@ the projected body — they do NOT re-check `entry.visibility`.
 | `file` (bare paths) | data | `entry.body` | `""` | Same as known |
 Plugins providing only a `visible` hook fall back to
-`attributes.summary` (model-authored keyword description) at summarized;
+`attributes.tags` (model-authored keyword description) at summarized;
 the renderer inserts it automatically. Plugins providing neither
 default to empty body — the tag still renders with its attributes so
 the model can pattern-match the path.
@@ -653,7 +771,7 @@ state: "proposed" (user decision pending)
 1. On dispatch, create a **proposal entry** at `{scheme}://turn_N/{slug}`
    with `state: "proposed"`, category=logging. Body empty;
-   `summary=command` attr.
+   `tags=command` attr.
 2. On user accept (client sends `set { state: "resolved" }` on the
    proposal path), `AgentLoop.resolve()` transitions the proposal
    entry to `state: "resolved"` (it becomes the **log entry**) and
@@ -690,31 +808,31 @@ pure RPC plumbing shared across all streaming producers.
 |--------|------|-------------|
 | `get` | Core tool | Load file/entry into context |
 | `set` | Core tool | Edit file/entry, visibility control |
-| `known` | Core tool + Assembly | Save knowledge, render `<knowns>` section |
+| `known` | Core tool + Assembly | Save knowledge; renders `<summary>` (priority 50) and `<visible>` (priority 75) for all category=data entries |
 | `rm` | Core tool | Delete permanently |
 | `mv` | Core tool | Move entry |
 | `cp` | Core tool | Copy entry |
 | `sh` | Core tool | Shell command (act mode only). Streaming producer — see [plugins_streaming_entries](#plugins_streaming_entries) |
-| `env` | Core tool | Exploratory command. Streaming producer — see §8.1 |
+| `env` | Core tool | Exploratory command. Streaming producer — see [plugins_streaming_entries](#plugins_streaming_entries) |
 | `stream` | Internal | Generic streaming-entry RPC (`stream`, `stream/completed`, `stream/aborted`, `stream/cancel`) for sh/env and future producers |
 | `ask_user` | Core tool | Ask the user |
 | `search` | Core tool | Web search (via external plugin) |
 | `update` | Structural | Status report + lifecycle signal. `status="200\|204\|422"` terminates; `status="102"` continues. Exposes `hooks.update.resolve` for TurnExecutor. |
-| `unknown` | Structural + Assembly | Register unknowns, render `<unknowns>` |
-| `previous` | Assembly | Render `<previous>` loop history |
-| `performed` | Assembly | Render `<performed>` active loop work |
-| `prompt` | Assembly | Render `<prompt mode="ask\|act" tokensFree="N" tokenUsage="M">` tag |
+| `unknown` | Structural + Assembly | Register unknowns, render `<unknowns>` (priority 150) |
+| `log` | Assembly | Render `<log>` (priority 100) — all logging-category entries plus pre-latest prompts |
+| `prompt` | Assembly | Render `<prompt tokensFree="N" tokenUsage="M">` (priority 30, front of user message) |
 | `hedberg` | Utility | Pattern matching, interpretation, normalization |
-| `instructions` | Internal | Preamble + tool docs + persona assembly; exposes `hooks.instructions.resolveSystemPrompt` |
+| `instructions` | Internal | System prompt assembly (`instructions-system.md` + `[%TOOLS%]` + `[%TOOLDOCS%]` + persona); renders `<instructions>` (priority 165) from `instructions-user.md`; exposes `hooks.instructions.resolveSystemPrompt` |
 | `file` | Internal | File entry projections and constraints (`scheme IS NULL`) |
 | `rpc` | Internal | RPC method registration + tool-fallback dispatch |
 | `telemetry` | Internal | Audit entries, usage stats, reasoning_content |
-| `budget` | Internal | Context ceiling enforcement: Prompt Demotion (pre-LLM first-turn 413) + Turn Demotion (post-dispatch). Exposes `hooks.budget.enforce` / `hooks.budget.postDispatch`. |
+| `budget` | Internal | Context ceiling enforcement: Prompt Demotion (pre-LLM first-turn 413) + Turn Demotion (post-dispatch). Subscribes to `turn.beforeDispatch` (filter) + `turn.dispatched` (event) + `assembly.user` (filter, priority 175 — renders `<budget>`). |
 | `policy` | Internal | Ask-mode per-invocation rejections via `entry.recording` filter |
 | `error` | Internal | `error.log` hook → `error://` entries |
 | `think` | Tool | Private reasoning tag; contributes to `reasoning_content` via the `llm.reasoning` filter |
 | `openai` / `ollama` / `xai` / `openrouter` | LLM provider | Register with `hooks.llm.providers`; handle `{prefix}/...` model aliases. Silently inert if their env isn't configured. |
-| `persona` / `skill` | Internal | Runtime persona/skill management via RPC |
+| `persona` | Internal | Renders the persona body inside the system prompt; default at `persona/default.md`. Run-attribute `persona` overrides per run (1:1, immutable for the run's lifetime). |
+| `skill` | Internal | `<skill path="..."/>` tag handler + `skill://` scheme. Walks file/folder/`.zip` (local or URL); registers content under `skill://<name>/...`. |
 ## External Plugins
@@ -806,9 +924,7 @@ dedicated verbs with 1:1 plugin-API equivalents.
 | `file/constraint` | `{ pattern, visibility }` | Project-scoped: set overlay. `visibility ∈ {active, readonly, ignore}`. Patterns can be globs. `readonly` is enforced on `set://` accept in `AgentLoop.resolve()`. |
 | `file/drop` | `{ pattern }` | Project-scoped: remove overlay row. |
 | `getConstraints` | — | Project-scoped: returns `[{pattern, visibility}]`. |
-| `skill/add` / `skill/remove` / `getSkills` / `listSkills` | | Skill management |
-| `persona/set` / `listPersonas` | | Persona management |
-| `stream` / `stream/completed` / `stream/aborted` / `stream/cancel` | | Streaming RPC (§8.1) |
+| `stream` / `stream/completed` / `stream/aborted` / `stream/cancel` | | Streaming RPC — see [plugins_streaming_entries](#plugins_streaming_entries) |
 **Why file constraints are typed RPCs and not `set` entries:** they
 are project-scoped (no `run`), persist across runs, and `readonly`

package/README.md CHANGED Viewed

@@ -1,63 +1,69 @@
-# RUMMY: Relational Unknowns Memory Management Yoke
+# RUMMY: The General-Purpose Agent Kernel
-Rummy is the only LLM agent service inspired by and dedicated to the memory of former Secretary of Defense Donald "Rummy" Rumsfeld. Our unique fusion of apophatic and hedbergian engineering strategies yields more accurate and efficient results than any other agent. Our client/server and plugin architecture integrates it into more workflows than any other agent. It's also more flexible and lean than any other agent. Our dynamic cache management, model hot-swapping, and flexible router interface make it more affordable than any other agent.
+Rummy is a headless, metacognitive relational architecture for LLM agents. It is designed to be integrated into real-world workflows—from IDEs and CLI tools to autonomous research pipelines—where project state is complex and accuracy is non-negotiable.
-## Key Features
+While traditional agents "thrash" and fail under the weight of linear chat history, Rummy treats the LLM as a **program** executing on a **managed memory substrate**. This "Virtual Memory" architecture ensures that Rummy remains reliable in sessions that span hundreds of turns and tens of thousands of files.
+## The Architecture: Virtual Memory for Tokens
-- **The Rumsfeld Loop:** Forcing models to catalog what they don't know is a powerful weapon against hallucination and laziness. Every turn, the model registers gaps via `<set path="unknown://...">`, records findings via `<set path="known://...">`, and signals continuation or completion via `<update status="...">` — externalizing its reasoning into a persistent K/V store that survives across turns without message history.
+Rummy provides the memory hierarchy necessary to maintain high-fidelity reasoning over unlimited-turn sessions. This is not a benchmarking "harness," but a production-grade Operating System for AI agency:
-- **One K/V Store:** Files, knowledge, tool results, unknowns, user prompts — everything is a keyed entry. Content lives in `entries` (scope-owned), per-run fidelity / status / turn in `run_views`. No message history. No separate file listings. The model's entire context is assembled from the store each turn.
+*   **L1 Cache (`visible`):** High-fidelity, character-perfect context. This is the active "Working Set" the model is reasoning with right now.
+*   **RAM (`summarized`):** Folksonomic metadata and searchable indices. This allows the model to know *what* information exists and how to address it without consuming the L1 token budget.
+*   **The Disk (`archived`):** Persistent SQLite storage. A relational substrate where every historical finding, raw source document, and prior tool result is safely indexed and searchable, ready to be "paged" back into Cache on demand.
-- **Hedberg:** The interpretation boundary between stochastic model output and deterministic system operations. Models speak in whatever syntax they were trained on — sed regex, SEARCH/REPLACE blocks, escaped characters. Hedberg normalizes all of it. Available to all plugins via `core.hooks.hedberg`.
+## Key Features
-- **Folksonomic Memory:** The model organizes its own knowledge into navigable path hierarchies with searchable summary tags. Not RAG — the model builds and curates its own taxonomy using `<set path="known://project/architecture" summary="keywords,go,here">...</set>`.
+### Headless & RPC-First
+Rummy is a **headless service**. It exposes a JSON-RPC over WebSocket interface, allowing it to be embedded into any client (e.g., [rummy.nvim](https://github.com/possumtech/rummy.nvim)). The server manages the project state and the "Kernel" loop, while the client drives the UI and handles local proposal resolution.
-- **Fidelity System:** Every per-run view of an entry has a fidelity level: `promoted` (body visible), `demoted` (path + summary only), `archived` (invisible, retrievable via pattern search). The model manages its own context by promoting what it needs and demoting what it doesn't. Budget enforcement catches overflow post-dispatch — tools run uninterrupted, demotion happens after.
+### Extensible Plugin Architecture
+Rummy is built for integration. Every `<tag>` the model sees is a plugin. Every URI scheme (`known://`, `unknown://`, `sh://`) is registered by its owner. Developers can drop custom logic into `src/plugins/` to add new tools, filters, or event hooks. See [PLUGINS.md](PLUGINS.md) for details.
-- **Plugin Architecture:** Every `<tag>` the model sees is a plugin. Every scheme is registered by its owner. The prompt itself is assembled from plugins. Drop a directory into `~/.rummy/plugins/` or install via npm. See [PLUGINS.md](PLUGINS.md) for the complete plugin API.
+### The Six Primitives
+Every operation in Rummy reduces to one of six verbs over a single entry contract: `set` / `get` / `rm` / `mv` / `cp` / `update`. Tools (`<sh>`, `<search>`, `<known>`, `<unknown>`, …) are plugins that compose these primitives. Three actor surfaces — model XML tags, plugin RummyContext methods, JSON-RPC client calls — speak the same grammar at the store layer.
-- **Symbols Done Right:** Designed with universal language support in mind. Powered by [@possumtech/antlrmap](https://github.com/possumtech/antlrmap).
+### The Model Owns Its Context
+Visibility (`visible` / `summarized` / `archived`) is the model's exclusive lever. The engine never silently mutates an entry's visibility behind the model's back; the only enforcements that touch visibility (Turn Demotion at budget overflow, Prompt Demotion at context-exceeded) surface through `error://` so the model sees the trigger. No chat-waterfall horizon, no auto-prune — the model controls what it sees and what it doesn't.
-- **SQLite Done Right:** Async, compiled WAL-mode SQL engine in worker threads. Powered by [@possumtech/sqlrite](https://github.com/possumtech/sqlrite).
+### Apophatic Reasoning (The Rumsfeld Loop)
+Rummy turns "Not Knowing" into a formal state to be processed. By mapping **Unknowns** (`unknown://`) into verified **Knowns** (`known://`), Rummy provides a transparent, auditable trail of how the agent arrived at its conclusion.
 ## Installation
-Rummy loads configuration from exactly **one** directory per
-invocation:
+Rummy loads configuration from exactly **one** directory per invocation:
-1. The directory you run `rummy` from, if it contains `.env.example`.
+1. The current working directory (if it contains `.env.example`).
 2. Otherwise, `${RUMMY_HOME}` (default `~/.rummy`).
-`npm i -g @possumtech/rummy` runs a postinstall that seeds
-`${RUMMY_HOME}/.env.example` from the package defaults, so the
-out-of-the-box path works:
 ```bash
-# In your shell rc:
+# Set your RUMMY_HOME
 export RUMMY_HOME=~/.rummy
+# Install globally
 npm i -g @possumtech/rummy
-$EDITOR ~/.rummy/.env.example   # set a model alias, tweak defaults
+# Configure your environment
+$EDITOR ~/.rummy/.env.example   # set model aliases and keys
 rummy
 ```
-Within the chosen directory, `.env.example` is the baseline and `.env`
-(if present) overrides. Shell env beats both. The package's own
-`.env.example` is **never** loaded at runtime — if neither the cwd nor
-`${RUMMY_HOME}` has an `.env.example`, rummy crashes at startup. No
-silent defaults.
 ## Usage
-Rummy is just the service. You'll need to get (or vibe) yourself a client interface. We're partial the our Neovim plugin: [@possumtech/rummy.nvim](https://github.com/possumtech/rummy.nvim)
+Start the service and connect your preferred client. The server defaults to port `3044`.
+*   **Official Client:** [rummy.nvim](https://github.com/possumtech/rummy.nvim) (Neovim interface)
+*   **In-process CLI:** `rummy-cli` (one-shot ask/act invocations against a project; see `src/plugins/cli/`)
+*   **Diagnostic Suite:** `test/tbench/` and `test/programbench/` (autonomous diagnostic and benchmarking harnesses)
 ## Documentation
 | Document | Contents |
 |----------|----------|
-| [SPEC.md](SPEC.md) | System design: K/V store, dispatch, packet structure, RPC |
-| [PLUGINS.md](PLUGINS.md) | Plugin development: registration, events, filters, hedberg |
-| [AGENTS.md](AGENTS.md) | Planning and progress |
+| [SPEC.md](SPEC.md) | Technical Specification: K/V store, packet structure, dispatch path, and lifecycle contracts. |
+| [PLUGINS.md](PLUGINS.md) | Extensibility: Hook registry, event filtering, and custom scheme registration. |
+| [src/plugins/](src/plugins/**/README.md) | **Plugin Reference:** Internal documentation for each scheme and toolset. |
+| [AGENTS.md](AGENTS.md) | Project roadmap, planning history, and architectural lessons. |
-Each plugin has its own README at `src/plugins/{name}/README.md`.
-The `discover` RPC method returns the live protocol reference at runtime.
+---
+*Rummy: The Managed Operating System for AI Agency.*