npm - niahere - Versions diffs - 0.2.57 → 0.2.58 - Mend

niahere 0.2.57 → 0.2.58

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (12) hide show

package/package.json +1 -1
package/skills/optimization-loop/SKILL.md +230 -0
package/skills/optimize/SKILL.md +238 -0
package/src/chat/engine.ts +94 -17
package/src/commands/backup.ts +26 -4
package/src/core/agents.ts +22 -8
package/src/core/consolidator.ts +52 -14
package/src/core/health.ts +92 -23
package/src/core/skills.ts +18 -6
package/src/core/summarizer.ts +33 -8
package/src/db/models/active_engine.ts +5 -3
package/src/utils/retry.ts +18 -0

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "niahere",
-  "version": "0.2.57",
+  "version": "0.2.58",
   "description": "A personal AI assistant daemon — chat, scheduled jobs, persona system, extensible via skills.",
   "type": "module",
   "scripts": {

package/skills/optimization-loop/SKILL.md ADDED Viewed

@@ -0,0 +1,230 @@
+---
+name: optimization-loop
+description: |
+  The iterative optimization pattern (Karpathy Loop / autoresearch). Reference for running
+  autonomous experiment loops on any target: modify → score → keep or revert → repeat.
+  Use when running multiple iterations of improvement against a measurable metric — code
+  benchmarks, prompt quality, copy effectiveness, config tuning, or any scorable target.
+  Also known as "autoresearch." Use this skill to understand the pattern and discipline.
+  For orchestration (scheduling, user confirmation, job setup), see the "optimize" skill.
+metadata:
+  version: 1.0.0
+---
+# Optimization Loop
+The Karpathy Loop: autonomous iterative optimization through disciplined experimentation.
+Modify a target, score the result, keep improvements, revert failures, repeat.
+This skill defines the **pattern and discipline**. For when/how to schedule and orchestrate
+optimization runs, see the `optimize` skill.
+## The Pattern
+```
+freeze contract + rubric
+save baseline (never touch again)
+copy baseline → current-best
+repeat:
+  1. read state — what's been tried, what worked
+  2. hypothesize — form a specific idea, informed by history
+  3. modify — produce a candidate version
+  4. gate check — hard constraints pass? if no → reject
+  5. score — compare candidate vs current-best (pairwise)
+  6. decide — clearly better? keep. otherwise revert.
+  7. log — append to results.jsonl
+  8. update state — what you tried, what happened, what next
+until: budget exhausted, target reached, or plateau detected
+notify user with summary
+```
+## Workspace Layout
+Each optimization run gets a dedicated, self-contained directory:
+```
+~/.niahere/optimizations/{slug}-{hex}/
+├── contract.md           # Frozen at start: objective, scope, constraints, metrics, budget
+├── rubric.md             # Frozen at start: scoring criteria (never modify during run)
+├── baseline.md           # Original version (never modify)
+├── current-best.md       # Best version so far (update only on accept)
+├── accepted/             # Every accepted candidate, numbered
+│   ├── 001.md
+│   ├── 002.md
+│   └── ...
+├── results.jsonl         # One JSON object per experiment (append-only)
+└── state.md              # Your working notebook
+```
+**The slug** is human-readable (e.g., `signup-prompt`). The hex suffix (4 chars) prevents
+collisions across multiple runs on the same target.
+## The Contract (contract.md)
+Freeze this at the start. Never modify during the run.
+```markdown
+# Optimization Contract
+## Objective
+[What we're optimizing and why — one sentence]
+## Target
+[File path or content being modified]
+[Which sections/parts are in scope — be specific]
+## Primary Metric
+[The metric being optimized — what "better" means]
+## Secondary Metrics (regression guards)
+[Metrics that must NOT degrade. Each with a threshold.]
+- [e.g., "Word count must stay under 200"]
+- [e.g., "All existing tests must pass"]
+- [e.g., "Readability score must stay above grade 8"]
+## Hard Constraints
+[Violations = automatic reject, no exceptions]
+- [e.g., "Must mention the free trial"]
+- [e.g., "Must pass lint and type check"]
+## Soft Preferences
+[Tiebreakers — not vetoes, but guide decisions]
+- [e.g., "Prefer shorter over longer"]
+- [e.g., "Prefer simple over clever"]
+## Budget
+- Max iterations: [N]
+- Max wall-clock time: [hours]
+## Stop Rules
+- All iterations completed
+- Target score reached: [if applicable]
+- Plateau: [N] consecutive discards (default 5)
+```
+## Scoring
+### For code targets
+Run a benchmark or test command. Extract the metric. The command is fixed in the contract
+and cannot be modified during the run.
+```
+1. Gate check: tests pass? lint clean? types check? → if any fail, reject immediately
+2. Run benchmark command → extract primary metric
+3. Check secondary metrics for regressions → if any violated, reject
+4. Compare primary metric against current-best
+5. Accept only if clearly improved (above noise floor)
+```
+### For content targets (prompts, copy, configs)
+Use pairwise comparison. Never absolute 1-10 scoring.
+```
+1. Gate check: hard constraints met? (word count, required elements, etc.)
+2. Present both versions side by side:
+   - Randomly assign which is "Version A" and "Version B"
+   - Do NOT label which is current-best vs candidate
+3. Evaluate using the frozen rubric criteria
+4. Pick the winner — candidate must be CLEARLY better, not just different
+5. If it's a toss-up, reject (bias toward stability)
+6. Check secondary metrics for regressions
+```
+**Anti-bias controls for LLM-as-judge:**
+- Randomize A/B order every time (prevents position bias)
+- Never reveal which version is "current" vs "candidate"
+- If the margin is slim, run the comparison twice with swapped order
+- The rubric is frozen in `rubric.md` — you cannot modify scoring criteria mid-run
+## Exploration Strategy
+Don't just make incremental tweaks. Use staged exploration:
+**Early phase (first ~30% of iterations):** Go broad. Try fundamentally different approaches.
+Different structures, different angles, different trade-offs. You're mapping the space.
+**Exploit phase (middle ~50%):** You've found something that works. Refine around it.
+Incremental improvements, wording tweaks, parameter tuning.
+**Escape phase (if plateaued):** If you hit 5 consecutive discards, try ONE radical departure
+from current-best — something completely different. If that fails too, stop. You've likely
+found a local optimum.
+## The Results Log (results.jsonl)
+Append one JSON object per experiment. Never edit previous entries.
+```json
+{"n": 1, "status": "keep", "hypothesis": "shorter opening hook", "score_note": "candidate clearly more direct", "duration_s": 45, "timestamp": "2026-04-07T02:14:00Z"}
+{"n": 2, "status": "discard", "hypothesis": "add social proof", "score_note": "toss-up, rejected for stability", "duration_s": 38, "timestamp": "2026-04-07T02:21:00Z"}
+{"n": 3, "status": "crash", "hypothesis": "doubled context window", "error": "benchmark timed out", "duration_s": 300, "timestamp": "2026-04-07T02:28:00Z"}
+```
+Every entry must include:
+- `n` — experiment number
+- `status` — `keep`, `discard`, or `crash`
+- `hypothesis` — what you tried and why (one line)
+- `score_note` — why you kept or discarded (one line)
+- `timestamp` — when the experiment completed
+## Resumability
+If the run crashes or is interrupted:
+1. Read `current-best.md` — this is always the last accepted version
+2. Read `results.jsonl` — count completed experiments, review what was tried
+3. Read `state.md` — pick up your thinking from where you left off
+4. Continue from the next experiment number
+5. Do NOT re-run completed experiments
+## Scoring Integrity
+**The scorer and the optimizer must be separated in intent.** You are both proposer and judge,
+so you must be disciplined:
+- The rubric is frozen. Do not adjust criteria because a candidate "almost" passes.
+- Do not add special cases to make a favorite candidate win.
+- Do not lower the bar after repeated failures. If nothing passes, that's a valid outcome.
+- If you notice you're gaming your own rubric, stop and note it in state.md.
+## When Finished
+1. Update `state.md` with a final summary:
+   - Baseline description vs final best description
+   - Total experiments: N run, X accepted, Y discarded, Z crashed
+   - Key findings: what worked, what didn't, surprises
+2. Send a message to the user (via `send_message`):
+   ```
+   [optimization] Done. Ran N experiments on [target].
+   X accepted, Y discarded. [One-line summary of the best version vs baseline].
+   Results: ~/.niahere/optimizations/{slug}-{hex}/
+   ```
+3. Do NOT auto-apply the result. The user reviews `current-best.md` and decides
+   whether to use it.
+## Principles
+- **Propose, never apply.** The optimization produces a candidate. The user promotes it.
+- **Simplicity criterion.** A marginal improvement that adds complexity isn't worth keeping.
+  Removing something while maintaining quality is always a win.
+- **Bias toward stability.** When in doubt, reject. Keeping a good version is better than
+  accepting a sideways move.
+- **One target, one metric, one run.** Don't try to optimize multiple things simultaneously.
+  Run separate optimizations for separate targets.

package/skills/optimize/SKILL.md ADDED Viewed

@@ -0,0 +1,238 @@
+---
+name: optimize
+description: |
+  Schedule or run an iterative optimization pass on code, prompts, copy, or any scorable
+  target. Use when user asks to "optimize this", "run experiments", "autoresearch this",
+  "iterate on this overnight", "can this be better", or proactively suggest after completing
+  work that could benefit from further iteration. Also use when a job wants to self-optimize
+  something within its own run. Handles spec confirmation, scoring setup, job scheduling,
+  and result delivery. For the loop discipline itself, references the optimization-loop skill.
+metadata:
+  version: 1.0.0
+---
+# Optimize
+Schedule or run autonomous optimization passes. This skill handles the orchestration —
+when to use it, how to confirm specs, how to schedule, how to deliver results.
+For the loop discipline, scoring methods, and workspace layout, invoke the
+`optimization-loop` skill.
+## Two Entry Points
+### 1. User explicitly asks
+User says "autoresearch this", "optimize this overnight", "run experiments on this",
+"can you iterate on this more", or similar.
+**Don't suggest — confirm and schedule.** The user already wants this. Move to Step 1.
+### 2. Proactive suggestion (after immediate work)
+You just finished a task — rewrote copy, tuned a prompt, optimized a function. The result
+is good, but more iterations could find something better.
+Suggest briefly:
+> "This is solid. Want me to schedule an overnight optimization pass? I'll run ~30
+> experiments scoring each version against [brief criteria] and have the best version
+> ready by morning."
+**Rules for suggesting:**
+- Only suggest when there's a clear, scorable metric
+- Only suggest when the target is self-contained (one file, one prompt, one section)
+- Don't suggest for trivial tasks or quick fixes
+- Don't push if the user declines — move on immediately
+- Don't suggest if the user said they need this done now and can't wait
+## Step 1: Confirm the Setup
+Before scheduling, confirm these with the user. Be concise — a quick summary, not an
+interrogation.
+**Target** — What are we optimizing?
+- A file (code, config, prompt file)
+- A section of content (landing page hero, email subject line)
+- A prompt or template
+**Scoring method** — How do we know if a version is better?
+- Code: what benchmark or test command produces a number?
+- Content: what criteria matter? (clarity, persuasiveness, brevity, conversion, etc.)
+- Custom: does the user have a specific scoring script?
+**Constraints** — What can't change?
+- Hard constraints (must-haves, test requirements, word limits)
+- Soft preferences (shorter is better, simpler is better)
+**Secondary metrics** — What must NOT get worse?
+- Code: performance can't drop, memory can't increase, tests must pass
+- Content: readability, brand voice, required elements
+- These are regression guards — violations veto an otherwise good candidate
+**Iterations** — How many experiments? Default 30. User can adjust.
+**When** — Now, or schedule for later? If later, what time?
+Example confirmation:
+> "Here's the plan:
+>
+> - **Target**: signup prompt at `src/prompts/signup.md`
+> - **Scoring**: pairwise comparison on clarity, persuasiveness, and brevity
+> - **Constraints**: must mention free trial, keep under 150 words
+> - **Regression guards**: readability must stay above grade 8
+> - **Iterations**: 30 experiments
+> - **When**: tonight at midnight
+>
+> Sound right?"
+Wait for confirmation before proceeding.
+## Step 2: Set Up the Workspace
+Create the optimization directory:
+```
+~/.niahere/optimizations/{slug}-{hex}/
+```
+Where `{slug}` is a short descriptive name and `{hex}` is 4 random hex chars.
+Create the frozen files:
+1. **contract.md** — objective, target, primary metric, secondary metrics, constraints,
+   preferences, budget, stop rules (see optimization-loop skill for template)
+2. **rubric.md** — detailed scoring criteria
+   - For code: the benchmark command and how to extract the metric
+   - For content: the pairwise comparison rubric with specific criteria and weights
+3. **baseline.md** — copy the current version of the target (the starting point)
+4. **current-best.md** — copy of baseline (will be updated during the run)
+5. **state.md** — initialize with "Run starting. 0 experiments completed."
+6. **accepted/** — create empty directory
+## Step 3: Compose the Job Prompt
+Build a self-contained job prompt that encodes everything the agent needs to run
+the optimization loop autonomously. The prompt must include:
+```
+Job: optimization — {slug}
+You are running an optimization loop. Follow the optimization-loop pattern strictly.
+## Your workspace
+{absolute path to the optimization directory}
+## What to optimize
+{description of the target — file path, what it does, context}
+## Current version
+{full content of the target}
+## Contract
+{contents of contract.md}
+## Scoring rubric
+{contents of rubric.md}
+## Loop instructions
+Read your workspace files (contract.md, rubric.md, baseline.md, current-best.md,
+state.md, results.jsonl) to understand the current state.
+For each iteration:
+1. Read state.md for context on what's been tried
+2. Form a hypothesis — what to change and why
+3. Produce a candidate version
+4. Gate check — verify all hard constraints from the contract
+5. Score — compare candidate vs current-best using the rubric (pairwise, randomized order)
+6. If candidate is clearly better AND no secondary metric regressions:
+   - Update current-best.md
+   - Save candidate to accepted/{NNN}.md
+   - Log {"status": "keep", ...} to results.jsonl
+7. If not clearly better:
+   - Discard candidate
+   - Log {"status": "discard", ...} to results.jsonl
+8. Update state.md with what you tried and learned
+Stop when:
+- Completed {N} iterations, OR
+- {stop_count} consecutive discards (plateau), OR
+- Target score reached (if specified in contract)
+When finished, update state.md with a final summary and send a message to the user:
+"[optimization] Done. Ran N experiments on {target}. X accepted, Y discarded.
+{One-line summary}. Results: {workspace path}"
+IMPORTANT:
+- Do NOT modify contract.md or rubric.md
+- Do NOT auto-apply results to the original file
+- Do NOT stop to ask the user questions — run autonomously until done
+```
+## Step 4: Schedule the Job
+Use the `add_job` MCP tool (preferred) or `nia job add` CLI:
+- **name**: `optimize-{slug}` (e.g., `optimize-signup-prompt`)
+- **schedule**: ISO timestamp for the agreed time, or now
+- **schedule_type**: `once`
+- **prompt**: the composed job prompt from Step 3
+- **always**: `true` (overnight runs need to ignore active hours)
+- **stateless**: `yes` (the optimization uses its own workspace, not the job's state.md)
+Confirm to the user:
+> "Scheduled. The optimization run starts at {time} and will run ~{N} experiments.
+> I'll message you when it's done with the results."
+## Step 5: After Completion
+When the user asks about results, or when reviewing the notification:
+1. Read `~/.niahere/optimizations/{slug}-{hex}/state.md` for the summary
+2. Read `results.jsonl` for the experiment log
+3. Show `current-best.md` vs `baseline.md` — the diff is the value
+4. Show the accepted progression if the user wants to see the journey
+5. Ask if the user wants to apply the result to the original target
+## Running Now vs Later
+**"Run it now":** Schedule with the current timestamp. The user stays in the conversation
+and can check results when the job finishes. Good for shorter runs (10-15 iterations).
+**"Schedule for later":** Schedule for a specific time (midnight, after hours). The user
+goes about their day. The notification arrives when done. Good for longer runs (30+ iterations).
+**"Run it inline":** If the user wants to optimize something RIGHT NOW in this conversation
+(not as a job), you can run the optimization-loop pattern directly without scheduling a job.
+Use this for quick 5-10 iteration runs where the user is watching.
+## When a Job Self-Optimizes
+A running job (e.g., news-curator, prompt-generator) can use this pattern to improve
+its own approach. The flow:
+1. Job creates an optimization subdirectory in its workspace or in `~/.niahere/optimizations/`
+2. Runs the loop inline (not as a sub-job — within its own execution)
+3. Saves the best version in the workspace
+4. Does NOT auto-apply changes to its own prompt or config
+5. Sends a message: "I found a better approach for [X]. Review at [path]."
+6. The user decides whether to apply it (e.g., via `nia job update`)
+## What NOT to Optimize
+- Things without a clear metric (vague "make it better")
+- Targets that require human judgment with no proxy (art, brand voice decisions)
+- Multi-file changes with complex interdependencies
+- Anything where the scoring takes longer than the modification (defeats the loop)
+- Security-sensitive code where autonomous changes are risky
+If the target doesn't fit, say so. Not everything benefits from iterative optimization.
+Sometimes the first good version is the right answer.

package/src/chat/engine.ts CHANGED Viewed

@@ -8,7 +8,15 @@ import { randomUUID } from "crypto";
 import { buildSystemPrompt, getSessionContext } from "./identity";
 import { getAgentDefinitions } from "../core/agents";
 import { Session, Message, ActiveEngine } from "../db/models";
-import type { Attachment, SendResult, StreamCallback, ActivityCallback, SendCallbacks, ChatEngine, EngineOptions } from "../types";
+import type {
+  Attachment,
+  SendResult,
+  StreamCallback,
+  ActivityCallback,
+  SendCallbacks,
+  ChatEngine,
+  EngineOptions,
+} from "../types";
 import { truncate, formatToolUse } from "../utils/format-activity";
 import { consolidateSession } from "../core/consolidator";
 import { summarizeSession } from "../core/summarizer";
@@ -25,10 +33,19 @@ interface SDKUserMessage {
 }
 /** Convert provider-agnostic attachments to Anthropic content blocks. */
-export function buildContentBlocks(text: string, attachments?: Attachment[]): MessageParam["content"] {
+export function buildContentBlocks(
+  text: string,
+  attachments?: Attachment[],
+): MessageParam["content"] {
   if (!attachments?.length) return text;
-  const blocks: Array<{ type: "text"; text: string } | { type: "image"; source: { type: "base64"; media_type: string; data: string } }> = [];
+  const blocks: Array<
+    | { type: "text"; text: string }
+    | {
+        type: "image";
+        source: { type: "base64"; media_type: string; data: string };
+      }
+  > = [];
   for (const att of attachments) {
     if (att.type === "image") {
@@ -94,6 +111,7 @@ class MessageStream {
 interface PendingResult {
   userMessage: string;
+  userSaved: boolean;
   onStream: StreamCallback | null;
   onActivity: ActivityCallback | null;
   accumulatedText: string;
@@ -103,15 +121,22 @@ interface PendingResult {
   reject: (error: Error) => void;
 }
 function sessionFileExists(sessionId: string, cwd: string): boolean {
   // SDK stores sessions at ~/.claude/projects/<encoded-cwd>/<session-id>.jsonl
   const encoded = cwd.replace(/\//g, "-");
-  const sessionFile = join(homedir(), ".claude", "projects", encoded, `${sessionId}.jsonl`);
+  const sessionFile = join(
+    homedir(),
+    ".claude",
+    "projects",
+    encoded,
+    `${sessionId}.jsonl`,
+  );
   return existsSync(sessionFile);
 }
-export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine> {
+export async function createChatEngine(
+  opts: EngineOptions,
+): Promise<ChatEngine> {
   const { room, channel, resume, mcpServers } = opts;
   let systemPrompt = buildSystemPrompt("chat", channel);
@@ -156,7 +181,10 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
     idleTimer = setTimeout(async () => {
       if (pending) {
         // Don't tear down while a request is in flight
-        log.warn({ room }, "idle timer fired while request pending, skipping teardown");
+        log.warn(
+          { room },
+          "idle timer fired while request pending, skipping teardown",
+        );
         return;
       }
       // Memory consolidation + session summary before "sleep"
@@ -165,7 +193,10 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
           log.error({ err, room }, "consolidation failed during idle teardown");
         });
         summarizeSession(sessionId, room).catch((err) => {
-          log.error({ err, room }, "session summary failed during idle teardown");
+          log.error(
+            { err, room },
+            "session summary failed during idle teardown",
+          );
         });
       }
       teardown();
@@ -185,7 +216,10 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
     longRunningTimer = setTimeout(() => {
       if (pending) {
         longRunningWarned = true;
-        log.warn({ room, elapsed: LONG_RUNNING_WARN / 1000 }, "engine request running for 30+ minutes");
+        log.warn(
+          { room, elapsed: LONG_RUNNING_WARN / 1000 },
+          "engine request running for 30+ minutes",
+        );
       }
     }, LONG_RUNNING_WARN);
   }
@@ -250,7 +284,7 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
               await Session.create(sessionId, room);
             }
-            if (pending) {
+            if (pending && !pending.userSaved) {
               await Message.save({
                 sessionId,
                 room,
@@ -258,6 +292,7 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
                 content: pending.userMessage,
                 isFromAgent: false,
               });
+              pending.userSaved = true;
               messageCount++;
             }
           }
@@ -279,7 +314,10 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
                 if (lines.length > 1) {
                   // Show the last complete line (not the partial one being typed)
                   const completeLine = lines[lines.length - 2]?.trim();
-                  if (completeLine && completeLine !== pending.lastThinkingLine) {
+                  if (
+                    completeLine &&
+                    completeLine !== pending.lastThinkingLine
+                  ) {
                     pending.lastThinkingLine = completeLine;
                     pending.onActivity?.(truncate(completeLine, 70));
                   }
@@ -364,15 +402,26 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
                 try {
                   messageId = await Message.save(saveParams);
                 } catch {
-                  messageId = await Message.save({ ...saveParams, metadata: undefined });
+                  messageId = await Message.save({
+                    ...saveParams,
+                    metadata: undefined,
+                  });
                 }
                 await Session.touch(sessionId);
-                Session.accumulateMetadata(sessionId, { ...metadata, channel }).catch(() => {});
+                Session.accumulateMetadata(sessionId, {
+                  ...metadata,
+                  channel,
+                }).catch(() => {});
               }
               await ActiveEngine.unregister(room);
               clearLongRunningTimer();
-              pending.resolve({ result: resultText, costUsd, turns, messageId });
+              pending.resolve({
+                result: resultText,
+                costUsd,
+                turns,
+                messageId,
+              });
               pending = null;
               resetIdleTimer();
             } else {
@@ -390,9 +439,16 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
         // Stream ended without a result — subprocess exited or was killed
         if (pending) {
           const partial = pending.accumulatedText;
-          log.error({ room, partialChars: partial.length }, "query stream ended without result, rejecting pending request");
+          log.error(
+            { room, partialChars: partial.length },
+            "query stream ended without result, rejecting pending request",
+          );
           await ActiveEngine.unregister(room).catch(() => {});
-          pending.reject(new Error(`stream ended without result (${partial.length} chars accumulated)`));
+          pending.reject(
+            new Error(
+              `stream ended without result (${partial.length} chars accumulated)`,
+            ),
+          );
           pending = null;
         }
       } catch (err) {
@@ -419,7 +475,11 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
       return room;
     },
-    async send(userMessage: string, callbacks?: SendCallbacks, attachments?: Attachment[]) {
+    async send(
+      userMessage: string,
+      callbacks?: SendCallbacks,
+      attachments?: Attachment[],
+    ) {
       // Clear idle timer — engine is not idle while processing a request
       clearIdleTimer();
       startLongRunningTimer();
@@ -430,9 +490,26 @@ export async function createChatEngine(opts: EngineOptions): Promise<ChatEngine>
         startQuery();
       }
+      // Save user message to DB if session already exists (resumed session).
+      // For new sessions, the init handler saves it once sessionId is known.
+      let userSaved = false;
+      if (sessionId) {
+        await Message.save({
+          sessionId,
+          room,
+          sender: "user",
+          content: userMessage,
+          isFromAgent: false,
+        });
+        await Session.touch(sessionId);
+        userSaved = true;
+        messageCount++;
+      }
       return new Promise<SendResult>((resolve, reject) => {
         pending = {
           userMessage,
+          userSaved,
           onStream: callbacks?.onStream || null,
           onActivity: callbacks?.onActivity || null,
           accumulatedText: "",

package/src/commands/backup.ts CHANGED Viewed

@@ -57,9 +57,25 @@ export async function createBackup(silent = false): Promise<string> {
   if (dbUrl) {
     const dumpPath = join(home, "tmp", "db-backup.sql");
     mkdirSync(join(home, "tmp"), { recursive: true });
-    const pg = Bun.spawn(["pg_dump", dbUrl, "-f", dumpPath], {
+    // Parse URL to avoid exposing password in process list (visible via ps)
+    const url = new URL(dbUrl);
+    const dbName = decodeURIComponent(url.pathname.replace(/^\//, ""));
+    const pgArgs = ["pg_dump", "-f", dumpPath];
+    if (url.hostname) pgArgs.push("-h", url.hostname);
+    if (url.port) pgArgs.push("-p", url.port);
+    if (url.username) pgArgs.push("-U", decodeURIComponent(url.username));
+    if (dbName) pgArgs.push("-d", dbName);
+    const pgEnv: Record<string, string> = { ...process.env } as Record<
+      string,
+      string
+    >;
+    if (url.password) pgEnv.PGPASSWORD = decodeURIComponent(url.password);
+    const sslmode = url.searchParams.get("sslmode");
+    if (sslmode) pgEnv.PGSSLMODE = sslmode;
+    const pg = Bun.spawn(pgArgs, {
       stdout: "pipe",
       stderr: "pipe",
+      env: pgEnv,
     });
     const exitCode = await pg.exited;
     if (exitCode === 0 && existsSync(dumpPath)) {
@@ -71,7 +87,9 @@ export async function createBackup(silent = false): Promise<string> {
       dbDumped = true;
     } else if (!silent) {
       const stderr = await new Response(pg.stderr).text();
-      console.log(`  ⚠ db dump skipped: ${stderr.trim() || `exit ${exitCode}`}`);
+      console.log(
+        `  ⚠ db dump skipped: ${stderr.trim() || `exit ${exitCode}`}`,
+      );
     }
   }
@@ -94,8 +112,12 @@ export async function createBackup(silent = false): Promise<string> {
   // Clean up temp db dump
   if (dbDumped) {
-    try { unlinkSync(join(home, "db-backup.sql")); } catch {}
-    try { unlinkSync(join(home, "tmp", "db-backup.sql")); } catch {}
+    try {
+      unlinkSync(join(home, "db-backup.sql"));
+    } catch {}
+    try {
+      unlinkSync(join(home, "tmp", "db-backup.sql"));
+    } catch {}
   }
   const size = statSync(outPath).size;

package/src/core/agents.ts CHANGED Viewed

@@ -48,19 +48,25 @@ export function scanAgents(): AgentInfo[] {
       try {
         meta = (yaml.load(fmMatch[1]) as Record<string, unknown>) || {};
       } catch (err) {
-        log.warn({ err, agent: entry.name, path: agentFile }, "failed to parse agent metadata, skipping");
+        log.warn(
+          { err, agent: entry.name, path: agentFile },
+          "failed to parse agent metadata, skipping",
+        );
         continue;
       }
-      const name = (typeof meta.name === "string" ? meta.name : "") || entry.name;
+      const name =
+        (typeof meta.name === "string" ? meta.name : "") || entry.name;
-      if (seen.has(name)) continue;
-      seen.add(name);
+      const key = name.toLowerCase();
+      if (seen.has(key)) continue;
+      seen.add(key);
       const body = content.replace(/^---\n[\s\S]*?\n---\n*/, "").trim();
       agents.push({
         name,
-        description: typeof meta.description === "string" ? meta.description : "",
+        description:
+          typeof meta.description === "string" ? meta.description : "",
         body,
         model: typeof meta.model === "string" ? meta.model : undefined,
         source,
@@ -74,13 +80,21 @@ export function scanAgents(): AgentInfo[] {
 export function getAgentsSummary(): string {
   const agents = scanAgents();
   if (agents.length === 0) return "";
-  const lines = agents.map((a) => a.description ? `- @${a.name}: ${a.description}` : `- @${a.name}`);
+  const lines = agents.map((a) =>
+    a.description ? `- @${a.name}: ${a.description}` : `- @${a.name}`,
+  );
   return `Available agents:\n${lines.join("\n")}`;
 }
-export function getAgentDefinitions(): Record<string, { description: string; prompt: string; model?: string }> {
+export function getAgentDefinitions(): Record<
+  string,
+  { description: string; prompt: string; model?: string }
+> {
   const agents = scanAgents();
-  const defs: Record<string, { description: string; prompt: string; model?: string }> = {};
+  const defs: Record<
+    string,
+    { description: string; prompt: string; model?: string }
+  > = {};
   for (const agent of agents) {
     defs[agent.name] = {

package/src/core/consolidator.ts CHANGED Viewed

@@ -22,8 +22,11 @@ import { runTask } from "./runner";
 import { log } from "../utils/log";
 import type { SessionMessage } from "../types";
-/** Track sessions already consolidated to prevent double runs. */
-const consolidated = new Set<string>();
+/** Bounded dedup: sessionId → message count at last consolidation. Prevents re-processing
+ *  the same messages while allowing re-consolidation when new turns arrive. */
+const processedCounts = new Map<string, number>();
+const inFlight = new Set<string>();
+const MAX_TRACKED = 500;
 /** Max messages to include in transcript (most recent). Keeps prompt size bounded. */
 const MAX_TRANSCRIPT_MESSAGES = 50;
@@ -37,11 +40,15 @@ function shouldSkip(room: string): boolean {
 function formatTranscript(messages: SessionMessage[]): string {
   const recent = messages.slice(-MAX_TRANSCRIPT_MESSAGES);
   const skipped = messages.length - recent.length;
-  const prefix = skipped > 0 ? `[...${skipped} earlier messages omitted]\n\n` : "";
-  return prefix + recent
-    .map((m) => `[${m.sender}] (${m.createdAt}): ${m.content.slice(0, 2000)}`)
-    .join("\n\n");
+  const prefix =
+    skipped > 0 ? `[...${skipped} earlier messages omitted]\n\n` : "";
+  return (
+    prefix +
+    recent
+      .map((m) => `[${m.sender}] (${m.createdAt}): ${m.content.slice(0, 2000)}`)
+      .join("\n\n")
+  );
 }
 /** Build the extraction prompt from a conversation transcript. */
@@ -80,7 +87,10 @@ Do NOT message the user about this. Save silently and report a brief summary of
 }
 /** Run the consolidation agent loop. */
-async function runConsolidation(transcript: string, source: string): Promise<void> {
+async function runConsolidation(
+  transcript: string,
+  source: string,
+): Promise<void> {
   await runTask({
     name: "consolidator",
     prompt: buildConsolidationPrompt(transcript, source),
@@ -91,21 +101,42 @@ async function runConsolidation(transcript: string, source: string): Promise<voi
  * Consolidate a chat session's conversation into memories.
  * Called when a chat engine goes idle or is explicitly closed.
  */
-export async function consolidateSession(sessionId: string, room: string): Promise<void> {
+export async function consolidateSession(
+  sessionId: string,
+  room: string,
+): Promise<void> {
   if (shouldSkip(room)) return;
-  if (consolidated.has(sessionId)) return;
-  consolidated.add(sessionId);
+  if (inFlight.has(sessionId)) return;
   try {
     const messages = await Message.getBySession(sessionId);
     if (messages.length < 2) return;
-    log.info({ sessionId, room, messageCount: messages.length }, "consolidator: extracting memories from chat");
+    // Skip if already processed this exact message count
+    if (processedCounts.get(sessionId) === messages.length) return;
+    inFlight.add(sessionId);
+    log.info(
+      { sessionId, room, messageCount: messages.length },
+      "consolidator: extracting memories from chat",
+    );
     const transcript = formatTranscript(messages);
     await runConsolidation(transcript, `chat session idle — ${room}`);
+    // Mark as processed only on success
+    processedCounts.set(sessionId, messages.length);
+    // Evict oldest entries when over cap
+    if (processedCounts.size > MAX_TRACKED) {
+      const firstKey = processedCounts.keys().next().value;
+      if (firstKey) processedCounts.delete(firstKey);
+    }
   } catch (err) {
     log.error({ err, sessionId, room }, "consolidator: chat extraction failed");
+  } finally {
+    inFlight.delete(sessionId);
   }
 }
@@ -113,7 +144,11 @@ export async function consolidateSession(sessionId: string, room: string): Promi
  * Consolidate a job run's output into memories.
  * Called after a job completes in the runner.
  */
-export async function consolidateJobRun(jobName: string, jobPrompt: string, result: string): Promise<void> {
+export async function consolidateJobRun(
+  jobName: string,
+  jobPrompt: string,
+  result: string,
+): Promise<void> {
   // Skip if the job itself is the consolidator (prevent infinite loop)
   if (jobName === "memory-consolidation") return;
@@ -123,7 +158,10 @@ export async function consolidateJobRun(jobName: string, jobPrompt: string, resu
   if (result.length < 50) return;
   try {
-    log.info({ jobName, resultChars: result.length }, "consolidator: extracting memories from job");
+    log.info(
+      { jobName, resultChars: result.length },
+      "consolidator: extracting memories from job",
+    );
     await runConsolidation(transcript, `job run — ${jobName}`);
   } catch (err) {
     log.error({ err, jobName }, "consolidator: job extraction failed");

package/src/core/health.ts CHANGED Viewed

@@ -5,6 +5,7 @@ import { getPaths } from "../utils/paths";
 import { isRunning, readPid } from "./daemon";
 import { errMsg } from "../utils/errors";
 import { localTime } from "../utils/time";
+import { withRetry } from "../utils/retry";
 export type CheckStatus = "ok" | "warn" | "fail";
 export type Check = { name: string; status: CheckStatus; detail: string };
@@ -22,9 +23,17 @@ export async function runHealthChecks(): Promise<Check[]> {
   // Daemon
   const pid = readPid();
   if (isRunning()) {
-    checks.push({ name: "daemon", status: "ok", detail: "running (pid: " + pid + ")" });
+    checks.push({
+      name: "daemon",
+      status: "ok",
+      detail: "running (pid: " + pid + ")",
+    });
   } else if (pid) {
-    checks.push({ name: "daemon", status: "fail", detail: "stale pid file (pid: " + pid + ", not running)" });
+    checks.push({
+      name: "daemon",
+      status: "fail",
+      detail: "stale pid file (pid: " + pid + ", not running)",
+    });
   } else {
     checks.push({ name: "daemon", status: "warn", detail: "not running" });
   }
@@ -32,19 +41,35 @@ export async function runHealthChecks(): Promise<Check[]> {
   // Config
   if (existsSync(paths.config)) {
     const raw = readRawConfig();
-    checks.push({ name: "config", status: "ok", detail: Object.keys(raw).length + " keys loaded" });
+    checks.push({
+      name: "config",
+      status: "ok",
+      detail: Object.keys(raw).length + " keys loaded",
+    });
   } else {
-    checks.push({ name: "config", status: "fail", detail: "missing (" + paths.config + ")" });
+    checks.push({
+      name: "config",
+      status: "fail",
+      detail: "missing (" + paths.config + ")",
+    });
   }
   // Database
   try {
     if (!config.database_url || !config.database_url.startsWith("postgres")) {
-      checks.push({ name: "database", status: "fail", detail: 'invalid url: "' + (config.database_url || "(empty)") + '"' });
+      checks.push({
+        name: "database",
+        status: "fail",
+        detail: 'invalid url: "' + (config.database_url || "(empty)") + '"',
+      });
     } else {
       const { checkDbHealth } = await import("../commands/health-db");
       const ok = await checkDbHealth(config.database_url);
-      checks.push({ name: "database", status: ok ? "ok" : "fail", detail: ok ? "connected" : "unreachable" });
+      checks.push({
+        name: "database",
+        status: ok ? "ok" : "fail",
+        detail: ok ? "connected" : "unreachable",
+      });
     }
   } catch (err) {
     checks.push({ name: "database", status: "fail", detail: errMsg(err) });
@@ -60,13 +85,26 @@ export async function runHealthChecks(): Promise<Check[]> {
     const tgToken = config.channels.telegram.bot_token;
     if (tgToken) {
       try {
-        const resp = await fetch(`https://api.telegram.org/bot${tgToken}/getMe`);
-        const data = await resp.json() as { ok: boolean };
+        const resp = await withRetry(() =>
+          fetch(`https://api.telegram.org/bot${tgToken}/getMe`, {
+            signal: AbortSignal.timeout(5000),
+          }),
+        );
+        const data = (await resp.json()) as { ok: boolean };
         results.push(data.ok ? "telegram: connected" : "telegram: auth failed");
-        if (!data.ok) checks.push({ name: "telegram", status: "fail", detail: "auth failed" });
+        if (!data.ok)
+          checks.push({
+            name: "telegram",
+            status: "fail",
+            detail: "auth failed",
+          });
       } catch {
         results.push("telegram: unreachable");
-        checks.push({ name: "telegram", status: "fail", detail: "unreachable" });
+        checks.push({
+          name: "telegram",
+          status: "warn",
+          detail: "unreachable",
+        });
       }
     }
@@ -74,31 +112,57 @@ export async function runHealthChecks(): Promise<Check[]> {
     const slToken = config.channels.slack.bot_token;
     if (slToken) {
       try {
-        const resp = await fetch("https://slack.com/api/auth.test", {
-          method: "POST",
-          headers: { Authorization: `Bearer ${slToken}`, "Content-Type": "application/json" },
-        });
-        const data = await resp.json() as { ok: boolean; error?: string };
-        results.push(data.ok ? "slack: connected" : `slack: ${data.error || "auth failed"}`);
-        if (!data.ok) checks.push({ name: "slack", status: "fail", detail: data.error || "auth failed" });
+        const resp = await withRetry(() =>
+          fetch("https://slack.com/api/auth.test", {
+            method: "POST",
+            headers: {
+              Authorization: `Bearer ${slToken}`,
+              "Content-Type": "application/json",
+            },
+            signal: AbortSignal.timeout(5000),
+          }),
+        );
+        const data = (await resp.json()) as { ok: boolean; error?: string };
+        results.push(
+          data.ok
+            ? "slack: connected"
+            : `slack: ${data.error || "auth failed"}`,
+        );
+        if (!data.ok)
+          checks.push({
+            name: "slack",
+            status: "fail",
+            detail: data.error || "auth failed",
+          });
       } catch {
         results.push("slack: unreachable");
-        checks.push({ name: "slack", status: "fail", detail: "unreachable" });
+        checks.push({ name: "slack", status: "warn", detail: "unreachable" });
       }
     }
     if (results.length === 0) {
-      checks.push({ name: "channels", status: "warn", detail: "enabled but no tokens configured" });
+      checks.push({
+        name: "channels",
+        status: "warn",
+        detail: "enabled but no tokens configured",
+      });
     } else {
       const allOk = results.every((r) => r.includes("connected"));
-      checks.push({ name: "channels", status: allOk ? "ok" : "warn", detail: results.join(", ") });
+      checks.push({
+        name: "channels",
+        status: allOk ? "ok" : "warn",
+        detail: results.join(", "),
+      });
     }
   }
   // API keys
   const geminiKey = config.gemini_api_key;
   const rawConfig = readRawConfig();
-  const openaiKey = typeof rawConfig.openai_api_key === "string" ? rawConfig.openai_api_key : null;
+  const openaiKey =
+    typeof rawConfig.openai_api_key === "string"
+      ? rawConfig.openai_api_key
+      : null;
   const apiKeys: string[] = [];
   if (geminiKey) apiKeys.push("gemini");
   if (openaiKey) apiKeys.push("openai");
@@ -110,11 +174,16 @@ export async function runHealthChecks(): Promise<Check[]> {
   // Persona files
   const personaFiles = ["identity.md", "owner.md", "soul.md"];
-  const missing = personaFiles.filter((f) => !existsSync(join(paths.selfDir, f)));
+  const missing = personaFiles.filter(
+    (f) => !existsSync(join(paths.selfDir, f)),
+  );
   checks.push({
     name: "persona",
     status: missing.length === 0 ? "ok" : "warn",
-    detail: missing.length === 0 ? "all files present" : "missing: " + missing.join(", "),
+    detail:
+      missing.length === 0
+        ? "all files present"
+        : "missing: " + missing.join(", "),
   });
   // Daemon log

package/src/core/skills.ts CHANGED Viewed

@@ -40,15 +40,25 @@ export function scanSkills(): SkillInfo[] {
       try {
         meta = (yaml.load(fmMatch[1]) as Record<string, unknown>) || {};
       } catch (err) {
-        log.warn({ err, skill: entry.name, path: skillFile }, "failed to parse skill metadata, skipping");
+        log.warn(
+          { err, skill: entry.name, path: skillFile },
+          "failed to parse skill metadata, skipping",
+        );
         continue;
       }
-      const name = (typeof meta.name === "string" ? meta.name : "") || entry.name;
+      const name =
+        (typeof meta.name === "string" ? meta.name : "") || entry.name;
-      if (seen.has(name)) continue;
-      seen.add(name);
+      const key = name.toLowerCase();
+      if (seen.has(key)) continue;
+      seen.add(key);
-      skills.push({ name, description: typeof meta.description === "string" ? meta.description : "", source });
+      skills.push({
+        name,
+        description:
+          typeof meta.description === "string" ? meta.description : "",
+        source,
+      });
     }
   }
@@ -62,6 +72,8 @@ export function getSkillNames(): string[] {
 export function getSkillsSummary(): string {
   const skills = scanSkills();
   if (skills.length === 0) return "";
-  const lines = skills.map((s) => s.description ? `- /${s.name}: ${s.description}` : `- /${s.name}`);
+  const lines = skills.map((s) =>
+    s.description ? `- /${s.name}: ${s.description}` : `- /${s.name}`,
+  );
   return `Available skills:\n${lines.join("\n")}`;
 }

package/src/core/summarizer.ts CHANGED Viewed

@@ -15,8 +15,10 @@ import { runTask } from "./runner";
 import { log } from "../utils/log";
 import type { SessionMessage } from "../types";
-/** Track sessions already summarized to prevent double runs. */
-const summarized = new Set<string>();
+/** Bounded dedup: sessionId → message count at last summarization. */
+const processedCounts = new Map<string, number>();
+const inFlight = new Set<string>();
+const MAX_TRACKED = 500;
 /** Max messages to include (most recent). */
 const MAX_MESSAGES = 30;
@@ -33,16 +35,26 @@ function formatTranscript(messages: SessionMessage[]): string {
  * Summarize a session and store the result in the sessions table.
  * Called when a chat engine goes idle — produces a context bridge for the next session.
  */
-export async function summarizeSession(sessionId: string, room: string): Promise<void> {
+export async function summarizeSession(
+  sessionId: string,
+  room: string,
+): Promise<void> {
   if (room.includes("placeholder")) return;
-  if (summarized.has(sessionId)) return;
-  summarized.add(sessionId);
+  if (inFlight.has(sessionId)) return;
   try {
     const messages = await Message.getBySession(sessionId);
     if (messages.length < 2) return;
-    log.info({ sessionId, room, messageCount: messages.length }, "summarizer: generating session summary");
+    // Skip if already processed this exact message count
+    if (processedCounts.get(sessionId) === messages.length) return;
+    inFlight.add(sessionId);
+    log.info(
+      { sessionId, room, messageCount: messages.length },
+      "summarizer: generating session summary",
+    );
     const transcript = formatTranscript(messages);
@@ -71,11 +83,24 @@ Keep it concise — a handoff note, not a report. Output ONLY the summary text.`
     const summary = output.agentText.trim();
     if (summary && summary.length > 10 && summary.length < 2000) {
       await Session.setSummary(sessionId, summary);
-      log.info({ sessionId, room, summaryChars: summary.length }, "summarizer: saved");
+      processedCounts.set(sessionId, messages.length);
+      if (processedCounts.size > MAX_TRACKED) {
+        const firstKey = processedCounts.keys().next().value;
+        if (firstKey) processedCounts.delete(firstKey);
+      }
+      log.info(
+        { sessionId, room, summaryChars: summary.length },
+        "summarizer: saved",
+      );
     } else {
-      log.warn({ sessionId, room, length: summary.length }, "summarizer: output too short or too long, skipped");
+      log.warn(
+        { sessionId, room, length: summary.length },
+        "summarizer: output too short or too long, skipped",
+      );
     }
   } catch (err) {
     log.error({ err, sessionId, room }, "summarizer: failed");
+  } finally {
+    inFlight.delete(sessionId);
   }
 }

package/src/db/models/active_engine.ts CHANGED Viewed

@@ -26,7 +26,9 @@ export async function unregister(room: string): Promise<void> {
   await sql`DELETE FROM active_engines WHERE room = ${room}`;
 }
-export async function clearStale(maxAgeMs: number = 5 * 60 * 1000): Promise<void> {
+export async function clearStale(
+  maxAgeMs: number = 5 * 60 * 1000,
+): Promise<void> {
   const sql = getSql();
   await sql`DELETE FROM active_engines WHERE last_ping < NOW() - ${maxAgeMs / 1000}::int * interval '1 second'`;
 }
@@ -38,8 +40,8 @@ export async function clearAll(): Promise<void> {
 export async function list(): Promise<ActiveEngine[]> {
   const sql = getSql();
-  await clearStale();
-  const rows = await sql`SELECT room, channel, started_at, last_ping FROM active_engines ORDER BY started_at`;
+  const rows =
+    await sql`SELECT room, channel, started_at, last_ping FROM active_engines ORDER BY started_at`;
   return rows.map((r) => ({
     room: r.room,
     channel: r.channel,

package/src/utils/retry.ts ADDED Viewed

@@ -0,0 +1,18 @@
+/** Retry a function with Fibonacci backoff. Only retries on thrown errors (not bad return values). */
+export async function withRetry<T>(
+  fn: () => Promise<T>,
+  retries = 3,
+): Promise<T> {
+  let a = 1,
+    b = 1;
+  for (let i = 0; i <= retries; i++) {
+    try {
+      return await fn();
+    } catch (err) {
+      if (i === retries) throw err;
+      await new Promise((r) => setTimeout(r, a * 1000));
+      [a, b] = [b, a + b];
+    }
+  }
+  throw new Error("unreachable"); // satisfies TS return type
+}