npm - open-research - Versions diffs - 0.1.3 → 0.1.4 - Mend

open-research 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (3) hide show

package/README.md CHANGED Viewed

@@ -74,36 +74,159 @@ It has tools that coding agents don't: federated academic paper search, PDF extr
 Everything stays local. Your workspace is a directory with `sources/`, `notes/`, `papers/`, `experiments/`. The agent reads and writes to it. Risky edits go to a review queue.
-## Skills
+## Agent Modes
-Built-in research methodologies. Type `/skill-name` to activate:
+Open Research operates in three modes. Cycle with `Shift+Tab`:
-- **source-scout** — find citation gaps, discover papers
-- **devils-advocate** — stress-test claims and assumptions
-- **methodology-critic** — critique research methodology
-- **evidence-adjudicator** — evaluate evidence quality
-- **experiment-designer** — design experiments
-- **draft-paper** — draft LaTeX papers from workspace evidence
-- **paper-explainer** — explain complex papers
-- **synthesis-updater** — update syntheses with new findings
+### Manual Review (default)
-Create custom skills in `~/.open-research/skills/`.
+The agent proposes changes. You review and accept (`a`) or reject (`r`) each one. Best for sensitive work where every edit matters.
+### Auto-Approve
+All file writes are applied immediately without review. Best for exploratory work where speed matters more than control.
+### Auto-Research
+The most powerful mode. A two-phase autonomous research workflow:
+**Phase 1 — Planning.** The agent enters read-only planning mode. It reads your workspace, searches academic databases, and asks you clarifying questions. It then produces a **Research Charter** — a structured contract defining:
+- The research question (precisely stated)
+- Success criteria (what "done" looks like)
+- Scope boundaries (what's explicitly out of scope)
+- Known starting points (papers, data, leads)
+- Proposed investigation steps
+You review the charter and either approve it, send it back for revision, or cancel.
+**Phase 2 — Execution.** Once approved, the agent executes the charter autonomously — searching papers, reading sources, running analysis code, writing notes, and producing artifacts. It runs until the success criteria are met or it hits a dead end and reports what it found.
+## Research Skills
+Skills are pluggable research methodologies — detailed workflow prompts that guide the agent through a specific research task. Type `/<skill-name>` to activate.
+### Discovery & Reading
+| Skill | What it does |
+|---|---|
+| **`/source-scout`** | Systematically finds papers the workspace is missing. Searches with multiple query variations, evaluates relevance by citation count and venue, fetches key papers, produces a prioritized scout report with gap analysis. |
+| **`/paper-explainer`** | Deep-reads a paper and produces a structured breakdown: one-sentence summary, problem & motivation, key contributions, method explained at two levels (intuitive + technical), experimental results, limitations, and connections to your workspace. |
+| **`/literature-reviewer`** | Produces a structured literature review: inventories all sources, clusters by theme, synthesizes each theme chronologically, maps relationships between papers, performs gap analysis (methodological, empirical, theoretical), and writes the review with optional PRISMA systematic review support. |
+### Critical Evaluation
+| Skill | What it does |
+|---|---|
+| **`/devils-advocate`** | Stress-tests every claim in the workspace. Attacks each one through six lenses: evidence gap, logical gap, scope overclaim, alternative explanation, replication concern, and statistical concern. Actively searches for counter-evidence. Rates each weakness as Critical/Significant/Minor. |
+| **`/methodology-critic`** | Reviews study design, sample selection, controls, measurement validity, statistical methods, and reporting completeness. If code is available, reproduces the analysis to verify results. Rates each study Rigorous/Acceptable/Concerning/Flawed. |
+| **`/evidence-adjudicator`** | Judges conflicting claims using a formal evidence hierarchy (meta-analysis → RCT → cohort → case study → opinion). Checks for bias and conflicts of interest. Delivers a clear verdict with evidence ratings: Strong/Moderate/Weak/Insufficient. |
+### Analysis & Experimentation
+| Skill | What it does |
+|---|---|
+| **`/experiment-designer`** | Autonomous proof engine. Takes a hypothesis and runs the full loop: formalize → design minimal experiment → write code → run it → analyze results → iterate (up to 5x) until proven or disproven. All artifacts saved to `experiments/` with versioned scripts. |
+| **`/data-analyst`** | End-to-end statistical analysis: explore data (distributions, missing values) → clean (with documented decisions) → analyze (appropriate tests, mandatory effect sizes and confidence intervals) → visualize (matplotlib/seaborn) → interpret with honest caveats. |
+### Synthesis & Writing
+| Skill | What it does |
+|---|---|
+| **`/synthesis-updater`** | Living-document management. Integrates new evidence into existing notes with full provenance tracking (`[Source: Author Year]`), confidence labels (`[Strong]`, `[Moderate]`, `[Weak]`, `[Contested]`), change trails, and a synthesis changelog. |
+| **`/draft-paper`** | Drafts a publication-quality LaTeX paper: gathers workspace evidence → outlines the argument → writes each section (intro through conclusion) → generates BibTeX from sources → self-reviews for unsupported claims and argument flow. |
+### Meta
+| Skill | What it does |
+|---|---|
+| **`/skill-creator`** | Create your own custom skills in `~/.open-research/skills/`. Each skill is a markdown file with a workflow prompt — no code needed. |
+## Memory
+The agent learns about you automatically. After each conversation, a background process identifies facts worth remembering — your research field, preferred tools, current projects, methodological preferences.
+Memories persist in `~/.open-research/memory.json` across sessions. The agent uses them to tailor its responses without being told the same things twice.
+```
+/memory              View all stored memories
+/memory clear        Delete everything
+/memory delete <id>  Remove a specific memory
+```
+## Live LaTeX Preview
+When the agent drafts a paper, preview it instantly:
+```
+/preview papers/draft.tex
+```
+Opens a localhost server in your browser with:
+- Sections, math (KaTeX), citations, lists rendered as styled HTML
+- Auto-reload — the page refreshes every time the file changes
+- Dark theme matching the CLI aesthetic
+- No LaTeX installation required for preview
+For final PDF output, the agent compiles with `pdflatex` or `tectonic` via `run_command`.
 ## Tools
+The agent has 13 tools with full filesystem and shell access:
 | Tool | Description |
 |---|---|
-| `read_file` | Read any file with streaming, binary detection |
-| `read_pdf` | Extract text from PDFs |
-| `run_command` | Shell execution — Python, R, LaTeX, anything |
-| `list_directory` | Explore directory trees |
-| `search_external_sources` | arXiv + Semantic Scholar + OpenAlex |
-| `fetch_url` | Fetch web pages and APIs |
+| `read_file` | Read any file — streaming, binary detection, `~` expansion |
+| `read_pdf` | Extract text from PDFs with page-range selection |
+| `run_command` | Shell execution — Python, R, LaTeX, curl, git, anything |
+| `list_directory` | Explore directory trees with depth control |
+| `search_external_sources` | Federated search: arXiv + Semantic Scholar + OpenAlex |
+| `fetch_url` | Fetch web pages and APIs, HTML auto-converted to text via cheerio |
 | `write_new_file` | Create workspace files |
-| `update_existing_file` | Edit with review policy |
-| `ask_user` | Pause and ask for clarification |
-| `search_workspace` | Full-text search across files |
-| `create_paper` | Create LaTeX drafts |
+| `update_existing_file` | Edit existing files with review policy |
+| `ask_user` | Pause and ask the user a question with selectable options |
+| `search_workspace` | Full-text search across workspace files |
+| `create_paper` | Create LaTeX paper drafts |
+| `load_skill` | Activate a research skill |
+| `read_skill_reference` | Read reference materials from active skills |
+## Commands
+| Command | Description |
+|---|---|
+| `/auth` | Connect OpenAI account via browser |
+| `/auth-codex` | Import existing Codex CLI auth |
+| `/init` | Initialize workspace in current directory |
+| `/skills` | List available research skills |
+| `/preview <file>` | Live-preview a LaTeX file in browser |
+| `/memory` | View or manage stored memories |
+| `/config` | View or change settings (model, theme, mode) |
+| `/resume` | Resume a previous session |
+| `/clear` | Start a new conversation |
+| `/help` | Show all commands |
+## Workspace
+```
+my-research/
+  sources/         # PDFs, papers, raw data
+  notes/           # Research notes, syntheses, reviews
+  artifacts/       # Generated outputs
+  papers/          # LaTeX paper drafts
+  experiments/     # Analysis scripts, results, hypotheses
+  .open-research/  # Workspace metadata and session logs
+```
+## Features
+- **Terminal markdown** — bold, italic, code blocks, headings rendered natively
+- **Autocomplete** — slash commands and skills in an arrow-key navigable dropdown
+- **@file mentions** — reference workspace files inline in prompts
+- **Shift+Enter** — multi-line input
+- **Context management** — automatic compaction when history exceeds 90% of context window
+- **Token tracking** — context usage visible in the status bar
+- **Tool activity streaming** — real-time display of what the agent is doing
+- **Update notifications** — checks for new versions on launch
 ## Development
@@ -112,7 +235,7 @@ git clone https://github.com/gangj277/open-research.git
 cd open-research
 npm install
 npm run dev          # dev mode
-npm test             # 63 tests
+npm test             # 80 tests
 npm run build        # production build
 ```

package/dist/cli.js CHANGED Viewed

@@ -1779,11 +1779,17 @@ function createOpenAIAuthProvider(credentials, onTokenRefresh, onValidationChang
         } else if (event.type === "response.completed") {
           const resp = event.data.response;
           if (resp?.usage) {
-            const usage = resp.usage;
+            const u = resp.usage;
+            const inputDetails = u.input_tokens_details;
+            const outputDetails = u.output_tokens_details;
+            const inputTokens = u.input_tokens ?? 0;
+            const outputTokens = u.output_tokens ?? 0;
             usageData = {
-              promptTokens: usage.input_tokens ?? 0,
-              completionTokens: usage.output_tokens ?? 0,
-              totalTokens: (usage.input_tokens ?? 0) + (usage.output_tokens ?? 0)
+              promptTokens: inputTokens,
+              completionTokens: outputTokens,
+              totalTokens: u.total_tokens ?? inputTokens + outputTokens,
+              cachedTokens: inputDetails?.cached_tokens ?? 0,
+              reasoningTokens: outputDetails?.reasoning_tokens ?? 0
             };
           }
           if (resp?.model) {
@@ -1895,11 +1901,17 @@ function createOpenAIAuthProvider(credentials, onTokenRefresh, onValidationChang
           case "response.completed": {
             const resp = event.data.response;
             if (resp?.usage) {
-              const responseUsage = resp.usage;
+              const u = resp.usage;
+              const inputDetails = u.input_tokens_details;
+              const outputDetails = u.output_tokens_details;
+              const inputTokens = u.input_tokens ?? 0;
+              const outputTokens = u.output_tokens ?? 0;
               usage = {
-                promptTokens: responseUsage.input_tokens ?? 0,
-                completionTokens: responseUsage.output_tokens ?? 0,
-                totalTokens: (responseUsage.input_tokens ?? 0) + (responseUsage.output_tokens ?? 0)
+                promptTokens: inputTokens,
+                completionTokens: outputTokens,
+                totalTokens: u.total_tokens ?? inputTokens + outputTokens,
+                cachedTokens: inputDetails?.cached_tokens ?? 0,
+                reasoningTokens: outputDetails?.reasoning_tokens ?? 0
               };
             }
             break;
@@ -4507,35 +4519,65 @@ var MODEL_CONTEXT_WINDOWS = {
   "o4-mini": 2e5
 };
 var DEFAULT_CONTEXT_WINDOW = 128e3;
-var COMPACT_THRESHOLD_PERCENT = 0.9;
+var AUTO_COMPACT_TOKEN_LIMIT = 25e4;
 function getContextWindow(model) {
   return MODEL_CONTEXT_WINDOWS[model] ?? DEFAULT_CONTEXT_WINDOW;
 }
 function getCompactThreshold(model) {
-  return Math.floor(getContextWindow(model) * COMPACT_THRESHOLD_PERCENT);
+  const window = getContextWindow(model);
+  return window > AUTO_COMPACT_TOKEN_LIMIT ? AUTO_COMPACT_TOKEN_LIMIT : Math.floor(window * 0.8);
+}
+function emptyBreakdown() {
+  return { input: 0, output: 0, reasoning: 0, cache: { read: 0, write: 0 }, total: 0 };
 }
 function createSessionUsage() {
   return {
+    cumulative: emptyBreakdown(),
+    lastTurn: emptyBreakdown(),
+    estimatedCurrentTokens: 0,
+    compactionCount: 0,
     inputTokens: 0,
     outputTokens: 0,
     totalTokens: 0,
-    lastTurnTokens: 0,
-    estimatedCurrentTokens: 0,
-    compactionCount: 0
+    lastTurnTokens: 0
   };
 }
 function updateUsageFromApi(usage, apiUsage) {
-  usage.inputTokens += apiUsage.promptTokens;
-  usage.outputTokens += apiUsage.completionTokens;
-  usage.totalTokens += apiUsage.totalTokens;
+  const cached = apiUsage.cachedTokens ?? 0;
+  const reasoning = apiUsage.reasoningTokens ?? 0;
+  const adjustedInput = Math.max(0, apiUsage.promptTokens - cached);
+  const adjustedOutput = Math.max(0, apiUsage.completionTokens - reasoning);
+  usage.cumulative.input += adjustedInput;
+  usage.cumulative.output += adjustedOutput;
+  usage.cumulative.reasoning += reasoning;
+  usage.cumulative.cache.read += cached;
+  usage.cumulative.total += apiUsage.totalTokens;
+  usage.lastTurn = {
+    input: adjustedInput,
+    output: adjustedOutput,
+    reasoning,
+    cache: { read: cached, write: 0 },
+    total: apiUsage.totalTokens
+  };
+  usage.inputTokens = usage.cumulative.input;
+  usage.outputTokens = usage.cumulative.output;
+  usage.totalTokens = usage.cumulative.total;
   usage.lastTurnTokens = apiUsage.totalTokens;
 }
 var PRUNE_PROTECT_TOKENS = 4e4;
 var PRUNE_MIN_SAVINGS = 2e4;
+var PRUNE_SKIP_RECENT_USER_TURNS = 2;
 function pruneToolOutputs(messages) {
+  const userIndices = [];
+  for (let i = messages.length - 1; i >= 0; i--) {
+    if (messages[i].role === "user") userIndices.push(i);
+  }
+  const protectBoundary = userIndices.length >= PRUNE_SKIP_RECENT_USER_TURNS ? userIndices[PRUNE_SKIP_RECENT_USER_TURNS - 1] : 0;
   const toolIndices = [];
   for (let i = 0; i < messages.length; i++) {
-    if (messages[i].role === "tool") toolIndices.push(i);
+    if (messages[i].role === "tool" && i < protectBoundary) {
+      toolIndices.push(i);
+    }
   }
   if (toolIndices.length === 0) return { messages, savedTokens: 0 };
   let protectedTokens = 0;
@@ -4555,44 +4597,105 @@ function pruneToolOutputs(messages) {
     const idx = toolIndices[i];
     const msg = result[idx];
     const oldTokens = estimateMessageTokens(msg);
-    const pruned = "[output pruned \u2014 use read_file to re-read if needed]";
-    savedTokens += oldTokens - estimateTokens(pruned);
-    result[idx] = { ...msg, content: pruned };
+    const stub = "[output pruned \u2014 use read_file to re-read if needed]";
+    savedTokens += oldTokens - estimateTokens(stub);
+    result[idx] = { ...msg, content: stub };
   }
   if (savedTokens < PRUNE_MIN_SAVINGS) {
     return { messages, savedTokens: 0 };
   }
   return { messages: result, savedTokens };
 }
-async function compactConversation(messages, provider, model, signal) {
+var COMPACTION_SYSTEM_PROMPT = `You are a conversation summarizer for a research agent. Your job is to create a handoff summary that another agent instance can use to seamlessly continue the work.
+Do not respond to any questions in the conversation. Only output the summary.
+Respond in the same language the user used.`;
+var COMPACTION_USER_TEMPLATE = `Provide a detailed summary of our conversation above for handoff to another agent that will continue the work.
+Stick to this template:
+## Goal
+[What is the user trying to accomplish? Be specific.]
+## Instructions
+- [Important instructions or preferences the user gave]
+- [Research methodology constraints or requirements]
+- [If there is a research charter or plan, summarize its key points]
+## Discoveries
+- [Key findings from paper searches, data analysis, or experiments]
+- [Important facts, numbers, or evidence discovered]
+- [Any surprising or contradicting results]
+## Accomplished
+- [What work has been completed]
+- [What is currently in progress]
+- [What remains to be done]
+## Relevant Files
+[List workspace files that were read, created, or modified. Include what each contains.]
+- path/to/file.md \u2014 description of contents
+- experiments/script.py \u2014 what it does and its results
+## Active Context
+- [Current research question or hypothesis being investigated]
+- [Which skills are active]
+- [Any pending user decisions or questions]
+## Next Steps
+1. [Most immediate next action]
+2. [Following action]
+3. [And so on]
+{CUSTOM_INSTRUCTIONS}`;
+async function compactConversation(messages, provider, model, customInstructions, signal) {
   const systemMsg = messages.find((m) => m.role === "system");
   const conversationMsgs = messages.filter((m) => m.role !== "system");
   const conversationText = conversationMsgs.map((m) => {
     const role = m.role === "assistant" ? "Agent" : m.role === "user" ? "User" : "Tool";
-    const content = typeof m.content === "string" ? m.content : m.content ? JSON.stringify(m.content) : "[tool calls]";
-    return `[${role}]: ${content?.slice(0, 2e3)}`;
+    let content;
+    if (typeof m.content === "string") {
+      content = m.content.length > 3e3 ? m.content.slice(0, 3e3) + "\n[... truncated]" : m.content;
+    } else if (m.content) {
+      content = JSON.stringify(m.content).slice(0, 1e3);
+    } else if (m.tool_calls?.length) {
+      content = m.tool_calls.map((tc) => `[tool: ${tc.function.name}]`).join(", ");
+    } else {
+      content = "[empty]";
+    }
+    return `[${role}]: ${content}`;
   }).join("\n\n");
+  const customBlock = customInstructions ? `
+Additional instructions: ${customInstructions}` : "";
+  const userPrompt = COMPACTION_USER_TEMPLATE.replace("{CUSTOM_INSTRUCTIONS}", customBlock);
+  const compactionModel = model.includes("5.4") ? "gpt-5.4-mini" : model;
   const summaryResponse = await provider.callLLM({
     messages: [
-      {
-        role: "system",
-        content: "You are performing a CONTEXT COMPACTION. Summarize the conversation into a concise handoff document. Include:\n1. **Goal**: What the user is trying to accomplish\n2. **Key discoveries**: Important findings, file paths, data points\n3. **Work completed**: What has been done so far\n4. **Next steps**: What should happen next\n5. **Active files**: Key file paths and their contents summary\n\nBe concise but preserve all actionable information. This summary will replace the full conversation history."
-      },
+      { role: "system", content: COMPACTION_SYSTEM_PROMPT },
       {
         role: "user",
-        content: `Summarize this conversation:
+        content: `Here is the conversation to summarize:
+${conversationText.slice(0, 12e4)}
+---
-${conversationText.slice(0, 1e5)}`
+${userPrompt}`
       }
     ],
-    model,
+    model: compactionModel,
     maxTokens: 4096
   });
   const compacted = [];
   if (systemMsg) compacted.push(systemMsg);
   compacted.push({
     role: "user",
-    content: "[Context compacted \u2014 previous conversation summarized below]\n\n" + summaryResponse.content
+    content: "What have we accomplished so far in this research session?"
+  });
+  compacted.push({
+    role: "assistant",
+    content: summaryResponse.content
   });
   return compacted;
 }
@@ -4610,7 +4713,17 @@ async function maybeCompact(messages, model, provider, usage, signal) {
     usage.compactionCount++;
     return { messages: pruned, didCompact: true };
   }
-  const compacted = await compactConversation(pruned, provider, model, signal);
+  const compacted = await compactConversation(pruned, provider, model, void 0, signal);
+  usage.estimatedCurrentTokens = estimateConversationTokens(compacted);
+  usage.compactionCount++;
+  return { messages: compacted, didCompact: true };
+}
+async function manualCompact(messages, model, provider, usage, customInstructions, signal) {
+  if (messages.length <= 2) {
+    return { messages, didCompact: false };
+  }
+  const { messages: pruned } = pruneToolOutputs(messages);
+  const compacted = await compactConversation(pruned, provider, model, customInstructions, signal);
   usage.estimatedCurrentTokens = estimateConversationTokens(compacted);
   usage.compactionCount++;
   return { messages: compacted, didCompact: true };
@@ -5630,6 +5743,13 @@ var SLASH_COMMANDS = [
   { name: "clear", aliases: ["/new"], description: "Clear conversation and start fresh", category: "session" },
   { name: "help", aliases: ["/commands"], description: "Show available commands", category: "system" },
   { name: "config", aliases: ["/settings"], description: "View or change settings (e.g. /config theme dark)", category: "system" },
+  { name: "compact", aliases: [], description: "Manually compress conversation to save context (e.g. /compact keep the statistics)", category: "session" },
+  { name: "cost", aliases: ["/tokens", "/usage"], description: "Show token usage and cost for the current session", category: "system" },
+  { name: "context", aliases: [], description: "Show context window usage \u2014 how full it is", category: "system" },
+  { name: "btw", aliases: ["/aside"], description: "Ask a side question without affecting the main conversation", category: "session" },
+  { name: "export", aliases: [], description: "Export conversation as markdown to a file", category: "session" },
+  { name: "diff", aliases: ["/changes"], description: "Show files the agent has changed in this session", category: "workspace" },
+  { name: "doctor", aliases: [], description: "Diagnose auth, connectivity, and tool availability", category: "system" },
   { name: "preview", aliases: [], description: "Live preview a LaTeX file in browser (e.g. /preview papers/draft.tex)", category: "workspace" },
   { name: "memory", aliases: ["/memories"], description: "View or clear stored memories about you", category: "system" },
   { name: "exit", aliases: ["/quit", "/q"], description: "Exit Open Research", category: "system" }
@@ -6458,6 +6578,178 @@ function App({
         addSystemMessage("  Esc  unfocus prompt");
         break;
       }
+      case "compact": {
+        if (history.length === 0) {
+          addSystemMessage("Nothing to compact \u2014 conversation is empty.");
+          break;
+        }
+        const customInstructions = args || void 0;
+        addSystemMessage(customInstructions ? `Compacting conversation (preserving: ${customInstructions})...` : "Compacting conversation...");
+        setBusy(true);
+        try {
+          const provider = await createProviderFromStoredAuth({ homeDir });
+          const msgs = [{ role: "system", content: "compaction" }, ...history.map((m) => m)];
+          const { messages: compacted, didCompact } = await manualCompact(
+            msgs,
+            config?.defaults.model ?? "gpt-5.4",
+            provider,
+            sessionTokens,
+            customInstructions
+          );
+          if (didCompact) {
+            const newHistory = compacted.filter((m) => m.role !== "system").map((m) => ({
+              role: m.role,
+              content: m.content
+            }));
+            setHistory(newHistory);
+            const k = (n) => n >= 1e3 ? `${(n / 1e3).toFixed(1)}k` : String(n);
+            setTokenDisplay(`${k(sessionTokens.estimatedCurrentTokens)} ctx \xB7 ${k(sessionTokens.totalTokens)} total`);
+            addSystemMessage(`Compacted. Context reduced to ~${Math.round(sessionTokens.estimatedCurrentTokens / 1e3)}k tokens.`);
+          } else {
+            addSystemMessage("Nothing to compact \u2014 conversation too short.");
+          }
+        } catch (err) {
+          addSystemMessage(`Compaction failed: ${err instanceof Error ? err.message : String(err)}`);
+        } finally {
+          setBusy(false);
+        }
+        break;
+      }
+      case "cost": {
+        const k = (n) => n >= 1e3 ? `${(n / 1e3).toFixed(1)}k` : String(n);
+        const c = sessionTokens.cumulative;
+        addSystemMessage("Session token usage:");
+        addSystemMessage(`  Input:     ${k(c.input)} tokens`);
+        addSystemMessage(`  Output:    ${k(c.output)} tokens`);
+        if (c.reasoning > 0) addSystemMessage(`  Reasoning: ${k(c.reasoning)} tokens`);
+        if (c.cache.read > 0) addSystemMessage(`  Cache read:  ${k(c.cache.read)} tokens`);
+        if (c.cache.write > 0) addSystemMessage(`  Cache write: ${k(c.cache.write)} tokens`);
+        addSystemMessage(`  Total:     ${k(c.total)} tokens`);
+        addSystemMessage(`  Context:   ~${k(sessionTokens.estimatedCurrentTokens)} (current window)`);
+        addSystemMessage(`  Compactions: ${sessionTokens.compactionCount}`);
+        break;
+      }
+      case "context": {
+        const model = config?.defaults.model ?? "gpt-5.4";
+        const window = getContextWindow(model);
+        const threshold = getCompactThreshold(model);
+        const current = sessionTokens.estimatedCurrentTokens || estimateConversationTokens(
+          history.map((m) => m)
+        );
+        const pct = Math.round(current / window * 100);
+        const barWidth = 40;
+        const filled = Math.round(pct / 100 * barWidth);
+        const bar = "\u2588".repeat(filled) + "\u2591".repeat(barWidth - filled);
+        const color = pct > 90 ? "red" : pct > 70 ? "yellow" : "green";
+        addSystemMessage(`Context window: ${model} (${(window / 1e3).toFixed(0)}k)`);
+        addSystemMessage(`  [${bar}] ${pct}%`);
+        addSystemMessage(`  ${(current / 1e3).toFixed(1)}k / ${(window / 1e3).toFixed(0)}k tokens used`);
+        addSystemMessage(`  Auto-compact at ${(threshold / 1e3).toFixed(0)}k (90%)`);
+        if (pct > 80) {
+          addSystemMessage("  Tip: run /compact to free space, or /clear to start fresh.");
+        }
+        break;
+      }
+      case "btw": {
+        if (!args) {
+          addSystemMessage("Usage: /btw <your side question>");
+          break;
+        }
+        if (!hasAuth) {
+          addSystemMessage("Not connected. Run /auth first.");
+          break;
+        }
+        addSystemMessage(`Side question: ${args}`);
+        setBusy(true);
+        try {
+          const provider = await createProviderFromStoredAuth({ homeDir });
+          const response = await provider.callLLM({
+            messages: [
+              { role: "system", content: "Answer this quick side question concisely. Do not reference any prior conversation." },
+              { role: "user", content: args }
+            ],
+            model: config?.defaults.model ?? "gpt-5.4",
+            maxTokens: 1e3
+          });
+          addSystemMessage(`Answer: ${response.content}`);
+        } catch (err) {
+          addSystemMessage(`Error: ${err instanceof Error ? err.message : String(err)}`);
+        } finally {
+          setBusy(false);
+        }
+        break;
+      }
+      case "export": {
+        const fileName = args?.trim() || "conversation-export.md";
+        const exportPath = __require("path").resolve(workspacePath ?? process.cwd(), fileName);
+        const lines = [`# Open Research \u2014 Conversation Export
+`];
+        for (const msg of messages) {
+          if (msg.role === "user") lines.push(`## You
+${msg.text}
+`);
+          else if (msg.role === "assistant") lines.push(`## Agent
+${msg.text}
+`);
+          else lines.push(`> ${msg.text}
+`);
+        }
+        try {
+          const fsModule = __require("fs/promises");
+          await fsModule.writeFile(exportPath, lines.join("\n"), "utf8");
+          addSystemMessage(`Exported ${messages.length} messages to ${exportPath}`);
+        } catch (err) {
+          addSystemMessage(`Export failed: ${err instanceof Error ? err.message : String(err)}`);
+        }
+        break;
+      }
+      case "diff": {
+        if (!workspacePath) {
+          addSystemMessage("No workspace active.");
+          break;
+        }
+        try {
+          const { execSync } = __require("child_process");
+          const gitStatus = execSync("git status --short 2>/dev/null || echo 'Not a git repo'", {
+            cwd: workspacePath,
+            encoding: "utf8"
+          }).trim();
+          if (!gitStatus || gitStatus === "Not a git repo") {
+            addSystemMessage("No changes detected (not a git repo or no modifications).");
+          } else {
+            addSystemMessage("Changed files:");
+            for (const line of gitStatus.split("\n")) {
+              addSystemMessage(`  ${line}`);
+            }
+          }
+        } catch {
+          addSystemMessage("Could not check changes.");
+        }
+        break;
+      }
+      case "doctor": {
+        addSystemMessage("Running diagnostics...");
+        const authResult = await getAuthStatus({ homeDir });
+        addSystemMessage(`  Auth: ${authResult.connected ? "connected" : "not connected"} \u2014 ${authResult.message}`);
+        addSystemMessage(`  Workspace: ${workspacePath ? workspacePath : "none"}`);
+        addSystemMessage(`  Files: ${workspaceFiles.length}`);
+        addSystemMessage(`  Skills: ${skills2.length} loaded`);
+        const mems = await loadMemories({ homeDir });
+        addSystemMessage(`  Memories: ${mems.length} stored`);
+        addSystemMessage(`  Node: ${process.version}`);
+        const toolChecks = ["python3 --version", "pdflatex --version", "git --version"];
+        for (const cmd2 of toolChecks) {
+          try {
+            const { execSync } = __require("child_process");
+            const out = execSync(cmd2 + " 2>&1", { encoding: "utf8", timeout: 3e3 }).trim().split("\n")[0];
+            addSystemMessage(`  ${cmd2.split(" ")[0]}: ${out}`);
+          } catch {
+            addSystemMessage(`  ${cmd2.split(" ")[0]}: not found`);
+          }
+        }
+        addSystemMessage("Diagnostics complete.");
+        break;
+      }
       case "preview": {
         if (!args) {
           addSystemMessage("Usage: /preview <path-to-tex-file>");

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "open-research",
-  "version": "0.1.3",
+  "version": "0.1.4",
   "description": "Local-first research CLI agent — discover papers, synthesize notes, run analysis, and draft artifacts from your terminal.",
   "type": "module",
   "license": "MIT",