npm - daemora - Versions diffs - 1.0.7 → 1.0.9 - Mend

daemora 1.0.7 → 1.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (11) hide show

package/SOUL.md +40 -54
package/daemora-ui/dist/assets/index-AfA65HSy.js +90 -0
package/daemora-ui/dist/index.html +1 -1
package/package.json +1 -1
package/src/agents/systemPrompt.js +51 -48
package/src/cli.js +39 -105
package/src/config/models.js +72 -0
package/src/mcp/MCPAgentRunner.js +1 -1
package/src/mcp/MCPManager.js +14 -0
package/src/setup/wizard.js +58 -88
package/daemora-ui/dist/assets/index-BiMfB4bx.js +0 -90

package/daemora-ui/dist/index.html CHANGED Viewed

@@ -7,7 +7,7 @@
     <meta name="theme-color" content="#0a0a0f" />
     <meta name="description" content="Daemora — Self-hosted AI agent platform" />
     <title>Daemora</title>
-    <script type="module" crossorigin src="/assets/index-BiMfB4bx.js"></script>
+    <script type="module" crossorigin src="/assets/index-AfA65HSy.js"></script>
     <link rel="stylesheet" crossorigin href="/assets/index-DP95eMOr.css">
   </head>
   <body>

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "daemora",
-  "version": "1.0.7",
+  "version": "1.0.9",
   "description": "A powerful open-source AI agent that runs on your machine. Connects to any AI model, any MCP server, any channel. Fully autonomous - plans, codes, tests, browses, emails, and manages your tools without asking permission.",
   "main": "src/index.js",
   "bin": {

package/src/agents/systemPrompt.js CHANGED Viewed

@@ -149,30 +149,39 @@ You MUST respond with a JSON object matching this exact schema on every turn:
 }
 \`\`\`
-## Rules for each response type:
-### When you need to use a tool (type = "tool_call"):
-- Set type to "tool_call"
-- Set tool_call.tool_name to the tool name
-- Set tool_call.params to an array of STRING arguments (even numbers must be strings)
-- Set text_content to null
-- Set finalResponse to false
-- You will receive the tool result in the next message, then continue
-### When you are truly finished (type = "text"):
-- Set type to "text"
-- Set text_content to a brief summary of what you DID (past tense)
-- Set tool_call to null
-- Set finalResponse to true
-## CRITICAL RULES:
-1. NEVER set finalResponse to true unless the work is VERIFIED complete - not just written, but confirmed working.
-2. If the user asks you to DO something (fix, create, edit, build, search, etc.), your FIRST response MUST be type "tool_call". Not text. Not a plan. A tool call.
-3. Chain multiple tool calls across turns. After each tool result, decide: need more tools? Call another. Done with verification? Set finalResponse true.
-4. If a tool fails, try an alternative approach. Do NOT give up and ask the user to do it manually.
-5. After writing or editing any file, ALWAYS read it back to verify the content is correct before moving on.
-6. After any coding task, run the build/test command. If it fails, fix the errors and run again. Repeat until it passes. NEVER set finalResponse true while a build is still failing.
-7. NEVER claim you "fixed" or "created" something without having called writeFile or editFile. Saying it is not doing it.`;
+## When to use each type
+### type = "tool_call"
+- User asks to DO something → FIRST response is always a tool call. Not text. Not a plan.
+- Set tool_call.tool_name and tool_call.params (array of STRINGS).
+- Set text_content to null, finalResponse to false.
+- Chain tool calls across turns until the work is verified complete.
+### type = "text"
+- Conversation (greetings, questions, chat) → reply naturally. finalResponse = true.
+- Task complete and verified → concise outcome in 1-3 sentences. finalResponse = true.
+## Task execution rules
+1. Action requests → start with a tool call immediately.
+2. Chain multiple tool calls. After each result: need more? Call another. Done? Verify first, then finalize.
+3. After writing/editing any file, read it back to verify.
+4. After code changes, run build/tests. Fix failures until clean.
+5. Tool fails → try a different approach. That fails → try another. Exhaust every option before reporting failure.
+6. Never give up. Never ask the user to do it manually. Never report a problem without attempting to solve it.
+7. Never claim you did something without actually calling the tool.
+8. Never set finalResponse=true while errors or failures exist.
+## Understanding user intent
+- Read the full request carefully. Identify exactly what the user wants done.
+- Infer context from conversation history, memory, and available information.
+- If the request has multiple parts, handle all of them. Don't skip any.
+- If genuinely ambiguous, ask ONE focused question. Otherwise just do it.
+## Final response format
+- 1-3 sentences. What happened, from the user's perspective.
+- Never dump tool output, full email bodies, API responses, status codes, message IDs, or JSON.
+- Never ask what to do next or offer follow-up options.
+- Never expose internal details (tool names, IDs, technical artifacts).`;
 }
 function renderToolDocs() {
@@ -190,6 +199,7 @@ ${unconfigured.map(t => `- ${t} — needs: ${TOOL_REQUIRED_KEYS[t].join(" or ")}
 All tool params are STRINGS. Pass them as an array of strings.
 ## File Operations
+Always use absolute paths. Resolve ~ and relative paths from the user's context before calling any file tool.
 - readFile(filePath, offset?, limit?) — Read file with line numbers. Always read before editing.
 - writeFile(filePath, content) — Create or overwrite file. Content is the complete file.
 - editFile(filePath, oldString, newString) — Find-and-replace (exactly 3 params). Read file first to get exact match string.
@@ -244,8 +254,9 @@ ${_isToolConfigured("textToSpeech") ? `- textToSpeech(text, optionsJson?) — Te
 - writeDailyLog(entry) — Append to today's daily log.
 ## Agents
+For complex multi-agent tasks, load \`readFile("skills/orchestration.md")\` first — covers parallel execution, contract-based planning, workspace artifacts, and coordination patterns.
 - spawnAgent(taskDescription, optionsJson?) — Spawn sub-agent. opts: {"profile":"coder|researcher|writer|analyst","extraTools":[...],"skills":["skills/coding.md"],"parentContext":"...","model":"..."}. Pass skills array with skill paths from the Available Skills list — the skill content is injected directly into the sub-agent so it can follow the instructions without loading them. Task description must be comprehensive — sub-agent has no other context.
-- parallelAgents(tasksJson, sharedOptionsJson?) — Spawn multiple agents in parallel. tasksJson: [{"description":"...","options":{...}}]. sharedOptionsJson: {"sharedContext":"..."}. Always pass workspace path in sharedContext.
+- parallelAgents(tasksJson, sharedOptionsJson?) — Spawn multiple agents in parallel. tasksJson: [{"description":"...","options":{...}}]. sharedOptionsJson: {"sharedContext":"..."}. Always pass workspace path and shared contract in sharedContext.
 - manageAgents(action, paramsJson?) — List, kill, or steer agents. action: "list"|"kill"|"steer".
 ### useMCP(serverName, taskDescription)
@@ -285,13 +296,9 @@ The following MCP servers are connected. Use \`useMCP(serverName, taskDescriptio
 ${serverList}
-**IMPORTANT: ALWAYS prefer MCP server tools over built-in equivalents.** For example:
-- To send email → use \`useMCP("Fastn", ...)\` (gmail_send_mail) instead of \`sendEmail\`
-- To manage calendar → use \`useMCP("Fastn", ...)\` instead of built-in tools
-- If an MCP server provides a capability, ALWAYS use it via \`useMCP\` first. Only fall back to built-in tools if no MCP server offers that capability.
+**Prefer MCP servers over built-in tools** when both can do the job. Route tasks through \`useMCP(serverName, taskDescription)\` — the specialist gets only that server's tools. Do not call mcp__ tools directly.
-Do NOT call mcp__ tools directly - always route through \`useMCP\`. The specialist agent receives only that server's tools for focused, efficient execution.
-Use \`manageMCP("list")\` to check server connection status at any time.`;
+**Never expose MCP tool names to the user.** When describing capabilities, use natural language (e.g. "I can manage your calendar" not "I have google_calendar_create_event"). Internal tool names are implementation details.`;
 }
 function renderToolUsageRules() {
@@ -372,30 +379,26 @@ function renderDailyLog() {
 function renderOperationalGuidelines() {
   return `# Operational Guidelines
-## Tone & Style
-- Be concise. 1-3 lines per response. No filler phrases.
-- Report what you DID in past tense. Don't narrate tool calls.
-- Don't ask "shall I proceed?" — just do the work. Only confirm before destructive actions.
-## Understanding Requirements
-- Infer implied intent from vague requests. "make it look better" → spacing, typography, contrast, responsive.
-- If truly ambiguous (two valid outcomes), ask ONE focused question. Otherwise just do it.
-- Match existing code style, patterns, and conventions.
-## Workflow: Read → Act → Verify → Fix → Report
-1. **Read** every file before touching it.
-2. **Act** with tools. editFile for small changes, writeFile for rewrites.
-3. **Verify** — readFile after writes. Run build/tests after code changes.
-4. **Fix** — if build/test fails, fix and re-verify. Loop until clean.
-5. **Report** — set finalResponse true only after verification. Summarize in 1-3 sentences.
+## Workflow
+1. Read every file before touching it.
+2. Act with tools — editFile for small changes, writeFile for rewrites.
+3. Verify — readFile after writes. Run build/tests after code changes.
+4. Fix — if build/test fails, fix and re-verify until clean.
+5. Report — brief result in 1-3 sentences. Never expose internal details (tool names, IDs, JSON, raw API data).
 - NEVER set finalResponse true while a build error or test failure exists.
+## Requirements
+- Infer implied intent from vague requests. Ask only when truly ambiguous.
+- Match existing code style and conventions.
 ## When Blocked
-- Don't brute force. Read the error, try a different approach.
+- Read the error, try a different approach. Don't brute force.
 - Tool fails twice with same params → stop and diagnose.
 - Never use destructive workarounds to clear a blocker.
 ## What NOT To Do
+- NEVER expose raw API responses, status codes, message IDs, or internal artifacts to the user.
+- NEVER ask the user what to do next or offer follow-up options. Either do it or don't.
 - NEVER claim "fixed" without calling writeFile/editFile. NEVER plan without executing.
 - NEVER ask user to do things manually. NEVER give up after one failure.
 - NEVER set finalResponse true without verification. NEVER over-engineer.`;

package/src/cli.js CHANGED Viewed

@@ -550,6 +550,12 @@ async function handleMCP(action, args) {
       let serverConfig;
       if (commandOrUrl.startsWith("http://") || commandOrUrl.startsWith("https://")) {
+        // Detect URLs that were truncated by shell (& splits in zsh/bash)
+        if (commandOrUrl.includes("?") && !commandOrUrl.includes("&") && restArgs.some(a => a.includes("="))) {
+          console.error(`\n  ${S.cross}  URL appears truncated by the shell. Wrap it in quotes:`);
+          console.error(`  ${S.arrow}  daemora mcp add ${name} "${commandOrUrl}&${restArgs.filter(a => a.includes("=")).join("&")}"\n`);
+          process.exit(1);
+        }
         const isSSE = restArgs.includes("--sse");
         serverConfig = { url: commandOrUrl, enabled: true };
         if (isSSE) serverConfig.transport = "sse";
@@ -1880,114 +1886,42 @@ async function handleChannels() {
 async function handleModels() {
   const { select, isCancel } = await import("@clack/prompts");
+  const { models: modelRegistry } = await import("./config/models.js");
   const w = 67;
   const line    = chalk.hex(P.cyan)("━".repeat(w));
   const rowLine = chalk.hex(P.border)("─".repeat(w));
-  const PROVIDERS = [
-    {
-      name: "OpenAI", prefix: "openai", envKey: "OPENAI_API_KEY",
-      models: [
-        // GPT-5.4
-        { id: "gpt-5.4",                    desc: "GPT-5.4 flagship",                            price: "$2.50/$15",   isNew: true },
-        { id: "gpt-5.4-pro",                desc: "GPT-5.4 Pro — highest capability",            price: "$30/$180",    isNew: true },
-        // GPT-5.2
-        { id: "gpt-5.2",                    desc: "GPT-5.2 flagship (Dec 2025)",                 price: "$1.75/$14",   isNew: true },
-        { id: "gpt-5.2-pro",                desc: "GPT-5.2 Pro — extended reasoning",            price: "$21/$168",    isNew: true },
-        // GPT-5.1 / 5
-        { id: "gpt-5.1",                    desc: "GPT-5.1 (Nov 2025)",                          price: "$1.25/$10",   isNew: true },
-        { id: "gpt-5",                      desc: "GPT-5 flagship (Aug 2025)",                   price: "$1.25/$10" },
-        { id: "gpt-5-pro",                  desc: "GPT-5 Pro — most powerful",                   price: "$15/$120" },
-        { id: "gpt-5-mini",                 desc: "GPT-5 Mini — fast & cheap",                   price: "$0.25/$2" },
-        { id: "gpt-5-nano",                 desc: "GPT-5 Nano — cheapest GPT-5",                 price: "$0.05/$0.40" },
-        // Codex
-        { id: "gpt-5.3-codex",              desc: "Latest coding model (2025)",                  price: "$1.75/$14",   isNew: true },
-        { id: "gpt-5.1-codex",              desc: "GPT-5.1 Codex — coding",                     price: "$1.25/$10",   isNew: true },
-        { id: "gpt-5-codex",                desc: "GPT-5 Codex — coding",                       price: "$1.25/$10" },
-        // o-series reasoning
-        { id: "o3-pro",                     desc: "Best reasoning — most thorough",              price: "$20/$80" },
-        { id: "o3",                         desc: "Advanced reasoning (Apr 2025)",               price: "$2/$8" },
-        { id: "o4-mini",                    desc: "Fast reasoning (Apr 2025)",                   price: "$1.10/$4.40" },
-        { id: "o1-pro",                     desc: "o1 Pro — powerful reasoning",                 price: "$150/$600" },
-        { id: "o1",                         desc: "o1 reasoning model",                          price: "$15/$60" },
-        { id: "o3-mini",                    desc: "Lightweight reasoning",                       price: "$1.10/$4.40" },
-        // GPT-4.1 (1M context)
-        { id: "gpt-4.1",                    desc: "1M context, best instruction following",      price: "$2/$8" },
-        { id: "gpt-4.1-mini",               desc: "1M context, fast & affordable (default)",     price: "$0.40/$1.60" },
-        { id: "gpt-4.1-nano",               desc: "1M context, fastest & cheapest",              price: "$0.10/$0.40" },
-        // GPT-4o & specialized
-        { id: "gpt-4o",                     desc: "Vision + text (128K ctx)",                    price: "$2.50/$10" },
-        { id: "gpt-4o-mini",                desc: "GPT-4o Mini (128K ctx)",                      price: "$0.15/$0.60" },
-        { id: "computer-use-preview",       desc: "Computer use / GUI automation",               price: "$3/$12" },
-      ],
-    },
-    {
-      name: "Anthropic", prefix: "anthropic", envKey: "ANTHROPIC_API_KEY",
-      models: [
-        { id: "claude-opus-4-6",            desc: "Most intelligent — complex reasoning",        price: "$5/$25",      isNew: true },
-        { id: "claude-opus-4-5",            desc: "Opus 4.5 — complex multi-step tasks",         price: "$5/$25",      isNew: true },
-        { id: "claude-opus-4-1",            desc: "Opus 4.1 — long-duration complex tasks",      price: "$15/$75" },
-        { id: "claude-opus-4",              desc: "Opus 4 — extended thinking",                  price: "$15/$75" },
-        { id: "claude-sonnet-4-6",          desc: "Best speed/intelligence — coding & agents",   price: "$3/$15",      isNew: true },
-        { id: "claude-sonnet-4-5",          desc: "Sonnet 4.5 — coding & agentic tasks",         price: "$3/$15" },
-        { id: "claude-sonnet-4",            desc: "Sonnet 4 — balanced performance",             price: "$3/$15" },
-        { id: "claude-haiku-4-5",           desc: "Fastest — high-volume, cost-sensitive",       price: "$1/$5" },
-        { id: "claude-haiku-3-5",           desc: "3.5 Haiku — fast previous gen",               price: "$0.80/$4" },
-        { id: "claude-haiku-3",             desc: "Haiku 3 — cheapest Claude",                   price: "$0.25/$1.25" },
-      ],
-    },
-    {
-      name: "Google", prefix: "google", envKey: "GOOGLE_AI_API_KEY",
-      models: [
-        { id: "gemini-3.1-pro-preview",          desc: "Latest — complex tasks, reasoning",      price: "$2/$12",      isNew: true },
-        { id: "gemini-3.1-flash-lite-preview",    desc: "Latest — cost-efficient & fast",         price: "$0.25/$1.50", isNew: true },
-        { id: "gemini-3-pro-preview",             desc: "Gemini 3 Pro — advanced reasoning",      price: "$2/$12",      isNew: true },
-        { id: "gemini-3-flash-preview",           desc: "Gemini 3 Flash — fast & cheap",          price: "$0.50/$3",    isNew: true },
-        { id: "gemini-2.5-pro",                   desc: "GA — complex reasoning & coding (1M)",   price: "$1.25/$10" },
-        { id: "gemini-2.5-flash",                 desc: "Fast & cost-effective for high-volume",  price: "$0.30/$2.50" },
-        { id: "gemini-2.5-flash-lite",            desc: "Speed-optimised for high-throughput",    price: "$0.10/$0.40" },
-        { id: "gemini-2.0-flash",                 desc: "Previous gen flash",                     price: "$0.15/$0.60" },
-        { id: "gemini-2.0-flash-lite",            desc: "Cheapest Gemini",                        price: "$0.075/$0.30" },
-      ],
-    },
-    {
-      name: "xAI", prefix: "xai", envKey: "XAI_API_KEY",
-      models: [
-        { id: "grok-4",           desc: "Grok 4 — latest & most capable (Jul 2025)", isNew: true },
-        { id: "grok-3-beta",      desc: "Grok 3 Beta — 131K ctx" },
-        { id: "grok-3-mini-beta", desc: "Grok 3 Mini — fast, 131K ctx" },
-      ],
-    },
-    {
-      name: "DeepSeek", prefix: "deepseek", envKey: "DEEPSEEK_API_KEY",
-      models: [
-        { id: "deepseek-chat",     desc: "DeepSeek V3 — excellent coder (128K ctx)" },
-        { id: "deepseek-reasoner", desc: "DeepSeek R1 — chain-of-thought reasoning" },
-      ],
-    },
-    {
-      name: "Mistral", prefix: "mistral", envKey: "MISTRAL_API_KEY",
-      models: [
-        { id: "mistral-large-2512",     desc: "Flagship — best quality (Dec 2025)",    isNew: true },
-        { id: "mistral-medium-3",       desc: "Balanced capability & speed (May 2025)" },
-        { id: "codestral-2508",         desc: "Code specialist (Aug 2025)" },
-        { id: "mistral-small-3.2-24b",  desc: "Lightweight, runs locally (24B params)" },
-      ],
-    },
-    {
-      name: "Ollama (local)", prefix: "ollama", configured: true,
-      models: [
-        { id: "llama4-maverick", desc: "Llama 4 Maverick — 17B MoE, 1M ctx, multimodal", price: "free", isNew: true },
-        { id: "llama4-scout",    desc: "Llama 4 Scout — 17B MoE, 10M ctx",               price: "free", isNew: true },
-        { id: "llama3.3",        desc: "Llama 3.3 70B — best open model (Dec 2024)",      price: "free" },
-        { id: "qwen2.5",         desc: "Qwen 2.5 72B — strong coder",                    price: "free" },
-        { id: "deepseek-r1",     desc: "DeepSeek-R1 local — reasoning",                  price: "free" },
-        { id: "mistral",         desc: "Mistral 7B — fast small model",                   price: "free" },
-        { id: "phi4",            desc: "Phi-4 14B — Microsoft small model",               price: "free" },
-        { id: "codellama",       desc: "CodeLlama — code specialised",                    price: "free" },
-      ],
-    },
-  ];
+  // ── Build providers dynamically from model registry ─────────────────────
+  const providerEnvKeys = {
+    openai: "OPENAI_API_KEY", anthropic: "ANTHROPIC_API_KEY", google: "GOOGLE_AI_API_KEY",
+    xai: "XAI_API_KEY", deepseek: "DEEPSEEK_API_KEY", mistral: "MISTRAL_API_KEY", ollama: null,
+  };
+  const providerNames = {
+    openai: "OpenAI", anthropic: "Anthropic", google: "Google", xai: "xAI",
+    deepseek: "DeepSeek", mistral: "Mistral", ollama: "Ollama (local)",
+  };
+  const providerMap = new Map();
+  for (const [fullId, meta] of Object.entries(modelRegistry)) {
+    const prov = meta.provider;
+    if (!providerMap.has(prov)) {
+      providerMap.set(prov, {
+        name: providerNames[prov] || prov,
+        prefix: prov,
+        envKey: providerEnvKeys[prov] || `${prov.toUpperCase()}_API_KEY`,
+        configured: prov === "ollama" ? true : undefined,
+        models: [],
+      });
+    }
+    const inputPrice = meta.costPer1kInput ? `$${(meta.costPer1kInput * 1000).toFixed(2)}` : null;
+    const outputPrice = meta.costPer1kOutput ? `$${(meta.costPer1kOutput * 1000).toFixed(2)}` : null;
+    const price = prov === "ollama" ? "free" : (inputPrice && outputPrice ? `${inputPrice}/${outputPrice}` : null);
+    const ctx = meta.contextWindow ? `${Math.round(meta.contextWindow / 1000)}K ctx` : "";
+    const caps = (meta.capabilities || []).filter(c => c !== "text" && c !== "tools").join(", ");
+    const desc = [caps, ctx].filter(Boolean).join(" · ") || meta.model;
+    providerMap.get(prov).models.push({ id: meta.model, desc, price });
+  }
+  const PROVIDERS = [...providerMap.values()];
   const routingRows = [
     ["DEFAULT_MODEL",  process.env.DEFAULT_MODEL  || chalk.hex(P.muted)("openai:gpt-4.1-mini (built-in default)")],
@@ -2289,7 +2223,7 @@ ${line}
   ${t.dim("$")} daemora mcp env notion NOTION_TOKEN ntn_...
   ${t.dim("$")} daemora mcp env stripe STRIPE_SECRET_KEY sk_live_...
   ${t.dim("$")} daemora mcp enable notion
-  ${t.dim("$")} daemora mcp add myserver https://api.example.com/mcp
+  ${t.dim("$")} daemora mcp add myserver "https://api.example.com/mcp?key=123&id=456"
   ${t.dim("$")} daemora mcp add mysse https://api.example.com/sse --sse
   ${t.dim("$")} daemora mcp remove github
   ${t.dim("$")} daemora mcp add                   (interactive - prompts for everything)

package/src/config/models.js CHANGED Viewed

@@ -333,6 +333,78 @@ export const models = {
     tier: "cheap",
   },
+  // ─── xAI ───────────────────────────────────────────────────────────────────
+  "xai:grok-4": {
+    provider: "xai", model: "grok-4",
+    contextWindow: 131_072, compactAt: 90_000,
+    costPer1kInput: 0.003, costPer1kOutput: 0.015,
+    capabilities: ["text", "tools", "structured-output"],
+    tier: "standard",
+  },
+  "xai:grok-3-beta": {
+    provider: "xai", model: "grok-3-beta",
+    contextWindow: 131_072, compactAt: 90_000,
+    costPer1kInput: 0.003, costPer1kOutput: 0.015,
+    capabilities: ["text", "tools"],
+    tier: "standard",
+  },
+  "xai:grok-3-mini-beta": {
+    provider: "xai", model: "grok-3-mini-beta",
+    contextWindow: 131_072, compactAt: 90_000,
+    costPer1kInput: 0.0005, costPer1kOutput: 0.005,
+    capabilities: ["text", "tools", "reasoning"],
+    tier: "cheap",
+  },
+  // ─── DeepSeek ──────────────────────────────────────────────────────────────
+  "deepseek:deepseek-chat": {
+    provider: "deepseek", model: "deepseek-chat",
+    contextWindow: 128_000, compactAt: 90_000,
+    costPer1kInput: 0.00027, costPer1kOutput: 0.0011,
+    capabilities: ["text", "tools", "structured-output"],
+    tier: "cheap",
+  },
+  "deepseek:deepseek-reasoner": {
+    provider: "deepseek", model: "deepseek-reasoner",
+    contextWindow: 128_000, compactAt: 90_000,
+    costPer1kInput: 0.00055, costPer1kOutput: 0.0022,
+    capabilities: ["text", "reasoning"],
+    tier: "cheap",
+  },
+  // ─── Mistral ───────────────────────────────────────────────────────────────
+  "mistral:mistral-large-latest": {
+    provider: "mistral", model: "mistral-large-latest",
+    contextWindow: 128_000, compactAt: 90_000,
+    costPer1kInput: 0.002, costPer1kOutput: 0.006,
+    capabilities: ["text", "tools", "structured-output"],
+    tier: "standard",
+  },
+  "mistral:mistral-medium-latest": {
+    provider: "mistral", model: "mistral-medium-latest",
+    contextWindow: 128_000, compactAt: 90_000,
+    costPer1kInput: 0.0004, costPer1kOutput: 0.002,
+    capabilities: ["text", "tools"],
+    tier: "cheap",
+  },
+  "mistral:codestral-latest": {
+    provider: "mistral", model: "codestral-latest",
+    contextWindow: 256_000, compactAt: 180_000,
+    costPer1kInput: 0.0003, costPer1kOutput: 0.0009,
+    capabilities: ["text", "tools"],
+    tier: "cheap",
+  },
+  "mistral:mistral-small-latest": {
+    provider: "mistral", model: "mistral-small-latest",
+    contextWindow: 128_000, compactAt: 90_000,
+    costPer1kInput: 0.0001, costPer1kOutput: 0.0003,
+    capabilities: ["text", "tools"],
+    tier: "cheap",
+  },
   // ─── Ollama (local — no cost) ────────────────────────────────────────────────
   "ollama:llama3": {

package/src/mcp/MCPAgentRunner.js CHANGED Viewed

@@ -57,7 +57,7 @@ All MCP tool params must be passed as a single JSON string (the first and only a
 - **Never ask for clarification.** You have everything you need in the task description. Make reasonable decisions and proceed.
 - **Handle errors yourself.** If a tool call fails, read the error, adjust your approach, try again. Do not give up and report failure unless you have exhausted all approaches.
 - **Be thorough.** If the task says "update all tasks in a project", update all of them. If it says "research X", gather enough detail to be useful. Don't do a half job.
-- **End with a useful summary.** When done, set finalResponse true and write a clear summary: what was done, what was created/updated/found, and any important details the main agent needs.`,
+- **End with a concise summary.** When done, set finalResponse true. Write 1-3 sentences: what was done and key outcomes. Never dump raw API responses, full JSON payloads, message IDs, status codes, or technical artifacts. The main agent will relay your response to the user.`,
   };
 }

package/src/mcp/MCPManager.js CHANGED Viewed

@@ -214,6 +214,20 @@ class MCPManager {
           }
         }
+        // Check args for placeholder patterns (e.g. connection strings, paths)
+        if (cfg.args && Array.isArray(cfg.args)) {
+          const hasArgPlaceholder = cfg.args.some(v =>
+            typeof v === "string" && (
+              /user:pass@/i.test(v) || /\/Users\/you\//i.test(v) || /YOUR_/i.test(v)
+              || /your-.*-here/i.test(v) || /example\.com/i.test(v) || /changeme/i.test(v)
+            )
+          );
+          if (hasArgPlaceholder) {
+            console.log(`[MCPManager] Skipping "${name}" - args contain placeholder values. Configure via UI or CLI.`);
+            return false;
+          }
+        }
         if (cfg.headers) {
           const expandedHeaders = Object.entries(cfg.headers).map(([k, v]) => {
             if (typeof v === "string") {

package/src/setup/wizard.js CHANGED Viewed

@@ -87,104 +87,74 @@ export async function runSetupWizard() {
     ],
   }));
-  if (provider === "openai") {
-    const key = guard(await p.password({ message: "OpenAI API key (sk-...)", validate: (v) => !v ? "Required" : undefined }));
-    envConfig.OPENAI_API_KEY = key;
-    envConfig.DEFAULT_MODEL = guard(await p.select({
-      message: "OpenAI model",
-      options: [
-        { value: "openai:gpt-4.1-mini",  label: "gpt-4.1-mini",  hint: "1M ctx \u2014 fast & affordable (recommended)" },
-        { value: "openai:gpt-5.2-pro",   label: "gpt-5.2-pro",   hint: "GPT-5.2 Pro \u2014 highest capability [NEW]" },
-        { value: "openai:gpt-5.2",       label: "gpt-5.2",       hint: "GPT-5.2 flagship (Dec 2025) [NEW]" },
-        { value: "openai:gpt-5",         label: "gpt-5",         hint: "GPT-5 flagship (Aug 2025)" },
-        { value: "openai:gpt-5-mini",    label: "gpt-5-mini",    hint: "GPT-5 Mini \u2014 fast & cheap" },
-        { value: "openai:gpt-4.1",       label: "gpt-4.1",       hint: "1M ctx, best instruction following" },
-        { value: "openai:gpt-4.1-nano",  label: "gpt-4.1-nano",  hint: "1M ctx, cheapest" },
-        { value: "openai:o3-pro",        label: "o3-pro",        hint: "Best reasoning \u2014 most thorough" },
-        { value: "openai:o4-mini",       label: "o4-mini",       hint: "Fast reasoning (Apr 2025)" },
-        { value: "openai:gpt-4o",        label: "gpt-4o",        hint: "Vision + text (128K ctx)" },
-        { value: "openai:gpt-4o-mini",   label: "gpt-4o-mini",   hint: "GPT-4o Mini \u2014 balanced" },
-      ],
-    }));
-  } else if (provider === "anthropic") {
-    const key = guard(await p.password({ message: "Anthropic API key (sk-ant-...)", validate: (v) => !v ? "Required" : undefined }));
-    envConfig.ANTHROPIC_API_KEY = key;
-    envConfig.DEFAULT_MODEL = guard(await p.select({
-      message: "Claude model",
-      options: [
-        { value: "anthropic:claude-sonnet-4-6",          label: "claude-sonnet-4-6",          hint: "Best speed/intelligence \u2014 coding & agents [NEW]" },
-        { value: "anthropic:claude-opus-4-6",            label: "claude-opus-4-6",            hint: "Most intelligent \u2014 extended thinking [NEW]" },
-        { value: "anthropic:claude-haiku-4-5",           label: "claude-haiku-4-5",           hint: "Fastest \u2014 high-volume tasks" },
-        { value: "anthropic:claude-sonnet-4-5-20250929", label: "claude-sonnet-4-5-20250929", hint: "Sonnet 4.5 \u2014 coding & agentic (200K ctx)" },
-        { value: "anthropic:claude-3-5-sonnet-latest",   label: "claude-3-5-sonnet-latest",   hint: "3.5 Sonnet \u2014 widely used previous gen" },
-      ],
-    }));
-  } else if (provider === "google") {
-    const key = guard(await p.password({ message: "Google AI API key", validate: (v) => !v ? "Required" : undefined }));
-    envConfig.GOOGLE_AI_API_KEY = key;
-    envConfig.DEFAULT_MODEL = guard(await p.select({
-      message: "Gemini model",
-      options: [
-        { value: "google:gemini-2.5-flash",               label: "gemini-2.5-flash",               hint: "Fast & cost-effective \u2014 recommended" },
-        { value: "google:gemini-3.1-pro-preview",         label: "gemini-3.1-pro-preview",         hint: "Latest \u2014 complex tasks [NEW]" },
-        { value: "google:gemini-3.1-flash-lite-preview",  label: "gemini-3.1-flash-lite-preview",  hint: "Latest lite \u2014 cost-efficient [NEW]" },
-        { value: "google:gemini-2.5-pro",                 label: "gemini-2.5-pro",                 hint: "Complex reasoning & coding (1M ctx)" },
-        { value: "google:gemini-2.5-flash-lite",          label: "gemini-2.5-flash-lite",          hint: "Speed-optimised high-throughput" },
-        { value: "google:gemini-2.0-flash",               label: "gemini-2.0-flash",               hint: "Previous gen flash" },
-      ],
-    }));
-  } else if (provider === "xai") {
-    const key = guard(await p.password({ message: "xAI API key", validate: (v) => !v ? "Required" : undefined }));
-    envConfig.XAI_API_KEY = key;
-    envConfig.DEFAULT_MODEL = guard(await p.select({
-      message: "Grok model",
-      options: [
-        { value: "xai:grok-4",           label: "grok-4",           hint: "Latest & most capable (Jul 2025) [NEW]" },
-        { value: "xai:grok-3-beta",      label: "grok-3-beta",      hint: "Grok 3 Beta \u2014 131K ctx" },
-        { value: "xai:grok-3-mini-beta", label: "grok-3-mini-beta", hint: "Grok 3 Mini \u2014 fast, 131K ctx" },
-      ],
-    }));
-  } else if (provider === "deepseek") {
-    const key = guard(await p.password({ message: "DeepSeek API key (sk-...)", validate: (v) => !v ? "Required" : undefined }));
-    envConfig.DEEPSEEK_API_KEY = key;
-    envConfig.DEFAULT_MODEL = guard(await p.select({
-      message: "DeepSeek model",
-      options: [
-        { value: "deepseek:deepseek-chat",     label: "deepseek-chat",     hint: "V3 \u2014 excellent coder (128K ctx, recommended)" },
-        { value: "deepseek:deepseek-reasoner", label: "deepseek-reasoner", hint: "R1 \u2014 chain-of-thought reasoning" },
-      ],
-    }));
-  } else if (provider === "mistral") {
-    const key = guard(await p.password({ message: "Mistral API key", validate: (v) => !v ? "Required" : undefined }));
-    envConfig.MISTRAL_API_KEY = key;
-    envConfig.DEFAULT_MODEL = guard(await p.select({
-      message: "Mistral model",
-      options: [
-        { value: "mistral:mistral-large-2512",    label: "mistral-large-2512",    hint: "Flagship \u2014 best quality (Dec 2025) [NEW]" },
-        { value: "mistral:mistral-medium-3",      label: "mistral-medium-3",      hint: "Balanced capability & speed" },
-        { value: "mistral:codestral-2508",        label: "codestral-2508",        hint: "Code specialist (Aug 2025)" },
-        { value: "mistral:mistral-small-3.2-24b", label: "mistral-small-3.2-24b", hint: "Lightweight, runs locally (24B)" },
-      ],
-    }));
-  } else if (provider === "ollama") {
+  // Load model registry dynamically
+  const { models: modelRegistry } = await import("../config/models.js");
+  // Provider config: API key prompt + env var name
+  const providerKeys = {
+    openai:    { env: "OPENAI_API_KEY",    prompt: "OpenAI API key (sk-...)" },
+    anthropic: { env: "ANTHROPIC_API_KEY", prompt: "Anthropic API key (sk-ant-...)" },
+    google:    { env: "GOOGLE_AI_API_KEY", prompt: "Google AI API key" },
+    xai:       { env: "XAI_API_KEY",       prompt: "xAI API key" },
+    deepseek:  { env: "DEEPSEEK_API_KEY",  prompt: "DeepSeek API key (sk-...)" },
+    mistral:   { env: "MISTRAL_API_KEY",   prompt: "Mistral API key" },
+  };
+  if (provider === "ollama") {
+    // Ollama: list known local models from registry + free text input
+    const ollamaModels = Object.entries(modelRegistry)
+      .filter(([, m]) => m.provider === "ollama")
+      .map(([, m]) => m.model);
+    const ollamaHint = ollamaModels.length ? ollamaModels.join(", ") : "llama3.1, qwen2.5-coder";
     p.note(
       [
         "Make sure Ollama is running:  ollama serve",
-        "Pull a model first:           ollama pull llama4-maverick",
-        "Recommended models:",
-        "  llama4-maverick  \u2014 Llama 4, 17B MoE, multimodal, 1M ctx",
-        "  llama4-scout     \u2014 Llama 4, 17B MoE, 10M ctx",
-        "  llama3.3         \u2014 best 70B open model",
-        "  qwen2.5          \u2014 strong coder",
+        "Pull a model first:           ollama pull <model>",
+        `Known models: ${ollamaHint}`,
+        "You can use any model available in your Ollama installation.",
       ].join("\n"),
       "Ollama (local models)",
     );
     const model = guard(await p.text({
       message: "Ollama model name",
-      initialValue: "llama4-maverick",
-      placeholder: "e.g. llama4-maverick, llama3.3, qwen2.5",
+      initialValue: ollamaModels[0] || "llama3.1",
+      placeholder: `e.g. ${ollamaHint}`,
     }));
     envConfig.DEFAULT_MODEL = `ollama:${model}`;
+  } else {
+    // Cloud provider: ask for API key, then show models from registry
+    const keyInfo = providerKeys[provider];
+    if (keyInfo) {
+      const key = guard(await p.password({ message: keyInfo.prompt, validate: (v) => !v ? "Required" : undefined }));
+      envConfig[keyInfo.env] = key;
+    }
+    // Build model options from registry for this provider
+    const providerModels = Object.entries(modelRegistry)
+      .filter(([, m]) => m.provider === provider)
+      .map(([id, m]) => {
+        const ctx = m.contextWindow >= 1_000_000
+          ? `${(m.contextWindow / 1_000_000).toFixed(0)}M ctx`
+          : `${(m.contextWindow / 1_000).toFixed(0)}K ctx`;
+        const caps = (m.capabilities || []).filter(c => c !== "text" && c !== "tools").join(", ");
+        const price = m.costPer1kInput > 0 ? `$${m.costPer1kInput}/1k in` : "free";
+        const parts = [ctx, m.tier, caps, price].filter(Boolean);
+        return { value: id, label: m.model, hint: parts.join(" \u00b7 ") };
+      });
+    if (providerModels.length > 0) {
+      envConfig.DEFAULT_MODEL = guard(await p.select({
+        message: `${provider.charAt(0).toUpperCase() + provider.slice(1)} model`,
+        options: providerModels,
+      }));
+    } else {
+      // Provider not in registry — free text input
+      const model = guard(await p.text({
+        message: `${provider} model name (e.g. ${provider}:model-name)`,
+        validate: (v) => !v ? "Required" : undefined,
+      }));
+      envConfig.DEFAULT_MODEL = model.includes(":") ? model : `${provider}:${model}`;
+    }
   }
   p.log.success(`Provider: ${t.bold(provider)}  Model: ${t.bold(envConfig.DEFAULT_MODEL)}`);