daemora 1.0.7 → 1.0.9

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -7,7 +7,7 @@
7
7
  <meta name="theme-color" content="#0a0a0f" />
8
8
  <meta name="description" content="Daemora — Self-hosted AI agent platform" />
9
9
  <title>Daemora</title>
10
- <script type="module" crossorigin src="/assets/index-BiMfB4bx.js"></script>
10
+ <script type="module" crossorigin src="/assets/index-AfA65HSy.js"></script>
11
11
  <link rel="stylesheet" crossorigin href="/assets/index-DP95eMOr.css">
12
12
  </head>
13
13
  <body>
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "daemora",
3
- "version": "1.0.7",
3
+ "version": "1.0.9",
4
4
  "description": "A powerful open-source AI agent that runs on your machine. Connects to any AI model, any MCP server, any channel. Fully autonomous - plans, codes, tests, browses, emails, and manages your tools without asking permission.",
5
5
  "main": "src/index.js",
6
6
  "bin": {
@@ -149,30 +149,39 @@ You MUST respond with a JSON object matching this exact schema on every turn:
149
149
  }
150
150
  \`\`\`
151
151
 
152
- ## Rules for each response type:
153
-
154
- ### When you need to use a tool (type = "tool_call"):
155
- - Set type to "tool_call"
156
- - Set tool_call.tool_name to the tool name
157
- - Set tool_call.params to an array of STRING arguments (even numbers must be strings)
158
- - Set text_content to null
159
- - Set finalResponse to false
160
- - You will receive the tool result in the next message, then continue
161
-
162
- ### When you are truly finished (type = "text"):
163
- - Set type to "text"
164
- - Set text_content to a brief summary of what you DID (past tense)
165
- - Set tool_call to null
166
- - Set finalResponse to true
167
-
168
- ## CRITICAL RULES:
169
- 1. NEVER set finalResponse to true unless the work is VERIFIED complete - not just written, but confirmed working.
170
- 2. If the user asks you to DO something (fix, create, edit, build, search, etc.), your FIRST response MUST be type "tool_call". Not text. Not a plan. A tool call.
171
- 3. Chain multiple tool calls across turns. After each tool result, decide: need more tools? Call another. Done with verification? Set finalResponse true.
172
- 4. If a tool fails, try an alternative approach. Do NOT give up and ask the user to do it manually.
173
- 5. After writing or editing any file, ALWAYS read it back to verify the content is correct before moving on.
174
- 6. After any coding task, run the build/test command. If it fails, fix the errors and run again. Repeat until it passes. NEVER set finalResponse true while a build is still failing.
175
- 7. NEVER claim you "fixed" or "created" something without having called writeFile or editFile. Saying it is not doing it.`;
152
+ ## When to use each type
153
+
154
+ ### type = "tool_call"
155
+ - User asks to DO something → FIRST response is always a tool call. Not text. Not a plan.
156
+ - Set tool_call.tool_name and tool_call.params (array of STRINGS).
157
+ - Set text_content to null, finalResponse to false.
158
+ - Chain tool calls across turns until the work is verified complete.
159
+
160
+ ### type = "text"
161
+ - Conversation (greetings, questions, chat) → reply naturally. finalResponse = true.
162
+ - Task complete and verified concise outcome in 1-3 sentences. finalResponse = true.
163
+
164
+ ## Task execution rules
165
+ 1. Action requests start with a tool call immediately.
166
+ 2. Chain multiple tool calls. After each result: need more? Call another. Done? Verify first, then finalize.
167
+ 3. After writing/editing any file, read it back to verify.
168
+ 4. After code changes, run build/tests. Fix failures until clean.
169
+ 5. Tool fails try a different approach. That fails try another. Exhaust every option before reporting failure.
170
+ 6. Never give up. Never ask the user to do it manually. Never report a problem without attempting to solve it.
171
+ 7. Never claim you did something without actually calling the tool.
172
+ 8. Never set finalResponse=true while errors or failures exist.
173
+
174
+ ## Understanding user intent
175
+ - Read the full request carefully. Identify exactly what the user wants done.
176
+ - Infer context from conversation history, memory, and available information.
177
+ - If the request has multiple parts, handle all of them. Don't skip any.
178
+ - If genuinely ambiguous, ask ONE focused question. Otherwise just do it.
179
+
180
+ ## Final response format
181
+ - 1-3 sentences. What happened, from the user's perspective.
182
+ - Never dump tool output, full email bodies, API responses, status codes, message IDs, or JSON.
183
+ - Never ask what to do next or offer follow-up options.
184
+ - Never expose internal details (tool names, IDs, technical artifacts).`;
176
185
  }
177
186
 
178
187
  function renderToolDocs() {
@@ -190,6 +199,7 @@ ${unconfigured.map(t => `- ${t} — needs: ${TOOL_REQUIRED_KEYS[t].join(" or ")}
190
199
  All tool params are STRINGS. Pass them as an array of strings.
191
200
 
192
201
  ## File Operations
202
+ Always use absolute paths. Resolve ~ and relative paths from the user's context before calling any file tool.
193
203
  - readFile(filePath, offset?, limit?) — Read file with line numbers. Always read before editing.
194
204
  - writeFile(filePath, content) — Create or overwrite file. Content is the complete file.
195
205
  - editFile(filePath, oldString, newString) — Find-and-replace (exactly 3 params). Read file first to get exact match string.
@@ -244,8 +254,9 @@ ${_isToolConfigured("textToSpeech") ? `- textToSpeech(text, optionsJson?) — Te
244
254
  - writeDailyLog(entry) — Append to today's daily log.
245
255
 
246
256
  ## Agents
257
+ For complex multi-agent tasks, load \`readFile("skills/orchestration.md")\` first — covers parallel execution, contract-based planning, workspace artifacts, and coordination patterns.
247
258
  - spawnAgent(taskDescription, optionsJson?) — Spawn sub-agent. opts: {"profile":"coder|researcher|writer|analyst","extraTools":[...],"skills":["skills/coding.md"],"parentContext":"...","model":"..."}. Pass skills array with skill paths from the Available Skills list — the skill content is injected directly into the sub-agent so it can follow the instructions without loading them. Task description must be comprehensive — sub-agent has no other context.
248
- - parallelAgents(tasksJson, sharedOptionsJson?) — Spawn multiple agents in parallel. tasksJson: [{"description":"...","options":{...}}]. sharedOptionsJson: {"sharedContext":"..."}. Always pass workspace path in sharedContext.
259
+ - parallelAgents(tasksJson, sharedOptionsJson?) — Spawn multiple agents in parallel. tasksJson: [{"description":"...","options":{...}}]. sharedOptionsJson: {"sharedContext":"..."}. Always pass workspace path and shared contract in sharedContext.
249
260
  - manageAgents(action, paramsJson?) — List, kill, or steer agents. action: "list"|"kill"|"steer".
250
261
 
251
262
  ### useMCP(serverName, taskDescription)
@@ -285,13 +296,9 @@ The following MCP servers are connected. Use \`useMCP(serverName, taskDescriptio
285
296
 
286
297
  ${serverList}
287
298
 
288
- **IMPORTANT: ALWAYS prefer MCP server tools over built-in equivalents.** For example:
289
- - To send email → use \`useMCP("Fastn", ...)\` (gmail_send_mail) instead of \`sendEmail\`
290
- - To manage calendar → use \`useMCP("Fastn", ...)\` instead of built-in tools
291
- - If an MCP server provides a capability, ALWAYS use it via \`useMCP\` first. Only fall back to built-in tools if no MCP server offers that capability.
299
+ **Prefer MCP servers over built-in tools** when both can do the job. Route tasks through \`useMCP(serverName, taskDescription)\` — the specialist gets only that server's tools. Do not call mcp__ tools directly.
292
300
 
293
- Do NOT call mcp__ tools directly - always route through \`useMCP\`. The specialist agent receives only that server's tools for focused, efficient execution.
294
- Use \`manageMCP("list")\` to check server connection status at any time.`;
301
+ **Never expose MCP tool names to the user.** When describing capabilities, use natural language (e.g. "I can manage your calendar" not "I have google_calendar_create_event"). Internal tool names are implementation details.`;
295
302
  }
296
303
 
297
304
  function renderToolUsageRules() {
@@ -372,30 +379,26 @@ function renderDailyLog() {
372
379
  function renderOperationalGuidelines() {
373
380
  return `# Operational Guidelines
374
381
 
375
- ## Tone & Style
376
- - Be concise. 1-3 lines per response. No filler phrases.
377
- - Report what you DID in past tense. Don't narrate tool calls.
378
- - Don't ask "shall I proceed?" just do the work. Only confirm before destructive actions.
379
-
380
- ## Understanding Requirements
381
- - Infer implied intent from vague requests. "make it look better" → spacing, typography, contrast, responsive.
382
- - If truly ambiguous (two valid outcomes), ask ONE focused question. Otherwise just do it.
383
- - Match existing code style, patterns, and conventions.
384
-
385
- ## Workflow: Read → Act → Verify → Fix → Report
386
- 1. **Read** every file before touching it.
387
- 2. **Act** with tools. editFile for small changes, writeFile for rewrites.
388
- 3. **Verify** — readFile after writes. Run build/tests after code changes.
389
- 4. **Fix** — if build/test fails, fix and re-verify. Loop until clean.
390
- 5. **Report** — set finalResponse true only after verification. Summarize in 1-3 sentences.
382
+ ## Workflow
383
+ 1. Read every file before touching it.
384
+ 2. Act with tools editFile for small changes, writeFile for rewrites.
385
+ 3. VerifyreadFile after writes. Run build/tests after code changes.
386
+ 4. Fix — if build/test fails, fix and re-verify until clean.
387
+ 5. Report — brief result in 1-3 sentences. Never expose internal details (tool names, IDs, JSON, raw API data).
391
388
  - NEVER set finalResponse true while a build error or test failure exists.
392
389
 
390
+ ## Requirements
391
+ - Infer implied intent from vague requests. Ask only when truly ambiguous.
392
+ - Match existing code style and conventions.
393
+
393
394
  ## When Blocked
394
- - Don't brute force. Read the error, try a different approach.
395
+ - Read the error, try a different approach. Don't brute force.
395
396
  - Tool fails twice with same params → stop and diagnose.
396
397
  - Never use destructive workarounds to clear a blocker.
397
398
 
398
399
  ## What NOT To Do
400
+ - NEVER expose raw API responses, status codes, message IDs, or internal artifacts to the user.
401
+ - NEVER ask the user what to do next or offer follow-up options. Either do it or don't.
399
402
  - NEVER claim "fixed" without calling writeFile/editFile. NEVER plan without executing.
400
403
  - NEVER ask user to do things manually. NEVER give up after one failure.
401
404
  - NEVER set finalResponse true without verification. NEVER over-engineer.`;
package/src/cli.js CHANGED
@@ -550,6 +550,12 @@ async function handleMCP(action, args) {
550
550
 
551
551
  let serverConfig;
552
552
  if (commandOrUrl.startsWith("http://") || commandOrUrl.startsWith("https://")) {
553
+ // Detect URLs that were truncated by shell (& splits in zsh/bash)
554
+ if (commandOrUrl.includes("?") && !commandOrUrl.includes("&") && restArgs.some(a => a.includes("="))) {
555
+ console.error(`\n ${S.cross} URL appears truncated by the shell. Wrap it in quotes:`);
556
+ console.error(` ${S.arrow} daemora mcp add ${name} "${commandOrUrl}&${restArgs.filter(a => a.includes("=")).join("&")}"\n`);
557
+ process.exit(1);
558
+ }
553
559
  const isSSE = restArgs.includes("--sse");
554
560
  serverConfig = { url: commandOrUrl, enabled: true };
555
561
  if (isSSE) serverConfig.transport = "sse";
@@ -1880,114 +1886,42 @@ async function handleChannels() {
1880
1886
 
1881
1887
  async function handleModels() {
1882
1888
  const { select, isCancel } = await import("@clack/prompts");
1889
+ const { models: modelRegistry } = await import("./config/models.js");
1883
1890
  const w = 67;
1884
1891
  const line = chalk.hex(P.cyan)("━".repeat(w));
1885
1892
  const rowLine = chalk.hex(P.border)("─".repeat(w));
1886
1893
 
1887
- const PROVIDERS = [
1888
- {
1889
- name: "OpenAI", prefix: "openai", envKey: "OPENAI_API_KEY",
1890
- models: [
1891
- // GPT-5.4
1892
- { id: "gpt-5.4", desc: "GPT-5.4 flagship", price: "$2.50/$15", isNew: true },
1893
- { id: "gpt-5.4-pro", desc: "GPT-5.4 Pro — highest capability", price: "$30/$180", isNew: true },
1894
- // GPT-5.2
1895
- { id: "gpt-5.2", desc: "GPT-5.2 flagship (Dec 2025)", price: "$1.75/$14", isNew: true },
1896
- { id: "gpt-5.2-pro", desc: "GPT-5.2 Pro — extended reasoning", price: "$21/$168", isNew: true },
1897
- // GPT-5.1 / 5
1898
- { id: "gpt-5.1", desc: "GPT-5.1 (Nov 2025)", price: "$1.25/$10", isNew: true },
1899
- { id: "gpt-5", desc: "GPT-5 flagship (Aug 2025)", price: "$1.25/$10" },
1900
- { id: "gpt-5-pro", desc: "GPT-5 Pro — most powerful", price: "$15/$120" },
1901
- { id: "gpt-5-mini", desc: "GPT-5 Mini — fast & cheap", price: "$0.25/$2" },
1902
- { id: "gpt-5-nano", desc: "GPT-5 Nano — cheapest GPT-5", price: "$0.05/$0.40" },
1903
- // Codex
1904
- { id: "gpt-5.3-codex", desc: "Latest coding model (2025)", price: "$1.75/$14", isNew: true },
1905
- { id: "gpt-5.1-codex", desc: "GPT-5.1 Codex — coding", price: "$1.25/$10", isNew: true },
1906
- { id: "gpt-5-codex", desc: "GPT-5 Codex — coding", price: "$1.25/$10" },
1907
- // o-series reasoning
1908
- { id: "o3-pro", desc: "Best reasoning — most thorough", price: "$20/$80" },
1909
- { id: "o3", desc: "Advanced reasoning (Apr 2025)", price: "$2/$8" },
1910
- { id: "o4-mini", desc: "Fast reasoning (Apr 2025)", price: "$1.10/$4.40" },
1911
- { id: "o1-pro", desc: "o1 Pro powerful reasoning", price: "$150/$600" },
1912
- { id: "o1", desc: "o1 reasoning model", price: "$15/$60" },
1913
- { id: "o3-mini", desc: "Lightweight reasoning", price: "$1.10/$4.40" },
1914
- // GPT-4.1 (1M context)
1915
- { id: "gpt-4.1", desc: "1M context, best instruction following", price: "$2/$8" },
1916
- { id: "gpt-4.1-mini", desc: "1M context, fast & affordable (default)", price: "$0.40/$1.60" },
1917
- { id: "gpt-4.1-nano", desc: "1M context, fastest & cheapest", price: "$0.10/$0.40" },
1918
- // GPT-4o & specialized
1919
- { id: "gpt-4o", desc: "Vision + text (128K ctx)", price: "$2.50/$10" },
1920
- { id: "gpt-4o-mini", desc: "GPT-4o Mini (128K ctx)", price: "$0.15/$0.60" },
1921
- { id: "computer-use-preview", desc: "Computer use / GUI automation", price: "$3/$12" },
1922
- ],
1923
- },
1924
- {
1925
- name: "Anthropic", prefix: "anthropic", envKey: "ANTHROPIC_API_KEY",
1926
- models: [
1927
- { id: "claude-opus-4-6", desc: "Most intelligent — complex reasoning", price: "$5/$25", isNew: true },
1928
- { id: "claude-opus-4-5", desc: "Opus 4.5 — complex multi-step tasks", price: "$5/$25", isNew: true },
1929
- { id: "claude-opus-4-1", desc: "Opus 4.1 — long-duration complex tasks", price: "$15/$75" },
1930
- { id: "claude-opus-4", desc: "Opus 4 — extended thinking", price: "$15/$75" },
1931
- { id: "claude-sonnet-4-6", desc: "Best speed/intelligence — coding & agents", price: "$3/$15", isNew: true },
1932
- { id: "claude-sonnet-4-5", desc: "Sonnet 4.5 — coding & agentic tasks", price: "$3/$15" },
1933
- { id: "claude-sonnet-4", desc: "Sonnet 4 — balanced performance", price: "$3/$15" },
1934
- { id: "claude-haiku-4-5", desc: "Fastest — high-volume, cost-sensitive", price: "$1/$5" },
1935
- { id: "claude-haiku-3-5", desc: "3.5 Haiku — fast previous gen", price: "$0.80/$4" },
1936
- { id: "claude-haiku-3", desc: "Haiku 3 — cheapest Claude", price: "$0.25/$1.25" },
1937
- ],
1938
- },
1939
- {
1940
- name: "Google", prefix: "google", envKey: "GOOGLE_AI_API_KEY",
1941
- models: [
1942
- { id: "gemini-3.1-pro-preview", desc: "Latest — complex tasks, reasoning", price: "$2/$12", isNew: true },
1943
- { id: "gemini-3.1-flash-lite-preview", desc: "Latest — cost-efficient & fast", price: "$0.25/$1.50", isNew: true },
1944
- { id: "gemini-3-pro-preview", desc: "Gemini 3 Pro — advanced reasoning", price: "$2/$12", isNew: true },
1945
- { id: "gemini-3-flash-preview", desc: "Gemini 3 Flash — fast & cheap", price: "$0.50/$3", isNew: true },
1946
- { id: "gemini-2.5-pro", desc: "GA — complex reasoning & coding (1M)", price: "$1.25/$10" },
1947
- { id: "gemini-2.5-flash", desc: "Fast & cost-effective for high-volume", price: "$0.30/$2.50" },
1948
- { id: "gemini-2.5-flash-lite", desc: "Speed-optimised for high-throughput", price: "$0.10/$0.40" },
1949
- { id: "gemini-2.0-flash", desc: "Previous gen flash", price: "$0.15/$0.60" },
1950
- { id: "gemini-2.0-flash-lite", desc: "Cheapest Gemini", price: "$0.075/$0.30" },
1951
- ],
1952
- },
1953
- {
1954
- name: "xAI", prefix: "xai", envKey: "XAI_API_KEY",
1955
- models: [
1956
- { id: "grok-4", desc: "Grok 4 — latest & most capable (Jul 2025)", isNew: true },
1957
- { id: "grok-3-beta", desc: "Grok 3 Beta — 131K ctx" },
1958
- { id: "grok-3-mini-beta", desc: "Grok 3 Mini — fast, 131K ctx" },
1959
- ],
1960
- },
1961
- {
1962
- name: "DeepSeek", prefix: "deepseek", envKey: "DEEPSEEK_API_KEY",
1963
- models: [
1964
- { id: "deepseek-chat", desc: "DeepSeek V3 — excellent coder (128K ctx)" },
1965
- { id: "deepseek-reasoner", desc: "DeepSeek R1 — chain-of-thought reasoning" },
1966
- ],
1967
- },
1968
- {
1969
- name: "Mistral", prefix: "mistral", envKey: "MISTRAL_API_KEY",
1970
- models: [
1971
- { id: "mistral-large-2512", desc: "Flagship — best quality (Dec 2025)", isNew: true },
1972
- { id: "mistral-medium-3", desc: "Balanced capability & speed (May 2025)" },
1973
- { id: "codestral-2508", desc: "Code specialist (Aug 2025)" },
1974
- { id: "mistral-small-3.2-24b", desc: "Lightweight, runs locally (24B params)" },
1975
- ],
1976
- },
1977
- {
1978
- name: "Ollama (local)", prefix: "ollama", configured: true,
1979
- models: [
1980
- { id: "llama4-maverick", desc: "Llama 4 Maverick — 17B MoE, 1M ctx, multimodal", price: "free", isNew: true },
1981
- { id: "llama4-scout", desc: "Llama 4 Scout — 17B MoE, 10M ctx", price: "free", isNew: true },
1982
- { id: "llama3.3", desc: "Llama 3.3 70B — best open model (Dec 2024)", price: "free" },
1983
- { id: "qwen2.5", desc: "Qwen 2.5 72B — strong coder", price: "free" },
1984
- { id: "deepseek-r1", desc: "DeepSeek-R1 local — reasoning", price: "free" },
1985
- { id: "mistral", desc: "Mistral 7B — fast small model", price: "free" },
1986
- { id: "phi4", desc: "Phi-4 14B — Microsoft small model", price: "free" },
1987
- { id: "codellama", desc: "CodeLlama — code specialised", price: "free" },
1988
- ],
1989
- },
1990
- ];
1894
+ // ── Build providers dynamically from model registry ─────────────────────
1895
+ const providerEnvKeys = {
1896
+ openai: "OPENAI_API_KEY", anthropic: "ANTHROPIC_API_KEY", google: "GOOGLE_AI_API_KEY",
1897
+ xai: "XAI_API_KEY", deepseek: "DEEPSEEK_API_KEY", mistral: "MISTRAL_API_KEY", ollama: null,
1898
+ };
1899
+ const providerNames = {
1900
+ openai: "OpenAI", anthropic: "Anthropic", google: "Google", xai: "xAI",
1901
+ deepseek: "DeepSeek", mistral: "Mistral", ollama: "Ollama (local)",
1902
+ };
1903
+
1904
+ const providerMap = new Map();
1905
+ for (const [fullId, meta] of Object.entries(modelRegistry)) {
1906
+ const prov = meta.provider;
1907
+ if (!providerMap.has(prov)) {
1908
+ providerMap.set(prov, {
1909
+ name: providerNames[prov] || prov,
1910
+ prefix: prov,
1911
+ envKey: providerEnvKeys[prov] || `${prov.toUpperCase()}_API_KEY`,
1912
+ configured: prov === "ollama" ? true : undefined,
1913
+ models: [],
1914
+ });
1915
+ }
1916
+ const inputPrice = meta.costPer1kInput ? `$${(meta.costPer1kInput * 1000).toFixed(2)}` : null;
1917
+ const outputPrice = meta.costPer1kOutput ? `$${(meta.costPer1kOutput * 1000).toFixed(2)}` : null;
1918
+ const price = prov === "ollama" ? "free" : (inputPrice && outputPrice ? `${inputPrice}/${outputPrice}` : null);
1919
+ const ctx = meta.contextWindow ? `${Math.round(meta.contextWindow / 1000)}K ctx` : "";
1920
+ const caps = (meta.capabilities || []).filter(c => c !== "text" && c !== "tools").join(", ");
1921
+ const desc = [caps, ctx].filter(Boolean).join(" · ") || meta.model;
1922
+ providerMap.get(prov).models.push({ id: meta.model, desc, price });
1923
+ }
1924
+ const PROVIDERS = [...providerMap.values()];
1991
1925
 
1992
1926
  const routingRows = [
1993
1927
  ["DEFAULT_MODEL", process.env.DEFAULT_MODEL || chalk.hex(P.muted)("openai:gpt-4.1-mini (built-in default)")],
@@ -2289,7 +2223,7 @@ ${line}
2289
2223
  ${t.dim("$")} daemora mcp env notion NOTION_TOKEN ntn_...
2290
2224
  ${t.dim("$")} daemora mcp env stripe STRIPE_SECRET_KEY sk_live_...
2291
2225
  ${t.dim("$")} daemora mcp enable notion
2292
- ${t.dim("$")} daemora mcp add myserver https://api.example.com/mcp
2226
+ ${t.dim("$")} daemora mcp add myserver "https://api.example.com/mcp?key=123&id=456"
2293
2227
  ${t.dim("$")} daemora mcp add mysse https://api.example.com/sse --sse
2294
2228
  ${t.dim("$")} daemora mcp remove github
2295
2229
  ${t.dim("$")} daemora mcp add (interactive - prompts for everything)
@@ -333,6 +333,78 @@ export const models = {
333
333
  tier: "cheap",
334
334
  },
335
335
 
336
+ // ─── xAI ───────────────────────────────────────────────────────────────────
337
+
338
+ "xai:grok-4": {
339
+ provider: "xai", model: "grok-4",
340
+ contextWindow: 131_072, compactAt: 90_000,
341
+ costPer1kInput: 0.003, costPer1kOutput: 0.015,
342
+ capabilities: ["text", "tools", "structured-output"],
343
+ tier: "standard",
344
+ },
345
+ "xai:grok-3-beta": {
346
+ provider: "xai", model: "grok-3-beta",
347
+ contextWindow: 131_072, compactAt: 90_000,
348
+ costPer1kInput: 0.003, costPer1kOutput: 0.015,
349
+ capabilities: ["text", "tools"],
350
+ tier: "standard",
351
+ },
352
+ "xai:grok-3-mini-beta": {
353
+ provider: "xai", model: "grok-3-mini-beta",
354
+ contextWindow: 131_072, compactAt: 90_000,
355
+ costPer1kInput: 0.0005, costPer1kOutput: 0.005,
356
+ capabilities: ["text", "tools", "reasoning"],
357
+ tier: "cheap",
358
+ },
359
+
360
+ // ─── DeepSeek ──────────────────────────────────────────────────────────────
361
+
362
+ "deepseek:deepseek-chat": {
363
+ provider: "deepseek", model: "deepseek-chat",
364
+ contextWindow: 128_000, compactAt: 90_000,
365
+ costPer1kInput: 0.00027, costPer1kOutput: 0.0011,
366
+ capabilities: ["text", "tools", "structured-output"],
367
+ tier: "cheap",
368
+ },
369
+ "deepseek:deepseek-reasoner": {
370
+ provider: "deepseek", model: "deepseek-reasoner",
371
+ contextWindow: 128_000, compactAt: 90_000,
372
+ costPer1kInput: 0.00055, costPer1kOutput: 0.0022,
373
+ capabilities: ["text", "reasoning"],
374
+ tier: "cheap",
375
+ },
376
+
377
+ // ─── Mistral ───────────────────────────────────────────────────────────────
378
+
379
+ "mistral:mistral-large-latest": {
380
+ provider: "mistral", model: "mistral-large-latest",
381
+ contextWindow: 128_000, compactAt: 90_000,
382
+ costPer1kInput: 0.002, costPer1kOutput: 0.006,
383
+ capabilities: ["text", "tools", "structured-output"],
384
+ tier: "standard",
385
+ },
386
+ "mistral:mistral-medium-latest": {
387
+ provider: "mistral", model: "mistral-medium-latest",
388
+ contextWindow: 128_000, compactAt: 90_000,
389
+ costPer1kInput: 0.0004, costPer1kOutput: 0.002,
390
+ capabilities: ["text", "tools"],
391
+ tier: "cheap",
392
+ },
393
+ "mistral:codestral-latest": {
394
+ provider: "mistral", model: "codestral-latest",
395
+ contextWindow: 256_000, compactAt: 180_000,
396
+ costPer1kInput: 0.0003, costPer1kOutput: 0.0009,
397
+ capabilities: ["text", "tools"],
398
+ tier: "cheap",
399
+ },
400
+ "mistral:mistral-small-latest": {
401
+ provider: "mistral", model: "mistral-small-latest",
402
+ contextWindow: 128_000, compactAt: 90_000,
403
+ costPer1kInput: 0.0001, costPer1kOutput: 0.0003,
404
+ capabilities: ["text", "tools"],
405
+ tier: "cheap",
406
+ },
407
+
336
408
  // ─── Ollama (local — no cost) ────────────────────────────────────────────────
337
409
 
338
410
  "ollama:llama3": {
@@ -57,7 +57,7 @@ All MCP tool params must be passed as a single JSON string (the first and only a
57
57
  - **Never ask for clarification.** You have everything you need in the task description. Make reasonable decisions and proceed.
58
58
  - **Handle errors yourself.** If a tool call fails, read the error, adjust your approach, try again. Do not give up and report failure unless you have exhausted all approaches.
59
59
  - **Be thorough.** If the task says "update all tasks in a project", update all of them. If it says "research X", gather enough detail to be useful. Don't do a half job.
60
- - **End with a useful summary.** When done, set finalResponse true and write a clear summary: what was done, what was created/updated/found, and any important details the main agent needs.`,
60
+ - **End with a concise summary.** When done, set finalResponse true. Write 1-3 sentences: what was done and key outcomes. Never dump raw API responses, full JSON payloads, message IDs, status codes, or technical artifacts. The main agent will relay your response to the user.`,
61
61
  };
62
62
  }
63
63
 
@@ -214,6 +214,20 @@ class MCPManager {
214
214
  }
215
215
  }
216
216
 
217
+ // Check args for placeholder patterns (e.g. connection strings, paths)
218
+ if (cfg.args && Array.isArray(cfg.args)) {
219
+ const hasArgPlaceholder = cfg.args.some(v =>
220
+ typeof v === "string" && (
221
+ /user:pass@/i.test(v) || /\/Users\/you\//i.test(v) || /YOUR_/i.test(v)
222
+ || /your-.*-here/i.test(v) || /example\.com/i.test(v) || /changeme/i.test(v)
223
+ )
224
+ );
225
+ if (hasArgPlaceholder) {
226
+ console.log(`[MCPManager] Skipping "${name}" - args contain placeholder values. Configure via UI or CLI.`);
227
+ return false;
228
+ }
229
+ }
230
+
217
231
  if (cfg.headers) {
218
232
  const expandedHeaders = Object.entries(cfg.headers).map(([k, v]) => {
219
233
  if (typeof v === "string") {
@@ -87,104 +87,74 @@ export async function runSetupWizard() {
87
87
  ],
88
88
  }));
89
89
 
90
- if (provider === "openai") {
91
- const key = guard(await p.password({ message: "OpenAI API key (sk-...)", validate: (v) => !v ? "Required" : undefined }));
92
- envConfig.OPENAI_API_KEY = key;
93
- envConfig.DEFAULT_MODEL = guard(await p.select({
94
- message: "OpenAI model",
95
- options: [
96
- { value: "openai:gpt-4.1-mini", label: "gpt-4.1-mini", hint: "1M ctx \u2014 fast & affordable (recommended)" },
97
- { value: "openai:gpt-5.2-pro", label: "gpt-5.2-pro", hint: "GPT-5.2 Pro \u2014 highest capability [NEW]" },
98
- { value: "openai:gpt-5.2", label: "gpt-5.2", hint: "GPT-5.2 flagship (Dec 2025) [NEW]" },
99
- { value: "openai:gpt-5", label: "gpt-5", hint: "GPT-5 flagship (Aug 2025)" },
100
- { value: "openai:gpt-5-mini", label: "gpt-5-mini", hint: "GPT-5 Mini \u2014 fast & cheap" },
101
- { value: "openai:gpt-4.1", label: "gpt-4.1", hint: "1M ctx, best instruction following" },
102
- { value: "openai:gpt-4.1-nano", label: "gpt-4.1-nano", hint: "1M ctx, cheapest" },
103
- { value: "openai:o3-pro", label: "o3-pro", hint: "Best reasoning \u2014 most thorough" },
104
- { value: "openai:o4-mini", label: "o4-mini", hint: "Fast reasoning (Apr 2025)" },
105
- { value: "openai:gpt-4o", label: "gpt-4o", hint: "Vision + text (128K ctx)" },
106
- { value: "openai:gpt-4o-mini", label: "gpt-4o-mini", hint: "GPT-4o Mini \u2014 balanced" },
107
- ],
108
- }));
109
- } else if (provider === "anthropic") {
110
- const key = guard(await p.password({ message: "Anthropic API key (sk-ant-...)", validate: (v) => !v ? "Required" : undefined }));
111
- envConfig.ANTHROPIC_API_KEY = key;
112
- envConfig.DEFAULT_MODEL = guard(await p.select({
113
- message: "Claude model",
114
- options: [
115
- { value: "anthropic:claude-sonnet-4-6", label: "claude-sonnet-4-6", hint: "Best speed/intelligence \u2014 coding & agents [NEW]" },
116
- { value: "anthropic:claude-opus-4-6", label: "claude-opus-4-6", hint: "Most intelligent \u2014 extended thinking [NEW]" },
117
- { value: "anthropic:claude-haiku-4-5", label: "claude-haiku-4-5", hint: "Fastest \u2014 high-volume tasks" },
118
- { value: "anthropic:claude-sonnet-4-5-20250929", label: "claude-sonnet-4-5-20250929", hint: "Sonnet 4.5 \u2014 coding & agentic (200K ctx)" },
119
- { value: "anthropic:claude-3-5-sonnet-latest", label: "claude-3-5-sonnet-latest", hint: "3.5 Sonnet \u2014 widely used previous gen" },
120
- ],
121
- }));
122
- } else if (provider === "google") {
123
- const key = guard(await p.password({ message: "Google AI API key", validate: (v) => !v ? "Required" : undefined }));
124
- envConfig.GOOGLE_AI_API_KEY = key;
125
- envConfig.DEFAULT_MODEL = guard(await p.select({
126
- message: "Gemini model",
127
- options: [
128
- { value: "google:gemini-2.5-flash", label: "gemini-2.5-flash", hint: "Fast & cost-effective \u2014 recommended" },
129
- { value: "google:gemini-3.1-pro-preview", label: "gemini-3.1-pro-preview", hint: "Latest \u2014 complex tasks [NEW]" },
130
- { value: "google:gemini-3.1-flash-lite-preview", label: "gemini-3.1-flash-lite-preview", hint: "Latest lite \u2014 cost-efficient [NEW]" },
131
- { value: "google:gemini-2.5-pro", label: "gemini-2.5-pro", hint: "Complex reasoning & coding (1M ctx)" },
132
- { value: "google:gemini-2.5-flash-lite", label: "gemini-2.5-flash-lite", hint: "Speed-optimised high-throughput" },
133
- { value: "google:gemini-2.0-flash", label: "gemini-2.0-flash", hint: "Previous gen flash" },
134
- ],
135
- }));
136
- } else if (provider === "xai") {
137
- const key = guard(await p.password({ message: "xAI API key", validate: (v) => !v ? "Required" : undefined }));
138
- envConfig.XAI_API_KEY = key;
139
- envConfig.DEFAULT_MODEL = guard(await p.select({
140
- message: "Grok model",
141
- options: [
142
- { value: "xai:grok-4", label: "grok-4", hint: "Latest & most capable (Jul 2025) [NEW]" },
143
- { value: "xai:grok-3-beta", label: "grok-3-beta", hint: "Grok 3 Beta \u2014 131K ctx" },
144
- { value: "xai:grok-3-mini-beta", label: "grok-3-mini-beta", hint: "Grok 3 Mini \u2014 fast, 131K ctx" },
145
- ],
146
- }));
147
- } else if (provider === "deepseek") {
148
- const key = guard(await p.password({ message: "DeepSeek API key (sk-...)", validate: (v) => !v ? "Required" : undefined }));
149
- envConfig.DEEPSEEK_API_KEY = key;
150
- envConfig.DEFAULT_MODEL = guard(await p.select({
151
- message: "DeepSeek model",
152
- options: [
153
- { value: "deepseek:deepseek-chat", label: "deepseek-chat", hint: "V3 \u2014 excellent coder (128K ctx, recommended)" },
154
- { value: "deepseek:deepseek-reasoner", label: "deepseek-reasoner", hint: "R1 \u2014 chain-of-thought reasoning" },
155
- ],
156
- }));
157
- } else if (provider === "mistral") {
158
- const key = guard(await p.password({ message: "Mistral API key", validate: (v) => !v ? "Required" : undefined }));
159
- envConfig.MISTRAL_API_KEY = key;
160
- envConfig.DEFAULT_MODEL = guard(await p.select({
161
- message: "Mistral model",
162
- options: [
163
- { value: "mistral:mistral-large-2512", label: "mistral-large-2512", hint: "Flagship \u2014 best quality (Dec 2025) [NEW]" },
164
- { value: "mistral:mistral-medium-3", label: "mistral-medium-3", hint: "Balanced capability & speed" },
165
- { value: "mistral:codestral-2508", label: "codestral-2508", hint: "Code specialist (Aug 2025)" },
166
- { value: "mistral:mistral-small-3.2-24b", label: "mistral-small-3.2-24b", hint: "Lightweight, runs locally (24B)" },
167
- ],
168
- }));
169
- } else if (provider === "ollama") {
90
+ // Load model registry dynamically
91
+ const { models: modelRegistry } = await import("../config/models.js");
92
+
93
+ // Provider config: API key prompt + env var name
94
+ const providerKeys = {
95
+ openai: { env: "OPENAI_API_KEY", prompt: "OpenAI API key (sk-...)" },
96
+ anthropic: { env: "ANTHROPIC_API_KEY", prompt: "Anthropic API key (sk-ant-...)" },
97
+ google: { env: "GOOGLE_AI_API_KEY", prompt: "Google AI API key" },
98
+ xai: { env: "XAI_API_KEY", prompt: "xAI API key" },
99
+ deepseek: { env: "DEEPSEEK_API_KEY", prompt: "DeepSeek API key (sk-...)" },
100
+ mistral: { env: "MISTRAL_API_KEY", prompt: "Mistral API key" },
101
+ };
102
+
103
+ if (provider === "ollama") {
104
+ // Ollama: list known local models from registry + free text input
105
+ const ollamaModels = Object.entries(modelRegistry)
106
+ .filter(([, m]) => m.provider === "ollama")
107
+ .map(([, m]) => m.model);
108
+ const ollamaHint = ollamaModels.length ? ollamaModels.join(", ") : "llama3.1, qwen2.5-coder";
170
109
  p.note(
171
110
  [
172
111
  "Make sure Ollama is running: ollama serve",
173
- "Pull a model first: ollama pull llama4-maverick",
174
- "Recommended models:",
175
- " llama4-maverick \u2014 Llama 4, 17B MoE, multimodal, 1M ctx",
176
- " llama4-scout \u2014 Llama 4, 17B MoE, 10M ctx",
177
- " llama3.3 \u2014 best 70B open model",
178
- " qwen2.5 \u2014 strong coder",
112
+ "Pull a model first: ollama pull <model>",
113
+ `Known models: ${ollamaHint}`,
114
+ "You can use any model available in your Ollama installation.",
179
115
  ].join("\n"),
180
116
  "Ollama (local models)",
181
117
  );
182
118
  const model = guard(await p.text({
183
119
  message: "Ollama model name",
184
- initialValue: "llama4-maverick",
185
- placeholder: "e.g. llama4-maverick, llama3.3, qwen2.5",
120
+ initialValue: ollamaModels[0] || "llama3.1",
121
+ placeholder: `e.g. ${ollamaHint}`,
186
122
  }));
187
123
  envConfig.DEFAULT_MODEL = `ollama:${model}`;
124
+ } else {
125
+ // Cloud provider: ask for API key, then show models from registry
126
+ const keyInfo = providerKeys[provider];
127
+ if (keyInfo) {
128
+ const key = guard(await p.password({ message: keyInfo.prompt, validate: (v) => !v ? "Required" : undefined }));
129
+ envConfig[keyInfo.env] = key;
130
+ }
131
+
132
+ // Build model options from registry for this provider
133
+ const providerModels = Object.entries(modelRegistry)
134
+ .filter(([, m]) => m.provider === provider)
135
+ .map(([id, m]) => {
136
+ const ctx = m.contextWindow >= 1_000_000
137
+ ? `${(m.contextWindow / 1_000_000).toFixed(0)}M ctx`
138
+ : `${(m.contextWindow / 1_000).toFixed(0)}K ctx`;
139
+ const caps = (m.capabilities || []).filter(c => c !== "text" && c !== "tools").join(", ");
140
+ const price = m.costPer1kInput > 0 ? `$${m.costPer1kInput}/1k in` : "free";
141
+ const parts = [ctx, m.tier, caps, price].filter(Boolean);
142
+ return { value: id, label: m.model, hint: parts.join(" \u00b7 ") };
143
+ });
144
+
145
+ if (providerModels.length > 0) {
146
+ envConfig.DEFAULT_MODEL = guard(await p.select({
147
+ message: `${provider.charAt(0).toUpperCase() + provider.slice(1)} model`,
148
+ options: providerModels,
149
+ }));
150
+ } else {
151
+ // Provider not in registry — free text input
152
+ const model = guard(await p.text({
153
+ message: `${provider} model name (e.g. ${provider}:model-name)`,
154
+ validate: (v) => !v ? "Required" : undefined,
155
+ }));
156
+ envConfig.DEFAULT_MODEL = model.includes(":") ? model : `${provider}:${model}`;
157
+ }
188
158
  }
189
159
 
190
160
  p.log.success(`Provider: ${t.bold(provider)} Model: ${t.bold(envConfig.DEFAULT_MODEL)}`);