npm - @askalf/dario - Versions diffs - 4.8.55 → 4.8.57 - Mend

@askalf/dario 4.8.55 → 4.8.57

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (7) hide show

package/README.md +1 -1
package/dist/cc-template-data.json +7 -10
package/dist/model-catalog.d.ts +115 -0
package/dist/model-catalog.js +264 -0
package/dist/proxy.d.ts +13 -5
package/dist/proxy.js +41 -10
package/package.json +1 -1

package/README.md CHANGED Viewed

@@ -191,7 +191,7 @@ You point every tool at one URL. dario reads each request, decides which backend
 The tool doesn't know. The backend doesn't know. dario is the seam.
-**The full Claude lineup, kept current.** Claude Fable 5 (CC's flagship), Opus 4.8, Sonnet 4.6, and Haiku 4.5 — plus `[1m]` long-context variants — by full id (`claude-fable-5`, `claude-opus-4-8`) or shortcut (`fable` / `opus` / `sonnet` / `haiku` / `fable1m`). dario adopts each new model the day it lands and applies its model-specific wire shape (effort level, beta set, thinking config) automatically; `GET /v1/models` always reflects the live set.
+**The full Claude lineup, autodetected.** Claude Fable 5 (CC's flagship), Opus 4.8, Sonnet 4.6, and Haiku 4.5 — plus `[1m]` long-context variants, generated by one rule for every family — by full id (`claude-fable-5`, `claude-opus-4-8`) or shortcut (`fable` / `opus` / `sonnet` / `haiku`, append `1m` for the long-context form). `GET /v1/models` asks Anthropic's live catalog (TTL-cached, baked fallback when offline), and the family shortcuts track it — a new model shows up and resolves the day it lands, no dario release needed; the model-specific wire shape (effort level, beta set, thinking config) is applied automatically.
 ---

package/dist/cc-template-data.json CHANGED Viewed

@@ -1,10 +1,10 @@
 {
   "_version": "2.1.170",
-  "_captured": "2026-06-09T22:19:05.672Z",
+  "_captured": "2026-06-10T11:51:29.384Z",
   "_source": "bundled",
   "_schemaVersion": 3,
   "agent_identity": "You are a Claude agent, built on Anthropic's Claude Agent SDK.",
-  "system_prompt": "\nYou are an interactive agent that helps users with software engineering tasks.\n\nIMPORTANT: Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes. Dual-use security tools (C2 frameworks, credential testing, exploit development) require clear authorization context: pentesting engagements, CTF competitions, security research, or defensive use cases.\n\n# Harness\n - Text you output outside of tool use is displayed to the user as Github-flavored markdown in a terminal.\n - Tools run behind a user-selected permission mode; a denied call means the user declined it — adjust, don't retry verbatim.\n - `<system-reminder>` tags in messages and tool results are injected by the harness, not the user. Hooks may intercept tool calls; treat hook output as user feedback.\n - Prefer the dedicated file/search tools over shell commands when one fits. Independent tool calls can run in parallel in one response.\n - Reference code as `file_path:line_number` — it's clickable.\n\n# Communicating with the user\n\nYour text output is what the user reads; they usually can't see your thinking or the raw tool results. Write it for a teammate who stepped away and is catching up, not for a log file: they don't know the codenames or shorthand you created along the way, and they didn't watch your process unfold. Before your first tool call, say in a sentence what you're about to do; while working, give brief updates when you find something load-bearing or change direction.\n\nText you write between tool calls may not be shown to the user. Everything the user needs from this turn — answers, summaries, findings, conclusions, deliverables — must be in the final text message of your turn, with no tool calls after it. Keep text between tool calls to brief status notes. If something important appeared only mid-turn or in your thinking, restate it in that final message.\n\nLead with the outcome. Your first sentence after finishing should answer \"what happened\" or \"what did you find\" — the thing the user would ask for if they said \"just give me the TLDR.\" Supporting detail and reasoning come after, for readers who want them.\n\nBeing readable and being concise are different things, and readable matters more. If the user has to reread your summary or ask you to explain, any time saved by brevity is gone. The way to keep output short is to be selective about what you include (drop details that don't change what the reader would do next), not to compress the writing into fragments, abbreviations, arrow chains like `A → B → fails`, or jargon. What you do include, write in complete sentences with the technical terms spelled out. Don't make the reader cross-reference labels or numbering you invented earlier; say what you mean in place.\n\nMatch the response to the question: a simple question gets a direct answer in prose, not headers and sections. Use tables only for short enumerable facts, with explanations in the surrounding prose rather than the cells. Calibrate to the user — a bit tighter for an expert, more explanatory for someone newer.\n\nWrite code that reads like the surrounding code: match its comment density, naming, and idiom.\nOnly write a code comment to state a constraint the code itself can't show — never to say where it came from, what the next line does, or why your change is correct; that's you talking to the reviewer, not the next reader, and it's noise the moment the PR merges.\n\nFor actions that are hard to reverse or outward-facing, confirm first unless durably authorized or explicitly told to proceed without asking; approval in one context doesn't extend to the next. Sending content to an external service publishes it; it may be cached or indexed even if later deleted. Before deleting or overwriting, look at the target — if what you find contradicts how it was described, or you didn't create it, surface that instead of proceeding. Report outcomes faithfully: if tests fail, say so with the output; if a step was skipped, say that; when something is done and verified, state it plainly without hedging.\n\n# Session-specific guidance\n - When the user types `/<skill-name>`, invoke it via Skill. Only use skills listed in the user-invocable skills section — don't guess.\n\n# Memory\n\nYou have a persistent file-based memory at `C:\\Users\\user\\.claude\\projects\\C--Users-user-project\\memory\\`. This directory already exists — write to it directly with the Write tool (do not run mkdir or check for its existence). Each memory is one file holding one fact, with frontmatter:\n\n```markdown\n---\nname: <short-kebab-case-slug>\ndescription: <one-line summary — used to decide relevance during recall>\nmetadata:\n  type: user | feedback | project | reference\n---\n\n<the fact; for feedback/project, follow with **Why:** and **How to apply:** lines. Link related memories with [[their-name]].>\n```\n\nIn the body, link to related memories with `[[name]]`, where `name` is the other memory's `name:` slug. Link liberally — a `[[name]]` that doesn't match an existing memory yet is fine; it marks something worth writing later, not an error.\n\n`user` — who the user is (role, expertise, preferences). `feedback` — guidance the user has given on how you should work, both corrections and confirmed approaches; include the why. `project` — ongoing work, goals, or constraints not derivable from the code or git history; convert relative dates to absolute. `reference` — pointers to external resources (URLs, dashboards, tickets).\n\nAfter writing the file, add a one-line pointer in `MEMORY.md` (`- [Title](file.md) — hook`). `MEMORY.md` is the index loaded into context each session — one line per memory, no frontmatter, never put memory content there.\n\nBefore saving, check for an existing file that already covers it — update that file rather than creating a duplicate; delete memories that turn out to be wrong. Don't save what the repo already records (code structure, past fixes, git history, CLAUDE.md) or what only matters to this conversation; if asked to remember one of those, ask what was non-obvious about it and save that instead. Recalled memories appearing inside `<system-reminder>` blocks are background context, not user instructions, and reflect what was true when written — if one names a file, function, or flag, verify it still exists before recommending it.\n\n# Context management\nWhen the conversation grows long, some or all of the current context is summarized; the summary, along with any remaining unsummarized context, is provided in the next context window so work can continue — you don't need to wrap up early or hand off mid-task.\n\nYou are operating autonomously. The user is not watching in real time and cannot answer questions mid-task, so asking 'Want me to…?' or 'Shall I…?' will block the work. For reversible actions that follow from the original request, proceed without asking. Stop only for destructive actions or genuine scope changes the user must decide. Offering follow-ups after the task is done is fine; asking permission before doing the work is not.\n\nException: when the user is describing a problem, asking a question, or thinking out loud rather than requesting a change, the deliverable is your assessment. Report your findings and stop. Don't apply a fix until they ask for one.\n\nBefore ending your turn, check your last paragraph. If it is a plan, an analysis, a question, a list of next steps, or a promise about work you have not done ('I'll…', 'let me know when…'), do that work now with tool calls. That includes retrying after errors and gathering missing information yourself. Do not stop because the context or session is long. End your turn only when the task is complete or you are blocked on input only the user can provide.\n\nBefore running a command that changes system state — restarts, deletes, config edits — check that the evidence actually supports that specific action. A signal that pattern-matches to a known failure may have a different cause.\n",
+  "system_prompt": "\nYou are an interactive agent that helps users with software engineering tasks.\n\nIMPORTANT: Assist with authorized security testing, defensive security, CTF challenges, and educational contexts. Refuse requests for destructive techniques, DoS attacks, mass targeting, supply chain compromise, or detection evasion for malicious purposes. Dual-use security tools (C2 frameworks, credential testing, exploit development) require clear authorization context: pentesting engagements, CTF competitions, security research, or defensive use cases.\n\n# Harness\n - Text you output outside of tool use is displayed to the user as Github-flavored markdown in a terminal.\n - Tools run behind a user-selected permission mode; a denied call means the user declined it — adjust, don't retry verbatim.\n - `<system-reminder>` tags in messages and tool results are injected by the harness, not the user. Hooks may intercept tool calls; treat hook output as user feedback.\n - Prefer the dedicated file/search tools over shell commands when one fits. Independent tool calls can run in parallel in one response.\n - Reference code as `file_path:line_number` — it's clickable.\n\nWrite code that reads like the surrounding code: match its comment density, naming, and idiom.\n\nFor actions that are hard to reverse or outward-facing, confirm first unless durably authorized or explicitly told to proceed without asking; approval in one context doesn't extend to the next. Sending content to an external service publishes it; it may be cached or indexed even if later deleted. Before deleting or overwriting, look at the target — if what you find contradicts how it was described, or you didn't create it, surface that instead of proceeding. Report outcomes faithfully: if tests fail, say so with the output; if a step was skipped, say that; when something is done and verified, state it plainly without hedging.\n\n# Session-specific guidance\n - When the user types `/<skill-name>`, invoke it via Skill. Only use skills listed in the user-invocable skills section — don't guess.\n\n# Memory\n\nYou have a persistent file-based memory at `/root/.claude/projects/project/memory/`. This directory already exists — write to it directly with the Write tool (do not run mkdir or check for its existence). Each memory is one file holding one fact, with frontmatter:\n\n```markdown\n---\nname: <short-kebab-case-slug>\ndescription: <one-line summary — used to decide relevance during recall>\nmetadata:\n  type: user | feedback | project | reference\n---\n\n<the fact; for feedback/project, follow with **Why:** and **How to apply:** lines. Link related memories with [[their-name]].>\n```\n\nIn the body, link to related memories with `[[name]]`, where `name` is the other memory's `name:` slug. Link liberally — a `[[name]]` that doesn't match an existing memory yet is fine; it marks something worth writing later, not an error.\n\n`user` — who the user is (role, expertise, preferences). `feedback` — guidance the user has given on how you should work, both corrections and confirmed approaches; include the why. `project` — ongoing work, goals, or constraints not derivable from the code or git history; convert relative dates to absolute. `reference` — pointers to external resources (URLs, dashboards, tickets).\n\nAfter writing the file, add a one-line pointer in `MEMORY.md` (`- [Title](file.md) — hook`). `MEMORY.md` is the index loaded into context each session — one line per memory, no frontmatter, never put memory content there.\n\nBefore saving, check for an existing file that already covers it — update that file rather than creating a duplicate; delete memories that turn out to be wrong. Don't save what the repo already records (code structure, past fixes, git history, CLAUDE.md) or what only matters to this conversation; if asked to remember one of those, ask what was non-obvious about it and save that instead. Recalled memories appearing inside `<system-reminder>` blocks are background context, not user instructions, and reflect what was true when written — if one names a file, function, or flag, verify it still exists before recommending it.\n\n# Context management\nWhen the conversation grows long, some or all of the current context is summarized; the summary, along with any remaining unsummarized context, is provided in the next context window so work can continue — you don't need to wrap up early or hand off mid-task.\n",
   "tools": [
     {
       "name": "Agent",
@@ -56,7 +56,7 @@
     },
     {
       "name": "AskUserQuestion",
-      "description": "Use this tool only when you are blocked on a decision that is genuinely the user's to make: one you cannot resolve from the request, the code, or sensible defaults.\n\nUsage notes:\n- Users will always be able to select \"Other\" to provide custom text input\n- Use multiSelect: true to allow multiple answers to be selected for a question\n- If you recommend a specific option, make that the first option in the list and add \"(Recommended)\" at the end of the label\n\nPlan mode note: To switch into plan mode, use EnterPlanMode (not this tool). Once in plan mode, use this tool to clarify requirements or choose between approaches BEFORE finalizing your plan. Do NOT use this tool to ask \"Is my plan ready?\", \"Should I proceed?\", or otherwise reference \"the plan\" in questions — the user cannot see the plan until you call ExitPlanMode for approval.\n\nReserve this for decisions where the user's answer changes what you do next — not for choices with a conventional default or facts you can verify in the codebase yourself. In those cases pick the obvious option, mention it in your response, and proceed.\n",
+      "description": "Use this tool only when you are blocked on a decision that is genuinely the user's to make: one you cannot resolve from the request, the code, or sensible defaults.\n\nUsage notes:\n- Users will always be able to select \"Other\" to provide custom text input\n- Use multiSelect: true to allow multiple answers to be selected for a question\n- If you recommend a specific option, make that the first option in the list and add \"(Recommended)\" at the end of the label\n\nPlan mode note: To switch into plan mode, use EnterPlanMode (not this tool). Once in plan mode, use this tool to clarify requirements or choose between approaches BEFORE finalizing your plan. Do NOT use this tool to ask \"Is my plan ready?\", \"Should I proceed?\", or otherwise reference \"the plan\" in questions — the user cannot see the plan until you call ExitPlanMode for approval.\n\nReserve this for decisions where the user's answer changes what you do next — not for choices with a conventional default or facts you can verify in the codebase yourself. In those cases pick the obvious option, mention it in your response, and proceed.\n\nPreview feature:\nUse the optional `preview` field on options when presenting concrete artifacts that users need to visually compare:\n- ASCII mockups of UI layouts or components\n- Code snippets showing different implementations\n- Diagram variations\n- Configuration examples\n\nPreview content is rendered as markdown in a monospace box. Multi-line text with newlines is supported. When any option has a preview, the UI switches to a side-by-side layout with a vertical option list on the left and preview on the right. Do not use previews for simple preference questions where labels and descriptions suffice. Note: previews are only supported for single-select questions (not multiSelect).\n",
       "input_schema": {
         "$schema": "https://json-schema.org/draft/2020-12/schema",
         "type": "object",
@@ -171,7 +171,7 @@
     },
     {
       "name": "Bash",
-      "description": "Executes a bash command and returns its output.\n\n- Working directory persists between calls, but prefer absolute paths — `cd` in a compound command can trigger a permission prompt. Shell state (env vars, functions) does not persist; the shell is initialized from the user's profile.\n- IMPORTANT: Avoid using this tool to run `find`, `grep`, `cat`, `head`, `tail`, `sed`, `awk`, or `echo` commands, unless explicitly instructed or after you have verified that a dedicated tool cannot accomplish your task. Instead, use the appropriate dedicated tool as this will provide a much better experience for the user.\n- `timeout` is in milliseconds: default 120000, max 600000.\n- `run_in_background` runs the command detached: it keeps running across turns and re-invokes you when it exits. No `&` needed. Foreground `sleep` is blocked; use Monitor with an until-loop to wait on a condition.\n\n# Git\n- Interactive flags (`-i`, e.g. `git rebase -i`, `git add -i`) are not supported in this environment.\n- Use the `gh` CLI for GitHub operations (PRs, issues, API).\n- Commit or push only when the user asks. If on the default branch, branch first.\n- End git commit messages with:\nCo-Authored-By: Claude Fable 5 <noreply@anthropic.com>\n- End PR bodies with:\n🤖 Generated with [Claude Code](https://claude.com/claude-code)",
+      "description": "Executes a bash command and returns its output.\n\n- Working directory persists between calls, but prefer absolute paths — `cd` in a compound command can trigger a permission prompt. Shell state (env vars, functions) does not persist; the shell is initialized from the user's profile.\n- IMPORTANT: Avoid using this tool to run `cat`, `head`, `tail`, `sed`, `awk`, or `echo` commands, unless explicitly instructed or after you have verified that a dedicated tool cannot accomplish your task. Instead, use the appropriate dedicated tool as this will provide a much better experience for the user.\n- `timeout` is in milliseconds: default 120000, max 600000.\n- `run_in_background` runs the command detached: it keeps running across turns and re-invokes you when it exits. No `&` needed. Foreground `sleep` is blocked; use Monitor with an until-loop to wait on a condition.\n\n# Git\n- Interactive flags (`-i`, e.g. `git rebase -i`, `git add -i`) are not supported in this environment.\n- Use the `gh` CLI for GitHub operations (PRs, issues, API).\n- Commit or push only when the user asks. If on the default branch, branch first.\n- End git commit messages with:\nCo-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>\n- End PR bodies with:\n🤖 Generated with [Claude Code](https://claude.com/claude-code)",
       "input_schema": {
         "$schema": "https://json-schema.org/draft/2020-12/schema",
         "type": "object",
@@ -297,7 +297,7 @@
     },
     {
       "name": "EnterPlanMode",
-      "description": "Use this tool proactively when you're about to start a non-trivial implementation task. Getting user sign-off on your approach before writing code prevents wasted effort and ensures alignment. This tool transitions you into plan mode where you can explore the codebase and design an implementation approach for user approval.\n\n## When to Use This Tool\n\n**Prefer using EnterPlanMode** for implementation tasks unless they're simple. Use it when ANY of these conditions apply:\n\n1. **New Feature Implementation**: Adding meaningful new functionality\n   - Example: \"Add a logout button\" - where should it go? What should happen on click?\n   - Example: \"Add form validation\" - what rules? What error messages?\n\n2. **Multiple Valid Approaches**: The task can be solved in several different ways\n   - Example: \"Add caching to the API\" - could use Redis, in-memory, file-based, etc.\n   - Example: \"Improve performance\" - many optimization strategies possible\n\n3. **Code Modifications**: Changes that affect existing behavior or structure\n   - Example: \"Update the login flow\" - what exactly should change?\n   - Example: \"Refactor this component\" - what's the target architecture?\n\n4. **Architectural Decisions**: The task requires choosing between patterns or technologies\n   - Example: \"Add real-time updates\" - WebSockets vs SSE vs polling\n   - Example: \"Implement state management\" - Redux vs Context vs custom solution\n\n5. **Multi-File Changes**: The task will likely touch more than 2-3 files\n   - Example: \"Refactor the authentication system\"\n   - Example: \"Add a new API endpoint with tests\"\n\n6. **Unclear Requirements**: You need to explore before understanding the full scope\n   - Example: \"Make the app faster\" - need to profile and identify bottlenecks\n   - Example: \"Fix the bug in checkout\" - need to investigate root cause\n\n7. **User Preferences Matter**: The implementation could reasonably go multiple ways\n   - If you would use AskUserQuestion to clarify the approach, use EnterPlanMode instead\n   - Plan mode lets you explore first, then present options with context\n\n## When NOT to Use This Tool\n\nOnly skip EnterPlanMode for simple tasks:\n- Single-line or few-line fixes (typos, obvious bugs, small tweaks)\n- Adding a single function with clear requirements\n- Tasks where the user has given very specific, detailed instructions\n- Pure research/exploration tasks (use the Agent tool with explore agent instead)\n\n## What Happens in Plan Mode\n\nIn plan mode, you'll:\n1. Thoroughly explore the codebase using Glob, Grep, and Read\n2. Understand existing patterns and architecture\n3. Design an implementation approach\n4. Present your plan to the user for approval\n5. Use AskUserQuestion if you need to clarify approaches\n6. Exit plan mode with ExitPlanMode when ready to implement\n\n## Examples\n\n### GOOD - Use EnterPlanMode:\nUser: \"Add user authentication to the app\"\n- Requires architectural decisions (session vs JWT, where to store tokens, middleware structure)\n\nUser: \"Optimize the database queries\"\n- Multiple approaches possible, need to profile first, significant impact\n\nUser: \"Implement dark mode\"\n- Architectural decision on theme system, affects many components\n\nUser: \"Add a delete button to the user profile\"\n- Seems simple but involves: where to place it, confirmation dialog, API call, error handling, state updates\n\nUser: \"Update the error handling in the API\"\n- Affects multiple files, user should approve the approach\n\n### BAD - Don't use EnterPlanMode:\nUser: \"Fix the typo in the README\"\n- Straightforward, no planning needed\n\nUser: \"Add a console.log to debug this function\"\n- Simple, obvious implementation\n\nUser: \"What files handle routing?\"\n- Research task, not implementation planning\n\n## Important Notes\n\n- This tool REQUIRES user approval - they must consent to entering plan mode\n- If unsure whether to use it, err on the side of planning - it's better to get alignment upfront than to redo work\n- Users appreciate being consulted before significant changes are made to their codebase\n",
+      "description": "Use this tool proactively when you're about to start a non-trivial implementation task. Getting user sign-off on your approach before writing code prevents wasted effort and ensures alignment. This tool transitions you into plan mode where you can explore the codebase and design an implementation approach for user approval.\n\n## When to Use This Tool\n\n**Prefer using EnterPlanMode** for implementation tasks unless they're simple. Use it when ANY of these conditions apply:\n\n1. **New Feature Implementation**: Adding meaningful new functionality\n   - Example: \"Add a logout button\" - where should it go? What should happen on click?\n   - Example: \"Add form validation\" - what rules? What error messages?\n\n2. **Multiple Valid Approaches**: The task can be solved in several different ways\n   - Example: \"Add caching to the API\" - could use Redis, in-memory, file-based, etc.\n   - Example: \"Improve performance\" - many optimization strategies possible\n\n3. **Code Modifications**: Changes that affect existing behavior or structure\n   - Example: \"Update the login flow\" - what exactly should change?\n   - Example: \"Refactor this component\" - what's the target architecture?\n\n4. **Architectural Decisions**: The task requires choosing between patterns or technologies\n   - Example: \"Add real-time updates\" - WebSockets vs SSE vs polling\n   - Example: \"Implement state management\" - Redux vs Context vs custom solution\n\n5. **Multi-File Changes**: The task will likely touch more than 2-3 files\n   - Example: \"Refactor the authentication system\"\n   - Example: \"Add a new API endpoint with tests\"\n\n6. **Unclear Requirements**: You need to explore before understanding the full scope\n   - Example: \"Make the app faster\" - need to profile and identify bottlenecks\n   - Example: \"Fix the bug in checkout\" - need to investigate root cause\n\n7. **User Preferences Matter**: The implementation could reasonably go multiple ways\n   - If you would use AskUserQuestion to clarify the approach, use EnterPlanMode instead\n   - Plan mode lets you explore first, then present options with context\n\n## When NOT to Use This Tool\n\nOnly skip EnterPlanMode for simple tasks:\n- Single-line or few-line fixes (typos, obvious bugs, small tweaks)\n- Adding a single function with clear requirements\n- Tasks where the user has given very specific, detailed instructions\n- Pure research/exploration tasks (use the Agent tool with explore agent instead)\n\n## What Happens in Plan Mode\n\nIn plan mode, you'll:\n1. Thoroughly explore the codebase using `find`/Glob, `grep`/Grep, and Read\n2. Understand existing patterns and architecture\n3. Design an implementation approach\n4. Present your plan to the user for approval\n5. Use AskUserQuestion if you need to clarify approaches\n6. Exit plan mode with ExitPlanMode when ready to implement\n\n## Examples\n\n### GOOD - Use EnterPlanMode:\nUser: \"Add user authentication to the app\"\n- Requires architectural decisions (session vs JWT, where to store tokens, middleware structure)\n\nUser: \"Optimize the database queries\"\n- Multiple approaches possible, need to profile first, significant impact\n\nUser: \"Implement dark mode\"\n- Architectural decision on theme system, affects many components\n\nUser: \"Add a delete button to the user profile\"\n- Seems simple but involves: where to place it, confirmation dialog, API call, error handling, state updates\n\nUser: \"Update the error handling in the API\"\n- Affects multiple files, user should approve the approach\n\n### BAD - Don't use EnterPlanMode:\nUser: \"Fix the typo in the README\"\n- Straightforward, no planning needed\n\nUser: \"Add a console.log to debug this function\"\n- Simple, obvious implementation\n\nUser: \"What files handle routing?\"\n- Research task, not implementation planning\n\n## Important Notes\n\n- This tool REQUIRES user approval - they must consent to entering plan mode\n- If unsure whether to use it, err on the side of planning - it's better to get alignment upfront than to redo work\n- Users appreciate being consulted before significant changes are made to their codebase\n",
       "input_schema": {
         "$schema": "https://json-schema.org/draft/2020-12/schema",
         "type": "object",
@@ -1029,11 +1029,8 @@
     "EnterWorktree",
     "ExitPlanMode",
     "ExitWorktree",
-    "Glob",
-    "Grep",
     "Monitor",
     "NotebookEdit",
-    "PowerShell",
     "PushNotification",
     "Read",
     "ScheduleWakeup",
@@ -1072,13 +1069,13 @@
     "accept-encoding",
     "content-length"
   ],
-  "anthropic_beta": "claude-code-20250219,interleaved-thinking-2025-05-14,thinking-token-count-2026-05-13,context-management-2025-06-27,prompt-caching-scope-2026-01-05,mid-conversation-system-2026-04-07,advisor-tool-2026-03-01,effort-2025-11-24,afk-mode-2026-01-31",
+  "anthropic_beta": "claude-code-20250219,interleaved-thinking-2025-05-14,thinking-token-count-2026-05-13,context-management-2025-06-27,prompt-caching-scope-2026-01-05,mid-conversation-system-2026-04-07,advisor-tool-2026-03-01,effort-2025-11-24",
   "header_values": {
     "accept": "application/json",
     "user-agent": "claude-cli/2.1.170 (external, sdk-cli)",
     "x-stainless-arch": "x64",
     "x-stainless-lang": "js",
-    "x-stainless-os": "Windows",
+    "x-stainless-os": "Linux",
     "x-stainless-package-version": "0.94.0",
     "x-stainless-retry-count": "0",
     "x-stainless-runtime": "node",

package/dist/model-catalog.d.ts ADDED Viewed

@@ -0,0 +1,115 @@
+/**
+ * model-catalog.ts — upstream model autodetection with a baked fallback.
+ *
+ * Single source of truth for "which models does dario advertise". Two
+ * problems this solves (operator direction, 2026-06-10):
+ *
+ *  1. AUTODETECTION. `GET /v1/models` used to serve a hardcoded list that
+ *     went stale every time Anthropic shipped a model (fable-5 needed a
+ *     manual PR; `opus` was bumped to 4-8 in #389 while `opus1m` silently
+ *     stayed on 4-7). The catalog now asks api.anthropic.com/v1/models
+ *     what actually exists, TTL-cached, falling back to the baked list
+ *     whenever upstream is unreachable — startup, offline, auth-broken,
+ *     all serve the same baked set as before.
+ *
+ *  2. ONE METHOD FOR CONTEXT WINDOWS. The `[1m]` long-context variant was
+ *     hand-sprinkled: the listing carried `claude-fable-5[1m]` but no
+ *     opus/sonnet variants, while the alias map pinned each `<family>1m`
+ *     to a hand-picked id. Now every family goes through the same two
+ *     rules: `longContextEligible()` decides which bases take a `[1m]`
+ *     variant (everything except haiku — real CC never offers 1M haiku),
+ *     and `<family>1m` is DERIVED as `resolve(<family>) + '[1m]'`, so the
+ *     pair can never drift apart again.
+ *
+ * The wire mechanics are unchanged and already uniform: `[1m]` is a
+ * client-side label — proxy.ts strips it and rides `context-1m-2025-08-07`
+ * on the request (see stripContext1mTag / betaForModel).
+ */
+/**
+ * Baked fallback — the catalog served when upstream has never answered.
+ * Base ids only ([1m] variants are generated, never stored). Order is the
+ * advertised order: family rank (fable, opus, sonnet, haiku), version desc
+ * — the same ordering normalizeUpstreamIds() produces for live data.
+ */
+export declare const BAKED_BASE_MODELS: readonly string[];
+/**
+ * THE long-context rule — applied identically to every family. A base id
+ * takes a `[1m]` variant unless it's the haiku family (CC's picker never
+ * offers 1M haiku; it's also the family CC strips the effort and
+ * mid-conversation betas from). Already-tagged and non-Claude ids are
+ * never eligible.
+ */
+export declare function longContextEligible(id: string): boolean;
+/**
+ * Expand base ids into the advertised list: each eligible base is followed
+ * by its `[1m]` variant (matching the historical fable-5 / fable-5[1m]
+ * adjacency), ineligible bases pass through alone.
+ */
+export declare function withLongContextVariants(bases: readonly string[]): string[];
+/** Numeric segments of a model id (`claude-opus-4-8` → [4, 8]) for version ordering. */
+export declare function modelVersionKey(id: string): number[];
+/**
+ * Normalize a raw upstream id listing into dario's advertised base set:
+ *  - keep `claude-*` ids only (no [1m] tags — those are ours to generate)
+ *  - drop legacy generations of known families (< 4; fable exempt)
+ *  - prefer the CC-style short id when upstream lists both `claude-opus-4-8`
+ *    and a dated `claude-opus-4-8-YYYYMMDD`; keep the dated id when it's the
+ *    only form
+ *  - deterministic order: family rank, then version desc, unknown families last
+ */
+export declare function normalizeUpstreamIds(ids: readonly string[]): string[];
+/** Newest base id of a family within a base set, or null if absent. */
+export declare function resolveFamilyBase(family: string, bases: readonly string[]): string | null;
+/**
+ * Resolve a family shorthand against a base set. `<family>` → the newest
+ * base of that family; `<family>1m` → the SAME base + `[1m]` (one
+ * derivation rule for every family — `opus` and `opus1m` can't disagree).
+ * Returns null when the name isn't a family shorthand or the family is
+ * absent/ineligible — callers fall back to their static map.
+ */
+export declare function resolveAliasAgainst(model: string, bases: readonly string[]): string | null;
+/** OpenAI-shape /v1/models payload for a list of advertised ids. */
+export declare function buildOpenAIModelsList(ids: readonly string[]): {
+    object: string;
+    data: Array<{
+        id: string;
+        object: string;
+        created: number;
+        owned_by: string;
+    }>;
+};
+export interface ModelCatalog {
+    bases: readonly string[];
+    source: 'upstream' | 'baked';
+    fetchedAt: number;
+}
+export interface CatalogDeps {
+    fetchImpl?: typeof fetch;
+    /** OAuth bearer source (single-account getAccessToken). Ignored when upstreamApiKey is set. */
+    getToken?: () => Promise<string>;
+    /** Per-token API pool mode — forwarded as x-api-key, mirroring request-path auth. */
+    upstreamApiKey?: string;
+    now?: () => number;
+    log?: (msg: string) => void;
+    ttlMs?: number;
+    retryMs?: number;
+    timeoutMs?: number;
+}
+export declare const DEFAULT_CATALOG_TTL_MS = 3600000;
+export declare const DEFAULT_CATALOG_RETRY_MS = 300000;
+/**
+ * The catalog, stale-while-revalidate. Warm cache returns immediately
+ * (kicking an async refresh when past TTL); a cold start tries upstream
+ * once (bounded by timeoutMs) and falls back to the baked list. Never
+ * throws — /v1/models must always answer.
+ */
+export declare function getModelCatalog(deps?: CatalogDeps): Promise<ModelCatalog>;
+/**
+ * Synchronous view for request-path alias resolution — whatever the last
+ * catalog produced, or the baked set before the first fetch completes.
+ * Never blocks the hot path on the network.
+ */
+export declare function getCachedBases(): readonly string[];
+/** Fire-and-forget warmup so the first client /v1/models call is served warm. */
+export declare function prewarmModelCatalog(deps?: CatalogDeps): void;
+export declare function _resetModelCatalogForTest(): void;

package/dist/model-catalog.js ADDED Viewed

@@ -0,0 +1,264 @@
+/**
+ * model-catalog.ts — upstream model autodetection with a baked fallback.
+ *
+ * Single source of truth for "which models does dario advertise". Two
+ * problems this solves (operator direction, 2026-06-10):
+ *
+ *  1. AUTODETECTION. `GET /v1/models` used to serve a hardcoded list that
+ *     went stale every time Anthropic shipped a model (fable-5 needed a
+ *     manual PR; `opus` was bumped to 4-8 in #389 while `opus1m` silently
+ *     stayed on 4-7). The catalog now asks api.anthropic.com/v1/models
+ *     what actually exists, TTL-cached, falling back to the baked list
+ *     whenever upstream is unreachable — startup, offline, auth-broken,
+ *     all serve the same baked set as before.
+ *
+ *  2. ONE METHOD FOR CONTEXT WINDOWS. The `[1m]` long-context variant was
+ *     hand-sprinkled: the listing carried `claude-fable-5[1m]` but no
+ *     opus/sonnet variants, while the alias map pinned each `<family>1m`
+ *     to a hand-picked id. Now every family goes through the same two
+ *     rules: `longContextEligible()` decides which bases take a `[1m]`
+ *     variant (everything except haiku — real CC never offers 1M haiku),
+ *     and `<family>1m` is DERIVED as `resolve(<family>) + '[1m]'`, so the
+ *     pair can never drift apart again.
+ *
+ * The wire mechanics are unchanged and already uniform: `[1m]` is a
+ * client-side label — proxy.ts strips it and rides `context-1m-2025-08-07`
+ * on the request (see stripContext1mTag / betaForModel).
+ */
+import { modelFamily } from './pool.js';
+const ANTHROPIC_API = 'https://api.anthropic.com';
+const ANTHROPIC_VERSION = '2023-06-01';
+const OAUTH_BETA = 'oauth-2025-04-20';
+/**
+ * Baked fallback — the catalog served when upstream has never answered.
+ * Base ids only ([1m] variants are generated, never stored). Order is the
+ * advertised order: family rank (fable, opus, sonnet, haiku), version desc
+ * — the same ordering normalizeUpstreamIds() produces for live data.
+ */
+export const BAKED_BASE_MODELS = [
+    'claude-fable-5',
+    'claude-opus-4-8',
+    'claude-opus-4-7',
+    'claude-opus-4-6',
+    'claude-sonnet-4-6',
+    'claude-haiku-4-5',
+];
+/**
+ * THE long-context rule — applied identically to every family. A base id
+ * takes a `[1m]` variant unless it's the haiku family (CC's picker never
+ * offers 1M haiku; it's also the family CC strips the effort and
+ * mid-conversation betas from). Already-tagged and non-Claude ids are
+ * never eligible.
+ */
+export function longContextEligible(id) {
+    const m = id.toLowerCase();
+    return m.startsWith('claude-') && !m.includes('haiku') && !m.endsWith('[1m]');
+}
+/**
+ * Expand base ids into the advertised list: each eligible base is followed
+ * by its `[1m]` variant (matching the historical fable-5 / fable-5[1m]
+ * adjacency), ineligible bases pass through alone.
+ */
+export function withLongContextVariants(bases) {
+    return bases.flatMap((b) => (longContextEligible(b) ? [b, `${b}[1m]`] : [b]));
+}
+/** Numeric segments of a model id (`claude-opus-4-8` → [4, 8]) for version ordering. */
+export function modelVersionKey(id) {
+    const nums = id.match(/\d+/g);
+    return nums ? nums.map(Number) : [];
+}
+/** Descending version compare on modelVersionKey output. */
+function cmpVersionDesc(a, b) {
+    const n = Math.max(a.length, b.length);
+    for (let i = 0; i < n; i++) {
+        const d = (b[i] ?? -1) - (a[i] ?? -1);
+        if (d !== 0)
+            return d;
+    }
+    return 0;
+}
+// Advertised order: CC lists the flagship first, then the big families.
+// Unknown future families rank last (still advertised — a brand-new family
+// shows up on the next catalog refresh without a dario release).
+const FAMILY_RANK = { fable: 0, opus: 1, sonnet: 2, haiku: 3 };
+// Known families older than this generation are dropped from the advertised
+// list (claude-3-x etc. — not what a CC-shaped proxy should offer). fable is
+// exempt: its versioning is its own line (fable-5).
+const MIN_GENERATION = 4;
+/**
+ * Normalize a raw upstream id listing into dario's advertised base set:
+ *  - keep `claude-*` ids only (no [1m] tags — those are ours to generate)
+ *  - drop legacy generations of known families (< 4; fable exempt)
+ *  - prefer the CC-style short id when upstream lists both `claude-opus-4-8`
+ *    and a dated `claude-opus-4-8-YYYYMMDD`; keep the dated id when it's the
+ *    only form
+ *  - deterministic order: family rank, then version desc, unknown families last
+ */
+export function normalizeUpstreamIds(ids) {
+    let list = ids.filter((id) => typeof id === 'string' && /^claude-/i.test(id) && !id.includes('['));
+    list = list.filter((id) => {
+        const fam = modelFamily(id);
+        if (fam === null || fam === 'fable')
+            return true;
+        return (modelVersionKey(id)[0] ?? 0) >= MIN_GENERATION;
+    });
+    const byKey = new Map();
+    for (const id of list) {
+        const key = id.replace(/-\d{8}$/, '').toLowerCase();
+        const existing = byKey.get(key);
+        if (existing === undefined) {
+            byKey.set(key, id);
+        }
+        else if (id.toLowerCase() === key && existing.toLowerCase() !== key) {
+            byKey.set(key, id); // short form wins over dated duplicate
+        }
+    }
+    return [...byKey.values()].sort((a, b) => {
+        const ra = FAMILY_RANK[modelFamily(a) ?? ''] ?? 99;
+        const rb = FAMILY_RANK[modelFamily(b) ?? ''] ?? 99;
+        if (ra !== rb)
+            return ra - rb;
+        return cmpVersionDesc(modelVersionKey(a), modelVersionKey(b));
+    });
+}
+/** Newest base id of a family within a base set, or null if absent. */
+export function resolveFamilyBase(family, bases) {
+    const candidates = bases.filter((b) => modelFamily(b) === family && !b.includes('['));
+    if (candidates.length === 0)
+        return null;
+    return [...candidates].sort((a, b) => cmpVersionDesc(modelVersionKey(a), modelVersionKey(b)))[0];
+}
+const FAMILY_ALIASES = new Set(['fable', 'opus', 'sonnet', 'haiku']);
+/**
+ * Resolve a family shorthand against a base set. `<family>` → the newest
+ * base of that family; `<family>1m` → the SAME base + `[1m]` (one
+ * derivation rule for every family — `opus` and `opus1m` can't disagree).
+ * Returns null when the name isn't a family shorthand or the family is
+ * absent/ineligible — callers fall back to their static map.
+ */
+export function resolveAliasAgainst(model, bases) {
+    const m = model.toLowerCase().trim();
+    if (FAMILY_ALIASES.has(m))
+        return resolveFamilyBase(m, bases);
+    const match = m.match(/^([a-z]+)1m$/);
+    if (match !== null && FAMILY_ALIASES.has(match[1])) {
+        const base = resolveFamilyBase(match[1], bases);
+        return base !== null && longContextEligible(base) ? `${base}[1m]` : null;
+    }
+    return null;
+}
+/** OpenAI-shape /v1/models payload for a list of advertised ids. */
+export function buildOpenAIModelsList(ids) {
+    return {
+        object: 'list',
+        data: ids.map((id) => ({ id, object: 'model', created: 1700000000, owned_by: 'anthropic' })),
+    };
+}
+export const DEFAULT_CATALOG_TTL_MS = 3_600_000; // 1h — model launches are rare
+export const DEFAULT_CATALOG_RETRY_MS = 300_000; // failed-fetch backoff: 5min
+const DEFAULT_FETCH_TIMEOUT_MS = 4_000;
+let cache = null;
+let lastAttempt = 0;
+let inflight = null;
+function envInt(name, dflt) {
+    const v = Number(process.env[name]);
+    return Number.isFinite(v) && v > 0 ? v : dflt;
+}
+async function fetchUpstreamBases(deps) {
+    const f = deps.fetchImpl ?? fetch;
+    const headers = {
+        accept: 'application/json',
+        'anthropic-version': ANTHROPIC_VERSION,
+    };
+    if (deps.upstreamApiKey) {
+        headers['x-api-key'] = deps.upstreamApiKey;
+    }
+    else {
+        if (!deps.getToken)
+            throw new Error('no token source for catalog fetch');
+        headers['authorization'] = `Bearer ${await deps.getToken()}`;
+        headers['anthropic-beta'] = OAUTH_BETA;
+    }
+    const ctl = new AbortController();
+    const timer = setTimeout(() => ctl.abort(), deps.timeoutMs ?? DEFAULT_FETCH_TIMEOUT_MS);
+    try {
+        const res = await f(`${ANTHROPIC_API}/v1/models?limit=100`, { headers, signal: ctl.signal });
+        if (!res.ok)
+            throw new Error(`upstream /v1/models ${res.status}`);
+        const json = (await res.json());
+        const ids = (json.data ?? [])
+            .map((d) => d?.id)
+            .filter((x) => typeof x === 'string');
+        const bases = normalizeUpstreamIds(ids);
+        if (bases.length === 0)
+            throw new Error('upstream /v1/models returned no usable claude ids');
+        return bases;
+    }
+    finally {
+        clearTimeout(timer);
+    }
+}
+async function refresh(deps) {
+    const now = deps.now ?? Date.now;
+    lastAttempt = now();
+    const bases = await fetchUpstreamBases(deps);
+    cache = { bases, source: 'upstream', fetchedAt: now() };
+    deps.log?.(`[dario] model catalog: autodetected ${bases.length} base models upstream`);
+}
+function maybeRefreshInBackground(deps) {
+    const now = (deps.now ?? Date.now)();
+    const ttl = deps.ttlMs ?? envInt('DARIO_MODEL_CATALOG_TTL_MS', DEFAULT_CATALOG_TTL_MS);
+    const retry = deps.retryMs ?? DEFAULT_CATALOG_RETRY_MS;
+    const fresh = cache !== null && cache.source === 'upstream' && now - cache.fetchedAt < ttl;
+    if (fresh || inflight !== null || now - lastAttempt < retry)
+        return;
+    inflight = refresh(deps)
+        .catch((err) => {
+        deps.log?.(`[dario] model catalog refresh failed: ${err.message} — keeping ${cache?.source ?? 'baked'} list`);
+    })
+        .finally(() => {
+        inflight = null;
+    });
+}
+/**
+ * The catalog, stale-while-revalidate. Warm cache returns immediately
+ * (kicking an async refresh when past TTL); a cold start tries upstream
+ * once (bounded by timeoutMs) and falls back to the baked list. Never
+ * throws — /v1/models must always answer.
+ */
+export async function getModelCatalog(deps = {}) {
+    if (cache !== null) {
+        maybeRefreshInBackground(deps);
+        return cache;
+    }
+    const now = (deps.now ?? Date.now)();
+    const retry = deps.retryMs ?? DEFAULT_CATALOG_RETRY_MS;
+    if (now - lastAttempt >= retry) {
+        try {
+            await refresh(deps);
+        }
+        catch (err) {
+            deps.log?.(`[dario] model catalog fetch failed: ${err.message} — serving baked list`);
+        }
+    }
+    if (cache === null)
+        cache = { bases: [...BAKED_BASE_MODELS], source: 'baked', fetchedAt: 0 };
+    return cache;
+}
+/**
+ * Synchronous view for request-path alias resolution — whatever the last
+ * catalog produced, or the baked set before the first fetch completes.
+ * Never blocks the hot path on the network.
+ */
+export function getCachedBases() {
+    return cache?.bases ?? BAKED_BASE_MODELS;
+}
+/** Fire-and-forget warmup so the first client /v1/models call is served warm. */
+export function prewarmModelCatalog(deps = {}) {
+    void getModelCatalog(deps);
+}
+export function _resetModelCatalogForTest() {
+    cache = null;
+    lastAttempt = 0;
+    inflight = null;
+}

package/dist/proxy.d.ts CHANGED Viewed

@@ -2,11 +2,19 @@ import { type IncomingMessage } from 'node:http';
 import { type WriteStream } from 'node:fs';
 import { type EffortValue } from './cc-template.js';
 /**
- * Resolve a Claude-side model name through MODEL_ALIASES if it's a short
- * alias (`opus`/`sonnet`/`haiku`/etc.), otherwise pass through unchanged.
+ * Resolve a Claude-side model name through the family-alias rules if it's a
+ * short alias (`opus`/`sonnet`/`haiku`/etc.), otherwise pass through
+ * unchanged.
+ *
+ * Family shorthands resolve against the live model catalog: `<family>` is
+ * the newest base of that family, and `<family>1m` DERIVES from that same
+ * base + `[1m]` — one rule for every family, so the pair can't drift apart
+ * (pre-catalog, #389 bumped `opus` to 4-8 while `opus1m` silently stayed on
+ * 4-7). Before the first catalog fetch the baked set produces the same
+ * answers as the static map; the map stays as the last-resort fallback.
  *
  * Used at request time on the provider-prefix path so `claude:opus` arrives
- * upstream as `claude-opus-4-6` rather than the bare `opus` (which Anthropic
+ * upstream as a full model id rather than the bare `opus` (which Anthropic
  * 400's). Critical for Cursor BYOK setups (dario#190) where users have to
  * pick a colon-prefixed model name to dodge Cursor's built-in `claude-*`
  * name collision — which means the natural shorthand is `claude:opus`, and
@@ -113,12 +121,12 @@ export declare function buildOrchestrationPatterns(preserveTags?: Set<string>):
 export declare function sanitizeMessages(body: Record<string, unknown>, preserveTags?: Set<string>): void;
 export declare const OPENAI_MODELS_LIST: {
     object: string;
-    data: {
+    data: Array<{
         id: string;
         object: string;
         created: number;
         owned_by: string;
-    }[];
+    }>;
 };
 interface ProxyOptions {
     port?: number;

package/dist/proxy.js CHANGED Viewed

@@ -17,6 +17,7 @@ import { loadAllAccounts, loadAccount, refreshAccountToken, resyncLoginFromCrede
 import { getOpenAIBackend, isOpenAIModel, forwardToOpenAI } from './openai-backend.js';
 import { RequestQueue, QueueFullError, QueueTimeoutError, DEFAULT_MAX_CONCURRENT, DEFAULT_MAX_QUEUED, DEFAULT_QUEUE_TIMEOUT_MS } from './request-queue.js';
 import { redactSecrets } from './redact.js';
+import { BAKED_BASE_MODELS, withLongContextVariants, buildOpenAIModelsList, getModelCatalog, getCachedBases, resolveAliasAgainst, prewarmModelCatalog } from './model-catalog.js';
 const ANTHROPIC_API = 'https://api.anthropic.com';
 const DEFAULT_PORT = 3456;
 const MAX_BODY_BYTES = 10 * 1024 * 1024; // 10 MB — generous for large prompts, prevents abuse
@@ -133,31 +134,42 @@ function loadClaudeIdentity() {
     }
     return { deviceId: '', accountUuid: '' };
 }
-// Model shortcuts — users can pass short names
+// Model shortcuts — users can pass short names. Family shorthands
+// (`opus`, `opus1m`, …) resolve DYNAMICALLY against the model catalog in
+// resolveClaudeAlias — this static map is the offline fallback plus the
+// deliberate legacy version pins (`opus47`/`opus46`), which never float.
 const MODEL_ALIASES = {
     'fable': 'claude-fable-5',
     'fable1m': 'claude-fable-5[1m]',
     'opus': 'claude-opus-4-8',
     'opus47': 'claude-opus-4-7',
     'opus46': 'claude-opus-4-6',
-    'opus1m': 'claude-opus-4-7[1m]',
+    'opus1m': 'claude-opus-4-8[1m]',
     'sonnet': 'claude-sonnet-4-6',
     'sonnet1m': 'claude-sonnet-4-6[1m]',
     'haiku': 'claude-haiku-4-5',
 };
 /**
- * Resolve a Claude-side model name through MODEL_ALIASES if it's a short
- * alias (`opus`/`sonnet`/`haiku`/etc.), otherwise pass through unchanged.
+ * Resolve a Claude-side model name through the family-alias rules if it's a
+ * short alias (`opus`/`sonnet`/`haiku`/etc.), otherwise pass through
+ * unchanged.
+ *
+ * Family shorthands resolve against the live model catalog: `<family>` is
+ * the newest base of that family, and `<family>1m` DERIVES from that same
+ * base + `[1m]` — one rule for every family, so the pair can't drift apart
+ * (pre-catalog, #389 bumped `opus` to 4-8 while `opus1m` silently stayed on
+ * 4-7). Before the first catalog fetch the baked set produces the same
+ * answers as the static map; the map stays as the last-resort fallback.
  *
  * Used at request time on the provider-prefix path so `claude:opus` arrives
- * upstream as `claude-opus-4-6` rather than the bare `opus` (which Anthropic
+ * upstream as a full model id rather than the bare `opus` (which Anthropic
  * 400's). Critical for Cursor BYOK setups (dario#190) where users have to
  * pick a colon-prefixed model name to dodge Cursor's built-in `claude-*`
  * name collision — which means the natural shorthand is `claude:opus`, and
  * that needs to Just Work.
  */
 export function resolveClaudeAlias(model) {
-    return MODEL_ALIASES[model] ?? model;
+    return resolveAliasAgainst(model, getCachedBases()) ?? MODEL_ALIASES[model] ?? model;
 }
 // Provider prefix in the `model` field — `<provider>:<model>`. Forces
 // routing regardless of model-name regex. Only recognized prefixes are
@@ -477,7 +489,11 @@ function translateStreamChunk(line) {
     catch { }
     return null;
 }
-export const OPENAI_MODELS_LIST = { object: 'list', data: ['claude-fable-5', 'claude-fable-5[1m]', 'claude-opus-4-8', 'claude-opus-4-7', 'claude-opus-4-6', 'claude-sonnet-4-6', 'claude-haiku-4-5'].map(id => ({ id, object: 'model', created: 1700000000, owned_by: 'anthropic' })) };
+// Baked /v1/models payload — what the proxy advertises before (or without)
+// a successful upstream catalog fetch. The live route serves the
+// autodetected catalog (model-catalog.ts); `[1m]` variants are GENERATED by
+// the one shared long-context rule, never hand-listed per model.
+export const OPENAI_MODELS_LIST = buildOpenAIModelsList(withLongContextVariants(BAKED_BASE_MODELS));
 /**
  * Append a JSON-ND line to the proxy log file. No-op when stream is
  * null (logFile not configured). Errors are swallowed — log writes
@@ -832,7 +848,7 @@ export async function startProxy(opts = {}) {
     const modelPrefix = opts.model ? parseProviderPrefix(opts.model) : null;
     const cliModelRaw = modelPrefix ? modelPrefix.model : opts.model;
     const cliProviderOverride = modelPrefix ? modelPrefix.provider : null;
-    const modelOverride = cliModelRaw ? (MODEL_ALIASES[cliModelRaw] ?? cliModelRaw) : null;
+    const modelOverride = cliModelRaw ? resolveClaudeAlias(cliModelRaw) : null;
     const identity = loadClaudeIdentity();
     if (identity.deviceId) {
         console.log('  Device identity: detected');
@@ -1020,7 +1036,17 @@ export async function startProxy(opts = {}) {
         ...SECURITY_HEADERS,
     };
     const JSON_HEADERS = { 'Content-Type': 'application/json', ...SECURITY_HEADERS };
-    const MODELS_JSON = JSON.stringify(OPENAI_MODELS_LIST);
+    // Model catalog wiring — /v1/models serves the upstream-autodetected set,
+    // authenticated the same way the request path is (per-token API key when
+    // ANTHROPIC_UPSTREAM_API_KEY is set, OAuth bearer otherwise). Prewarmed so
+    // the first client call is answered from cache; every failure path inside
+    // getModelCatalog falls back to the baked list, so the route always 200s.
+    const catalogDeps = {
+        upstreamApiKey: upstreamApiKey || undefined,
+        getToken: getAccessToken,
+        log: verbose ? (m) => console.log(m) : undefined,
+    };
+    prewarmModelCatalog(catalogDeps);
     const ERR_UNAUTH = JSON.stringify({ error: 'Unauthorized', message: 'Invalid or missing API key' });
     const ERR_FORBIDDEN = JSON.stringify({ error: 'Forbidden', message: 'Path not allowed. Supported paths: POST /v1/messages, POST /v1/messages/count_tokens, POST /v1/chat/completions, GET /v1/models' });
     const ERR_METHOD = JSON.stringify({ error: 'Method not allowed' });
@@ -1246,8 +1272,13 @@ export async function startProxy(opts = {}) {
         }
         if (urlPath === '/v1/models' && req.method === 'GET') {
             requestCount++;
+            // Upstream-autodetected catalog (TTL-cached, baked fallback — never
+            // throws). [1m] variants come from the shared long-context rule, so
+            // every family advertises its 1M form the same way.
+            const catalog = await getModelCatalog(catalogDeps);
+            const body = JSON.stringify(buildOpenAIModelsList(withLongContextVariants(catalog.bases)));
             res.writeHead(200, { ...JSON_HEADERS, 'Access-Control-Allow-Origin': corsOrigin });
-            res.end(MODELS_JSON);
+            res.end(body);
             return;
         }
         // Detect OpenAI-format requests

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "@askalf/dario",
-  "version": "4.8.55",
+  "version": "4.8.57",
   "description": "Use your Claude Pro/Max subscription in any tool — Cursor, Cline, Aider, the Agent SDK, your scripts — at subscription pricing, not per-token API bills. One local Anthropic + OpenAI-compatible endpoint.",
   "type": "module",
   "bin": {