npm - @goondocks/myco - Versions diffs - 0.4.1 → 0.4.2 - Mend

@goondocks/myco 0.4.1 → 0.4.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (49) hide show

package/skills/setup/references/model-recommendations.md ADDED Viewed

@@ -0,0 +1,77 @@
+# Model Recommendations
+Hardware-based guidance for choosing intelligence and embedding models during Myco setup.
+## Intelligence Model (LLM)
+One model handles all intelligence tasks — hooks, extraction, summaries, and digest. Size for digestion, the most demanding task (largest context window). The same model runs at 8192 context for hooks and at the digest context window below for synthesis.
+| RAM | Recommended Model | Digest Context Window |
+|-----|-------------------|-----------------------|
+| **64GB+** | `qwen3.5:35b` (MoE, recommended) | 65536 |
+| **32–64GB** | `qwen3.5:27b` | 32768 |
+| **16–32GB** | `qwen3.5:latest` (~10B) | 16384 |
+| **8–16GB** | `qwen3.5:4b` | 8192 |
+### Why Qwen 3.5?
+Qwen 3.5 models offer strong instruction-following and synthesis quality on local hardware. The MoE variant (`35b`) runs efficiently on 64GB+ systems because only a subset of parameters activate per token. Any instruction-tuned model that handles JSON output works — prefer what the user already has loaded, but recommend Qwen 3.5 for new setups.
+### Pulling Models
+**Ollama:**
+```bash
+ollama pull qwen3.5         # pulls latest tag (~10B)
+ollama pull qwen3.5:4b      # 4B variant
+ollama pull qwen3.5:27b     # 27B variant
+ollama pull qwen3.5:35b     # 35B MoE variant
+```
+**LM Studio:** Search for `qwen3.5` in the model browser. Download the variant matching the RAM tier above.
+## Embedding Model
+Embedding models are separate from the intelligence model. Anthropic does not support embeddings — only Ollama and LM Studio provide embedding models.
+Recommended embedding models:
+- `bge-m3` — strong multilingual embeddings, good default
+- `nomic-embed-text` — lightweight alternative
+**Ollama:**
+```bash
+ollama pull bge-m3
+ollama pull nomic-embed-text
+```
+**LM Studio:** Filter the model list for names containing `text-embedding`. If none are available, search for and download an embedding model through the model browser.
+## Inject Tier
+Controls how much pre-computed context the agent receives at session start. Agents can always request a different tier on-demand via the `myco_context` MCP tool.
+| RAM | Available Tiers | Default |
+|-----|-----------------|---------|
+| **64GB+** | 1500, 3000, 5000, 10000 | 3000 |
+| **32–64GB** | 1500, 3000, 5000 | 3000 |
+| **16–32GB** | 1500, 3000 | 1500 |
+| **8–16GB** | 1500 | 1500 |
+### Tier Descriptions
+- **1500** — executive briefing (fastest, lightest)
+- **3000** — team standup (recommended for most setups)
+- **5000** — deep onboarding
+- **10000** — institutional knowledge (richest, most context)
+## Advanced: Separate Digestion Model
+The guided setup configures one intelligence model for all tasks. Power users who want a separate, larger model specifically for digest can configure it via CLI:
+```bash
+node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-digest \
+  --provider lm-studio \
+  --model "qwen/qwen3.5-35b-a3b" \
+  --context-window 65536
+```
+This is not exposed in the guided setup to avoid resource exhaustion from running two large models simultaneously.

package/commands/init.md DELETED Viewed

@@ -1,122 +0,0 @@
----
-name: myco-init
-description: Initialize Myco in the current project — sets up vault, config, and intelligence backend
----
-# Initialize Myco
-Guide the user through setup using the composable CLI commands. **Do NOT create files manually — the CLI handles all vault creation, config writing, and env configuration.**
-**Ask each question one at a time using AskUserQuestion with selectable options.** Wait for the user's answer before proceeding to the next question. Do NOT combine multiple questions into one message.
-The streamlined setup asks just four questions: vault location, provider, model, and embedding model. One model handles everything — hooks, extraction, summaries, and digest — sized for the most demanding task (digestion). Advanced configuration is available via CLI commands after init.
-## Step 1: Detect available providers and system capabilities
-Run the provider detection command and detect system RAM:
-```bash
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js detect-providers
-```
-Detect RAM:
-- **macOS**: `sysctl -n hw.memsize` (bytes → GB)
-- **Linux**: parse `/proc/meminfo` for `MemTotal`
-Parse the JSON output. This tells you which providers are running and what models are available.
-## Step 2: Choose vault location
-**Question:** "Where would you like to store the Myco vault?"
-**Options:**
-- "In the project (.myco/)" — vault lives with the code, can be committed to git for team sharing
-- "Centralized (~/.myco/vaults/<project-name>/)" — vault stays outside the repo, good for public repos or personal use
-- "Custom path" — specify your own location
-## Step 3: Choose provider and model
-**Question:** "Which LLM provider and model?"
-List only providers where `available` is `true`. Recommend a model sized for digest based on detected RAM:
-| RAM | Recommended Model | Digest Context |
-|-----|-------------------|----------------|
-| **64GB+** | `qwen3.5:35b` (MoE, recommended) | 65536 |
-| **32–64GB** | `qwen3.5:27b` | 32768 |
-| **16–32GB** | `qwen3.5:latest` (~10B) | 16384 |
-| **8–16GB** | `qwen3.5:4b` | 8192 |
-The same model handles hooks (at 8K context), extraction, summaries, and digest (at the larger context from the table). No separate model configuration needed.
-If the model isn't installed, offer to pull it:
-- **Ollama**: `ollama pull qwen3.5`
-- **LM Studio**: search for `qwen3.5` in the model browser
-## Step 4: Choose embedding model
-**Question:** "Which embedding model?"
-**Options:** List only providers that support embeddings (Anthropic does not):
-- **Ollama** — list available embedding models. If none are available, offer to pull one (e.g., `bge-m3` or `nomic-embed-text`).
-- **LM Studio** — filter the model list for names containing `text-embedding`. If none are available, guide the user to search for and download an embedding model through LM Studio's model browser.
-If no embedding models are available on the chosen provider, help the user get one before proceeding.
-## Step 5: Choose digest inject tier
-**Question:** "How much context should the agent receive at session start?"
-Based on RAM, present the recommended tiers:
-| RAM | Options | Default |
-|-----|---------|---------|
-| **64GB+** | 1500, 3000, 5000, 10000 | 3000 |
-| **32–64GB** | 1500, 3000, 5000 | 3000 |
-| **16–32GB** | 1500, 3000 | 1500 |
-| **8–16GB** | 1500 | 1500 |
-**Options:**
-- "1500 — executive briefing (fastest, lightest)"
-- "3000 — team standup (recommended)"
-- "5000 — deep onboarding"
-- "10000 — institutional knowledge (richest)"
-This controls what gets auto-injected at the start of every session. Agents can always request a different tier on-demand via the `myco_context` tool.
-## Step 6: Run init and configure
-Create the vault and apply settings:
-```bash
-# Create vault structure and base config
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js init \
-  --vault <chosen-path> \
-  --llm-provider <provider> \
-  --llm-model <model> \
-  --embedding-provider <embedding-provider> \
-  --embedding-model <embedding-model>
-# Set digest context window and inject tier based on user choices
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-digest \
-  --context-window <from-ram-table> \
-  --inject-tier <chosen-tier>
-```
-## Step 7: Verify connectivity
-```bash
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js verify
-```
-If verification fails, help the user troubleshoot.
-## Step 8: Display summary
-| Setting | Value |
-|---------|-------|
-| Vault path | `<resolved path>` |
-| Provider | `<provider>` / `<model>` |
-| Embedding | `<embedding-provider>` / `<embedding-model>` |
-| Digest | enabled (context: `<context-window>`) |
-| RAM detected | `<X>` GB |

package/commands/setup-llm.md DELETED Viewed

@@ -1,114 +0,0 @@
----
-name: myco-setup-llm
-description: Configure or change the intelligence backend (Ollama, LM Studio, or Anthropic)
----
-# LLM Backend Setup
-Guide the user through configuring their intelligence backend. This command can be run at any time to change providers or models.
-The streamlined setup asks just three questions: provider, model, and embedding model. One model handles everything — hooks, extraction, summaries, and digest — at different context windows per request. Advanced configuration is available via the CLI for power users.
-## Prerequisites
-Read the existing `myco.yaml` from the vault directory to show current settings before making changes.
-## Step 1: Detect available providers and system capabilities
-Check which providers are reachable:
-- **Ollama** — fetch `http://localhost:11434/api/tags`, list model names
-- **LM Studio** — fetch `http://localhost:1234/v1/models`, list model names
-- **Anthropic** — check if `ANTHROPIC_API_KEY` is set in the environment
-Detect system RAM for recommendations:
-- **macOS**: `sysctl -n hw.memsize` (bytes → GB)
-- **Linux**: parse `/proc/meminfo` for `MemTotal`
-Report which providers are available and the detected RAM.
-## Step 2: Choose provider and model
-Ask the user to select from available providers. After picking a provider, recommend a model sized for digest (the most demanding task). The same model handles hooks and extraction at smaller context windows automatically.
-Recommended models by hardware tier — Qwen 3.5 is preferred for its strong instruction-following and synthesis quality:
-| RAM | Model | Context for Digest |
-|-----|-------|--------------------|
-| **64GB+** | `qwen3.5:35b` (MoE, recommended) | 65536 |
-| **32–64GB** | `qwen3.5:27b` | 32768 |
-| **16–32GB** | `qwen3.5:latest` (~10B) | 16384 |
-| **8–16GB** | `qwen3.5:4b` | 8192 |
-Any instruction-tuned model that handles JSON output works. Prefer what the user already has loaded, but recommend Qwen 3.5 if they're starting fresh.
-If the chosen model isn't installed, offer to pull it:
-- **Ollama**: `ollama pull qwen3.5` (pulls latest tag automatically)
-- **LM Studio**: search for `qwen3.5` in the model browser
-## Step 3: Choose embedding model
-Ask the user to select an embedding model — **Anthropic is not an option** (it doesn't support embeddings):
-- **Ollama** — list available embedding models. If none are available, offer to pull one (e.g., `bge-m3` or `nomic-embed-text`).
-- **LM Studio** — filter the model list for names containing `text-embedding`. If none are available, guide the user to search for and download an embedding model through LM Studio's model browser.
-If no embedding models are available on the chosen provider, help the user get one before proceeding.
-**Important:** If the user changes the embedding model, warn them:
-> "Changing the embedding model will require a full rebuild of the vector index. Run `node dist/src/cli.js rebuild` after this change."
-## Step 4: Apply settings
-Use the CLI commands to write settings deterministically. The context window for the main LLM stays at 8192 (hooks don't need more). The digest context window is set based on the RAM tier recommendation.
-```bash
-# Set provider and model
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-llm \
-  --llm-provider <provider> \
-  --llm-model <model> \
-  --embedding-provider <embedding-provider> \
-  --embedding-model <embedding-model>
-# Set digest context window based on RAM tier (model inherits from main LLM)
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-digest \
-  --context-window <from-ram-table>
-```
-Only pass flags the user explicitly changed — Zod defaults handle the rest.
-If migrating from a v1 config (has `backend: local/cloud` structure), bump `version` to `2` and rewrite the entire intelligence section. The loader auto-maps `provider: haiku` to `anthropic`.
-## Step 5: Verify and restart
-1. Test the LLM provider with a simple prompt
-2. Test the embedding provider with a test embedding
-3. Restart the daemon to pick up the new config: `node dist/src/cli.js restart`
-4. Report success or issues found
-## Advanced Configuration
-For power users who want fine-grained control, all settings are available via CLI:
-```bash
-# Separate digest model (e.g., larger model on LM Studio)
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-digest \
-  --provider lm-studio \
-  --model "qwen/qwen3.5-35b-a3b" \
-  --context-window 65536 \
-  --gpu-kv-cache false
-# Custom tiers and injection
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-digest \
-  --tiers 1500,3000,5000,10000 \
-  --inject-tier 3000
-# Capture token budgets
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-digest \
-  --extraction-tokens 2048 \
-  --summary-tokens 1024
-# View current settings
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-llm --show
-node ${CLAUDE_PLUGIN_ROOT}/dist/src/cli.js setup-digest --show
-```

package/commands/status.md DELETED Viewed

@@ -1,130 +0,0 @@
----
-name: myco-status
-description: Show Myco vault health, stats, and any pending issues
----
-# Myco Status
-Check and report the health of the Myco vault and daemon. Use the CLI (`node dist/src/cli.js stats`) for data, and supplement with direct checks where needed.
-## Step 1: Resolve vault location
-Find the vault directory:
-- Check `MYCO_VAULT_DIR` in the environment
-- Check `.claude/settings.user.json` (or `.claude/settings.json`) under the `env` key for `MYCO_VAULT_DIR`
-- Fall back to `~/.myco/vaults/<project-name>/`
-If no vault is found, report: "No Myco vault configured. Run `/myco-init` to set up."
-## Step 2: Config health
-Read `myco.yaml` from the vault:
-- Report config version (should be `2`)
-- Report LLM provider and model
-- Report embedding provider and model
-- Flag any issues (v1 config, missing fields)
-## Step 3: Daemon status
-Check `daemon.json` in the vault for PID and port:
-- Is the daemon process running? (check if PID is alive)
-- Is it healthy? (HTTP health check on the reported port)
-- Report PID, port, uptime, active sessions
-- If not running: "Daemon not running. It will start automatically on next session."
-## Step 4: Vault stats
-Query the FTS index for counts:
-| Metric | How to check |
-|--------|-------------|
-| Sessions | `index.query({ type: 'session' }).length` |
-| Spores | `index.query({ type: 'spore' }).length` |
-| Plans | `index.query({ type: 'plan' }).length` |
-| Artifacts | `index.query({ type: 'artifact' }).length` |
-| Embeddings | Vector index count |
-Also report spore breakdown by observation type (decision, gotcha, trade_off, etc.).
-## Step 5: Digest status
-Check the digest system state:
-- **Enabled/disabled**: read `digest.enabled` from `myco.yaml`
-- **Extracts**: list which tier files exist in `vault/digest/` (extract-1500.md, etc.) with file sizes and generated timestamps
-- **Last cycle**: read last line of `vault/digest/trace.jsonl` — report cycle ID, timestamp, tiers generated, substrate count, duration
-- **Metabolism**: report configured tiers, inject tier, and context window
-- **Digest model**: if `digest.intelligence.model` is set, show it; otherwise note "inherits from main LLM"
-## Step 6: Intelligence backend health
-Test connectivity to the configured providers:
-- **LLM provider**: call `isAvailable()` — report reachable or not
-- **Embedding provider**: call `isAvailable()` — report reachable or not
-- If either is unreachable, suggest running `/myco-setup-llm`
-## Step 7: Pending issues
-Check for problems:
-- **Stale buffers**: any `.jsonl` files in `buffer/` older than 24h? These indicate events that were never processed (LLM was unavailable)
-- **Missing index**: does `index.db` exist? If not, suggest `node dist/src/cli.js rebuild`
-- **Missing vectors**: does `vectors.db` exist? If not, embeddings are disabled
-- **Lineage**: does `lineage.json` exist? Report link count if so
-## Step 8: Recent activity
-Show the 3 most recent sessions with:
-- Session ID (short form)
-- Title
-- Started/ended timestamps
-- Number of spores extracted
-- Parent session (if lineage detected)
-## Output format
-Present as a structured report:
-```
-=== Myco Vault ===
-Path: ~/.myco/vaults/myco/
-Config: v2 (valid)
---- Intelligence ---
-LLM:       ollama / gpt-oss (reachable)
-Embedding: ollama / bge-m3 (reachable)
---- Daemon ---
-PID:      12345 (running)
-Port:     60942
-Sessions: 1 active
---- Vault ---
-Sessions:  12
-Spores:    183 (67 decision, 34 gotcha, 32 trade_off, 20 discovery, 19 bug_fix, 1 cross-cutting)
-Plans:     0
-Artifacts: 8
-Vectors:   224
---- Digest ---
-Enabled:    yes
-Tiers:      [1500, 3000, 5000, 10000]
-Inject:     3000 (auto-inject at session start)
-Model:      gpt-oss (inherited from main LLM)
-Last cycle: dc-a1b2c3 (2 min ago, 4 tiers, 12 notes, 45s)
-Extracts:   1500 (1.1KB), 3000 (4.5KB), 5000 (6.9KB), 10000 (9.6KB)
---- Lineage ---
-Links: 5 (3 clear, 1 inferred, 1 semantic_similarity)
---- Recent Sessions ---
-1. [abc123] "Auth redesign session" (2h 15m, 5 spores)
-2. [def456] "Bug fix for CORS" (45m, 2 spores, parent: abc123)
-3. [ghi789] "Config cleanup" (20m, 1 spore)
---- Issues ---
-None found.
-```
-Adapt the format to what's actually available. If sections have no data, show them with "None" rather than omitting them.

package/dist/chunk-P7RNAYU7.js.map DELETED Viewed

@@ -1 +0,0 @@

- {"version":3,"sources":["../src/intelligence/ollama.ts","../src/intelligence/lm-studio.ts"],"sourcesContent":["import type { LlmProvider, EmbeddingProvider, LlmResponse, EmbeddingResponse, LlmRequestOptions } from './llm.js';\nimport { estimateTokens, LLM_REQUEST_TIMEOUT_MS, EMBEDDING_REQUEST_TIMEOUT_MS, DAEMON_CLIENT_TIMEOUT_MS } from '../constants.js';\n\ninterface OllamaConfig {\n model?: string;\n base_url?: string;\n context_window?: number;\n max_tokens?: number;\n // Legacy fields (ignored, kept for backward compat during migration)\n embedding_model?: string;\n summary_model?: string;\n}\n\n// Ollama API endpoints\nconst ENDPOINT_GENERATE = '/api/generate';\nconst ENDPOINT_EMBED = '/api/embed';\nconst ENDPOINT_TAGS = '/api/tags';\n\nexport class OllamaBackend implements LlmProvider, EmbeddingProvider {\n static readonly DEFAULT_BASE_URL = 'http://localhost:11434';\n readonly name = 'ollama';\n private baseUrl: string;\n private model: string;\n private contextWindow: number;\n private defaultMaxTokens: number;\n\n constructor(config?: OllamaConfig) {\n this.baseUrl = config?.base_url ?? OllamaBackend.DEFAULT_BASE_URL;\n this.model = config?.model ?? config?.summary_model ?? 'llama3.2';\n this.contextWindow = config?.context_window ?? 8192;\n this.defaultMaxTokens = config?.max_tokens ?? 1024;\n }\n\n async summarize(prompt: string, opts?: LlmRequestOptions): Promise<LlmResponse> {\n const maxTokens = opts?.maxTokens ?? this.defaultMaxTokens;\n const contextLength = opts?.contextLength ?? this.contextWindow;\n const promptTokens = estimateTokens(prompt);\n const numCtx = Math.max(promptTokens + maxTokens, contextLength);\n\n const body: Record<string, unknown> = {\n model: this.model,\n prompt,\n stream: false,\n options: {\n num_ctx: numCtx,\n num_predict: maxTokens,\n },\n };\n\n // System prompt — sent as a separate field instead of concatenated into prompt\n if (opts?.systemPrompt) {\n body.system = opts.systemPrompt;\n }\n\n // Thinking control — false suppresses chain-of-thought for reasoning models\n if (opts?.reasoning) {\n body.think = opts.reasoning === 'off' ? false : opts.reasoning;\n }\n\n // Keep model loaded between requests (useful for digest cycles)\n if (opts?.keepAlive) {\n body.keep_alive = opts.keepAlive;\n }\n\n const response = await fetch(`${this.baseUrl}${ENDPOINT_GENERATE}`, {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify(body),\n signal: AbortSignal.timeout(opts?.timeoutMs ?? LLM_REQUEST_TIMEOUT_MS),\n });\n\n if (!response.ok) {\n const errorBody = await response.text().catch(() => '');\n throw new Error(`Ollama summarize failed: ${response.status} ${errorBody.slice(0, 500)}`);\n }\n\n const data = await response.json() as { response: string; model: string };\n return { text: data.response, model: data.model };\n }\n\n async embed(text: string): Promise<EmbeddingResponse> {\n const response = await fetch(`${this.baseUrl}${ENDPOINT_EMBED}`, {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({\n model: this.model,\n input: text,\n }),\n signal: AbortSignal.timeout(EMBEDDING_REQUEST_TIMEOUT_MS),\n });\n\n if (!response.ok) {\n throw new Error(`Ollama embed failed: ${response.status} ${response.statusText}`);\n }\n\n const data = await response.json() as { embeddings: number[][]; model: string };\n const embedding = data.embeddings[0];\n return { embedding, model: data.model, dimensions: embedding.length };\n }\n\n async isAvailable(): Promise<boolean> {\n try {\n const response = await fetch(`${this.baseUrl}${ENDPOINT_TAGS}`, {\n signal: AbortSignal.timeout(DAEMON_CLIENT_TIMEOUT_MS),\n });\n return response.ok;\n } catch {\n return false;\n }\n }\n\n /** List available models on this Ollama instance. */\n async listModels(timeoutMs?: number): Promise<string[]> {\n try {\n const response = await fetch(`${this.baseUrl}${ENDPOINT_TAGS}`, {\n signal: AbortSignal.timeout(timeoutMs ?? DAEMON_CLIENT_TIMEOUT_MS),\n });\n const data = await response.json() as { models: Array<{ name: string }> };\n return data.models.map((m) => m.name);\n } catch {\n return [];\n }\n }\n}\n","import type { LlmProvider, EmbeddingProvider, LlmResponse, EmbeddingResponse, LlmRequestOptions } from './llm.js';\nimport { LLM_REQUEST_TIMEOUT_MS, EMBEDDING_REQUEST_TIMEOUT_MS, DAEMON_CLIENT_TIMEOUT_MS } from '../constants.js';\n\ninterface LmStudioConfig {\n model?: string;\n base_url?: string;\n context_window?: number;\n max_tokens?: number;\n // Legacy fields\n embedding_model?: string;\n summary_model?: string;\n}\n\n// LM Studio API endpoints\nconst ENDPOINT_CHAT = '/api/v1/chat';\nconst ENDPOINT_MODELS_LOAD = '/api/v1/models/load';\nconst ENDPOINT_MODELS_UNLOAD = '/api/v1/models/unload';\nconst ENDPOINT_MODELS_LIST = '/v1/models';\nconst ENDPOINT_EMBEDDINGS = '/v1/embeddings';\n\nexport class LmStudioBackend implements LlmProvider, EmbeddingProvider {\n static readonly DEFAULT_BASE_URL = 'http://localhost:1234';\n readonly name = 'lm-studio';\n private baseUrl: string;\n private model: string;\n private loadedInstanceId: string | null = null;\n private contextWindow: number | undefined;\n private defaultMaxTokens: number;\n\n constructor(config?: LmStudioConfig) {\n this.baseUrl = config?.base_url ?? LmStudioBackend.DEFAULT_BASE_URL;\n this.model = config?.model ?? config?.summary_model ?? 'llama3.2';\n this.contextWindow = config?.context_window;\n this.defaultMaxTokens = config?.max_tokens ?? 1024;\n }\n\n /**\n * Generate text using LM Studio's native REST API (/api/v1/chat).\n * Supports per-request context_length, reasoning control, and system_prompt.\n */\n async summarize(prompt: string, opts?: LlmRequestOptions): Promise<LlmResponse> {\n const maxTokens = opts?.maxTokens ?? this.defaultMaxTokens;\n\n const body: Record<string, unknown> = {\n model: this.loadedInstanceId ?? this.model,\n input: prompt,\n max_output_tokens: maxTokens,\n store: false,\n };\n\n // Only set context_length if we haven't pre-loaded the model\n // (pre-loaded models already have the correct context via ensureLoaded)\n if (!this.loadedInstanceId) {\n const contextLength = opts?.contextLength ?? this.contextWindow;\n if (contextLength) {\n body.context_length = contextLength;\n }\n }\n\n // System prompt — sent separately from user content\n if (opts?.systemPrompt) {\n body.system_prompt = opts.systemPrompt;\n }\n\n // Reasoning control — 'off' suppresses chain-of-thought for reasoning models\n if (opts?.reasoning) {\n body.reasoning = opts.reasoning;\n }\n\n const response = await fetch(`${this.baseUrl}${ENDPOINT_CHAT}`, {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify(body),\n signal: AbortSignal.timeout(opts?.timeoutMs ?? LLM_REQUEST_TIMEOUT_MS),\n });\n\n if (!response.ok) {\n const errorBody = await response.text().catch(() => '');\n throw new Error(`LM Studio summarize failed: ${response.status} ${errorBody.slice(0, 500)}`);\n }\n\n const data = await response.json() as {\n model_instance_id: string;\n output: Array<{ type: string; content: string }>;\n };\n const messageOutput = data.output.find((o) => o.type === 'message');\n const text = messageOutput?.content ?? '';\n return { text, model: data.model_instance_id };\n }\n\n /**\n * Generate embeddings using LM Studio's OpenAI-compatible endpoint.\n * (The native API doesn't have an embedding endpoint — OpenAI-compat is fine here.)\n */\n async embed(text: string): Promise<EmbeddingResponse> {\n const response = await fetch(`${this.baseUrl}${ENDPOINT_EMBEDDINGS}`, {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify({\n model: this.model,\n input: text,\n }),\n signal: AbortSignal.timeout(EMBEDDING_REQUEST_TIMEOUT_MS),\n });\n\n if (!response.ok) {\n throw new Error(`LM Studio embed failed: ${response.status}`);\n }\n\n const data = await response.json() as {\n data: Array<{ embedding: number[] }>;\n model: string;\n };\n const embedding = data.data[0].embedding;\n return { embedding, model: data.model, dimensions: embedding.length };\n }\n\n /**\n * Load the model with specific settings for digest operations.\n * Creates a dedicated instance and captures the instance_id so subsequent\n * chat requests target it directly (avoiding auto-load side effects).\n * Does not unload other instances — hooks and other providers may be\n * using the same model with different settings.\n */\n async ensureLoaded(contextLength?: number, gpuKvCache?: boolean): Promise<void> {\n\n const ctx = contextLength ?? this.contextWindow;\n const body: Record<string, unknown> = {\n model: this.model,\n flash_attention: true,\n offload_kv_cache_to_gpu: gpuKvCache ?? false,\n };\n if (ctx) {\n body.context_length = ctx;\n }\n\n const response = await fetch(`${this.baseUrl}${ENDPOINT_MODELS_LOAD}`, {\n method: 'POST',\n headers: { 'Content-Type': 'application/json' },\n body: JSON.stringify(body),\n signal: AbortSignal.timeout(LLM_REQUEST_TIMEOUT_MS),\n });\n\n if (!response.ok) {\n const errorBody = await response.text().catch(() => '');\n throw new Error(`LM Studio model load failed: ${response.status} ${errorBody.slice(0, 200)}`);\n }\n\n // Capture the instance ID so chat requests target this specific loaded instance\n const loadResult = await response.json() as { instance_id?: string };\n if (loadResult.instance_id) {\n this.loadedInstanceId = loadResult.instance_id;\n }\n }\n\n async isAvailable(): Promise<boolean> {\n try {\n const response = await fetch(`${this.baseUrl}${ENDPOINT_MODELS_LIST}`, {\n signal: AbortSignal.timeout(DAEMON_CLIENT_TIMEOUT_MS),\n });\n return response.ok;\n } catch {\n return false;\n }\n }\n\n /** List available models on this LM Studio instance. */\n async listModels(timeoutMs?: number): Promise<string[]> {\n try {\n const response = await fetch(`${this.baseUrl}${ENDPOINT_MODELS_LIST}`, {\n signal: AbortSignal.timeout(timeoutMs ?? DAEMON_CLIENT_TIMEOUT_MS),\n });\n const data = await response.json() as { data: Array<{ id: string }> };\n return data.data.map((m) => m.id);\n } catch {\n return [];\n }\n }\n}\n"],"mappings":";;;;;;;;;AAcA,IAAM,oBAAoB;AAC1B,IAAM,iBAAiB;AACvB,IAAM,gBAAgB;AAEf,IAAM,gBAAN,MAAM,eAAwD;AAAA,EACnE,OAAgB,mBAAmB;AAAA,EAC1B,OAAO;AAAA,EACR;AAAA,EACA;AAAA,EACA;AAAA,EACA;AAAA,EAER,YAAY,QAAuB;AACjC,SAAK,UAAU,QAAQ,YAAY,eAAc;AACjD,SAAK,QAAQ,QAAQ,SAAS,QAAQ,iBAAiB;AACvD,SAAK,gBAAgB,QAAQ,kBAAkB;AAC/C,SAAK,mBAAmB,QAAQ,cAAc;AAAA,EAChD;AAAA,EAEA,MAAM,UAAU,QAAgB,MAAgD;AAC9E,UAAM,YAAY,MAAM,aAAa,KAAK;AAC1C,UAAM,gBAAgB,MAAM,iBAAiB,KAAK;AAClD,UAAM,eAAe,eAAe,MAAM;AAC1C,UAAM,SAAS,KAAK,IAAI,eAAe,WAAW,aAAa;AAE/D,UAAM,OAAgC;AAAA,MACpC,OAAO,KAAK;AAAA,MACZ;AAAA,MACA,QAAQ;AAAA,MACR,SAAS;AAAA,QACP,SAAS;AAAA,QACT,aAAa;AAAA,MACf;AAAA,IACF;AAGA,QAAI,MAAM,cAAc;AACtB,WAAK,SAAS,KAAK;AAAA,IACrB;AAGA,QAAI,MAAM,WAAW;AACnB,WAAK,QAAQ,KAAK,cAAc,QAAQ,QAAQ,KAAK;AAAA,IACvD;AAGA,QAAI,MAAM,WAAW;AACnB,WAAK,aAAa,KAAK;AAAA,IACzB;AAEA,UAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,iBAAiB,IAAI;AAAA,MAClE,QAAQ;AAAA,MACR,SAAS,EAAE,gBAAgB,mBAAmB;AAAA,MAC9C,MAAM,KAAK,UAAU,IAAI;AAAA,MACzB,QAAQ,YAAY,QAAQ,MAAM,aAAa,sBAAsB;AAAA,IACvE,CAAC;AAED,QAAI,CAAC,SAAS,IAAI;AAChB,YAAM,YAAY,MAAM,SAAS,KAAK,EAAE,MAAM,MAAM,EAAE;AACtD,YAAM,IAAI,MAAM,4BAA4B,SAAS,MAAM,IAAI,UAAU,MAAM,GAAG,GAAG,CAAC,EAAE;AAAA,IAC1F;AAEA,UAAM,OAAO,MAAM,SAAS,KAAK;AACjC,WAAO,EAAE,MAAM,KAAK,UAAU,OAAO,KAAK,MAAM;AAAA,EAClD;AAAA,EAEA,MAAM,MAAM,MAA0C;AACpD,UAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,cAAc,IAAI;AAAA,MAC/D,QAAQ;AAAA,MACR,SAAS,EAAE,gBAAgB,mBAAmB;AAAA,MAC9C,MAAM,KAAK,UAAU;AAAA,QACnB,OAAO,KAAK;AAAA,QACZ,OAAO;AAAA,MACT,CAAC;AAAA,MACD,QAAQ,YAAY,QAAQ,4BAA4B;AAAA,IAC1D,CAAC;AAED,QAAI,CAAC,SAAS,IAAI;AAChB,YAAM,IAAI,MAAM,wBAAwB,SAAS,MAAM,IAAI,SAAS,UAAU,EAAE;AAAA,IAClF;AAEA,UAAM,OAAO,MAAM,SAAS,KAAK;AACjC,UAAM,YAAY,KAAK,WAAW,CAAC;AACnC,WAAO,EAAE,WAAW,OAAO,KAAK,OAAO,YAAY,UAAU,OAAO;AAAA,EACtE;AAAA,EAEA,MAAM,cAAgC;AACpC,QAAI;AACF,YAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,aAAa,IAAI;AAAA,QAC9D,QAAQ,YAAY,QAAQ,wBAAwB;AAAA,MACtD,CAAC;AACD,aAAO,SAAS;AAAA,IAClB,QAAQ;AACN,aAAO;AAAA,IACT;AAAA,EACF;AAAA;AAAA,EAGA,MAAM,WAAW,WAAuC;AACtD,QAAI;AACF,YAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,aAAa,IAAI;AAAA,QAC9D,QAAQ,YAAY,QAAQ,aAAa,wBAAwB;AAAA,MACnE,CAAC;AACD,YAAM,OAAO,MAAM,SAAS,KAAK;AACjC,aAAO,KAAK,OAAO,IAAI,CAAC,MAAM,EAAE,IAAI;AAAA,IACtC,QAAQ;AACN,aAAO,CAAC;AAAA,IACV;AAAA,EACF;AACF;;;AC7GA,IAAM,gBAAgB;AACtB,IAAM,uBAAuB;AAE7B,IAAM,uBAAuB;AAC7B,IAAM,sBAAsB;AAErB,IAAM,kBAAN,MAAM,iBAA0D;AAAA,EACrE,OAAgB,mBAAmB;AAAA,EAC1B,OAAO;AAAA,EACR;AAAA,EACA;AAAA,EACA,mBAAkC;AAAA,EAClC;AAAA,EACA;AAAA,EAER,YAAY,QAAyB;AACnC,SAAK,UAAU,QAAQ,YAAY,iBAAgB;AACnD,SAAK,QAAQ,QAAQ,SAAS,QAAQ,iBAAiB;AACvD,SAAK,gBAAgB,QAAQ;AAC7B,SAAK,mBAAmB,QAAQ,cAAc;AAAA,EAChD;AAAA;AAAA;AAAA;AAAA;AAAA,EAMA,MAAM,UAAU,QAAgB,MAAgD;AAC9E,UAAM,YAAY,MAAM,aAAa,KAAK;AAE1C,UAAM,OAAgC;AAAA,MACpC,OAAO,KAAK,oBAAoB,KAAK;AAAA,MACrC,OAAO;AAAA,MACP,mBAAmB;AAAA,MACnB,OAAO;AAAA,IACT;AAIA,QAAI,CAAC,KAAK,kBAAkB;AAC1B,YAAM,gBAAgB,MAAM,iBAAiB,KAAK;AAClD,UAAI,eAAe;AACjB,aAAK,iBAAiB;AAAA,MACxB;AAAA,IACF;AAGA,QAAI,MAAM,cAAc;AACtB,WAAK,gBAAgB,KAAK;AAAA,IAC5B;AAGA,QAAI,MAAM,WAAW;AACnB,WAAK,YAAY,KAAK;AAAA,IACxB;AAEA,UAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,aAAa,IAAI;AAAA,MAC9D,QAAQ;AAAA,MACR,SAAS,EAAE,gBAAgB,mBAAmB;AAAA,MAC9C,MAAM,KAAK,UAAU,IAAI;AAAA,MACzB,QAAQ,YAAY,QAAQ,MAAM,aAAa,sBAAsB;AAAA,IACvE,CAAC;AAED,QAAI,CAAC,SAAS,IAAI;AAChB,YAAM,YAAY,MAAM,SAAS,KAAK,EAAE,MAAM,MAAM,EAAE;AACtD,YAAM,IAAI,MAAM,+BAA+B,SAAS,MAAM,IAAI,UAAU,MAAM,GAAG,GAAG,CAAC,EAAE;AAAA,IAC7F;AAEA,UAAM,OAAO,MAAM,SAAS,KAAK;AAIjC,UAAM,gBAAgB,KAAK,OAAO,KAAK,CAAC,MAAM,EAAE,SAAS,SAAS;AAClE,UAAM,OAAO,eAAe,WAAW;AACvC,WAAO,EAAE,MAAM,OAAO,KAAK,kBAAkB;AAAA,EAC/C;AAAA;AAAA;AAAA;AAAA;AAAA,EAMA,MAAM,MAAM,MAA0C;AACpD,UAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,mBAAmB,IAAI;AAAA,MACpE,QAAQ;AAAA,MACR,SAAS,EAAE,gBAAgB,mBAAmB;AAAA,MAC9C,MAAM,KAAK,UAAU;AAAA,QACnB,OAAO,KAAK;AAAA,QACZ,OAAO;AAAA,MACT,CAAC;AAAA,MACD,QAAQ,YAAY,QAAQ,4BAA4B;AAAA,IAC1D,CAAC;AAED,QAAI,CAAC,SAAS,IAAI;AAChB,YAAM,IAAI,MAAM,2BAA2B,SAAS,MAAM,EAAE;AAAA,IAC9D;AAEA,UAAM,OAAO,MAAM,SAAS,KAAK;AAIjC,UAAM,YAAY,KAAK,KAAK,CAAC,EAAE;AAC/B,WAAO,EAAE,WAAW,OAAO,KAAK,OAAO,YAAY,UAAU,OAAO;AAAA,EACtE;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA;AAAA,EASA,MAAM,aAAa,eAAwB,YAAqC;AAE9E,UAAM,MAAM,iBAAiB,KAAK;AAClC,UAAM,OAAgC;AAAA,MACpC,OAAO,KAAK;AAAA,MACZ,iBAAiB;AAAA,MACjB,yBAAyB,cAAc;AAAA,IACzC;AACA,QAAI,KAAK;AACP,WAAK,iBAAiB;AAAA,IACxB;AAEA,UAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,oBAAoB,IAAI;AAAA,MACrE,QAAQ;AAAA,MACR,SAAS,EAAE,gBAAgB,mBAAmB;AAAA,MAC9C,MAAM,KAAK,UAAU,IAAI;AAAA,MACzB,QAAQ,YAAY,QAAQ,sBAAsB;AAAA,IACpD,CAAC;AAED,QAAI,CAAC,SAAS,IAAI;AAChB,YAAM,YAAY,MAAM,SAAS,KAAK,EAAE,MAAM,MAAM,EAAE;AACtD,YAAM,IAAI,MAAM,gCAAgC,SAAS,MAAM,IAAI,UAAU,MAAM,GAAG,GAAG,CAAC,EAAE;AAAA,IAC9F;AAGA,UAAM,aAAa,MAAM,SAAS,KAAK;AACvC,QAAI,WAAW,aAAa;AAC1B,WAAK,mBAAmB,WAAW;AAAA,IACrC;AAAA,EACF;AAAA,EAEA,MAAM,cAAgC;AACpC,QAAI;AACF,YAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,oBAAoB,IAAI;AAAA,QACrE,QAAQ,YAAY,QAAQ,wBAAwB;AAAA,MACtD,CAAC;AACD,aAAO,SAAS;AAAA,IAClB,QAAQ;AACN,aAAO;AAAA,IACT;AAAA,EACF;AAAA;AAAA,EAGA,MAAM,WAAW,WAAuC;AACtD,QAAI;AACF,YAAM,WAAW,MAAM,MAAM,GAAG,KAAK,OAAO,GAAG,oBAAoB,IAAI;AAAA,QACrE,QAAQ,YAAY,QAAQ,aAAa,wBAAwB;AAAA,MACnE,CAAC;AACD,YAAM,OAAO,MAAM,SAAS,KAAK;AACjC,aAAO,KAAK,KAAK,IAAI,CAAC,MAAM,EAAE,EAAE;AAAA,IAClC,QAAQ;AACN,aAAO,CAAC;AAAA,IACV;AAAA,EACF;AACF;","names":[]}

/package/dist/{chunk-XHWIIU5D.js.map → chunk-GFBG73P4.js.map} RENAMED Viewed

File without changes

/package/dist/{chunk-IVS5MYBL.js.map → chunk-IYFKPSRP.js.map} RENAMED Viewed

File without changes

/package/dist/{cli-IGZA3TZC.js.map → cli-PMOFCZQL.js.map} RENAMED Viewed

File without changes

/package/dist/{detect-providers-5FU3BN5Q.js.map → detect-providers-IRL2TTLK.js.map} RENAMED Viewed

File without changes

/package/dist/{init-M3GDZRKI.js.map → init-NUF5UBUJ.js.map} RENAMED Viewed

File without changes

/package/dist/{main-3JSO25IZ.js.map → main-2XEBVUR6.js.map} RENAMED Viewed

File without changes