npm - arkaos - Versions diffs - 2.21.0 → 2.22.0 - Mend

arkaos 2.21.0 → 2.22.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (41) hide show

package/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 2.21.0
1	+ 2.22.0

package/arka/SKILL.md CHANGED Viewed

@@ -109,7 +109,8 @@ violation (squad-routing, arka-supremacy, spec-driven, mandatory-qa).
 | Command | Description |
 |---------|-------------|
-| `/arka status` | System status (version, departments, agents, active projects) |
+| `/arka status` | System status (version, departments, agents, active projects). Includes **LLM costs (24h)** section: top-line cost + cache hit rate + call count from `core.runtime.llm_cost_telemetry.summarise(period="today")`. |
+| `/arka costs [period]` | LLM cost visibility — aggregates telemetry by day/week/month/all, with top expensive sessions. See `arka/skills/costs/SKILL.md`. Shells out to `python -m core.runtime.llm_cost_telemetry_cli <period>`. |
 | `/arka standup` | Daily standup (projects, priorities, blockers, updates) |
 | `/arka monitor` | System health monitoring |
 | `/arka onboard <path>` | Onboard an existing project into ArkaOS |

package/arka/skills/costs/SKILL.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+name: arka-costs
+description: >
+  LLM cost visibility — aggregates `~/.arkaos/telemetry/llm-cost.jsonl` by
+  day/week/month/all, breaks down by provider/model/session, surfaces top
+  expensive sessions and cache hit rate. Visibility-only per ADR-011;
+  never imposes hard caps.
+allowed-tools: [Bash, Read]
+---
+# /arka costs — LLM cost visibility
+Aggregates runtime-agnostic LLM call telemetry written by
+`core/runtime/llm_cost_telemetry.record_cost`. Per ADR-011, token
+budgets are **informational, not restrictive** — this command only
+surfaces usage and emits soft advisories. It never blocks a call.
+## Usage
+| Command | What it shows |
+| --- | --- |
+| `/arka costs` | Today (UTC midnight → now) |
+| `/arka costs today` | Same as above |
+| `/arka costs week` | Rolling last 7 days |
+| `/arka costs month` | Rolling last 30 days |
+| `/arka costs all` | Entire history in the JSONL |
+| `/arka costs sessions` | Top 10 most expensive sessions (all time) |
+## Output
+- Total cost (USD, `n/a` when all entries are unpriced models)
+- Total tokens in / out, plus cached tokens
+- Cache hit rate (`cached / tokens_in`)
+- Breakdown by provider
+- Breakdown by model (`<unknown>` bucket for calls with no model)
+- Top 10 sessions sorted by cost
+- Advisories — a soft line per session that crossed the
+  `advisory_threshold_usd` (default $5 per session)
+## Implementation
+This skill shells out to the Python CLI:
+```bash
+python -m core.runtime.llm_cost_telemetry_cli <period>
+```
+Source:
+- `core/runtime/llm_cost_telemetry.py` — `summarise`, `list_expensive_sessions`
+- `core/runtime/llm_cost_telemetry_cli.py` — markdown renderer
+## Data source
+`~/.arkaos/telemetry/llm-cost.jsonl` (override with `ARKA_LLM_COST_PATH`).
+One JSONL line per LLM call, written by every provider adapter.
+Malformed lines are skipped and counted, never raised.
+## Non-negotiables
+1. Read-only. This skill never edits state.
+2. No hard budget caps — advisories are strings, not errors.
+3. No external dependencies; stdlib only.

package/core/cognition/__pycache__/auto_documentor.cpython-313.pyc CHANGED Viewed

Binary file

package/core/cognition/auto_documentor.py CHANGED Viewed

@@ -5,14 +5,15 @@ synthesises learnings about external sources consulted, decisions made,
 and deliverables produced, then invokes the Obsidian cataloger + relator
 (Task #4 modules) to file structured, wikilinked notes into the vault.
-Model routing is dynamic: a complexity heuristic over the learning
-content chooses haiku / sonnet / opus. Nothing is hardcoded per task
-type. The actual LLM call is abstracted behind `_call_llm` and falls
-back to a deterministic template when no SDK is wired — this keeps the
-module testable and unblocks the SDK integration as a follow-up.
+The synthesis step is runtime- and model-agnostic: it delegates to the
+active `LLMProvider` (see `core.runtime.llm_provider`). This module
+NEVER picks a model — the provider / runtime / env does. When no
+provider is available or the call fails, it falls through to a
+deterministic template synthesiser that preserves every extracted fact.
 ADR/Plan references:
 - ~/.arkaos/plans/2026-04-20-intelligence-v2.md (Task #7 — Épico B)
+- ~/.arkaos/plans/2026-04-20-llm-agnostic.md (Task #12/#13 — LLMProvider)
 - core/obsidian/cataloger.py, core/obsidian/relator.py (Task #4)
 """
@@ -21,9 +22,8 @@ from __future__ import annotations
 import json
 import re
 from dataclasses import dataclass, field
-from datetime import date
 from pathlib import Path
-from typing import Iterable, Optional
+from typing import Iterable
 from core.obsidian import cataloger as _cataloger
 from core.obsidian import relator as _relator
@@ -32,14 +32,6 @@ from core.obsidian.writer import ObsidianWriter
 SAFE_SESSION_ID_RE = re.compile(r"^[A-Za-z0-9._-]{1,128}$")
-_ARCHITECTURAL_KEYWORDS = (
-    "architecture", "adr", "decision", "trade-off", "tradeoff",
-    "design pattern", "refactor", "migration", "schema", "bounded context",
-)
-_ANALYSIS_KEYWORDS = (
-    "analysis", "investigation", "compare", "benchmark", "evaluate",
-    "profile", "review", "audit",
-)
 _URL_RE = re.compile(r"https?://[^\s\)\]\"']+")
 _FILE_PATH_RE = re.compile(r"(?:^|[\s`'])(/[A-Za-z0-9_./\-]+\.[A-Za-z0-9]+)")
 _ROUTING_MARKER_RE = re.compile(
@@ -58,8 +50,17 @@ _EXTERNAL_RESEARCH_TOOLS = frozenset({
     "mcp__firecrawl__firecrawl_extract",
 })
-_HAIKU_MAX_CHARS = 600
-_OPUS_MIN_CHARS = 4000
+_AUTO_DOC_SUFFIX = "Auto-documented by ArkaOS"
+_LLM_MAX_TOKENS = 1500
+_SYSTEM_PROMPT = (
+    "You are ArkaOS's auto-documentor. Produce a concise knowledge note "
+    "(150-300 words) summarising the session. Structure: short intro, "
+    "then markdown sections for Key Facts, Decisions, and Sources. "
+    "Preserve every URL and file path verbatim. Use Obsidian wikilinks "
+    "([[Topic]]) for reusable concepts. No preamble, no sign-off, no "
+    "meta commentary about the model or prompt. Output only markdown."
+)
 @dataclass
@@ -264,49 +265,73 @@ def _dedupe_keep_order(items: Iterable[str]) -> list[str]:
     return out
-# ─── Model routing ─────────────────────────────────────────────────────
-def choose_model(learning: Learning) -> str:
-    """Return 'haiku' | 'sonnet' | 'opus' based on content complexity."""
-    low = learning.content.lower()
-    length = len(learning.content)
-    if any(kw in low for kw in _ARCHITECTURAL_KEYWORDS):
-        return "opus"
-    if length >= _OPUS_MIN_CHARS and len(learning.decisions) >= 3:
-        return "opus"
-    if length <= _HAIKU_MAX_CHARS and len(learning.decisions) <= 1:
-        return "haiku"
-    if any(kw in low for kw in _ANALYSIS_KEYWORDS):
-        return "sonnet"
-    if length >= _HAIKU_MAX_CHARS:
-        return "sonnet"
-    return "haiku"
 # ─── Synthesis ─────────────────────────────────────────────────────────
-def synthesize(learning: Learning, model_hint: str) -> str:
+def synthesize(learning: Learning) -> str:
     """Produce a markdown body for the learning.
-    Calls `_call_llm` when a real LLM integration is wired; otherwise
-    falls back to a deterministic template that preserves all extracted
-    information. The template path is what ships in Task #7.
+    Delegates to the active `LLMProvider` via `_call_llm`. If the
+    provider is unavailable, returns empty text, or raises, falls
+    through to a deterministic template that preserves every extracted
+    fact. No model name ever crosses this boundary.
     """
-    llm_out = _call_llm(learning, model_hint)
+    llm_out = _call_llm(learning)
     if llm_out:
         return llm_out
-    return _template_synthesize(learning, model_hint)
+    return _template_synthesize(learning)
-def _call_llm(learning: Learning, model_hint: str) -> str:
-    return ""
+def _call_llm(learning: Learning) -> str:
+    from core.runtime import get_llm_provider
+    from core.runtime.llm_provider import LLMUnavailable
+    try:
+        provider = get_llm_provider()
+        if not provider.is_available():
+            return ""
+        prompt = _build_synthesis_prompt(learning)
+        response = provider.complete(
+            prompt, max_tokens=_LLM_MAX_TOKENS, system=_SYSTEM_PROMPT
+        )
+        return response.text.strip()
+    except LLMUnavailable:
+        return ""
+    except Exception:  # noqa: BLE001 — LLM path must never crash the doc job
+        return ""
-def _template_synthesize(learning: Learning, model_hint: str) -> str:
+def _build_synthesis_prompt(learning: Learning) -> str:
+    lines = [f"Topic: {learning.topic}", ""]
+    if learning.content.strip():
+        lines.append("Session blob:")
+        lines.append(learning.content.strip())
+        lines.append("")
+    if learning.sources:
+        lines.append("Sources consulted:")
+        for src in learning.sources[:20]:
+            lines.append(f"- {src}")
+        lines.append("")
+    if learning.decisions:
+        lines.append("Decisions recorded:")
+        for dec in learning.decisions[:10]:
+            lines.append(f"- {dec}")
+        lines.append("")
+    if learning.metadata:
+        meta_pairs = sorted(learning.metadata.items())
+        lines.append("Metadata:")
+        for key, value in meta_pairs:
+            lines.append(f"- {key}: {value}")
+        lines.append("")
+    lines.append(
+        "Write the note now. Obey the system prompt. Output only markdown."
+    )
+    return "\n".join(lines)
+def _template_synthesize(learning: Learning) -> str:
     parts = [f"# {learning.topic}", ""]
-    parts.append(f"> Auto-documented via ArkaOS ({model_hint}).")
+    parts.append(f"> {_AUTO_DOC_SUFFIX}.")
     parts.append("")
     if learning.content.strip():
         parts.append(learning.content.strip())
@@ -318,7 +343,7 @@ def _template_synthesize(learning: Learning, model_hint: str) -> str:
             parts.append(f"- {src}")
         parts.append("")
     if learning.decisions:
-        parts.append("## Key decisions")
+        parts.append("## Decisions")
         parts.append("")
         for dec in learning.decisions[:10]:
             parts.append(f"- {dec}")
@@ -358,14 +383,12 @@ def _document_one(
     writer: ObsidianWriter,
     vault_path: Path,
     session_id: str,
-) -> Optional[Path]:
-    model_hint = choose_model(learning)
-    body = synthesize(learning, model_hint)
+) -> Path | None:
+    body = synthesize(learning)
     meta = dict(learning.metadata)
     meta.setdefault("title", learning.topic[:80])
     meta.setdefault("session", session_id)
     meta.setdefault("auto_documented", True)
-    meta.setdefault("model_hint", model_hint)
     try:
         plan = _cataloger.plan(body, meta)
     except ValueError:

package/core/jobs/__pycache__/__init__.cpython-313.pyc CHANGED Viewed

Binary file

package/core/jobs/__pycache__/auto_doc_worker.cpython-313.pyc CHANGED Viewed

Binary file

package/core/obsidian/__pycache__/__init__.cpython-313.pyc CHANGED Viewed

Binary file

package/core/obsidian/__pycache__/cataloger.cpython-313.pyc CHANGED Viewed

Binary file

package/core/obsidian/__pycache__/relator.cpython-313.pyc CHANGED Viewed

Binary file

package/core/obsidian/__pycache__/taxonomy.cpython-313.pyc CHANGED Viewed

Binary file

package/core/runtime/__init__.py CHANGED Viewed

@@ -2,5 +2,26 @@
 from core.runtime.base import RuntimeAdapter, RuntimeConfig
 from core.runtime.registry import get_adapter, detect_runtime
+from core.runtime.llm_provider import (
+    AnthropicDirectProvider,
+    LLMProvider,
+    LLMResponse,
+    LLMUnavailable,
+    StubProvider,
+    SubagentProvider,
+    get_llm_provider,
+)
-__all__ = ["RuntimeAdapter", "RuntimeConfig", "get_adapter", "detect_runtime"]
+__all__ = [
+    "AnthropicDirectProvider",
+    "LLMProvider",
+    "LLMResponse",
+    "LLMUnavailable",
+    "RuntimeAdapter",
+    "RuntimeConfig",
+    "StubProvider",
+    "SubagentProvider",
+    "detect_runtime",
+    "get_adapter",
+    "get_llm_provider",
+]

package/core/runtime/__pycache__/__init__.cpython-313.pyc CHANGED Viewed

Binary file

package/core/runtime/__pycache__/base.cpython-313.pyc CHANGED Viewed

Binary file

package/core/runtime/__pycache__/claude_code.cpython-313.pyc CHANGED Viewed

Binary file

package/core/runtime/__pycache__/codex_cli.cpython-313.pyc CHANGED Viewed

Binary file

package/core/runtime/__pycache__/cursor.cpython-313.pyc CHANGED Viewed

Binary file

package/core/runtime/__pycache__/gemini_cli.cpython-313.pyc CHANGED Viewed

Binary file

package/core/runtime/__pycache__/llm_cost_telemetry.cpython-313.pyc ADDED Viewed

Binary file

package/core/runtime/__pycache__/llm_cost_telemetry_cli.cpython-313.pyc ADDED Viewed

Binary file

package/core/runtime/__pycache__/llm_provider.cpython-313.pyc ADDED Viewed

Binary file

package/core/runtime/__pycache__/pricing.cpython-313.pyc ADDED Viewed

Binary file

package/core/runtime/base.py CHANGED Viewed

@@ -11,7 +11,10 @@ knowing which runtime is active. Each adapter handles the translation.
 from abc import ABC, abstractmethod
 from dataclasses import dataclass, field
 from pathlib import Path
-from typing import Any
+from typing import TYPE_CHECKING, Any
+if TYPE_CHECKING:
+    from core.runtime.llm_provider import LLMResponse
 @dataclass
@@ -141,3 +144,29 @@ class RuntimeAdapter(ABC):
             "mcp": config.supports_mcp,
         }
         return feature_map.get(feature, False)
+    def headless_supported(self) -> bool:
+        """Whether this adapter can perform a headless CLI completion.
+        Default: False. Override in adapters that implement
+        `headless_complete`. `SubagentProvider` consults this before
+        attempting a call so the factory can fall back cleanly.
+        """
+        return False
+    def headless_complete(
+        self,
+        prompt: str,
+        *,
+        max_tokens: int = 2000,
+        system: str = "",
+    ) -> "LLMResponse":
+        """Run a one-shot headless completion via this runtime's CLI.
+        Default implementation raises NotImplementedError — adapters
+        that do not have a headless CLI mode (e.g. Cursor) inherit this
+        behaviour. Never hardcodes a model.
+        """
+        raise NotImplementedError(
+            f"{self.get_config().name} does not support headless LLM completion"
+        )

package/core/runtime/claude_code.py CHANGED Viewed

@@ -4,11 +4,18 @@ Claude Code is the primary and most capable runtime for ArkaOS.
 It supports hooks, subagents (Agent tool), MCP servers, and worktrees.
 """
+import json
+import shutil
+import subprocess
 from pathlib import Path
 from os.path import expanduser
+from typing import TYPE_CHECKING
 from core.runtime.base import RuntimeAdapter, RuntimeConfig, AgentContext, AgentResult
+if TYPE_CHECKING:
+    from core.runtime.llm_provider import LLMResponse
 class ClaudeCodeAdapter(RuntimeAdapter):
     """Adapter for Anthropic's Claude Code CLI."""
@@ -102,3 +109,64 @@ class ClaudeCodeAdapter(RuntimeAdapter):
     def supports_feature(self, feature: str) -> bool:
         """Claude Code supports all features."""
         return True
+    def headless_supported(self) -> bool:
+        return shutil.which("claude") is not None
+    def headless_complete(
+        self,
+        prompt: str,
+        *,
+        max_tokens: int = 2000,
+        system: str = "",
+    ) -> "LLMResponse":
+        from core.runtime.llm_provider import LLMResponse, LLMUnavailable
+        binary = shutil.which("claude")
+        if binary is None:
+            raise NotImplementedError(
+                "claude CLI not found on PATH — install Claude Code to enable "
+                "headless completion via this adapter."
+            )
+        cmd = [binary, "-p", prompt, "--output-format", "json"]
+        if system:
+            cmd.extend(["--append-system-prompt", system])
+        try:
+            proc = subprocess.run(
+                cmd,
+                capture_output=True,
+                text=True,
+                timeout=60,
+                check=False,
+            )
+        except subprocess.TimeoutExpired as exc:
+            raise LLMUnavailable("claude CLI timed out after 60s") from exc
+        except OSError as exc:
+            raise LLMUnavailable(f"claude CLI subprocess failed: {exc}") from exc
+        if proc.returncode != 0:
+            raise LLMUnavailable(
+                f"claude CLI exited {proc.returncode}: {proc.stderr.strip()[:200]}"
+            )
+        return _parse_claude_json(proc.stdout)
+def _parse_claude_json(stdout: str) -> "LLMResponse":
+    from core.runtime.llm_provider import LLMResponse
+    payload = json.loads(stdout) if stdout.strip() else {}
+    text = str(payload.get("result") or payload.get("response") or "")
+    usage = payload.get("usage") or {}
+    tokens_in = int(usage.get("input_tokens") or 0)
+    tokens_out = int(usage.get("output_tokens") or 0)
+    cache_read = int(usage.get("cache_read_input_tokens") or 0)
+    cache_write = int(usage.get("cache_creation_input_tokens") or 0)
+    total_input = tokens_in + cache_read + cache_write
+    model = str(payload.get("model") or "")
+    return LLMResponse(
+        text=text,
+        tokens_in=total_input,
+        tokens_out=tokens_out,
+        cached_tokens=cache_read,
+        model=model,
+    )

package/core/runtime/codex_cli.py CHANGED Viewed

@@ -4,11 +4,16 @@ OpenAI's Codex CLI. Supports sandboxed execution and file operations.
 More limited than Claude Code: no native hooks, no MCP servers.
 """
+import shutil
 from pathlib import Path
 from os.path import expanduser
+from typing import TYPE_CHECKING
 from core.runtime.base import RuntimeAdapter, RuntimeConfig, AgentContext, AgentResult
+if TYPE_CHECKING:
+    from core.runtime.llm_provider import LLMResponse
 class CodexCliAdapter(RuntimeAdapter):
     """Adapter for OpenAI's Codex CLI."""
@@ -69,3 +74,31 @@ class CodexCliAdapter(RuntimeAdapter):
     def search_content(self, pattern: str, path: str = ".") -> list[str]:
         raise NotImplementedError("Use Codex CLI's native content search")
+    def headless_supported(self) -> bool:
+        # Codex CLI headless invocation syntax is not stable as of
+        # 2026-04-20. Until verified, we surface unsupported and let
+        # SubagentProvider fall back to AnthropicDirect or stub.
+        return False
+    def headless_complete(
+        self,
+        prompt: str,
+        *,
+        max_tokens: int = 2000,
+        system: str = "",
+    ) -> "LLMResponse":
+        binary = shutil.which("codex")
+        if binary is None:
+            raise NotImplementedError(
+                "codex CLI not found on PATH — install Codex CLI to "
+                "enable headless completion."
+            )
+        # TODO(llm-agnostic): Verify Codex CLI headless invocation
+        # syntax (`codex exec "<prompt>"` was the working hypothesis
+        # but has not been confirmed for the current release). Until
+        # then, refuse rather than guess. Tracked in Task #12 report.
+        raise NotImplementedError(
+            "Codex CLI headless completion not yet wired — verify CLI "
+            "syntax before enabling. See core/runtime/codex_cli.py TODO."
+        )

package/core/runtime/cursor.py CHANGED Viewed

@@ -6,9 +6,13 @@ Supports agent mode with tool execution.
 from pathlib import Path
 from os.path import expanduser
+from typing import TYPE_CHECKING
 from core.runtime.base import RuntimeAdapter, RuntimeConfig, AgentContext, AgentResult
+if TYPE_CHECKING:
+    from core.runtime.llm_provider import LLMResponse
 class CursorAdapter(RuntimeAdapter):
     """Adapter for Cursor AI IDE."""
@@ -69,3 +73,18 @@ class CursorAdapter(RuntimeAdapter):
     def search_content(self, pattern: str, path: str = ".") -> list[str]:
         raise NotImplementedError("Use Cursor's native content search")
+    def headless_supported(self) -> bool:
+        return False
+    def headless_complete(
+        self,
+        prompt: str,
+        *,
+        max_tokens: int = 2000,
+        system: str = "",
+    ) -> "LLMResponse":
+        raise NotImplementedError(
+            "Cursor has no headless CLI mode as of 2026-04. "
+            "Fall back to AnthropicDirectProvider or StubProvider."
+        )

package/core/runtime/gemini_cli.py CHANGED Viewed

@@ -3,11 +3,16 @@
 Google's Gemini CLI. Uses GEMINI.md for instructions and activate_skill for skills.
 """
+import shutil
 from pathlib import Path
 from os.path import expanduser
+from typing import TYPE_CHECKING
 from core.runtime.base import RuntimeAdapter, RuntimeConfig, AgentContext, AgentResult
+if TYPE_CHECKING:
+    from core.runtime.llm_provider import LLMResponse
 class GeminiCliAdapter(RuntimeAdapter):
     """Adapter for Google's Gemini CLI."""
@@ -66,3 +71,31 @@ class GeminiCliAdapter(RuntimeAdapter):
     def search_content(self, pattern: str, path: str = ".") -> list[str]:
         raise NotImplementedError("Use Gemini CLI's native content search")
+    def headless_supported(self) -> bool:
+        # Gemini CLI headless invocation syntax is not verified for the
+        # current release. Returning False lets SubagentProvider fall
+        # back gracefully rather than shell out blindly.
+        return False
+    def headless_complete(
+        self,
+        prompt: str,
+        *,
+        max_tokens: int = 2000,
+        system: str = "",
+    ) -> "LLMResponse":
+        binary = shutil.which("gemini")
+        if binary is None:
+            raise NotImplementedError(
+                "gemini CLI not found on PATH — install Gemini CLI to "
+                "enable headless completion."
+            )
+        # TODO(llm-agnostic): Verify Gemini CLI's headless invocation
+        # (`gemini -p "<prompt>"` was the working hypothesis). Until
+        # confirmed for the shipped CLI version, refuse rather than
+        # guess. Tracked in Task #12 report.
+        raise NotImplementedError(
+            "Gemini CLI headless completion not yet wired — verify CLI "
+            "syntax before enabling. See core/runtime/gemini_cli.py TODO."
+        )