npm - arkaos - Versions diffs - 4.0.0 → 4.0.1 - Mend

arkaos 4.0.0 → 4.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (13) hide show

package/README.md +42 -30
package/VERSION +1 -1
package/arka/SKILL.md +2 -2
package/package.json +1 -1
package/pyproject.toml +1 -1
package/scripts/bench/__init__.py +5 -0
package/scripts/bench/__pycache__/__init__.cpython-313.pyc +0 -0
package/scripts/bench/__pycache__/harness.cpython-313.pyc +0 -0
package/scripts/bench/__pycache__/run.cpython-313.pyc +0 -0
package/scripts/bench/harness.py +138 -0
package/scripts/bench/run.py +136 -0
package/scripts/tools/__pycache__/docs_stats.cpython-313.pyc +0 -0
package/scripts/tools/docs_stats.py +154 -0

package/README.md CHANGED Viewed

@@ -2,13 +2,16 @@
 **The Operating System for AI Agent Teams.**
-106 agents. 17 departments. 250+ skills. Enterprise frameworks. Multi-runtime. One install.
+82 agents. 17 departments. 267 skills. Enterprise frameworks. Multi-runtime. One install.
 ```bash
 npx arkaos install
 ```
-[![npm](https://img.shields.io/npm/v/arkaos)](https://www.npmjs.com/package/arkaos) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) [![Tests](https://img.shields.io/badge/tests-3025%20passing-brightgreen)]()
+[![npm](https://img.shields.io/npm/v/arkaos)](https://www.npmjs.com/package/arkaos) [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE) [![Tests](https://img.shields.io/badge/tests-4500%2B%20passing-brightgreen)]()
+> All counts in this document are generated by `python scripts/tools/docs_stats.py`
+> and locked by a test — they cannot drift from the repository.
 ---
@@ -99,7 +102,7 @@ In plain language. No special syntax required.
 ### 2. ArkaOS routes to the right squad
-The Synapse engine (8-layer context injection in <1ms) analyzes your request and routes it to the correct department. Each department has a lead agent who orchestrates specialists.
+The Synapse engine (12-layer context injection, ~87ms cold / ~83ms warm — see [Benchmarks](wiki/11-Benchmarks.md)) analyzes your request and routes it to the correct department. Each department has a lead agent who orchestrates specialists.
 ### 3. Agents execute with enterprise frameworks
@@ -132,24 +135,27 @@ Every decision, solution, and pattern is captured. The Cognitive Layer curates i
 | Department | Prefix | Agents | What It Does |
 |-----------|--------|--------|-------------|
-| **Development** | `/dev` | 10 | Full-stack features, APIs, architecture, security, CI/CD |
+| **Development** | `/dev` | 15 | Full-stack features, APIs, architecture, security, CI/CD |
+| **Brand & Design** | `/brand` | 10 | Brand identity, UX/UI, design systems, naming |
 | **Marketing** | `/mkt` | 4 | SEO, paid ads, email campaigns, growth loops |
-| **Brand & Design** | `/brand` | 4 | Brand identity, UX/UI, design systems, naming |
-| **Finance** | `/fin` | 3 | DCF valuation, unit economics, budgets, investor prep |
-| **Strategy** | `/strat` | 3 | Market analysis, competitive intelligence, business models |
+| **Strategy** | `/strat` | 4 | Market analysis, competitive intelligence, business models |
 | **E-Commerce** | `/ecom` | 4 | Store optimization, CRO, pricing, RFM segmentation |
-| **Knowledge** | `/kb` | 3 | Research, Zettelkasten, persona building, ingestion |
-| **Operations** | `/ops` | 4 | Automation, SOPs, compliance (GDPR, ISO, SOC 2) |
-| **Project Mgmt** | `/pm` | 3 | Scrum, Shape Up, discovery, roadmaps |
-| **SaaS** | `/saas` | 4 | Idea validation, metrics, PLG strategy, scaffolding |
-| **Landing Pages** | `/landing` | 4 | Sales copy, funnels, offers, page generation |
+| **Knowledge** | `/kb` | 4 | Research, Zettelkasten, persona building, ingestion |
+| **Project Mgmt** | `/pm` | 4 | Scrum, Shape Up, discovery, roadmaps |
 | **Content** | `/content` | 4 | Viral hooks, scripts, repurposing, content calendars |
-| **Communities** | `/community` | 2 | Groups, membership, gamification, engagement |
-| **Sales** | `/sales` | 2 | Pipeline management, SPIN selling, negotiation |
-| **Leadership** | `/lead` | 2 | Team health, OKRs, culture, hiring frameworks |
-| **Organization** | `/org` | 1 | Org design, team topologies, matrix structure |
+| **Sales** | `/sales` | 4 | Pipeline management, SPIN selling, negotiation |
+| **SaaS** | `/saas` | 5 | Idea validation, metrics, PLG strategy, scaffolding |
+| **Organization** | `/org` | 5 | Org design, team topologies, matrix structure |
+| **Landing Pages** | `/landing` | 4 | Sales copy, funnels, offers, page generation |
+| **Finance** | `/fin` | 3 | DCF valuation, unit economics, budgets, investor prep |
+| **Operations** | `/ops` | 3 | Automation, SOPs, compliance (GDPR, ISO, SOC 2) |
+| **Communities** | `/community` | 3 | Groups, membership, gamification, engagement |
+| **Leadership** | `/lead` | 3 | Team health, OKRs, culture, hiring frameworks |
 | **Quality Gate** | (auto) | 3 | Mandatory review on every workflow. Veto power. |
+> 82 agents across 17 departments (81 unique; `cro-specialist` is shared by
+> E-Commerce and Landing in the matrix structure).
 ---
 ## Cognitive Layer (v2.10)
@@ -386,7 +392,7 @@ python scripts/tools/okr_cascade.py growth --json
 User Input
   │
   ▼
-Synapse v2 (8-layer context injection, <1ms, cached)
+Synapse v2 (12-layer context injection, ~87ms cold / ~83ms warm)
   │
   ▼
 Orchestrator (/do → department routing)
@@ -408,13 +414,13 @@ Output (Obsidian vault + structured deliverables)
 | System | Purpose |
 |--------|---------|
-| **Synapse v2** | 8-layer context injection (<1ms, with caching) |
+| **Synapse v2** | 12-layer context injection (~87ms cold, ~83ms warm; cacheable layers are sub-millisecond) |
 | **Workflow Engine** | YAML workflows with phases, gates, parallelization |
 | **Agent Schema** | 4-framework behavioral DNA with consistency validation |
 | **Squad Framework** | Department squads + ad-hoc project squads (matrix) |
 | **Cognitive Layer** | Memory, Dreaming, Research, Scheduler |
 | **Living Specs** | Bidirectional spec/code sync |
-| **Governance** | Constitution with 14 non-negotiable rules |
+| **Governance** | Constitution with 25 non-negotiable rules (+ 11 must, 8 should) |
 | **Multi-Runtime** | Claude Code, Codex, Gemini, Cursor adapters |
 ### Tech Stack
@@ -427,22 +433,28 @@ Output (Obsidian vault + structured deliverables)
 | Workflows | YAML |
 | Agent Definitions | YAML |
 | Knowledge | Obsidian + SQLite-VSS |
-| Tests | pytest (1,993 tests) |
+| Tests | pytest (4,500+ tests) |
 ---
 ## Documentation
-Full documentation is available on the **[GitHub Wiki](https://github.com/andreagroferreira/arka-os/wiki)**:
+Full documentation lives in two places in this repository:
+**[`wiki/`](wiki/Home.md)** — the user-facing guide (step-by-step, features, benchmarks):
+- [Home](wiki/Home.md) — the index of everything
+- [Getting Started](wiki/01-Getting-Started.md) — install and run your first command
+- [Core Concepts](wiki/02-Core-Concepts.md) — squads, agents, tiers, behavioral DNA
+- [The 13-Phase Flow](wiki/03-The-13-Phase-Flow.md) — how every request is handled
+- [Departments](wiki/04-Departments/) — one page per department
+- [Commands Reference](wiki/05-Commands-Reference.md)
+- [Cognitive Layer](wiki/06-Cognitive-Layer.md) — memory, dreaming, research
+- [Benchmarks](wiki/11-Benchmarks.md) — measured, reproducible numbers
+- [Competitive Analysis](wiki/12-Competitive-Analysis.md) and [Benefits & ROI](wiki/13-Benefits-ROI.md)
-- [Getting Started](https://github.com/andreagroferreira/arka-os/wiki/Getting-Started)
-- [Installation Guide](https://github.com/andreagroferreira/arka-os/wiki/Installation)
-- [Departments & Agents](https://github.com/andreagroferreira/arka-os/wiki/Departments)
-- [Cognitive Layer](https://github.com/andreagroferreira/arka-os/wiki/Cognitive-Layer)
-- [Ecosystem Management](https://github.com/andreagroferreira/arka-os/wiki/Ecosystems)
-- [Configuration](https://github.com/andreagroferreira/arka-os/wiki/Configuration)
-- [Creating Projects](https://github.com/andreagroferreira/arka-os/wiki/Creating-Projects)
-- [Update & Sync](https://github.com/andreagroferreira/arka-os/wiki/Update-and-Sync)
+**[`docs/`](docs/)** — the technical/contributor reference (architecture, API,
+agent schema, core engine, ADRs).
 ---
@@ -493,7 +505,7 @@ Department commands: `/dev`, `/mkt`, `/brand`, `/fin`, `/strat`, `/ecom`, `/kb`,
 ## Contributing
-See [CONTRIBUTING.md](.github/CONTRIBUTING.md). PRs welcome — all changes require passing the full test suite (3,473+ tests as of v2.46.x) and Quality Gate review (Marta CQO + Eduardo Copy + Francisca Tech, all Opus).
+See [CONTRIBUTING.md](.github/CONTRIBUTING.md). PRs welcome — all changes require passing the full test suite (4,500+ tests as of v4.0.0) and Quality Gate review (Marta CQO + Eduardo Copy + Francisca Tech, all Opus).
 ## License

package/VERSION CHANGED Viewed

	@@ -1 +1 @@
1	- 4.0.0
1	+ 4.0.1

package/arka/SKILL.md CHANGED Viewed

@@ -21,10 +21,10 @@ treat them as your default source. External research supplements, it
 does not replace the vault.
 <!-- arka:kb-first-prefix end -->
-# ArkaOS v2 — Main Orchestrator
+# ArkaOS — Main Orchestrator
 > **The Operating System for AI Agent Teams**
-> 65 agents. 17 departments. 244+ skills. Multi-runtime. Dashboard. Knowledge RAG.
+> 82 agents. 17 departments. 267 skills. Multi-runtime. Dashboard. Knowledge RAG.
 ## ⛔ Mandatory 13-phase flow (NON-NEGOTIABLE)

package/package.json CHANGED Viewed

@@ -1,6 +1,6 @@
 {
   "name": "arkaos",
-  "version": "4.0.0",
+  "version": "4.0.1",
   "description": "The Operating System for AI Agent Teams",
   "type": "module",
   "bin": {

package/pyproject.toml CHANGED Viewed

@@ -1,6 +1,6 @@
 [project]
 name = "arkaos-core"
-version = "4.0.0"
+version = "4.0.1"
 description = "Core engine for ArkaOS — The Operating System for AI Agent Teams"
 readme = "README.md"
 license = {text = "MIT"}

package/scripts/bench/__init__.py ADDED Viewed

@@ -0,0 +1,5 @@
+"""ArkaOS benchmark harness.
+Reproducible, honest measurements of the core engine. No fabricated numbers:
+every value in the docs comes from running `python scripts/bench/run.py`.
+"""

package/scripts/bench/__pycache__/__init__.cpython-313.pyc ADDED Viewed

Binary file

package/scripts/bench/__pycache__/harness.cpython-313.pyc ADDED Viewed

Binary file

package/scripts/bench/__pycache__/run.cpython-313.pyc ADDED Viewed

Binary file

package/scripts/bench/harness.py ADDED Viewed

@@ -0,0 +1,138 @@
+"""ArkaOS benchmark harness -- core engine measurements.
+Three honest measurements:
+1. Synapse injection latency (engine-only, no vector store) -- cold vs warm,
+   plus per-layer compute time so the "cached layers are sub-millisecond"
+   claim can be verified against the "full engine costs N ms" reality.
+2. Subagent handoff artifact size -- measured token estimate vs the documented
+   ~379-token claim.
+3. Routing accuracy -- DepartmentLayer keyword detection over a fixed labelled
+   prompt set.
+All numbers are reproducible. Timings vary by machine; routing accuracy and
+handoff sizes are deterministic.
+"""
+from __future__ import annotations
+import statistics
+import sys
+import time
+from pathlib import Path
+from typing import Callable
+_REPO_ROOT = Path(__file__).resolve().parents[2]
+if str(_REPO_ROOT) not in sys.path:
+    sys.path.insert(0, str(_REPO_ROOT))
+def _percentiles(samples_ms: list[float]) -> dict:
+    """Summarise a list of millisecond samples."""
+    ordered = sorted(samples_ms)
+    return {
+        "runs": len(ordered),
+        "min": round(ordered[0], 3),
+        "p50": round(statistics.median(ordered), 3),
+        "mean": round(statistics.mean(ordered), 3),
+        "max": round(ordered[-1], 3),
+    }
+def _time_call(fn: Callable[[], object]) -> float:
+    """Time a single call, return elapsed milliseconds."""
+    start = time.perf_counter()
+    fn()
+    return (time.perf_counter() - start) * 1000.0
+def bench_synapse_latency(runs: int = 50) -> dict:
+    """Measure Synapse engine injection latency (cold vs warm) + per-layer ms."""
+    from core.synapse.engine import create_default_engine
+    from core.synapse.layers import PromptContext
+    engine = create_default_engine()
+    ctx = PromptContext(
+        user_input="fix the authentication bug in the login controller",
+        cwd="/tmp/project", git_branch="feat/auth", project_name="demo",
+        project_stack="laravel 11", active_agent="backend-dev",
+    )
+    cold = [_time_call(lambda: (engine.clear_cache(), engine.inject(ctx))) for _ in range(runs)]
+    engine.inject(ctx)  # warm the cache
+    warm = [_time_call(lambda: engine.inject(ctx)) for _ in range(runs)]
+    last = engine.metrics[-1] if engine.metrics else {}
+    profile = {
+        "layers_computed": last.get("layers_computed"),
+        "layers_skipped": last.get("layers_skipped"),
+        "tokens_injected": last.get("tokens_injected"),
+    }
+    return {"layer_count": engine.layer_count,
+            "cold_ms": _percentiles(cold), "warm_ms": _percentiles(warm),
+            "injection_profile": profile}
+def bench_subagent_handoff() -> dict:
+    """Measure a representative handoff artifact's token estimate."""
+    from core.runtime.subagent import HandoffArtifact
+    artifact = HandoffArtifact(
+        task_id="task-0042",
+        task_description="Implement Stripe subscription billing with idempotent webhooks",
+        agent_id="backend-dev", agent_role="Senior Backend Developer",
+        agent_disc="D:80 I:50 S:45 C:78", department="dev",
+        relevant_files=["app/Services/BillingService.php",
+                        "app/Http/Controllers/WebhookController.php",
+                        "tests/Feature/BillingTest.php"],
+        context_summary=("Laravel 11 app, Cashier installed. Customer model has "
+                         "stripe_id. Need tiered pricing with volume discounts."),
+        constraints=["SOLID + Services/Repositories", "Feature tests required",
+                     "Idempotent webhook handling"],
+        expected_output="Tested, secure billing implementation with passing suite",
+        quality_criteria=["80%+ coverage", "OWASP reviewed", "Conventional commits"],
+    )
+    return {"documented_claim": 379,
+            "measured_tokens": artifact.estimated_tokens,
+            "prompt_chars": len(artifact.to_prompt())}
+# Fixed labelled prompt set for routing accuracy. (prompt, expected_department)
+_ROUTING_SET: list[tuple[str, str]] = [
+    ("fix the authentication bug in the login controller", "dev"),
+    ("refactor the payment service and add unit tests", "dev"),
+    ("create a go-to-market plan for our new SaaS", "saas"),
+    ("design a brand identity with logo and color palette", "brand"),
+    ("write viral content hooks for our TikTok channel", "content"),
+    ("build a high-converting landing page funnel", "landing"),
+    ("audit our online store conversion rate", "ecom"),
+    ("model our Q3 budget and cash flow forecast", "finance"),
+    ("run a competitive analysis with Porter's Five Forces", "strategy"),
+    ("plan the next sprint and groom the backlog", "pm"),
+    ("set up an SEO and paid ads growth campaign", "marketing"),
+    ("automate our client onboarding with an SOP", "ops"),
+]
+def bench_routing_accuracy() -> dict:
+    """Measure DepartmentLayer keyword routing over the labelled prompt set."""
+    from core.synapse.layers import DepartmentLayer, PromptContext
+    layer = DepartmentLayer()
+    hits, details = 0, []
+    for prompt, expected in _ROUTING_SET:
+        result = layer.compute(PromptContext(user_input=prompt))
+        detected = (result.content or "").strip()
+        ok = detected == expected
+        hits += int(ok)
+        details.append({"prompt": prompt, "expected": expected,
+                        "detected": detected or "(none)", "ok": ok})
+    total = len(_ROUTING_SET)
+    return {"total": total, "correct": hits,
+            "accuracy_pct": round(100.0 * hits / total, 1), "details": details}
+def run_all(runs: int = 50) -> dict:
+    """Run every benchmark and return a combined result dict."""
+    return {
+        "synapse_latency": bench_synapse_latency(runs=runs),
+        "subagent_handoff": bench_subagent_handoff(),
+        "routing_accuracy": bench_routing_accuracy(),
+    }

package/scripts/bench/run.py ADDED Viewed

@@ -0,0 +1,136 @@
+#!/usr/bin/env python3
+"""Run the ArkaOS benchmark harness and persist results.
+Writes:
+- benchmarks/results.json  -- machine-readable, consumed by the wiki
+- benchmarks/results.md     -- human-readable summary table
+Usage:
+    python scripts/bench/run.py                 # default 50 runs
+    python scripts/bench/run.py --runs 100
+    python scripts/bench/run.py --runs 30 --no-write   # print only
+"""
+from __future__ import annotations
+import argparse
+import json
+import platform
+import sys
+from pathlib import Path
+_REPO_ROOT = Path(__file__).resolve().parents[2]
+if str(_REPO_ROOT) not in sys.path:
+    sys.path.insert(0, str(_REPO_ROOT))
+from scripts.bench import harness  # noqa: E402
+_OUT_DIR = _REPO_ROOT / "benchmarks"
+def _environment() -> dict:
+    """Capture the machine environment (numbers are machine-relative)."""
+    return {
+        "python": platform.python_version(),
+        "platform": platform.platform(),
+        "machine": platform.machine(),
+    }
+def _synapse_section(sl: dict) -> list[str]:
+    """Render the Synapse latency section."""
+    prof = sl["injection_profile"]
+    return [
+        "## Synapse context injection (engine-only, no vector store)",
+        "",
+        f"- Registered layers: **{sl['layer_count']}**",
+        f"- Cold injection (cache cleared each run): "
+        f"p50 **{sl['cold_ms']['p50']} ms**, mean {sl['cold_ms']['mean']} ms, "
+        f"min {sl['cold_ms']['min']} ms, max {sl['cold_ms']['max']} ms "
+        f"({sl['cold_ms']['runs']} runs)",
+        f"- Warm injection (cached): "
+        f"p50 **{sl['warm_ms']['p50']} ms**, mean {sl['warm_ms']['mean']} ms "
+        f"({sl['warm_ms']['runs']} runs)",
+        "- The small cold/warm delta is expected: cacheable layers are a "
+        "minority of total compute, so warming the cache saves only a few ms.",
+        f"- Representative injection: {prof['layers_computed']} layers computed, "
+        f"{prof['layers_skipped']} skipped, {prof['tokens_injected']} tokens injected",
+        "",
+    ]
+def _handoff_section(ho: dict) -> list[str]:
+    """Render the subagent handoff section."""
+    return [
+        "## Subagent handoff artifact",
+        "",
+        f"- Measured (representative artifact): **{ho['measured_tokens']} word-tokens** "
+        f"({ho['prompt_chars']} chars). 'word-tokens' is a whitespace-split estimate, "
+        "not a BPE tokenizer count.",
+        f"- Previously documented claim: {ho['documented_claim']}",
+        "",
+    ]
+def _routing_section(ra: dict) -> list[str]:
+    """Render the routing accuracy section + table."""
+    out = [
+        "## Routing accuracy (DepartmentLayer keyword detection)",
+        "",
+        f"- **{ra['correct']}/{ra['total']} = {ra['accuracy_pct']}%** on a fixed "
+        "labelled prompt set",
+        "",
+        "| Prompt | Expected | Detected | OK |",
+        "|---|---|---|:--:|",
+    ]
+    for d in ra["details"]:
+        mark = "yes" if d["ok"] else "no"
+        prompt = d["prompt"] if len(d["prompt"]) <= 48 else d["prompt"][:45] + "..."
+        out.append(f"| {prompt} | {d['expected']} | {d['detected']} | {mark} |")
+    out.append("")
+    return out
+def render_markdown(results: dict, env: dict) -> str:
+    """Render a human-readable benchmark summary."""
+    header = [
+        "# ArkaOS Benchmarks",
+        "",
+        "> Generated by `python scripts/bench/run.py`. Timings are "
+        "machine-relative; routing accuracy and handoff size are deterministic.",
+        "",
+        f"**Environment:** Python {env['python']} - {env['platform']}",
+        "",
+    ]
+    return "\n".join(header
+                     + _synapse_section(results["synapse_latency"])
+                     + _handoff_section(results["subagent_handoff"])
+                     + _routing_section(results["routing_accuracy"]))
+def main() -> int:
+    """Entry point."""
+    parser = argparse.ArgumentParser(description="Run ArkaOS benchmarks")
+    parser.add_argument("--runs", type=int, default=50, help="Latency samples (default 50)")
+    parser.add_argument("--no-write", action="store_true", help="Print only, do not write files")
+    args = parser.parse_args()
+    env = _environment()
+    results = harness.run_all(runs=args.runs)
+    payload = {"environment": env, "results": results}
+    md = render_markdown(results, env)
+    if args.no_write:
+        print(json.dumps(payload, indent=2))
+        print("\n" + md)
+        return 0
+    _OUT_DIR.mkdir(exist_ok=True)
+    (_OUT_DIR / "results.json").write_text(json.dumps(payload, indent=2) + "\n", encoding="utf-8")
+    (_OUT_DIR / "results.md").write_text(md + "\n", encoding="utf-8")
+    print(f"Wrote {_OUT_DIR / 'results.json'} and {_OUT_DIR / 'results.md'}")
+    print("\n" + md)
+    return 0
+if __name__ == "__main__":
+    sys.exit(main())

package/scripts/tools/__pycache__/docs_stats.cpython-313.pyc ADDED Viewed

Binary file

package/scripts/tools/docs_stats.py ADDED Viewed

@@ -0,0 +1,154 @@
+#!/usr/bin/env python3
+"""ArkaOS Docs Stats -- canonical source of truth for documentation numbers.
+Counts agents, departments, skills, ADRs, and tests directly from the
+repository so that every document (README, wiki, CLAUDE.md) consumes generated
+numbers instead of hand-typed ones. This is the antidote to documentation
+drift: no number is ever written by hand.
+Usage:
+    python docs_stats.py                 # human-readable (repo root auto-detected)
+    python docs_stats.py --json
+    python docs_stats.py --root /path/to/arka-os --json
+    python docs_stats.py --with-pytest   # also collect authoritative pytest case count
+"""
+from __future__ import annotations
+import argparse
+import json
+import re
+import subprocess
+import sys
+from pathlib import Path
+from typing import Optional
+_TEST_DEF_RE = re.compile(r"^\s*(?:async\s+)?def\s+test_\w+", re.MULTILINE)
+_COLLECTED_RE = re.compile(r"(\d+)\s+tests?\s+collected")
+def repo_root(start: Optional[Path] = None) -> Path:
+    """Find the repo root by walking up to a dir with VERSION + departments/."""
+    cur = (start or Path(__file__).resolve()).resolve()
+    candidates = [cur, *cur.parents] if cur.is_dir() else [cur.parent, *cur.parents]
+    for p in candidates:
+        if (p / "VERSION").is_file() and (p / "departments").is_dir():
+            return p
+    return Path.cwd()
+def read_version(root: Path) -> str:
+    """Read the canonical version string from the VERSION file."""
+    vf = root / "VERSION"
+    return vf.read_text(encoding="utf-8").strip() if vf.is_file() else ""
+def count_agents(root: Path) -> dict:
+    """Count agent YAML files under departments/*/agents/ (recursive, to
+    include sub-squad nesting). Returns total files + unique slugs."""
+    dep = root / "departments"
+    files = [f for d in dep.glob("*/agents") if d.is_dir()
+             for f in d.rglob("*.yaml")] if dep.is_dir() else []
+    return {"files": len(files), "unique_slugs": len({f.name for f in files})}
+def count_departments(root: Path) -> int:
+    """Count department directories under departments/."""
+    dep = root / "departments"
+    return sum(1 for d in dep.iterdir() if d.is_dir()) if dep.is_dir() else 0
+def count_skills(root: Path) -> dict:
+    """Count SKILL.md files by area. 'core' = departments + arka."""
+    def _n(rel: str) -> int:
+        base = root / rel
+        return len(list(base.rglob("SKILL.md"))) if base.is_dir() else 0
+    dept, arka, market = _n("departments"), _n("arka"), _n("marketplace")
+    return {"departments": dept, "arka": arka, "marketplace": market,
+            "core": dept + arka}
+def count_adrs(root: Path) -> int:
+    """Count Architecture Decision Records in docs/adr/."""
+    adr = root / "docs" / "adr"
+    return len(list(adr.glob("*.md"))) if adr.is_dir() else 0
+def count_test_functions(root: Path) -> int:
+    """Static count of `def test_` / `async def test_` definitions in tests/."""
+    tdir = root / "tests"
+    if not tdir.is_dir():
+        return 0
+    return sum(len(_TEST_DEF_RE.findall(f.read_text(encoding="utf-8", errors="replace")))
+               for f in tdir.rglob("test_*.py"))
+def collect_pytest_cases(root: Path) -> Optional[int]:
+    """Authoritative pytest case count via --collect-only. None on failure."""
+    try:
+        out = subprocess.run(
+            [sys.executable, "-m", "pytest", "--collect-only", "-q"],
+            cwd=root, capture_output=True, text=True, timeout=300, check=False)
+    except (OSError, subprocess.SubprocessError):
+        return None
+    for line in reversed(out.stdout.splitlines()):
+        m = _COLLECTED_RE.search(line)
+        if m:
+            return int(m.group(1))
+    return None
+def gather(root: Path, with_pytest: bool = False) -> dict:
+    """Collect all documentation stats into a JSON-serialisable dict."""
+    tests = {"functions": count_test_functions(root)}
+    if with_pytest:
+        tests["collected"] = collect_pytest_cases(root)
+    return {
+        "version": read_version(root),
+        "agents": count_agents(root),
+        "departments": count_departments(root),
+        "skills": count_skills(root),
+        "adrs": count_adrs(root),
+        "tests": tests,
+        "root": str(root),
+    }
+def format_text(stats: dict) -> str:
+    """Render a human-readable summary."""
+    a, s, t = stats["agents"], stats["skills"], stats["tests"]
+    lines = [
+        "=" * 52,
+        "ARKAOS DOCS STATS (canonical)",
+        "=" * 52,
+        f"Version:        {stats['version']}",
+        f"Departments:    {stats['departments']}",
+        f"Agents:         {a['files']} files ({a['unique_slugs']} unique slugs)",
+        f"Skills (core):  {s['core']}  (departments {s['departments']} + arka {s['arka']})",
+        f"  marketplace:  {s['marketplace']}",
+        f"ADRs:           {stats['adrs']}",
+        f"Test functions: {t['functions']}",
+    ]
+    if "collected" in t:
+        lines.append(f"Test cases:     {t['collected']} (pytest collected)")
+    lines.append("=" * 52)
+    return "\n".join(lines)
+def main() -> int:
+    """Entry point."""
+    parser = argparse.ArgumentParser(
+        description="ArkaOS docs stats -- canonical documentation counter")
+    parser.add_argument("--root", default=None, help="Repo root (default: auto-detect)")
+    parser.add_argument("--json", action="store_true", help="Output as JSON")
+    parser.add_argument("--with-pytest", action="store_true",
+                        help="Also collect authoritative pytest case count")
+    args = parser.parse_args()
+    root = Path(args.root).resolve() if args.root else repo_root()
+    stats = gather(root, with_pytest=args.with_pytest)
+    print(json.dumps(stats, indent=2) if args.json else format_text(stats))
+    return 0
+if __name__ == "__main__":
+    sys.exit(main())