npm - @nomos-arc/arc - Versions diffs - 0.1.0 - Mend

@nomos-arc/arc 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (160) hide show

package/.claude/settings.local.json +10 -0
package/.nomos-config.json +5 -0
package/CLAUDE.md +108 -0
package/LICENSE +190 -0
package/README.md +569 -0
package/dist/cli.js +21120 -0
package/docs/auth/googel_plan.yaml +1093 -0
package/docs/auth/google_task.md +235 -0
package/docs/auth/hardened_blueprint.yaml +1658 -0
package/docs/auth/red_team_report.yaml +336 -0
package/docs/auth/session_state.yaml +162 -0
package/docs/certificate/cer_enhance_plan.md +605 -0
package/docs/certificate/certificate_report.md +338 -0
package/docs/dev_overview.md +419 -0
package/docs/feature_assessment.md +156 -0
package/docs/how_it_works.md +78 -0
package/docs/infrastructure/map.md +867 -0
package/docs/init/master_plan.md +3581 -0
package/docs/init/red_team_report.md +215 -0
package/docs/init/report_phase_1a.md +304 -0
package/docs/integrity-gate/enhance_drift.md +703 -0
package/docs/integrity-gate/overview.md +108 -0
package/docs/management/manger-task.md +99 -0
package/docs/management/scafffold.md +76 -0
package/docs/map/ATOMIC_BLUEPRINT.md +1349 -0
package/docs/map/RED_TEAM_REPORT.md +159 -0
package/docs/map/map_task.md +147 -0
package/docs/map/semantic_graph_task.md +792 -0
package/docs/map/semantic_master_plan.md +705 -0
package/docs/phase7/TEAM_RED.md +249 -0
package/docs/phase7/plan.md +1682 -0
package/docs/phase7/task.md +275 -0
package/docs/prompts/USAGE.md +312 -0
package/docs/prompts/architect.md +165 -0
package/docs/prompts/executer.md +190 -0
package/docs/prompts/hardener.md +190 -0
package/docs/prompts/red_team.md +146 -0
package/docs/verification/goveranance-overview.md +396 -0
package/docs/verification/governance-overview.md +245 -0
package/docs/verification/verification-arc-ar.md +560 -0
package/docs/verification/verification-architecture.md +560 -0
package/docs/very_next.md +52 -0
package/docs/whitepaper.md +89 -0
package/overview.md +1469 -0
package/package.json +63 -0
package/src/adapters/__tests__/git.test.ts +296 -0
package/src/adapters/__tests__/stdio.test.ts +70 -0
package/src/adapters/git.ts +226 -0
package/src/adapters/pty.ts +159 -0
package/src/adapters/stdio.ts +113 -0
package/src/cli.ts +83 -0
package/src/commands/apply.ts +47 -0
package/src/commands/auth.ts +301 -0
package/src/commands/certificate.ts +89 -0
package/src/commands/discard.ts +24 -0
package/src/commands/drift.ts +116 -0
package/src/commands/index.ts +78 -0
package/src/commands/init.ts +121 -0
package/src/commands/list.ts +75 -0
package/src/commands/map.ts +55 -0
package/src/commands/plan.ts +30 -0
package/src/commands/review.ts +58 -0
package/src/commands/run.ts +63 -0
package/src/commands/search.ts +147 -0
package/src/commands/show.ts +63 -0
package/src/commands/status.ts +59 -0
package/src/core/__tests__/budget.test.ts +213 -0
package/src/core/__tests__/certificate.test.ts +385 -0
package/src/core/__tests__/config.test.ts +191 -0
package/src/core/__tests__/preflight.test.ts +24 -0
package/src/core/__tests__/prompt.test.ts +358 -0
package/src/core/__tests__/review.test.ts +161 -0
package/src/core/__tests__/state.test.ts +362 -0
package/src/core/auth/__tests__/manager.test.ts +166 -0
package/src/core/auth/__tests__/server.test.ts +220 -0
package/src/core/auth/gcp-projects.ts +160 -0
package/src/core/auth/manager.ts +114 -0
package/src/core/auth/server.ts +141 -0
package/src/core/budget.ts +119 -0
package/src/core/certificate.ts +502 -0
package/src/core/config.ts +212 -0
package/src/core/errors.ts +54 -0
package/src/core/factory.ts +49 -0
package/src/core/graph/__tests__/builder.test.ts +272 -0
package/src/core/graph/__tests__/contract-writer.test.ts +175 -0
package/src/core/graph/__tests__/enricher.test.ts +299 -0
package/src/core/graph/__tests__/parser.test.ts +200 -0
package/src/core/graph/__tests__/pipeline.test.ts +202 -0
package/src/core/graph/__tests__/renderer.test.ts +128 -0
package/src/core/graph/__tests__/resolver.test.ts +185 -0
package/src/core/graph/__tests__/scanner.test.ts +231 -0
package/src/core/graph/__tests__/show.test.ts +134 -0
package/src/core/graph/builder.ts +303 -0
package/src/core/graph/constraints.ts +94 -0
package/src/core/graph/contract-writer.ts +93 -0
package/src/core/graph/drift/__tests__/classifier.test.ts +215 -0
package/src/core/graph/drift/__tests__/comparator.test.ts +335 -0
package/src/core/graph/drift/__tests__/drift.test.ts +453 -0
package/src/core/graph/drift/__tests__/reporter.test.ts +203 -0
package/src/core/graph/drift/classifier.ts +165 -0
package/src/core/graph/drift/comparator.ts +205 -0
package/src/core/graph/drift/reporter.ts +77 -0
package/src/core/graph/enricher.ts +251 -0
package/src/core/graph/grammar-paths.ts +30 -0
package/src/core/graph/html-template.ts +493 -0
package/src/core/graph/map-schema.ts +137 -0
package/src/core/graph/parser.ts +336 -0
package/src/core/graph/pipeline.ts +209 -0
package/src/core/graph/renderer.ts +92 -0
package/src/core/graph/resolver.ts +195 -0
package/src/core/graph/scanner.ts +145 -0
package/src/core/logger.ts +46 -0
package/src/core/orchestrator.ts +792 -0
package/src/core/plan-file-manager.ts +66 -0
package/src/core/preflight.ts +64 -0
package/src/core/prompt.ts +173 -0
package/src/core/review.ts +95 -0
package/src/core/state.ts +294 -0
package/src/core/worktree-coordinator.ts +77 -0
package/src/search/__tests__/chunk-extractor.test.ts +339 -0
package/src/search/__tests__/embedder-auth.test.ts +124 -0
package/src/search/__tests__/embedder.test.ts +267 -0
package/src/search/__tests__/graph-enricher.test.ts +178 -0
package/src/search/__tests__/indexer.test.ts +518 -0
package/src/search/__tests__/integration.test.ts +649 -0
package/src/search/__tests__/query-engine.test.ts +334 -0
package/src/search/__tests__/similarity.test.ts +78 -0
package/src/search/__tests__/vector-store.test.ts +281 -0
package/src/search/chunk-extractor.ts +167 -0
package/src/search/embedder.ts +209 -0
package/src/search/graph-enricher.ts +95 -0
package/src/search/indexer.ts +483 -0
package/src/search/lexical-searcher.ts +190 -0
package/src/search/query-engine.ts +225 -0
package/src/search/vector-store.ts +311 -0
package/src/types/index.ts +572 -0
package/src/utils/__tests__/ansi.test.ts +54 -0
package/src/utils/__tests__/frontmatter.test.ts +79 -0
package/src/utils/__tests__/sanitize.test.ts +229 -0
package/src/utils/ansi.ts +19 -0
package/src/utils/context.ts +44 -0
package/src/utils/frontmatter.ts +27 -0
package/src/utils/sanitize.ts +78 -0
package/test/e2e/lifecycle.test.ts +330 -0
package/test/fixtures/mock-planner-hang.ts +5 -0
package/test/fixtures/mock-planner.ts +26 -0
package/test/fixtures/mock-reviewer-bad.ts +8 -0
package/test/fixtures/mock-reviewer-retry.ts +34 -0
package/test/fixtures/mock-reviewer.ts +18 -0
package/test/fixtures/sample-project/src/circular-a.ts +6 -0
package/test/fixtures/sample-project/src/circular-b.ts +6 -0
package/test/fixtures/sample-project/src/config.ts +15 -0
package/test/fixtures/sample-project/src/main.ts +19 -0
package/test/fixtures/sample-project/src/services/product-service.ts +20 -0
package/test/fixtures/sample-project/src/services/user-service.ts +18 -0
package/test/fixtures/sample-project/src/types.ts +14 -0
package/test/fixtures/sample-project/src/utils/index.ts +14 -0
package/test/fixtures/sample-project/src/utils/validate.ts +12 -0
package/tsconfig.json +20 -0
package/vitest.config.ts +12 -0

package/docs/dev_overview.md ADDED Viewed

@@ -0,0 +1,419 @@
+# Strategic Development Overview: nomos-arc.ai - The Orchestrator
+This document outlines the strategic shift of nomos-arc.ai from a standalone AI assistant to a **Global Governance & Orchestration Layer**. Our goal is to manage, audit, and secure the output of agentic AI coding tools.
+> **Open Source Philosophy:** nomos-arc.ai is designed as a **free, open-source tool** that imposes zero financial burden on teams. By leveraging free-tier LLM APIs (Gemini, etc.), local models (Ollama), and open-source infrastructure (ChromaDB, Mermaid.js), the focus shifts from "paying for AI" to **"automating understanding"** — not just code generation, but code comprehension, architecture mapping, and intelligent failure analysis.
+---
+## 1. The Orchestrator Vision: Manager vs. Agent
+Most AI tools (Cursor, Claude Code, Aider) are **"Agents"** — they are the workers in the field. nomos-arc.ai is the **"Manager" (Orchestrator)**.
+### Why this matters:
+*   **Decoupling Execution from Control:** nomos-arc.ai opens the "Shadow Branch," injects the rules, and spawns the Agent (Claude Code).
+*   **Stateful Lifecycle:** Unlike ephemeral chat sessions, nomos-arc.ai manages a task's full lifecycle (Init -> Plan -> Review -> Refine -> Apply).
+*   **Reliability:** nomos-arc.ai ensures the agent follows the path, rather than just hoping it does.
+---
+## 2. Solving the "Single-Model Bias" (Independent Verification)
+The biggest flaw in current tools is that the AI builds and reviews its own work. nomos-arc.ai introduces an **Independent Auditor** system.
+*   **Conflict-of-Interest Model:** If Model A (e.g., Claude 3.5 Sonnet) writes the code, nomos-arc.ai mandates that Model B (e.g., GPT-4o or a local Llama 3) must perform the review.
+*   **The "Double Check" Advantage:** This eliminates hallucinations and ensures that architectural choices are validated by a second, unbiased intelligence.
+---
+## 3. Hard vs. Soft Governance: The "Enforcer" Layer
+Current tools use `CLAUDE.md` or system instructions as "suggestions" which the AI may ignore in long sessions. nomos-arc.ai turns these into **Hard Gates**.
+*   **Rules Injection:** nomos-arc.ai programmatically injects global engineering standards as mandatory runtime constraints.
+*   **Automatic Rejection:** If the Reviewer Agent detects a rule violation, the `arc apply` command is physically blocked. The AI must fix the issue before the code is allowed to enter the codebase.
+*   **Corporate Compliance:** This makes AI safe for regulated industries (Banking, Health, etc.) where compliance is not optional.
+---
+## 4. Context Optimization Engine (Smarter Agents)
+AI agents often fail because they lack the "Big Picture" or have too much "Noisy Context." nomos-arc.ai fixes this by pre-processing the codebase before the agent even starts.
+*   **Dependency Pre-Scan:** nomos-arc.ai scans the **Import Graph** of the affected files.
+*   **Smart Injection:** It injects *only* the relevant documentation and type definitions into the agent's prompt.
+*   **Efficiency:** This reduces token waste and significantly increases the accuracy of the agent's first attempt.
+---
+## 5. Enterprise Audit & "Black Box" State (Traceability)
+Every session in nomos-arc.ai is a **Persistent State (JSON)**. This acts as a "Flight Recorder" for the software development lifecycle.
+| Data Point | Description |
+| :--- | :--- |
+| **History Hash** | A immutable record of the prompt, the diff, and the review score. |
+| **Rule Lineage** | Tracks exactly which version of the rules were used for a specific task. |
+| **Cost Tracking** | Real-time dollar-cost tracking per Jira ticket/Task. |
+| **Audit Log** | Essential for legal and security compliance in enterprise environments. |
+---
+## 6. Local Control & Hybrid Privacy (The Privacy Shield)
+To solve the "Enterprise Privacy" problem, nomos-arc.ai supports a **Hybrid Model**:
+1.  **Planner (Cloud):** Use powerful cloud models (like Claude) to generate complex logic.
+2.  **Reviewer (Local):** Use local LLMs (Llama 3, Mistral) via Ollama/vLLM to review the code.
+    *   **Result:** The "Sensitive Logic" review and standard enforcement never leave the internal network.
+---
+## 7. The Competitive Matrix: Agents vs. The Manager
+nomos-arc.ai does not replace these tools; it **orchestrates** them to make them "Production Ready."
+| Agent | Their Weakness | The nomos-arc.ai Complementary Edge |
+| :--- | :--- | :--- |
+| **Claude Code** | Internal-biased reviews; no hard state. | **Multi-Model Audit** & Persistent Task History. |
+| **Cursor** | Local-individual; no team governance. | **Centralized Rules** applied to all developers. |
+| **GitHub Copilot** | Vendor-locked; Repo-focused. | **Platform Agnostic** (GitLab, Bitbucket, On-prem). |
+| **Aider** | Chat-based (Ephemeral); no "Hard Gates." | **Formal SDLC Stages** with blocking merge filters. |
+| **Plandex** | Complex workspace; lack of audit trail. | **Clean Shadow Branching** via Git Worktrees. |
+---
+## 8. Strategic Assessment: Strengths & Risks
+An honest evaluation of nomos-arc.ai's strategic position — both the advantages and the threats.
+### 8.1 Strategic Strengths
+*   **Solving the Trust Gap:** The #1 blocker for enterprise AI adoption is the absence of institutional oversight. "Hard Gates" transform AI from an "enthusiastic volunteer" into an "employee governed by policy." This is the exact language that CTOs and CISOs need to hear.
+*   **Model-Agnostic Architecture (N-Version Programming):** The "Model A builds, Model B reviews" pattern is a proven engineering practice. This eliminates single-model bias and is a defensible technical advantage.
+*   **The Certificate as the Real Product:** In regulated environments (Banking, Healthcare, Defense), the documentation proving process quality is as important as the code itself. The Certificate *is* the product — the CLI is just the delivery mechanism.
+### 8.2 Competitive Risks (Threats to Monitor)
+| Risk | Severity | Mitigation Strategy |
+| :--- | :--- | :--- |
+| **Latency Tax** — Adding a review layer + governance slows DX compared to raw Cursor/Claude Code. | 🟡 Medium | Offer "Fast Mode" for non-production branches; full governance only on `main`/`release`. |
+| **Absorption Risk** — Anthropic, Microsoft, or Google may build governance features natively into their tools. | 🔴 High | **Move fast on CI/CD integration.** Once nomos-arc is embedded in the pipeline, it becomes infrastructure — not a feature to absorb. |
+| **Adoption Friction** — Developers may resist the extra ceremony of `arc init → plan → review → apply`. | 🟡 Medium | `arc run` already automates the full loop. Add IDE plugins (VSCode, JetBrains) for zero-friction UX. |
+### 8.3 Market Opportunity Score
+*   **Market Opportunity:** 9/10 (High Enterprise Demand).
+*   **Technical Advantage:** 8/10 (Unique State-Machine approach).
+*   **Probability of Technical Success:** **78%**.
+*   **Probability of Market Dominance:** **High** (if positioned as the "Safe Portal" for AI Agents).
+---
+## 9. The "Certificate of AI Engineering Integrity" (Core Differentiator)
+The single most powerful feature nomos-arc.ai can offer is transforming its **State JSON** into a verifiable **"Certificate of AI Engineering Integrity"** — a cryptographically signed proof that AI-generated code was engineered responsibly.
+### What is it?
+Every task in nomos-arc.ai already produces a State JSON containing the full history of prompts, diffs, reviews, and scores. The Certificate takes this further by packaging it into a **tamper-proof, exportable artifact** that answers three questions any CISO or CTO will ask:
+| Question | How the Certificate Answers It |
+| :--- | :--- |
+| **"Who wrote this code?"** | Full lineage: which AI model, which version, which prompt, which developer initiated it. |
+| **"Was it reviewed?"** | Independent multi-model review with a structured score, tied to a specific rules version. |
+| **"Does it meet our standards?"** | The exact rules snapshot (`rules_hash`) that was enforced, with pass/fail evidence. |
+### Why this is the "Killer Feature":
+*   **No other tool provides this.** Cursor, Claude Code, Copilot — none of them produce a verifiable record of what the AI did and whether it was validated.
+*   **Regulatory Demand is coming.** The EU AI Act, SOC2 for AI, and emerging FDA guidelines for AI-assisted medical software will *require* this kind of traceability. nomos-arc.ai is building the infrastructure *before* the mandate.
+*   **It makes the CTO say "Yes."** The #1 blocker for enterprise AI adoption is trust. The Certificate is the artifact that a CTO shows the board to prove AI usage is under control.
+### Technical Implementation:
+*   **SHA-256 Chain:** Each `HistoryEntry.output_hash` is already computed. Chain them into a Merkle-like hash so any single tampered entry invalidates the entire certificate.
+*   **Exportable Artifact:** `arc certificate <task-id>` generates a standalone `.json` or `.pdf` report suitable for compliance archives.
+*   **Signature Support:** Optional GPG/SSH signing of the certificate by the developer, creating a non-repudiation chain.
+---
+## 10. The Enterprise Lock-In Flywheel
+nomos-arc.ai's governance model creates a **natural retention flywheel** that makes it increasingly difficult for enterprises to leave once adopted.
+### The Flywheel Stages:
+```
+┌─────────────────────────────────────────────────────────┐
+│  1. ADOPT  →  Team starts using arc for AI tasks        │
+│  2. CUSTOMIZE  →  Engineering rules tailored to org     │
+│  3. ACCUMULATE  →  Audit trail grows per task/sprint    │
+│  4. DEPEND  →  Compliance reports reference nomos-arc data  │
+│  5. EXPAND  →  More teams onboard → more rules → more   │
+│               audit data → deeper dependency            │
+└─────────────────────────────────────────────────────────┘
+```
+### Why this matters strategically:
+*   **Rules as IP:** An organization's `global.md` + `backend.md` + domain rules become their **codified engineering culture**. Migrating away means losing or rebuilding this institutional knowledge.
+*   **Audit Trail as Legal Record:** Once compliance teams reference nomos-arc certificates in audits, removing nomos-arc means breaking the compliance chain — a risk no enterprise will take.
+*   **Cost Tracking as Budget Infrastructure:** When finance teams build AI budgets around nomos-arc's per-task cost data, it becomes embedded in the financial workflow.
+---
+## 11. Product Evolution: Next-Gen Capabilities (2026 Vision)
+To transform nomos-arc.ai into infrastructure that is **impossible to replace**, the following capabilities must be developed. These combine "ease of implementation" with "maximum professional impact."
+### 11.1 Hybrid Governance: Static Analysis + AI Review
+Instead of relying solely on a second LLM for review, the engine must include a **built-in Static Analysis layer** (e.g., Semgrep, ESLint, SonarQube rules).
+*   **How it works:** Code must pass traditional static analysis tools *before* being sent to the AI Reviewer (Model B). This creates a **two-tier quality gate:**
+    ```
+    AI Output → [Gate 1: Static Analysis] → [Gate 2: AI Review] → arc apply
+    ```
+*   **Why it matters:**
+    *   Reduces token consumption — known vulnerability patterns (CVEs) are caught without spending API calls.
+    *   Provides deterministic guarantees that AI review alone cannot (e.g., regex-based secret detection, dependency vulnerability scanning).
+    *   Enterprises trust tools they already know (SonarQube). Integrating them *inside* nomos-arc builds credibility.
+### 11.2 MCP (Model Context Protocol) Integration
+As the Orchestrator, nomos-arc.ai must support Anthropic's **MCP protocol** to become the central hub for context assembly.
+*   **How it works:** When a developer runs `arc plan`, nomos-arc.ai automatically pulls:
+    *   **Task requirements** from Jira/Linear via MCP connector.
+    *   **Design specifications** from Figma via MCP connector.
+    *   **Database schema** from the live database via MCP connector.
+*   **Why it matters:** The AI Agent receives **complete, real-world context** — not just code files — leading to dramatically better first-attempt accuracy.
+*   **Competitive Edge:** This positions nomos-arc.ai as the **"Context Router"** — the single point where all organizational knowledge flows into the AI Agent.
+### 11.3 Enterprise RLHF: Learning from Human Corrections
+Every time a developer rejects AI-generated code and manually corrects it, that correction is **training data**.
+*   **How it works:**
+    1.  Developer runs `arc plan` → AI generates code.
+    2.  Developer modifies the code manually before running `arc apply`.
+    3.  nomos-arc.ai captures the **diff between AI output and human correction**.
+    4.  The correction patterns are analyzed and **automatically added to the governance rules** to prevent recurrence.
+*   **Why it matters:** The product becomes **smarter with every line of code the team writes**. This creates a **technical moat** (خندق تقني) that competitors cannot replicate — because the intelligence is built from *this specific team's* coding patterns.
+*   **Lock-In Effect:** After 6 months of corrections, the rules engine contains institutional knowledge that took hundreds of developer-hours to generate. Switching to a competitor means starting from zero.
+### 11.4 Cost-Aware Routing (Financial Intelligence Layer)
+Not all tasks require the same model. nomos-arc.ai must add a **smart routing layer** that optimizes for cost.
+*   **Routing Logic:**
+    | Task Complexity | Routed To | Estimated Cost Savings |
+    | :--- | :--- | :--- |
+    | **Simple** (Unit Tests, Docs, Formatting) | Local LLM (Llama 3 via Ollama) | ~90% savings vs. cloud |
+    | **Medium** (Bug Fixes, Refactoring) | Claude 3.5 Haiku / GPT-4o-mini | ~60% savings vs. top-tier |
+    | **Complex** (Architecture, Security-critical) | Claude Opus / Gemini 1.5 Pro | Full cost, maximum quality |
+*   **Why it matters:** Reduces the monthly AI bill by up to **40%**. This is the feature that wins over the **CFO** — the person who signs the enterprise contract.
+*   **Implementation:** Add a `complexity` field to the task frontmatter (auto-detected or manually set). The orchestrator routes to the appropriate binary based on complexity × budget remaining.
+---
+## 12. The Developer Intelligence Suite (Powered by Gemini Free Tier)
+nomos-arc.ai's philosophy is to automate **"understanding"** — not just **"writing."** The following capabilities use **zero-cost infrastructure** (Gemini API Free Tier, open-source tools) to give developers superpowers without adding financial burden.
+### 12.1 Smart CI/CD: Intelligent Failure Analysis
+The CI/CD pipeline should not just say "Build Failed" — it should say **"Why it failed, what caused it, and how to fix it."**
+*   **How it works:**
+    1.  When a CI/CD pipeline fails, nomos-arc captures the error logs automatically.
+    2.  The logs are sent to the Gemini API (Free Tier via `google-generativeai` SDK) for **Root Cause Analysis**.
+    3.  Gemini produces a structured diagnosis: the root cause, the affected component, and a concrete fix suggestion.
+    4.  The analysis is posted as a **comment on the Pull Request** automatically.
+*   **Impact Analysis (Pre-Merge):** Before any merge, nomos-arc analyzes the `git diff` to identify **sensitive areas** that may be affected (e.g., database schema changes rippling into API endpoints). The developer is alerted with a structured impact report.
+*   **Implementation:** Packaged as a reusable **GitHub Action** (`nomos-arc/ci-analysis`) or **GitLab CI Job** that any project can adopt in minutes.
+```yaml
+# Example: .github/workflows/nomos-ci.yml
+name: nomos-arc AI Review
+on: [pull_request]
+jobs:
+  nomos-review:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+      - uses: nomos-arc/ci-analysis@v1
+        with:
+          gemini_api_key: ${{ secrets.GEMINI_API_KEY }}
+          analysis_mode: 'failure+impact'
+          post_comment: true
+```
+### 12.2 Architecture Map Graph (Project Visualization)
+To truly understand a codebase, you need a **visual map** — not a file tree. nomos-arc.ai builds this automatically.
+*   **How it works:**
+    1.  **Extraction:** Use `tree-sitter` to parse the AST (Abstract Syntax Tree) of every file, extracting classes, functions, imports, and their relationships.
+    2.  **LLM Processing:** Send the structural data to Gemini to produce a clean **Mermaid.js** diagram code, annotated with module boundaries and dependency flows.
+    3.  **Visualization:** Render the Mermaid code into an interactive diagram — embeddable in the project's README, a local dashboard, or the `arc dashboard` web view.
+*   **The Killer Use Case:** "If I change function `X`, what breaks?" The map shows the **blast radius** of any change — the modules, services, and tests that depend on `X`.
+*   **CLI Command:** `arc map` generates the full project architecture graph.
+    ```
+    $ arc map --output mermaid
+    ✓ Parsed 47 files via tree-sitter
+    ✓ Extracted 128 relationships
+    ✓ Generated architecture diagram → tasks-management/maps/architecture.mmd
+    ```
+*   **Example Output:**
+    ```mermaid
+    graph TD
+        CLI["cli.ts"] --> Commands["commands/*"]
+        Commands --> Orchestrator["core/orchestrator.ts"]
+        Orchestrator --> StateManager["core/state.ts"]
+        Orchestrator --> WorktreeCoord["core/worktree-coordinator.ts"]
+        Orchestrator --> PtyAdapter["adapters/pty.ts"]
+        Orchestrator --> StdioAdapter["adapters/stdio.ts"]
+        WorktreeCoord --> GitAdapter["adapters/git.ts"]
+        Orchestrator --> Budget["core/budget.ts"]
+        Orchestrator --> Prompt["core/prompt.ts"]
+        Orchestrator --> Review["core/review.ts"]
+    ```
+### 12.3 Semantic Code Understanding (Local RAG System)
+For new developers joining a project, the #1 pain point is: **"Where does X happen in the codebase?"** nomos-arc.ai solves this with a **Local RAG (Retrieval-Augmented Generation)** system.
+*   **How it works:**
+    1.  **Chunking:** The codebase is split into semantic chunks (functions, classes, modules).
+    2.  **Embedding:** Each chunk is embedded using the Gemini Embeddings API (free within usage limits) and stored in a local **ChromaDB** vector database.
+    3.  **Querying:** The developer asks natural language questions via `arc ask`.
+    4.  **Retrieval + Generation:** The system retrieves the most relevant code chunks and uses Gemini to generate a comprehensive answer with file references.
+*   **CLI Command:** `arc ask "<question>"`
+    ```
+    $ arc ask "Where is the review scoring logic implemented?"
+    📍 Found in: src/core/review.ts (lines 12-45)
+    The review scoring is handled by the `parseReviewOutput()` function.
+    It extracts a JSON object from the reviewer's output containing:
+    - `score` (0-100): Overall quality score
+    - `issues[]`: Array of specific problems found
+    - `suggestions[]`: Improvement recommendations
+    The score is compared against `config.convergence.score_threshold`
+    in the Orchestrator (src/core/orchestrator.ts:579) to determine
+    if the task moves to 'approved' or 'refinement'.
+    Related files:
+    → src/core/orchestrator.ts (review method, line 469)
+    → src/types/index.ts (ReviewResult type definition)
+    ```
+*   **Index Command:** `arc index` rebuilds the semantic index of the project.
+*   **Zero Cloud Dependency:** All embeddings are stored locally in ChromaDB. No code leaves the developer's machine.
+### 12.4 Pre-Commit Hook: "Shift Left" Philosophy
+The most impactful governance happens **before code is pushed**, not after it fails in CI.
+*   **How it works:** `arc hook install` adds a Git pre-commit hook that runs locally:
+    1.  **Architecture Guard:** Checks the proposed changes against the architecture map — does this commit introduce a forbidden dependency?
+    2.  **Rules Check:** Validates the diff against governance rules without waiting for CI.
+    3.  **Quick Score:** A lightweight local model (or Gemini Flash) provides a fast quality score.
+    4.  If the score is below threshold → the commit is **blocked locally** with a clear explanation.
+*   **Why it matters:** Prevents bad code from ever reaching the CI pipeline, saving time and compute. Senior developers and enterprises value this because **"Prevention is better than fixing failures in the pipeline."**
+---
+## 13. Zero-Cost Technology Stack
+nomos-arc.ai is built on a principle: **maximum intelligence, zero financial overhead.**
+| Tool | Role | Cost |
+| :--- | :--- | :--- |
+| **Gemini API (Free Tier)** | Core AI engine — analysis, review, map generation, RAG answers. | $0 (within fair usage limits) |
+| **tree-sitter** | AST parsing for architecture extraction and code understanding. | Open Source |
+| **Mermaid.js** | Converts LLM output into professional engineering diagrams. | Open Source |
+| **ChromaDB** | Local vector database for semantic code search (RAG). | Open Source (Local) |
+| **Docker & Local Runners** | Run all processing locally to minimize cloud dependency. | Free |
+| **Ollama / vLLM** | Local LLM inference for privacy-sensitive reviews. | Open Source |
+---
+## 14. Unified Implementation Roadmap
+Combining the Trust Layer, Product Evolution, and Developer Intelligence capabilities into a single execution plan:
+### Phase 1: "The Trust Layer" (Immediate — Weeks 1-4)
+| Feature | Status | Impact | Owner |
+| :--- | :--- | :--- | :--- |
+| `arc certificate <task-id>` — Exportable compliance artifact | 🔴 Not Started | **Critical** — This is the sales pitch. | Core |
+| Hash Chain Integrity — Merkle chain across history entries | 🔴 Not Started | **Critical** — Tamper-proof certificates. | Core |
+| Rules Versioning — Semver for rule files with changelog | 🔴 Not Started | **High** — Required for "Rule Lineage." | Core |
+| Static Analysis Gate — Semgrep/ESLint pre-review layer | 🔴 Not Started | **High** — Deterministic quality baseline. | Engine |
+### Phase 2: "The Intelligence Layer" (Short-Term — Weeks 5-8)
+| Feature | Status | Impact | Owner |
+| :--- | :--- | :--- | :--- |
+| Cost-Aware Routing — Complexity-based model selection | 🔴 Not Started | **High** — CFO-winning feature. | Engine |
+| `arc dashboard` — CLI/Web summary of tasks, scores, costs | 🔴 Not Started | **High** — Team lead visibility. | UX |
+| Enterprise RLHF — Learn from human correction diffs | 🔴 Not Started | **High** — Technical moat builder. | Engine |
+| Team Rules Sync — Central rules repo with `arc pull-rules` | 🔴 Not Started | **High** — Multi-developer governance. | Core |
+### Phase 3: "The Developer Intelligence Suite" (Medium-Term — Weeks 9-14)
+| Feature | Status | Impact | Owner |
+| :--- | :--- | :--- | :--- |
+| `arc map` — Architecture graph via tree-sitter + Gemini + Mermaid | 🔴 Not Started | **High** — Visual codebase understanding. | Intelligence |
+| `arc ask` — Local RAG with ChromaDB + Gemini Embeddings | 🔴 Not Started | **High** — Developer onboarding accelerator. | Intelligence |
+| `arc index` — Semantic codebase indexing | 🔴 Not Started | **High** — Foundation for RAG queries. | Intelligence |
+| Smart CI/CD Failure Analysis — Gemini-powered log diagnosis | 🔴 Not Started | **High** — Auto-fix suggestions on PR. | Intelligence |
+### Phase 4: "The Pipeline Lock" (Medium-Term — Weeks 15-20)
+| Feature | Status | Impact | Owner |
+| :--- | :--- | :--- | :--- |
+| **Pre-Commit Hook** — `arc hook install` for local governance | 🔴 Not Started | **Critical** — "Shift Left" prevention. | Integrations |
+| **Git-Hook Integration** — Block commits without nomos-arc signature | 🔴 Not Started | **Critical** — Makes nomos-arc mandatory. | Integrations |
+| **CI/CD Plugin** — GitHub Actions / GitLab CI native integration | 🔴 Not Started | **Critical** — Embeds in production pipeline. | Integrations |
+| MCP Protocol Support — Jira, Figma, DB context connectors | 🔴 Not Started | **High** — Context Router positioning. | Integrations |
+| Webhook Integration — Post results to Slack/GitHub PR | 🔴 Not Started | **Medium** — Enterprise workflow fit. | Integrations |
+### Phase 5: "The Enterprise Package" (Long-Term — Weeks 21+)
+| Feature | Status | Impact | Owner |
+| :--- | :--- | :--- | :--- |
+| RBAC — Role-based `arc apply` access (leads only) | 🔴 Not Started | **High** — Enterprise access control. | Platform |
+| SSO/LDAP Integration — Enterprise identity management | 🔴 Not Started | **Medium** — Required for enterprise sales. | Platform |
+| Compliance Report Generator — Auto-generate SOC2/ISO artifacts | 🔴 Not Started | **Critical** — Direct revenue driver. | Platform |
+| Semantic Search for Rules — AI-powered rule matching at scale | 🔴 Not Started | **Medium** — Handles thousands of rule files. | Engine |
+---
+## 15. Conclusion: The Path Forward
+nomos-arc.ai is not a coding tool — it is a **"Governance Operating System for AI-Assisted Development"** and a **"Developer Intelligence Platform"** — built entirely on open-source, zero-cost infrastructure.
+nomos-arc.ai wins by being the **"Police"** in a world of **"Cowboys."** While the market focuses on speed, nomos-arc.ai provides the infrastructure that makes AI-assisted development **safe, auditable, and architecturally sound.**
+The **Certificate of AI Engineering Integrity** is the north star for governance. The **Developer Intelligence Suite** (`arc map`, `arc ask`, Smart CI/CD) is the north star for adoption. Together, they make nomos-arc.ai **indispensable from both sides** — the engineer loves it, and the CTO mandates it.
+### The Dual Strategy:
+> **For Developers:** "nomos-arc makes me understand this codebase in 5 minutes instead of 5 days."
+>
+> **For Enterprises:** "nomos-arc proves that our AI-generated code meets compliance standards."
+### The Strategic Priority:
+> **Phase 4 (Pipeline Lock) is the endgame.** If nomos-arc.ai becomes embedded in the CI/CD pipeline — where no code reaches production without a nomos-arc signature — then the market is captured. But **Phase 3 (Developer Intelligence) is the adoption engine.** Developers adopt `arc map` and `arc ask` because they're useful. Once adopted, governance follows naturally.
+The market will inevitably demand proof that AI-generated code was engineered responsibly. nomos-arc.ai is not just building a tool — it is building the **Trust Layer** that will be the prerequisite for enterprise AI adoption.
+> **The company that owns the audit trail, owns the enterprise.**
+>
+> **The company that owns the pipeline, owns the audit trail.**
+>
+> **The company that owns developer adoption, owns the pipeline.**

package/docs/feature_assessment.md ADDED Viewed

@@ -0,0 +1,156 @@
+# nomos-arc.ai Feature Assessment — Honest Reality Check
+> **Assessment Date:** April 4, 2026
+> **Assessed By:** Technical Architecture Review
+> **Scope:** All features in `docs/dev_overview.md` (420 lines, 15 sections)
+---
+## Rating System
+Each feature is rated on 3 axes (1-10):
+| Axis | Meaning |
+| :--- | :--- |
+| **Feasibility** | How realistic is the technical implementation? (10 = trivial, 1 = moonshot) |
+| **Impact** | How much does this move the needle for adoption/revenue? (10 = game-changer, 1 = nice-to-have) |
+| **Current Progress** | How much is already built in the codebase? (10 = shipped, 1 = vaporware) |
+**Seriousness Score** = `(Feasibility × 0.3) + (Impact × 0.4) + (Progress × 0.3)`
+---
+## Part 1: Core Architecture (§1-§6) — Already Built ✅
+These are the foundation. They're real — the code exists.
+| # | Feature | Feasibility | Impact | Progress | Score | Verdict |
+| :--- | :--- | :---: | :---: | :---: | :---: | :--- |
+| §1 | **Orchestrator (Manager vs Agent)** | 9 | 9 | 8 | **8.7** | ✅ **Core is built.** `orchestrator.ts` (783 lines) with full state machine. Real. |
+| §2 | **Multi-Model Review (Independent Verification)** | 8 | 9 | 7 | **8.1** | ✅ **Working.** Planner → Reviewer pipeline exists. Binary configs support different models. |
+| §3 | **Hard Gates (Enforcer Layer)** | 8 | 10 | 6 | **8.2** | ✅ **Partially built.** Rules injection works. Auto-rejection on score threshold works. But "physically blocked" is overstated — it's a score check, not a cryptographic gate. |
+| §4 | **Context Optimization Engine** | 7 | 7 | 6 | **6.7** | ⚠️ **Basic version exists.** Import graph scanning via `grep` + `readFirst50Lines`. Not true AST-level dependency resolution yet. |
+| §5 | **Enterprise Audit / Black Box State** | 9 | 9 | 8 | **8.7** | ✅ **Strong.** State JSON with history, hashes, rules snapshots — all implemented in `state.ts`. |
+| §6 | **Hybrid Privacy (Cloud + Local)** | 8 | 7 | 5 | **6.5** | ⚠️ **Architecturally supported** (binary configs for planner/reviewer). But no actual Ollama adapter tested. The claim is ahead of the code. |
+### Part 1 Summary:
+> **Overall: 7.8/10** — The foundation is genuinely strong. The Orchestrator + State Machine + Multi-Model Review is a real, working pipeline. The "overpromise" areas are Context Optimization (basic grep, not true AST) and Hybrid Privacy (no local LLM adapter exists yet).
+---
+## Part 2: Strategic Positioning (§7-§8) — Analysis, Not Features
+| # | Feature | Feasibility | Impact | Progress | Score | Verdict |
+| :--- | :--- | :---: | :---: | :---: | :---: | :--- |
+| §7 | **Competitive Matrix** | N/A | 8 | N/A | **8.0** | ℹ️ **Accurate but optimistic.** The weaknesses listed for competitors are real. But the "complementary edge" claims need to be proven, not just stated. |
+| §8 | **Strategic Assessment (Risks)** | N/A | 9 | N/A | **9.0** | ✅ **Honest and valuable.** The Absorption Risk callout is the most important item in the entire document. |
+### ⚠️ Critical Risk Flag:
+> **Absorption Risk is REAL.** Anthropic already added `CLAUDE.md` governance. GitHub Copilot added "Custom Instructions." The window for nomos-arc to establish itself as infrastructure is **12-18 months max.** Speed of execution matters more than feature count.
+---
+## Part 3: Certificate of Integrity (§9) — The Crown Jewel
+| # | Feature | Feasibility | Impact | Progress | Score | Verdict |
+| :--- | :--- | :---: | :---: | :---: | :---: | :--- |
+| §9a | **Certificate Export (`arc certificate`)** | 9 | 10 | 2 | **6.7** | 🔥 **Highest ROI feature not yet built.** The State JSON already has 90% of the data. This is ~2 weeks of work to package into an exportable artifact. MUST be Phase 1. |
+| §9b | **SHA-256 Merkle Chain** | 7 | 8 | 3 | **5.8** | ⚠️ **Medium effort.** `output_hash` exists per entry. Chaining them is straightforward crypto. But true tamper-proof requires the chain to be verified on read — not just computed on write. |
+| §9c | **GPG/SSH Signature** | 6 | 6 | 0 | **3.6** | 🟡 **Nice-to-have.** Adds non-repudiation but most enterprises won't use it immediately. Defer to Phase 3+. |
+### Part 3 Summary:
+> **Certificate Export is the single highest-ROI item.** 90% of the infrastructure exists. 2 weeks of work → a tangible, demo-able product that a CTO can hold. **Do this first.**
+---
+## Part 4: Product Evolution (§11) — Mixed Feasibility
+| # | Feature | Feasibility | Impact | Progress | Score | Verdict |
+| :--- | :--- | :---: | :---: | :---: | :---: | :--- |
+| §11.1 | **Hybrid Governance (Static Analysis Gate)** | 8 | 8 | 1 | **5.5** | ✅ **Highly feasible.** Shell out to `eslint`/`semgrep` before review. ~1 week. Good signal-to-noise improvement. |
+| §11.2 | **MCP Protocol Integration** | 5 | 7 | 0 | **3.7** | 🔴 **Over-ambitious for now.** MCP is still evolving. Building connectors for Jira + Figma + DB is 3 separate integrations, each with auth complexity. Defer to Phase 4+. |
+| §11.3 | **Enterprise RLHF (Learning from Corrections)** | 4 | 9 | 0 | **4.4** | 🔴 **Sounds amazing but technically hard.** Auto-generating rules from diffs requires understanding *intent*, not just *difference*. An LLM can suggest rule updates, but auto-applying them is dangerous. Needs human-in-the-loop review of suggested rules. |
+| §11.4 | **Cost-Aware Routing** | 7 | 7 | 2 | **5.0** | ⚠️ **Feasible but premature.** The `complexity` field is easy. But auto-detecting complexity reliably is an unsolved problem. Start with manual complexity tags, add auto-detection later. |
+### Part 4 Summary:
+> **Static Analysis Gate = Quick Win. MCP and RLHF = too early.** Cost-Aware Routing is useful but only with manual complexity tags first. Don't over-invest in auto-magic yet.
+---
+## Part 5: Developer Intelligence Suite (§12) — The Growth Engine
+| # | Feature | Feasibility | Impact | Progress | Score | Verdict |
+| :--- | :--- | :---: | :---: | :---: | :---: | :--- |
+| §12.1 | **Smart CI/CD Failure Analysis** | 8 | 7 | 0 | **4.7** | ✅ **Very feasible.** Send logs to Gemini, post comment on PR. GitHub Action wrapper is ~3 days of work. Good open-source adoption driver. |
+| §12.2 | **Architecture Map (`arc map`)** | 6 | 8 | 0 | **4.4** | ⚠️ **Medium difficulty.** tree-sitter integration per language is non-trivial (TypeScript ≠ Python ≠ Go). Start with TypeScript/JavaScript only. Mermaid output is easy once you have the graph. |
+| §12.3 | **Semantic Code Understanding (`arc ask`)** | 5 | 8 | 0 | **4.1** | ⚠️ **Conceptually strong, technically involved.** ChromaDB + chunking + embeddings is a full RAG pipeline. Works well for medium projects (< 500 files). Quality degrades on large monorepos. Expect 3-4 weeks to get right. |
+| §12.4 | **Pre-Commit Hook (`arc hook`)** | 8 | 8 | 0 | **5.0** | ✅ **High feasibility, high impact.** A lightweight pre-commit that runs rules check + quick score is very doable. Architecture Guard (checking against map) requires `arc map` to be built first. Start with rules-only hook. |
+### Part 5 Summary:
+> **CI/CD Failure Analysis and Pre-Commit Hook are quick wins.** `arc map` and `arc ask` are serious engineering efforts (3-6 weeks each). Start with TypeScript-only support and a small RAG scope.
+---
+## Part 6: Enterprise Features (§14 Phase 5) — Future Territory
+| # | Feature | Feasibility | Impact | Progress | Score | Verdict |
+| :--- | :--- | :---: | :---: | :---: | :---: | :--- |
+| | **RBAC** | 6 | 7 | 0 | **4.0** | 🟡 Requires a user/role system that doesn't exist yet. Medium effort. |
+| | **SSO/LDAP** | 4 | 5 | 0 | **2.7** | 🔴 Heavy enterprise plumbing. Only build when you have a paying enterprise customer asking for it. |
+| | **Compliance Report Generator** | 7 | 9 | 0 | **5.2** | ⚠️ Depends entirely on Certificate being built first. High impact once the data exists. |
+| | **Semantic Search for Rules** | 6 | 5 | 0 | **3.3** | 🟡 Only needed when organizations have 100+ rule files. Premature optimization for now. |
+---
+## Overall Document Assessment
+### The Good ✅
+- The **core architecture is real** (Orchestrator, State Machine, Multi-Model). This isn't vaporware.
+- The **Certificate** concept is genuinely unique and the most defensible competitive advantage.
+- The **risk analysis** (§8) is honest and shows strategic maturity.
+- The **Developer Intelligence Suite** is the right adoption strategy — developers adopt tools they love, then governance follows.
+### The Concerning ⚠️
+- **Feature count inflation.** The document lists **22+ features** across 5 phases. For a solo/small team, this is a **3+ year roadmap** presented as if it's a few months of work.
+- **"Weeks" timeline is unrealistic.** Phase 1-5 spanning "Weeks 1-21+" assumes full-time focus and no tech debt. Realistically, Phase 1 alone is 6-8 weeks with testing and polish.
+- **MCP and Enterprise RLHF are over-scoped.** These are each standalone products. Including them in the roadmap creates a false sense of progress.
+- **Gemini Free Tier dependency.** The document positions Free Tier as a permanent solution. Google's free tiers change frequently. This is a fragile foundation for "zero cost" claims.
+### The Risky 🔴
+- **Absorption Risk is existential.** If Anthropic ships `claude review --rules` or GitHub adds "Compliance Mode" to Copilot, the window closes fast. Every month of delay on CI/CD integration increases this risk.
+- **Open Source + Zero Revenue.** The document doesn't address monetization. "Free forever" is a community strategy, not a business strategy. Without revenue, there's no team to build Phases 3-5.
+---
+## Recommended Priority Stack (Honest Version)
+If I had to pick **5 features** to build in the next **90 days** to maximize viability:
+| Priority | Feature | Time Estimate | Why |
+| :---: | :--- | :--- | :--- |
+| 🥇 | **`arc certificate`** (Export + Hash Chain) | 2-3 weeks | This IS the product. Everything else supports this. |
+| 🥈 | **Static Analysis Gate** (ESLint/Semgrep pre-review) | 1 week | Quick win, massive credibility boost. |
+| 🥉 | **Pre-Commit Hook** (`arc hook install`, rules-only) | 1 week | "Shift Left" is immediately useful. No LLM needed. |
+| 4 | **Smart CI/CD GitHub Action** (Gemini failure analysis) | 1-2 weeks | Open-source adoption magnet. Easy to share. |
+| 5 | **`arc map`** (TypeScript-only, tree-sitter + Mermaid) | 3-4 weeks | Visual wow factor. Good for demos and READMEs. |
+**Total: ~8-11 weeks of focused work.**
+Everything else (MCP, RLHF, RBAC, SSO, `arc ask`) should be **deferred** until these 5 are shipped and validated.
+---
+## Final Seriousness Verdict
+| Dimension | Score | Notes |
+| :--- | :---: | :--- |
+| **Vision Quality** | 9.5/10 | Exceptional strategic thinking. The "Manager vs Agent" framing is perfect. |
+| **Technical Foundation** | 8/10 | Real code, real architecture. Not vaporware. |
+| **Feature Realism** | 5/10 | Too many features, too optimistic timelines. Needs ruthless prioritization. |
+| **Market Timing** | 8/10 | The governance gap is real and urgent. But the window is 12-18 months. |
+| **Execution Risk** | 6/10 | Solo/small team vs. massive roadmap. Ship the Certificate NOW. |
+### **Overall: 7.3/10 — Strong vision, needs focus.**
+> The document reads like a **Series A pitch deck** — inspiring, comprehensive, strategically sound. But the gap between "what's documented" and "what's shipped" is large. The #1 risk is not technical — it's **scope creep**. The project will succeed or fail based on whether the team ships the Certificate + CI/CD integration in the next 90 days, not on whether the roadmap has 22 features.
+> **الخلاصة بالعربي:** الرؤية ممتازة 9.5/10، الكود الموجود قوي 8/10، لكن عدد الـ Features كتير أوي والـ Timelines متفائلة جداً 5/10. لو ركزت على 5 حاجات بس في أول 90 يوم (Certificate + Static Analysis + Hook + CI/CD Action + Arc Map) — هتبقى في وضع تنافسي قوي جداً. لو حاولت تعمل الـ 22 feature كلهم — مش هتخلص حاجة.