npm - @miller-tech/uap - Versions diffs - 1.40.0 → 1.41.0 - Mend

@miller-tech/uap 1.40.0 → 1.41.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (150) hide show

package/README.md +109 -642
package/dist/.tsbuildinfo +1 -1
package/dist/cli/deliver-defaults.d.ts +23 -0
package/dist/cli/deliver-defaults.d.ts.map +1 -0
package/dist/cli/deliver-defaults.js +121 -0
package/dist/cli/deliver-defaults.js.map +1 -0
package/dist/cli/init.d.ts.map +1 -1
package/dist/cli/init.js +29 -0
package/dist/cli/init.js.map +1 -1
package/dist/cli/setup.d.ts.map +1 -1
package/dist/cli/setup.js +19 -0
package/dist/cli/setup.js.map +1 -1
package/dist/policies/policy-tools.d.ts +7 -0
package/dist/policies/policy-tools.d.ts.map +1 -1
package/dist/policies/policy-tools.js +24 -2
package/dist/policies/policy-tools.js.map +1 -1
package/docs/INDEX.md +48 -286
package/docs/architecture/OVERVIEW.md +328 -0
package/docs/architecture/PROTOCOL.md +204 -0
package/docs/benchmarks/README.md +17 -192
package/docs/getting-started/CONFIGURATION.md +237 -0
package/docs/getting-started/INSTALLATION.md +125 -0
package/docs/getting-started/QUICKSTART.md +115 -0
package/docs/guides/COORDINATION.md +162 -0
package/docs/guides/DELIVER.md +115 -0
package/docs/guides/DEPLOY_BATCHING.md +212 -0
package/docs/guides/DROIDS_AND_SKILLS.md +202 -0
package/docs/guides/LOCAL_MODELS.md +148 -0
package/docs/guides/MCP_ROUTER.md +195 -0
package/docs/guides/MEMORY.md +235 -0
package/docs/guides/MULTI_MODEL.md +223 -0
package/docs/guides/POLICIES.md +190 -0
package/docs/guides/WORKTREE_WORKFLOW.md +185 -0
package/docs/integrations/MCP_ROUTER.md +147 -0
package/docs/integrations/RTK.md +102 -0
package/docs/reference/API.md +485 -0
package/docs/reference/CLI.md +719 -0
package/docs/reference/CONFIGURATION.md +90 -193
package/docs/reference/DATABASE_SCHEMA.md +110 -344
package/docs/reference/FEATURES.md +176 -472
package/docs/reference/PATTERNS.md +102 -0
package/docs/reference/PLATFORMS.md +83 -0
package/package.json +3 -1
package/src/policies/enforcers/7ebbc721-7540-4e9f-879a-770e0213a09b_architecture_review.py +101 -0
package/src/policies/enforcers/__pycache__/_common.cpython-312.pyc +0 -0
package/src/policies/enforcers/_common.py +100 -0
package/src/policies/enforcers/artifact_hygiene.py +52 -0
package/src/policies/enforcers/cluster_routing.py +63 -0
package/src/policies/enforcers/codebase_read_before_plan.py +52 -0
package/src/policies/enforcers/coord_overlap.py +81 -0
package/src/policies/enforcers/delivery_enforcement.py +97 -0
package/src/policies/enforcers/doc_live_over_report.py +50 -0
package/src/policies/enforcers/expert_review_required.py +135 -0
package/src/policies/enforcers/iac_parity.py +53 -0
package/src/policies/enforcers/mcp_router_first.py +37 -0
package/src/policies/enforcers/memory_before_plan.py +61 -0
package/src/policies/enforcers/parallel_reads.py +50 -0
package/src/policies/enforcers/rtk_wrap.py +44 -0
package/src/policies/enforcers/schema_diff_gate.py +80 -0
package/src/policies/enforcers/session_memory_write.py +52 -0
package/src/policies/enforcers/task_required.py +131 -0
package/src/policies/enforcers/test_gate.py +58 -0
package/src/policies/enforcers/validate_plan_before_build.py +75 -0
package/src/policies/enforcers/worktree_required.py +57 -0
package/src/policies/schemas/policies/architecture-review.md +51 -0
package/src/policies/schemas/policies/artifact-hygiene.md +29 -0
package/src/policies/schemas/policies/cluster-routing.md +31 -0
package/src/policies/schemas/policies/codebase-read-before-plan.md +30 -0
package/src/policies/schemas/policies/coord-overlap.md +24 -0
package/src/policies/schemas/policies/delivery-enforcement.md +45 -0
package/src/policies/schemas/policies/doc-live-over-report.md +32 -0
package/src/policies/schemas/policies/expert-review-required.md +60 -0
package/src/policies/schemas/policies/iac-parity.md +31 -0
package/src/policies/schemas/policies/mandatory-testing-deployment.md +147 -0
package/src/policies/schemas/policies/mcp-router-first.md +24 -0
package/src/policies/schemas/policies/memory-before-plan.md +24 -0
package/src/policies/schemas/policies/merge-deploy-monitor-verify.md +145 -0
package/src/policies/schemas/policies/parallel-reads.md +24 -0
package/src/policies/schemas/policies/rtk-wrap.md +26 -0
package/src/policies/schemas/policies/schema-diff-gate.md +30 -0
package/src/policies/schemas/policies/session-memory-write.md +24 -0
package/src/policies/schemas/policies/task-required.md +49 -0
package/src/policies/schemas/policies/test-gate.md +24 -0
package/src/policies/schemas/policies/validate-plan-before-build.md +28 -0
package/src/policies/schemas/policies/worktree-required.md +28 -0
package/templates/hooks/uap-policy-gate.sh +5 -0
package/docs/AGENTS.md +0 -423
package/docs/DOCUMENTATION_AUDIT_REPORT.md +0 -131
package/docs/GETTING_STARTED.md +0 -288
package/docs/PROJECT_ANALYSIS_REPORT.md +0 -510
package/docs/architecture/COMPLETE_ARCHITECTURE.md +0 -748
package/docs/architecture/EXPERT_STACK.md +0 -137
package/docs/architecture/MULTI_MODEL.md +0 -224
package/docs/architecture/PLATFORM_GATING.md +0 -68
package/docs/architecture/SYSTEM_ANALYSIS.md +0 -334
package/docs/architecture/UAP_COMPLIANCE.md +0 -217
package/docs/architecture/UAP_PROTOCOL.md +0 -339
package/docs/architecture/UAP_STRICT_DROIDS.md +0 -172
package/docs/archive/BALLS_MODE_SELF_ANALYSIS.md +0 -260
package/docs/archive/BENCHMARK_GAPS_AND_PLAN.md +0 -146
package/docs/archive/FAILING_TASKS_SOLUTION_PLAN.md +0 -668
package/docs/archive/JINJA2-SYSTEM-MESSAGE-FIX.md +0 -209
package/docs/archive/MODEL_ROUTING_IMPLEMENTATION_SUMMARY.md +0 -281
package/docs/archive/MODEL_ROUTING_OPTIMIZATION_PLAN.md +0 -320
package/docs/archive/NPM-PUBLISH-V0.9.1.md +0 -240
package/docs/archive/OPTIMIZATION_OPTIONS.md +0 -334
package/docs/archive/PARALLELISM_GAPS_AND_OPTIONS.md +0 -422
package/docs/archive/POLICY_GATE_IMPLEMENTATION.md +0 -245
package/docs/archive/SETUP_IMPROVEMENTS.md +0 -213
package/docs/archive/UAP_GENERIC_OPTIMIZATION_PLAN.md +0 -270
package/docs/archive/UAP_OPTIMIZATION_PLAN.md +0 -701
package/docs/archive/UAP_V103_PATTERN_DESIGN.md +0 -315
package/docs/archive/UAP_V104_COMPLIANCE_DESIGN.md +0 -223
package/docs/archive/changelog/2026-03-10_uap-100-compliance.md +0 -77
package/docs/archive/changelog/2026-03-10_uap-full-system-verification.md +0 -109
package/docs/archive/opencode-integration-guide.md +0 -740
package/docs/archive/opencode-integration-quickref.md +0 -180
package/docs/benchmarks/OVERNIGHT_RUNNER.md +0 -341
package/docs/benchmarks/SPECULATIVE_DECODING_JOURNEY_2026-03.md +0 -221
package/docs/benchmarks/VALIDATION_PLAN.md +0 -568
package/docs/blog/SPECULATIVE_DECODING_PRODUCTION_PLAYBOOK.md +0 -139
package/docs/blog/local-coding-agents.md +0 -266
package/docs/blog/x-thread.md +0 -254
package/docs/deployment/DEPLOYMENT.md +0 -895
package/docs/deployment/DEPLOYMENT_STRATEGIES.md +0 -518
package/docs/deployment/DEPLOY_BATCHER_ANALYSIS.md +0 -224
package/docs/deployment/DEPLOY_BATCHING.md +0 -273
package/docs/deployment/DEPLOY_BUCKETING_ANALYSIS.md +0 -420
package/docs/deployment/QWEN35_LLAMA_CPP.md +0 -426
package/docs/deployment/UAP_LLAMA_ANTHROPIC_PROXY_BOOTSTRAP.md +0 -279
package/docs/getting-started/INTEGRATION.md +0 -628
package/docs/getting-started/OVERVIEW.md +0 -324
package/docs/getting-started/SETUP.md +0 -377
package/docs/integrations/MCP_ROUTER_SETUP.md +0 -445
package/docs/integrations/RTK_INTEGRATION.md +0 -468
package/docs/operations/TROUBLESHOOTING.md +0 -660
package/docs/pr/PR_SPECULATIVE_DOCS_TEMPLATE.md +0 -146
package/docs/pr/UPSTREAM_PRS.md +0 -424
package/docs/reference/API_REFERENCE.md +0 -903
package/docs/reference/EXPERT_DROIDS.md +0 -219
package/docs/reference/HARNESS-MATRIX.md +0 -318
package/docs/reference/PATTERN_LIBRARY.md +0 -636
package/docs/reference/UAP_CLI_REFERENCE.md +0 -620
package/docs/research/BEHAVIORAL_PATTERNS.md +0 -228
package/docs/research/DOMAIN_STRATEGIES.md +0 -316
package/docs/research/MEMORY_SYSTEMS_COMPARISON.md +0 -812
package/docs/research/PATTERN_ANALYSIS_2026-01-18.md +0 -436
package/docs/research/PERFORMANCE_ANALYSIS_2026-01-18.md +0 -209
package/docs/research/PERFORMANCE_TEST_PLAN.md +0 -383
package/docs/research/TERMINAL_BENCH_LEARNINGS.md +0 -217

package/docs/archive/UAP_OPTIMIZATION_PLAN.md DELETED Viewed

@@ -1,701 +0,0 @@
-# UAP Optimization & Dashboard Overlay Plan (Validated)
-> Validated against codebase on 2026-03-17. All references point to real files, types, and services.
-> **Updated**: 2026-03-19 after implementing all critical fixes from validated plan.
-## Validation Summary
-### What exists today
-- **Policy system**: Full CRUD + enforcement gate + audit trail (`src/policies/`), SQLite-backed, 3 enforcement levels (REQUIRED/RECOMMENDED/OPTIONAL), `togglePolicy()` already on `PolicyMemoryManager`
-- **Memory system**: 4-tier (L1-L4), 26 files in `src/memory/`, adaptive context, dynamic retrieval, predictive pre-fetch
-- **Model router**: Rule-based (`src/models/router.ts`) + benchmark-data (`src/memory/model-router.ts`) + unified consensus (`src/models/unified-router.ts`), execution profiles per model family (`src/models/execution-profiles.ts`)
-- **Dashboard**: 1830-line terminal dashboard (`src/cli/dashboard.ts`) with 8 views, 424-line viz library (`src/cli/visualize.ts`), session telemetry (`src/telemetry/session-telemetry.ts`)
-- **No web dashboard exists**. All visualization is chalk-based terminal output.
-- **No enforcement stage concept exists** on policies. Policies have `level` (REQUIRED/RECOMMENDED/OPTIONAL) and `isActive` (boolean) but no stage gating.
-### What the original plan got wrong
-1. Proposed `ink`/`blessed` TUI -- unnecessary. The existing chalk-based dashboard + visualize.ts primitives already work and are battle-tested. New panels should extend the existing system.
-2. Proposed web React dashboard from scratch -- premature. Option 3 (embedded) is correct: extend the existing `uap dashboard` CLI with new panels first, add a lightweight HTTP/WebSocket server later.
-3. Missed that `PolicyMemoryManager.togglePolicy()` already exists at `src/policies/policy-memory.ts:91`.
-4. Missed that `unified-router.ts` model maps need updating for opus-4.6 and qwen35.
-5. Missed that the session dashboard already shows policies but only as hardcoded bullet items (`dashboard.ts:1305-1317`), not from the database.
-### Codebase gaps identified (2026-03-17 validated plan)
-**CRITICAL - Fixed in Phase 1 (2026-03-19):**
-1. ✅ `src/memory/adaptive-context.ts:829` -- `validModelIds` array missing `opus-4.6`, breaking adaptive learning feedback loop for default models
-2. ✅ `src/models/router.ts` -- `routingMatrix` config defined but never consumed in `selectModel()` method
-**PERFORMANCE - Fixed in Phase 2-4 (2026-03-19):** 3. ✅ `src/tasks/service.ts:566` -- N+1 query pattern in `batchGetWithRelations()` calling `this.get(task.parentId)` for each task 4. ✅ `src/dashboard/data-service.ts:242` -- Dashboard opens/closes SQLite DB on every refresh (every 2 seconds) 5. ✅ `src/memory/dynamic-retrieval.ts:704,763` -- Synchronous file reads on every retrieval call for rarely-changing files 6. ✅ `src/memory/adaptive-context.ts:74` -- Over-provisioned DB pool (5 connections for sync driver) 7. ✅ `src/memory/model-router.ts:203` -- Over-provisioned DB pool (5 connections for sync driver)
-**RESOURCE LEAKS - Fixed in Phase 3 (2026-03-19):** 8. ✅ `src/utils/adaptive-cache.ts:160` -- setInterval already has `.unref()` (already fixed) 9. ✅ `src/models/analytics.ts:294-299` -- process.on('exit') handler exists (already fixed) 10. ✅ `src/policies/policy-gate.ts:493-499` -- process.on('exit') handler exists (already fixed)
-**ALREADY FIXED:**
-- Costs array capping in `src/telemetry/session-telemetry.ts:711-714`
-- Constant hoisting in `src/memory/adaptive-context.ts:672`
-- File read caching with mtime invalidation (new in Phase 4)
----
-## Part 1: Model Optimization
-### A. Immediate Fixes (Prerequisite)
-#### 1.1 Update unified-router model maps
-**File**: `src/models/unified-router.ts:35-49`
-```typescript
-const BENCHMARK_TO_RULE_MODEL_MAP: Record<string, string> = {
-  'claude-opus-4.5': 'opus-4.5',
-  'claude-opus-4.6': 'opus-4.6', // ADD
-  'gpt-5.2': 'gpt-5.2',
-  'glm-4.7': 'glm-4.7',
-  'gpt-5.2-codex': 'gpt-5.2',
-  qwen35: 'qwen35', // ADD
-};
-const RULE_TO_BENCHMARK_MODEL_MAP: Record<string, ModelId> = {
-  'opus-4.5': 'claude-opus-4.5',
-  'opus-4.6': 'claude-opus-4.6', // ADD
-  'gpt-5.2': 'gpt-5.2',
-  'glm-4.7': 'glm-4.7',
-  'deepseek-v3.2': 'gpt-5.2',
-  'deepseek-v3.2-exp': 'gpt-5.2',
-  'qwen35-a3b': 'glm-4.7',
-  qwen35: 'qwen35', // ADD
-};
-```
-#### 1.2 Update CLI model defaults
-**File**: `src/cli/model.ts:44-59`
-Change `getMultiModelConfig()` fallback to use `ModelRouter.getDefaultUAPConfig()` instead of hardcoded opus-4.5 defaults.
-#### 1.3 Add benchmark fingerprint for opus-4.6 and qwen35
-**File**: `src/memory/model-router.ts`
-Add `MODEL_FINGERPRINTS` entries for `claude-opus-4.6` and `qwen35` so the benchmark-data router can track them.
-### B. Qwen 3.5 Optimizations
-#### 2.1 Dynamic quantization switching
-**Where**: New function in `src/models/execution-profiles.ts`
-The `SMALL_MOE_PROFILE` already covers qwen3.5 correctly. Enhancement: add a `quantizationHint` field to `ExecutionProfile` so the llama.cpp server can be told which quant to load.
-```typescript
-// Add to ExecutionProfile interface
-quantizationHint?: {
-  low: string;    // e.g. 'iq2_xs' for simple tasks
-  medium: string; // e.g. 'iq4_xs' for standard tasks
-  high: string;   // e.g. 'q5_k_m' for complex tasks
-};
-```
-The router already classifies complexity. Wire the quant hint into the `ModelSelection` result so the agent runner can pass it to the llama.cpp endpoint.
-#### 2.2 Context window management
-**Where**: Extend `src/memory/context-compressor.ts` and `src/memory/adaptive-context.ts`
-These already exist and handle token budgets. Enhancement:
-- Add a `modelContextBudget` field to `ModelConfig` in `src/models/types.ts` (distinct from `maxContextTokens`) representing the _effective_ context the model handles well
-- For qwen35: `maxContextTokens: 262144` but `modelContextBudget: 32768` (sweet spot for 3B active params)
-- `AdaptiveContext` already selects context level by task type -- wire it to respect `modelContextBudget`
-#### 2.3 Prompt token budget tracking
-**Where**: `src/memory/context-compressor.ts` already has `SemanticCompressor` with entropy-aware compression
-Enhancement: expose a per-session token counter that the dashboard can read. Add to `globalSessionStats` in `src/mcp-router/session-stats.ts`:
-```typescript
-// Already exists: totalContextBytes, totalRawBytes, savingsRatio
-// Add:
-modelTokenBudget: number; // from modelContextBudget
-modelTokensConsumed: number; // running total
-compressionEvents: number; // how many times compressor fired
-```
-### C. Multi-Model Routing Enhancements
-#### 3.1 Complexity-based routing matrix
-**Where**: `src/models/router.ts` -- `selectAdaptiveModel()` already implements this logic
-Current behavior (validated):
-- `critical`/`high` -> planner (opus-4.6)
-- `medium` -> executor (qwen35)
-- `low` -> cheapest model (qwen35, $0/1M)
-This is correct. No change needed for the matrix itself.
-Enhancement: add a `routingMatrix` config option to `MultiModelConfig` so users can override per-complexity routing without editing code:
-```typescript
-// Add to MultiModelConfig in src/models/types.ts
-routingMatrix?: Record<TaskComplexity, { planner: string; executor: string }>;
-```
-#### 3.2 Performance analytics module
-**Where**: New file `src/models/analytics.ts`
-```typescript
-export interface TaskOutcome {
-  modelId: string;
-  taskType: string;
-  complexity: TaskComplexity;
-  success: boolean;
-  durationMs: number;
-  tokensUsed: { input: number; output: number };
-  cost: number;
-  timestamp: string;
-}
-export class ModelAnalytics {
-  private db: Database; // SQLite, same pattern as other DBs
-  recordOutcome(outcome: TaskOutcome): void;
-  getSuccessRate(modelId: string, taskType?: string): number;
-  getAvgLatency(modelId: string, taskType?: string): number;
-  getOptimalRouting(): Record<string, string>; // taskType -> modelId
-  getCostBreakdown(since?: Date): CostBreakdown[];
-}
-```
-This feeds into the dashboard cost tracker panel.
----
-## Part 2: Dashboard -- Policies / Memories / Model Active Panels
-### A. Architecture Decision
-**Extend the existing `src/cli/dashboard.ts`** with new panels and a new `uap dashboard policies` view. Do NOT build a separate TUI framework. The existing chalk + visualize.ts primitives cover everything needed.
-For the web overlay (Phase 3), add a thin HTTP + WebSocket server that serves JSON from the same data sources the CLI dashboard reads. A single-page HTML file (like the existing `web/generator.html` pattern) consumes it.
-### B. New Dashboard Panels
-#### Panel 1: Policies Active
-**CLI command**: `uap dashboard policies`
-Reads from `PolicyMemoryManager.getAllPolicies()` and `PolicyGate.getAuditTrail()`.
-```
-  UAP Policies Dashboard
-  ──────────────────────────────────────────────────────
-  Active Policies (3)
-  ──────────────────────────────────────────────────────
-  Name                      Level        Category   Stage      Status
-  ─────────────────────────────────────────────────────────────────────
-  IaC State Parity          REQUIRED     code       pre-exec   ON
-  Mandatory File Backup     REQUIRED     code       pre-exec   ON
-  Image Asset Verification  RECOMMENDED  image      pre-exec   ON
-  Enforcement Stages
-  ──────────────────────────────────────────────────────
-  pre-exec   ████████████████████ 3 policies
-  post-exec  ░░░░░░░░░░░░░░░░░░░ 0 policies
-  review     ░░░░░░░░░░░░░░░░░░░ 0 policies
-  Recent Audit Trail (last 10)
-  ──────────────────────────────────────────────────────
-  2026-03-17 14:23  IaC State Parity       web_browser  ALLOWED
-  2026-03-17 14:22  Mandatory File Backup  file_write   ALLOWED
-  2026-03-17 14:20  IaC State Parity       terraform    BLOCKED  "No state file"
-  Toggle: uap policy toggle <id> --off
-  Stage:  uap policy stage <id> --stage post-exec
-  Level:  uap policy level <id> --level OPTIONAL
-```
-#### Panel 2: Memories Active
-**CLI command**: `uap dashboard memories`
-Extends the existing memory section in `showSessionDashboard()` (`dashboard.ts:1217-1247`).
-```
-  UAP Memories Dashboard
-  ──────────────────────────────────────────────────────
-  Memory Tiers
-  ──────────────────────────────────────────────────────
-  L1 Working    ████████░░░░░░░░░░░░  42/50 entries   12 KB
-  L2 Session    ██░░░░░░░░░░░░░░░░░░  8 entries       3 KB
-  L3 Semantic   Qdrant: Running (Up 4h 23m)  1,247 vectors
-  L4 Knowledge  23 entities  47 relationships
-  Active Memories This Session (by type)
-  ──────────────────────────────────────────────────────
-  decision     ████████████  12
-  observation  ████████      8
-  pattern      ██████        6
-  correction   ██            2
-  Open Loops (3)
-  ──────────────────────────────────────────────────────
-  > TODO: wire dashboard WebSocket to session-stats
-  > BLOCKED: Qdrant cloud migration pending API key
-  > REVIEW: memory consolidation threshold too aggressive
-  Compression Stats
-  ──────────────────────────────────────────────────────
-  Token budget:    32,768 / 262,144 (12.5%)
-  Compressions:    4 this session
-  Savings ratio:   73.2%
-```
-#### Panel 3: Model Active Per Task
-**CLI command**: `uap dashboard models`
-Reads from `ModelRouter`, `UnifiedRoutingService`, and the new `ModelAnalytics`.
-```
-  UAP Model Dashboard
-  ──────────────────────────────────────────────────────
-  Active Configuration
-  ──────────────────────────────────────────────────────
-  Planner:   opus-4.6    Claude Opus 4.6    $7.50/$37.50 per 1M
-  Executor:  qwen35      Qwen 3.5 (local)   $0.00/$0.00
-  Reviewer:  opus-4.6    Claude Opus 4.6
-  Fallback:  qwen35      Qwen 3.5 (local)
-  Strategy:  balanced
-  Routing Matrix
-  ──────────────────────────────────────────────────────
-  Complexity   Planner     Executor
-  low          qwen35      qwen35       $0.00
-  medium       opus-4.6    qwen35       $0.04
-  high         opus-4.6    opus-4.6     $0.22
-  critical     opus-4.6    opus-4.6     $0.22
-  Session Usage
-  ──────────────────────────────────────────────────────
-  Model        Tasks  Tokens In  Tokens Out  Cost     Success
-  opus-4.6     3      4,521      2,103       $0.11    100%
-  qwen35       12     18,432     9,876       $0.00    91.7%
-  Execution Profile: small-moe (Qwen 3.5)
-  ──────────────────────────────────────────────────────
-  domainHints: ON   webSearch: OFF   reflectionCheckpoints: OFF
-  temperature: 0.15  loopEscapeThreshold: 3  toolChoiceForce: required
-  softBudget: 35    hardBudget: 50
-  Unified Router Consensus
-  ──────────────────────────────────────────────────────
-  Last 10 decisions: 8 consensus, 1 rule-based, 1 benchmark-data
-  Avg confidence: 0.82
-```
-### C. Policy Enforcement Stages & Toggling
-#### Schema Changes
-**File**: `src/policies/schemas/policy.ts`
-Add `enforcementStage` to the policy schema:
-```typescript
-export const PolicySchema = z.object({
-  id: z.string().uuid(),
-  name: z.string(),
-  category: z.enum(['image', 'code', 'security', 'testing', 'ui', 'automation', 'custom']),
-  level: z.enum(['REQUIRED', 'RECOMMENDED', 'OPTIONAL']),
-  enforcementStage: z.enum(['pre-exec', 'post-exec', 'review', 'always']).default('pre-exec'), // NEW
-  rawMarkdown: z.string(),
-  convertedFormat: z.string().optional(),
-  executableTools: z.array(z.string()).optional(),
-  tags: z.array(z.string()),
-  createdAt: z
-    .string()
-    .refine((d) => !Number.isNaN(Date.parse(d)), { message: 'Invalid ISO date string' }),
-  updatedAt: z
-    .string()
-    .refine((d) => !Number.isNaN(Date.parse(d)), { message: 'Invalid ISO date string' }),
-  version: z.number(),
-  isActive: z.boolean(),
-  priority: z.number().default(50),
-});
-```
-**File**: `src/policies/database-manager.ts`
-Add column to `policies` table:
-```sql
-ALTER TABLE policies ADD COLUMN enforcementStage TEXT NOT NULL DEFAULT 'pre-exec';
-```
-Use a migration check pattern (check if column exists before adding).
-#### PolicyGate Changes
-**File**: `src/policies/policy-gate.ts`
-Add stage-aware enforcement:
-```typescript
-async executeWithGates<T>(
-  operation: string,
-  args: Record<string, unknown>,
-  executor: () => Promise<T>,
-  stage: 'pre-exec' | 'post-exec' | 'review' = 'pre-exec'  // NEW param
-): Promise<T> {
-  const gateResult = await this.checkPolicies(operation, args, stage);
-  // ... existing logic, but only check policies matching this stage
-}
-async checkPolicies(
-  operation: string,
-  args: Record<string, unknown>,
-  stage: 'pre-exec' | 'post-exec' | 'review' | 'always' = 'pre-exec'
-): Promise<GateResult> {
-  const allPolicies = await this.memory.getAllPolicies();
-  // Filter to policies matching this stage or 'always'
-  const stagePolicies = allPolicies.filter(
-    p => p.enforcementStage === stage || p.enforcementStage === 'always'
-  );
-  // ... evaluate only stagePolicies
-}
-```
-#### CLI Commands for Toggling
-**File**: `src/bin/policy.ts` (extend existing)
-```
-uap policy toggle <id> [--on|--off]     # Uses existing PolicyMemoryManager.togglePolicy()
-uap policy stage <id> --stage <stage>    # New: change enforcement stage
-uap policy level <id> --level <level>    # New: change REQUIRED/RECOMMENDED/OPTIONAL
-uap policy list                          # New: list all with status/stage/level
-uap policy audit [--policy-id <id>]      # New: show audit trail
-```
-Implementation: `togglePolicy()` already exists. Add `setEnforcementStage()` and `setLevel()` to `PolicyMemoryManager`:
-```typescript
-// src/policies/policy-memory.ts
-async setEnforcementStage(id: string, stage: 'pre-exec' | 'post-exec' | 'review' | 'always'): Promise<void> {
-  this.db.updatePolicy({ id }, { enforcementStage: stage, updatedAt: new Date().toISOString() });
-}
-async setLevel(id: string, level: 'REQUIRED' | 'RECOMMENDED' | 'OPTIONAL'): Promise<void> {
-  this.db.updatePolicy({ id }, { level, updatedAt: new Date().toISOString() });
-}
-```
-### D. Grouping: Per-Task vs Grouped Display
-The dashboard supports both views:
-1. **Grouped view** (default for `uap dashboard policies`, `uap dashboard models`): Shows aggregate state -- all active policies, all model assignments, memory tier health.
-2. **Per-task view** (when a task ID is provided): Shows what was active _for that specific task_.
-```
-uap dashboard policies                    # Grouped: all policies, stages, audit
-uap dashboard policies --task <task-id>   # Per-task: which policies fired for this task
-uap dashboard models                      # Grouped: all model assignments, session totals
-uap dashboard models --task <task-id>     # Per-task: which model handled this task, tokens, cost
-uap dashboard memories --task <task-id>   # Per-task: memories retrieved/stored for this task
-```
-Per-task view requires linking `policy_executions` and `ModelAnalytics.TaskOutcome` to task IDs. Add a `taskId` column to both:
-- `policy_executions` table: `taskId TEXT` (nullable, for backward compat)
-- `ModelAnalytics` outcomes table: `taskId TEXT`
----
-## Part 3: Phase 3 -- Advanced Features
-### A. Web Overlay (Option 3: Embedded)
-Architecture: The CLI dashboard functions already compute all the data. Extract the data-gathering logic into shared service functions, then expose via a lightweight HTTP server.
-#### 3.1 Data service layer
-**New file**: `src/dashboard/data-service.ts`
-```typescript
-export interface DashboardData {
-  policies: PolicyDashboardData;
-  memories: MemoryDashboardData;
-  models: ModelDashboardData;
-  tasks: TaskDashboardData;
-  coordination: CoordinationDashboardData;
-}
-export async function getDashboardData(): Promise<DashboardData> {
-  // Reuse the same DB queries from dashboard.ts but return structured data
-  // instead of printing to console
-}
-```
-#### 3.2 Embedded HTTP + WebSocket server
-**New file**: `src/dashboard/server.ts`
-```typescript
-import { createServer } from 'http';
-import { WebSocketServer } from 'ws'; // ws package, already common in Node ecosystem
-import { getDashboardData } from './data-service.js';
-export function startDashboardServer(port: number = 3847): void {
-  const server = createServer(async (req, res) => {
-    if (req.url === '/api/dashboard') {
-      const data = await getDashboardData();
-      res.writeHead(200, { 'Content-Type': 'application/json' });
-      res.end(JSON.stringify(data));
-    }
-    if (req.url === '/') {
-      // Serve the single-page dashboard HTML (like web/generator.html pattern)
-      res.writeHead(200, { 'Content-Type': 'text/html' });
-      res.end(DASHBOARD_HTML); // Inline or read from file
-    }
-  });
-  const wss = new WebSocketServer({ server });
-  // Push updates every 2s
-  setInterval(async () => {
-    const data = await getDashboardData();
-    for (const client of wss.clients) {
-      client.send(JSON.stringify(data));
-    }
-  }, 2000);
-  server.listen(port);
-}
-```
-#### 3.3 CLI integration
-**File**: `src/cli/dashboard.ts`
-```
-uap dashboard serve [--port 3847]   # Start embedded web dashboard
-```
-Launches as foreground process. When opencode/claude-code exits, the server dies with it (child process of the same shell).
-#### 3.4 Single-page HTML dashboard
-**New file**: `web/dashboard.html`
-Self-contained HTML + CSS + vanilla JS (no build step, same pattern as `web/generator.html`). Connects to `ws://localhost:3847`, renders:
-- Policy table with toggle buttons (POST to `/api/policy/:id/toggle`)
-- Memory tier gauges
-- Model routing live view
-- Cost tracker
-- Task timeline
-### B. Historical session comparison
-Store session snapshots in SQLite (`agents/data/memory/sessions.db`):
-```sql
-CREATE TABLE session_snapshots (
-  id TEXT PRIMARY KEY,
-  timestamp TEXT NOT NULL,
-  data TEXT NOT NULL,  -- JSON blob from getDashboardData()
-  duration_ms INTEGER,
-  total_cost REAL,
-  tasks_completed INTEGER,
-  models_used TEXT     -- JSON array
-);
-```
-CLI: `uap dashboard history [--last 10]`
-### C. Export
-`uap dashboard export [--format json|csv] [--output file]`
-Dumps current dashboard data. JSON is the `DashboardData` object. CSV flattens the key tables (policies, model usage, task outcomes).
----
-## Part 4: Implementation Roadmap
-### Phase 1: Foundation (Week 1-2)
-| #   | Task                                                             | File(s)                                                              | Status |
-| --- | ---------------------------------------------------------------- | -------------------------------------------------------------------- | ------ |
-| 1   | Update unified-router model maps for opus-4.6/qwen35             | `src/models/unified-router.ts`                                       | Ready  |
-| 2   | Update CLI model defaults                                        | `src/cli/model.ts`                                                   | Ready  |
-| 3   | Add benchmark fingerprints for new models                        | `src/memory/model-router.ts`                                         | Ready  |
-| 4   | Add `enforcementStage` to policy schema + DB migration           | `src/policies/schemas/policy.ts`, `src/policies/database-manager.ts` | Ready  |
-| 5   | Add `setEnforcementStage()`, `setLevel()` to PolicyMemoryManager | `src/policies/policy-memory.ts`                                      | Ready  |
-| 6   | Add stage-aware filtering to PolicyGate                          | `src/policies/policy-gate.ts`                                        | Ready  |
-| 7   | Add policy CLI commands (toggle/stage/level/list/audit)          | `src/bin/policy.ts`                                                  | Ready  |
-### Phase 2: Dashboard Panels (Week 3-4)
-| #   | Task                                                    | File(s)                                          | Status |
-| --- | ------------------------------------------------------- | ------------------------------------------------ | ------ |
-| 8   | Replace hardcoded policies section with DB-driven panel | `src/cli/dashboard.ts:1305-1317`                 | Ready  |
-| 9   | Build `showPoliciesDashboard()` panel                   | `src/cli/dashboard.ts`                           | Ready  |
-| 10  | Build `showModelsDashboard()` panel                     | `src/cli/dashboard.ts`                           | Ready  |
-| 11  | Extend `showMemoryDashboard()` with compression stats   | `src/cli/dashboard.ts`                           | Ready  |
-| 12  | Add `--task <id>` per-task filtering to all panels      | `src/cli/dashboard.ts`                           | Ready  |
-| 13  | Create `ModelAnalytics` module                          | `src/models/analytics.ts`                        | Ready  |
-| 14  | Wire `ModelAnalytics` into router + executor            | `src/models/router.ts`, `src/models/executor.ts` | Ready  |
-### Phase 3: Web Overlay + Advanced (Week 5-6)
-| #   | Task                                         | File(s)                         | Status |
-| --- | -------------------------------------------- | ------------------------------- | ------ |
-| 15  | Extract data-service layer from dashboard.ts | `src/dashboard/data-service.ts` | Ready  |
-| 16  | Build embedded HTTP + WebSocket server       | `src/dashboard/server.ts`       | Ready  |
-| 17  | Build single-page HTML dashboard             | `web/dashboard.html`            | Ready  |
-| 18  | Add `uap dashboard serve` command            | `src/cli/dashboard.ts`          | Ready  |
-| 19  | Add policy toggle/stage/level API endpoints  | `src/dashboard/server.ts`       | Ready  |
-| 20  | Session snapshot storage + history view      | `src/dashboard/data-service.ts` | Ready  |
-| 21  | Export command (JSON/CSV)                    | `src/cli/dashboard.ts`          | Ready  |
-### Phase 4: Model Optimization (Week 7-8)
-| #   | Task                                                | File(s)                                       | Status |
-| --- | --------------------------------------------------- | --------------------------------------------- | ------ |
-| 22  | Add `quantizationHint` to ExecutionProfile          | `src/models/execution-profiles.ts`            | Ready  |
-| 23  | Add `modelContextBudget` to ModelConfig             | `src/models/types.ts`                         | Ready  |
-| 24  | Wire adaptive context to respect modelContextBudget | `src/memory/adaptive-context.ts`              | Ready  |
-| 25  | Add token counter to globalSessionStats             | `src/mcp-router/session-stats.ts`             | Ready  |
-| 26  | Add `routingMatrix` config option                   | `src/models/types.ts`, `src/models/router.ts` | Ready  |
-| 27  | Training data collection script                     | `scripts/collect-training-data.py`            | Ready  |
----
-## Part 5: Dependency Graph
-```
-Phase 1 (foundation)
-  [1,2,3] unified-router + CLI + fingerprints  (parallel, no deps)
-  [4,5,6] policy schema + memory + gate         (sequential: 4 -> 5 -> 6)
-  [7] policy CLI commands                        (depends on 5,6)
-Phase 2 (dashboard panels)
-  [8,9] policies dashboard                       (depends on 5,6)
-  [10] models dashboard                          (depends on 1,2,3)
-  [11] memory dashboard                          (no deps)
-  [12] per-task filtering                        (depends on 13)
-  [13,14] ModelAnalytics                         (depends on 1)
-Phase 3 (web overlay)
-  [15] data-service extraction                   (depends on 8,9,10,11)
-  [16,17,18] server + HTML + CLI                 (depends on 15)
-  [19] policy API endpoints                      (depends on 16, 5,6)
-  [20,21] history + export                       (depends on 15)
-Phase 4 (model optimization)
-  [22-27] all independent of Phase 3, can run in parallel with Phase 2-3
-```
----
-## Part 6: Risk Assessment
-| Risk                                                    | Impact | Mitigation                                                                           |
-| ------------------------------------------------------- | ------ | ------------------------------------------------------------------------------------ |
-| Policy DB migration breaks existing data                | High   | Use `ALTER TABLE ADD COLUMN ... DEFAULT` -- backward compatible                      |
-| WebSocket server port conflicts                         | Low    | Configurable port, default 3847 (unlikely to conflict)                               |
-| Dashboard overhead slows agent execution                | Medium | Data-service reads are read-only SQLite queries (<5ms each), WebSocket push is async |
-| Qwen 3.5 quantization switching requires server restart | Medium | Document as limitation; future: llama.cpp hot-swap support                           |
-| Unified router model map drift                          | Low    | Add test that validates all ModelPresets have map entries                            |
----
-## Part 7: Test Strategy
-### Unit tests needed
-```
-test/policies/enforcement-stage.test.ts    -- stage filtering in PolicyGate
-test/policies/policy-toggle.test.ts        -- toggle/level/stage mutations
-test/models/unified-router-maps.test.ts    -- all presets have map entries
-test/models/analytics.test.ts              -- outcome recording + queries
-test/dashboard/data-service.test.ts        -- structured data output
-```
-### Integration tests
-```
-test/dashboard/policies-panel.test.ts      -- end-to-end: store policy -> toggle -> verify dashboard output
-test/dashboard/models-panel.test.ts        -- route task -> verify model usage in dashboard
-test/dashboard/web-server.test.ts          -- HTTP + WebSocket connectivity
-```
-### Validation command
-```bash
-npm test -- --grep "enforcement-stage|unified-router-maps|analytics"
-```
----
-## Implementation Status (2026-03-19)
-### ✅ Completed - All Critical Issues Fixed
-**Phase 1: Correctness Bugs (COMPLETED)**
-- ✅ Fixed `validModelIds` in `adaptive-context.ts:834` to include `opus-4.6`
-- ✅ Updated `ModelId` type in `model-router.ts:23-29` to include `opus-4.6`
-- ✅ Updated `RULE_TO_BENCHMARK_MODEL_MAP` in `unified-router.ts:51` to use `opus-4.6`
-- ✅ Verified `routingMatrix` consumption in `router.ts:247-264` (already implemented)
-**Phase 2: Performance Fixes (COMPLETED)**
-- ✅ Fixed N+1 query in `tasks/service.ts:535-570` by batching parent lookups
-- ✅ Added DB connection caching in `dashboard/data-service.ts:25-26, 242-275`
-- ✅ Added file read cache with mtime invalidation in `dynamic-retrieval.ts:40-48, 693-748, 794-845`
-**Phase 3: Resource Leaks (COMPLETED)**
-- ✅ Verified `.unref()` on `adaptive-cache.ts:160` (already present)
-- ✅ Verified `process.on('exit')` handlers in `analytics.ts:294-299` and `policy-gate.ts:493-499`
-- ✅ Reduced DB_POOL_SIZE from 5 to 1 in `adaptive-context.ts:74` and `model-router.ts:203`
-**Phase 4: Performance Improvements (COMPLETED)**
-- ✅ Added file read caching for `long_term_prepopulated.json` and `CLAUDE.md`
-- ✅ Verified constant hoisting in `adaptive-context.ts:672` (already present)
-**Phase 5: Code Quality (COMPLETED)**
-- ✅ No additional improvements needed - all identified issues were already fixed or intentional
-### Build Status
-```bash
-npm run build  # ✅ Passes with zero errors
-tsc --noEmit   # ✅ Type checking passes
-```
-### Next Steps
-The validated optimization plan is now fully implemented. Remaining work from the original stale documents can be safely ignored as they describe features that were already implemented before the validation exercise.
-Focus areas for future work:
-1. Web dashboard implementation (embedded HTTP server with WebSocket)
-2. Enforcement stage concept for policies
-3. Additional pattern optimizations from `.factory/patterns/`