@miller-tech/uap 1.40.0 → 1.41.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (150) hide show
  1. package/README.md +109 -642
  2. package/dist/.tsbuildinfo +1 -1
  3. package/dist/cli/deliver-defaults.d.ts +23 -0
  4. package/dist/cli/deliver-defaults.d.ts.map +1 -0
  5. package/dist/cli/deliver-defaults.js +121 -0
  6. package/dist/cli/deliver-defaults.js.map +1 -0
  7. package/dist/cli/init.d.ts.map +1 -1
  8. package/dist/cli/init.js +29 -0
  9. package/dist/cli/init.js.map +1 -1
  10. package/dist/cli/setup.d.ts.map +1 -1
  11. package/dist/cli/setup.js +19 -0
  12. package/dist/cli/setup.js.map +1 -1
  13. package/dist/policies/policy-tools.d.ts +7 -0
  14. package/dist/policies/policy-tools.d.ts.map +1 -1
  15. package/dist/policies/policy-tools.js +24 -2
  16. package/dist/policies/policy-tools.js.map +1 -1
  17. package/docs/INDEX.md +48 -286
  18. package/docs/architecture/OVERVIEW.md +328 -0
  19. package/docs/architecture/PROTOCOL.md +204 -0
  20. package/docs/benchmarks/README.md +17 -192
  21. package/docs/getting-started/CONFIGURATION.md +237 -0
  22. package/docs/getting-started/INSTALLATION.md +125 -0
  23. package/docs/getting-started/QUICKSTART.md +115 -0
  24. package/docs/guides/COORDINATION.md +162 -0
  25. package/docs/guides/DELIVER.md +115 -0
  26. package/docs/guides/DEPLOY_BATCHING.md +212 -0
  27. package/docs/guides/DROIDS_AND_SKILLS.md +202 -0
  28. package/docs/guides/LOCAL_MODELS.md +148 -0
  29. package/docs/guides/MCP_ROUTER.md +195 -0
  30. package/docs/guides/MEMORY.md +235 -0
  31. package/docs/guides/MULTI_MODEL.md +223 -0
  32. package/docs/guides/POLICIES.md +190 -0
  33. package/docs/guides/WORKTREE_WORKFLOW.md +185 -0
  34. package/docs/integrations/MCP_ROUTER.md +147 -0
  35. package/docs/integrations/RTK.md +102 -0
  36. package/docs/reference/API.md +485 -0
  37. package/docs/reference/CLI.md +719 -0
  38. package/docs/reference/CONFIGURATION.md +90 -193
  39. package/docs/reference/DATABASE_SCHEMA.md +110 -344
  40. package/docs/reference/FEATURES.md +176 -472
  41. package/docs/reference/PATTERNS.md +102 -0
  42. package/docs/reference/PLATFORMS.md +83 -0
  43. package/package.json +3 -1
  44. package/src/policies/enforcers/7ebbc721-7540-4e9f-879a-770e0213a09b_architecture_review.py +101 -0
  45. package/src/policies/enforcers/__pycache__/_common.cpython-312.pyc +0 -0
  46. package/src/policies/enforcers/_common.py +100 -0
  47. package/src/policies/enforcers/artifact_hygiene.py +52 -0
  48. package/src/policies/enforcers/cluster_routing.py +63 -0
  49. package/src/policies/enforcers/codebase_read_before_plan.py +52 -0
  50. package/src/policies/enforcers/coord_overlap.py +81 -0
  51. package/src/policies/enforcers/delivery_enforcement.py +97 -0
  52. package/src/policies/enforcers/doc_live_over_report.py +50 -0
  53. package/src/policies/enforcers/expert_review_required.py +135 -0
  54. package/src/policies/enforcers/iac_parity.py +53 -0
  55. package/src/policies/enforcers/mcp_router_first.py +37 -0
  56. package/src/policies/enforcers/memory_before_plan.py +61 -0
  57. package/src/policies/enforcers/parallel_reads.py +50 -0
  58. package/src/policies/enforcers/rtk_wrap.py +44 -0
  59. package/src/policies/enforcers/schema_diff_gate.py +80 -0
  60. package/src/policies/enforcers/session_memory_write.py +52 -0
  61. package/src/policies/enforcers/task_required.py +131 -0
  62. package/src/policies/enforcers/test_gate.py +58 -0
  63. package/src/policies/enforcers/validate_plan_before_build.py +75 -0
  64. package/src/policies/enforcers/worktree_required.py +57 -0
  65. package/src/policies/schemas/policies/architecture-review.md +51 -0
  66. package/src/policies/schemas/policies/artifact-hygiene.md +29 -0
  67. package/src/policies/schemas/policies/cluster-routing.md +31 -0
  68. package/src/policies/schemas/policies/codebase-read-before-plan.md +30 -0
  69. package/src/policies/schemas/policies/coord-overlap.md +24 -0
  70. package/src/policies/schemas/policies/delivery-enforcement.md +45 -0
  71. package/src/policies/schemas/policies/doc-live-over-report.md +32 -0
  72. package/src/policies/schemas/policies/expert-review-required.md +60 -0
  73. package/src/policies/schemas/policies/iac-parity.md +31 -0
  74. package/src/policies/schemas/policies/mandatory-testing-deployment.md +147 -0
  75. package/src/policies/schemas/policies/mcp-router-first.md +24 -0
  76. package/src/policies/schemas/policies/memory-before-plan.md +24 -0
  77. package/src/policies/schemas/policies/merge-deploy-monitor-verify.md +145 -0
  78. package/src/policies/schemas/policies/parallel-reads.md +24 -0
  79. package/src/policies/schemas/policies/rtk-wrap.md +26 -0
  80. package/src/policies/schemas/policies/schema-diff-gate.md +30 -0
  81. package/src/policies/schemas/policies/session-memory-write.md +24 -0
  82. package/src/policies/schemas/policies/task-required.md +49 -0
  83. package/src/policies/schemas/policies/test-gate.md +24 -0
  84. package/src/policies/schemas/policies/validate-plan-before-build.md +28 -0
  85. package/src/policies/schemas/policies/worktree-required.md +28 -0
  86. package/templates/hooks/uap-policy-gate.sh +5 -0
  87. package/docs/AGENTS.md +0 -423
  88. package/docs/DOCUMENTATION_AUDIT_REPORT.md +0 -131
  89. package/docs/GETTING_STARTED.md +0 -288
  90. package/docs/PROJECT_ANALYSIS_REPORT.md +0 -510
  91. package/docs/architecture/COMPLETE_ARCHITECTURE.md +0 -748
  92. package/docs/architecture/EXPERT_STACK.md +0 -137
  93. package/docs/architecture/MULTI_MODEL.md +0 -224
  94. package/docs/architecture/PLATFORM_GATING.md +0 -68
  95. package/docs/architecture/SYSTEM_ANALYSIS.md +0 -334
  96. package/docs/architecture/UAP_COMPLIANCE.md +0 -217
  97. package/docs/architecture/UAP_PROTOCOL.md +0 -339
  98. package/docs/architecture/UAP_STRICT_DROIDS.md +0 -172
  99. package/docs/archive/BALLS_MODE_SELF_ANALYSIS.md +0 -260
  100. package/docs/archive/BENCHMARK_GAPS_AND_PLAN.md +0 -146
  101. package/docs/archive/FAILING_TASKS_SOLUTION_PLAN.md +0 -668
  102. package/docs/archive/JINJA2-SYSTEM-MESSAGE-FIX.md +0 -209
  103. package/docs/archive/MODEL_ROUTING_IMPLEMENTATION_SUMMARY.md +0 -281
  104. package/docs/archive/MODEL_ROUTING_OPTIMIZATION_PLAN.md +0 -320
  105. package/docs/archive/NPM-PUBLISH-V0.9.1.md +0 -240
  106. package/docs/archive/OPTIMIZATION_OPTIONS.md +0 -334
  107. package/docs/archive/PARALLELISM_GAPS_AND_OPTIONS.md +0 -422
  108. package/docs/archive/POLICY_GATE_IMPLEMENTATION.md +0 -245
  109. package/docs/archive/SETUP_IMPROVEMENTS.md +0 -213
  110. package/docs/archive/UAP_GENERIC_OPTIMIZATION_PLAN.md +0 -270
  111. package/docs/archive/UAP_OPTIMIZATION_PLAN.md +0 -701
  112. package/docs/archive/UAP_V103_PATTERN_DESIGN.md +0 -315
  113. package/docs/archive/UAP_V104_COMPLIANCE_DESIGN.md +0 -223
  114. package/docs/archive/changelog/2026-03-10_uap-100-compliance.md +0 -77
  115. package/docs/archive/changelog/2026-03-10_uap-full-system-verification.md +0 -109
  116. package/docs/archive/opencode-integration-guide.md +0 -740
  117. package/docs/archive/opencode-integration-quickref.md +0 -180
  118. package/docs/benchmarks/OVERNIGHT_RUNNER.md +0 -341
  119. package/docs/benchmarks/SPECULATIVE_DECODING_JOURNEY_2026-03.md +0 -221
  120. package/docs/benchmarks/VALIDATION_PLAN.md +0 -568
  121. package/docs/blog/SPECULATIVE_DECODING_PRODUCTION_PLAYBOOK.md +0 -139
  122. package/docs/blog/local-coding-agents.md +0 -266
  123. package/docs/blog/x-thread.md +0 -254
  124. package/docs/deployment/DEPLOYMENT.md +0 -895
  125. package/docs/deployment/DEPLOYMENT_STRATEGIES.md +0 -518
  126. package/docs/deployment/DEPLOY_BATCHER_ANALYSIS.md +0 -224
  127. package/docs/deployment/DEPLOY_BATCHING.md +0 -273
  128. package/docs/deployment/DEPLOY_BUCKETING_ANALYSIS.md +0 -420
  129. package/docs/deployment/QWEN35_LLAMA_CPP.md +0 -426
  130. package/docs/deployment/UAP_LLAMA_ANTHROPIC_PROXY_BOOTSTRAP.md +0 -279
  131. package/docs/getting-started/INTEGRATION.md +0 -628
  132. package/docs/getting-started/OVERVIEW.md +0 -324
  133. package/docs/getting-started/SETUP.md +0 -377
  134. package/docs/integrations/MCP_ROUTER_SETUP.md +0 -445
  135. package/docs/integrations/RTK_INTEGRATION.md +0 -468
  136. package/docs/operations/TROUBLESHOOTING.md +0 -660
  137. package/docs/pr/PR_SPECULATIVE_DOCS_TEMPLATE.md +0 -146
  138. package/docs/pr/UPSTREAM_PRS.md +0 -424
  139. package/docs/reference/API_REFERENCE.md +0 -903
  140. package/docs/reference/EXPERT_DROIDS.md +0 -219
  141. package/docs/reference/HARNESS-MATRIX.md +0 -318
  142. package/docs/reference/PATTERN_LIBRARY.md +0 -636
  143. package/docs/reference/UAP_CLI_REFERENCE.md +0 -620
  144. package/docs/research/BEHAVIORAL_PATTERNS.md +0 -228
  145. package/docs/research/DOMAIN_STRATEGIES.md +0 -316
  146. package/docs/research/MEMORY_SYSTEMS_COMPARISON.md +0 -812
  147. package/docs/research/PATTERN_ANALYSIS_2026-01-18.md +0 -436
  148. package/docs/research/PERFORMANCE_ANALYSIS_2026-01-18.md +0 -209
  149. package/docs/research/PERFORMANCE_TEST_PLAN.md +0 -383
  150. package/docs/research/TERMINAL_BENCH_LEARNINGS.md +0 -217
@@ -1,200 +1,25 @@
1
1
  # UAP Benchmarks
2
2
 
3
- > **Version:** 1.18.0
4
- > **Last Updated:** 2026-03-28
5
- > **Status:** ✅ Production Ready
3
+ Performance and accuracy results for the Universal Agent Protocol, measured on Terminal-Bench 2.0.
6
4
 
7
- ---
5
+ ## Headline results
8
6
 
9
- ## Benchmark Overview
7
+ UAP-on vs. baseline, 12 representative tasks across 8 categories:
10
8
 
11
- This directory contains comprehensive benchmark results and validation data for UAP v1.18.0 with OpenCode integration.
9
+ | Metric | Baseline | With UAP | Δ |
10
+ |---|---|---|---|
11
+ | Tokens consumed | 558,000 | 280,438 | **−49.7%** |
12
+ | Task success rate | 25% | 58% | **+33pp** |
13
+ | Errors per task | 1.17 | 0.42 | **−68%** |
14
+ | Wall-clock (total) | 618s | 266s | **−57%** |
12
15
 
13
- ### Quick Stats
16
+ ## Reports
14
17
 
15
- | Metric | Baseline | UAP v1.17 | UAP v1.18 + OpenCode | Improvement |
16
- |--------|----------|-----------|---------------------|-------------|
17
- | **Success Rate** | 75% | 92% | **100%** | +25pp |
18
- | **Avg Tokens/Task** | 52,000 | 28,500 | **23,400** | -55% |
19
- | **Avg Time/Task** | 45s | 38s | **32s** | -29% |
20
- | **Error Rate** | 12% | 4% | **0%** | -100% |
21
- | **Quality Score** | 3.2/5 | 4.1/5 | **4.7/5** | +47% |
18
+ | Doc | What it covers |
19
+ |---|---|
20
+ | [Validation Results](VALIDATION_RESULTS.md) | Full methodology + per-task breakdown |
21
+ | [Token Optimization](TOKEN_OPTIMIZATION.md) | Where the token savings come from |
22
+ | [Accuracy Analysis](ACCURACY_ANALYSIS.md) | Success-rate and error analysis |
23
+ | [Comprehensive Benchmarks](COMPREHENSIVE_BENCHMARKS.md) | Extended measurements |
22
24
 
23
- ---
24
-
25
- ## Documentation
26
-
27
- ### Main Documents
28
-
29
- | Document | Description | Link |
30
- |----------|-------------|------|
31
- | **Comprehensive Benchmarks** | Full benchmark results and analysis | [COMPREHENSIVE_BENCHMARKS.md](COMPREHENSIVE_BENCHMARKS.md) |
32
- | **Validation Results** | Production validation report | [VALIDATION_RESULTS.md](VALIDATION_RESULTS.md) |
33
- | **Validation Plan** | Benchmark methodology | [VALIDATION_PLAN.md](VALIDATION_PLAN.md) |
34
-
35
- ### Analysis Documents
36
-
37
- | Document | Description |
38
- |----------|-------------|
39
- | **Token Optimization** | Per-feature token savings analysis |
40
- | **Accuracy Analysis** | Internal vs Terminal-Bench comparison |
41
- | **Speculative Decoding Journey** | End-to-end tuning narrative |
42
-
43
- ### Quick Reference
44
-
45
- - [Benchmark Results Summary](../README.md#benchmarks)
46
- - [Token Optimization Details](../benchmarks/TOKEN_OPTIMIZATION.md)
47
- - [Accuracy Analysis](../benchmarks/ACCURACY_ANALYSIS.md)
48
-
49
- ---
50
-
51
- ## Running Benchmarks
52
-
53
- ### Quick Start
54
-
55
- ```bash
56
- # Run short benchmark suite (10 tasks)
57
- npm run benchmark:short
58
-
59
- # Run full benchmark suite (14 tasks)
60
- npm run benchmark:full
61
-
62
- # Run overnight suite (extended validation)
63
- npm run benchmark:overnight
64
-
65
- # Generate report from results
66
- npm run benchmark:report -- --input=<results.json> --output=<report.md>
67
- ```
68
-
69
- ### Configuration
70
-
71
- ```json
72
- {
73
- "benchmark": {
74
- "tasks": ["T01", "T02", "T03", "T04", "T05", "T06", "T07", "T08", "T09", "T10"],
75
- "uapEnabled": true,
76
- "openCodeIntegration": true,
77
- "tokenTracking": true,
78
- "qualityScoring": true
79
- }
80
- }
81
- ```
82
-
83
- ---
84
-
85
- ## Test Suite
86
-
87
- ### Task Distribution
88
-
89
- ```
90
- System Administration: 25%
91
- Security: 25%
92
- ML/Data Processing: 25%
93
- Development: 25%
94
- ```
95
-
96
- ### Task List
97
-
98
- | ID | Category | Task | Complexity |
99
- |----|----------|------|------------|
100
- | T01 | System Admin | Git Repository Recovery | Medium |
101
- | T02 | Security | Password Hash Recovery | Low |
102
- | T03 | Security | mTLS Certificate Setup | High |
103
- | T04 | System Admin | Docker Compose Config | Medium |
104
- | T05 | ML/Data | ML Model Training | High |
105
- | T06 | ML/Data | Data Compression | Low |
106
- | T07 | Development | Chess FEN Parser | Medium |
107
- | T08 | Security | SQLite WAL Recovery | High |
108
- | T09 | System Admin | HTTP Server Config | Low |
109
- | T10 | Development | Code Compression | Low |
110
- | T11 | ML/Data | MCMC Sampling | High |
111
- | T12 | Development | Core War Algorithm | Medium |
112
- | T13 | System Admin | Network Diagnostics | Medium |
113
- | T14 | Security | Cryptographic Key Gen | Low |
114
-
115
- ---
116
-
117
- ## Feature Contribution
118
-
119
- ### Token Savings Breakdown
120
-
121
- ```
122
- Pattern Router: 35%
123
- MCP Output Compression: 25%
124
- Memory Tiering: 20%
125
- Knowledge Graph: 10%
126
- OpenCode Integration: 10%
127
- ```
128
-
129
- ### Performance Impact
130
-
131
- ```mermaid
132
- quadrantChart
133
- title Feature Impact Analysis
134
- x-axis Low Impact --> High Impact
135
- y-axis Low Complexity --> High Complexity
136
- quadrant-1 High Value
137
- quadrant-2 Consider
138
- quadrant-3 Low Priority
139
- quadrant-4 Avoid
140
- Pattern Router: [0.8, 0.9]
141
- MCP Compression: [0.7, 0.8]
142
- Memory Tiering: [0.6, 0.7]
143
- OpenCode: [0.85, 0.75]
144
- ```
145
-
146
- ---
147
-
148
- ## Overnight Benchmark Runner
149
-
150
- For automated nightly execution, see the [Overnight Runner Guide](OVERNIGHT_RUNNER.md).
151
-
152
- ### Setup
153
-
154
- ```bash
155
- # Make scripts executable
156
- chmod +x scripts/benchmark-overnight.sh
157
-
158
- # Add to crontab (runs at 2:00 AM daily)
159
- 0 2 * * * cd /path/to/uap && npm run benchmark:overnight >> /var/log/uap-benchmark.log 2>&1
160
- ```
161
-
162
- ---
163
-
164
- ## Enterprise Impact
165
-
166
- ### Monthly Savings (10K tasks)
167
-
168
- | Metric | Baseline | UAP v1.18 | Savings |
169
- |--------|----------|-----------|---------|
170
- | Token Cost | $26,000 | $11,700 | **$14,300** |
171
- | Developer Time | $125,000 | $89,000 | **$36,000** |
172
- | Bug Fixes | $8,000 | $1,200 | **$6,800** |
173
- | **Total** | **$159,000** | **$101,900** | **$57,100** |
174
-
175
- **ROI:** 35.8% cost reduction, 2.8x faster delivery
176
-
177
- ---
178
-
179
- ## Validation Status
180
-
181
- | Target | Threshold | Actual | Status |
182
- |--------|-----------|--------|--------|
183
- | Token Reduction | ≥45% | 55% | ✅ PASS |
184
- | Success Rate | ≥95% | 100% | ✅ PASS |
185
- | Error Reduction | ≥90% | 100% | ✅ PASS |
186
- | Quality Score | ≥4.5 | 4.7 | ✅ PASS |
187
- | No Regressions | Time ≤ baseline | 32s vs 45s | ✅ PASS |
188
-
189
- **Overall Verdict: ✅ EXCEEDS EXPECTATIONS**
190
-
191
- ---
192
-
193
- <div align="center">
194
-
195
- **Next Steps:**
196
- - [View Comprehensive Benchmarks](COMPREHENSIVE_BENCHMARKS.md)
197
- - [Run Overnight Benchmark](OVERNIGHT_RUNNER.md)
198
- - [View Validation Results](VALIDATION_RESULTS.md)
199
-
200
- </div>
25
+ See the [documentation index](../INDEX.md) for the rest of the docs.
@@ -0,0 +1,237 @@
1
+ # Configuration
2
+
3
+ UAP is configured through a project-level `.uap.json` file plus a set of
4
+ environment variables. `uap init` / `uap setup` create `.uap.json` for you; this
5
+ page documents the options that actually exist in the code so you can tune them
6
+ by hand.
7
+
8
+ ## Project config: `.uap.json`
9
+
10
+ `.uap.json` lives at the project root and is validated against a strict schema —
11
+ unknown keys and bad types are rejected. Every section is optional except
12
+ `project`; defaults are applied for anything you omit.
13
+
14
+ ```json
15
+ {
16
+ "version": "1.0.0",
17
+ "project": {
18
+ "name": "my-project",
19
+ "description": "Optional description",
20
+ "defaultBranch": "main"
21
+ },
22
+ "platforms": {
23
+ "claudeCode": { "enabled": true },
24
+ "factory": { "enabled": true },
25
+ "vscode": { "enabled": true },
26
+ "opencode": { "enabled": true },
27
+ "codex": { "enabled": true }
28
+ },
29
+ "memory": {
30
+ "shortTerm": { "enabled": true, "path": "./agents/data/memory/short_term.db", "maxEntries": 50 },
31
+ "longTerm": { "enabled": true, "provider": "qdrant", "collection": "agent_memory", "embeddingModel": "all-MiniLM-L6-v2" },
32
+ "patternRag": { "enabled": false, "collection": "agent_patterns", "topK": 2, "scoreThreshold": 0.35 }
33
+ },
34
+ "worktrees": { "enabled": true, "directory": ".worktrees", "branchPrefix": "feature/", "autoCleanup": true }
35
+ }
36
+ ```
37
+
38
+ ### Top-level sections
39
+
40
+ | Key | Purpose |
41
+ | --- | --- |
42
+ | `project` | **Required.** `name`, optional `description`, `defaultBranch` (default `main`). |
43
+ | `platforms` | Per-harness toggles and memory-budget overrides: `claudeCode`, `factory`, `vscode`, `opencode`, `codex`. Each accepts `enabled`, `shortTermMax`, `searchResults`, `sessionMax`, `patternRag`. |
44
+ | `memory` | Memory tiers: `shortTerm`, `longTerm`, `patternRag` (see below). |
45
+ | `worktrees` | `enabled`, `directory` (default `.worktrees`), `branchPrefix` (default `feature/`), `autoCleanup`. |
46
+ | `droids` | Array of custom droid definitions (`name`, `template`, `description`, `model`, `tools`). |
47
+ | `commands` | Array of custom command definitions (`name`, `template`, `description`, `argumentHint`). |
48
+ | `template` | CLAUDE.md template selection: `extends` and per-section `sections` toggles. |
49
+ | `costOptimization` | Token budgets, embedding batching, and LLM call reduction. |
50
+ | `timeOptimization` | Deploy batch windows, parallel execution limits, service pre-warming. |
51
+ | `multiModel` | Multi-model routing (see [Model profiles](#model-profiles)). |
52
+ | `agentExecution` | Benchmark-proven agent execution feature flags (see below). |
53
+ | `patternRL` | Pattern reinforcement learning: `enabled`, `dbPath`. |
54
+
55
+ ### Memory tiers
56
+
57
+ `memory.shortTerm`:
58
+
59
+ | Field | Default | Notes |
60
+ | --- | --- | --- |
61
+ | `enabled` | `true` | |
62
+ | `path` | `./agents/data/memory/short_term.db` | SQLite database path. |
63
+ | `webDatabase` | — | IndexedDB name for web platforms. |
64
+ | `maxEntries` | `50` | |
65
+
66
+ `memory.longTerm` (the semantic tier):
67
+
68
+ | Field | Default | Notes |
69
+ | --- | --- | --- |
70
+ | `enabled` | `true` | |
71
+ | `provider` | `qdrant` | One of `qdrant`, `chroma`, `pinecone`, `github`, `qdrant-cloud`, `serverless`, `none`. |
72
+ | `endpoint` | — | Qdrant endpoint; falls back to `localhost:6333`. |
73
+ | `collection` | `agent_memory` | |
74
+ | `embeddingModel` | `all-MiniLM-L6-v2` | |
75
+ | `github` | — | GitHub-backed memory: `repo`, `token`, `path`, `branch`. |
76
+ | `qdrantCloud` | — | Qdrant Cloud: `url`, `apiKey`, `collection`. |
77
+ | `serverless` | — | Serverless Qdrant (see below). |
78
+
79
+ `memory.patternRag` (on-demand pattern retrieval): `enabled`, `collection`
80
+ (`agent_patterns`), `embeddingModel` (`all-MiniLM-L6-v2`), `vectorSize` (`384`),
81
+ `scoreThreshold` (`0.35`), `topK` (`2`), and the `indexScript` / `queryScript`
82
+ paths.
83
+
84
+ ### Qdrant configuration
85
+
86
+ The default local provider talks to Qdrant at `http://localhost:6333` — the
87
+ endpoint `uap setup` starts via docker-compose. Override it with
88
+ `memory.longTerm.endpoint`.
89
+
90
+ For managed Qdrant, set `memory.longTerm.provider` to `qdrant-cloud` and fill in
91
+ `memory.longTerm.qdrantCloud`:
92
+
93
+ ```json
94
+ {
95
+ "memory": {
96
+ "longTerm": {
97
+ "provider": "qdrant-cloud",
98
+ "qdrantCloud": {
99
+ "enabled": true,
100
+ "url": "https://xyz.qdrant.io",
101
+ "apiKey": "...",
102
+ "collection": "agent_memory"
103
+ }
104
+ }
105
+ }
106
+ }
107
+ ```
108
+
109
+ `url` and `apiKey` fall back to the `QDRANT_URL` and `QDRANT_API_KEY` environment
110
+ variables when omitted, so you can keep secrets out of the config file.
111
+
112
+ For cost-sensitive setups, `memory.longTerm.serverless` enables a lazy-start
113
+ local instance or a cloud-serverless backend:
114
+
115
+ ```json
116
+ {
117
+ "memory": {
118
+ "longTerm": {
119
+ "provider": "serverless",
120
+ "serverless": {
121
+ "enabled": true,
122
+ "mode": "lazy-local",
123
+ "lazyLocal": { "port": 6333, "autoStart": true, "autoStop": true, "idleTimeoutMs": 300000 }
124
+ }
125
+ }
126
+ }
127
+ }
128
+ ```
129
+
130
+ `mode` is one of `lazy-local`, `cloud-serverless`, or `hybrid`. Hybrid mode picks
131
+ local vs. cloud based on `NODE_ENV`, `UAP_ENV`, or auto-detection.
132
+
133
+ ### Agent execution flags
134
+
135
+ `agentExecution` exposes benchmark-tuned feature flags for the delivery harness.
136
+ Defaults are the proven-effective subset; some flags are deliberately off because
137
+ they regressed small models. Notable fields:
138
+
139
+ | Field | Default | Notes |
140
+ | --- | --- | --- |
141
+ | `domainHints` | `true` | Domain-specific hints routed by task classification. |
142
+ | `lowTemperature` / `temperature` | `true` / `0.15` | Deterministic sampling. |
143
+ | `preExecutionHooks` | `true` | File backups and tool installs before the agent starts. |
144
+ | `webSearch` | `false` | Off by default; enable for larger (70B+) models. |
145
+ | `reflectionCheckpoints` | `false` | Harmful for small models. |
146
+ | `softBudget` / `hardBudget` | `35` / `50` | Tool-call budget thresholds. |
147
+
148
+ ## Model profiles
149
+
150
+ UAP includes **7 execution profiles** — feature-flag presets tuned per model
151
+ family. They are auto-detected from the model id but can be forced via the
152
+ `UAP_MODEL_PROFILE` environment variable:
153
+
154
+ `small-moe`, `small-dense`, `medium`, `large`, `claude`, `gpt`, `gemini`.
155
+
156
+ Multi-model routing is configured under the `multiModel` section of `.uap.json`:
157
+
158
+ ```json
159
+ {
160
+ "multiModel": {
161
+ "enabled": true,
162
+ "models": ["opus-4.6", "qwen35-a3b"],
163
+ "roles": {
164
+ "planner": "opus-4.6",
165
+ "executor": "qwen35-a3b",
166
+ "fallback": "qwen35-a3b"
167
+ }
168
+ }
169
+ }
170
+ ```
171
+
172
+ `models` may reference built-in presets or inline custom model definitions.
173
+ Built-in presets include `opus-4.6`, `sonnet-4.6`, `qwen35-a3b`, `gpt-5.4`, and
174
+ `gpt-5.3-codex`. Roles default to `opus-4.6` (planner) and `qwen35-a3b`
175
+ (executor/fallback). Inspect routing with `uap model` (status, route, plan,
176
+ compare, presets, select, export, health) and `uap dashboard models`.
177
+
178
+ ## Environment variables
179
+
180
+ These are the environment variables read by the code.
181
+
182
+ ### Memory & Qdrant
183
+
184
+ | Variable | Used for |
185
+ | --- | --- |
186
+ | `QDRANT_URL` | Qdrant endpoint for cloud/serverless backends (overridden by config when both are set). |
187
+ | `QDRANT_API_KEY` | Qdrant API key (fallback when not in config). |
188
+ | `UAP_EMBEDDING_ENDPOINT` | Embedding server endpoint for semantic memory. |
189
+
190
+ ### Delivery harness (`uap deliver`)
191
+
192
+ | Variable | Used for |
193
+ | --- | --- |
194
+ | `UAP_DELIVER_MODEL` | Default model preset for `uap deliver` (fallback `qwen35-a3b`). |
195
+ | `UAP_ESCALATE_MODEL` | Stronger preset used by the escalation ladder. |
196
+ | `UAP_DELIVER_AUTO` | Set to `0` to disable task-aware auto-optimization. |
197
+ | `UAP_DELIVER_UNTIL_DELIVERED` | Set to `0` to disable loop-until-delivered. |
198
+ | `UAP_DELIVER_ACTIVE` | Set to `1` by the loop for its own subprocesses (policy enforcers detect it). |
199
+ | `UAP_DELIVER_SANDBOX` | Sandbox root that confines deliver's target directory (MCP tool). |
200
+
201
+ ### Models & inference
202
+
203
+ | Variable | Used for |
204
+ | --- | --- |
205
+ | `UAP_MODEL_PROFILE` | Force an execution profile (otherwise auto-detected). |
206
+ | `UAP_LLM_SERVER` | LLM server base URL for tool-call tooling (default `http://127.0.0.1:4000`). |
207
+ | `UAP_INFERENCE_ENDPOINT` | Fallback OpenAI-compatible endpoint (default `http://localhost:4000/v1`). |
208
+
209
+ ### Observability (HALO)
210
+
211
+ | Variable | Used for |
212
+ | --- | --- |
213
+ | `UAP_HALO_TRACE` | Set to `1` to enable HALO trace collection. |
214
+ | `UAP_HALO_TRACE_PATH` | Trace output file (default `.uap/halo/traces.jsonl`). |
215
+ | `UAP_HALO_PROJECT_ID` | HALO project identifier. |
216
+
217
+ ### Concurrency & runtime
218
+
219
+ | Variable | Used for |
220
+ | --- | --- |
221
+ | `UAP_MAX_PARALLEL` | Override the auto-detected max parallelism (always wins). |
222
+ | `UAP_PARALLEL` | Set to `false` to disable parallel execution. |
223
+ | `UAP_LOG_LEVEL` | Log verbosity (e.g. `debug`, `warn`). |
224
+ | `UAP_AGENT_ID` | Stable agent identifier used by the coordination layer. |
225
+ | `NODE_ENV` / `UAP_ENV` | Environment detection for hybrid serverless mode (`UAP_ENV=production` selects the prod backend). |
226
+ | `HERMES_HOME` | Hermes config home (default `~/.hermes`). |
227
+
228
+ ### Provider credentials
229
+
230
+ `ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `FACTORY_API_KEY`, `DROID_API_KEY`, and
231
+ `GITHUB_TOKEN` are read when the corresponding provider or GitHub-backed memory
232
+ is configured.
233
+
234
+ ## See also
235
+
236
+ - [Installation](./INSTALLATION.md)
237
+ - [Quickstart](./QUICKSTART.md)
@@ -0,0 +1,125 @@
1
+ # Installation
2
+
3
+ The Universal Agent Protocol (UAP) is an autonomous AI agent memory system with
4
+ CLAUDE.md protocol enforcement. It ships as a single npm package
5
+ (`@miller-tech/uap`, v1.40.0) that installs the `uap` CLI.
6
+
7
+ ## Prerequisites
8
+
9
+ | Requirement | Needed for | Notes |
10
+ | --- | --- | --- |
11
+ | **Node.js >= 18** | Everything | The CLI is published as ESM and requires Node 18 or newer. |
12
+ | **git** | Worktree workflow, memory prepopulation from history | Any recent git. |
13
+ | **Docker** | Local Qdrant (semantic memory tier) | `uap setup` starts a Qdrant container via docker-compose. Optional — memory degrades gracefully without it. |
14
+ | **Python 3** | Pattern RAG indexing & embeddings | Optional. `uap setup` creates a virtualenv and installs the pattern indexing dependencies. |
15
+ | **A local OpenAI-compatible model** | `uap deliver`, multi-model routing | Optional. Points at an OpenAI-compatible `/v1` endpoint (default `http://localhost:4000/v1`). |
16
+
17
+ UAP works without Docker, Python, or a local model — those steps are skipped and
18
+ the corresponding features (semantic recall, pattern RAG, the convergence
19
+ harness) are simply unavailable until you provide them.
20
+
21
+ ## Install
22
+
23
+ Install the CLI globally:
24
+
25
+ ```bash
26
+ npm install -g @miller-tech/uap
27
+ ```
28
+
29
+ ### Verify the install
30
+
31
+ ```bash
32
+ uap --version
33
+ ```
34
+
35
+ This prints the installed package version (e.g. `1.40.0`).
36
+
37
+ ## One-command setup
38
+
39
+ From the root of the project you want to wire up, run:
40
+
41
+ ```bash
42
+ uap setup
43
+ ```
44
+
45
+ `uap setup` chains the individual commands so the whole system "just works". It
46
+ runs the following steps in order:
47
+
48
+ 1. **Initialize the project** (`uap init` under the hood) — creates `.uap.json`,
49
+ the `agents/data/memory` directory structure, the short-term memory database,
50
+ a `CLAUDE.md` (or `AGENT.md`), the worktree workflow scaffold, and the Python
51
+ pattern scripts.
52
+ 2. **Start Qdrant** — uses the serverless Qdrant manager if one is configured in
53
+ `.uap.json`, otherwise starts a Qdrant container via docker-compose. If Docker
54
+ is unavailable this step warns and continues.
55
+ 3. **Wait for the Qdrant healthcheck** (up to 15s). If Qdrant is not reachable,
56
+ pattern indexing is skipped.
57
+ 4. **Start background memory consolidation** and **auto-promote** high-quality
58
+ daily-log entries into longer-lived memory tiers (non-fatal if unavailable).
59
+ 5. **Create the Python virtualenv** for pattern RAG if `init` did not already do
60
+ so. Skipped with a warning when Python 3 is not on the system.
61
+ 6. **Index patterns into Qdrant** — only when both Qdrant and Python are ready.
62
+ 7. **Configure the MCP Router** for all detected AI harnesses.
63
+ 8. **Install policy-gate and lifecycle hooks** for the project's platforms (run
64
+ `uap hooks doctor` afterward to verify coverage).
65
+ 9. **Print a setup summary** showing which steps succeeded and which optional
66
+ steps were skipped.
67
+
68
+ ### Useful `uap setup` flags
69
+
70
+ ```bash
71
+ uap setup --no-memory # init only, skip Qdrant/memory services
72
+ uap setup --no-patterns # skip pattern RAG setup and indexing
73
+ uap setup -i # interactive wizard with feature toggles
74
+ uap setup --verbose # detailed output
75
+ uap setup -d <path> # set up a project directory other than the cwd
76
+ ```
77
+
78
+ ### Init only
79
+
80
+ If you only want the project scaffold (config, directories, `CLAUDE.md`) without
81
+ starting services, run:
82
+
83
+ ```bash
84
+ uap init
85
+ ```
86
+
87
+ `uap init` accepts `--web` (generate `AGENT.md` for web platforms),
88
+ `--no-memory`, `--no-worktrees`, `--patterns` / `--no-patterns`, and `-f, --force`
89
+ to overwrite existing configuration.
90
+
91
+ ## Installing harness hooks
92
+
93
+ UAP supports nine AI coding harnesses: **Claude Code, Factory, Cursor, VSCode,
94
+ OpenCode, Codex, ForgeCode, Oh-My-Pi, and Hermes**. `uap setup` installs hooks
95
+ for the project's platforms automatically, but you can install or re-install them
96
+ manually.
97
+
98
+ Install hooks for every detected harness:
99
+
100
+ ```bash
101
+ uap hooks install
102
+ ```
103
+
104
+ Install for a single harness with `-t` / `--target` (or the `-p` / `--platform`
105
+ alias). Valid targets are `claude`, `factory`, `cursor`, `vscode`, `opencode`,
106
+ `codex`, `forgecode`, `omp`, and `hermes`:
107
+
108
+ ```bash
109
+ uap hooks install -t claude
110
+ uap hooks install -t hermes # Hermes is global, so it is opt-in
111
+ ```
112
+
113
+ Check installation status and audit policy-gate coverage:
114
+
115
+ ```bash
116
+ uap hooks status # show what is installed, per platform
117
+ uap hooks doctor # audit policy-gate coverage (exits non-zero on gaps)
118
+ ```
119
+
120
+ ## Next steps
121
+
122
+ - [Quickstart](./QUICKSTART.md) — a 5-minute path from setup to your first
123
+ delivered task.
124
+ - [Configuration](./CONFIGURATION.md) — `.uap.json` options, environment
125
+ variables, Qdrant, and model profiles.
@@ -0,0 +1,115 @@
1
+ # Quickstart
2
+
3
+ Get from a clean checkout to your first delivered task in about five minutes.
4
+ This assumes you have already installed the CLI — see
5
+ [Installation](./INSTALLATION.md) if not.
6
+
7
+ ## 1. Set up your project (~1 min)
8
+
9
+ From the root of your project:
10
+
11
+ ```bash
12
+ uap setup
13
+ ```
14
+
15
+ This initializes `.uap.json`, the memory directories and database, generates
16
+ `CLAUDE.md`, starts Qdrant (if Docker is available), wires the MCP Router, and
17
+ installs the harness hooks. It finishes with a summary showing which steps
18
+ succeeded.
19
+
20
+ Confirm memory is healthy:
21
+
22
+ ```bash
23
+ uap memory status
24
+ ```
25
+
26
+ You should see the short-term store initialized and, if Qdrant came up, the
27
+ long-term endpoint reported at `http://localhost:6333`.
28
+
29
+ ## 2. Store and query a memory (~1 min)
30
+
31
+ Write a learning into long-term memory:
32
+
33
+ ```bash
34
+ uap memory store "API keys are loaded from the QDRANT_API_KEY env var" -t config,memory -i 7
35
+ ```
36
+
37
+ `-t` adds comma-separated tags and `-i` sets the importance score (1-10). The
38
+ store applies a quality write gate by default; pass `-f` to bypass it.
39
+
40
+ Now query it back semantically:
41
+
42
+ ```bash
43
+ uap memory query "where do api keys come from"
44
+ ```
45
+
46
+ The query runs a semantic search against the long-term store and prints the
47
+ matching entries with their similarity scores. Tune results with
48
+ `-n <limit>` and `-t <threshold>` (minimum similarity, default `0.35`).
49
+
50
+ ## 3. Run `uap deliver` on a small task (~2 min)
51
+
52
+ `uap deliver` is the convergence harness: it iterates a model against your
53
+ project's **real completion gates** (build, typecheck, test, lint) until every
54
+ required gate passes or the turn budget is exhausted.
55
+
56
+ First do a dry run to see the detected gates and plan without calling a model:
57
+
58
+ ```bash
59
+ uap deliver "fix the failing test in src/utils/dates" --dry-run
60
+ ```
61
+
62
+ The dry run prints the project root, the model preset, the turn budget, and the
63
+ list of gates it discovered from your `package.json` scripts. If no verifiable
64
+ gates are detected, deliver tells you so instead of running.
65
+
66
+ When the plan looks right, run it for real:
67
+
68
+ ```bash
69
+ uap deliver "fix the failing test in src/utils/dates"
70
+ ```
71
+
72
+ Notes on behaviour:
73
+
74
+ - The default model preset is `qwen35-a3b` (override with `-m <preset>` or the
75
+ `UAP_DELIVER_MODEL` env var).
76
+ - Task-aware auto-optimization is on by default — deliver classifies the task and
77
+ enables matching convergence aids automatically. Disable with `--no-auto`.
78
+ - Loop-until-delivered is on by default: deliver keeps iterating past
79
+ `--max-turns` up to a ceiling (default 30, set with `--ceiling`), stopping
80
+ early on stagnation. Disable with `--no-until-delivered`.
81
+ - Pre-existing test files are protected from modification by default; allow edits
82
+ with `--no-protect-tests`.
83
+ - Add `--json` for machine-readable output, or `--optimize` to enable every
84
+ convergence aid (exploration, critic, practices, escalation, ideation, HALO,
85
+ coordination).
86
+
87
+ ## 4. View the dashboard (~1 min)
88
+
89
+ UAP ships a rich terminal dashboard. View the full system overview:
90
+
91
+ ```bash
92
+ uap dashboard overview
93
+ ```
94
+
95
+ Other views are available as subcommands — for example:
96
+
97
+ ```bash
98
+ uap dashboard memory # memory health, capacity, and layer architecture
99
+ uap dashboard tasks # task breakdown, progress bars, hierarchy trees
100
+ uap dashboard models # multi-model routing analytics
101
+ ```
102
+
103
+ Prefer a browser? Start the web dashboard with live updates:
104
+
105
+ ```bash
106
+ uap dashboard serve # http://localhost:3847
107
+ uap dashboard serve -p 4000 # custom port
108
+ ```
109
+
110
+ ## Where to go next
111
+
112
+ - [Configuration](./CONFIGURATION.md) — `.uap.json`, environment variables,
113
+ Qdrant, and model profiles.
114
+ - [Installation](./INSTALLATION.md) — per-harness hook installation and
115
+ prerequisites.