@miller-tech/uap 1.40.0 → 1.40.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (92) hide show
  1. package/README.md +109 -642
  2. package/docs/INDEX.md +48 -286
  3. package/docs/architecture/OVERVIEW.md +328 -0
  4. package/docs/architecture/PROTOCOL.md +204 -0
  5. package/docs/benchmarks/README.md +17 -192
  6. package/docs/getting-started/CONFIGURATION.md +237 -0
  7. package/docs/getting-started/INSTALLATION.md +125 -0
  8. package/docs/getting-started/QUICKSTART.md +115 -0
  9. package/docs/guides/COORDINATION.md +162 -0
  10. package/docs/guides/DELIVER.md +115 -0
  11. package/docs/guides/DEPLOY_BATCHING.md +212 -0
  12. package/docs/guides/DROIDS_AND_SKILLS.md +202 -0
  13. package/docs/guides/LOCAL_MODELS.md +148 -0
  14. package/docs/guides/MCP_ROUTER.md +195 -0
  15. package/docs/guides/MEMORY.md +235 -0
  16. package/docs/guides/MULTI_MODEL.md +223 -0
  17. package/docs/guides/POLICIES.md +190 -0
  18. package/docs/guides/WORKTREE_WORKFLOW.md +185 -0
  19. package/docs/integrations/MCP_ROUTER.md +147 -0
  20. package/docs/integrations/RTK.md +102 -0
  21. package/docs/reference/API.md +485 -0
  22. package/docs/reference/CLI.md +719 -0
  23. package/docs/reference/CONFIGURATION.md +90 -193
  24. package/docs/reference/DATABASE_SCHEMA.md +110 -344
  25. package/docs/reference/FEATURES.md +176 -472
  26. package/docs/reference/PATTERNS.md +102 -0
  27. package/docs/reference/PLATFORMS.md +83 -0
  28. package/package.json +1 -1
  29. package/docs/AGENTS.md +0 -423
  30. package/docs/DOCUMENTATION_AUDIT_REPORT.md +0 -131
  31. package/docs/GETTING_STARTED.md +0 -288
  32. package/docs/PROJECT_ANALYSIS_REPORT.md +0 -510
  33. package/docs/architecture/COMPLETE_ARCHITECTURE.md +0 -748
  34. package/docs/architecture/EXPERT_STACK.md +0 -137
  35. package/docs/architecture/MULTI_MODEL.md +0 -224
  36. package/docs/architecture/PLATFORM_GATING.md +0 -68
  37. package/docs/architecture/SYSTEM_ANALYSIS.md +0 -334
  38. package/docs/architecture/UAP_COMPLIANCE.md +0 -217
  39. package/docs/architecture/UAP_PROTOCOL.md +0 -339
  40. package/docs/architecture/UAP_STRICT_DROIDS.md +0 -172
  41. package/docs/archive/BALLS_MODE_SELF_ANALYSIS.md +0 -260
  42. package/docs/archive/BENCHMARK_GAPS_AND_PLAN.md +0 -146
  43. package/docs/archive/FAILING_TASKS_SOLUTION_PLAN.md +0 -668
  44. package/docs/archive/JINJA2-SYSTEM-MESSAGE-FIX.md +0 -209
  45. package/docs/archive/MODEL_ROUTING_IMPLEMENTATION_SUMMARY.md +0 -281
  46. package/docs/archive/MODEL_ROUTING_OPTIMIZATION_PLAN.md +0 -320
  47. package/docs/archive/NPM-PUBLISH-V0.9.1.md +0 -240
  48. package/docs/archive/OPTIMIZATION_OPTIONS.md +0 -334
  49. package/docs/archive/PARALLELISM_GAPS_AND_OPTIONS.md +0 -422
  50. package/docs/archive/POLICY_GATE_IMPLEMENTATION.md +0 -245
  51. package/docs/archive/SETUP_IMPROVEMENTS.md +0 -213
  52. package/docs/archive/UAP_GENERIC_OPTIMIZATION_PLAN.md +0 -270
  53. package/docs/archive/UAP_OPTIMIZATION_PLAN.md +0 -701
  54. package/docs/archive/UAP_V103_PATTERN_DESIGN.md +0 -315
  55. package/docs/archive/UAP_V104_COMPLIANCE_DESIGN.md +0 -223
  56. package/docs/archive/changelog/2026-03-10_uap-100-compliance.md +0 -77
  57. package/docs/archive/changelog/2026-03-10_uap-full-system-verification.md +0 -109
  58. package/docs/archive/opencode-integration-guide.md +0 -740
  59. package/docs/archive/opencode-integration-quickref.md +0 -180
  60. package/docs/benchmarks/OVERNIGHT_RUNNER.md +0 -341
  61. package/docs/benchmarks/SPECULATIVE_DECODING_JOURNEY_2026-03.md +0 -221
  62. package/docs/benchmarks/VALIDATION_PLAN.md +0 -568
  63. package/docs/blog/SPECULATIVE_DECODING_PRODUCTION_PLAYBOOK.md +0 -139
  64. package/docs/blog/local-coding-agents.md +0 -266
  65. package/docs/blog/x-thread.md +0 -254
  66. package/docs/deployment/DEPLOYMENT.md +0 -895
  67. package/docs/deployment/DEPLOYMENT_STRATEGIES.md +0 -518
  68. package/docs/deployment/DEPLOY_BATCHER_ANALYSIS.md +0 -224
  69. package/docs/deployment/DEPLOY_BATCHING.md +0 -273
  70. package/docs/deployment/DEPLOY_BUCKETING_ANALYSIS.md +0 -420
  71. package/docs/deployment/QWEN35_LLAMA_CPP.md +0 -426
  72. package/docs/deployment/UAP_LLAMA_ANTHROPIC_PROXY_BOOTSTRAP.md +0 -279
  73. package/docs/getting-started/INTEGRATION.md +0 -628
  74. package/docs/getting-started/OVERVIEW.md +0 -324
  75. package/docs/getting-started/SETUP.md +0 -377
  76. package/docs/integrations/MCP_ROUTER_SETUP.md +0 -445
  77. package/docs/integrations/RTK_INTEGRATION.md +0 -468
  78. package/docs/operations/TROUBLESHOOTING.md +0 -660
  79. package/docs/pr/PR_SPECULATIVE_DOCS_TEMPLATE.md +0 -146
  80. package/docs/pr/UPSTREAM_PRS.md +0 -424
  81. package/docs/reference/API_REFERENCE.md +0 -903
  82. package/docs/reference/EXPERT_DROIDS.md +0 -219
  83. package/docs/reference/HARNESS-MATRIX.md +0 -318
  84. package/docs/reference/PATTERN_LIBRARY.md +0 -636
  85. package/docs/reference/UAP_CLI_REFERENCE.md +0 -620
  86. package/docs/research/BEHAVIORAL_PATTERNS.md +0 -228
  87. package/docs/research/DOMAIN_STRATEGIES.md +0 -316
  88. package/docs/research/MEMORY_SYSTEMS_COMPARISON.md +0 -812
  89. package/docs/research/PATTERN_ANALYSIS_2026-01-18.md +0 -436
  90. package/docs/research/PERFORMANCE_ANALYSIS_2026-01-18.md +0 -209
  91. package/docs/research/PERFORMANCE_TEST_PLAN.md +0 -383
  92. package/docs/research/TERMINAL_BENCH_LEARNINGS.md +0 -217
@@ -0,0 +1,147 @@
1
+ # MCP Router
2
+
3
+ `v1.40.0` · `src/mcp-router/`
4
+
5
+ The MCP Router is a hierarchical Model Context Protocol server that sits in
6
+ front of all of your downstream MCP servers and dramatically reduces the tokens
7
+ the model spends on tool definitions and tool output. It is the mechanism behind
8
+ UAP's "up to 98% savings on large tool calls."
9
+
10
+ For where it fits in the wider system, see
11
+ [../architecture/OVERVIEW.md](../architecture/OVERVIEW.md#mcp-router-srcmcp-router).
12
+
13
+ ---
14
+
15
+ ## Why it exists
16
+
17
+ A normal MCP setup exposes every tool from every server directly to the model.
18
+ With a dozen servers that is easily 150+ tool schemas at roughly ~500 tokens
19
+ each — tens of thousands of tokens of context burned before the agent does any
20
+ work. On top of that, tools like file readers and shell wrappers return large
21
+ outputs that flood the context window.
22
+
23
+ The router fixes both:
24
+
25
+ 1. **Tool hiding.** It exposes just three meta-tools instead of every
26
+ downstream tool. The documented design target is ~75,000 tokens of tool
27
+ definitions collapsed to ~700 (`src/mcp-router/index.ts`,
28
+ `src/mcp-router/server.ts`).
29
+ 2. **Output compression.** Large tool results are indexed into an in-memory
30
+ SQLite **FTS5** table and only the most relevant snippets are returned
31
+ (`src/mcp-router/output-compressor.ts`).
32
+
33
+ ---
34
+
35
+ ## How it works
36
+
37
+ ### The three meta-tools
38
+
39
+ Instead of N downstream tools, the model sees:
40
+
41
+ | Meta-tool | What it does |
42
+ |-----------|--------------|
43
+ | `discover_tools` | Natural-language query → matching downstream tool paths |
44
+ | `execute_tool` | Run a tool by `path` with `args` (+ optional `intent`) |
45
+ | `deliver` | Run the `uap deliver` convergence loop |
46
+
47
+ Downstream tools are loaded into an in-memory fuzzy search index at startup and
48
+ are never surfaced as definitions. The agent's flow is:
49
+
50
+ ```
51
+ discover_tools("read the auth config")
52
+ │ → [ "filesystem.read_file", ... ]
53
+
54
+ execute_tool({ path: "filesystem.read_file",
55
+ args: { path: "src/auth.ts" },
56
+ intent: "csrf token validation" })
57
+
58
+
59
+ ┌──────────── output compressor ────────────┐
60
+ │ small result → passthrough │
61
+ │ large result → FTS5 index + BM25(intent) │
62
+ │ → top snippets + searchable-vocab footer │
63
+ │ huge / no intent → head+tail truncation │
64
+ └────────────────────────────────────────────┘
65
+ ```
66
+
67
+ The `intent` string on `execute_tool` is what drives the BM25 query — provide a
68
+ focused intent to get focused snippets. The model can then issue a follow-up
69
+ `execute_tool` with a refined intent using the vocabulary footer.
70
+
71
+ ---
72
+
73
+ ## Setup
74
+
75
+ ### One command, all harnesses
76
+
77
+ ```bash
78
+ uap mcp-setup
79
+ ```
80
+
81
+ `uap mcp-setup` (`src/cli/setup-mcp-router.ts`) configures the MCP Router as the
82
+ single MCP server across your AI harnesses. It writes a `mcpServers.router`
83
+ entry pointing at the router and migrates/backs up any existing servers
84
+ (prompts unless `--force`), then validates the result with `uap mcp-router list`.
85
+
86
+ Harnesses configured (global `~/` config paths):
87
+
88
+ | Harness | Config file |
89
+ |---------|-------------|
90
+ | Claude Code | `~/.claude/settings.json` |
91
+ | Factory.AI | `~/.factory/mcp.json` |
92
+ | VSCode | `~/.vscode/mcp.json` (skipped if absent) |
93
+ | Cursor | `~/.cursor/settings.json` |
94
+
95
+ The router entry it writes looks like:
96
+
97
+ ```json
98
+ {
99
+ "mcpServers": {
100
+ "router": {
101
+ "command": "npx",
102
+ "args": ["uap", "mcp-router", "start"]
103
+ }
104
+ }
105
+ }
106
+ ```
107
+
108
+ ### Running and inspecting the router
109
+
110
+ `uap mcp-router <action>` (`src/cli/mcp-router.ts`) drives the router directly:
111
+
112
+ ```bash
113
+ uap mcp-router start # run the stdio MCP server (what harnesses launch)
114
+ uap mcp-router list # list discovered downstream tools
115
+ uap mcp-router discover # try a natural-language tool discovery query
116
+ uap mcp-router stats # token-savings stats
117
+ ```
118
+
119
+ `uap mcp-router start` is the command harnesses invoke via the generated config;
120
+ you normally don't run it by hand.
121
+
122
+ ---
123
+
124
+ ## Verifying it works
125
+
126
+ ```bash
127
+ uap mcp-router list # should enumerate tools from your downstream servers
128
+ uap mcp-router stats # shows tool-hiding savings + per-output compression
129
+ uap hooks doctor # confirms gate/router wiring across harnesses
130
+ ```
131
+
132
+ If `list` is empty, the router found no downstream MCP configs — confirm your
133
+ harness still has its original MCP servers defined (they are migrated into the
134
+ router's view, not deleted) and re-run `uap mcp-setup`.
135
+
136
+ ---
137
+
138
+ ## Notes
139
+
140
+ - The router reads downstream MCP configs from Claude Desktop, Cursor, VSCode,
141
+ Claude Code CLI, Factory.AI, and a local `mcp.json`, expands `~`/env vars,
142
+ skips disabled servers, and refuses to reference itself.
143
+ - The 98% / 75k→700 figures are the documented design target for tool hiding;
144
+ per-output FTS5 savings are computed live for each call and reported by
145
+ `uap mcp-router stats`.
146
+ - Pair the router with **RTK** for CLI-output savings — see
147
+ [RTK.md](RTK.md). The two are complementary (tool definitions + CLI output).
@@ -0,0 +1,102 @@
1
+ # RTK — Rust Token Killer
2
+
3
+ `v1.40.0` · `src/cli/rtk.ts`
4
+
5
+ RTK (Rust Token Killer) is a fast CLI proxy that compresses and filters the
6
+ output of command-line tools — `git status`, test runs, file reads, and similar
7
+ heavy commands — to cut the tokens your agent spends echoing terminal output.
8
+ Source positions it at **60–90% token savings on CLI command output**.
9
+
10
+ RTK is a separate, open-source tool (`https://github.com/rtk-ai/rtk`,
11
+ docs at `https://www.rtk-ai.app`). UAP integrates with it but does not bundle
12
+ it; `uap rtk` manages installation and wiring.
13
+
14
+ ---
15
+
16
+ ## Why RTK + the MCP Router
17
+
18
+ The two integrations target different sources of token waste and stack:
19
+
20
+ | Layer | Tool | Saves on |
21
+ |-------|------|----------|
22
+ | MCP tool definitions + tool output | [MCP Router](MCP_ROUTER.md) | ~98% of tool-definition tokens |
23
+ | Raw CLI command output | **RTK** | 60–90% of CLI-output tokens |
24
+
25
+ Source describes the combination as **95%+ total token reduction**
26
+ (`src/cli/rtk.ts`).
27
+
28
+ ---
29
+
30
+ ## How UAP integrates RTK
31
+
32
+ `uap rtk <command>` (`src/cli/rtk.ts`):
33
+
34
+ ```bash
35
+ uap rtk install # install RTK, auto-detecting the best method
36
+ uap rtk status # check install + hook wiring + recent savings
37
+ uap rtk help # usage
38
+ ```
39
+
40
+ ### `uap rtk install`
41
+
42
+ Auto-detects the best install method (Homebrew → Cargo → pre-built binary via
43
+ curl) and runs it. Override with flags:
44
+
45
+ ```bash
46
+ uap rtk install --method homebrew # force a method: homebrew | cargo | curl
47
+ uap rtk install --force # reinstall
48
+ ```
49
+
50
+ Equivalent manual installs:
51
+
52
+ ```bash
53
+ brew install rtk # Homebrew
54
+ cargo install --git https://github.com/rtk-ai/rtk # Cargo
55
+ # or download a release binary from:
56
+ # https://github.com/rtk-ai/rtk/releases
57
+ ```
58
+
59
+ After install, initialize and verify:
60
+
61
+ ```bash
62
+ rtk init --global # set up the global rewrite hook
63
+ rtk gain # show token savings analytics
64
+ ```
65
+
66
+ ### `uap rtk status`
67
+
68
+ Reports whether the `rtk` binary is installed, whether the rewrite hook
69
+ (`~/.claude/hooks/rtk-rewrite.sh`) is wired in, and recent savings from
70
+ `rtk gain`.
71
+
72
+ ---
73
+
74
+ ## How it works in practice
75
+
76
+ Once the rewrite hook is installed, heavy CLI commands are transparently routed
77
+ through RTK (e.g. `git status` is rewritten to `rtk git status`) with zero
78
+ extra tokens of overhead — the agent issues normal commands and RTK compresses
79
+ the output before it reaches the model.
80
+
81
+ UAP can nudge agents to route heavy CLIs through RTK via the `rtk_wrap.py`
82
+ policy enforcer (`src/policies/enforcers/rtk_wrap.py`).
83
+
84
+ Useful RTK meta-commands:
85
+
86
+ ```bash
87
+ rtk gain # token savings analytics
88
+ rtk gain --history # command usage history with savings
89
+ rtk discover # find missed savings opportunities
90
+ rtk --version # verify the install
91
+ ```
92
+
93
+ ---
94
+
95
+ ## Combined analytics
96
+
97
+ `uap rtk` can surface unified analytics combining MCP Router and RTK savings
98
+ (`showUnifiedAnalytics` in `src/cli/rtk.ts`), so you can see total context
99
+ reduction from both layers at once.
100
+
101
+ See also: [MCP_ROUTER.md](MCP_ROUTER.md) ·
102
+ [../architecture/OVERVIEW.md](../architecture/OVERVIEW.md)