@miller-tech/uap 1.40.0 → 1.40.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +109 -642
- package/docs/INDEX.md +48 -286
- package/docs/architecture/OVERVIEW.md +328 -0
- package/docs/architecture/PROTOCOL.md +204 -0
- package/docs/benchmarks/README.md +17 -192
- package/docs/getting-started/CONFIGURATION.md +237 -0
- package/docs/getting-started/INSTALLATION.md +125 -0
- package/docs/getting-started/QUICKSTART.md +115 -0
- package/docs/guides/COORDINATION.md +162 -0
- package/docs/guides/DELIVER.md +115 -0
- package/docs/guides/DEPLOY_BATCHING.md +212 -0
- package/docs/guides/DROIDS_AND_SKILLS.md +202 -0
- package/docs/guides/LOCAL_MODELS.md +148 -0
- package/docs/guides/MCP_ROUTER.md +195 -0
- package/docs/guides/MEMORY.md +235 -0
- package/docs/guides/MULTI_MODEL.md +223 -0
- package/docs/guides/POLICIES.md +190 -0
- package/docs/guides/WORKTREE_WORKFLOW.md +185 -0
- package/docs/integrations/MCP_ROUTER.md +147 -0
- package/docs/integrations/RTK.md +102 -0
- package/docs/reference/API.md +485 -0
- package/docs/reference/CLI.md +719 -0
- package/docs/reference/CONFIGURATION.md +90 -193
- package/docs/reference/DATABASE_SCHEMA.md +110 -344
- package/docs/reference/FEATURES.md +176 -472
- package/docs/reference/PATTERNS.md +102 -0
- package/docs/reference/PLATFORMS.md +83 -0
- package/package.json +1 -1
- package/docs/AGENTS.md +0 -423
- package/docs/DOCUMENTATION_AUDIT_REPORT.md +0 -131
- package/docs/GETTING_STARTED.md +0 -288
- package/docs/PROJECT_ANALYSIS_REPORT.md +0 -510
- package/docs/architecture/COMPLETE_ARCHITECTURE.md +0 -748
- package/docs/architecture/EXPERT_STACK.md +0 -137
- package/docs/architecture/MULTI_MODEL.md +0 -224
- package/docs/architecture/PLATFORM_GATING.md +0 -68
- package/docs/architecture/SYSTEM_ANALYSIS.md +0 -334
- package/docs/architecture/UAP_COMPLIANCE.md +0 -217
- package/docs/architecture/UAP_PROTOCOL.md +0 -339
- package/docs/architecture/UAP_STRICT_DROIDS.md +0 -172
- package/docs/archive/BALLS_MODE_SELF_ANALYSIS.md +0 -260
- package/docs/archive/BENCHMARK_GAPS_AND_PLAN.md +0 -146
- package/docs/archive/FAILING_TASKS_SOLUTION_PLAN.md +0 -668
- package/docs/archive/JINJA2-SYSTEM-MESSAGE-FIX.md +0 -209
- package/docs/archive/MODEL_ROUTING_IMPLEMENTATION_SUMMARY.md +0 -281
- package/docs/archive/MODEL_ROUTING_OPTIMIZATION_PLAN.md +0 -320
- package/docs/archive/NPM-PUBLISH-V0.9.1.md +0 -240
- package/docs/archive/OPTIMIZATION_OPTIONS.md +0 -334
- package/docs/archive/PARALLELISM_GAPS_AND_OPTIONS.md +0 -422
- package/docs/archive/POLICY_GATE_IMPLEMENTATION.md +0 -245
- package/docs/archive/SETUP_IMPROVEMENTS.md +0 -213
- package/docs/archive/UAP_GENERIC_OPTIMIZATION_PLAN.md +0 -270
- package/docs/archive/UAP_OPTIMIZATION_PLAN.md +0 -701
- package/docs/archive/UAP_V103_PATTERN_DESIGN.md +0 -315
- package/docs/archive/UAP_V104_COMPLIANCE_DESIGN.md +0 -223
- package/docs/archive/changelog/2026-03-10_uap-100-compliance.md +0 -77
- package/docs/archive/changelog/2026-03-10_uap-full-system-verification.md +0 -109
- package/docs/archive/opencode-integration-guide.md +0 -740
- package/docs/archive/opencode-integration-quickref.md +0 -180
- package/docs/benchmarks/OVERNIGHT_RUNNER.md +0 -341
- package/docs/benchmarks/SPECULATIVE_DECODING_JOURNEY_2026-03.md +0 -221
- package/docs/benchmarks/VALIDATION_PLAN.md +0 -568
- package/docs/blog/SPECULATIVE_DECODING_PRODUCTION_PLAYBOOK.md +0 -139
- package/docs/blog/local-coding-agents.md +0 -266
- package/docs/blog/x-thread.md +0 -254
- package/docs/deployment/DEPLOYMENT.md +0 -895
- package/docs/deployment/DEPLOYMENT_STRATEGIES.md +0 -518
- package/docs/deployment/DEPLOY_BATCHER_ANALYSIS.md +0 -224
- package/docs/deployment/DEPLOY_BATCHING.md +0 -273
- package/docs/deployment/DEPLOY_BUCKETING_ANALYSIS.md +0 -420
- package/docs/deployment/QWEN35_LLAMA_CPP.md +0 -426
- package/docs/deployment/UAP_LLAMA_ANTHROPIC_PROXY_BOOTSTRAP.md +0 -279
- package/docs/getting-started/INTEGRATION.md +0 -628
- package/docs/getting-started/OVERVIEW.md +0 -324
- package/docs/getting-started/SETUP.md +0 -377
- package/docs/integrations/MCP_ROUTER_SETUP.md +0 -445
- package/docs/integrations/RTK_INTEGRATION.md +0 -468
- package/docs/operations/TROUBLESHOOTING.md +0 -660
- package/docs/pr/PR_SPECULATIVE_DOCS_TEMPLATE.md +0 -146
- package/docs/pr/UPSTREAM_PRS.md +0 -424
- package/docs/reference/API_REFERENCE.md +0 -903
- package/docs/reference/EXPERT_DROIDS.md +0 -219
- package/docs/reference/HARNESS-MATRIX.md +0 -318
- package/docs/reference/PATTERN_LIBRARY.md +0 -636
- package/docs/reference/UAP_CLI_REFERENCE.md +0 -620
- package/docs/research/BEHAVIORAL_PATTERNS.md +0 -228
- package/docs/research/DOMAIN_STRATEGIES.md +0 -316
- package/docs/research/MEMORY_SYSTEMS_COMPARISON.md +0 -812
- package/docs/research/PATTERN_ANALYSIS_2026-01-18.md +0 -436
- package/docs/research/PERFORMANCE_ANALYSIS_2026-01-18.md +0 -209
- package/docs/research/PERFORMANCE_TEST_PLAN.md +0 -383
- package/docs/research/TERMINAL_BENCH_LEARNINGS.md +0 -217
|
@@ -0,0 +1,147 @@
|
|
|
1
|
+
# MCP Router
|
|
2
|
+
|
|
3
|
+
`v1.40.0` · `src/mcp-router/`
|
|
4
|
+
|
|
5
|
+
The MCP Router is a hierarchical Model Context Protocol server that sits in
|
|
6
|
+
front of all of your downstream MCP servers and dramatically reduces the tokens
|
|
7
|
+
the model spends on tool definitions and tool output. It is the mechanism behind
|
|
8
|
+
UAP's "up to 98% savings on large tool calls."
|
|
9
|
+
|
|
10
|
+
For where it fits in the wider system, see
|
|
11
|
+
[../architecture/OVERVIEW.md](../architecture/OVERVIEW.md#mcp-router-srcmcp-router).
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Why it exists
|
|
16
|
+
|
|
17
|
+
A normal MCP setup exposes every tool from every server directly to the model.
|
|
18
|
+
With a dozen servers that is easily 150+ tool schemas at roughly ~500 tokens
|
|
19
|
+
each — tens of thousands of tokens of context burned before the agent does any
|
|
20
|
+
work. On top of that, tools like file readers and shell wrappers return large
|
|
21
|
+
outputs that flood the context window.
|
|
22
|
+
|
|
23
|
+
The router fixes both:
|
|
24
|
+
|
|
25
|
+
1. **Tool hiding.** It exposes just three meta-tools instead of every
|
|
26
|
+
downstream tool. The documented design target is ~75,000 tokens of tool
|
|
27
|
+
definitions collapsed to ~700 (`src/mcp-router/index.ts`,
|
|
28
|
+
`src/mcp-router/server.ts`).
|
|
29
|
+
2. **Output compression.** Large tool results are indexed into an in-memory
|
|
30
|
+
SQLite **FTS5** table and only the most relevant snippets are returned
|
|
31
|
+
(`src/mcp-router/output-compressor.ts`).
|
|
32
|
+
|
|
33
|
+
---
|
|
34
|
+
|
|
35
|
+
## How it works
|
|
36
|
+
|
|
37
|
+
### The three meta-tools
|
|
38
|
+
|
|
39
|
+
Instead of N downstream tools, the model sees:
|
|
40
|
+
|
|
41
|
+
| Meta-tool | What it does |
|
|
42
|
+
|-----------|--------------|
|
|
43
|
+
| `discover_tools` | Natural-language query → matching downstream tool paths |
|
|
44
|
+
| `execute_tool` | Run a tool by `path` with `args` (+ optional `intent`) |
|
|
45
|
+
| `deliver` | Run the `uap deliver` convergence loop |
|
|
46
|
+
|
|
47
|
+
Downstream tools are loaded into an in-memory fuzzy search index at startup and
|
|
48
|
+
are never surfaced as definitions. The agent's flow is:
|
|
49
|
+
|
|
50
|
+
```
|
|
51
|
+
discover_tools("read the auth config")
|
|
52
|
+
│ → [ "filesystem.read_file", ... ]
|
|
53
|
+
▼
|
|
54
|
+
execute_tool({ path: "filesystem.read_file",
|
|
55
|
+
args: { path: "src/auth.ts" },
|
|
56
|
+
intent: "csrf token validation" })
|
|
57
|
+
│
|
|
58
|
+
▼
|
|
59
|
+
┌──────────── output compressor ────────────┐
|
|
60
|
+
│ small result → passthrough │
|
|
61
|
+
│ large result → FTS5 index + BM25(intent) │
|
|
62
|
+
│ → top snippets + searchable-vocab footer │
|
|
63
|
+
│ huge / no intent → head+tail truncation │
|
|
64
|
+
└────────────────────────────────────────────┘
|
|
65
|
+
```
|
|
66
|
+
|
|
67
|
+
The `intent` string on `execute_tool` is what drives the BM25 query — provide a
|
|
68
|
+
focused intent to get focused snippets. The model can then issue a follow-up
|
|
69
|
+
`execute_tool` with a refined intent using the vocabulary footer.
|
|
70
|
+
|
|
71
|
+
---
|
|
72
|
+
|
|
73
|
+
## Setup
|
|
74
|
+
|
|
75
|
+
### One command, all harnesses
|
|
76
|
+
|
|
77
|
+
```bash
|
|
78
|
+
uap mcp-setup
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
`uap mcp-setup` (`src/cli/setup-mcp-router.ts`) configures the MCP Router as the
|
|
82
|
+
single MCP server across your AI harnesses. It writes a `mcpServers.router`
|
|
83
|
+
entry pointing at the router and migrates/backs up any existing servers
|
|
84
|
+
(prompts unless `--force`), then validates the result with `uap mcp-router list`.
|
|
85
|
+
|
|
86
|
+
Harnesses configured (global `~/` config paths):
|
|
87
|
+
|
|
88
|
+
| Harness | Config file |
|
|
89
|
+
|---------|-------------|
|
|
90
|
+
| Claude Code | `~/.claude/settings.json` |
|
|
91
|
+
| Factory.AI | `~/.factory/mcp.json` |
|
|
92
|
+
| VSCode | `~/.vscode/mcp.json` (skipped if absent) |
|
|
93
|
+
| Cursor | `~/.cursor/settings.json` |
|
|
94
|
+
|
|
95
|
+
The router entry it writes looks like:
|
|
96
|
+
|
|
97
|
+
```json
|
|
98
|
+
{
|
|
99
|
+
"mcpServers": {
|
|
100
|
+
"router": {
|
|
101
|
+
"command": "npx",
|
|
102
|
+
"args": ["uap", "mcp-router", "start"]
|
|
103
|
+
}
|
|
104
|
+
}
|
|
105
|
+
}
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
### Running and inspecting the router
|
|
109
|
+
|
|
110
|
+
`uap mcp-router <action>` (`src/cli/mcp-router.ts`) drives the router directly:
|
|
111
|
+
|
|
112
|
+
```bash
|
|
113
|
+
uap mcp-router start # run the stdio MCP server (what harnesses launch)
|
|
114
|
+
uap mcp-router list # list discovered downstream tools
|
|
115
|
+
uap mcp-router discover # try a natural-language tool discovery query
|
|
116
|
+
uap mcp-router stats # token-savings stats
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
`uap mcp-router start` is the command harnesses invoke via the generated config;
|
|
120
|
+
you normally don't run it by hand.
|
|
121
|
+
|
|
122
|
+
---
|
|
123
|
+
|
|
124
|
+
## Verifying it works
|
|
125
|
+
|
|
126
|
+
```bash
|
|
127
|
+
uap mcp-router list # should enumerate tools from your downstream servers
|
|
128
|
+
uap mcp-router stats # shows tool-hiding savings + per-output compression
|
|
129
|
+
uap hooks doctor # confirms gate/router wiring across harnesses
|
|
130
|
+
```
|
|
131
|
+
|
|
132
|
+
If `list` is empty, the router found no downstream MCP configs — confirm your
|
|
133
|
+
harness still has its original MCP servers defined (they are migrated into the
|
|
134
|
+
router's view, not deleted) and re-run `uap mcp-setup`.
|
|
135
|
+
|
|
136
|
+
---
|
|
137
|
+
|
|
138
|
+
## Notes
|
|
139
|
+
|
|
140
|
+
- The router reads downstream MCP configs from Claude Desktop, Cursor, VSCode,
|
|
141
|
+
Claude Code CLI, Factory.AI, and a local `mcp.json`, expands `~`/env vars,
|
|
142
|
+
skips disabled servers, and refuses to reference itself.
|
|
143
|
+
- The 98% / 75k→700 figures are the documented design target for tool hiding;
|
|
144
|
+
per-output FTS5 savings are computed live for each call and reported by
|
|
145
|
+
`uap mcp-router stats`.
|
|
146
|
+
- Pair the router with **RTK** for CLI-output savings — see
|
|
147
|
+
[RTK.md](RTK.md). The two are complementary (tool definitions + CLI output).
|
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
# RTK — Rust Token Killer
|
|
2
|
+
|
|
3
|
+
`v1.40.0` · `src/cli/rtk.ts`
|
|
4
|
+
|
|
5
|
+
RTK (Rust Token Killer) is a fast CLI proxy that compresses and filters the
|
|
6
|
+
output of command-line tools — `git status`, test runs, file reads, and similar
|
|
7
|
+
heavy commands — to cut the tokens your agent spends echoing terminal output.
|
|
8
|
+
Source positions it at **60–90% token savings on CLI command output**.
|
|
9
|
+
|
|
10
|
+
RTK is a separate, open-source tool (`https://github.com/rtk-ai/rtk`,
|
|
11
|
+
docs at `https://www.rtk-ai.app`). UAP integrates with it but does not bundle
|
|
12
|
+
it; `uap rtk` manages installation and wiring.
|
|
13
|
+
|
|
14
|
+
---
|
|
15
|
+
|
|
16
|
+
## Why RTK + the MCP Router
|
|
17
|
+
|
|
18
|
+
The two integrations target different sources of token waste and stack:
|
|
19
|
+
|
|
20
|
+
| Layer | Tool | Saves on |
|
|
21
|
+
|-------|------|----------|
|
|
22
|
+
| MCP tool definitions + tool output | [MCP Router](MCP_ROUTER.md) | ~98% of tool-definition tokens |
|
|
23
|
+
| Raw CLI command output | **RTK** | 60–90% of CLI-output tokens |
|
|
24
|
+
|
|
25
|
+
Source describes the combination as **95%+ total token reduction**
|
|
26
|
+
(`src/cli/rtk.ts`).
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## How UAP integrates RTK
|
|
31
|
+
|
|
32
|
+
`uap rtk <command>` (`src/cli/rtk.ts`):
|
|
33
|
+
|
|
34
|
+
```bash
|
|
35
|
+
uap rtk install # install RTK, auto-detecting the best method
|
|
36
|
+
uap rtk status # check install + hook wiring + recent savings
|
|
37
|
+
uap rtk help # usage
|
|
38
|
+
```
|
|
39
|
+
|
|
40
|
+
### `uap rtk install`
|
|
41
|
+
|
|
42
|
+
Auto-detects the best install method (Homebrew → Cargo → pre-built binary via
|
|
43
|
+
curl) and runs it. Override with flags:
|
|
44
|
+
|
|
45
|
+
```bash
|
|
46
|
+
uap rtk install --method homebrew # force a method: homebrew | cargo | curl
|
|
47
|
+
uap rtk install --force # reinstall
|
|
48
|
+
```
|
|
49
|
+
|
|
50
|
+
Equivalent manual installs:
|
|
51
|
+
|
|
52
|
+
```bash
|
|
53
|
+
brew install rtk # Homebrew
|
|
54
|
+
cargo install --git https://github.com/rtk-ai/rtk # Cargo
|
|
55
|
+
# or download a release binary from:
|
|
56
|
+
# https://github.com/rtk-ai/rtk/releases
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
After install, initialize and verify:
|
|
60
|
+
|
|
61
|
+
```bash
|
|
62
|
+
rtk init --global # set up the global rewrite hook
|
|
63
|
+
rtk gain # show token savings analytics
|
|
64
|
+
```
|
|
65
|
+
|
|
66
|
+
### `uap rtk status`
|
|
67
|
+
|
|
68
|
+
Reports whether the `rtk` binary is installed, whether the rewrite hook
|
|
69
|
+
(`~/.claude/hooks/rtk-rewrite.sh`) is wired in, and recent savings from
|
|
70
|
+
`rtk gain`.
|
|
71
|
+
|
|
72
|
+
---
|
|
73
|
+
|
|
74
|
+
## How it works in practice
|
|
75
|
+
|
|
76
|
+
Once the rewrite hook is installed, heavy CLI commands are transparently routed
|
|
77
|
+
through RTK (e.g. `git status` is rewritten to `rtk git status`) with zero
|
|
78
|
+
extra tokens of overhead — the agent issues normal commands and RTK compresses
|
|
79
|
+
the output before it reaches the model.
|
|
80
|
+
|
|
81
|
+
UAP can nudge agents to route heavy CLIs through RTK via the `rtk_wrap.py`
|
|
82
|
+
policy enforcer (`src/policies/enforcers/rtk_wrap.py`).
|
|
83
|
+
|
|
84
|
+
Useful RTK meta-commands:
|
|
85
|
+
|
|
86
|
+
```bash
|
|
87
|
+
rtk gain # token savings analytics
|
|
88
|
+
rtk gain --history # command usage history with savings
|
|
89
|
+
rtk discover # find missed savings opportunities
|
|
90
|
+
rtk --version # verify the install
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
---
|
|
94
|
+
|
|
95
|
+
## Combined analytics
|
|
96
|
+
|
|
97
|
+
`uap rtk` can surface unified analytics combining MCP Router and RTK savings
|
|
98
|
+
(`showUnifiedAnalytics` in `src/cli/rtk.ts`), so you can see total context
|
|
99
|
+
reduction from both layers at once.
|
|
100
|
+
|
|
101
|
+
See also: [MCP_ROUTER.md](MCP_ROUTER.md) ·
|
|
102
|
+
[../architecture/OVERVIEW.md](../architecture/OVERVIEW.md)
|