@miller-tech/uap 1.40.0 → 1.41.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (150) hide show
  1. package/README.md +109 -642
  2. package/dist/.tsbuildinfo +1 -1
  3. package/dist/cli/deliver-defaults.d.ts +23 -0
  4. package/dist/cli/deliver-defaults.d.ts.map +1 -0
  5. package/dist/cli/deliver-defaults.js +121 -0
  6. package/dist/cli/deliver-defaults.js.map +1 -0
  7. package/dist/cli/init.d.ts.map +1 -1
  8. package/dist/cli/init.js +29 -0
  9. package/dist/cli/init.js.map +1 -1
  10. package/dist/cli/setup.d.ts.map +1 -1
  11. package/dist/cli/setup.js +19 -0
  12. package/dist/cli/setup.js.map +1 -1
  13. package/dist/policies/policy-tools.d.ts +7 -0
  14. package/dist/policies/policy-tools.d.ts.map +1 -1
  15. package/dist/policies/policy-tools.js +24 -2
  16. package/dist/policies/policy-tools.js.map +1 -1
  17. package/docs/INDEX.md +48 -286
  18. package/docs/architecture/OVERVIEW.md +328 -0
  19. package/docs/architecture/PROTOCOL.md +204 -0
  20. package/docs/benchmarks/README.md +17 -192
  21. package/docs/getting-started/CONFIGURATION.md +237 -0
  22. package/docs/getting-started/INSTALLATION.md +125 -0
  23. package/docs/getting-started/QUICKSTART.md +115 -0
  24. package/docs/guides/COORDINATION.md +162 -0
  25. package/docs/guides/DELIVER.md +115 -0
  26. package/docs/guides/DEPLOY_BATCHING.md +212 -0
  27. package/docs/guides/DROIDS_AND_SKILLS.md +202 -0
  28. package/docs/guides/LOCAL_MODELS.md +148 -0
  29. package/docs/guides/MCP_ROUTER.md +195 -0
  30. package/docs/guides/MEMORY.md +235 -0
  31. package/docs/guides/MULTI_MODEL.md +223 -0
  32. package/docs/guides/POLICIES.md +190 -0
  33. package/docs/guides/WORKTREE_WORKFLOW.md +185 -0
  34. package/docs/integrations/MCP_ROUTER.md +147 -0
  35. package/docs/integrations/RTK.md +102 -0
  36. package/docs/reference/API.md +485 -0
  37. package/docs/reference/CLI.md +719 -0
  38. package/docs/reference/CONFIGURATION.md +90 -193
  39. package/docs/reference/DATABASE_SCHEMA.md +110 -344
  40. package/docs/reference/FEATURES.md +176 -472
  41. package/docs/reference/PATTERNS.md +102 -0
  42. package/docs/reference/PLATFORMS.md +83 -0
  43. package/package.json +3 -1
  44. package/src/policies/enforcers/7ebbc721-7540-4e9f-879a-770e0213a09b_architecture_review.py +101 -0
  45. package/src/policies/enforcers/__pycache__/_common.cpython-312.pyc +0 -0
  46. package/src/policies/enforcers/_common.py +100 -0
  47. package/src/policies/enforcers/artifact_hygiene.py +52 -0
  48. package/src/policies/enforcers/cluster_routing.py +63 -0
  49. package/src/policies/enforcers/codebase_read_before_plan.py +52 -0
  50. package/src/policies/enforcers/coord_overlap.py +81 -0
  51. package/src/policies/enforcers/delivery_enforcement.py +97 -0
  52. package/src/policies/enforcers/doc_live_over_report.py +50 -0
  53. package/src/policies/enforcers/expert_review_required.py +135 -0
  54. package/src/policies/enforcers/iac_parity.py +53 -0
  55. package/src/policies/enforcers/mcp_router_first.py +37 -0
  56. package/src/policies/enforcers/memory_before_plan.py +61 -0
  57. package/src/policies/enforcers/parallel_reads.py +50 -0
  58. package/src/policies/enforcers/rtk_wrap.py +44 -0
  59. package/src/policies/enforcers/schema_diff_gate.py +80 -0
  60. package/src/policies/enforcers/session_memory_write.py +52 -0
  61. package/src/policies/enforcers/task_required.py +131 -0
  62. package/src/policies/enforcers/test_gate.py +58 -0
  63. package/src/policies/enforcers/validate_plan_before_build.py +75 -0
  64. package/src/policies/enforcers/worktree_required.py +57 -0
  65. package/src/policies/schemas/policies/architecture-review.md +51 -0
  66. package/src/policies/schemas/policies/artifact-hygiene.md +29 -0
  67. package/src/policies/schemas/policies/cluster-routing.md +31 -0
  68. package/src/policies/schemas/policies/codebase-read-before-plan.md +30 -0
  69. package/src/policies/schemas/policies/coord-overlap.md +24 -0
  70. package/src/policies/schemas/policies/delivery-enforcement.md +45 -0
  71. package/src/policies/schemas/policies/doc-live-over-report.md +32 -0
  72. package/src/policies/schemas/policies/expert-review-required.md +60 -0
  73. package/src/policies/schemas/policies/iac-parity.md +31 -0
  74. package/src/policies/schemas/policies/mandatory-testing-deployment.md +147 -0
  75. package/src/policies/schemas/policies/mcp-router-first.md +24 -0
  76. package/src/policies/schemas/policies/memory-before-plan.md +24 -0
  77. package/src/policies/schemas/policies/merge-deploy-monitor-verify.md +145 -0
  78. package/src/policies/schemas/policies/parallel-reads.md +24 -0
  79. package/src/policies/schemas/policies/rtk-wrap.md +26 -0
  80. package/src/policies/schemas/policies/schema-diff-gate.md +30 -0
  81. package/src/policies/schemas/policies/session-memory-write.md +24 -0
  82. package/src/policies/schemas/policies/task-required.md +49 -0
  83. package/src/policies/schemas/policies/test-gate.md +24 -0
  84. package/src/policies/schemas/policies/validate-plan-before-build.md +28 -0
  85. package/src/policies/schemas/policies/worktree-required.md +28 -0
  86. package/templates/hooks/uap-policy-gate.sh +5 -0
  87. package/docs/AGENTS.md +0 -423
  88. package/docs/DOCUMENTATION_AUDIT_REPORT.md +0 -131
  89. package/docs/GETTING_STARTED.md +0 -288
  90. package/docs/PROJECT_ANALYSIS_REPORT.md +0 -510
  91. package/docs/architecture/COMPLETE_ARCHITECTURE.md +0 -748
  92. package/docs/architecture/EXPERT_STACK.md +0 -137
  93. package/docs/architecture/MULTI_MODEL.md +0 -224
  94. package/docs/architecture/PLATFORM_GATING.md +0 -68
  95. package/docs/architecture/SYSTEM_ANALYSIS.md +0 -334
  96. package/docs/architecture/UAP_COMPLIANCE.md +0 -217
  97. package/docs/architecture/UAP_PROTOCOL.md +0 -339
  98. package/docs/architecture/UAP_STRICT_DROIDS.md +0 -172
  99. package/docs/archive/BALLS_MODE_SELF_ANALYSIS.md +0 -260
  100. package/docs/archive/BENCHMARK_GAPS_AND_PLAN.md +0 -146
  101. package/docs/archive/FAILING_TASKS_SOLUTION_PLAN.md +0 -668
  102. package/docs/archive/JINJA2-SYSTEM-MESSAGE-FIX.md +0 -209
  103. package/docs/archive/MODEL_ROUTING_IMPLEMENTATION_SUMMARY.md +0 -281
  104. package/docs/archive/MODEL_ROUTING_OPTIMIZATION_PLAN.md +0 -320
  105. package/docs/archive/NPM-PUBLISH-V0.9.1.md +0 -240
  106. package/docs/archive/OPTIMIZATION_OPTIONS.md +0 -334
  107. package/docs/archive/PARALLELISM_GAPS_AND_OPTIONS.md +0 -422
  108. package/docs/archive/POLICY_GATE_IMPLEMENTATION.md +0 -245
  109. package/docs/archive/SETUP_IMPROVEMENTS.md +0 -213
  110. package/docs/archive/UAP_GENERIC_OPTIMIZATION_PLAN.md +0 -270
  111. package/docs/archive/UAP_OPTIMIZATION_PLAN.md +0 -701
  112. package/docs/archive/UAP_V103_PATTERN_DESIGN.md +0 -315
  113. package/docs/archive/UAP_V104_COMPLIANCE_DESIGN.md +0 -223
  114. package/docs/archive/changelog/2026-03-10_uap-100-compliance.md +0 -77
  115. package/docs/archive/changelog/2026-03-10_uap-full-system-verification.md +0 -109
  116. package/docs/archive/opencode-integration-guide.md +0 -740
  117. package/docs/archive/opencode-integration-quickref.md +0 -180
  118. package/docs/benchmarks/OVERNIGHT_RUNNER.md +0 -341
  119. package/docs/benchmarks/SPECULATIVE_DECODING_JOURNEY_2026-03.md +0 -221
  120. package/docs/benchmarks/VALIDATION_PLAN.md +0 -568
  121. package/docs/blog/SPECULATIVE_DECODING_PRODUCTION_PLAYBOOK.md +0 -139
  122. package/docs/blog/local-coding-agents.md +0 -266
  123. package/docs/blog/x-thread.md +0 -254
  124. package/docs/deployment/DEPLOYMENT.md +0 -895
  125. package/docs/deployment/DEPLOYMENT_STRATEGIES.md +0 -518
  126. package/docs/deployment/DEPLOY_BATCHER_ANALYSIS.md +0 -224
  127. package/docs/deployment/DEPLOY_BATCHING.md +0 -273
  128. package/docs/deployment/DEPLOY_BUCKETING_ANALYSIS.md +0 -420
  129. package/docs/deployment/QWEN35_LLAMA_CPP.md +0 -426
  130. package/docs/deployment/UAP_LLAMA_ANTHROPIC_PROXY_BOOTSTRAP.md +0 -279
  131. package/docs/getting-started/INTEGRATION.md +0 -628
  132. package/docs/getting-started/OVERVIEW.md +0 -324
  133. package/docs/getting-started/SETUP.md +0 -377
  134. package/docs/integrations/MCP_ROUTER_SETUP.md +0 -445
  135. package/docs/integrations/RTK_INTEGRATION.md +0 -468
  136. package/docs/operations/TROUBLESHOOTING.md +0 -660
  137. package/docs/pr/PR_SPECULATIVE_DOCS_TEMPLATE.md +0 -146
  138. package/docs/pr/UPSTREAM_PRS.md +0 -424
  139. package/docs/reference/API_REFERENCE.md +0 -903
  140. package/docs/reference/EXPERT_DROIDS.md +0 -219
  141. package/docs/reference/HARNESS-MATRIX.md +0 -318
  142. package/docs/reference/PATTERN_LIBRARY.md +0 -636
  143. package/docs/reference/UAP_CLI_REFERENCE.md +0 -620
  144. package/docs/research/BEHAVIORAL_PATTERNS.md +0 -228
  145. package/docs/research/DOMAIN_STRATEGIES.md +0 -316
  146. package/docs/research/MEMORY_SYSTEMS_COMPARISON.md +0 -812
  147. package/docs/research/PATTERN_ANALYSIS_2026-01-18.md +0 -436
  148. package/docs/research/PERFORMANCE_ANALYSIS_2026-01-18.md +0 -209
  149. package/docs/research/PERFORMANCE_TEST_PLAN.md +0 -383
  150. package/docs/research/TERMINAL_BENCH_LEARNINGS.md +0 -217
@@ -0,0 +1,162 @@
1
+ # Multi-Agent Coordination
2
+
3
+ > UAP v1.40.0
4
+
5
+ When multiple agents work a codebase in parallel, the expensive failure is two
6
+ of them editing the same file at the same time and colliding at merge. UAP's
7
+ coordination layer lets agents **register**, **announce** what they intend to
8
+ work on, and **check for overlaps** before they start — so parallel work stays
9
+ conflict-free.
10
+
11
+ The coordination modules live in
12
+ [`src/coordination/`](../../src/coordination/): the shared
13
+ [`service.ts`](../../src/coordination/service.ts) backed by a SQLite store
14
+ ([`database.ts`](../../src/coordination/database.ts)), agent lifecycle and
15
+ auto-registration ([`auto-agent.ts`](../../src/coordination/auto-agent.ts)),
16
+ the deploy batcher ([`deploy-batcher.ts`](../../src/coordination/deploy-batcher.ts)),
17
+ and routing/pattern helpers. The CLI entry points are
18
+ [`src/cli/agent.ts`](../../src/cli/agent.ts) and
19
+ [`src/cli/coord.ts`](../../src/cli/coord.ts).
20
+
21
+ ## The model
22
+
23
+ - **Agents** register with a name, optional capabilities, and an optional
24
+ worktree branch, and receive an `AGENT_ID`. They send periodic heartbeats so
25
+ stale agents can be cleaned up.
26
+ - **Work announcements** declare an *intent* (`editing`, `reviewing`,
27
+ `refactoring`, `testing`, `documenting`) against a *resource* (a file path or
28
+ other identifier), optionally with affected files, a description, and an
29
+ estimate in minutes.
30
+ - **Overlap detection** compares your announced resource against active work
31
+ from other agents and returns a conflict-risk assessment plus a suggestion.
32
+ - **Messaging** lets agents broadcast to a channel or send directly to another
33
+ agent.
34
+
35
+ ## The announce / overlaps workflow
36
+
37
+ The recommended flow, printed by `uap agent register` itself:
38
+
39
+ ```bash
40
+ # 1. Register (once per agent)
41
+ uap agent register --name reviewer-1 --worktree feature/042-foo
42
+ # → prints AGENT_ID=<id>
43
+
44
+ # 2. Announce what you're about to work on
45
+ uap agent announce --id <id> --resource src/server.ts --intent editing \
46
+ --description "add request logging" --files src/server.ts --minutes 20
47
+
48
+ # 3. Check overlaps before editing (anyone can run this, no ID needed)
49
+ uap agent overlaps --resource src/server.ts
50
+
51
+ # 4. When finished, release the resource
52
+ uap agent complete --id <id> --resource src/server.ts
53
+ ```
54
+
55
+ `announce` immediately reports whether the resource is **CLEAR** or has
56
+ **overlapping work**. For each overlap it lists the other agents, their intent,
57
+ their worktree branch, a conflict-risk badge (`low` → `critical`), and a
58
+ suggestion. When risks exist it may also surface collaboration suggestions,
59
+ including a recommended merge order.
60
+
61
+ `complete` notifies other agents that the resource is free, so they can safely
62
+ merge.
63
+
64
+ ## CLI reference: `uap agent`
65
+
66
+ Agent lifecycle, work coordination, and communication.
67
+
68
+ ```bash
69
+ uap agent <action> [options]
70
+ ```
71
+
72
+ | Action | Purpose | Required options |
73
+ | ------------ | ------- | ---------------- |
74
+ | `register` | Register a new agent | `--name` |
75
+ | `auto` | Auto-register an agent that heartbeats (30s) and deregisters on exit | — (`--name` optional) |
76
+ | `heartbeat` | Send a liveness heartbeat | `--id` |
77
+ | `status` | Show one agent (`--id`) or all active agents + active work | — |
78
+ | `announce` | Announce work intent on a resource | `--id`, `--resource`, `--intent` |
79
+ | `complete` | Mark work complete on a resource (notifies others) | `--id`, `--resource` |
80
+ | `overlaps` | Show overlaps for a resource, or all active work if none given | — |
81
+ | `broadcast` | Broadcast a message to a channel | `--id`, `--channel`, `--message` |
82
+ | `send` | Send a direct message to another agent | `--id`, `--to`, `--message` |
83
+ | `receive` | Read pending messages | `--id` |
84
+ | `deregister` | Remove an agent | `--id` |
85
+
86
+ Key options:
87
+
88
+ - `--name`, `-i/--id`, `--capabilities` (comma-separated), `-w/--worktree`
89
+ (branch) — registration.
90
+ - `--resource`, `--intent` (`editing|reviewing|refactoring|testing|documenting`),
91
+ `--description`, `--files` (comma-separated), `--minutes` — announcing work.
92
+ - `-c/--channel` (`broadcast|deploy|review|coordination`), `--message`,
93
+ `-t/--to`, `--priority` — messaging.
94
+
95
+ ```bash
96
+ # Inspect everything currently in flight
97
+ uap agent status
98
+
99
+ # Message another agent directly
100
+ uap agent send --id <id> --to <other-id> --message "ready to merge src/a.ts"
101
+
102
+ # Broadcast on the review channel
103
+ uap agent broadcast --id <id> --channel review --message '{"action":"need-review"}'
104
+ ```
105
+
106
+ ## CLI reference: `uap coord`
107
+
108
+ System-wide coordination status and maintenance.
109
+
110
+ ```bash
111
+ uap coord <status|flush|cleanup> [options]
112
+ ```
113
+
114
+ | Action | Purpose |
115
+ | --------- | ------- |
116
+ | `status` | Show active agents, resource claims, the deploy queue, and unread-message counts |
117
+ | `flush` | Force-execute all pending deploys (see [Deploy Batching](./DEPLOY_BATCHING.md)) |
118
+ | `cleanup` | Mark stale agents as failed and remove expired claims, old messages, and completed entries |
119
+
120
+ ```bash
121
+ uap coord status -v
122
+ uap coord cleanup
123
+ ```
124
+
125
+ ## CLI reference: `uap coordination`
126
+
127
+ Focused overlap checks and resolution.
128
+
129
+ ```bash
130
+ uap coordination <check|resolve> [options]
131
+ ```
132
+
133
+ ### `check` — detect overlapping work
134
+
135
+ ```bash
136
+ uap coordination check [--agents <ids|names>] [-r|--resource <resource>] [-v] [--json]
137
+ ```
138
+
139
+ Filters active work by agent and/or resource, then reports overlaps with their
140
+ conflict risk and suggestions. `--json` emits machine-readable output.
141
+
142
+ ### `resolve` — broadcast a resolution
143
+
144
+ ```bash
145
+ uap coordination resolve <overlapId> [--action <assign|merge|delegate>] [--json]
146
+ ```
147
+
148
+ `<overlapId>` is the resource path. The resolution (default `merge`) is
149
+ broadcast on the `coordination` channel so other agents can act on it.
150
+
151
+ ```bash
152
+ uap coordination check --resource src/server.ts
153
+ uap coordination resolve src/server.ts --action merge
154
+ ```
155
+
156
+ ## Related
157
+
158
+ - [Deploy Batching](./DEPLOY_BATCHING.md) — how coordinated commits/pushes are
159
+ batched to avoid merge conflicts.
160
+ - `uap deliver --coordinate` registers a convergence run with the coordination
161
+ layer (announce + heartbeat + overlap detection); see
162
+ [Local Models](./LOCAL_MODELS.md).
@@ -0,0 +1,115 @@
1
+ # `uap deliver` — the delivery harness
2
+
3
+ `uap deliver` drives a model through a **convergence loop that iterates against your project's real completion gates until the work is actually delivered** — build green, tests passing, lint clean — not until the model *claims* it's done.
4
+
5
+ It is UAP's answer to "the agent said it finished, but nothing compiles." Instead of a single shot, `deliver` runs an execute → verify → critique → iterate loop, feeding real gate failures back to the model and persisting until the gates pass or the run provably stalls.
6
+
7
+ ```bash
8
+ uap deliver "implement the password reset flow"
9
+ ```
10
+
11
+ ---
12
+
13
+ ## How it works
14
+
15
+ The loop lives in `src/delivery/` (15 modules). Each turn:
16
+
17
+ 1. **Explore & plan** — the model reads the relevant code and proposes a change. With best-of-N exploration enabled, several candidate approaches are generated and the most promising is taken (`explorer.ts`).
18
+ 2. **Apply** — the applier writes the proposed file changes (`applier.ts`). Pre-existing test files, gate configs, and the transitive imports of your spec files are **protected from being overwritten by default** — the model cannot "pass" by editing the tests. A runtime integrity guard hashes protected files and rejects tampering (`integrity.ts`, `spec-imports.ts`).
19
+ 3. **Verify** — the verifier ladder runs your real gates — build, typecheck, test, lint — and scores the turn (`verifier-ladder.ts`). Nothing counts as delivered until the required gates are green.
20
+ 4. **Critique & feed back** — failures are turned into structured guidance for the next turn (`critic.ts`); learned best-practice cards can be injected and recorded on success (`practice.ts`).
21
+ 5. **Iterate until delivered** — the loop continues. By default it **extends past `--max-turns` up to a ceiling**, stopping early only on genuine stagnation (no score improvement across several turns). On stagnation with `--escalate`, it widens exploration, adds a critic pass, and finally escalates to a stronger model (`escalation.ts`).
22
+
23
+ ```
24
+ ┌──────────── guidance file (optional) ───────────┐
25
+ ▼ │
26
+ explore → apply → verify (build/test/lint) → critique ──┘
27
+ ▲ │
28
+ └──── until delivered ◄──┘ (stops on green gates or stagnation)
29
+ ```
30
+
31
+ ---
32
+
33
+ ## Autonomy
34
+
35
+ `deliver` runs the **whole mission without stopping to ask between phases**. It still reports progress, and you can steer it live through a guidance channel:
36
+
37
+ ```bash
38
+ uap deliver "migrate the auth module to JWT" --guidance-file ./guidance.txt
39
+ # in another shell, append guidance at any time — the loop polls it each turn:
40
+ echo "prefer RS256 and keep the existing /login route" >> ./guidance.txt
41
+ ```
42
+
43
+ ---
44
+
45
+ ## Auto-optimization
46
+
47
+ By default every task is **classified by complexity** and the matching convergence aids turn on automatically (`auto-optimizer.ts`). You don't have to tune anything for the common case. To control it explicitly:
48
+
49
+ ```bash
50
+ uap deliver "big refactor across modules" --optimize # enable every aid
51
+ uap deliver "trivial typo fix" --no-auto # disable dynamic optimization
52
+ ```
53
+
54
+ `--optimize` enables exploration, critic, practices, escalation, ideation, HALO spans, and coordination together.
55
+
56
+ ---
57
+
58
+ ## Options
59
+
60
+ | Flag | Purpose |
61
+ |---|---|
62
+ | `--max-turns <n>` | Maximum execute→verify iterations before until-delivered extension (default `5`) |
63
+ | `--no-until-delivered` | Disable loop-until-delivered (ON by default: extends past `--max-turns` to the ceiling, stopping on stagnation) |
64
+ | `--ceiling <n>` | Hard turn ceiling for until-delivered (1–50, default `30`) |
65
+ | `-m, --model <preset>` | Model preset (default `$UAP_DELIVER_MODEL` or `qwen35-a3b`) |
66
+ | `--endpoint <url>` | Override the model endpoint (OpenAI-compatible `/v1`) |
67
+ | `--escalate-model <preset>` | Stronger model for escalation (default `$UAP_ESCALATE_MODEL`) |
68
+ | `--temperature <t>` | Sampling temperature (default: execution-profile value) |
69
+ | `--gates <ids>` | Gate subset: `build,typecheck,test,lint` |
70
+ | `--candidates <n>` | Best-of-N exploration: candidates per turn (2–8) |
71
+ | `--critic` | Structured critique of failed turns |
72
+ | `--practices` / `--no-semantic` | Inject/record best-practice cards (keyword retrieval with `--no-semantic`) |
73
+ | `--escalate` | Escalation ladder on stagnation |
74
+ | `--ideate` / `--ideate-project <name>` | Divergent ideation strategy seeds |
75
+ | `--halo` | Emit HALO spans (analyze with `uap harness analyze`) |
76
+ | `--coordinate` | Register the run with the coordination layer |
77
+ | `--deploy` | On success, queue a commit of applied files into the deploy batcher |
78
+ | `--optimize` | Enable every convergence aid |
79
+ | `--no-auto` | Disable dynamic optimization |
80
+ | `--no-protect-tests` | Allow the model to modify pre-existing test files (protected by default) |
81
+ | `--guidance-file <path>` | Poll a file each turn for live operator guidance |
82
+ | `--project-root <path>` | Project whose gates define delivery (default: cwd) |
83
+ | `--dry-run` | Show detected gates and plan without calling the model |
84
+ | `--json` | Emit a JSON result |
85
+
86
+ ---
87
+
88
+ ## Local or frontier models
89
+
90
+ `deliver` speaks the OpenAI-compatible `/v1` API, so it runs against frontier models or a **local model** (e.g. Qwen on llama.cpp). The default preset `qwen35-a3b` targets a local server; point elsewhere with `--endpoint` / `--model`. See **[Local Models](LOCAL_MODELS.md)**.
91
+
92
+ ```bash
93
+ uap deliver "add a healthcheck endpoint" --model qwen35-a3b --endpoint http://127.0.0.1:8080/v1
94
+ ```
95
+
96
+ ---
97
+
98
+ ## Automatic routing & enforcement
99
+
100
+ - **MCP `deliver` meta-tool** — harnesses with the MCP router can auto-route a coding task into `uap deliver` without a shell call (see [MCP Router](../integrations/MCP_ROUTER.md)).
101
+ - **delivery-enforcement policy** — an optional policy gate that routes source edits through `deliver` rather than ad-hoc writes. It is a cooperative-agent guardrail, not a security boundary (see [Policies](POLICIES.md)).
102
+
103
+ ---
104
+
105
+ ## Dry run first
106
+
107
+ ```bash
108
+ uap deliver "add input validation to the signup form" --dry-run
109
+ ```
110
+
111
+ shows the gates UAP detected and the plan, without spending a single model token — the fastest way to confirm `deliver` understands your project's definition of done.
112
+
113
+ ---
114
+
115
+ See also: [Architecture overview](../architecture/OVERVIEW.md) · [Policies](POLICIES.md) · [Multi-model routing](MULTI_MODEL.md)
@@ -0,0 +1,212 @@
1
+ # Deploy Batching
2
+
3
+ > UAP v1.40.0
4
+
5
+ When several agents work in parallel, they all want to commit, push, merge, and
6
+ deploy at roughly the same time. Left unmanaged, that produces two failure
7
+ modes:
8
+
9
+ - **Merge conflicts** — two agents push to the same branch within seconds of
10
+ each other and the second push is rejected (or worse, races into a conflicted
11
+ state).
12
+ - **Thundering deploys** — a burst of redundant deploys, duplicate workflow
13
+ triggers, and a noisy commit history full of one-line commits.
14
+
15
+ The deploy batcher solves this by *queueing* git/deploy actions and grouping
16
+ them inside short, per-action-type time windows. Commits to the same branch are
17
+ squashed, duplicate pushes and workflow triggers are deduplicated, and the
18
+ result is executed as a single ordered batch.
19
+
20
+ The implementation lives in
21
+ [`src/coordination/deploy-batcher.ts`](../../src/coordination/deploy-batcher.ts),
22
+ with the CLI surface in [`src/cli/deploy.ts`](../../src/cli/deploy.ts).
23
+
24
+ ## How it works
25
+
26
+ 1. An agent **queues** an action (`commit`, `push`, `merge`, `deploy`, or
27
+ `workflow`) against a target (branch, environment, or workflow name).
28
+ 2. Each action gets an `execute_after` timestamp computed from its
29
+ type-specific batch window. Until that time passes, the action stays
30
+ `pending`.
31
+ 3. If a *similar* pending action already exists for the same type + target, the
32
+ new one is **merged** into it instead of being queued separately:
33
+ - `commit` actions are squashed (messages concatenated, file lists unioned).
34
+ - `push` actions to the same branch are merged.
35
+ - `workflow` triggers are deduplicated.
36
+ 4. **Creating a batch** collects every pending action whose window has elapsed,
37
+ groups them by `actionType:target`, squashes where possible, and assigns a
38
+ batch ID.
39
+ 5. **Executing a batch** runs the actions. State-dependent actions (`commit`,
40
+ `push`, `merge`, `deploy`) run sequentially in priority order;
41
+ `workflow` triggers can run in parallel.
42
+
43
+ Actions are executed with real tooling: `git add` / `git commit` / `git push`
44
+ (`--force-with-lease` when forced) / `git merge`, `gh workflow run`, and a
45
+ configurable deploy command. Each external command runs under a timeout
46
+ (default 300000 ms / 5 minutes) so a hung process can't block the pipeline.
47
+
48
+ ## Batch windows per action type
49
+
50
+ Each action type has its own default window. Shorter windows favor speed;
51
+ longer windows favor more batching (fewer, larger operations).
52
+
53
+ | Action type | Default window | Rationale |
54
+ | ----------- | -------------- | --------- |
55
+ | `commit` | 30000 ms (30s) | Allows squashing multiple commits |
56
+ | `push` | 5000 ms (5s) | Fast for PR creation |
57
+ | `merge` | 10000 ms (10s) | Moderate safety buffer |
58
+ | `workflow` | 5000 ms (5s) | Fast workflow triggers |
59
+ | `deploy` | 60000 ms (60s) | Safety buffer for deployments |
60
+
61
+ These defaults are defined as `DEFAULT_DYNAMIC_WINDOWS` in
62
+ `deploy-batcher.ts`. Windows below 1000 ms or above 300000 ms trigger a
63
+ validation warning.
64
+
65
+ ### Configuring windows
66
+
67
+ Windows can be set per project in `.uap.json` under `deploy.batchWindows`:
68
+
69
+ ```json
70
+ {
71
+ "deploy": {
72
+ "batchWindows": {
73
+ "commit": 60000,
74
+ "push": 3000,
75
+ "merge": 15000,
76
+ "workflow": 5000,
77
+ "deploy": 60000
78
+ }
79
+ }
80
+ }
81
+ ```
82
+
83
+ Any window not set in the file falls back to an environment variable, then to
84
+ the default:
85
+
86
+ | Window | Environment variable |
87
+ | ---------- | ----------------------------- |
88
+ | `commit` | `UAP_DEPLOY_COMMIT_WINDOW` |
89
+ | `push` | `UAP_DEPLOY_PUSH_WINDOW` |
90
+ | `merge` | `UAP_DEPLOY_MERGE_WINDOW` |
91
+ | `workflow` | `UAP_DEPLOY_WORKFLOW_WINDOW` |
92
+ | `deploy` | `UAP_DEPLOY_DEPLOY_WINDOW` |
93
+
94
+ The batcher also exposes named profiles (`fast`, `safe`, `default`) at the API
95
+ level via `DeployBatcher.fromProfile(...)`.
96
+
97
+ ## Urgent mode
98
+
99
+ Urgent mode collapses every window to its minimum so a time-sensitive change
100
+ fast-tracks through the queue:
101
+
102
+ | Action type | Urgent window |
103
+ | ----------- | ------------- |
104
+ | `commit` | 2000 ms |
105
+ | `push` | 1000 ms |
106
+ | `merge` | 2000 ms |
107
+ | `workflow` | 1000 ms |
108
+ | `deploy` | 5000 ms |
109
+
110
+ Toggle it from the CLI:
111
+
112
+ ```bash
113
+ uap deploy urgent --on # enable fast windows
114
+ uap deploy urgent --off # restore default windows
115
+ ```
116
+
117
+ > Note: `uap deploy urgent` applies to the batcher instance it creates, so it
118
+ > is most useful as part of a session that immediately queues and flushes.
119
+
120
+ ## CLI reference: `uap deploy`
121
+
122
+ ```bash
123
+ uap deploy <queue|batch|execute|status|flush|config|set-config|urgent> [options]
124
+ ```
125
+
126
+ ### `queue` — add an action to the batch queue
127
+
128
+ ```bash
129
+ uap deploy queue \
130
+ --agent-id <id> \
131
+ --action-type <commit|push|merge|deploy|workflow> \
132
+ --target <branch|environment|workflow> \
133
+ [options]
134
+ ```
135
+
136
+ `--agent-id`, `--action-type`, and `--target` are required. Type-specific
137
+ options:
138
+
139
+ | Option | Applies to | Meaning |
140
+ | ------------------- | ---------- | ------- |
141
+ | `-m, --message` | `commit` | Commit message |
142
+ | `-f, --files` | `commit` | Comma-separated file list |
143
+ | `-r, --remote` | `push` | Git remote (default `origin`) |
144
+ | `--force` | `push` | Force push (`--force-with-lease`) |
145
+ | `--ref` | `workflow` | Git ref to run the workflow against |
146
+ | `--inputs` | `workflow` | Workflow inputs as JSON |
147
+ | `-p, --priority` | all | Priority 1–10 (default 5) |
148
+
149
+ ```bash
150
+ uap deploy queue --agent-id agent-123 --action-type commit --target main \
151
+ -m "feat: add batcher" -f "src/a.ts,src/b.ts"
152
+ ```
153
+
154
+ ### `batch` — create a batch from ready actions
155
+
156
+ ```bash
157
+ uap deploy batch [-v|--verbose]
158
+ ```
159
+
160
+ Collects pending actions whose window has elapsed and prints the new batch ID
161
+ plus the command to execute it.
162
+
163
+ ### `execute` — run a specific batch
164
+
165
+ ```bash
166
+ uap deploy execute --batch-id <id> [--dry-run]
167
+ ```
168
+
169
+ `--batch-id` is required. Reports executed/failed counts, duration, and any
170
+ per-action errors.
171
+
172
+ ### `status` — inspect the queue
173
+
174
+ ```bash
175
+ uap deploy status [-v|--verbose]
176
+ ```
177
+
178
+ Shows pending (unbatched) actions grouped by type, pending batches, and a
179
+ summary.
180
+
181
+ ### `flush` — batch and execute everything pending
182
+
183
+ ```bash
184
+ uap deploy flush [-v|--verbose] [--dry-run]
185
+ ```
186
+
187
+ Repeatedly creates and executes batches until the queue is empty. This is the
188
+ one-shot "do it all now" command.
189
+
190
+ ### `config` / `set-config` — view and change windows
191
+
192
+ ```bash
193
+ uap deploy config
194
+ uap deploy set-config --message '{"commit":60000,"push":3000}'
195
+ ```
196
+
197
+ `set-config` takes a JSON object of window values (ms); every value must be a
198
+ positive number. Changes apply to the current batcher instance.
199
+
200
+ ### `urgent` — toggle fast windows
201
+
202
+ ```bash
203
+ uap deploy urgent --on
204
+ uap deploy urgent --off
205
+ ```
206
+
207
+ ## Related
208
+
209
+ - `uap coord flush` is an alias-style shortcut that flushes all pending
210
+ deploys (see [Coordination](./COORDINATION.md)).
211
+ - `uap deliver --deploy` queues a commit of applied files into the batcher on a
212
+ successful convergence run (see [Local Models](./LOCAL_MODELS.md)).
@@ -0,0 +1,202 @@
1
+ # Droids and Skills
2
+
3
+ > Applies to UAP **v1.40.0**
4
+
5
+ UAP ships two complementary extension mechanisms:
6
+
7
+ - **Droids** — markdown-defined specialist agents (a reviewer, a language
8
+ expert, an architect). Each droid is a focused persona with its own tools and
9
+ instructions.
10
+ - **Skills** — reusable workflows that any agent can load on demand (a coding
11
+ protocol, a navigation technique, a memory operation).
12
+
13
+ Droids answer *"who should do this?"*; skills answer *"how is this done?"*. A
14
+ droid can pull in skills when a domain-specific workflow applies.
15
+
16
+ ## What a droid is
17
+
18
+ A droid is a single markdown file under
19
+ [`.factory/droids/`](../../.factory/droids/) with YAML frontmatter followed by a
20
+ prompt body. The frontmatter declares the droid's identity, model, tools, and
21
+ optional coordination/skill metadata.
22
+
23
+ A minimal droid (the default scaffold from `uap droids add`):
24
+
25
+ ```markdown
26
+ ---
27
+ name: my-droid
28
+ description: Custom droid for my-droid
29
+ model: inherit
30
+ tools: ["Read", "LS", "Grep", "Glob"]
31
+ ---
32
+
33
+ You are a specialized assistant for my-droid tasks.
34
+
35
+ Describe what this droid should do and how it should respond.
36
+ ```
37
+
38
+ A real droid carries richer frontmatter — for example `security-auditor`:
39
+
40
+ ```markdown
41
+ ---
42
+ name: security-auditor
43
+ description: Proactive security analyst that reviews all code for vulnerabilities, secrets exposure, injection attacks, and security best practices.
44
+ model: inherit
45
+ coordination:
46
+ channels: ["review", "broadcast"]
47
+ claims: ["exclusive"]
48
+ batches_deploy: true
49
+ skills:
50
+ - sec-context-review
51
+ ---
52
+ # Security Auditor
53
+
54
+ ## Mission
55
+ ...
56
+ ```
57
+
58
+ Frontmatter fields used by UAP:
59
+
60
+ - `name` — **required**, unique across droids. Used to reference the droid.
61
+ - `description` — **required**, at least 5 characters. Shown in listings and
62
+ used by the expert router for capability matching.
63
+ - `model` — `inherit` (use the caller's routed model) or a specific model id.
64
+ - `tools` — the tool allowlist the droid may use.
65
+ - `coordination` — optional; declares channels, claim semantics
66
+ (`exclusive` / `shared`), and deploy batching for multi-droid runs.
67
+ - `skills` — optional; skills the droid loads for its domain.
68
+
69
+ Droids are invoked as subagents, e.g.
70
+ `Task(subagent_type: "security-auditor", prompt: "...")`.
71
+
72
+ ## The droid library
73
+
74
+ UAP ships **38 droids** in `.factory/droids/`. They cluster into a few
75
+ categories:
76
+
77
+ - **Language experts** — `python-pro`, `typescript-node-expert`,
78
+ `javascript-pro`, `go-pro`, `rust-pro`
79
+ - **Reviewers** — `code-quality-reviewer`, `code-quality-guardian`,
80
+ `security-code-reviewer`, `performance-reviewer`, `test-coverage-reviewer`,
81
+ `documentation-accuracy-reviewer`, `architect-reviewer`
82
+ - **Architects & planners** — `strategic-architect`, `tactical-architect`,
83
+ `implementation-planner`, `product-strategist`, `ideation-expert`
84
+ - **Security & compliance** — `security-auditor`, `compliance-officer`,
85
+ `dependency-auditor`
86
+ - **Quality & testing** — `qa-expert`, `test-strategist`, `test-plan-writer`,
87
+ `refactoring-specialist`, `debug-expert`
88
+ - **Performance & cost** — `performance-optimizer`, `cost-engineer`,
89
+ `harness-optimizer`, `terminal-bench-optimizer`
90
+ - **Ops & infrastructure** — `sysadmin-expert`, `observability-engineer`,
91
+ `incident-responder`, `release-manager`
92
+ - **Domain specialists** — `api-designer`, `cli-design-expert`,
93
+ `accessibility-tester`, `documentation-expert`, `ml-training-expert`
94
+
95
+ Run `uap droids list` to see the live set across all sources.
96
+
97
+ ## `uap droids` CLI
98
+
99
+ Defined in `src/cli/droids.ts` and registered in `src/bin/cli.ts`.
100
+
101
+ ```bash
102
+ uap droids list # list droids from all known locations + built-in templates
103
+ uap droids add <name> # scaffold a new droid in .factory/droids/
104
+ uap droids add <name> -t <template> # scaffold from a built-in template
105
+ uap droids import <path> # import .md droids from another directory
106
+ uap droids validate # validate frontmatter + capability-router coverage
107
+ uap droids validate -q # quiet mode: exit code only
108
+ ```
109
+
110
+ `uap droids list` scans, in order:
111
+
112
+ - `.factory/droids` (project)
113
+ - `.claude/agents` (Claude Code)
114
+ - `.opencode/agent` (OpenCode)
115
+ - `~/.factory/droids` (personal)
116
+
117
+ Built-in templates available to `uap droids add -t`: `code-reviewer`,
118
+ `security-reviewer`, `performance-reviewer`, `test-writer`.
119
+
120
+ `uap droids validate` parses every droid's frontmatter and reports errors for
121
+ missing/short descriptions, missing names, duplicate names, and invalid YAML.
122
+ It also cross-references the capability router so any droid the router expects
123
+ but that is missing on disk is flagged.
124
+
125
+ ## The expert router
126
+
127
+ `uap expert-route` recommends an ordered **chain** of droids for a task,
128
+ grouped into phases (ideate → plan → design → implement → review → release). It
129
+ is backed by the `ExpertOrchestrator` (`src/coordination/expert-orchestrator.ts`).
130
+
131
+ ```bash
132
+ uap expert-route "add OAuth2 login with JWT sessions"
133
+ uap expert-route "refactor the payment module" --files src/payments/*.ts
134
+ uap expert-route "harden the upload endpoint" --json
135
+ ```
136
+
137
+ Output shows the matched capabilities, a confidence score, and for each step:
138
+ the phase, the droid, whether it runs in parallel, a rationale, and a historical
139
+ success rate (when available). `--files` scopes routing by the affected paths;
140
+ `--json` emits machine-readable output (also used automatically when stdout is
141
+ not a TTY).
142
+
143
+ ## Skills
144
+
145
+ A skill is a reusable workflow. Skills live in directories under
146
+ [`.factory/skills/`](../../.factory/skills/), each containing a `SKILL.md` file
147
+ with frontmatter (`name`, `version`, `compatibility`) and the workflow body.
148
+
149
+ UAP ships **32 skills** in `.factory/skills/`, including:
150
+
151
+ - **Coordination & workflow** — `uap-coordination`, `uap-tasks`,
152
+ `uap-worktree`, `worktree-workflow`, `parallel-expert-review`, `batch-review`
153
+ - **Memory & context** — `uap-memory`, `memory-management`,
154
+ `scripts-preload-memory`, `session-context-preservation-droid`
155
+ - **Navigation & analysis** — `codebase-navigator`, `git-forensics`,
156
+ `uap-patterns`, `compression`
157
+ - **Engineering** — `typescript-node-expert`, `polyglot`, `cli-design-expert`,
158
+ `llama-cpp-worker`, `infra-worker`, `service-config`
159
+ - **Iteration & benchmarking** — `near-miss`, `near-miss-iteration`,
160
+ `adversarial`, `terminal-bench`, `terminal-bench-strategies`
161
+ - **Hooks** — `hooks-session-start`, `hooks-pre-compact`, `scripts-tool-router`
162
+
163
+ ### `uap skill` CLI
164
+
165
+ Defined in `src/cli/skill.ts`.
166
+
167
+ ```bash
168
+ uap skill list # list available skills (with source tag)
169
+ uap skill list -c <category> # filter by path/category substring
170
+ uap skill list --json # machine-readable listing
171
+ uap skill load <name> # print a skill's full content for the session
172
+ uap skill load <name> -c <cat> # scope discovery by category
173
+ ```
174
+
175
+ Skills are discovered from three roots, in order: `skills/` (project),
176
+ `.factory/skills/`, and `.claude/skills/`. A directory with a `SKILL.md` is
177
+ treated as a skill, as is any top-level `.md` file in those roots. `load`
178
+ matches names case-insensitively.
179
+
180
+ ## Adding a custom droid
181
+
182
+ ```bash
183
+ # 1. Scaffold (optionally from a template)
184
+ uap droids add my-reviewer -t code-reviewer
185
+
186
+ # 2. Edit .factory/droids/my-reviewer.md
187
+ # - set a clear, >= 5-char description (used by the expert router)
188
+ # - adjust the tools allowlist
189
+ # - write the prompt body / mission
190
+
191
+ # 3. Validate before relying on it
192
+ uap droids validate
193
+ ```
194
+
195
+ To bring droids in from another project or platform, drop the `.md` files in a
196
+ folder and run `uap droids import <path>` (existing files are skipped, not
197
+ overwritten).
198
+
199
+ ## See also
200
+
201
+ - [Multi-Model Routing](./MULTI_MODEL.md) — the models that droids and skills
202
+ run on, and how tasks are routed to them.