@miller-tech/uap 1.30.0 → 1.31.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +119 -2
- package/dist/.tsbuildinfo +1 -1
- package/dist/bin/cli.js +6 -0
- package/dist/bin/cli.js.map +1 -1
- package/dist/cli/deliver.d.ts +12 -0
- package/dist/cli/deliver.d.ts.map +1 -1
- package/dist/cli/deliver.js +144 -9
- package/dist/cli/deliver.js.map +1 -1
- package/dist/coordination/deploy-batcher.d.ts.map +1 -1
- package/dist/coordination/deploy-batcher.js +9 -3
- package/dist/coordination/deploy-batcher.js.map +1 -1
- package/dist/delivery/applier.d.ts.map +1 -1
- package/dist/delivery/applier.js +4 -0
- package/dist/delivery/applier.js.map +1 -1
- package/dist/delivery/convergence-loop.d.ts +7 -0
- package/dist/delivery/convergence-loop.d.ts.map +1 -1
- package/dist/delivery/convergence-loop.js +42 -0
- package/dist/delivery/convergence-loop.js.map +1 -1
- package/dist/delivery/halo-trace.d.ts +29 -0
- package/dist/delivery/halo-trace.d.ts.map +1 -0
- package/dist/delivery/halo-trace.js +88 -0
- package/dist/delivery/halo-trace.js.map +1 -0
- package/dist/delivery/ideation.d.ts +36 -0
- package/dist/delivery/ideation.d.ts.map +1 -0
- package/dist/delivery/ideation.js +109 -0
- package/dist/delivery/ideation.js.map +1 -0
- package/dist/delivery/index.d.ts +4 -1
- package/dist/delivery/index.d.ts.map +1 -1
- package/dist/delivery/index.js +4 -1
- package/dist/delivery/index.js.map +1 -1
- package/dist/delivery/run-coordinator.d.ts +48 -0
- package/dist/delivery/run-coordinator.d.ts.map +1 -0
- package/dist/delivery/run-coordinator.js +132 -0
- package/dist/delivery/run-coordinator.js.map +1 -0
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -15,6 +15,18 @@
|
|
|
15
15
|
|
|
16
16
|
## Recent Updates
|
|
17
17
|
|
|
18
|
+
**New:** Delivery Harness (`uap deliver`) — a convergence loop that drives an
|
|
19
|
+
underlying model through execute → apply → verify → feedback against the
|
|
20
|
+
project's real completion gates until delivery is achieved. Best-of-N
|
|
21
|
+
exploration, a structured critic, semantically-recalled best-practice cards,
|
|
22
|
+
and a stagnation-driven escalation ladder turn weaker/local models into
|
|
23
|
+
reliable closers. See [Delivery Harness](#delivery-harness).
|
|
24
|
+
|
|
25
|
+
```bash
|
|
26
|
+
uap deliver "add a parseDuration(str) helper returning seconds" \
|
|
27
|
+
--candidates 3 --critic --practices --escalate
|
|
28
|
+
```
|
|
29
|
+
|
|
18
30
|
**New:** Expert-stack extensions — forward-design droids (strategic/tactical
|
|
19
31
|
architect, implementation-planner), activated `experts.<name>` MCP tools, HALO
|
|
20
32
|
trace-based harness optimization, open-collider divergent ideation, and a real
|
|
@@ -55,6 +67,7 @@ uap setup -p all
|
|
|
55
67
|
- [Browser Automation](#browser-automation)
|
|
56
68
|
- [MCP Router](#mcp-router)
|
|
57
69
|
- [Multi-Model Architecture](#multi-model-architecture)
|
|
70
|
+
- [Delivery Harness](#delivery-harness)
|
|
58
71
|
- [Pattern System](#pattern-system)
|
|
59
72
|
- [Droids and Skills](#droids--skills)
|
|
60
73
|
- [Task Management](#task-management)
|
|
@@ -78,6 +91,7 @@ uap setup -p all
|
|
|
78
91
|
| Browser | 1 module | Stealth web automation via CloakBrowser (Playwright drop-in) |
|
|
79
92
|
| MCP Router | 11 modules | 2-tool meta-router + expert-consultation registry (98% token savings) |
|
|
80
93
|
| Models | 10 modules | Multi-model routing, planning, execution, validation, 13 model profiles |
|
|
94
|
+
| Delivery Harness | 11 modules | `uap deliver`: convergence loop, best-of-N explorer, critic, practice recall, escalation, ideation seeds, HALO tracing, coordination + deploy queueing |
|
|
81
95
|
| Patterns | 23 patterns | Battle-tested workflows from Terminal-Bench 2.0 |
|
|
82
96
|
| Droids | 30 experts | Full SDLC expert stack: strategy, design, build, review, release, ops ([reference](docs/reference/EXPERT_DROIDS.md)) |
|
|
83
97
|
| Expert Orchestrator | 1 module | Adaptive droid-chain selection across plan→design→implement→review→release |
|
|
@@ -329,6 +343,108 @@ Each profile supports: `dynamic_temperature` (decay per retry), `tool_call_batch
|
|
|
329
343
|
|
|
330
344
|
---
|
|
331
345
|
|
|
346
|
+
## Delivery Harness
|
|
347
|
+
|
|
348
|
+
`uap deliver` forces an underlying model — including weaker or local models —
|
|
349
|
+
to reach a **verified** outcome. Instead of trusting a single generation, it
|
|
350
|
+
loops: the model emits whole files, the harness writes them, runs the
|
|
351
|
+
project's real completion gates, and feeds the failures back until every gate
|
|
352
|
+
passes or the turn budget is exhausted. "Done" is defined by the gates, not by
|
|
353
|
+
the model's say-so.
|
|
354
|
+
|
|
355
|
+
### Pipeline
|
|
356
|
+
|
|
357
|
+
```
|
|
358
|
+
┌─────────────────────────── loop until gates pass ───────────────────────────┐
|
|
359
|
+
│ │
|
|
360
|
+
instruction → build prompt → execute → apply files → verify (gates) → feedback ─────────┘
|
|
361
|
+
(+ practices) (+ critique) model to tree build/typecheck/test/lint
|
|
362
|
+
│ │
|
|
363
|
+
best-of-N candidates pass → done ✓ fail → critic + escalate
|
|
364
|
+
```
|
|
365
|
+
|
|
366
|
+
1. **Convergence loop** — execute → apply → verify → feedback against real gates. A baseline check short-circuits when the tree is already green (no model call, no false success).
|
|
367
|
+
2. **Best-of-N explorer** (`--candidates N`) — generates N candidates per turn under distinct strategy seeds, evaluates each on the same tree via apply→verify→rollback, and commits the winner; a model judge breaks ties.
|
|
368
|
+
3. **Structured critic** (`--critic`) — turns a failed turn's gate output into a numbered, file-scoped repair plan via a gate-specific analyst persona.
|
|
369
|
+
4. **Best-practice recall** (`--practices`) — injects provenance-safe practice cards learned from past successful deliveries, retrieved by semantic similarity (nomic-768 embeddings, keyword fallback).
|
|
370
|
+
5. **Escalation ladder** (`--escalate`) — on stagnation, climbs cheap→expensive: widen exploration → enable the critic → switch to a stronger model.
|
|
371
|
+
6. **Divergent ideation** (`--ideate`, `--ideate-project <name>`) — replaces the static strategy seeds with task-specific, deliberately diverse seeds: generated by a bisociation-style model call, or taken from an open-collider project's curated ideas (`uap ideate`). Implies best-of-N exploration.
|
|
372
|
+
7. **HALO tracing** (`--halo`) — emits one AGENT span per run and one CHAIN span per turn (scores, strategies, failed gates) so `uap harness analyze` can mine systemic failure modes across runs.
|
|
373
|
+
8. **Coordination** (`--coordinate`) — registers the run with the multi-agent coordination layer (`uap agent`): announces work on the project, warns about overlapping agents, heartbeats every turn, completes/deregisters on exit.
|
|
374
|
+
9. **Deploy batching** (`--deploy`) — on success, queues a commit of the applied files into the deploy batcher; execute with `uap deploy flush`.
|
|
375
|
+
10. **`--optimize`** — one switch for every convergence aid: 4 candidates/turn + critic + practices + escalation + ideation + HALO + coordination (deploy stays explicit).
|
|
376
|
+
|
|
377
|
+
### Components (11 modules)
|
|
378
|
+
|
|
379
|
+
| Component | File | Purpose |
|
|
380
|
+
| ----------------- | ------------------------------------- | ----------------------------------------------------------------- |
|
|
381
|
+
| Convergence Loop | `src/delivery/convergence-loop.ts` | Turn loop with pluggable seams + mutable run-state for escalation |
|
|
382
|
+
| Verifier Ladder | `src/delivery/verifier-ladder.ts` | Build/typecheck/test/lint gates with fail-fast and diagnostics |
|
|
383
|
+
| Applier | `src/delivery/applier.ts` | Writes ` ```file:path ` blocks; path-safe, rollback-capable |
|
|
384
|
+
| Explorer | `src/delivery/explorer.ts` | Best-of-N candidates with strategy seeds + rollback evaluation |
|
|
385
|
+
| Judge | `src/delivery/judge.ts` | Model tie-break among equally-scored candidates |
|
|
386
|
+
| Critic | `src/delivery/critic.ts` | Gate-persona repair plans from failed turns |
|
|
387
|
+
| Practice Store | `src/delivery/practice.ts` | Provenance-safe best-practice cards with semantic recall |
|
|
388
|
+
| Escalation | `src/delivery/escalation.ts` | Stagnation-driven ladder returning loop directives |
|
|
389
|
+
| Ideation Seeder | `src/delivery/ideation.ts` | Divergent strategy seeds (generated or from curated ideas) |
|
|
390
|
+
| HALO Tracer | `src/delivery/halo-trace.ts` | Run/turn spans for `uap harness analyze` |
|
|
391
|
+
| Run Coordinator | `src/delivery/run-coordinator.ts` | `uap agent` registration/heartbeat + `uap deploy` commit queueing |
|
|
392
|
+
|
|
393
|
+
The model is reached through an OpenAI-compatible client
|
|
394
|
+
(`src/models/openai-compat-client.ts`) — the local inference gateway,
|
|
395
|
+
llama.cpp, vLLM, Ollama, or any `/v1/chat/completions` endpoint.
|
|
396
|
+
|
|
397
|
+
### Usage
|
|
398
|
+
|
|
399
|
+
```bash
|
|
400
|
+
# Single-shot loop against the current project's gates
|
|
401
|
+
uap deliver "implement src/slugify.js exporting slugify(str)"
|
|
402
|
+
|
|
403
|
+
# Full quality stack: 3 candidates/turn, critic, learned practices, escalation
|
|
404
|
+
uap deliver "add retry-with-backoff to the HTTP client" \
|
|
405
|
+
--candidates 3 --critic --practices --escalate --escalate-model opus-4.6
|
|
406
|
+
|
|
407
|
+
# Preview detected gates and plan without calling the model
|
|
408
|
+
uap deliver "..." --dry-run
|
|
409
|
+
|
|
410
|
+
# Scope to a subset of gates, cap turns, target another project
|
|
411
|
+
uap deliver "..." --gates build,test --max-turns 8 --project-root ../service
|
|
412
|
+
|
|
413
|
+
# Everything on: exploration, critic, practices, escalation, ideation, HALO, coordination
|
|
414
|
+
uap deliver "refactor the cache layer to LRU with TTL" --optimize
|
|
415
|
+
|
|
416
|
+
# Divergent ideation seeds + queue a commit into the deploy batcher on success
|
|
417
|
+
uap deliver "..." --ideate --candidates 4 --deploy
|
|
418
|
+
```
|
|
419
|
+
|
|
420
|
+
### Key flags
|
|
421
|
+
|
|
422
|
+
| Flag | Effect |
|
|
423
|
+
| -------------------------- | ---------------------------------------------------------------------- |
|
|
424
|
+
| `-m, --model <preset>` | Model preset (default `$UAP_DELIVER_MODEL` or `qwen35-a3b`) |
|
|
425
|
+
| `--max-turns <n>` | Maximum execute→verify iterations (default 5) |
|
|
426
|
+
| `--gates <ids>` | Gate subset: `build,typecheck,test,lint` |
|
|
427
|
+
| `--candidates <n>` | Best-of-N exploration (2–8) per turn |
|
|
428
|
+
| `--critic` | Structured repair plans on failed turns |
|
|
429
|
+
| `--practices` | Inject and record best-practice cards |
|
|
430
|
+
| `--no-semantic` | Use keyword (not embedding) practice recall |
|
|
431
|
+
| `--escalate` | Escalation ladder on stagnation |
|
|
432
|
+
| `--escalate-model <preset>`| Stronger model for the final escalation tier |
|
|
433
|
+
| `--ideate` | Divergent ideation: task-specific strategy seeds (implies exploration) |
|
|
434
|
+
| `--ideate-project <name>` | Seed exploration from `projects/<name>` curated ideas (`uap ideate`) |
|
|
435
|
+
| `--halo` | Emit HALO spans; analyze with `uap harness analyze` |
|
|
436
|
+
| `--coordinate` | Register with `uap agent`: announce, heartbeat, overlap detection |
|
|
437
|
+
| `--deploy` | On success, queue a commit into the deploy batcher (`uap deploy`) |
|
|
438
|
+
| `--optimize` | Enable every convergence aid (deploy excluded) |
|
|
439
|
+
| `--endpoint <url>` | Override the model endpoint (OpenAI-compatible `/v1`) |
|
|
440
|
+
| `--dry-run` / `--json` | Show the plan only / emit machine-readable result |
|
|
441
|
+
|
|
442
|
+
Model output is never executed — only written as files and checked by the
|
|
443
|
+
gates. The applier refuses writes to executed config (`package.json`,
|
|
444
|
+
lockfiles), `.git`/hooks/CI paths, and symlinks that escape the project root.
|
|
445
|
+
|
|
446
|
+
---
|
|
447
|
+
|
|
332
448
|
## Pattern System (23 Patterns)
|
|
333
449
|
|
|
334
450
|
Battle-tested patterns from Terminal-Bench 2.0, stored in `.factory/patterns/`.
|
|
@@ -478,7 +594,7 @@ pre-tool-use mechanism (claude, vscode, cursor, factory, opencode, omp, hermes).
|
|
|
478
594
|
|
|
479
595
|
## CLI Reference
|
|
480
596
|
|
|
481
|
-
###
|
|
597
|
+
### 29 Top-Level Commands
|
|
482
598
|
|
|
483
599
|
| Command | Description |
|
|
484
600
|
| ------------------------- | -------------------------------------------- |
|
|
@@ -498,6 +614,7 @@ pre-tool-use mechanism (claude, vscode, cursor, factory, opencode, omp, hermes).
|
|
|
498
614
|
| `uap task <action>` | Task management (15 subcommands) |
|
|
499
615
|
| `uap droids <action>` | Droid management (3 subcommands) |
|
|
500
616
|
| `uap expert-route <task>` | Recommend an expert droid chain for a task |
|
|
617
|
+
| `uap deliver <task>` | Convergence loop: iterate a model against real gates until delivery |
|
|
501
618
|
| `uap harness <action>` | HALO trace analysis (analyze, status) |
|
|
502
619
|
| `uap ideate <action>` | Open-collider ideation (setup, run, ideas) |
|
|
503
620
|
| `uap model <action>` | Multi-model management (8 subcommands) |
|
|
@@ -511,7 +628,7 @@ pre-tool-use mechanism (claude, vscode, cursor, factory, opencode, omp, hermes).
|
|
|
511
628
|
| `uap sync` | Sync configuration between platforms |
|
|
512
629
|
| `uap uap-omp <action>` | Oh-My-Pi integration (7 subcommands) |
|
|
513
630
|
|
|
514
|
-
**Total:
|
|
631
|
+
**Total: 118 commands and subcommands.**
|
|
515
632
|
|
|
516
633
|
### Additional Binaries
|
|
517
634
|
|