@miller-tech/uap 1.22.0 → 1.26.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +65 -21
- package/dist/.tsbuildinfo +1 -1
- package/dist/benchmarks/token-throughput.d.ts +53 -53
- package/dist/bin/cli.js +88 -5
- package/dist/bin/cli.js.map +1 -1
- package/dist/bin/llama-server-optimize.js +0 -0
- package/dist/bin/policy.js +0 -0
- package/dist/cli/agent.js +1 -1
- package/dist/cli/agent.js.map +1 -1
- package/dist/cli/droids.d.ts +21 -1
- package/dist/cli/droids.d.ts.map +1 -1
- package/dist/cli/droids.js +142 -0
- package/dist/cli/droids.js.map +1 -1
- package/dist/cli/expert-route.d.ts +11 -0
- package/dist/cli/expert-route.d.ts.map +1 -0
- package/dist/cli/expert-route.js +67 -0
- package/dist/cli/expert-route.js.map +1 -0
- package/dist/cli/harness.d.ts +24 -0
- package/dist/cli/harness.d.ts.map +1 -0
- package/dist/cli/harness.js +84 -0
- package/dist/cli/harness.js.map +1 -0
- package/dist/cli/hooks.d.ts +13 -2
- package/dist/cli/hooks.d.ts.map +1 -1
- package/dist/cli/hooks.js +333 -3
- package/dist/cli/hooks.js.map +1 -1
- package/dist/cli/ideate.d.ts +18 -0
- package/dist/cli/ideate.d.ts.map +1 -0
- package/dist/cli/ideate.js +148 -0
- package/dist/cli/ideate.js.map +1 -0
- package/dist/cli/patterns.js +55 -0
- package/dist/cli/patterns.js.map +1 -1
- package/dist/cli/setup.d.ts.map +1 -1
- package/dist/cli/setup.js +14 -1
- package/dist/cli/setup.js.map +1 -1
- package/dist/coordination/capability-router.d.ts +1 -1
- package/dist/coordination/capability-router.d.ts.map +1 -1
- package/dist/coordination/capability-router.js +132 -0
- package/dist/coordination/capability-router.js.map +1 -1
- package/dist/coordination/expert-orchestrator.d.ts +66 -0
- package/dist/coordination/expert-orchestrator.d.ts.map +1 -0
- package/dist/coordination/expert-orchestrator.js +150 -0
- package/dist/coordination/expert-orchestrator.js.map +1 -0
- package/dist/coordination/service.d.ts +8 -1
- package/dist/coordination/service.d.ts.map +1 -1
- package/dist/coordination/service.js +18 -4
- package/dist/coordination/service.js.map +1 -1
- package/dist/mcp-router/experts/registry.d.ts +54 -0
- package/dist/mcp-router/experts/registry.d.ts.map +1 -0
- package/dist/mcp-router/experts/registry.js +143 -0
- package/dist/mcp-router/experts/registry.js.map +1 -0
- package/dist/mcp-router/index.d.ts +2 -0
- package/dist/mcp-router/index.d.ts.map +1 -1
- package/dist/mcp-router/index.js +1 -0
- package/dist/mcp-router/index.js.map +1 -1
- package/dist/mcp-router/server.d.ts.map +1 -1
- package/dist/mcp-router/server.js +16 -0
- package/dist/mcp-router/server.js.map +1 -1
- package/dist/mcp-router/tools/execute.d.ts.map +1 -1
- package/dist/mcp-router/tools/execute.js +40 -0
- package/dist/mcp-router/tools/execute.js.map +1 -1
- package/dist/models/planner.d.ts +7 -1
- package/dist/models/planner.d.ts.map +1 -1
- package/dist/models/planner.js +61 -0
- package/dist/models/planner.js.map +1 -1
- package/dist/models/types.d.ts +14 -12
- package/dist/models/types.d.ts.map +1 -1
- package/dist/models/types.js.map +1 -1
- package/dist/observability/halo-exporter.d.ts +86 -0
- package/dist/observability/halo-exporter.d.ts.map +1 -0
- package/dist/observability/halo-exporter.js +139 -0
- package/dist/observability/halo-exporter.js.map +1 -0
- package/dist/telemetry/session-telemetry.d.ts.map +1 -1
- package/dist/telemetry/session-telemetry.js +7 -0
- package/dist/telemetry/session-telemetry.js.map +1 -1
- package/dist/types/config.d.ts +170 -170
- package/docs/architecture/EXPERT_STACK.md +137 -0
- package/docs/architecture/PLATFORM_GATING.md +68 -0
- package/docs/reference/EXPERT_DROIDS.md +219 -0
- package/package.json +1 -1
- package/templates/hooks/pre-tool-use-edit-write.sh +29 -8
- package/templates/hooks/uap-policy-gate-hermes.sh +42 -0
- package/tools/agents/scripts/anthropic_proxy.py +166 -30
- package/tools/agents/tests/test_attractor_detection.py +213 -0
- package/dist/utils/baseline-metrics.d.ts +0 -21
- package/dist/utils/baseline-metrics.d.ts.map +0 -1
- package/dist/utils/baseline-metrics.js +0 -111
- package/dist/utils/baseline-metrics.js.map +0 -1
- package/tools/agents/__pycache__/claude_local_agent.cpython-313.pyc +0 -0
- package/tools/agents/__pycache__/opencode_uap_agent.cpython-313.pyc +0 -0
- package/tools/agents/scripts/__pycache__/anthropic_proxy.cpython-313.pyc +0 -0
- package/tools/agents/tests/__pycache__/test_anthropic_proxy_streaming.cpython-313-pytest-9.0.2.pyc +0 -0
|
@@ -0,0 +1,137 @@
|
|
|
1
|
+
# Expert Stack: Forward-Design, HALO & Open-Collider
|
|
2
|
+
|
|
3
|
+
This document covers the expert-system extensions added on top of the v1.23.0
|
|
4
|
+
droid stack: forward-design experts, the activated experts-as-MCP-tools surface,
|
|
5
|
+
HALO trace-based harness optimization, open-collider divergent ideation, and the
|
|
6
|
+
expert-review hard gate.
|
|
7
|
+
|
|
8
|
+
> Scope note: the base 33-droid roster, `ExpertOrchestrator`, `expert-route`
|
|
9
|
+
> CLI, and `parallel-expert-review` skill already shipped in v1.23.0. This layer
|
|
10
|
+
> closes real gaps in that stack and integrates two external tools.
|
|
11
|
+
|
|
12
|
+
---
|
|
13
|
+
|
|
14
|
+
## 1. Forward-design droids
|
|
15
|
+
|
|
16
|
+
The pre-existing roster was review-heavy — the orchestrator's `plan`/`design`
|
|
17
|
+
phases produced no up-front design. Three forward-design experts fill that gap:
|
|
18
|
+
|
|
19
|
+
| Droid | Phase | Role |
|
|
20
|
+
|---|---|---|
|
|
21
|
+
| `strategic-architect` | plan | North-star architecture, technology selection (OSS-first), multi-quarter evolution, one-way-door decisions. Forward-design counterpart to `architect-reviewer`. |
|
|
22
|
+
| `tactical-architect` | design | Concrete component/module boundaries, interfaces, data shapes, pattern selection, refactor strategy. |
|
|
23
|
+
| `implementation-planner` | design | Executable work breakdown: ordered steps, file-level plan (reuse-first), test plan, risk/rollback. Feeds the `validate-plan-before-build` gate. |
|
|
24
|
+
|
|
25
|
+
Wiring: `src/coordination/expert-orchestrator.ts` — `PHASE_ROSTER.plan` gains
|
|
26
|
+
`strategic-architect`; `PHASE_ROSTER.design` gains `tactical-architect` and
|
|
27
|
+
`implementation-planner`; `isRelevantForCapability` maps them to the
|
|
28
|
+
`architecture`/`api-design` capabilities so they appear only on relevant tasks.
|
|
29
|
+
|
|
30
|
+
```bash
|
|
31
|
+
uap expert-route "Design a new billing subsystem" --files src/types/billing.ts --json
|
|
32
|
+
# → plan: strategic-architect … design: tactical-architect, implementation-planner …
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
---
|
|
36
|
+
|
|
37
|
+
## 2. Experts as MCP tools (activated)
|
|
38
|
+
|
|
39
|
+
`src/mcp-router/experts/registry.ts` could already convert droids to virtual
|
|
40
|
+
`experts.<name>` tools (`loadExpertTools`) but was never wired in. Now:
|
|
41
|
+
|
|
42
|
+
- `McpRouter.loadTools()` (`src/mcp-router/server.ts`) calls `loadExpertTools(cwd)`
|
|
43
|
+
and adds the experts to the fuzzy search index.
|
|
44
|
+
- `handleExecuteTool` (`src/mcp-router/tools/execute.ts`) intercepts
|
|
45
|
+
`experts.<droid>` paths and dispatches an in-process `consultExpert()` — it
|
|
46
|
+
loads the droid's instructions and returns them wrapped as a prompt (mirroring
|
|
47
|
+
`uap_droid_invoke`), instead of routing to an external MCP server.
|
|
48
|
+
|
|
49
|
+
Result: `discover_tools "architecture review"` surfaces the right expert and
|
|
50
|
+
`execute_tool experts.architect-reviewer` returns a consultation — all within
|
|
51
|
+
the 2-tool token-saving router shape.
|
|
52
|
+
|
|
53
|
+
---
|
|
54
|
+
|
|
55
|
+
## 3. HALO — trace-based harness optimization
|
|
56
|
+
|
|
57
|
+
[HALO](https://github.com/context-labs/HALO) analyzes large volumes of execution
|
|
58
|
+
traces to find *systemic* harness/prompt failure modes (not one-off errors). UAP
|
|
59
|
+
integrates it as an exporter + a droid + a CLI.
|
|
60
|
+
|
|
61
|
+
**Exporter** (`src/observability/halo-exporter.ts`) — opt-in, zero-overhead when
|
|
62
|
+
off. Emits one JSONL span per agent/LLM/tool call in HALO's OTLP/OpenInference
|
|
63
|
+
shape: OTLP identity, `resource.attributes."service.name"`, and the four
|
|
64
|
+
`inference.*` attributes (`project_id`, `observation_kind`, `export.schema_version`,
|
|
65
|
+
`openinference.span.kind`), with nanosecond-precision timestamps.
|
|
66
|
+
|
|
67
|
+
Tap points: `execute.ts:handleExecuteTool` (TOOL spans) and
|
|
68
|
+
`session-telemetry.ts` `agentComplete`/`agentError` (AGENT spans).
|
|
69
|
+
|
|
70
|
+
```bash
|
|
71
|
+
export UAP_HALO_TRACE=1 # enable collection
|
|
72
|
+
export UAP_HALO_TRACE_PATH=.uap/halo/traces.jsonl
|
|
73
|
+
# … run your workflow …
|
|
74
|
+
uap harness status # enabled? path? span count?
|
|
75
|
+
uap harness analyze -p "systemic failure modes?" # wraps `halo <file> -p ...`
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
**Prerequisite:** `pip install halo-engine` (Python ≥3.10) + an OpenAI-compatible
|
|
79
|
+
endpoint. Each analysis run incurs LLM cost. The `harness-optimizer` droid runs
|
|
80
|
+
the loop: diagnose → **verify each claim against the repo** → route fixes →
|
|
81
|
+
re-measure. Hard rule: *ask HALO about the trace data; never ask it to write code.*
|
|
82
|
+
|
|
83
|
+
---
|
|
84
|
+
|
|
85
|
+
## 4. Open-Collider — divergent ideation
|
|
86
|
+
|
|
87
|
+
[open-collider](https://github.com/CL-ML/open-collider) escapes LLM "hivemind"
|
|
88
|
+
clustering by colliding structurally distant knowledge domains (Koestler
|
|
89
|
+
bisociation), then curating non-trivial ideas. Skill mode is free.
|
|
90
|
+
|
|
91
|
+
- `ideation-expert` droid drives the brief → domains → collide → curate flow.
|
|
92
|
+
- `uap ideate setup <name>` scaffolds the `projects/<name>/` file contract
|
|
93
|
+
(`brief_validated.json`, `input_bank.yaml`, `prompts/`, `texts/`).
|
|
94
|
+
- `uap ideate run <name>` drives the brainstorm; `uap ideate ideas <name>` reads
|
|
95
|
+
the newest `curated_ideas.json`.
|
|
96
|
+
- Orchestrator opt-in: `new ExpertOrchestrator({ includeIdeation: true })`
|
|
97
|
+
prepends an `ideate` phase feeding the plan-phase product/strategy droids.
|
|
98
|
+
`readCuratedIdeas()` (`src/cli/ideate.ts`) is the consumable artifact.
|
|
99
|
+
|
|
100
|
+
Use it only when the solution space is wide; skip for convergent tasks.
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
## 5. Expert-review hard gate
|
|
105
|
+
|
|
106
|
+
The `parallel-expert-review` skill claimed "REQUIRED by policy" but nothing
|
|
107
|
+
enforced it. Two policy artifacts close that:
|
|
108
|
+
|
|
109
|
+
- `expert-review-required` (`src/policies/schemas/policies/expert-review-required.md`
|
|
110
|
+
+ `src/policies/enforcers/expert_review_required.py`): blocks ship actions
|
|
111
|
+
(`git commit`/`push`, `gh pr create`, merge/pr-ready/signoff) unless
|
|
112
|
+
`.uap/reviews/<branch-slug>.json` exists and covers `HEAD` (stale → block).
|
|
113
|
+
Fail-open on detached/non-git; override `UAP_NO_REVIEW=1`.
|
|
114
|
+
- `architecture-review` (`…/policies/architecture-review.md`): the missing
|
|
115
|
+
backing doc for the previously-orphan `architecture_review.py` enforcer
|
|
116
|
+
(ADR-or-waiver on architecturally significant diffs).
|
|
117
|
+
|
|
118
|
+
The review flow writes the artifact on consolidation:
|
|
119
|
+
`{ "head": "<sha>", "verdict": "approve", "reviewers": [...] }`. Install with:
|
|
120
|
+
|
|
121
|
+
```bash
|
|
122
|
+
uap policy install expert-review-required # attaches the enforcer to the hook
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
---
|
|
126
|
+
|
|
127
|
+
## File map
|
|
128
|
+
|
|
129
|
+
| Concern | Path |
|
|
130
|
+
|---|---|
|
|
131
|
+
| Forward-design droids | `.factory/droids/{strategic-architect,tactical-architect,implementation-planner}.md` |
|
|
132
|
+
| Orchestrator wiring | `src/coordination/expert-orchestrator.ts` |
|
|
133
|
+
| Experts-MCP dispatch | `src/mcp-router/experts/registry.ts`, `server.ts`, `tools/execute.ts` |
|
|
134
|
+
| HALO exporter | `src/observability/halo-exporter.ts` |
|
|
135
|
+
| HALO droid + CLI | `.factory/droids/harness-optimizer.md`, `src/cli/harness.ts` |
|
|
136
|
+
| Ideation droid + CLI | `.factory/droids/ideation-expert.md`, `src/cli/ideate.ts` |
|
|
137
|
+
| Review gate | `src/policies/{schemas/policies/expert-review-required.md,enforcers/expert_review_required.py}` |
|
|
@@ -0,0 +1,68 @@
|
|
|
1
|
+
# Platform Gating
|
|
2
|
+
|
|
3
|
+
How UAP's policy gate (the DB-driven enforcement that blocks tool calls via
|
|
4
|
+
`policies.db` + `.policy-tools/*.py`) is applied across each supported agent
|
|
5
|
+
harness, and where harness limits make it advisory.
|
|
6
|
+
|
|
7
|
+
## Install & validate
|
|
8
|
+
|
|
9
|
+
```bash
|
|
10
|
+
uap hooks install # all project platforms (Hermes is global → opt-in)
|
|
11
|
+
uap hooks install -t hermes # Hermes (writes global ~/.hermes/config.yaml)
|
|
12
|
+
uap hooks doctor # audit coverage; exits non-zero on gaps
|
|
13
|
+
uap setup # now also installs hooks (Step 7)
|
|
14
|
+
```
|
|
15
|
+
|
|
16
|
+
The gate script `templates/hooks/uap-policy-gate.sh` is copied into each
|
|
17
|
+
platform's hooks dir and registered on that platform's pre-tool event. It reads
|
|
18
|
+
the tool payload on stdin, runs the active enforcers, and blocks with `exit 2`
|
|
19
|
+
(Claude convention). Hermes uses a wrapper (`uap-policy-gate-hermes.sh`) that
|
|
20
|
+
translates `exit 2` into a stdout `{"decision":"block"}` JSON.
|
|
21
|
+
|
|
22
|
+
## Coverage matrix
|
|
23
|
+
|
|
24
|
+
| Platform | Tier | Pre-tool mechanism | Config |
|
|
25
|
+
|---|---|---|---|
|
|
26
|
+
| claude | ✅ gated | `PreToolUse` hooks (Edit/Write/MultiEdit, Bash, Task/Agent/…) | `.claude/settings.local.json` |
|
|
27
|
+
| vscode | ✅ gated | same (Claude format) | `.claude/settings.local.json` |
|
|
28
|
+
| cursor | ✅ gated | `preToolUse` array | `.cursor/hooks.json` |
|
|
29
|
+
| factory | ✅ gated | `PreToolUse` hooks | `.factory/settings.local.json` |
|
|
30
|
+
| opencode | ✅ gated | `tool.execute.before` plugin hook (throws to abort) | `.opencode/plugin/uap-session-hooks.ts` |
|
|
31
|
+
| omp | ✅ gated | `preToolUsePolicyGate` hook | `.uap/omp/settings.json` |
|
|
32
|
+
| hermes | ✅ gated | `pre_tool_call` shell hook (stdout block JSON) | `~/.hermes/config.yaml` (global) |
|
|
33
|
+
| codex | ⚠️ MCP-gated | no native pre-tool hook event | `.codex/config.toml` `[mcp_servers.uap]` |
|
|
34
|
+
| forgecode | ⚠️ advisory | plugin injects policy context; no block path | `.forge/forgecode.plugin.sh` |
|
|
35
|
+
|
|
36
|
+
## Harness limits (why two platforms are not hard-gated)
|
|
37
|
+
|
|
38
|
+
- **Codex** has no pre-tool-use *hook event*, so it can't auto-run the gate
|
|
39
|
+
before every tool. Gating is **hard** for tools routed through the UAP MCP
|
|
40
|
+
server (`execute_tool` runs the PolicyGate) and **advisory** for codex-native
|
|
41
|
+
edit/bash (run `bash .codex/hooks/uap-policy-gate.sh` per AGENTS.md). `hooks
|
|
42
|
+
doctor` reports codex as MCP-gated.
|
|
43
|
+
- **ForgeCode**'s plugin surfaces session/compaction lifecycle and injects the
|
|
44
|
+
active-policy list as context, but exposes no pre-tool interception point that
|
|
45
|
+
can *block*. Reported as advisory.
|
|
46
|
+
|
|
47
|
+
## Hermes specifics
|
|
48
|
+
|
|
49
|
+
- Config is **global** (`$HERMES_HOME` or `~/.hermes/config.yaml`), so it is
|
|
50
|
+
excluded from the default `uap hooks install` loop and installed explicitly
|
|
51
|
+
with `-t hermes`. `hooks doctor` treats an absent `~/.hermes` as optional, and
|
|
52
|
+
a present-but-unwired install as a real gap.
|
|
53
|
+
- Hermes hooks are **fail-open** (a crashing/exit-non-zero/bad-JSON hook lets the
|
|
54
|
+
tool proceed). The UAP Hermes gate therefore always exits 0 and always emits a
|
|
55
|
+
valid decision JSON, so genuine blocks are enforced.
|
|
56
|
+
- Hermes prompts once to approve each hook command (stored in
|
|
57
|
+
`~/.hermes/shell-hooks-allowlist.json`); approve the UAP gate, or set
|
|
58
|
+
`hooks_auto_accept: true`.
|
|
59
|
+
- Hermes has no per-file persona registry, so UAP droids are surfaced via a
|
|
60
|
+
skills bridge (`~/.hermes/uap-skills/uap-experts/SKILL.md`) that routes to
|
|
61
|
+
`uap expert-route` and the MCP `experts.<name>` tools.
|
|
62
|
+
|
|
63
|
+
## Key files
|
|
64
|
+
|
|
65
|
+
- Installer + doctor: `src/cli/hooks.ts` (`copyHookScripts`, `installHermesHooks`, `auditPlatform`, `hooksDoctor`, `ALL_TARGETS`).
|
|
66
|
+
- Gate scripts: `templates/hooks/uap-policy-gate.sh`, `templates/hooks/uap-policy-gate-hermes.sh`.
|
|
67
|
+
- MCP-router gate (codex path): `src/mcp-router/tools/execute.ts:handleExecuteTool`.
|
|
68
|
+
- Setup wiring: `src/cli/setup.ts`.
|
|
@@ -0,0 +1,219 @@
|
|
|
1
|
+
# Expert Droids Reference
|
|
2
|
+
|
|
3
|
+
UAP ships with a 30-droid expert stack covering the full SDLC — ideation,
|
|
4
|
+
strategy, design, build, review, release, and operations. Droids are markdown
|
|
5
|
+
files under `.factory/droids/` discoverable by the capability router, the MCP
|
|
6
|
+
router's `expert-consultation` category (virtual `experts.<name>` tools), and
|
|
7
|
+
the `ExpertOrchestrator`. The forward-design / HALO / ideation extensions are
|
|
8
|
+
documented in [docs/architecture/EXPERT_STACK.md](../architecture/EXPERT_STACK.md).
|
|
9
|
+
|
|
10
|
+
## Quick Look
|
|
11
|
+
|
|
12
|
+
```bash
|
|
13
|
+
uap droids list # see what's installed
|
|
14
|
+
uap droids validate # CI-grade integrity check
|
|
15
|
+
uap expert-route "<task description>" # ask for the recommended chain
|
|
16
|
+
```
|
|
17
|
+
|
|
18
|
+
---
|
|
19
|
+
|
|
20
|
+
## Base Roster (25 droids)
|
|
21
|
+
|
|
22
|
+
> The 5 forward-design / HALO / ideation droids that bring the total to 30 are
|
|
23
|
+
> listed under [Forward-Design, HALO & Ideation Extensions](#forward-design-halo--ideation-extensions) below.
|
|
24
|
+
|
|
25
|
+
### Strategy (3)
|
|
26
|
+
|
|
27
|
+
| Droid | Role |
|
|
28
|
+
|---|---|
|
|
29
|
+
| `product-strategist` | Acceptance criteria, hidden constraints, PRDs |
|
|
30
|
+
| `architect-reviewer` | System design validation, ADRs, blast radius |
|
|
31
|
+
| `api-designer` | Contract design, versioning, schema diff authority |
|
|
32
|
+
|
|
33
|
+
### Build (8)
|
|
34
|
+
|
|
35
|
+
| Droid | Role |
|
|
36
|
+
|---|---|
|
|
37
|
+
| `typescript-node-expert` | Strict TS, ESM, Node 18+ async patterns |
|
|
38
|
+
| `javascript-pro` | Modern ES2023+, browser + Node |
|
|
39
|
+
| `python-pro` | Type-safe Python 3.11+, async, packaging |
|
|
40
|
+
| `rust-pro` | Ownership, lifetimes, async/await with Tokio |
|
|
41
|
+
| `go-pro` | Idiomatic concurrency, errgroup, context |
|
|
42
|
+
| `cli-design-expert` | Verb/noun discipline, exit codes, stdout/stderr split |
|
|
43
|
+
| `debug-expert` | Root cause analysis, dependency conflicts |
|
|
44
|
+
| `refactoring-specialist` | Behavior-preserving transformations |
|
|
45
|
+
|
|
46
|
+
### Quality (4)
|
|
47
|
+
|
|
48
|
+
| Droid | Role |
|
|
49
|
+
|---|---|
|
|
50
|
+
| `code-quality-guardian` | Full-file structural review, smells, metrics |
|
|
51
|
+
| `code-quality-reviewer` | Per-diff quality with citations |
|
|
52
|
+
| `security-auditor` | Full OWASP audit, secret detection, sec-context |
|
|
53
|
+
| `security-code-reviewer` | Per-diff security regressions, CWE-cited |
|
|
54
|
+
|
|
55
|
+
### Performance & Cost (3)
|
|
56
|
+
|
|
57
|
+
| Droid | Role |
|
|
58
|
+
|---|---|
|
|
59
|
+
| `performance-optimizer` | Full perf analysis, bottleneck identification |
|
|
60
|
+
| `performance-reviewer` | Per-diff regressions, N+1, allocation hotspots |
|
|
61
|
+
| `cost-engineer` | Cloud spend modeling, FinOps, observability cost |
|
|
62
|
+
|
|
63
|
+
### Testing & QA (4)
|
|
64
|
+
|
|
65
|
+
| Droid | Role |
|
|
66
|
+
|---|---|
|
|
67
|
+
| `test-strategist` | Pyramid design, coverage targets, technique selection |
|
|
68
|
+
| `test-plan-writer` | Authors test plans + scaffolds from acceptance criteria |
|
|
69
|
+
| `test-coverage-reviewer` | Verifies new behavior is exercised |
|
|
70
|
+
| `qa-expert` | Flaky test triage, regression bisect, release sign-off |
|
|
71
|
+
|
|
72
|
+
### Documentation (2)
|
|
73
|
+
|
|
74
|
+
| Droid | Role |
|
|
75
|
+
|---|---|
|
|
76
|
+
| `documentation-expert` | Authors comprehensive docs, READMEs, API docs |
|
|
77
|
+
| `documentation-accuracy-reviewer` | Per-diff doc drift, broken examples |
|
|
78
|
+
|
|
79
|
+
### Operations (5)
|
|
80
|
+
|
|
81
|
+
| Droid | Role |
|
|
82
|
+
|---|---|
|
|
83
|
+
| `release-manager` | Semver decisions, CHANGELOG, deploy batch priority |
|
|
84
|
+
| `compliance-officer` | Policy authoring, regulatory mapping, waivers |
|
|
85
|
+
| `incident-responder` | War-room coordination, postmortems, runbooks |
|
|
86
|
+
| `observability-engineer` | Logs/metrics/traces, SLOs, cardinality |
|
|
87
|
+
| `dependency-auditor` | Supply chain, slopsquatting, CVE workflow |
|
|
88
|
+
|
|
89
|
+
### Specialty (3)
|
|
90
|
+
|
|
91
|
+
| Droid | Role |
|
|
92
|
+
|---|---|
|
|
93
|
+
| `ml-training-expert` | Model training, dataset processing, MTEB |
|
|
94
|
+
| `sysadmin-expert` | Linux kernel, QEMU, networking, systemd |
|
|
95
|
+
| `terminal-bench-optimizer` | Meta-orchestrator for benchmark tasks |
|
|
96
|
+
| `accessibility-tester` | WCAG 2.2 AA, keyboard nav, screen readers |
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## Routing
|
|
101
|
+
|
|
102
|
+
The `ExpertOrchestrator` composes droid chains across five phases:
|
|
103
|
+
|
|
104
|
+
```
|
|
105
|
+
plan → design → implement → review → release
|
|
106
|
+
```
|
|
107
|
+
|
|
108
|
+
Each phase pulls from a roster but only includes droids relevant to the
|
|
109
|
+
matched capabilities. Review-phase droids run in parallel.
|
|
110
|
+
|
|
111
|
+
```bash
|
|
112
|
+
# CLI surface
|
|
113
|
+
uap expert-route "Add rate limiting to /api/login" --files src/auth/login.ts
|
|
114
|
+
|
|
115
|
+
# JSON output (machine-readable, scriptable)
|
|
116
|
+
uap expert-route "..." --json
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
Programmatic surface:
|
|
120
|
+
|
|
121
|
+
```typescript
|
|
122
|
+
import { ExpertOrchestrator } from '@miller-tech/uap/coordination/expert-orchestrator';
|
|
123
|
+
|
|
124
|
+
const orch = new ExpertOrchestrator({
|
|
125
|
+
successRateFor: (droid) => adaptivePatterns.successRate(droid),
|
|
126
|
+
});
|
|
127
|
+
|
|
128
|
+
const plan = orch.plan(task, affectedFiles);
|
|
129
|
+
for (const step of plan.steps) {
|
|
130
|
+
console.log(`${step.phase}: ${step.droid} (${step.parallel ? 'parallel' : 'sequential'})`);
|
|
131
|
+
}
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
---
|
|
135
|
+
|
|
136
|
+
## MCP Router Surface
|
|
137
|
+
|
|
138
|
+
Every droid is exposed as a virtual MCP tool under the `experts` server,
|
|
139
|
+
discoverable via the standard `discover_tools` interface. This preserves
|
|
140
|
+
the 98% token-savings shape (just 2 meta tools exposed to the LLM).
|
|
141
|
+
|
|
142
|
+
```typescript
|
|
143
|
+
import { loadExpertTools } from '@miller-tech/uap/mcp-router';
|
|
144
|
+
|
|
145
|
+
const tools = loadExpertTools(process.cwd());
|
|
146
|
+
// tools[].serverName === 'experts'
|
|
147
|
+
// tools[].name === '<droid-name>'
|
|
148
|
+
|
|
149
|
+
// In the MCP router's tool index:
|
|
150
|
+
index.addTools(tools);
|
|
151
|
+
const matches = index.search('security review for auth code');
|
|
152
|
+
// → [{ path: 'experts.security-auditor', ... }]
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
---
|
|
156
|
+
|
|
157
|
+
## Authoring a New Droid
|
|
158
|
+
|
|
159
|
+
1. Create `.factory/droids/<name>.md` with frontmatter:
|
|
160
|
+
```yaml
|
|
161
|
+
---
|
|
162
|
+
name: <droid-name>
|
|
163
|
+
description: <one-line summary, ≥5 chars>
|
|
164
|
+
model: inherit
|
|
165
|
+
coordination:
|
|
166
|
+
channels: ["review"]
|
|
167
|
+
claims: ["shared"]
|
|
168
|
+
batches_deploy: true
|
|
169
|
+
---
|
|
170
|
+
```
|
|
171
|
+
2. Body sections: Mission, MANDATORY Pre-Checks, PROACTIVE ACTIVATION,
|
|
172
|
+
protocol/checklist, output shape, coordination notes.
|
|
173
|
+
3. If routable: add entry to `DEFAULT_CAPABILITY_MAPPINGS` in
|
|
174
|
+
`src/coordination/capability-router.ts`.
|
|
175
|
+
4. Run `uap droids validate` to confirm integrity.
|
|
176
|
+
5. CI gate (`droids validate` step) will block merge if anything is off.
|
|
177
|
+
|
|
178
|
+
See [`.factory/droids/code-quality-guardian.md`](../../.factory/droids/code-quality-guardian.md)
|
|
179
|
+
for a canonical example.
|
|
180
|
+
|
|
181
|
+
---
|
|
182
|
+
|
|
183
|
+
## Forward-Design, HALO & Ideation Extensions
|
|
184
|
+
|
|
185
|
+
Added on top of the base roster (see
|
|
186
|
+
[docs/architecture/EXPERT_STACK.md](../architecture/EXPERT_STACK.md)):
|
|
187
|
+
|
|
188
|
+
| Droid | Phase | Role |
|
|
189
|
+
|---|---|---|
|
|
190
|
+
| `strategic-architect` | plan | North-star architecture, technology selection, multi-quarter evolution (forward-design counterpart to `architect-reviewer`) |
|
|
191
|
+
| `tactical-architect` | design | Concrete component boundaries, interfaces, data shapes, pattern selection |
|
|
192
|
+
| `implementation-planner` | design | Executable work breakdown: steps, file plan, test plan, rollback |
|
|
193
|
+
| `ideation-expert` | ideate | open-collider divergent ideation (bisociation) feeding plan/design |
|
|
194
|
+
| `harness-optimizer` | review | HALO loop — diagnoses systemic harness failures from execution traces |
|
|
195
|
+
|
|
196
|
+
## Policy Hooks
|
|
197
|
+
|
|
198
|
+
| Policy | Level | Droid authority |
|
|
199
|
+
|---|---|---|
|
|
200
|
+
| `architecture-review` / `architecture-review-required` | REQUIRED | `architect-reviewer` |
|
|
201
|
+
| `expert-review-required` | REQUIRED | parallel-expert-review reviewers |
|
|
202
|
+
| `acceptance-criteria-defined` | RECOMMENDED | `product-strategist` |
|
|
203
|
+
| `observability-required` | RECOMMENDED | `observability-engineer` |
|
|
204
|
+
|
|
205
|
+
Architecture-review policy is enforced by
|
|
206
|
+
`src/policies/enforcers/<uuid>_architecture_review.py` — blocks PR-ready
|
|
207
|
+
operations on qualifying diffs unless an ADR or active waiver is present (backed
|
|
208
|
+
by `architecture-review.md`). The `expert-review-required` policy
|
|
209
|
+
(`expert_review_required.py`) blocks ship actions until a parallel review
|
|
210
|
+
artifact `.uap/reviews/<branch>.json` covers HEAD.
|
|
211
|
+
|
|
212
|
+
---
|
|
213
|
+
|
|
214
|
+
## Related
|
|
215
|
+
|
|
216
|
+
- [PARALLEL REVIEW PROTOCOL skill](../../.factory/skills/parallel-expert-review/SKILL.md)
|
|
217
|
+
- [Capability Router](../../src/coordination/capability-router.ts)
|
|
218
|
+
- [Expert Orchestrator](../../src/coordination/expert-orchestrator.ts)
|
|
219
|
+
- [MCP Router expert registry](../../src/mcp-router/experts/registry.ts)
|
package/package.json
CHANGED
|
@@ -3,11 +3,17 @@
|
|
|
3
3
|
# Event: PreToolUse (matcher: Edit|Write)
|
|
4
4
|
# Exit 2 = BLOCK the edit/write. Exit 0 = allow.
|
|
5
5
|
# Enforces: worktree-file-guard, worktree-enforcement policies.
|
|
6
|
+
#
|
|
7
|
+
# Scope: ONLY enforces inside the project repo (resolved via `git rev-parse
|
|
8
|
+
# --show-toplevel` from PWD). Files outside the repo (e.g. ~/.claude/projects/
|
|
9
|
+
# memory area, /tmp scratch, system files) are always allowed — the worktree
|
|
10
|
+
# policy governs repo-tracked work only. See: _hook-fix-2026-04-29.
|
|
6
11
|
set -euo pipefail
|
|
7
12
|
|
|
8
13
|
# --- Loop Protection: track frequency of blocking events ---
|
|
9
14
|
HOOK_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
10
15
|
if [ -f "${HOOK_DIR}/loop-protection.sh" ]; then
|
|
16
|
+
# shellcheck disable=SC1091
|
|
11
17
|
source "${HOOK_DIR}/loop-protection.sh"
|
|
12
18
|
fi
|
|
13
19
|
|
|
@@ -22,7 +28,25 @@ if [ -z "$FILE_PATH" ]; then
|
|
|
22
28
|
exit 0
|
|
23
29
|
fi
|
|
24
30
|
|
|
25
|
-
#
|
|
31
|
+
# Resolve to canonical absolute path. -m allows the file to not exist yet
|
|
32
|
+
# (common for Write to a new file).
|
|
33
|
+
ABS_PATH="$(realpath -m "$FILE_PATH" 2>/dev/null || printf '%s' "$FILE_PATH")"
|
|
34
|
+
|
|
35
|
+
# Resolve repo root from current working directory. If cwd is not inside a
|
|
36
|
+
# git repo at all, allow — there's no worktree policy to enforce.
|
|
37
|
+
REPO_ROOT="$(git rev-parse --show-toplevel 2>/dev/null || true)"
|
|
38
|
+
if [ -z "$REPO_ROOT" ]; then
|
|
39
|
+
exit 0
|
|
40
|
+
fi
|
|
41
|
+
|
|
42
|
+
# Scope guard (the regression fix): if the file being edited is outside the
|
|
43
|
+
# repo root, always allow. The worktree policy only applies to in-repo files.
|
|
44
|
+
case "$ABS_PATH" in
|
|
45
|
+
"$REPO_ROOT"/*) : ;; # inside repo — continue to in-repo checks
|
|
46
|
+
*) exit 0 ;; # outside repo — allow
|
|
47
|
+
esac
|
|
48
|
+
|
|
49
|
+
# Exempt paths — runtime data, not source code (substring match, intentional)
|
|
26
50
|
EXEMPT_PATTERNS=(
|
|
27
51
|
"agents/data/"
|
|
28
52
|
"node_modules/"
|
|
@@ -30,23 +54,20 @@ EXEMPT_PATTERNS=(
|
|
|
30
54
|
".uap/"
|
|
31
55
|
".git/"
|
|
32
56
|
"dist/"
|
|
33
|
-
"/tmp/"
|
|
34
|
-
"/dev/"
|
|
35
57
|
)
|
|
36
58
|
|
|
37
59
|
for pattern in "${EXEMPT_PATTERNS[@]}"; do
|
|
38
|
-
if echo "$
|
|
60
|
+
if echo "$ABS_PATH" | grep -q "$pattern"; then
|
|
39
61
|
exit 0
|
|
40
62
|
fi
|
|
41
63
|
done
|
|
42
64
|
|
|
43
|
-
# Allow if path is inside a worktree
|
|
44
|
-
if echo "$
|
|
65
|
+
# Allow if path is inside a worktree (substring match, handles nested worktrees)
|
|
66
|
+
if echo "$ABS_PATH" | grep -q '\.worktrees/'; then
|
|
45
67
|
exit 0
|
|
46
68
|
fi
|
|
47
69
|
|
|
48
|
-
# BLOCK: path is
|
|
49
|
-
# Record the block event for loop detection
|
|
70
|
+
# BLOCK: path is inside the repo, not in a worktree, not exempt.
|
|
50
71
|
if type lp_record_invocation &>/dev/null; then
|
|
51
72
|
lp_record_invocation "pre-tool-edit-block"
|
|
52
73
|
fi
|
|
@@ -0,0 +1,42 @@
|
|
|
1
|
+
#!/usr/bin/env bash
|
|
2
|
+
# UAP policy gate — Hermes Agent (NousResearch) variant.
|
|
3
|
+
#
|
|
4
|
+
# Hermes `pre_tool_call` hooks decide via a JSON object on STDOUT, and are
|
|
5
|
+
# FAIL-OPEN: a non-zero exit, timeout, or malformed JSON only logs a warning and
|
|
6
|
+
# lets the tool proceed. The shared uap-policy-gate.sh instead signals a block
|
|
7
|
+
# with `exit 2` (Claude Code convention), which Hermes would ignore.
|
|
8
|
+
#
|
|
9
|
+
# This wrapper runs the shared gate and TRANSLATES its verdict into the Hermes
|
|
10
|
+
# contract: it always exits 0 and always prints valid JSON —
|
|
11
|
+
# {"decision":"block","reason":"…"} when the gate blocks (exit 2)
|
|
12
|
+
# {} otherwise (allow)
|
|
13
|
+
# so a real block is reliably enforced rather than silently failing open.
|
|
14
|
+
#
|
|
15
|
+
# Hermes stdin payload: {"hook_event_name":"pre_tool_call","tool_name":...,
|
|
16
|
+
# "tool_input":{...},"session_id":...,"cwd":...}. uap-policy-gate.sh reads the
|
|
17
|
+
# same tool_name/tool_input fields, so the payload is passed through unchanged.
|
|
18
|
+
|
|
19
|
+
set -o pipefail
|
|
20
|
+
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
|
21
|
+
GATE="$SCRIPT_DIR/uap-policy-gate.sh"
|
|
22
|
+
|
|
23
|
+
PAYLOAD="$(cat)"
|
|
24
|
+
|
|
25
|
+
# No shared gate present → allow (fail-open, consistent with non-UAP repos).
|
|
26
|
+
if [ ! -f "$GATE" ]; then
|
|
27
|
+
echo '{}'
|
|
28
|
+
exit 0
|
|
29
|
+
fi
|
|
30
|
+
|
|
31
|
+
STDERR_FILE="$(mktemp 2>/dev/null || echo /tmp/uap-hermes-gate.$$)"
|
|
32
|
+
printf '%s' "$PAYLOAD" | bash "$GATE" >/dev/null 2>"$STDERR_FILE"
|
|
33
|
+
CODE=$?
|
|
34
|
+
REASON="$(tr -d '\r' < "$STDERR_FILE" | tr '\n' ' ' | sed 's/\\/\\\\/g; s/"/\\"/g')"
|
|
35
|
+
rm -f "$STDERR_FILE" 2>/dev/null || true
|
|
36
|
+
|
|
37
|
+
if [ "$CODE" -eq 2 ]; then
|
|
38
|
+
printf '{"decision":"block","reason":"%s"}\n' "${REASON:-blocked by UAP policy}"
|
|
39
|
+
else
|
|
40
|
+
echo '{}'
|
|
41
|
+
fi
|
|
42
|
+
exit 0
|