@miller-tech/uap 1.20.32 → 1.20.34
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/config/model-profiles/qwen35.json +6 -5
- package/dist/.tsbuildinfo +1 -1
- package/dist/bin/cli.js +6 -1
- package/dist/bin/cli.js.map +1 -1
- package/dist/cli/hooks.js +30 -7
- package/dist/cli/hooks.js.map +1 -1
- package/dist/cli/policy.d.ts.map +1 -1
- package/dist/cli/policy.js +26 -0
- package/dist/cli/policy.js.map +1 -1
- package/dist/dashboard/data-seeder.d.ts.map +1 -1
- package/dist/dashboard/data-seeder.js +72 -3
- package/dist/dashboard/data-seeder.js.map +1 -1
- package/dist/dashboard/data-service.js +1 -1
- package/dist/dashboard/data-service.js.map +1 -1
- package/dist/dashboard/server.js +1 -1
- package/dist/dashboard/server.js.map +1 -1
- package/dist/index.d.ts +15 -1
- package/dist/index.d.ts.map +1 -1
- package/dist/index.js +14 -0
- package/dist/index.js.map +1 -1
- package/dist/types/index.d.ts +20 -0
- package/dist/types/index.d.ts.map +1 -1
- package/dist/types/index.js +20 -0
- package/dist/types/index.js.map +1 -1
- package/docs/AGENTS.md +423 -0
- package/docs/AGENTS.md</path>CLAUDE.md</path>/home/cogtek/dev/miller-tech/universal-agent-protocol/docs/INDEX.md</path>/home/cogtek/dev/miller-tech/universal-agent-protocol/docs/reference/API_REFERENCE.md</path>/home/cogtek/dev/miller-tech/universal-agent-protocol/docs/reference/UAP_CLI_REFERENCE.md</path>src/index.ts</path>/src/cli/worktree.ts</path>/src/coordination/deploy-batcher.ts</path>/src/policies/policy-gate.ts</path>/src/memory/model-router.ts</path>/src/memory/embeddings.ts</path>/src/models/types.ts</path>/src/types/coordination.ts</path>/src/utils/logger.ts</path>/src/utils/config-loader.ts</path>/src/utils/performance-monitor.ts</path>/src/utils/concurrency.ts</path>/src/utils/concurrency-pool.ts</path>/src/utils/string-similarity.ts</path>/src/utils/rate-limiter.ts</path>/src/utils/system-resources.ts</path>/src/utils/adaptive-cache.ts</path>/src/utils/lazy-imports.ts</path>/src/utils/merge-claude-md.ts</path>/src/utils/stopwords.ts</path>/src/utils/config-loader.ts</path>/src/utils/performance-monitor.ts</path>/src/utils/concurrency.ts</path>/src/utils/concurrency-pool.ts</path>/src/utils/string-similarity.ts</path>/src/utils/rate-limiter.ts</path>/src/utils/system-resources.ts</path>/src/utils/adaptive-cache.ts</path>/src/utils/lazy-imports.ts</path>/src/utils/merge-claude-md.ts</path>/src/utils/stopwords.ts</path> +433 -0
- package/docs/DOCUMENTATION_AUDIT_REPORT.md +131 -0
- package/docs/GETTING_STARTED.md +288 -0
- package/docs/INDEX.md +272 -42
- package/docs/PROJECT_ANALYSIS_REPORT.md +510 -0
- package/docs/architecture/SYSTEM_ANALYSIS.md +220 -1003
- package/docs/blog/local-coding-agents.md +266 -0
- package/docs/blog/x-thread.md +254 -0
- package/docs/deployment/DEPLOY_BATCHER_ANALYSIS.md +15 -647
- package/docs/getting-started/OVERVIEW.md +10 -30
- package/docs/getting-started/SETUP.md +183 -9
- package/docs/pr/UPSTREAM_PRS.md +424 -0
- package/docs/reference/CONFIGURATION.md +208 -0
- package/docs/reference/DATABASE_SCHEMA.md +344 -0
- package/docs/reference/PATTERN_LIBRARY.md +636 -0
- package/package.json +1 -1
- package/templates/hooks/uap-policy-gate.sh +36 -0
- package/tools/agents/claude_local_agent.py +92 -0
- package/tools/agents/opencode_uap_agent.py +3 -0
- package/tools/agents/scripts/anthropic_proxy.py +654 -20
- package/tools/agents/uap_agent.py +1 -1
|
@@ -0,0 +1,424 @@
|
|
|
1
|
+
# UAP Upstream PR Plan
|
|
2
|
+
|
|
3
|
+
5 PRs covering the session stickiness bug, loop protection hardening, per-request spec control, OpenAI-compat endpoint, and the policy engine.
|
|
4
|
+
|
|
5
|
+
## Dependency graph
|
|
6
|
+
|
|
7
|
+
```
|
|
8
|
+
PR 1 (session fingerprinting) ── CRITICAL ──► enables PR 2, PR 3, PR 5
|
|
9
|
+
PR 2 (loop protection) ── depends on PR 1
|
|
10
|
+
PR 3 (spec decoding control) ── independent
|
|
11
|
+
PR 4 (OpenAI /v1/chat/completions) ── depends on PR 2 (via guardrails)
|
|
12
|
+
PR 5 (policy engine) ── depends on PR 1 + PR 2
|
|
13
|
+
```
|
|
14
|
+
|
|
15
|
+
---
|
|
16
|
+
|
|
17
|
+
## PR 1 — `proxy: stable session fingerprinting`
|
|
18
|
+
|
|
19
|
+
**Scope:** Critical bug fix
|
|
20
|
+
**Files:** `tools/agents/scripts/anthropic_proxy.py`
|
|
21
|
+
**Risk:** Low — pure fix, no new surface area
|
|
22
|
+
**Priority:** Highest — every stateful guardrail depends on this
|
|
23
|
+
|
|
24
|
+
### Problem
|
|
25
|
+
|
|
26
|
+
Session fingerprints were hashed from `remote | model | system | first_user_content`. Two inputs were volatile:
|
|
27
|
+
|
|
28
|
+
1. **`tool_use_id`** values in tool_result blocks — random UUIDs regenerated per turn. `_content_fingerprint` included `f"result:{block.get('tool_use_id', '')}"` in the hash.
|
|
29
|
+
2. **`system` prompt** — clients inject volatile context (timestamps, cwd, session markers) into system prompts.
|
|
30
|
+
|
|
31
|
+
Result: **every single request got a different session ID** → every request spawned a fresh `SessionMonitor` → every stateful guardrail (cycle detection, forced_budget, review_cycles, finalize_hard_stop, unproductive_exhaustion_streak) was effectively stateless per-request.
|
|
32
|
+
|
|
33
|
+
This silently broke every loop protection mechanism ever built on top of the session monitor.
|
|
34
|
+
|
|
35
|
+
### Diagnostic evidence
|
|
36
|
+
|
|
37
|
+
After adding session ID logging:
|
|
38
|
+
|
|
39
|
+
```
|
|
40
|
+
sess=fp:9c8f26a802f9f4739f18 msgs=79
|
|
41
|
+
sess=fp:b801857a9e49e21a6599 msgs=81
|
|
42
|
+
sess=fp:aeef638954a390ef7aec msgs=83
|
|
43
|
+
sess=fp:16f908db2e478f31cb91 msgs=85
|
|
44
|
+
```
|
|
45
|
+
|
|
46
|
+
Every request got a new session ID. `session_count: 35` after 35 requests on what should have been one session.
|
|
47
|
+
|
|
48
|
+
### Fix
|
|
49
|
+
|
|
50
|
+
1. `_content_fingerprint` uses stable content excerpt (`result:<first 64 chars>`) instead of `tool_use_id`
|
|
51
|
+
2. `resolve_session_id` hashes only the first user message's **text content**, excludes `system` prompt entirely
|
|
52
|
+
|
|
53
|
+
```python
|
|
54
|
+
def resolve_session_id(request: Request, anthropic_body: dict) -> str:
|
|
55
|
+
# ... header-based lookup unchanged ...
|
|
56
|
+
|
|
57
|
+
first_user = ""
|
|
58
|
+
for msg in anthropic_body.get("messages", []):
|
|
59
|
+
if msg.get("role") == "user":
|
|
60
|
+
content = msg.get("content", "")
|
|
61
|
+
if isinstance(content, str):
|
|
62
|
+
first_user = content[:512]
|
|
63
|
+
elif isinstance(content, list):
|
|
64
|
+
text_parts = [
|
|
65
|
+
b.get("text", "") for b in content
|
|
66
|
+
if isinstance(b, dict) and b.get("type") == "text"
|
|
67
|
+
]
|
|
68
|
+
first_user = "\n".join(text_parts)[:512]
|
|
69
|
+
break
|
|
70
|
+
|
|
71
|
+
# Deliberately exclude `system` from fingerprint — clients inject
|
|
72
|
+
# volatile context (timestamps, cwd, session markers).
|
|
73
|
+
digest = hashlib.sha256(
|
|
74
|
+
f"{remote}|{model}|{first_user}".encode("utf-8", errors="ignore")
|
|
75
|
+
).hexdigest()[:20]
|
|
76
|
+
return f"fp:{digest}"
|
|
77
|
+
```
|
|
78
|
+
|
|
79
|
+
### Impact
|
|
80
|
+
|
|
81
|
+
- Before: 1 request per session
|
|
82
|
+
- After: 170+ requests on the same session (verified with Claude Code + OpenCode + Forge clients)
|
|
83
|
+
- All downstream guardrails suddenly started working — no changes needed to them
|
|
84
|
+
|
|
85
|
+
### Add session ID logging
|
|
86
|
+
|
|
87
|
+
The REQ line now includes `sess=` for diagnosis:
|
|
88
|
+
|
|
89
|
+
```
|
|
90
|
+
REQ: client=remote:127.0.0.1 sess=fp:aa5169796b2c39c2a4a4 rate_60s=1 ...
|
|
91
|
+
```
|
|
92
|
+
|
|
93
|
+
### Tests
|
|
94
|
+
|
|
95
|
+
- [ ] Unit test: same message with changing tool_use_ids → stable fingerprint
|
|
96
|
+
- [ ] Unit test: same message with changing system timestamps → stable fingerprint
|
|
97
|
+
- [ ] Integration test: 3 sequential requests on same conversation → same session_id
|
|
98
|
+
|
|
99
|
+
---
|
|
100
|
+
|
|
101
|
+
## PR 2 — `proxy: loop protection hardening`
|
|
102
|
+
|
|
103
|
+
**Scope:** Medium — new counters + threshold gates
|
|
104
|
+
**Files:** `anthropic_proxy.py`
|
|
105
|
+
**Depends on:** PR 1 (counters only work with sticky sessions)
|
|
106
|
+
|
|
107
|
+
### Additions
|
|
108
|
+
|
|
109
|
+
1. **`tool_state_unproductive_exhaustion_streak`**
|
|
110
|
+
- Tracks consecutive `forced_budget_exhausted` events where NEITHER cycling NOR stagnation was detected
|
|
111
|
+
- After `PROXY_UNPRODUCTIVE_EXHAUSTION_LIMIT` (default 4), forces finalize
|
|
112
|
+
- Catches "distinct-but-unproductive tool spam" that defeats per-tool cycle detection
|
|
113
|
+
|
|
114
|
+
2. **`finalize_hard_stop_count`** (monotonic session-level)
|
|
115
|
+
- NOT reset by `fresh_user_text` / `inactive_loop` paths
|
|
116
|
+
- Incremented in BOTH:
|
|
117
|
+
- `_inject_synthetic_continuation` (synthetic continuation path)
|
|
118
|
+
- `state_choice == "finalize"` handler (tool-stripping path)
|
|
119
|
+
- When `>= PROXY_FINALIZE_SESSION_HARD_CAP` (default 6), synthetic continuation injection is blocked, natural end_turn passes through → client terminates loop cleanly
|
|
120
|
+
|
|
121
|
+
3. **`finalize_fired` flag in `_completion_blockers()`**
|
|
122
|
+
- When `finalize_hard_stop_count > 0`, suppresses `text_only_after_tool_results` blocker
|
|
123
|
+
- Prevents state machine from re-entering active loop after a finalize wraps up the work
|
|
124
|
+
- Was causing `finalize → review → cycle_detected → finalize → review → ...` infinite ping-pong
|
|
125
|
+
|
|
126
|
+
### New env vars
|
|
127
|
+
|
|
128
|
+
```
|
|
129
|
+
PROXY_UNPRODUCTIVE_EXHAUSTION_LIMIT=4 # new
|
|
130
|
+
PROXY_FINALIZE_SESSION_HARD_CAP=6 # new
|
|
131
|
+
```
|
|
132
|
+
|
|
133
|
+
### Tuned thresholds (tighter defaults)
|
|
134
|
+
|
|
135
|
+
```
|
|
136
|
+
PROXY_LOOP_REPEAT_THRESHOLD=4 # was 10
|
|
137
|
+
PROXY_FORCED_THRESHOLD=12 # was 18
|
|
138
|
+
PROXY_NO_PROGRESS_THRESHOLD=3 # was 5
|
|
139
|
+
PROXY_TOOL_STATE_STAGNATION_THRESHOLD=4 # was 8
|
|
140
|
+
PROXY_TOOL_STATE_FINALIZE_THRESHOLD=8 # was 18
|
|
141
|
+
PROXY_TOOL_STATE_REVIEW_CYCLE_LIMIT=5 # was 3 (relaxed from prior 3 after tuning)
|
|
142
|
+
PROXY_TOOL_NARROWING_EXPAND_ON_LOOP=off # was on
|
|
143
|
+
PROXY_TOOL_NARROWING_KEEP=8 # was 12
|
|
144
|
+
```
|
|
145
|
+
|
|
146
|
+
### Verification
|
|
147
|
+
|
|
148
|
+
Real session that was previously looping indefinitely terminated cleanly:
|
|
149
|
+
```
|
|
150
|
+
TOOL STATE MACHINE: 4 consecutive unproductive budget exhaustions — forcing finalize
|
|
151
|
+
TOOL STATE MACHINE: phase review -> finalize reason=unproductive_exhaustion
|
|
152
|
+
FINALIZE CONTINUATION: session hard cap reached (6/6) — not injecting, allowing termination
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
Client received clean `end_turn`, started a fresh new task.
|
|
156
|
+
|
|
157
|
+
### Tests
|
|
158
|
+
|
|
159
|
+
- [ ] Simulated loop: distinct tool calls with no context growth → triggers unproductive exhaustion
|
|
160
|
+
- [ ] Simulated loop: same tool repeated → triggers per-tool cycle detection (existing)
|
|
161
|
+
- [ ] Finalize → synthetic continuation → reset → new active loop → hard cap at 6 → natural termination
|
|
162
|
+
|
|
163
|
+
---
|
|
164
|
+
|
|
165
|
+
## PR 3 — `proxy: per-request speculative decoding control`
|
|
166
|
+
|
|
167
|
+
**Scope:** Small, focused
|
|
168
|
+
**Files:** `anthropic_proxy.py`, README
|
|
169
|
+
**Risk:** Low
|
|
170
|
+
|
|
171
|
+
### Feature
|
|
172
|
+
|
|
173
|
+
New env var `PROXY_DISABLE_SPEC_ON_TOOL_TURNS` (default off). When on, the proxy sets `openai_body["speculative.n_max"] = 0` on tool-turn requests, telling llama.cpp to skip the draft/spec path for that request only.
|
|
174
|
+
|
|
175
|
+
### Why
|
|
176
|
+
|
|
177
|
+
Some models (observed: early Qwen3.5-35B-A3B Q4_K_M) produce garbled tool-call output under speculative decoding due to rejected-draft state leakage. Disabling spec on tool turns while keeping it on for plain chat gives the best of both worlds for unstable models. Stable models can leave this off and benefit from spec on every turn.
|
|
178
|
+
|
|
179
|
+
### Applied in two places
|
|
180
|
+
|
|
181
|
+
1. Main handler (`_build_openai_request` end)
|
|
182
|
+
2. Tool starvation breaker early-return path (so the flag is respected on both code paths)
|
|
183
|
+
|
|
184
|
+
```python
|
|
185
|
+
if PROXY_DISABLE_SPEC_ON_TOOL_TURNS:
|
|
186
|
+
openai_body["speculative.n_max"] = 0
|
|
187
|
+
logger.info("Spec decoding disabled for tool turn (PROXY_DISABLE_SPEC_ON_TOOL_TURNS=on)")
|
|
188
|
+
```
|
|
189
|
+
|
|
190
|
+
### Relies on llama.cpp upstream support
|
|
191
|
+
|
|
192
|
+
llama.cpp already supports per-request `speculative.n_max` in `server-task.cpp`:
|
|
193
|
+
```cpp
|
|
194
|
+
params.speculative.n_max = json_value(data, "speculative.n_max", defaults.speculative.n_max);
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
Setting it to 0 gates the entire draft path (`if (n_draft_max > 0)` in `server-context.cpp`).
|
|
198
|
+
|
|
199
|
+
### Tests
|
|
200
|
+
|
|
201
|
+
- [ ] Tool-turn request with flag on → `speculative.n_max=0` in forwarded body
|
|
202
|
+
- [ ] Non-tool request with flag on → no speculative field added
|
|
203
|
+
- [ ] Flag off → no speculative field added regardless
|
|
204
|
+
|
|
205
|
+
---
|
|
206
|
+
|
|
207
|
+
## PR 4 — `proxy: fully guarded OpenAI /v1/chat/completions endpoint`
|
|
208
|
+
|
|
209
|
+
**Scope:** Medium — new endpoint with full bidirectional conversion
|
|
210
|
+
**Files:** `anthropic_proxy.py`
|
|
211
|
+
**Depends on:** PR 2 (reuses the guardrail pipeline)
|
|
212
|
+
|
|
213
|
+
### Motivation
|
|
214
|
+
|
|
215
|
+
Clients like **OpenCode**, **Forge**, **Cline**, and many LangChain-based agents expect OpenAI's `/v1/chat/completions` shape. The proxy previously only exposed `/v1/messages` (Anthropic shape), so these clients either:
|
|
216
|
+
1. Bypassed the proxy and talked directly to llama.cpp (no guardrails), OR
|
|
217
|
+
2. Couldn't use the proxy at all
|
|
218
|
+
|
|
219
|
+
### Approach
|
|
220
|
+
|
|
221
|
+
Add `/v1/chat/completions` handler that:
|
|
222
|
+
1. Receives OpenAI-format request
|
|
223
|
+
2. Converts to Anthropic format (`openai_to_anthropic_request`)
|
|
224
|
+
3. Invokes the existing `messages()` handler via synthetic `Request` with Anthropic body
|
|
225
|
+
4. Converts the Anthropic response back to OpenAI format (`anthropic_to_openai_response`)
|
|
226
|
+
5. Returns to the client
|
|
227
|
+
|
|
228
|
+
**All guardrails from the `/v1/messages` path apply automatically** — loop detection, tool narrowing, cycle breaking, malformed tool retry, context pruning, profile overrides, activation replay (llama.cpp side).
|
|
229
|
+
|
|
230
|
+
### Streaming
|
|
231
|
+
|
|
232
|
+
Client stream requests are processed internally as non-stream through the Anthropic pipeline, then re-streamed as OpenAI SSE chunks:
|
|
233
|
+
|
|
234
|
+
```
|
|
235
|
+
data: {"id":"msg_...","delta":{"role":"assistant"},...}
|
|
236
|
+
data: {"id":"msg_...","delta":{"content":"..."},...}
|
|
237
|
+
data: {"id":"msg_...","delta":{"tool_calls":[...]},...}
|
|
238
|
+
data: {"id":"msg_...","delta":{},"finish_reason":"tool_calls"}
|
|
239
|
+
data: [DONE]
|
|
240
|
+
```
|
|
241
|
+
|
|
242
|
+
This sacrifices token-by-token streaming granularity in exchange for keeping all guardrails. The difference is invisible to most clients.
|
|
243
|
+
|
|
244
|
+
### Helper functions added
|
|
245
|
+
|
|
246
|
+
- **`openai_to_anthropic_request(openai_body)`** — full conversion (system prompt, messages, tool_calls, tool_responses, tools, tool_choice, sampling params)
|
|
247
|
+
- **`anthropic_to_openai_response(anthropic_resp)`** — content blocks → message, tool_use → tool_calls, stop_reason → finish_reason, usage mapping
|
|
248
|
+
- **`_parse_anthropic_sse_to_message(raw)`** — SSE fallback parser if inner pipeline returns a stream despite `stream=False`
|
|
249
|
+
|
|
250
|
+
### Verification
|
|
251
|
+
|
|
252
|
+
Tested against OpenCode, Forge, and synthetic curl requests:
|
|
253
|
+
- Plain chat: clean text response
|
|
254
|
+
- Tool use: proper `tool_calls` with JSON arguments
|
|
255
|
+
- Streaming: proper SSE chunks with finish_reason
|
|
256
|
+
- All guardrails active (verified via log `CHAT (guarded)` marker)
|
|
257
|
+
|
|
258
|
+
### Tests
|
|
259
|
+
|
|
260
|
+
- [ ] Round-trip: OpenAI request → Anthropic → OpenAI with matching content
|
|
261
|
+
- [ ] Tool call conversion (both directions)
|
|
262
|
+
- [ ] System prompt extraction from messages
|
|
263
|
+
- [ ] Streaming endpoint emits valid SSE sequence
|
|
264
|
+
- [ ] Profile overrides apply to chat/completions path
|
|
265
|
+
|
|
266
|
+
---
|
|
267
|
+
|
|
268
|
+
## PR 5 — `proxy: policy engine with worktree + CI/CD enforcement`
|
|
269
|
+
|
|
270
|
+
**Scope:** Large — new module + hook points
|
|
271
|
+
**Files:** `policies/engine.py`, `policies/rules/*.py`, `anthropic_proxy.py` (hook points), tests
|
|
272
|
+
**Depends on:** PR 1 (session continuity), PR 2 (guardrail infrastructure)
|
|
273
|
+
**Risk:** Medium — new subsystem
|
|
274
|
+
|
|
275
|
+
### Motivation
|
|
276
|
+
|
|
277
|
+
You can tell a local coding agent to use a git worktree. You can write it in CLAUDE.md, put it in the system prompt, make it the first rule. Local 27–35B models **still commit directly to main**.
|
|
278
|
+
|
|
279
|
+
Policy-as-prompt is not an enforcement mechanism for local coding agents — it's a suggestion. The only reliable way to enforce workflow requirements is to make them non-bypassable at the proxy layer.
|
|
280
|
+
|
|
281
|
+
### What it enforces
|
|
282
|
+
|
|
283
|
+
- **Worktree routing** — `Edit`, `Write`, `Bash` tool inputs get rewritten to reference the active worktree path. Operations targeting the main working tree are rejected.
|
|
284
|
+
- **Completion gates** — `end_turn` is blocked unless tests ran, memory was queried, parallel reviewers were invoked.
|
|
285
|
+
- **Pre-commit discipline** — commit tool calls blocked until code-reviewer + security-auditor + architect-reviewer were invoked.
|
|
286
|
+
- **CI/CD deploy bucketing** — each agent session has a deploy bucket tied to its worktree. Concurrent agents don't collide at the pipeline layer.
|
|
287
|
+
- **Per-profile rule sets** — `build` / `plan` / `memory` / `autoaccept` each get a different policy set.
|
|
288
|
+
- **Session start protocol** — mandatory bootstrap checks (memory query, session context load)
|
|
289
|
+
- **Auditable trail** — every policy decision logged with rule ID, context, outcome
|
|
290
|
+
|
|
291
|
+
### Architecture
|
|
292
|
+
|
|
293
|
+
```
|
|
294
|
+
client → proxy → [guardrails] → [policy engine] → [tool rewriter] → llama.cpp
|
|
295
|
+
↓
|
|
296
|
+
audit log
|
|
297
|
+
```
|
|
298
|
+
|
|
299
|
+
Every tool call goes through a policy check chain before being forwarded to llama.cpp. Rules can allow, rewrite, or block.
|
|
300
|
+
|
|
301
|
+
### Rule DSL
|
|
302
|
+
|
|
303
|
+
```python
|
|
304
|
+
from uap.policies import policy, block, allow, MUTATING_TOOLS
|
|
305
|
+
|
|
306
|
+
@policy("worktree.enforce", profile=["build", "autoaccept"])
|
|
307
|
+
def enforce_worktree(request, session):
|
|
308
|
+
if request.tool_name in MUTATING_TOOLS:
|
|
309
|
+
if not session.worktree_active:
|
|
310
|
+
return block("worktree_not_in_use",
|
|
311
|
+
hint="Create a worktree first with `git worktree add`")
|
|
312
|
+
request.tool_input["path"] = rewrite_to_worktree(
|
|
313
|
+
request.tool_input["path"], session.worktree
|
|
314
|
+
)
|
|
315
|
+
return allow()
|
|
316
|
+
|
|
317
|
+
@policy("commit.parallel_review", profile="build")
|
|
318
|
+
def enforce_parallel_review(request, session):
|
|
319
|
+
if request.tool_name == "Bash" and "git commit" in request.tool_input.get("command", ""):
|
|
320
|
+
if not session.review_completed_this_turn:
|
|
321
|
+
return block("parallel_review_required",
|
|
322
|
+
hint="Invoke code-reviewer + security-auditor + architect-reviewer in parallel before committing")
|
|
323
|
+
return allow()
|
|
324
|
+
|
|
325
|
+
@policy("completion.gates", profile="build")
|
|
326
|
+
def enforce_completion_gates(request, session):
|
|
327
|
+
if request.is_end_turn:
|
|
328
|
+
blockers = []
|
|
329
|
+
if not session.tests_ran:
|
|
330
|
+
blockers.append("tests_not_run")
|
|
331
|
+
if not session.memory_queried:
|
|
332
|
+
blockers.append("memory_not_queried")
|
|
333
|
+
if blockers:
|
|
334
|
+
return block(f"completion_gates_failed: {','.join(blockers)}")
|
|
335
|
+
return allow()
|
|
336
|
+
```
|
|
337
|
+
|
|
338
|
+
### Integration with existing `_completion_blockers()`
|
|
339
|
+
|
|
340
|
+
Policy blockers extend the existing completion contract:
|
|
341
|
+
|
|
342
|
+
```python
|
|
343
|
+
def _completion_blockers(anthropic_body, has_tool_results, phase="", finalize_fired=False):
|
|
344
|
+
blockers = []
|
|
345
|
+
# ... existing checks ...
|
|
346
|
+
|
|
347
|
+
# NEW: policy-level blockers
|
|
348
|
+
policy_blockers = policy_engine.evaluate_completion(anthropic_body, session)
|
|
349
|
+
blockers.extend(policy_blockers)
|
|
350
|
+
|
|
351
|
+
return blockers
|
|
352
|
+
```
|
|
353
|
+
|
|
354
|
+
### Per-profile rule sets
|
|
355
|
+
|
|
356
|
+
```python
|
|
357
|
+
# policies/profiles.py
|
|
358
|
+
BUILD_PROFILE_RULES = [
|
|
359
|
+
"worktree.enforce",
|
|
360
|
+
"commit.parallel_review",
|
|
361
|
+
"commit.message_format",
|
|
362
|
+
"commit.no_secrets",
|
|
363
|
+
"completion.gates",
|
|
364
|
+
"session.bootstrap",
|
|
365
|
+
]
|
|
366
|
+
|
|
367
|
+
PLAN_PROFILE_RULES = [
|
|
368
|
+
"tools.read_only", # blocks write/edit/bash tools
|
|
369
|
+
"session.bootstrap",
|
|
370
|
+
]
|
|
371
|
+
|
|
372
|
+
MEMORY_PROFILE_RULES = [
|
|
373
|
+
"tools.memory_only", # only memory read/write tools allowed
|
|
374
|
+
]
|
|
375
|
+
|
|
376
|
+
AUTOACCEPT_PROFILE_RULES = [
|
|
377
|
+
"worktree.enforce", # same worktree rule
|
|
378
|
+
"commit.no_secrets", # security still enforced
|
|
379
|
+
# no parallel review required (autoaccept is explicit trade-off)
|
|
380
|
+
]
|
|
381
|
+
```
|
|
382
|
+
|
|
383
|
+
### Audit trail
|
|
384
|
+
|
|
385
|
+
Every policy decision is logged with session, rule ID, tool name, decision, and blocker reason:
|
|
386
|
+
|
|
387
|
+
```
|
|
388
|
+
POLICY: sess=fp:aa51... rule=worktree.enforce tool=Edit decision=rewrite old_path=/home/cogtek/dev/main/app.py new_path=/home/cogtek/dev/.worktrees/feat-x/app.py
|
|
389
|
+
POLICY: sess=fp:aa51... rule=commit.parallel_review tool=Bash decision=block reason=parallel_review_required
|
|
390
|
+
```
|
|
391
|
+
|
|
392
|
+
### Tests
|
|
393
|
+
|
|
394
|
+
- [ ] Unit tests for each rule in isolation
|
|
395
|
+
- [ ] Integration: build profile session → attempt commit without review → blocked → invoke review → commit succeeds
|
|
396
|
+
- [ ] Integration: plan profile session → attempt Write → blocked
|
|
397
|
+
- [ ] Multi-agent: two sessions with different worktrees → no collision
|
|
398
|
+
- [ ] Audit log format validation
|
|
399
|
+
|
|
400
|
+
### Migration path
|
|
401
|
+
|
|
402
|
+
- PR introduces the policy engine as **opt-in** per profile (default profile has no policies — fully backward-compatible)
|
|
403
|
+
- Users can enable rules one at a time via profile env vars
|
|
404
|
+
- Existing CLAUDE.md prose instructions can reference policies for context, but policies are now enforced independent of prose
|
|
405
|
+
|
|
406
|
+
---
|
|
407
|
+
|
|
408
|
+
## Submission order
|
|
409
|
+
|
|
410
|
+
1. **PR 1 (session fingerprinting)** — critical bug fix, low risk, unblocks everything else
|
|
411
|
+
2. **PR 2 (loop protection hardening)** — depends on PR 1, reviewers can verify that PR 1's fix makes these counters functional
|
|
412
|
+
3. **PR 3 (spec decoding control)** — independent, small, easy to review
|
|
413
|
+
4. **PR 4 (OpenAI endpoint)** — depends on PR 2 (reuses guardrails), adds major new functionality
|
|
414
|
+
5. **PR 5 (policy engine)** — depends on PR 1 + PR 2, new subsystem, needs the most review
|
|
415
|
+
|
|
416
|
+
## Pre-submission checklist (all PRs)
|
|
417
|
+
|
|
418
|
+
- [ ] Unit tests added
|
|
419
|
+
- [ ] Integration tests with real llama.cpp upstream
|
|
420
|
+
- [ ] README / docs updated
|
|
421
|
+
- [ ] Env var reference updated
|
|
422
|
+
- [ ] No breaking changes to existing endpoints (or clearly flagged)
|
|
423
|
+
- [ ] Config migration notes for existing deployments
|
|
424
|
+
- [ ] Diff against current production (`anthropic-proxy.env.*` profiles)
|
|
@@ -0,0 +1,208 @@
|
|
|
1
|
+
# UAP Configuration Reference
|
|
2
|
+
|
|
3
|
+
Complete configuration schema and environment variables for Universal Agent Protocol.
|
|
4
|
+
|
|
5
|
+
## .uap.json Project Configuration
|
|
6
|
+
|
|
7
|
+
### Root Schema
|
|
8
|
+
|
|
9
|
+
```json
|
|
10
|
+
{
|
|
11
|
+
"version": "1.0.0",
|
|
12
|
+
"project": {
|
|
13
|
+
"name": "string (required)",
|
|
14
|
+
"defaultBranch": "string (optional, default: main)"
|
|
15
|
+
},
|
|
16
|
+
"memory": {
|
|
17
|
+
"shortTerm": {
|
|
18
|
+
"enabled": "boolean (default: true)",
|
|
19
|
+
"path": "string (default: ./agents/data/memory/short_term.db)",
|
|
20
|
+
"maxEntries": "integer (default: 50)"
|
|
21
|
+
},
|
|
22
|
+
"longTerm": {
|
|
23
|
+
"enabled": "boolean (default: true)",
|
|
24
|
+
"provider": "string (qdrant | github | local)",
|
|
25
|
+
"endpoint": "string (for Qdrant cloud)",
|
|
26
|
+
"apiKey": "string (for Qdrant cloud)"
|
|
27
|
+
}
|
|
28
|
+
},
|
|
29
|
+
"multiModel": {
|
|
30
|
+
"enabled": "boolean (default: true)",
|
|
31
|
+
"models": "string[] (required)",
|
|
32
|
+
"roles": {
|
|
33
|
+
"planner": "string (model ID)",
|
|
34
|
+
"executor": "string (model ID)",
|
|
35
|
+
"fallback": "string (model ID)"
|
|
36
|
+
},
|
|
37
|
+
"routingStrategy": "string (cost-optimized | performance-first | balanced)"
|
|
38
|
+
},
|
|
39
|
+
"worktrees": {
|
|
40
|
+
"enabled": "boolean (default: true)",
|
|
41
|
+
"directory": "string (default: .worktrees)"
|
|
42
|
+
},
|
|
43
|
+
"policies": {
|
|
44
|
+
"enabled": "boolean (default: true)",
|
|
45
|
+
"auditTrail": "boolean (default: true)"
|
|
46
|
+
},
|
|
47
|
+
"hooks": {
|
|
48
|
+
"sessionStart": "boolean (default: true)",
|
|
49
|
+
"preCompact": "boolean (default: true)"
|
|
50
|
+
}
|
|
51
|
+
}
|
|
52
|
+
```
|
|
53
|
+
|
|
54
|
+
### Validation Rules
|
|
55
|
+
|
|
56
|
+
| Field | Type | Required | Default | Description |
|
|
57
|
+
| --------------------------- | -------- | -------- | ---------------------------------- | -------------------------- |
|
|
58
|
+
| version | string | Yes | - | Schema version (1.0.0) |
|
|
59
|
+
| project.name | string | Yes | - | Project identifier |
|
|
60
|
+
| project.defaultBranch | string | No | main | Git default branch |
|
|
61
|
+
| memory.shortTerm.enabled | boolean | No | true | Enable short-term memory |
|
|
62
|
+
| memory.shortTerm.path | string | No | ./agents/data/memory/short_term.db | SQLite path |
|
|
63
|
+
| memory.shortTerm.maxEntries | integer | No | 50 | Max working memory entries |
|
|
64
|
+
| memory.longTerm.provider | string | No | qdrant | Backend provider |
|
|
65
|
+
| multiModel.models | string[] | Yes | - | Available model IDs |
|
|
66
|
+
| multiModel.routingStrategy | string | No | balanced | Routing strategy |
|
|
67
|
+
|
|
68
|
+
## Environment Variables
|
|
69
|
+
|
|
70
|
+
### Memory Configuration
|
|
71
|
+
|
|
72
|
+
| Variable | Type | Default | Description |
|
|
73
|
+
| ----------------------------- | ------ | ---------------------------------- | ------------------------- |
|
|
74
|
+
| UAP_MEMORY_SHORT_TERM_PATH | string | ./agents/data/memory/short_term.db | Short-term memory DB path |
|
|
75
|
+
| UAP_MEMORY_LONG_TERM_PROVIDER | string | qdrant | Long-term memory backend |
|
|
76
|
+
| UAP_QDRANT_ENDPOINT | string | - | Qdrant cloud endpoint |
|
|
77
|
+
| UAP_QDRANT_API_KEY | string | - | Qdrant API key |
|
|
78
|
+
|
|
79
|
+
### Multi-Model Configuration
|
|
80
|
+
|
|
81
|
+
| Variable | Type | Default | Description |
|
|
82
|
+
| -------------------- | ------ | -------- | ---------------------- |
|
|
83
|
+
| UAP_MODEL_PLANNER | string | opus-4.6 | Default planner model |
|
|
84
|
+
| UAP_MODEL_EXECUTOR | string | glm-4.7 | Default executor model |
|
|
85
|
+
| UAP_MODEL_FALLBACK | string | opus-4.5 | Fallback on failure |
|
|
86
|
+
| UAP_ROUTING_STRATEGY | string | balanced | Routing strategy |
|
|
87
|
+
|
|
88
|
+
### Worktree Configuration
|
|
89
|
+
|
|
90
|
+
| Variable | Type | Default | Description |
|
|
91
|
+
| -------------------- | ------- | ---------- | ----------------------- |
|
|
92
|
+
| UAP_WORKTREE_DIR | string | .worktrees | Worktree directory path |
|
|
93
|
+
| UAP_WORKTREE_ENABLED | boolean | true | Enable worktree system |
|
|
94
|
+
|
|
95
|
+
### Policy Configuration
|
|
96
|
+
|
|
97
|
+
| Variable | Type | Default | Description |
|
|
98
|
+
| -------------------- | ------- | ------- | ------------------------- |
|
|
99
|
+
| UAP_POLICIES_ENABLED | boolean | true | Enable policy enforcement |
|
|
100
|
+
| UAP_AUDIT_TRAIL | boolean | true | Enable audit logging |
|
|
101
|
+
|
|
102
|
+
### Debug & Logging
|
|
103
|
+
|
|
104
|
+
| Variable | Type | Default | Description |
|
|
105
|
+
| --------------------- | ------- | ------- | ------------------------------------ |
|
|
106
|
+
| UAP_VERBOSE | boolean | false | Enable verbose logging |
|
|
107
|
+
| UAP_LOG_LEVEL | string | info | Log level (debug, info, warn, error) |
|
|
108
|
+
| UAP_TELEMETRY_ENABLED | boolean | true | Enable telemetry collection |
|
|
109
|
+
|
|
110
|
+
## Platform-Specific Configurations
|
|
111
|
+
|
|
112
|
+
### Claude Code Integration
|
|
113
|
+
|
|
114
|
+
```json
|
|
115
|
+
{
|
|
116
|
+
"hooks": {
|
|
117
|
+
"claude": {
|
|
118
|
+
"sessionStart": "templates/hooks/session-start.sh",
|
|
119
|
+
"preCompact": "templates/hooks/pre-compact.sh"
|
|
120
|
+
}
|
|
121
|
+
}
|
|
122
|
+
}
|
|
123
|
+
```
|
|
124
|
+
|
|
125
|
+
### Factory.AI Integration
|
|
126
|
+
|
|
127
|
+
```json
|
|
128
|
+
{
|
|
129
|
+
"hooks": {
|
|
130
|
+
"factory": {
|
|
131
|
+
"sessionStart": "templates/hooks/session-start.sh",
|
|
132
|
+
"preCompact": "templates/hooks/pre-compact.sh"
|
|
133
|
+
}
|
|
134
|
+
}
|
|
135
|
+
}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### OpenCode Integration
|
|
139
|
+
|
|
140
|
+
```json
|
|
141
|
+
{
|
|
142
|
+
"hooks": {
|
|
143
|
+
"opencode": {
|
|
144
|
+
"sessionStart": "templates/hooks/session-start.sh",
|
|
145
|
+
"preCompact": "templates/hooks/pre-compact.sh"
|
|
146
|
+
}
|
|
147
|
+
}
|
|
148
|
+
}
|
|
149
|
+
```
|
|
150
|
+
|
|
151
|
+
## Example Configurations
|
|
152
|
+
|
|
153
|
+
### Minimal Configuration
|
|
154
|
+
|
|
155
|
+
```json
|
|
156
|
+
{
|
|
157
|
+
"version": "1.0.0",
|
|
158
|
+
"project": { "name": "my-project" },
|
|
159
|
+
"memory": { "shortTerm": { "enabled": true } },
|
|
160
|
+
"multiModel": {
|
|
161
|
+
"enabled": true,
|
|
162
|
+
"models": ["opus-4.6", "qwen35"],
|
|
163
|
+
"roles": { "planner": "opus-4.6", "executor": "qwen35" }
|
|
164
|
+
}
|
|
165
|
+
}
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
### Production Configuration
|
|
169
|
+
|
|
170
|
+
```json
|
|
171
|
+
{
|
|
172
|
+
"version": "1.0.0",
|
|
173
|
+
"project": { "name": "production-app", "defaultBranch": "main" },
|
|
174
|
+
"memory": {
|
|
175
|
+
"shortTerm": { "enabled": true, "maxEntries": 50 },
|
|
176
|
+
"longTerm": { "enabled": true, "provider": "qdrant", "endpoint": "https://qdrant.example.com" }
|
|
177
|
+
},
|
|
178
|
+
"multiModel": {
|
|
179
|
+
"enabled": true,
|
|
180
|
+
"models": ["opus-4.6", "sonnet-4.6", "qwen35"],
|
|
181
|
+
"roles": { "planner": "opus-4.6", "executor": "qwen35", "fallback": "sonnet-4.6" },
|
|
182
|
+
"routingStrategy": "cost-optimized"
|
|
183
|
+
},
|
|
184
|
+
"worktrees": { "enabled": true, "directory": ".worktrees" },
|
|
185
|
+
"policies": { "enabled": true, "auditTrail": true }
|
|
186
|
+
}
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
## Configuration Validation
|
|
190
|
+
|
|
191
|
+
Run validation:
|
|
192
|
+
|
|
193
|
+
```bash
|
|
194
|
+
uap compliance check
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
This verifies:
|
|
198
|
+
|
|
199
|
+
- Memory database paths exist or can be created
|
|
200
|
+
- Model IDs are valid
|
|
201
|
+
- Worktree directory is accessible
|
|
202
|
+
- Policy enforcement is properly configured
|
|
203
|
+
|
|
204
|
+
## See Also
|
|
205
|
+
|
|
206
|
+
- [Getting Started](../../docs/getting-started/SETUP.md)
|
|
207
|
+
- [Multi-Model Architecture](../../docs/reference/FEATURES.md#multi-model-architecture)
|
|
208
|
+
- [Memory System](../../docs/reference/FEATURES.md#memory-system)
|