@rubytech/create-maxy 1.0.655 → 1.0.657
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/package.json +1 -1
- package/payload/platform/plugins/admin/PLUGIN.md +2 -1
- package/payload/platform/plugins/admin/mcp/dist/index.js +56 -0
- package/payload/platform/plugins/admin/mcp/dist/index.js.map +1 -1
- package/payload/platform/plugins/admin/skills/deck-pages/SKILL.md +418 -0
- package/payload/platform/plugins/cloudflare/scripts/list-cf-domains.ts +97 -32
- package/payload/platform/plugins/docs/references/adherence.md +98 -0
- package/payload/platform/plugins/docs/references/cloudflare.md +1 -1
- package/payload/platform/plugins/docs/references/platform.md +1 -1
- package/payload/platform/plugins/docs/references/troubleshooting.md +14 -0
- package/payload/platform/templates/agents/admin/IDENTITY.md +2 -0
- package/payload/server/package.json +2 -1
- package/payload/server/public/assets/{admin-CVZaji3A.js → admin-C6FCOBJJ.js} +9 -9
- package/payload/server/public/assets/{data-DgI19qYm.js → data-OhPCCGxF.js} +1 -1
- package/payload/server/public/assets/{file-J1JpJF4E.js → file-vGZzzcEL.js} +1 -1
- package/payload/server/public/assets/graph-arM1qUve.js +49 -0
- package/payload/server/public/assets/{house-Dche6_m0.js → house-CStAEh5N.js} +1 -1
- package/payload/server/public/assets/jsx-runtime-CE3bWIbP.css +1 -0
- package/payload/server/public/assets/{public-LhnMTdDE.js → public-CA8hdxVS.js} +1 -1
- package/payload/server/public/assets/{share-2-6hJtFYgM.js → share-2-BgiUCVf3.js} +1 -1
- package/payload/server/public/assets/{useVoiceRecorder-PUde6itK.js → useVoiceRecorder-B343k-mr.js} +1 -1
- package/payload/server/public/assets/x-CNdlr5ao.js +1 -0
- package/payload/server/public/data.html +6 -6
- package/payload/server/public/graph.html +6 -6
- package/payload/server/public/index.html +7 -7
- package/payload/server/public/public.html +4 -4
- package/payload/server/server.js +1432 -752
- package/payload/server/public/assets/graph-CFwxUVS0.js +0 -49
- package/payload/server/public/assets/jsx-runtime-C7zbe_Pq.css +0 -1
- package/payload/server/public/assets/x-DmqRGGHj.js +0 -1
- /package/payload/server/public/assets/{jsx-runtime-BE1CBORz.js → jsx-runtime-ZaEwDls0.js} +0 -0
|
@@ -0,0 +1,98 @@
|
|
|
1
|
+
# Adherence Fidelity
|
|
2
|
+
|
|
3
|
+
User-facing reference for the attention-weighted correction ledger that makes agent adherence compound. Canonical platform documentation lives at [`.docs/agents.md`](../../../../.docs/agents.md) § Adherence Fidelity — this reference mirrors the same behaviour for operators reading plugin docs.
|
|
4
|
+
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
## What it solves
|
|
8
|
+
|
|
9
|
+
The agent's prerogatives (PRECISE, CONCISE, EVIDENCE-BASED) are prose at the top of every system prompt. Without a compounding mechanism, a rule corrected 50 times has the same attention weight as a rule corrected once. Adherence Fidelity adds a per-agent ledger whose rendered summary is inserted into the system prompt every turn, so the agent sees its own recidivism with counts, samples, and recency.
|
|
10
|
+
|
|
11
|
+
## How an operator sees it
|
|
12
|
+
|
|
13
|
+
**In chat:** ask the agent *"what is my adherence score?"* or *"what are my top rule violations?"* The admin agent answers via the `adherence-read` tool, which reads the ledger file on disk — the number is authoritative.
|
|
14
|
+
|
|
15
|
+
**Via API:** `GET /api/admin/adherence?agent=admin` returns the ledger JSON, plus `constraints` (whether capability routing is active for the next turn) and an optional `rendered` block when called with `?block=1`.
|
|
16
|
+
|
|
17
|
+
**On the filesystem:** `{accountDir}/agents/{agentName}/adherence-ledger.json` is the source of truth. `jq` queries work directly:
|
|
18
|
+
|
|
19
|
+
```bash
|
|
20
|
+
jq '{score, top: (.rules | sort_by(-.count) | .[0])}' \
|
|
21
|
+
~/.maxy/<accountId>/agents/admin/adherence-ledger.json
|
|
22
|
+
```
|
|
23
|
+
|
|
24
|
+
## Score
|
|
25
|
+
|
|
26
|
+
```
|
|
27
|
+
score = 100 × (1 − rules_violating_in_rolling_7d / n_rules)
|
|
28
|
+
```
|
|
29
|
+
|
|
30
|
+
An agent with three rules, zero of which have a violation in the last 7 days, scores 100%. One out of three scores 67%. All three scores 0%.
|
|
31
|
+
|
|
32
|
+
## Top offenders
|
|
33
|
+
|
|
34
|
+
The rendered ledger block bolds the top-3 rules by `count` (all-time, not just the rolling window) and quotes their most recent `last_sample`. Rules 4 through 10 appear as one-liners. Zero-recidivism rules (`count = 0` and `rolling_7d = 0`) are omitted entirely.
|
|
35
|
+
|
|
36
|
+
Sort order: `count DESC, last_violated_at DESC`.
|
|
37
|
+
|
|
38
|
+
## Capability routing at threshold
|
|
39
|
+
|
|
40
|
+
When any rule's `rolling_7d` reaches `5`, the next turn's spawn is clamped:
|
|
41
|
+
|
|
42
|
+
- `--max-turns` drops to `5` — the agent has fewer turns to sprawl.
|
|
43
|
+
- Non-core tools drop from the allowed set (currently the specialist roles).
|
|
44
|
+
- The offending rule renders with a `BLOCKING:` prefix in the ledger block so it dominates the prompt.
|
|
45
|
+
|
|
46
|
+
The constraint is computed once per turn at the top of `invokeAgent` and frozen for that spawn — one-turn granularity. The next turn reads the updated ledger and re-evaluates, so a single clean turn begins to lift the constraint as `rolling_7d` decays.
|
|
47
|
+
|
|
48
|
+
## Data flow per turn
|
|
49
|
+
|
|
50
|
+
```
|
|
51
|
+
┌─ Pre-turn ────────────────────────────────────┐
|
|
52
|
+
│ loadAdherenceLedger(accountDir, accountId) │
|
|
53
|
+
│ renderAdherenceLedger(ledger, blockingRules) │
|
|
54
|
+
│ → inject at <!-- ADHERENCE-LEDGER-INSERT --> │
|
|
55
|
+
│ computeConstraints(ledger) │
|
|
56
|
+
│ → clamp max-turns, drop tools │
|
|
57
|
+
└────────────────────────────────────────────────┘
|
|
58
|
+
│
|
|
59
|
+
▼
|
|
60
|
+
Assistant stream
|
|
61
|
+
│
|
|
62
|
+
▼
|
|
63
|
+
┌─ Post-turn ───────────────────────────────────┐
|
|
64
|
+
│ criticAndRecord(responseText) — Haiku │
|
|
65
|
+
│ → verdict=pass → recordPass() │
|
|
66
|
+
│ → verdict=violation → recordViolation() │
|
|
67
|
+
│ Fire-and-forget. Non-blocking. │
|
|
68
|
+
└────────────────────────────────────────────────┘
|
|
69
|
+
```
|
|
70
|
+
|
|
71
|
+
## What the ledger file looks like
|
|
72
|
+
|
|
73
|
+
```json
|
|
74
|
+
{
|
|
75
|
+
"agent_id": "admin",
|
|
76
|
+
"account_id": "abc123...",
|
|
77
|
+
"rules": [
|
|
78
|
+
{
|
|
79
|
+
"rule_id": "PRECISE",
|
|
80
|
+
"canonical_text": "Use exact names...",
|
|
81
|
+
"rule_family": "prerogative",
|
|
82
|
+
"count": 7,
|
|
83
|
+
"violations": ["2026-04-19T10:12:00Z", "..."],
|
|
84
|
+
"last_violated_at": "2026-04-21T09:30:12Z",
|
|
85
|
+
"last_sample": "Something that paraphrased tool output…",
|
|
86
|
+
"current_streak": 2,
|
|
87
|
+
"rolling_7d": 5
|
|
88
|
+
}
|
|
89
|
+
],
|
|
90
|
+
"updated_at": "2026-04-21T09:30:13Z"
|
|
91
|
+
}
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
## Limits and deferrals
|
|
95
|
+
|
|
96
|
+
v1 covers the admin agent only. Specialist subagents (`personal-assistant`, `project-manager`, `research-assistant`, `content-producer`) do not receive their own ledger injection yet — their `.md` templates load via `--plugin-dir` and have no TS-side assembly site. Follow-up task filed.
|
|
97
|
+
|
|
98
|
+
No cross-agent rule inheritance, no user-visible correction-ack signal, no blocking-critic retry loop in v1 — each is a separate follow-up task. See [`.docs/agents.md`](../../../../.docs/agents.md) § Adherence Fidelity for the full deferral list with task numbers.
|
|
@@ -8,7 +8,7 @@ Each installation has its own Cloudflare account. Sign-in is OAuth in the device
|
|
|
8
8
|
|------|--------|
|
|
9
9
|
| **Product identity** (Maxy vs Real Agent) | `brand.json` (`productName`, `configDir`) — known at install. |
|
|
10
10
|
| **Cloudflare account identity** | `cert.pem` from OAuth. One account per brand per device. |
|
|
11
|
-
| **Domain scope** (which zones the operator can route) | Live Cloudflare dashboard at form-render time via `list-cf-domains.sh`, not `brand.json`. Brand identity has no authority over which domains the operator's CF account holds. |
|
|
11
|
+
| **Domain scope** (which zones the operator can route) | Live Cloudflare dashboard at form-render time via `list-cf-domains.sh`, not `brand.json`. Brand identity has no authority over which domains the operator's CF account holds. When the scrape returns an unexpected count (e.g. 1 on a two-zone account), the stream log's per-poll `phase=dom-scrape-poll n=<k> count=<n> domains=[…]` trajectory + the on-disk HTML dump at `~/{configDir}/logs/list-cf-domains-<ts>-count<n>-<mode>-pid<pid>.html` (Task 608 — written on every scrape outcome, not just empty ones) give the operator everything they need to triage the cause without re-running. |
|
|
12
12
|
| **Local tunnel state** | `~/{configDir}/cloudflared/` — `cert.pem`, `<UUID>.json`, `config.yml`, `tunnel.state`, `alias-domains.json`. |
|
|
13
13
|
|
|
14
14
|
There is no token-based auth for the operator-owned path (Mode A). To switch Cloudflare accounts, run `reset-tunnel.sh` (which deletes the cert and every tunnel on the current account), then run `setup-tunnel.sh` again — `cloudflared tunnel login` inside the setup script will pick a fresh account when you sign in.
|
|
@@ -68,7 +68,7 @@ The admin UI includes a live terminal surface that opens a real shell on your Pi
|
|
|
68
68
|
|
|
69
69
|
The tmux session outlives admin-server restarts — running an upgrade inside this terminal means you see the live shell output continuously, even through the admin server's own restart mid-upgrade. Closing the browser tab does not kill the running work; re-opening the Software Update window reattaches to the same session during an active upgrade and scrollback shows everything that happened in the meantime. Password-protected `sudo` prompts appear natively inside the terminal, and the password you type never leaves the Pi — the admin-server proxy is a raw byte pipe that never inspects frame payloads.
|
|
70
70
|
|
|
71
|
-
The Software Update window mounts the terminal lazily: the WebSocket
|
|
71
|
+
The Software Update window mounts the terminal lazily: neither the terminal, its WebSocket, nor its black-backgrounded container render until you click Upgrade. Pre-click, the window shows a small "Ready to upgrade — click Upgrade to begin." line, no network traffic flows, and the upgrade UI is the lifecycle indicator — not the terminal. After you click, the window adds a status row above the terminal ("Upgrading to v… · elapsed: Ns · Downloading installer…" flipping to "Running installer…" on the first byte of installer output) so the 5–30 second npx cold-start window is never silent. The upgrade command is dispatched the moment the WebSocket opens — you won't see "terminal not ready" warnings on a healthy device. If the admin server cannot reach `ttyd`, the window renders an inline "Admin terminal not available" message with the exact re-install command and a Try again button. The scrollback-across-reopen behaviour above still applies during an active upgrade (a sessionStorage flag remembers that an upgrade is in flight so reopening the window re-mounts the terminal and reattaches; the elapsed counter keeps ticking from the original start time).
|
|
72
72
|
|
|
73
73
|
## AI Content Provenance
|
|
74
74
|
|
|
@@ -125,6 +125,20 @@ npx -y @rubytech/create-maxy@latest
|
|
|
125
125
|
|
|
126
126
|
Then return to the upgrade window and click **Try again**. The window re-probes `/api/health` and, once ttyd is listening, the terminal area mounts as normal. If the problem persists, check the boot log for `[ttyd] upstream NOT reachable on 127.0.0.1:7681` and follow the `maxy-ttyd` restart steps above.
|
|
127
127
|
|
|
128
|
+
## Upgrade spinner turns but terminal stays blank
|
|
129
|
+
|
|
130
|
+
**Symptom:** You clicked **Upgrade**, the progress row is showing with an elapsed counter ticking, but the terminal area below stays empty for more than about a minute with no output.
|
|
131
|
+
|
|
132
|
+
**What it means:** The upgrade command was dispatched successfully (`onReady` fired), but `ttyd` is not relaying any bytes back from the installer — the `npx` process may have crashed before it printed anything, or `ttyd` itself has lost its PTY.
|
|
133
|
+
|
|
134
|
+
**Fix:** SSH to the device and check the ttyd unit:
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
sudo systemctl --user status maxy-ttyd
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
If it's not running, restart it with `sudo systemctl --user restart maxy-ttyd`. Then close and reopen the Software Update window. If `ttyd` is healthy and the spinner keeps turning with no output, the installer process itself has died — re-run `npx -y @rubytech/create-maxy@latest` from an SSH shell directly.
|
|
141
|
+
|
|
128
142
|
## Orphan Account Directory Archived to `.trash/`
|
|
129
143
|
|
|
130
144
|
**What happened:** During upgrade, the installer detected multiple account directories under `~/maxy/data/accounts/` and identified one as live (its `admins` list matches the device's `users.json`). Non-matching siblings are archived — not deleted — under `~/maxy/data/accounts/.trash/<uuid>-<ISO8601-ts>/`.
|
|
@@ -16,6 +16,8 @@ Three rules govern every turn. They are load-bearing — when they conflict with
|
|
|
16
16
|
|
|
17
17
|
A landfill graph defeats EVIDENCE-BASED: search returns noise, the agent re-writes the noise, the noise compounds. Compress on write; filter on read.
|
|
18
18
|
|
|
19
|
+
<!-- ADHERENCE-LEDGER-INSERT -->
|
|
20
|
+
|
|
19
21
|
---
|
|
20
22
|
|
|
21
23
|
## Intent Gate — First Principle
|