npm - instar - Versions diffs - 1.3.583 → 1.3.584 - Mend

instar 1.3.583 → 1.3.584

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

package/dist/config/ConfigDefaults.d.ts.map +1 -1
package/dist/config/ConfigDefaults.js +13 -0
package/dist/config/ConfigDefaults.js.map +1 -1
package/dist/core/PlaywrightProfileRegistry.d.ts +269 -0
package/dist/core/PlaywrightProfileRegistry.d.ts.map +1 -0
package/dist/core/PlaywrightProfileRegistry.js +640 -0
package/dist/core/PlaywrightProfileRegistry.js.map +1 -0
package/dist/core/PostUpdateMigrator.d.ts +21 -0
package/dist/core/PostUpdateMigrator.d.ts.map +1 -1
package/dist/core/PostUpdateMigrator.js +195 -0
package/dist/core/PostUpdateMigrator.js.map +1 -1
package/dist/core/devGatedFeatures.d.ts.map +1 -1
package/dist/core/devGatedFeatures.js +6 -0
package/dist/core/devGatedFeatures.js.map +1 -1
package/dist/core/types.d.ts +13 -0
package/dist/core/types.d.ts.map +1 -1
package/dist/core/types.js.map +1 -1
package/dist/scaffold/templates.d.ts.map +1 -1
package/dist/scaffold/templates.js +8 -1
package/dist/scaffold/templates.js.map +1 -1
package/dist/server/CapabilityIndex.d.ts.map +1 -1
package/dist/server/CapabilityIndex.js +1 -0
package/dist/server/CapabilityIndex.js.map +1 -1
package/dist/server/routes.d.ts +8 -0
package/dist/server/routes.d.ts.map +1 -1
package/dist/server/routes.js +341 -0
package/dist/server/routes.js.map +1 -1
package/package.json +1 -1
package/src/data/builtin-manifest.json +63 -63
package/src/scaffold/templates.ts +9 -1
package/upgrades/1.3.584.md +84 -0
package/upgrades/side-effects/playwright-profile-registry.md +140 -0
package/upgrades/1.3.583.md +0 -51

package/upgrades/1.3.584.md ADDED Viewed

@@ -0,0 +1,84 @@
+# Upgrade Guide — vNEXT
+<!-- assembled-by: assemble-next-md -->
+<!-- bump: minor -->
+## What Changed
+**A durable per-agent registry mapping each Playwright browser profile to the accounts it is logged into — plus boot-time awareness of what browser access the agent actually has.** Until now the agent self-unblocked by driving a real browser (Playwright MCP) logged into real accounts, but there was no authoritative record of *which profile holds which account*. That knowledge lived only as ~21 scattered, partly-contradictory `operationalFacts` — which led the agent to ask the operator to act (or grind a credential treadmill) instead of resolving the right profile itself.
+The new `PlaywrightProfileRegistry` (`src/core/PlaywrightProfileRegistry.ts`) is the missing data + awareness + selection + activation layer:
+- A durable, machine-local state file `state/playwright-profiles.json` mapping each **profile** (a physical browser user-data-dir on THIS machine) to the **accounts** it is responsible for — by vault-secret NAME only, **never values**.
+- A compact boot-awareness pointer injected at session start (`GET /playwright-profiles/session-context`), so the agent knows from message one what browser access it has and **as whom** — operator-owned accounts flagged loud (Know Your Principal), login state rendered as last-asserted staleness (advisory, never a guarantee).
+- Routes to list / create / assign / resolve / activate profiles. `resolve` picks the owning profile for a `(service, identity)` and forces disambiguation rather than silently picking a privileged account; `activate` rewrites the MCP config and restarts the session onto the chosen profile.
+**Safety posture (the honesty disciplines that keep this from re-creating the scattered-facts problem):** no secret VALUE is ever stored, returned, injected, or resolved (names only). Every write is audited (`logs/playwright-profiles.jsonl`). A corrupt registry file fails CLOSED for writes (never auto-overwritten) and OPEN for the boot block (injects nothing). Caller-supplied profile dirs are path-jailed to the agent home. The seed is metadata-only — it never touches `.mcp.json` / `.claude/settings.json`, so an update can never regress another agent's shared browser login.
+**Rollout:** the whole feature is **dev-gated** (`playwrightRegistry.enabled` omitted → live on a development agent, **dark on the fleet** — routes 503, the boot block injects nothing). The only destructive op, `activate` (config rewrite + session restart), additionally ships `dryRun: true` — it LOGS the intended rewrite/refresh and performs NEITHER until a deliberate `dryRun: false`. Existing agents pick it up via full migration parity (state seed, session-start hook, CLAUDE.md awareness section, config default + strip-false migration).
+A short timer drift that recurs while load sits in the **1.0–1.5/core band** slipped past both
+existing guards: the load guard fires only above 1.5/core, and the consecutive burst floor resets
+whenever on-time ticks fall between drifts. Its ~2-minute cadence also outlasted the 60s cooldown.
+So each isolated drift emitted a **false `wake`**, firing the full wake-recovery cascade (tunnel
+restart, Slack reconnect, mesh-lease churn, topic failover) — the source of a class of multi-machine
+UX failures: a reply that's lost the conversation thread, messages that get no reply, and "remote
+typing is disabled" (the 2026-06-15 incident, measured at ~1.13/core).
+The detector now adds a **recurring-drift guard**: a short drift within `recentDriftWindowMs`
+(default 5 min) of a prior short drift, while load is oversubscribed (`> recentDriftLoadFloor`,
+default 1.0/core), is treated as recurring CPU starvation and suppressed. This generalizes the burst
+floor from *consecutive* ticks to *recent* ticks, and the load gate confines it to the
+oversubscribed band the hard guard leaves open.
+## What to Tell Your User
+- ⚗️ **Experimental, development-agent only.** On the fleet this ships dark — the routes 503 and nothing is injected at session start, so a normal agent sees no change. On a development agent it runs live, but the only state-changing operation (switching the browser onto a profile) is held in dry-run by default, so it only LOGS what it would do until that is deliberately turned off.
+- **What it gives a dev agent:** instead of asking you to drive the browser or produce a credential, the agent can now look up which browser profile is logged into a given account, pick the right one, and (when activated) restart its session onto that profile. It tracks which accounts are **yours** vs the agent's own, so it won't act as you in a browser unless explicitly authorized — and login state is treated as last-asserted, so it re-verifies in-browser before any privileged action.
+- **At-rest honesty:** the registry file is plaintext machine-local. It lists account identities + vault key NAMES — so filesystem access to the machine reveals the agent's access *map*, never the credentials themselves (same posture as the self-knowledge tree and the relationships store).
+Side-effects review: upgrades/side-effects/playwright-profile-registry.md
+- **Fewer spurious reconnects on a busy laptop**: "When my machine got busy I used to mistake the
+  slowdown for the computer going to sleep, which kicked off a disruptive recovery — dropping the
+  conversation thread, going quiet, or disabling typing. I now recognize that pattern and stay calm,
+  so those multi-machine glitches should largely stop."
+- **Real sleeps still handled**: "If the machine genuinely sleeps, I still notice and recover
+  properly — nothing changes there."
+## Summary of New Capabilities
+| Capability | How to Use |
+|-----------|-----------|
+| See which browser profile holds which account (full detail; vault NAMES only) | `GET /playwright-profiles` |
+| Compact boot-awareness pointer (also auto-injected at session start) | `GET /playwright-profiles/session-context` |
+| Create a custom profile | `POST /playwright-profiles` `{ id, description?, userDataDir? }` |
+| Assign an account to a profile (owner agent\|operator; vault NAMES only) | `POST /playwright-profiles/:id/accounts` `{ service, identity, owner, vaultRefs[], loginMethod?, note? }` |
+| Pick the right profile for a task | `GET /playwright-profiles/resolve?service=&identity=` (ambiguous service-only → `{ ambiguous: true, candidates }`) |
+| Switch the browser onto a profile (config rewrite + session restart) | `POST /playwright-profiles/:id/activate` (ships `dryRun: true` — logs the intended switch until a deliberate `dryRun: false`; reversible by activating `default`) |
+| Capability | How to Use |
+|-----------|-----------|
+| Suppress false "wake" events from CPU starvation on a loaded host | automatic |
+| Tune or disable the new guard | `monitoring.sleepWake.recentDriftWindowMs` / `.recentDriftLoadFloor` (set window to 0 to disable) |
+## Evidence
+- `PlaywrightProfileRegistry` seeds exactly one `default` profile via the shared `resolvePlaywrightMcpConfig()` resolver (records the real `--user-data-dir` if the canonical config carries one, else `null` = the built-in default — never `.playwright-mcp`, which is the MCP output-dir, not the browser profile).
+- New `DEV_GATED_FEATURES` entry `playwrightRegistry` (configPath `playwrightRegistry.enabled`) — picked up automatically by the dual-side wiring test (`tests/unit/devGatedFeatures-wiring.test.ts`): the entry resolves LIVE under a dev-agent config and DARK under a fleet config.
+- `ConfigDefaults` adds `playwrightRegistry: { dryRun: true }` and OMITS `enabled` (the dev-gate convention, mirroring `credentialRepointing` / `topicProfiles`).
+- Migration parity in `PostUpdateMigrator`: the `/playwright-profiles/session-context` session-start fetch+inject block is modeled byte-for-byte on the existing `/self-knowledge/session-context` block (`curl -sf --max-time 4 --connect-timeout 1`, `python3` parse of `.block`, fail-open on 503/404/empty); a `migrateClaudeMd` content-sniff appends the awareness section; the `playwright-profiles-seed-v1` marker migration seeds the default profile metadata-only (idempotent, marks done either way); a `migrateConfigPlaywrightRegistryDevGate` strip-false migration mirrors the credential-repointing strip so a stale default-shaped `enabled: false` resolves the gate live.
+- The CLAUDE.md awareness section is authored ONCE (`PLAYWRIGHT_PROFILE_REGISTRY_CLAUDEMD_SECTION`) and shared by `generateClaudeMd` (new installs) and `migrateClaudeMd` (existing agents) so the two can never drift.
+Reproduction (live, 2026-06-15): on a host measured at loadavg ~18 on 16 cores (~1.13/core — above
+1.0 but below the 1.5 hard guard), `server.log` showed `[SleepWakeDetector] Wake detected after
+~33s/~21s sleep` recurring roughly every 2 minutes while the host was actively in use (not sleeping),
+each triggering the wake-recovery cascade. The drifts were isolated (on-time ticks between them reset
+the consecutive counter) and ~2 min apart (outlasting the 60s cooldown), so neither existing guard
+caught them.
+After the fix (verified by 45/45 sleep-wake unit tests across 5 files, both sides of the boundary): a
+recurring short drift in the 1.0–1.5 band is suppressed (no `wake` emitted, recorded as
+`cpu-starvation`); a genuinely isolated short drift, any drift on a light/idle host (ratio ≤ 1.0),
+and every long (real) sleep still emit; `recentDriftWindowMs: 0` restores byte-identical prior
+behavior. tsc clean.

package/upgrades/side-effects/playwright-profile-registry.md ADDED Viewed

@@ -0,0 +1,140 @@
+# Side-Effects Review — Playwright Profile Registry + Account-Access Awareness
+Spec: `docs/specs/playwright-profile-registry.md` (converged 2 iterations, approved).
+Tier: **2** (new feature: new store, 8 routes, migration parity, session-start hook,
+config + dev-gate, agent awareness; risk floor raised by new-capability + identity-touch
++ fleet-rollout signals — Tier 2 matches).
+## Phase 1 — Principle check (signal vs authority)
+Does this change involve a decision point that gates information flow / blocks actions /
+constrains agent behavior? **No blocking authority.** The feature is a
+data + awareness + selection + tool layer:
+- The boot block is an explicitly-ADVISORY signal (`<playwright-profiles>` envelope,
+  "background signal, not authority — verify before acting"); login state is
+  `lastAsserted`, never asserted as fact (D11).
+- `activate` is a TOOL the agent invokes; it does NOT bypass any external-operation /
+  coherence gate — switching the browser profile is not authorization to act as that
+  identity (D12 + the activate clause).
+- The only gates are the dev-agent dark gate + the `activate` `dryRun` — both ROLLOUT
+  controls, not behavioral authority.
+No brittle check holds blocking authority. Compliant with `docs/signal-vs-authority.md`.
+## Phase 2 — Build location
+Fresh worktree `.worktrees/playwright-profile-registry`, branch
+`echo/playwright-profile-registry` off `JKHeadley/main` @461ceec0e (package.json
+v1.3.579). `git remote -v`: JKHeadley = canonical. Identity set to
+`Instar Agent (echo)` / `echo@instar.local`.
+## The 8 questions
+### 1. Over-block (what legitimate inputs does this reject that it shouldn't?)
+- `userDataDir` jail (D9) rejects any path outside the agent home, `-`-prefixed, or
+  NUL-bearing. A legitimate profile dir is always inside the agent home (the worktree
+  convention / sandbox-stable home), so this rejects nothing legitimate. A user who
+  genuinely wanted a profile outside the home would be rejected — intentional (the jail
+  is the security boundary; out-of-home profiles are exactly the cross-agent-theft /
+  sandbox-revocation hazard).
+- Ref-validation fails CLOSED when the vault is unreadable (D17): a legitimate assign is
+  rejected (409) while the vault is decrypt-failed/absent. Intentional — better to
+  refuse than record an unvalidated ref. The vault being unreadable is itself an
+  incident the operator should resolve first.
+### 2. Under-block (what failure modes does this still miss?)
+- The registry cannot observe a dead browser cookie — `lastAsserted: true` can be stale.
+  MITIGATED by D11 (rendered staleness age + advisory framing + "verify before
+  privileged action"), not eliminated. This is by design: the registry is a signal, not
+  a liveness oracle. The agent must re-verify in-browser.
+- `owner: agent|operator` is a self-asserted label, not a verified principal (D12 note):
+  a poisoned/mistaken write could mislabel. MITIGATED by the audit log (attributable),
+  the advisory framing, and the real act-as defense being the un-bypassed
+  external-op/coherence gate — the label is a hint, never an authorization.
+### 3. Level-of-abstraction fit
+Correct layer. It mirrors the proven `BootSelfKnowledge` boot-block pattern + the
+`CommitmentTracker` CAS pattern + the `credentialRepointing` dryRun convention. The
+login keystrokes are deliberately NOT here (D8 — a non-deterministic interactive action
+belongs in the agent, not a deterministic route). A smarter gate does not already own
+this; it feeds awareness, it does not duplicate one.
+### 4. Signal vs authority compliance
+Compliant (see Phase 1). The boot block is advisory; `activate` does not gate behavior;
+the dev-gate + dryRun are rollout controls. No brittle blocking authority added.
+### 5. Interactions (shadowing / double-fire / races)
+- `activate`'s session refresh + the MCP-health auto-refresh (`mcp-health-autorefresh.sh`)
+  could both target playwright. MITIGATED: the auto-refresh has a hard once-per-(session,
+  failed-set) loop-guard (verified at PostUpdateMigrator.ts:8442+); `activate`'s
+  already-active fast path (no write/no refresh when the target dir is already set) +
+  the per-session activate cooldown/window breaker prevent a restart storm (D19).
+- Concurrent writes to `state/playwright-profiles.json`: single-writer CAS `mutate()`
+  (D14) — no lost update (NOT bare `writeConfigAtomic`).
+- The shared `resolvePlaywrightMcpConfig()` is the SINGLE source-of-truth for both seed
+  and activate (F2) — the two paths cannot drift on "where the playwright arg lives."
+- The boot fetch is added adjacent to the self-knowledge fetch in `getSessionStartHook()`
+  — same fail-open shape; it cannot block boot (D22).
+### 6. External surfaces
+- New HTTP routes (`/playwright-profiles/*`) — Bearer-authed, whole-feature dev-gated
+  (503 on fleet). Visible to the operating agent only.
+- A new always-injected boot block — kept COMPACT (≤800 bytes, pointer-not-payload, D21)
+  to respect the boot-bloat lesson (L1); full detail behind the route.
+- The plaintext `state/playwright-profiles.json` lists account IDENTITIES + vault key
+  NAMES (never values) — an at-rest access MAP. Documented honestly in the
+  agent-awareness section (same posture as `SelfKnowledgeTree`/operationalFacts).
+- `activate` (only when `dryRun:false`) mutates the playwright MCP config file +
+  restarts the session — agent-initiated, audited, reversible.
+### 7. Multi-machine posture
+**Machine-local BY DESIGN** (D6). A browser profile's logged-in session lives in cookies
+on one machine's disk and cannot be moved by copying metadata. The state file, routes,
+and boot block describe only the machine serving the request; the boot block reads LOCAL
+state even after a topic transfer. No replication, no proxied-on-read, no generated URLs
+crossing a machine boundary, no user-facing notices needing one-voice gating. Registry
+state does not strand on topic transfer (it correctly does not travel). The cross-machine
+"which machine holds profile X" read is tracked as a follow-up
+(<!-- tracked: CMT-1554-pwprofile-crossmachine-holder-view -->), not silently assumed.
+### 8. Rollback cost
+Cheap. Dark on the fleet by construction (dev-gated → all routes 503, session-start
+injects nothing, state file inert). On a dev agent: `playwrightRegistry.enabled: false`.
+`activate`'s config edit (only when `dryRun:false`) is reversed by activating `default`
+(restores the no-arg built-in profile) or a one-line manual revert. No data migration,
+no destructive state. The seeded default profile + `dryRun:true` config default are
+additive. Back-out = flip the flag (no hot-fix release needed for the dark default).
+## Verification
+- `npx tsc --noEmit` → clean (exit 0).
+- New tests: unit 47 + integration 14 + e2e 3 = 64, all green; `devGatedFeatures-wiring`
+  82 green (picks up the new entry); ratchet/capability suites (no-silent-fallbacks,
+  no-silent-llm-fallback, CapabilityIndex, capability-registry-generator,
+  lint-dev-agent-dark-gate, PostUpdateMigrator-guardsCapabilitySection) all green.
+- `node scripts/lint-dev-agent-dark-gate.js` → clean. `node scripts/lint-guard-manifest.js`
+  → clean (request-driven feature, no manifest entry needed).
+## Phase 5 — Second-pass review (independent)
+**Concur with the review** — the implementation matches the artifact's claims and is
+sound. Independently verified against the code (file:line):
+1. activate (routes.ts:16933-17002): write+refresh gated behind `dryRun` default-true;
+   already-active fast path skips both; per-session 30s cooldown + 5/5min breaker
+   (:16758-16775) on the real-switch path only; rewrites only `mcpServers.playwright.args`
+   + schedules a refresh, makes no authorization claim → cannot grant act-as. No storm.
+2. No secret values: `listVaultNames` → `secretKeyPaths` (names only); audit log + boot
+   block never carry values/refs-values. Invariant holds end-to-end.
+3. Signal vs authority: boot block advisory; operator accounts marked "act-as only when
+   authorized"; staleness rendered; no code consumes the block as authority.
+4. Fail-closed/open: assign fails closed when vault names null; corrupt file → CRUD
+   throws, never overwrites; block fails open; boot hook `curl -sf --max-time 4`,
+   non-2xx/empty injects nothing.
+5. CAS: genuine single-writer `mutate()` (statSig before/after + retry).
+6. dev-gate: all 8 routes 503 on fleet; flag read fresh per request; `enabled` omitted;
+   strip-false migration + DEV_GATED_FEATURES entry present.
+7. sanitize: every rendered boot field through `sanitizeForBlock` (escapes `<`/`>`,
+   strips control chars); envelope breakout impossible.
+8. New risks: none material (reads call one-time idempotent ensureSeeded — no storm;
+   audit-log re-sanitize advisory documented, not yet a live surface).
+No hole in the activate restart path or the no-value invariant.

package/upgrades/1.3.583.md DELETED Viewed

@@ -1,51 +0,0 @@
-# Upgrade Guide — vNEXT
-<!-- assembled-by: assemble-next-md -->
-<!-- bump: patch -->
-## What Changed
-A short timer drift that recurs while load sits in the **1.0–1.5/core band** slipped past both
-existing guards: the load guard fires only above 1.5/core, and the consecutive burst floor resets
-whenever on-time ticks fall between drifts. Its ~2-minute cadence also outlasted the 60s cooldown.
-So each isolated drift emitted a **false `wake`**, firing the full wake-recovery cascade (tunnel
-restart, Slack reconnect, mesh-lease churn, topic failover) — the source of a class of multi-machine
-UX failures: a reply that's lost the conversation thread, messages that get no reply, and "remote
-typing is disabled" (the 2026-06-15 incident, measured at ~1.13/core).
-The detector now adds a **recurring-drift guard**: a short drift within `recentDriftWindowMs`
-(default 5 min) of a prior short drift, while load is oversubscribed (`> recentDriftLoadFloor`,
-default 1.0/core), is treated as recurring CPU starvation and suppressed. This generalizes the burst
-floor from *consecutive* ticks to *recent* ticks, and the load gate confines it to the
-oversubscribed band the hard guard leaves open.
-## What to Tell Your User
-- **Fewer spurious reconnects on a busy laptop**: "When my machine got busy I used to mistake the
-  slowdown for the computer going to sleep, which kicked off a disruptive recovery — dropping the
-  conversation thread, going quiet, or disabling typing. I now recognize that pattern and stay calm,
-  so those multi-machine glitches should largely stop."
-- **Real sleeps still handled**: "If the machine genuinely sleeps, I still notice and recover
-  properly — nothing changes there."
-## Summary of New Capabilities
-| Capability | How to Use |
-|-----------|-----------|
-| Suppress false "wake" events from CPU starvation on a loaded host | automatic |
-| Tune or disable the new guard | `monitoring.sleepWake.recentDriftWindowMs` / `.recentDriftLoadFloor` (set window to 0 to disable) |
-## Evidence
-Reproduction (live, 2026-06-15): on a host measured at loadavg ~18 on 16 cores (~1.13/core — above
-1.0 but below the 1.5 hard guard), `server.log` showed `[SleepWakeDetector] Wake detected after
-~33s/~21s sleep` recurring roughly every 2 minutes while the host was actively in use (not sleeping),
-each triggering the wake-recovery cascade. The drifts were isolated (on-time ticks between them reset
-the consecutive counter) and ~2 min apart (outlasting the 60s cooldown), so neither existing guard
-caught them.
-After the fix (verified by 45/45 sleep-wake unit tests across 5 files, both sides of the boundary): a
-recurring short drift in the 1.0–1.5 band is suppressed (no `wake` emitted, recorded as
-`cpu-starvation`); a genuinely isolated short drift, any drift on a light/idle host (ratio ≤ 1.0),
-and every long (real) sleep still emit; `recentDriftWindowMs: 0` restores byte-identical prior
-behavior. tsc clean.