instar 1.3.582 → 1.3.583
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/dist/commands/server.d.ts.map +1 -1
- package/dist/commands/server.js +2 -0
- package/dist/commands/server.js.map +1 -1
- package/dist/core/SleepWakeDetector.d.ts +39 -0
- package/dist/core/SleepWakeDetector.d.ts.map +1 -1
- package/dist/core/SleepWakeDetector.js +52 -0
- package/dist/core/SleepWakeDetector.js.map +1 -1
- package/dist/core/types.d.ts +10 -0
- package/dist/core/types.d.ts.map +1 -1
- package/dist/core/types.js.map +1 -1
- package/package.json +1 -1
- package/src/data/builtin-manifest.json +2 -2
- package/upgrades/1.3.583.md +51 -0
- package/upgrades/side-effects/sleepwake-recurring-drift-guard.md +46 -0
|
@@ -1,8 +1,8 @@
|
|
|
1
1
|
{
|
|
2
2
|
"$schema": "./builtin-manifest.schema.json",
|
|
3
3
|
"schemaVersion": 1,
|
|
4
|
-
"generatedAt": "2026-06-
|
|
5
|
-
"instarVersion": "1.3.
|
|
4
|
+
"generatedAt": "2026-06-15T23:23:58.124Z",
|
|
5
|
+
"instarVersion": "1.3.583",
|
|
6
6
|
"entryCount": 201,
|
|
7
7
|
"entries": {
|
|
8
8
|
"hook:session-start": {
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
# Upgrade Guide — vNEXT
|
|
2
|
+
|
|
3
|
+
<!-- assembled-by: assemble-next-md -->
|
|
4
|
+
<!-- bump: patch -->
|
|
5
|
+
|
|
6
|
+
## What Changed
|
|
7
|
+
|
|
8
|
+
A short timer drift that recurs while load sits in the **1.0–1.5/core band** slipped past both
|
|
9
|
+
existing guards: the load guard fires only above 1.5/core, and the consecutive burst floor resets
|
|
10
|
+
whenever on-time ticks fall between drifts. Its ~2-minute cadence also outlasted the 60s cooldown.
|
|
11
|
+
So each isolated drift emitted a **false `wake`**, firing the full wake-recovery cascade (tunnel
|
|
12
|
+
restart, Slack reconnect, mesh-lease churn, topic failover) — the source of a class of multi-machine
|
|
13
|
+
UX failures: a reply that's lost the conversation thread, messages that get no reply, and "remote
|
|
14
|
+
typing is disabled" (the 2026-06-15 incident, measured at ~1.13/core).
|
|
15
|
+
|
|
16
|
+
The detector now adds a **recurring-drift guard**: a short drift within `recentDriftWindowMs`
|
|
17
|
+
(default 5 min) of a prior short drift, while load is oversubscribed (`> recentDriftLoadFloor`,
|
|
18
|
+
default 1.0/core), is treated as recurring CPU starvation and suppressed. This generalizes the burst
|
|
19
|
+
floor from *consecutive* ticks to *recent* ticks, and the load gate confines it to the
|
|
20
|
+
oversubscribed band the hard guard leaves open.
|
|
21
|
+
|
|
22
|
+
## What to Tell Your User
|
|
23
|
+
|
|
24
|
+
- **Fewer spurious reconnects on a busy laptop**: "When my machine got busy I used to mistake the
|
|
25
|
+
slowdown for the computer going to sleep, which kicked off a disruptive recovery — dropping the
|
|
26
|
+
conversation thread, going quiet, or disabling typing. I now recognize that pattern and stay calm,
|
|
27
|
+
so those multi-machine glitches should largely stop."
|
|
28
|
+
- **Real sleeps still handled**: "If the machine genuinely sleeps, I still notice and recover
|
|
29
|
+
properly — nothing changes there."
|
|
30
|
+
|
|
31
|
+
## Summary of New Capabilities
|
|
32
|
+
|
|
33
|
+
| Capability | How to Use |
|
|
34
|
+
|-----------|-----------|
|
|
35
|
+
| Suppress false "wake" events from CPU starvation on a loaded host | automatic |
|
|
36
|
+
| Tune or disable the new guard | `monitoring.sleepWake.recentDriftWindowMs` / `.recentDriftLoadFloor` (set window to 0 to disable) |
|
|
37
|
+
|
|
38
|
+
## Evidence
|
|
39
|
+
|
|
40
|
+
Reproduction (live, 2026-06-15): on a host measured at loadavg ~18 on 16 cores (~1.13/core — above
|
|
41
|
+
1.0 but below the 1.5 hard guard), `server.log` showed `[SleepWakeDetector] Wake detected after
|
|
42
|
+
~33s/~21s sleep` recurring roughly every 2 minutes while the host was actively in use (not sleeping),
|
|
43
|
+
each triggering the wake-recovery cascade. The drifts were isolated (on-time ticks between them reset
|
|
44
|
+
the consecutive counter) and ~2 min apart (outlasting the 60s cooldown), so neither existing guard
|
|
45
|
+
caught them.
|
|
46
|
+
|
|
47
|
+
After the fix (verified by 45/45 sleep-wake unit tests across 5 files, both sides of the boundary): a
|
|
48
|
+
recurring short drift in the 1.0–1.5 band is suppressed (no `wake` emitted, recorded as
|
|
49
|
+
`cpu-starvation`); a genuinely isolated short drift, any drift on a light/idle host (ratio ≤ 1.0),
|
|
50
|
+
and every long (real) sleep still emit; `recentDriftWindowMs: 0` restores byte-identical prior
|
|
51
|
+
behavior. tsc clean.
|
|
@@ -0,0 +1,46 @@
|
|
|
1
|
+
# Side-effects — SleepWakeDetector recurring-drift guard (gap #3 / CMT-1563)
|
|
2
|
+
|
|
3
|
+
## What changed (3 files)
|
|
4
|
+
|
|
5
|
+
- `src/core/SleepWakeDetector.ts` — new config `recentDriftWindowMs` (default 300000) +
|
|
6
|
+
`recentDriftLoadFloor` (default 1.0); new state `lastShortDriftAtMs`; a new suppression branch
|
|
7
|
+
in `start()` (after the load guard, before the cooldown) that suppresses a SHORT drift recurring
|
|
8
|
+
within the window while `loadRatio > recentDriftLoadFloor`. Reuses the existing `cpu-starvation`
|
|
9
|
+
suppression reason — the stats/telemetry type is unchanged.
|
|
10
|
+
- `src/core/types.ts` — `config.monitoring.sleepWake` gains the two optional knobs (mirrors the
|
|
11
|
+
existing `maxLoadRatio` plumbing; no ConfigDefaults change, no migration).
|
|
12
|
+
- `src/commands/server.ts` — the production `new SleepWakeDetector({...})` boot site forwards the
|
|
13
|
+
two new knobs from `config.monitoring.sleepWake`.
|
|
14
|
+
|
|
15
|
+
## Behavioral side-effects
|
|
16
|
+
|
|
17
|
+
- **On a moderately-loaded host (loadRatio in the 1.0–1.5 band):** a short timer drift that recurs
|
|
18
|
+
within 5 min of a prior short drift no longer emits a `wake` — so it no longer triggers the
|
|
19
|
+
wake-recovery cascade (tunnel restart / Slack reconnect / mesh-lease churn / topic failover). This
|
|
20
|
+
is the fix for the 2026-06-15 multi-machine UX cascade.
|
|
21
|
+
- **No change** on a light/idle host (ratio ≤ 1.0): repeated short drifts still emit (the existing
|
|
22
|
+
"genuinely-isolated drifts both emit" behavior is preserved — verified by the unchanged tests).
|
|
23
|
+
- **No change** for long sleeps (≥ `longSleepFloorSeconds`): always emitted, recovery preserved.
|
|
24
|
+
- **No change** for an isolated short drift (no prior drift in the window): still emits.
|
|
25
|
+
- The wake-reaper's cumulative-sleep accounting is unaffected — suppressed drifts were already
|
|
26
|
+
excluded from `wakeHistory`, and this branch suppresses the same way.
|
|
27
|
+
|
|
28
|
+
## Risk + rollback
|
|
29
|
+
|
|
30
|
+
- HIGH-risk surface (session-lifecycle / recovery trigger). Fail-safe direction: the branch only
|
|
31
|
+
ADDS suppression to a SHORT drift on an OVERSUBSCRIBED host; it can never suppress a real long
|
|
32
|
+
sleep or change light-host behavior.
|
|
33
|
+
- Rollback lever: `config.monitoring.sleepWake.recentDriftWindowMs: 0` disables the guard with no
|
|
34
|
+
logic redeploy (restores exactly today's behavior).
|
|
35
|
+
|
|
36
|
+
## Tests
|
|
37
|
+
|
|
38
|
+
- `tests/unit/sleep-wake-starvation-guard.test.ts` — new `describe('recurring-drift guard for the
|
|
39
|
+
moderate-load band')` with 5 cases (band-suppress, light-host-emit, isolated-emit, disable-lever,
|
|
40
|
+
long-sleep-exempt). Full sleep-wake unit suite: 39/39 green. tsc clean on the touched files.
|
|
41
|
+
|
|
42
|
+
## Migration parity
|
|
43
|
+
|
|
44
|
+
The fix ships in the class default, so every agent gets it on update (the boot site reads optional
|
|
45
|
+
config but the default is in the constructor). No `.claude`/hook/skill/CLAUDE.md template change is
|
|
46
|
+
required — this is an internal monitoring guard, not an agent-facing capability or route.
|