instar 0.28.0 → 0.28.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (41) hide show
  1. package/README.md +1 -1
  2. package/dist/commands/init.js +1 -1
  3. package/dist/commands/init.js.map +1 -1
  4. package/dist/commands/server.d.ts.map +1 -1
  5. package/dist/commands/server.js +42 -3
  6. package/dist/commands/server.js.map +1 -1
  7. package/dist/core/types.d.ts +1 -1
  8. package/dist/core/types.d.ts.map +1 -1
  9. package/dist/core/types.js.map +1 -1
  10. package/dist/lifeline/ServerSupervisor.d.ts.map +1 -1
  11. package/dist/lifeline/ServerSupervisor.js +31 -0
  12. package/dist/lifeline/ServerSupervisor.js.map +1 -1
  13. package/dist/lifeline/TelegramLifeline.d.ts +8 -0
  14. package/dist/lifeline/TelegramLifeline.d.ts.map +1 -1
  15. package/dist/lifeline/TelegramLifeline.js +115 -6
  16. package/dist/lifeline/TelegramLifeline.js.map +1 -1
  17. package/dist/messaging/SpawnRequestManager.d.ts +16 -0
  18. package/dist/messaging/SpawnRequestManager.d.ts.map +1 -1
  19. package/dist/messaging/SpawnRequestManager.js +63 -5
  20. package/dist/messaging/SpawnRequestManager.js.map +1 -1
  21. package/dist/scheduler/JobScheduler.d.ts.map +1 -1
  22. package/dist/scheduler/JobScheduler.js +8 -4
  23. package/dist/scheduler/JobScheduler.js.map +1 -1
  24. package/dist/server/AgentServer.d.ts +5 -0
  25. package/dist/server/AgentServer.d.ts.map +1 -1
  26. package/dist/server/AgentServer.js +1 -0
  27. package/dist/server/AgentServer.js.map +1 -1
  28. package/dist/server/routes.d.ts +7 -0
  29. package/dist/server/routes.d.ts.map +1 -1
  30. package/dist/server/routes.js +85 -15
  31. package/dist/server/routes.js.map +1 -1
  32. package/dist/threadline/AgentTrustManager.d.ts +2 -0
  33. package/dist/threadline/AgentTrustManager.d.ts.map +1 -1
  34. package/dist/threadline/AgentTrustManager.js +5 -3
  35. package/dist/threadline/AgentTrustManager.js.map +1 -1
  36. package/dist/threadline/mcp-stdio-entry.js +3 -2
  37. package/dist/threadline/mcp-stdio-entry.js.map +1 -1
  38. package/package.json +1 -1
  39. package/src/data/builtin-manifest.json +75 -75
  40. package/upgrades/0.28.1.md +24 -0
  41. package/upgrades/0.28.2.md +29 -0
@@ -0,0 +1,24 @@
1
+ # Upgrade Guide — v0.28.1
2
+
3
+ <!-- bump: patch -->
4
+
5
+ ## What Changed
6
+
7
+ **Gate Failure Diagnostics** — When a job's gate command fails, the scheduler now captures and logs the exit code, stderr output, and the gate command itself in the event metadata. Previously, gate failures were silently swallowed with only "gate check returned nothing to do" — making it impossible to diagnose why jobs were being skipped.
8
+
9
+ Additionally, gate skips are now recorded in the SkipLedger (previously only quota, paused, claimed, and machine-scope skips were tracked). The new `gate` skip reason appears in skip reports and the auto-tune system.
10
+
11
+ **Memory Export Gate Auth Fix** — The memory-export job's gate command referenced an undefined `$AUTH` shell variable. The gate now resolves the auth token inline from the agent's config file, matching the pattern used by the job's execute script.
12
+
13
+ ## What to Tell Your User
14
+
15
+ - **Better job skip visibility**: "If your scheduled jobs are being skipped and you are not sure why, the skip events now include the actual error from the gate check. You can see these in the event log or skip ledger."
16
+
17
+ - **Memory export reliability**: "The memory export job had a gate that would always fail silently due to a missing authentication token. This is now fixed — memory exports should run on schedule."
18
+
19
+ ## Summary of New Capabilities
20
+
21
+ | Capability | How to Use |
22
+ |-----------|-----------|
23
+ | Gate skip diagnostics | Automatic — check event log for `job_gate_skip` events with `exitCode` and `stderr` metadata |
24
+ | Gate skip in SkipLedger | `GET /jobs/report` now includes gate skips in skip counts |
@@ -0,0 +1,29 @@
1
+ # Upgrade Guide — v0.28.2
2
+
3
+ <!-- bump: patch -->
4
+
5
+ ## What Changed
6
+
7
+ **Lifeline Shutdown Crash-Proofing** — The shutdown handler in both the lifeline and server now wraps `unregisterAgent`, `stopHeartbeat`, and other cleanup steps in try-catch. Previously, an ELOCKED error from the agent registry during SIGTERM would crash the lifeline with an unhandled exception. This confused launchd's KeepAlive restart logic and could leave the agent in a "spawn scheduled" limbo state indefinitely. The lifeline also now installs global `uncaughtException` and `unhandledRejection` handlers that specifically catch ELOCKED errors — the lifeline will never crash from registry lock contention again.
8
+
9
+ **409 Conflict Resolution** — When a stale Telegram long-poll connection causes persistent 409 Conflict errors ("another bot instance is polling"), the lifeline now actively tries to reclaim exclusive polling every 20 failures by calling deleteWebhook followed by a zero-timeout getUpdates. Previously, the lifeline would back off to 60 seconds and never recover, rendering the agent permanently unreachable until a manual restart.
10
+
11
+ **Settings JSON Self-Healing** — Both the lifeline startup and the ServerSupervisor preflight checks now validate `.claude/settings.json` for parseable JSON. If unresolved git merge conflict markers are detected, the file is auto-repaired by stripping the conflict markers (with a backup saved). This was the root cause of a complete agent outage where every Claude Code session crashed on startup due to invalid JSON, while the lifeline and server showed no errors — the agent appeared alive but never responded to messages.
12
+
13
+ ## What to Tell Your User
14
+
15
+ - **More resilient agent recovery**: "Your agent is now much harder to take down. Even if something goes wrong during a restart, the lifeline will survive and keep your agent reachable. Previously, a lock file contention during shutdown could permanently strand the agent."
16
+
17
+ - **Automatic Telegram recovery**: "If your agent's Telegram connection gets into a conflict state — for example after a machine sleep or a crash — it will now automatically recover instead of requiring a manual restart."
18
+
19
+ - **Config corruption protection**: "If a git sync causes a merge conflict in your agent's settings file, the system will now detect and auto-repair it. Previously, this could silently prevent all sessions from starting while everything else appeared healthy."
20
+
21
+ ## Summary of New Capabilities
22
+
23
+ | Capability | How to Use |
24
+ |-----------|-----------|
25
+ | Shutdown crash-proofing | Automatic — ELOCKED errors during shutdown are caught and logged |
26
+ | 409 conflict resolution | Automatic — reclaim attempt every 20 consecutive 409 errors |
27
+ | Settings JSON validation | Automatic — preflight check on every server/lifeline start |
28
+ | Global ELOCKED safety net | Automatic — uncaught ELOCKED errors logged but don't crash lifeline |
29
+