@checkstack/slo-backend 0.4.6 → 0.6.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,220 @@
1
1
  # @checkstack/slo-backend
2
2
 
3
+ ## 0.6.0
4
+
5
+ ### Minor Changes
6
+
7
+ - b995afb: Make `slo` a plugin-backed, COMPUTED reactive entity via the Model-B entity state machine + rewire its cross-plugin consumers.
8
+
9
+ SLO defines a `slo` entity `{ objectiveId, systemId, target, budgetRemainingPercent, currentStreak, bestStreak }` keyed by `objectiveId`. There is no framework `entity_state` row: its current state is assembled on demand by a `read` accessor (`createSloEntityRead` / `computeSloEntityState`). `currentStreak` / `bestStreak` / `systemId` / `target` come from the authoritative `slo_streaks` + `slo_objectives` tables, and `budgetRemainingPercent` (plus `target`) is COMPUTED on the fly via the SLO engine's `computeStatus` (downtime aggregation over the objective's rolling window). The daily snapshot job's streak-persist write drives through the fail-soft `writeSloEntity` (`handle.mutate({ id: objectiveId, apply })`): `apply` persists the streak to `slo_streaks` (its own write) and returns the freshly-computed view; the framework snapshots `prev` via the computed `read` BEFORE the write, appends the transition log, and emits `ENTITY_CHANGED`.
10
+
11
+ Compute-on-read (not materialize): the budget is a pure function of the objective's append-only downtime history, so storing a second copy would duplicate the engine's source of truth and risk drift. The `read` recomputes from the same tables the SLO API already reads; it is only exercised on the prev-snapshot of the once-daily streak job and on reactive scope/wake resolution, so the recompute cost is negligible. The append-only `slo_downtime_events` + `slo_daily_snapshots` tables are declared non-reactive (bookkeeping); the live budget/streak is the entity. Operators author budget/streak thresholds as reactive `numeric_state` conditions over `state.slo.<objectiveId>.budgetRemainingPercent` / `currentStreak`.
12
+
13
+ The healthcheck + catalog consumers switched from `onHook(<hook>)` to `onEntityChanged({ kind })`, all keeping `work-queue` delivery (each handler performs side-effecting writes that must run once per cluster):
14
+
15
+ - `slo-system-down` / `slo-upstream-down`: react to `health` changes filtered to a degraded transition (`classifyHealthChange().degraded`).
16
+ - `slo-system-up`: reacts to `health` changes filtered to a recovered transition (`classifyHealthChange().recovered`).
17
+ - `slo-system-cleanup`: reacts to `catalog-system` tombstones (`change.next === null`).
18
+
19
+ BREAKING CHANGES:
20
+
21
+ - The `slo.budget.warning` / `slo.budget.critical` / `slo.budget.exhausted` and `slo.streak.broken` automation triggers are removed. These thresholds were never emitted by the engine (the underlying hooks were inert) and are replaced by reactive `numeric_state` conditions over the `slo` entity (`budgetRemainingPercent < 20`, `currentStreak == 0`, etc.). Re-author any automations that referenced these trigger ids as `numeric_state` / `state` conditions. The `slo.achievement.unlocked` and `slo.weekly.digest` triggers are KEPT.
22
+
23
+ - b995afb: Remove the dead `slo.budget.warning` / `slo.budget.critical` / `slo.budget.exhausted` / `slo.streak.broken` hook descriptors from `sloHooks`.
24
+
25
+ These four `createHook` descriptors had no emitter and no trigger registration left: per the reactive automation engine (§9.2) the SLO budget IS the reactive entity, and the old threshold/streak triggers became `numeric_state` / `state` conditions over `state.slo.<objectiveId>.budgetRemainingPercent` + `currentStreak`. Nothing in the repo emitted or subscribed to the four hooks, so they were unreachable surface. `sloAchievementUnlocked` and `sloWeeklyDigest` are unaffected and stay.
26
+
27
+ BREAKING CHANGES:
28
+
29
+ - Removed `sloHooks.sloBudgetWarning`, `sloHooks.sloBudgetCritical`, `sloHooks.sloBudgetExhausted`, and `sloHooks.sloStreakBroken`. Author SLO budget / streak threshold automations as reactive `numeric_state` / `state` conditions over the `slo` entity state instead.
30
+
31
+ ### Patch Changes
32
+
33
+ - b995afb: Extract a shared `withEntityWrite` / `withEntityRemove` guard for PLUGIN-BACKED (Model B) reactive entities and refactor the per-domain copies onto it.
34
+
35
+ Every plugin-backed domain (incident, catalog, dependency, maintenance, slo, satellite) reimplemented the same "no handle wired → run the plugin write directly; handle wired → route through `handle.mutate` / `handle.remove`" guard, varying only in the id-key name. `@checkstack/automation-backend` now exports `withEntityWrite` / `withEntityRemove` (from the entity barrel) and each domain's thin, well-named wrappers (`writeIncidentEntity`, `writeMaintenanceEntity`, satellite's `mirror`, …) delegate to it, so the branch lives in exactly one place. Behavior is unchanged.
36
+
37
+ `writeHealthEntity` (healthcheck-backend) is intentionally NOT migrated onto the helper — it is genuinely bespoke (closure-captured durable state, distinct rethrow-vs-fail-soft branches, a per-system serializer, and it returns the computed state). SLO keeps its fail-soft `onError` wrapper around the shared guard.
38
+
39
+ - Updated dependencies [270ef29]
40
+ - Updated dependencies [b995afb]
41
+ - Updated dependencies [b995afb]
42
+ - Updated dependencies [b995afb]
43
+ - Updated dependencies [270ef29]
44
+ - Updated dependencies [270ef29]
45
+ - Updated dependencies [270ef29]
46
+ - Updated dependencies [270ef29]
47
+ - Updated dependencies [270ef29]
48
+ - Updated dependencies [270ef29]
49
+ - Updated dependencies [270ef29]
50
+ - Updated dependencies [270ef29]
51
+ - Updated dependencies [270ef29]
52
+ - Updated dependencies [b995afb]
53
+ - Updated dependencies [b995afb]
54
+ - Updated dependencies [b995afb]
55
+ - Updated dependencies [b995afb]
56
+ - Updated dependencies [270ef29]
57
+ - Updated dependencies [b995afb]
58
+ - Updated dependencies [270ef29]
59
+ - Updated dependencies [b995afb]
60
+ - Updated dependencies [b995afb]
61
+ - Updated dependencies [270ef29]
62
+ - Updated dependencies [b995afb]
63
+ - Updated dependencies [b995afb]
64
+ - Updated dependencies [270ef29]
65
+ - Updated dependencies [b995afb]
66
+ - Updated dependencies [b995afb]
67
+ - Updated dependencies [b995afb]
68
+ - Updated dependencies [b995afb]
69
+ - Updated dependencies [b995afb]
70
+ - Updated dependencies [b995afb]
71
+ - Updated dependencies [b995afb]
72
+ - Updated dependencies [270ef29]
73
+ - Updated dependencies [270ef29]
74
+ - Updated dependencies [270ef29]
75
+ - Updated dependencies [270ef29]
76
+ - Updated dependencies [270ef29]
77
+ - Updated dependencies [270ef29]
78
+ - Updated dependencies [270ef29]
79
+ - Updated dependencies [270ef29]
80
+ - Updated dependencies [b995afb]
81
+ - Updated dependencies [b995afb]
82
+ - @checkstack/backend-api@0.19.0
83
+ - @checkstack/automation-backend@0.3.0
84
+ - @checkstack/gitops-common@0.5.0
85
+ - @checkstack/gitops-backend@0.4.0
86
+ - @checkstack/healthcheck-backend@1.4.0
87
+ - @checkstack/healthcheck-common@1.4.0
88
+ - @checkstack/catalog-backend@1.3.0
89
+ - @checkstack/cache-api@0.3.7
90
+ - @checkstack/command-backend@0.1.32
91
+ - @checkstack/queue-api@0.3.7
92
+ - @checkstack/cache-utils@0.2.12
93
+
94
+ ## 0.5.0
95
+
96
+ ### Minor Changes
97
+
98
+ - 41c77f4: feat(automation): type enum-able trigger/artifact fields as enums for editor value autocompletion
99
+
100
+ The automation editor's staged completion offers concrete values after a
101
+ comparator (`{{ trigger.payload.severity == "high" }}`) only when the
102
+ field's JSON Schema carries an `enum`. Several trigger payload + artifact
103
+ schemas declared closed-set fields as loose `z.string()`, so no values
104
+ were suggested. Tightened them to the canonical enums that already
105
+ existed in each plugin's `-common` package (and matched the hook payload
106
+ types in lockstep so the trigger's `payloadSchema` and `hook` keep the
107
+ same `TPayload`):
108
+
109
+ - **incident** — trigger payloads: `severity` → `IncidentSeverityEnum`,
110
+ `status` / `statusChange` → `IncidentStatusEnum`.
111
+ - **healthcheck** — trigger payloads: `previousStatus` / `newStatus` /
112
+ `status` → `HealthCheckStatusSchema` (across systemDegraded,
113
+ systemHealthy, systemHealthChanged, checkFailed; plus checkCompleted's
114
+ hook type).
115
+ - **dependency** — trigger + artifact: `impactType` → `ImpactTypeSchema`;
116
+ impactPropagated `previousState` / `newState` → `DerivedStateSchema`.
117
+ Also deduped the inline `impactTypeSchema` action-config enum to reuse
118
+ the canonical `ImpactTypeSchema`.
119
+ - **maintenance** — trigger + artifact: `status` →
120
+ `MaintenanceStatusEnum`; deduped the inline `maintenanceStatusEnum`
121
+ (used by `add_update.statusChange`) to the canonical one.
122
+ - **slo** — `achievement.unlocked` trigger + hook: `achievement` →
123
+ `AchievementTypeSchema`.
124
+
125
+ Runtime behaviour is unchanged — these fields always carried valid enum
126
+ values (the underlying records are enum-constrained); only the schema
127
+ types were loose. The hook payload generics are now precise too, which
128
+ caught one stale test fixture asserting an invalid `impactType: "soft"`.
129
+
130
+ Fields that look enum-ish but are genuinely free-form were intentionally
131
+ left as `z.string()`: satellite `region` (user-entered), Jira issue
132
+ `status` (per-instance workflow name), notification `strategyQualifiedId`
133
+ / `errorMessage`, healthcheck collector `result`, and script
134
+ `stdout` / `stderr`.
135
+
136
+ ### Patch Changes
137
+
138
+ - 41c77f4: feat(automation): one-time migration of webhook subscriptions + remove legacy integration backend
139
+
140
+ **BREAKING CHANGES** (platform is in BETA — no major bump):
141
+
142
+ - `IntegrationProvider` no longer carries `config` (subscription
143
+ config) or `deliver`. The interface now models a connection provider
144
+ only: connection schema + `getConnectionOptions` + `testConnection`.
145
+ - The legacy subscription / delivery-log / event endpoints
146
+ (`listSubscriptions`, `createSubscription`, `getDeliveryLogs`,
147
+ `listEventTypes`, …) are removed from `integrationContract`.
148
+ - `delivery-coordinator`, `hook-subscriber`, `event-registry`, and the
149
+ `integrationEventExtensionPoint` are deleted. Plugins that
150
+ previously called `integrationEvents.registerEvent(...)` now
151
+ register their hooks as automation triggers via
152
+ `automationTriggerExtensionPoint.registerTrigger(...)`.
153
+ - Frontend pages `IntegrationsPage` and `DeliveryLogsPage` are gone;
154
+ the integration plugin's only remaining UI is connection
155
+ management. Subscription management lives under `/automation/...`.
156
+ - `webhook_subscriptions` and `delivery_logs` tables stay in the
157
+ database for one release as a safety net (no code reads or writes
158
+ them), and will be dropped in a follow-up migration.
159
+
160
+ **New**:
161
+
162
+ - `jira.create_issue`, `teams.post_message`, `webex.post_message`,
163
+ `webhook.send`, `integration-script.run_shell`, and
164
+ `integration-script.run_script` actions registered against the
165
+ Automation Platform with matching `*.message`, `*.delivery`,
166
+ `shell.result`, and `script.result` artifact types. The script
167
+ plugin exposes **two** actions — `run_shell` runs bash via the
168
+ shared `ShellScriptRunner` (Monaco `shell` editor), `run_script`
169
+ runs an ESM module in a Bun subprocess via `EsmScriptRunner`
170
+ (Monaco `typescript` editor + `defineIntegration` helper) — to
171
+ preserve the legacy provider split. `jira.create_issue` keeps the
172
+ dynamic field-mapping dropdown (driven by
173
+ `JIRA_RESOLVERS.FIELD_OPTIONS`).
174
+ - One-time data migration runs on boot in
175
+ `automation-backend.afterPluginsReady`. It reads
176
+ `webhook_subscriptions` via a new service RPC
177
+ `IntegrationApi.listLegacySubscriptions`, translates each row into
178
+ a single-trigger / single-action automation (marked with
179
+ `managed_by = "migrated-subscription:<id>"`), and is idempotent
180
+ across restarts.
181
+ - Failed translations are recorded in a new
182
+ `automation_migration_failures` table and surfaced via
183
+ `AutomationApi.listMigrationFailures` /
184
+ `acknowledgeMigrationFailure` so admins can review and re-create
185
+ failed entries by hand.
186
+
187
+ - Updated dependencies [e2d6f25]
188
+ - Updated dependencies [41c77f4]
189
+ - Updated dependencies [41c77f4]
190
+ - Updated dependencies [e1a2077]
191
+ - Updated dependencies [41c77f4]
192
+ - Updated dependencies [41c77f4]
193
+ - Updated dependencies [41c77f4]
194
+ - Updated dependencies [41c77f4]
195
+ - Updated dependencies [41c77f4]
196
+ - Updated dependencies [41c77f4]
197
+ - Updated dependencies [41c77f4]
198
+ - Updated dependencies [6d52276]
199
+ - Updated dependencies [6d52276]
200
+ - Updated dependencies [35bc682]
201
+ - @checkstack/automation-backend@0.2.0
202
+ - @checkstack/healthcheck-backend@1.3.0
203
+ - @checkstack/catalog-backend@1.2.0
204
+ - @checkstack/common@0.12.0
205
+ - @checkstack/backend-api@0.18.0
206
+ - @checkstack/healthcheck-common@1.3.0
207
+ - @checkstack/catalog-common@2.2.3
208
+ - @checkstack/dependency-common@1.1.3
209
+ - @checkstack/slo-common@0.4.2
210
+ - @checkstack/command-backend@0.1.31
211
+ - @checkstack/gitops-backend@0.3.7
212
+ - @checkstack/gitops-common@0.4.2
213
+ - @checkstack/signal-common@0.2.5
214
+ - @checkstack/cache-api@0.3.6
215
+ - @checkstack/queue-api@0.3.6
216
+ - @checkstack/cache-utils@0.2.11
217
+
3
218
  ## 0.4.6
4
219
 
5
220
  ### Patch Changes
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@checkstack/slo-backend",
3
- "version": "0.4.6",
3
+ "version": "0.6.0",
4
4
  "license": "Elastic-2.0",
5
5
  "type": "module",
6
6
  "main": "src/index.ts",
@@ -14,31 +14,30 @@
14
14
  "lint:code": "eslint . --max-warnings 0"
15
15
  },
16
16
  "dependencies": {
17
- "@checkstack/backend-api": "0.17.0",
18
- "@checkstack/cache-api": "0.3.4",
19
- "@checkstack/cache-utils": "0.2.9",
20
- "@checkstack/slo-common": "0.4.1",
21
- "@checkstack/healthcheck-common": "1.1.2",
22
- "@checkstack/healthcheck-backend": "1.1.4",
23
- "@checkstack/dependency-common": "1.1.2",
24
- "@checkstack/catalog-common": "2.2.2",
25
- "@checkstack/catalog-backend": "1.1.5",
26
- "@checkstack/command-backend": "0.1.29",
27
- "@checkstack/signal-common": "0.2.4",
28
- "@checkstack/integration-backend": "0.1.29",
29
- "@checkstack/integration-common": "0.5.0",
30
- "@checkstack/gitops-backend": "0.3.5",
31
- "@checkstack/gitops-common": "0.4.1",
32
- "@checkstack/common": "0.11.0",
33
- "@checkstack/queue-api": "0.3.4",
17
+ "@checkstack/backend-api": "0.18.0",
18
+ "@checkstack/cache-api": "0.3.6",
19
+ "@checkstack/cache-utils": "0.2.11",
20
+ "@checkstack/slo-common": "0.4.2",
21
+ "@checkstack/healthcheck-common": "1.3.0",
22
+ "@checkstack/healthcheck-backend": "1.3.0",
23
+ "@checkstack/dependency-common": "1.1.3",
24
+ "@checkstack/catalog-common": "2.2.3",
25
+ "@checkstack/catalog-backend": "1.2.0",
26
+ "@checkstack/command-backend": "0.1.31",
27
+ "@checkstack/signal-common": "0.2.5",
28
+ "@checkstack/automation-backend": "0.2.0",
29
+ "@checkstack/gitops-backend": "0.3.7",
30
+ "@checkstack/gitops-common": "0.4.2",
31
+ "@checkstack/common": "0.12.0",
32
+ "@checkstack/queue-api": "0.3.6",
34
33
  "drizzle-orm": "^0.45.0",
35
34
  "zod": "^4.2.1",
36
35
  "@orpc/server": "^1.13.2"
37
36
  },
38
37
  "devDependencies": {
39
38
  "@checkstack/drizzle-helper": "0.0.5",
40
- "@checkstack/scripts": "0.3.3",
41
- "@checkstack/test-utils-backend": "0.1.29",
39
+ "@checkstack/scripts": "0.3.4",
40
+ "@checkstack/test-utils-backend": "0.1.31",
42
41
  "@checkstack/tsconfig": "0.0.7",
43
42
  "@types/bun": "^1.0.0",
44
43
  "drizzle-kit": "^0.31.10",
package/src/hooks.ts CHANGED
@@ -1,4 +1,5 @@
1
1
  import { createHook } from "@checkstack/backend-api";
2
+ import type { AchievementType } from "@checkstack/slo-common";
2
3
 
3
4
  /**
4
5
  * SLO hooks for cross-plugin communication.
@@ -6,51 +7,12 @@ import { createHook } from "@checkstack/backend-api";
6
7
  * Registered as integration events so they flow through configured notification channels.
7
8
  */
8
9
  export const sloHooks = {
9
- /**
10
- * Emitted when an SLO's error budget consumption exceeds the warning threshold.
11
- */
12
- sloBudgetWarning: createHook<{
13
- systemId: string;
14
- objectiveId: string;
15
- target: number;
16
- budgetRemainingPercent: number;
17
- }>("slo.budget.warning"),
18
-
19
- /**
20
- * Emitted when an SLO's error budget consumption exceeds the critical threshold.
21
- */
22
- sloBudgetCritical: createHook<{
23
- systemId: string;
24
- objectiveId: string;
25
- target: number;
26
- budgetRemainingPercent: number;
27
- }>("slo.budget.critical"),
28
-
29
- /**
30
- * Emitted when an SLO's error budget is fully exhausted.
31
- */
32
- sloBudgetExhausted: createHook<{
33
- systemId: string;
34
- objectiveId: string;
35
- target: number;
36
- }>("slo.budget.exhausted"),
37
-
38
- /**
39
- * Emitted when a reliability streak is broken.
40
- */
41
- sloStreakBroken: createHook<{
42
- systemId: string;
43
- objectiveId: string;
44
- streak: number;
45
- bestStreak: number;
46
- }>("slo.streak.broken"),
47
-
48
10
  /**
49
11
  * Emitted when a system unlocks a new reliability achievement.
50
12
  */
51
13
  sloAchievementUnlocked: createHook<{
52
14
  systemId: string;
53
- achievement: string;
15
+ achievement: AchievementType;
54
16
  }>("slo.achievement.unlocked"),
55
17
 
56
18
  /**
package/src/index.ts CHANGED
@@ -7,20 +7,37 @@ import {
7
7
  pluginMetadata,
8
8
  sloContract,
9
9
  sloRoutes,
10
+ AchievementTypeSchema,
10
11
  } from "@checkstack/slo-common";
11
12
  import { createBackendPlugin, coreServices } from "@checkstack/backend-api";
12
- import { integrationEventExtensionPoint } from "@checkstack/integration-backend";
13
+ import {
14
+ automationTriggerExtensionPoint,
15
+ entityExtensionPoint,
16
+ type EntityHandle,
17
+ } from "@checkstack/automation-backend";
13
18
  import { SloService } from "./service";
14
19
  import { SloEngine } from "./slo-engine";
15
20
  import { createRouter } from "./router";
16
21
  import { createSloCache } from "./cache";
17
22
  import { DependencyApi } from "@checkstack/dependency-common";
18
23
  import { HealthCheckApi } from "@checkstack/healthcheck-common";
19
- import { catalogHooks } from "@checkstack/catalog-backend";
20
- import { healthCheckHooks } from "@checkstack/healthcheck-backend";
24
+ import {
25
+ CATALOG_SYSTEM_ENTITY_KIND,
26
+ } from "@checkstack/catalog-backend";
27
+ import {
28
+ HEALTH_ENTITY_KIND,
29
+ classifyHealthChange,
30
+ } from "@checkstack/healthcheck-backend";
21
31
  import { registerSearchProvider } from "@checkstack/command-backend";
22
32
  import { resolveRoute } from "@checkstack/common";
23
33
  import { sloHooks } from "./hooks";
34
+ import {
35
+ SLO_ENTITY_KIND,
36
+ SloEntityStateSchema,
37
+ createSloEntityRead,
38
+ deriveSloTriggerEvents,
39
+ type SloEntityState,
40
+ } from "./slo-entity";
24
41
  import { setupDailySnapshotJob } from "./streak-calculator";
25
42
  import { setupWeeklyDigestJob } from "./weekly-digest";
26
43
  import { evaluateAchievements } from "./achievement-evaluator";
@@ -31,36 +48,17 @@ import { registerSloGitOpsKinds } from "./slo-gitops-kinds";
31
48
  // Integration Event Payload Schemas
32
49
  // =============================================================================
33
50
 
34
- const sloBudgetWarningPayloadSchema = z.object({
35
- systemId: z.string(),
36
- objectiveId: z.string(),
37
- target: z.number(),
38
- budgetRemainingPercent: z.number(),
39
- });
40
-
41
- const sloBudgetCriticalPayloadSchema = z.object({
42
- systemId: z.string(),
43
- objectiveId: z.string(),
44
- target: z.number(),
45
- budgetRemainingPercent: z.number(),
46
- });
47
-
48
- const sloBudgetExhaustedPayloadSchema = z.object({
49
- systemId: z.string(),
50
- objectiveId: z.string(),
51
- target: z.number(),
52
- });
53
-
54
- const sloStreakBrokenPayloadSchema = z.object({
55
- systemId: z.string(),
56
- objectiveId: z.string(),
57
- streak: z.number(),
58
- bestStreak: z.number(),
59
- });
51
+ // NOTE: The `budget.warning` / `.critical` / `.exhausted` and
52
+ // `streak.broken` trigger payload schemas were removed (§9.2). Those four
53
+ // thresholds are now authored as reactive `numeric_state` conditions over
54
+ // the `slo` entity's `budgetRemainingPercent` / `currentStreak`, not as
55
+ // pre-baked event triggers. The hooks they fronted were never emitted by
56
+ // the engine (inert), so removing the trigger registrations is behavior-
57
+ // preserving.
60
58
 
61
59
  const sloAchievementUnlockedPayloadSchema = z.object({
62
60
  systemId: z.string(),
63
- achievement: z.string(),
61
+ achievement: AchievementTypeSchema,
64
62
  });
65
63
 
66
64
  const sloWeeklyDigestPayloadSchema = z.object({
@@ -88,82 +86,100 @@ const sloWeeklyDigestPayloadSchema = z.object({
88
86
  // Plugin Definition
89
87
  // =============================================================================
90
88
 
89
+ // Reactive `slo` entity handle (§10.7). Defined in register() via the
90
+ // entity extension point; mutated from the daily snapshot job onward.
91
+ let sloEntity: EntityHandle<SloEntityState> | undefined;
92
+
93
+ // The SLO service + engine are created in afterPluginsReady (they need the
94
+ // resolved database + RPC clients), but the PLUGIN-BACKED + COMPUTED entity
95
+ // `read` accessor must be supplied at `defineEntity` time in register(). These
96
+ // holders bridge the two: the `read` closure resolves them lazily, and
97
+ // afterPluginsReady sets them before any mutation runs (the daily job — the
98
+ // only mutation site — runs from afterPluginsReady onward).
99
+ let sloEntityServiceRef: SloService | undefined;
100
+ let sloEntityEngineRef: SloEngine | undefined;
101
+
91
102
  export default createBackendPlugin({
92
103
  metadata: pluginMetadata,
93
104
  register(env) {
94
105
  env.registerAccessRules(sloAccessRules);
95
106
 
96
- // Register hooks as integration events
97
- const integrationEvents = env.getExtensionPoint(
98
- integrationEventExtensionPoint,
107
+ // Register hooks as automation triggers
108
+ const automationTriggers = env.getExtensionPoint(
109
+ automationTriggerExtensionPoint,
99
110
  );
100
111
 
101
- integrationEvents.registerEvent(
102
- {
103
- hook: sloHooks.sloBudgetWarning,
104
- displayName: "SLO Budget Warning",
105
- description:
106
- "Fired when an SLO error budget consumption exceeds the warning threshold",
107
- category: "SLO",
108
- payloadSchema: sloBudgetWarningPayloadSchema,
112
+ // ─── Reactive `slo` entity (§10.7, §9.2) ───────────────────────────
113
+ // The SLO budget IS the entity. The former `budget.warning/.critical/
114
+ // .exhausted` + `streak.broken` triggers are removed — those thresholds
115
+ // are now authored as `numeric_state` conditions over
116
+ // `state.slo.<objectiveId>.budgetRemainingPercent` / `currentStreak`.
117
+ // The deriver fires no legacy events; it exists so `slo` is a known
118
+ // reactive kind (scope + wake resolution).
119
+ //
120
+ // PLUGIN-BACKED + COMPUTED (Model B): there is NO framework `entity_state`
121
+ // row. `read` assembles each objective's view by reading `slo_streaks` +
122
+ // `slo_objectives` and COMPUTING `budgetRemainingPercent` via the engine
123
+ // (see `createSloEntityRead`). No `indexes` — those only apply to
124
+ // store-backed kinds, and a plugin-backed kind keeps its state in its own
125
+ // tables. The `read` closure resolves the service + engine set by
126
+ // afterPluginsReady (the daily job is the only mutation site).
127
+ const entityPoint = env.getExtensionPoint(entityExtensionPoint);
128
+ sloEntity = entityPoint.defineEntity<SloEntityState>({
129
+ kind: SLO_ENTITY_KIND,
130
+ state: SloEntityStateSchema,
131
+ read: (ids) => {
132
+ const service = sloEntityServiceRef;
133
+ const engine = sloEntityEngineRef;
134
+ if (!service || !engine) {
135
+ throw new Error(
136
+ "slo entity read before init: service/engine not yet resolved",
137
+ );
138
+ }
139
+ return createSloEntityRead({ service, engine })(ids);
109
140
  },
110
- pluginMetadata,
111
- );
112
-
113
- integrationEvents.registerEvent(
114
- {
115
- hook: sloHooks.sloBudgetCritical,
116
- displayName: "SLO Budget Critical",
117
- description:
118
- "Fired when an SLO error budget consumption exceeds the critical threshold",
119
- category: "SLO",
120
- payloadSchema: sloBudgetCriticalPayloadSchema,
121
- },
122
- pluginMetadata,
123
- );
124
-
125
- integrationEvents.registerEvent(
126
- {
127
- hook: sloHooks.sloBudgetExhausted,
128
- displayName: "SLO Budget Exhausted",
129
- description: "Fired when an SLO error budget is fully consumed",
130
- category: "SLO",
131
- payloadSchema: sloBudgetExhaustedPayloadSchema,
132
- },
133
- pluginMetadata,
134
- );
135
-
136
- integrationEvents.registerEvent(
137
- {
138
- hook: sloHooks.sloStreakBroken,
139
- displayName: "SLO Streak Broken",
140
- description: "Fired when a reliability streak is broken",
141
- category: "SLO",
142
- payloadSchema: sloStreakBrokenPayloadSchema,
143
- },
144
- pluginMetadata,
145
- );
141
+ });
142
+ entityPoint.registerChangeDeriver({
143
+ kind: SLO_ENTITY_KIND,
144
+ derive: deriveSloTriggerEvents,
145
+ });
146
+ // Event-sourced history is NOT the live entity (§5): downtime events +
147
+ // daily snapshots are append-only records, the budget/streak is the
148
+ // reactive entity.
149
+ entityPoint.declareNonReactiveState({
150
+ table: "slo_downtime_events",
151
+ reason: "bookkeeping",
152
+ note: "Append-only downtime history. The live budget/streak is the `slo` entity.",
153
+ });
154
+ entityPoint.declareNonReactiveState({
155
+ table: "slo_daily_snapshots",
156
+ reason: "bookkeeping",
157
+ note: "Append-only daily trend snapshots. The live budget/streak is the `slo` entity.",
158
+ });
146
159
 
147
- integrationEvents.registerEvent(
160
+ automationTriggers.registerTrigger(
148
161
  {
149
- hook: sloHooks.sloAchievementUnlocked,
162
+ id: "achievement.unlocked",
150
163
  displayName: "SLO Achievement Unlocked",
151
164
  description:
152
165
  "Fired when a system unlocks a new reliability achievement",
153
166
  category: "SLO",
154
167
  payloadSchema: sloAchievementUnlockedPayloadSchema,
168
+ hook: sloHooks.sloAchievementUnlocked,
169
+ contextKey: (p) => p.systemId,
155
170
  },
156
171
  pluginMetadata,
157
172
  );
158
173
 
159
- integrationEvents.registerEvent(
174
+ automationTriggers.registerTrigger(
160
175
  {
161
- hook: sloHooks.sloWeeklyDigest,
176
+ id: "weekly.digest",
162
177
  displayName: "SLO Weekly Digest",
163
178
  description:
164
179
  "Weekly summary of SLO performance across all systems (Monday 09:00 UTC)",
165
180
  category: "SLO",
166
181
  payloadSchema: sloWeeklyDigestPayloadSchema,
182
+ hook: sloHooks.sloWeeklyDigest,
167
183
  },
168
184
  pluginMetadata,
169
185
  );
@@ -171,6 +187,8 @@ export default createBackendPlugin({
171
187
  // Shared references across init/afterPluginsReady (maintenance-backend pattern)
172
188
  let sharedEngine: SloEngine;
173
189
  let gitopsService: SloService | undefined;
190
+ // Reactive `slo` entity handle (§10.7), defined just above in register().
191
+ const onEntityChanged = entityPoint.onEntityChanged;
174
192
 
175
193
  // ─── GitOps Entity Kind Registration ─────────────────────────────
176
194
  const kindRegistry = env.getExtensionPoint(entityKindExtensionPoint);
@@ -252,7 +270,6 @@ export default createBackendPlugin({
252
270
  afterPluginsReady: async ({
253
271
  database,
254
272
  logger,
255
- onHook,
256
273
  emitHook,
257
274
  rpcClient,
258
275
  signalService,
@@ -265,6 +282,12 @@ export default createBackendPlugin({
265
282
  signalService,
266
283
  logger,
267
284
  });
285
+ // Publish the service + engine for the PLUGIN-BACKED + COMPUTED entity
286
+ // `read` accessor (defined in register()). The daily snapshot job — the
287
+ // only `slo` mutation site — runs from here onward, so the refs are set
288
+ // before any `read`/`mutate` can fire.
289
+ sloEntityServiceRef = service;
290
+ sloEntityEngineRef = engine;
268
291
 
269
292
  const dependencyClient = rpcClient.forPlugin(DependencyApi);
270
293
  const healthCheckClient = rpcClient.forPlugin(HealthCheckApi);
@@ -333,41 +356,52 @@ export default createBackendPlugin({
333
356
  }
334
357
  };
335
358
 
359
+ // Cross-plugin consumers now react to the reactive `health` /
360
+ // `catalog-system` ENTITY changes via `onEntityChanged` instead of
361
+ // the (being-removed) directional hooks (§10.7). `classifyHealthChange`
362
+ // reproduces the exact degraded/recovered transition predicate the
363
+ // old `systemDegraded` / `systemHealthy` hooks fired on. Each
364
+ // consumer keeps `work-queue` delivery with its original
365
+ // `workerGroup`: these are side-effecting writes (open/close downtime,
366
+ // achievements, cleanup) that must run exactly once per cluster — not
367
+ // per-instance — so broadcast would double-apply them.
368
+
336
369
  // =====================================================================
337
370
  // Perspective 1: System goes DOWN — open downtime events
338
371
  // =====================================================================
339
- onHook(
340
- healthCheckHooks.systemDegraded,
341
- async (payload) => {
372
+ onEntityChanged({
373
+ kind: HEALTH_ENTITY_KIND,
374
+ handler: async (change) => {
375
+ const { systemId, degraded, previousStatus, newStatus } =
376
+ classifyHealthChange(change);
377
+ if (!degraded) return;
342
378
  logger.debug(
343
- `SLO: System ${payload.systemId} degraded (${payload.previousStatus} → ${payload.newStatus})`,
379
+ `SLO: System ${systemId} degraded (${previousStatus} → ${newStatus})`,
344
380
  );
345
381
  await engine.handleSystemDown({
346
- systemId: payload.systemId,
382
+ systemId,
347
383
  getUpstreamHealthStatus,
348
384
  });
349
385
  },
350
- { mode: "work-queue", workerGroup: "slo-system-down" },
351
- );
386
+ delivery: { mode: "work-queue", workerGroup: "slo-system-down" },
387
+ });
352
388
 
353
389
  // =====================================================================
354
390
  // Perspective 1: System goes UP — close downtime events
355
391
  // =====================================================================
356
- onHook(
357
- healthCheckHooks.systemHealthy,
358
- async (payload) => {
359
- logger.debug(`SLO: System ${payload.systemId} recovered`);
360
- await engine.handleSystemUp({
361
- systemId: payload.systemId,
362
- });
392
+ onEntityChanged({
393
+ kind: HEALTH_ENTITY_KIND,
394
+ handler: async (change) => {
395
+ const { systemId, recovered } = classifyHealthChange(change);
396
+ if (!recovered) return;
397
+ logger.debug(`SLO: System ${systemId} recovered`);
398
+ await engine.handleSystemUp({ systemId });
363
399
 
364
400
  // Also handle Perspective 2 (as upstream)
365
- const downstreamIds = await getDownstreamSystemIds(
366
- payload.systemId,
367
- );
401
+ const downstreamIds = await getDownstreamSystemIds(systemId);
368
402
  if (downstreamIds.length > 0) {
369
403
  await engine.handleUpstreamUp({
370
- upstreamSystemId: payload.systemId,
404
+ upstreamSystemId: systemId,
371
405
  downstreamSystemIds: downstreamIds,
372
406
  getUpstreamHealthStatus,
373
407
  });
@@ -375,54 +409,53 @@ export default createBackendPlugin({
375
409
 
376
410
  // Evaluate achievements on recovery (rapid_recovery, clean_sheet, etc.)
377
411
  await evaluateAchievements({
378
- systemId: payload.systemId,
412
+ systemId,
379
413
  service,
380
414
  engine,
381
415
  logger,
382
416
  });
383
417
  },
384
- { mode: "work-queue", workerGroup: "slo-system-up" },
385
- );
418
+ delivery: { mode: "work-queue", workerGroup: "slo-system-up" },
419
+ });
386
420
 
387
421
  // =====================================================================
388
422
  // Perspective 2: Upstream degraded — split downstream "self" events
389
- // We re-use the systemDegraded hook, checking downstream systems
423
+ // We re-use the degraded transition, checking downstream systems
390
424
  // =====================================================================
391
- onHook(
392
- healthCheckHooks.systemDegraded,
393
- async (payload) => {
394
- const downstreamIds = await getDownstreamSystemIds(
395
- payload.systemId,
396
- );
425
+ onEntityChanged({
426
+ kind: HEALTH_ENTITY_KIND,
427
+ handler: async (change) => {
428
+ const { systemId, degraded } = classifyHealthChange(change);
429
+ if (!degraded) return;
430
+ const downstreamIds = await getDownstreamSystemIds(systemId);
397
431
  if (downstreamIds.length > 0) {
398
432
  await engine.handleUpstreamDown({
399
- upstreamSystemId: payload.systemId,
400
- upstreamSystemName: payload.systemName ?? payload.systemId,
433
+ upstreamSystemId: systemId,
434
+ upstreamSystemName: systemId,
401
435
  downstreamSystemIds: downstreamIds,
402
436
  });
403
437
  }
404
438
  },
405
- { mode: "work-queue", workerGroup: "slo-upstream-down" },
406
- );
439
+ delivery: { mode: "work-queue", workerGroup: "slo-upstream-down" },
440
+ });
407
441
 
408
442
  // =====================================================================
409
- // Subscribe to catalog system deletion for cleanup
443
+ // Subscribe to catalog system deletion (tombstone) for cleanup
410
444
  // =====================================================================
411
- onHook(
412
- catalogHooks.systemDeleted,
413
- async (payload) => {
445
+ onEntityChanged({
446
+ kind: CATALOG_SYSTEM_ENTITY_KIND,
447
+ handler: async (change) => {
448
+ // Only react to a tombstone (delete), not create/update.
449
+ if (change.next !== null) return;
450
+ const systemId = change.id;
414
451
  logger.debug(
415
- `Cleaning up SLO data for deleted system: ${payload.systemId}`,
452
+ `Cleaning up SLO data for deleted system: ${systemId}`,
416
453
  );
417
- await service.deleteObjectivesForSystem({
418
- systemId: payload.systemId,
419
- });
420
- await service.deleteAchievementsForSystem({
421
- systemId: payload.systemId,
422
- });
454
+ await service.deleteObjectivesForSystem({ systemId });
455
+ await service.deleteAchievementsForSystem({ systemId });
423
456
  },
424
- { mode: "work-queue", workerGroup: "slo-system-cleanup" },
425
- );
457
+ delivery: { mode: "work-queue", workerGroup: "slo-system-cleanup" },
458
+ });
426
459
 
427
460
  // =====================================================================
428
461
  // Daily snapshot + streak calculation cron job
@@ -432,6 +465,7 @@ export default createBackendPlugin({
432
465
  engine,
433
466
  logger,
434
467
  queueManager,
468
+ getSloEntity: () => sloEntity,
435
469
  });
436
470
 
437
471
  // =====================================================================
@@ -0,0 +1,255 @@
1
+ import { describe, it, expect } from "bun:test";
2
+ import type { EntityHandle } from "@checkstack/automation-backend";
3
+
4
+ import {
5
+ SLO_ENTITY_KIND,
6
+ SloEntityStateSchema,
7
+ computeSloEntityState,
8
+ createSloEntityRead,
9
+ deriveSloTriggerEvents,
10
+ writeSloEntity,
11
+ type SloEntityState,
12
+ } from "./slo-entity";
13
+ import type { SloService } from "./service";
14
+ import type { SloEngine } from "./slo-engine";
15
+
16
+ describe("deriveSloTriggerEvents", () => {
17
+ it("fires no legacy trigger events (thresholds are numeric_state conditions, §9.2)", () => {
18
+ expect(
19
+ deriveSloTriggerEvents({
20
+ kind: SLO_ENTITY_KIND,
21
+ id: "obj-1",
22
+ prev: null,
23
+ next: {
24
+ objectiveId: "obj-1",
25
+ systemId: "sys-1",
26
+ target: 99.9,
27
+ budgetRemainingPercent: 10,
28
+ currentStreak: 0,
29
+ bestStreak: 5,
30
+ },
31
+ delta: {},
32
+ changedFields: [],
33
+ actor: { type: "system", id: "system" },
34
+ occurredAt: new Date().toISOString(),
35
+ }),
36
+ ).toEqual([]);
37
+ });
38
+ });
39
+
40
+ describe("SloEntityStateSchema", () => {
41
+ it("parses the reactive subset", () => {
42
+ const parsed = SloEntityStateSchema.parse({
43
+ objectiveId: "o",
44
+ systemId: "s",
45
+ target: 99.5,
46
+ budgetRemainingPercent: 42,
47
+ currentStreak: 3,
48
+ bestStreak: 9,
49
+ });
50
+ expect(parsed.budgetRemainingPercent).toBe(42);
51
+ });
52
+ });
53
+
54
+ // ─── Fakes ──────────────────────────────────────────────────────────────
55
+
56
+ function makeService(over: {
57
+ objective?: { id: string; systemId: string; target: number } | undefined;
58
+ streak?: { currentStreak: number; bestStreak: number } | undefined;
59
+ }): SloService {
60
+ return {
61
+ async getObjective() {
62
+ return over.objective;
63
+ },
64
+ async getStreak() {
65
+ return over.streak;
66
+ },
67
+ } as unknown as SloService;
68
+ }
69
+
70
+ function makeEngine(budgetRemainingPercent: number): SloEngine {
71
+ return {
72
+ async computeStatus() {
73
+ return { errorBudgetRemainingPercent: budgetRemainingPercent };
74
+ },
75
+ } as unknown as SloEngine;
76
+ }
77
+
78
+ describe("computeSloEntityState", () => {
79
+ it("assembles the view by reading streak/objective + COMPUTING budget", async () => {
80
+ const service = makeService({
81
+ objective: { id: "obj-1", systemId: "sys-1", target: 99.9 },
82
+ streak: { currentStreak: 4, bestStreak: 12 },
83
+ });
84
+ const engine = makeEngine(20);
85
+ const state = await computeSloEntityState({
86
+ service,
87
+ engine,
88
+ objectiveId: "obj-1",
89
+ });
90
+ expect(state).toEqual({
91
+ objectiveId: "obj-1",
92
+ systemId: "sys-1",
93
+ target: 99.9,
94
+ budgetRemainingPercent: 20,
95
+ currentStreak: 4,
96
+ bestStreak: 12,
97
+ });
98
+ });
99
+
100
+ it("defaults missing streak counters to 0", async () => {
101
+ const service = makeService({
102
+ objective: { id: "obj-2", systemId: "sys-2", target: 99 },
103
+ streak: undefined,
104
+ });
105
+ const state = await computeSloEntityState({
106
+ service,
107
+ engine: makeEngine(100),
108
+ objectiveId: "obj-2",
109
+ });
110
+ expect(state?.currentStreak).toBe(0);
111
+ expect(state?.bestStreak).toBe(0);
112
+ });
113
+
114
+ it("returns undefined when the objective no longer exists", async () => {
115
+ const service = makeService({ objective: undefined });
116
+ const state = await computeSloEntityState({
117
+ service,
118
+ engine: makeEngine(50),
119
+ objectiveId: "gone",
120
+ });
121
+ expect(state).toBeUndefined();
122
+ });
123
+ });
124
+
125
+ describe("createSloEntityRead", () => {
126
+ it("computes the view per id and omits missing objectives", async () => {
127
+ const service = {
128
+ async getObjective({ id }: { id: string }) {
129
+ if (id === "obj-1") return { id, systemId: "sys-1", target: 99.9 };
130
+ return undefined;
131
+ },
132
+ async getStreak() {
133
+ return { currentStreak: 2, bestStreak: 7 };
134
+ },
135
+ } as unknown as SloService;
136
+ const read = createSloEntityRead({ service, engine: makeEngine(33) });
137
+ const out = await read(["obj-1", "missing"]);
138
+ expect(Object.keys(out)).toEqual(["obj-1"]);
139
+ expect(out["obj-1"]).toEqual({
140
+ objectiveId: "obj-1",
141
+ systemId: "sys-1",
142
+ target: 99.9,
143
+ budgetRemainingPercent: 33,
144
+ currentStreak: 2,
145
+ bestStreak: 7,
146
+ });
147
+ });
148
+
149
+ it("returns {} for an empty id list without touching the service", async () => {
150
+ let called = false;
151
+ const service = {
152
+ async getObjective() {
153
+ called = true;
154
+ return undefined;
155
+ },
156
+ } as unknown as SloService;
157
+ const read = createSloEntityRead({ service, engine: makeEngine(0) });
158
+ expect(await read([])).toEqual({});
159
+ expect(called).toBe(false);
160
+ });
161
+ });
162
+
163
+ describe("writeSloEntity", () => {
164
+ it("drives the streak write through handle.mutate keyed by objectiveId", async () => {
165
+ const calls: Array<{ id: string; next: SloEntityState }> = [];
166
+ const handle = {
167
+ kind: SLO_ENTITY_KIND,
168
+ async mutate(input: {
169
+ id: string;
170
+ apply: () => Promise<SloEntityState>;
171
+ }) {
172
+ const next = await input.apply();
173
+ calls.push({ id: input.id, next });
174
+ return next;
175
+ },
176
+ } as unknown as EntityHandle<SloEntityState>;
177
+
178
+ let applied = false;
179
+ await writeSloEntity({
180
+ handle,
181
+ objectiveId: "obj-7",
182
+ apply: async () => {
183
+ applied = true;
184
+ return {
185
+ objectiveId: "obj-7",
186
+ systemId: "sys-7",
187
+ target: 99.9,
188
+ budgetRemainingPercent: 20,
189
+ currentStreak: 4,
190
+ bestStreak: 12,
191
+ };
192
+ },
193
+ });
194
+ expect(applied).toBe(true);
195
+ expect(calls).toEqual([
196
+ {
197
+ id: "obj-7",
198
+ next: {
199
+ objectiveId: "obj-7",
200
+ systemId: "sys-7",
201
+ target: 99.9,
202
+ budgetRemainingPercent: 20,
203
+ currentStreak: 4,
204
+ bestStreak: 12,
205
+ },
206
+ },
207
+ ]);
208
+ });
209
+
210
+ it("still runs the streak write when no handle is wired", async () => {
211
+ let applied = false;
212
+ await writeSloEntity({
213
+ handle: undefined,
214
+ objectiveId: "x",
215
+ apply: async () => {
216
+ applied = true;
217
+ return {
218
+ objectiveId: "x",
219
+ systemId: "x",
220
+ target: 1,
221
+ budgetRemainingPercent: 1,
222
+ currentStreak: 0,
223
+ bestStreak: 0,
224
+ };
225
+ },
226
+ });
227
+ expect(applied).toBe(true);
228
+ });
229
+
230
+ it("routes entity-layer errors to onError (fail-soft) without rethrowing", async () => {
231
+ let captured: unknown;
232
+ const handle = {
233
+ kind: SLO_ENTITY_KIND,
234
+ async mutate() {
235
+ throw new Error("nope");
236
+ },
237
+ } as unknown as EntityHandle<SloEntityState>;
238
+ await writeSloEntity({
239
+ handle,
240
+ objectiveId: "x",
241
+ apply: async () => ({
242
+ objectiveId: "x",
243
+ systemId: "x",
244
+ target: 1,
245
+ budgetRemainingPercent: 1,
246
+ currentStreak: 0,
247
+ bestStreak: 0,
248
+ }),
249
+ onError: (e) => {
250
+ captured = e;
251
+ },
252
+ });
253
+ expect((captured as Error).message).toBe("nope");
254
+ });
255
+ });
@@ -0,0 +1,162 @@
1
+ /**
2
+ * The reactive `slo` entity (reactive automation engine §10.7, §9.2).
3
+ *
4
+ * Model B PLUGIN-BACKED + COMPUTED entity. There is NO framework
5
+ * `entity_state` row for an SLO. The current state is assembled on demand by
6
+ * the `read` accessor from two sources:
7
+ *
8
+ * - `slo_streaks` + `slo_objectives` (authoritative tables) supply
9
+ * `currentStreak` / `bestStreak` / `systemId` / `target`, and
10
+ * - the SLO engine COMPUTES `budgetRemainingPercent` (and re-surfaces
11
+ * `target`) on the fly via `computeStatus` (downtime aggregation over the
12
+ * objective's window).
13
+ *
14
+ * The streak-persist site (the daily snapshot job) drives its write through
15
+ * `handle.mutate({ id: objectiveId, apply })`: `apply` persists the streak to
16
+ * `slo_streaks` (the plugin's own write) and returns the freshly-computed
17
+ * view. The framework snapshots `prev` via `read` BEFORE the write, appends
18
+ * the transition log, and emits `ENTITY_CHANGED`.
19
+ *
20
+ * Per §9.2 the SLO budget IS the entity, and the four removed threshold hooks
21
+ * (`budget.warning/critical/exhausted`, `streak.broken`) become derived
22
+ * `numeric_state` / `state` conditions over
23
+ * `state.slo.<objectiveId>.budgetRemainingPercent` + `currentStreak`. The
24
+ * change deriver therefore emits NO legacy trigger events — operators author
25
+ * thresholds as reactive conditions, not pre-baked event triggers.
26
+ */
27
+ import { z } from "zod";
28
+ import type {
29
+ EntityChangeDeriver,
30
+ EntityHandle,
31
+ EntityMutationOpts,
32
+ EntityRead,
33
+ } from "@checkstack/automation-backend";
34
+ import { withEntityWrite } from "@checkstack/automation-backend";
35
+
36
+ import type { SloService } from "./service";
37
+ import type { SloEngine } from "./slo-engine";
38
+
39
+ export const SLO_ENTITY_KIND = "slo";
40
+
41
+ export const SloEntityStateSchema = z.object({
42
+ objectiveId: z.string(),
43
+ systemId: z.string(),
44
+ target: z.number(),
45
+ budgetRemainingPercent: z.number(),
46
+ currentStreak: z.number().int().nonnegative(),
47
+ bestStreak: z.number().int().nonnegative(),
48
+ });
49
+
50
+ export type SloEntityState = z.infer<typeof SloEntityStateSchema>;
51
+
52
+ /**
53
+ * SLO change → trigger events. Intentionally empty: the threshold/streak
54
+ * hooks were removed (§9.2) and replaced by `numeric_state` / `state`
55
+ * conditions over the entity state, so a change fires no legacy event. The
56
+ * deriver is still registered so the kind is a known reactive kind (its
57
+ * state is resolvable into automation scope for those conditions + wakes
58
+ * suspended `wait_until`s whose condition reads `state.slo.*`).
59
+ */
60
+ export const deriveSloTriggerEvents: EntityChangeDeriver = () => [];
61
+
62
+ /**
63
+ * Compute the reactive `slo` view for a single objective: read the objective
64
+ * config + streak, compute the error-budget remaining via the engine, and
65
+ * assemble the `{ objectiveId, systemId, target, budgetRemainingPercent,
66
+ * currentStreak, bestStreak }` subset. Returns `undefined` when the objective
67
+ * no longer exists (missing ids are omitted from the batched `read`).
68
+ *
69
+ * Compute-on-read (not materialized): the budget is a pure function of the
70
+ * objective's append-only downtime history over its rolling window. Storing a
71
+ * second copy would duplicate the engine's source of truth and risk drift; a
72
+ * read recomputes from the same tables the API already reads. See the change
73
+ * doc for the cost assessment.
74
+ */
75
+ export async function computeSloEntityState(args: {
76
+ service: SloService;
77
+ engine: SloEngine;
78
+ objectiveId: string;
79
+ }): Promise<SloEntityState | undefined> {
80
+ const { service, engine, objectiveId } = args;
81
+ const objective = await service.getObjective({ id: objectiveId });
82
+ if (!objective) return undefined;
83
+
84
+ const [status, streak] = await Promise.all([
85
+ engine.computeStatus({ objective }),
86
+ service.getStreak({ objectiveId }),
87
+ ]);
88
+
89
+ return {
90
+ objectiveId,
91
+ systemId: objective.systemId,
92
+ target: objective.target,
93
+ budgetRemainingPercent: status.errorBudgetRemainingPercent,
94
+ currentStreak: streak?.currentStreak ?? 0,
95
+ bestStreak: streak?.bestStreak ?? 0,
96
+ };
97
+ }
98
+
99
+ /**
100
+ * Build the PLUGIN-BACKED + COMPUTED `read` accessor for the `slo` entity.
101
+ * For each objective id, assembles the view via {@link computeSloEntityState}
102
+ * (missing objectives omitted). This is the single source of truth that
103
+ * `handle.mutate` snapshots `prev` from and `get`/`getMany`/scope enrichment
104
+ * route through — no framework `entity_state` storage.
105
+ */
106
+ export function createSloEntityRead(deps: {
107
+ service: SloService;
108
+ engine: SloEngine;
109
+ }): EntityRead<SloEntityState> {
110
+ const { service, engine } = deps;
111
+ return async (ids) => {
112
+ if (ids.length === 0) return {};
113
+ const out: Record<string, SloEntityState> = {};
114
+ await Promise.all(
115
+ ids.map(async (objectiveId) => {
116
+ const state = await computeSloEntityState({
117
+ service,
118
+ engine,
119
+ objectiveId,
120
+ });
121
+ if (state) out[objectiveId] = state;
122
+ }),
123
+ );
124
+ return out;
125
+ };
126
+ }
127
+
128
+ /**
129
+ * Drive the streak-persist write through `handle.mutate` (§10.7). `apply`
130
+ * performs the REAL `slo_streaks` write (the plugin's own db/tx) and returns
131
+ * the freshly-computed `slo` view (budget recomputed + post-write streak).
132
+ * The framework snapshots `prev` via `read` BEFORE the write, appends the
133
+ * transition log, and emits `ENTITY_CHANGED`. No-op (no emit) when the
134
+ * recomputed view is structurally equal to `prev`.
135
+ *
136
+ * When no handle is available (tests / before wiring), the write still runs
137
+ * — the entity reactivity is layered on top, never required for the streak
138
+ * write to succeed. Errors from the entity layer are routed to `onError` so a
139
+ * mirror/transition failure never breaks the daily job.
140
+ */
141
+ export async function writeSloEntity(args: {
142
+ handle: EntityHandle<SloEntityState> | undefined;
143
+ objectiveId: string;
144
+ opts?: EntityMutationOpts;
145
+ apply: () => Promise<SloEntityState>;
146
+ onError?: (error: unknown) => void;
147
+ }): Promise<void> {
148
+ const { handle, objectiveId, opts, apply, onError } = args;
149
+ if (!handle) {
150
+ await apply();
151
+ return;
152
+ }
153
+ // A wired handle routes through the shared guard; the daily-job caller wants
154
+ // an entity-layer (mirror/transition) failure to be fail-soft so it never
155
+ // breaks the streak persist, so errors are routed to `onError` rather than
156
+ // rethrown (the bespoke SLO behavior the shared guard does not encode).
157
+ try {
158
+ await withEntityWrite({ handle, id: objectiveId, opts, apply });
159
+ } catch (error) {
160
+ onError?.(error);
161
+ }
162
+ }
@@ -2,6 +2,12 @@ import type { SloService } from "./service";
2
2
  import type { SloEngine } from "./slo-engine";
3
3
  import type { Logger } from "@checkstack/backend-api";
4
4
  import type { QueueManager } from "@checkstack/queue-api";
5
+ import type { EntityHandle } from "@checkstack/automation-backend";
6
+ import {
7
+ computeSloEntityState,
8
+ writeSloEntity,
9
+ type SloEntityState,
10
+ } from "./slo-entity";
5
11
 
6
12
  const SNAPSHOT_QUEUE = "slo-daily-snapshots";
7
13
  const SNAPSHOT_JOB_ID = "slo-daily-snapshot-run";
@@ -12,6 +18,8 @@ interface StreakCalculatorDeps {
12
18
  engine: SloEngine;
13
19
  logger: Logger;
14
20
  queueManager: QueueManager;
21
+ /** Resolver for the reactive `slo` entity (§10.7). Undefined in tests. */
22
+ getSloEntity?: () => EntityHandle<SloEntityState> | undefined;
15
23
  }
16
24
 
17
25
  /**
@@ -20,7 +28,7 @@ interface StreakCalculatorDeps {
20
28
  * and updating streak counters for all active objectives.
21
29
  */
22
30
  export async function setupDailySnapshotJob(deps: StreakCalculatorDeps) {
23
- const { queueManager, logger, service, engine } = deps;
31
+ const { queueManager, logger, service, engine, getSloEntity } = deps;
24
32
 
25
33
  const queue = queueManager.getQueue<{ trigger: "scheduled" }>(SNAPSHOT_QUEUE);
26
34
 
@@ -28,7 +36,7 @@ export async function setupDailySnapshotJob(deps: StreakCalculatorDeps) {
28
36
  await queue.consume(
29
37
  async () => {
30
38
  logger.info("Starting daily SLO snapshot job");
31
- await runDailySnapshotJob({ service, engine, logger });
39
+ await runDailySnapshotJob({ service, engine, logger, getSloEntity });
32
40
  logger.info("Completed daily SLO snapshot job");
33
41
  },
34
42
  { consumerGroup: WORKER_GROUP, maxRetries: 0 },
@@ -57,8 +65,9 @@ export async function runDailySnapshotJob(deps: {
57
65
  service: SloService;
58
66
  engine: SloEngine;
59
67
  logger: Logger;
68
+ getSloEntity?: () => EntityHandle<SloEntityState> | undefined;
60
69
  }) {
61
- const { service, engine, logger } = deps;
70
+ const { service, engine, logger, getSloEntity } = deps;
62
71
 
63
72
  const objectives = await service.listObjectives();
64
73
  const today = new Date();
@@ -79,24 +88,64 @@ export async function runDailySnapshotJob(deps: {
79
88
  availabilityPercent: status.currentAvailability ?? 100,
80
89
  budgetConsumedMinutes: status.errorBudgetConsumedMinutes,
81
90
  budgetRemainingPercent: status.errorBudgetRemainingPercent,
82
-
91
+
83
92
  burnRate: status.burnRate ?? null,
84
93
  streakDays: streak?.currentStreak ?? 0,
85
94
  },
86
95
  });
87
96
 
88
- // 2. Update streak: if currently meeting target, increment; else reset
89
- if (!status.isBreaching && !status.hasOpenDowntime) {
90
- await service.incrementStreak({ objectiveId: objective.id });
91
- } else if (status.isBreaching) {
92
- const currentStreak = streak?.currentStreak ?? 0;
93
- if (currentStreak > 0) {
94
- await service.resetStreak({ objectiveId: objective.id });
95
- logger.info(
96
- `SLO ${objective.id}: Streak broken at ${currentStreak} days`,
97
- );
98
- }
99
- }
97
+ // 2. Update streak (if currently meeting target, increment; else reset)
98
+ // AND surface the recomputed `slo` entity, driven through
99
+ // `handle.mutate` (§10.7). The REAL `slo_streaks` write runs INSIDE
100
+ // `apply` (the plugin's own write) so the framework snapshots `prev`
101
+ // via the COMPUTED `read` BEFORE the streak flips, then emits
102
+ // `ENTITY_CHANGED`. Operators author budget/streak thresholds as
103
+ // `numeric_state` conditions over this state (§9.2). The change is a
104
+ // no-op (no emit) when neither budget nor streak moved.
105
+ await writeSloEntity({
106
+ handle: getSloEntity?.(),
107
+ objectiveId: objective.id,
108
+ apply: async () => {
109
+ if (!status.isBreaching && !status.hasOpenDowntime) {
110
+ await service.incrementStreak({ objectiveId: objective.id });
111
+ } else if (status.isBreaching) {
112
+ const currentStreak = streak?.currentStreak ?? 0;
113
+ if (currentStreak > 0) {
114
+ await service.resetStreak({ objectiveId: objective.id });
115
+ logger.info(
116
+ `SLO ${objective.id}: Streak broken at ${currentStreak} days`,
117
+ );
118
+ }
119
+ }
120
+ // Re-assemble the computed view from the POST-write tables so the
121
+ // emitted `next` reflects the updated streak + recomputed budget.
122
+ const next = await computeSloEntityState({
123
+ service,
124
+ engine,
125
+ objectiveId: objective.id,
126
+ });
127
+ if (next) return next;
128
+ // The objective vanished mid-cycle (raced delete). Fall back to a
129
+ // view from the in-hand objective + post-write streak so `apply`
130
+ // still returns a valid state and the mutate is a no-op.
131
+ const freshStreak = await service.getStreak({
132
+ objectiveId: objective.id,
133
+ });
134
+ return {
135
+ objectiveId: objective.id,
136
+ systemId: objective.systemId,
137
+ target: objective.target,
138
+ budgetRemainingPercent: status.errorBudgetRemainingPercent,
139
+ currentStreak: freshStreak?.currentStreak ?? 0,
140
+ bestStreak: freshStreak?.bestStreak ?? 0,
141
+ };
142
+ },
143
+ onError: (error) =>
144
+ logger.warn(
145
+ `Failed to surface slo entity for objective ${objective.id}`,
146
+ { error },
147
+ ),
148
+ });
100
149
  } catch (error) {
101
150
  logger.error(
102
151
  `Failed to process daily snapshot for objective ${objective.id}`,
package/tsconfig.json CHANGED
@@ -4,6 +4,9 @@
4
4
  "src"
5
5
  ],
6
6
  "references": [
7
+ {
8
+ "path": "../automation-backend"
9
+ },
7
10
  {
8
11
  "path": "../backend-api"
9
12
  },
@@ -43,12 +46,6 @@
43
46
  {
44
47
  "path": "../healthcheck-common"
45
48
  },
46
- {
47
- "path": "../integration-backend"
48
- },
49
- {
50
- "path": "../integration-common"
51
- },
52
49
  {
53
50
  "path": "../queue-api"
54
51
  },