@checkstack/slo-backend 0.2.16 → 0.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/CHANGELOG.md CHANGED
@@ -1,5 +1,124 @@
1
1
  # @checkstack/slo-backend
2
2
 
3
+ ## 0.3.1
4
+
5
+ ### Patch Changes
6
+
7
+ - Updated dependencies [208ad71]
8
+ - @checkstack/signal-common@0.2.0
9
+ - @checkstack/dependency-common@0.3.0
10
+ - @checkstack/healthcheck-common@0.13.0
11
+ - @checkstack/integration-common@0.3.0
12
+ - @checkstack/slo-common@0.3.0
13
+ - @checkstack/backend-api@0.13.1
14
+ - @checkstack/healthcheck-backend@0.18.1
15
+ - @checkstack/integration-backend@0.1.21
16
+ - @checkstack/catalog-common@1.5.3
17
+ - @checkstack/catalog-backend@0.7.1
18
+ - @checkstack/cache-api@0.2.1
19
+ - @checkstack/command-backend@0.1.21
20
+ - @checkstack/queue-api@0.2.15
21
+ - @checkstack/cache-utils@0.2.1
22
+
23
+ ## 0.3.0
24
+
25
+ ### Minor Changes
26
+
27
+ - 8d1ef12: ## Per-entity caching with single-flight + safe invalidation across the dashboard hot paths
28
+
29
+ ### `@checkstack/cache-api`
30
+
31
+ - **Breaking** for backend implementors: `CacheProvider` now requires `deleteByPrefix(prefix: string): Promise<number>` for family-level invalidation. The in-memory provider implements it; downstream providers (Redis, etc.) must add it before upgrading.
32
+ - `createScopedCache` forwards `deleteByPrefix` and keeps prefixes scoped to the calling plugin.
33
+
34
+ ### `@checkstack/cache-utils` (new package)
35
+
36
+ High-level read-through caching helpers built on `CacheProvider`:
37
+
38
+ - `createCachedScope({ cacheManager, pluginId })` returns a scope with `wrap`, `wrapMany`, `invalidate`, and `invalidatePrefix`.
39
+ - **Single-flight**: concurrent cache misses for the same key share one loader.
40
+ - **Per-entity bulk caching** via `wrapMany` so list/bulk RPCs cache by id rather than by the full input shape — overlapping callers share entries and invalidation stays exact.
41
+ - **Race-safe invalidation** via per-key epoch counters: a loader started before a mutation cannot repopulate the cache with stale data after the mutation invalidates it. The mutation invariant is `db.write → cache.invalidate (await) → signals.emit`.
42
+ - Cache failures fall through to the loader so a cache outage cannot break reads.
43
+
44
+ ### `@checkstack/backend`
45
+
46
+ - The internal null `CacheProvider` (used when no cache backend is configured) now implements the new `deleteByPrefix` method as a no-op. Patch bump only — no behavior change for existing callers.
47
+
48
+ ### `@checkstack/healthcheck-backend`
49
+
50
+ - `getSystemHealthStatus` and `getBulkSystemHealthStatus` now read through a per-system cache (`healthcheck:status:<systemId>`), eliminating N database queries per dashboard refresh for unchanged systems.
51
+ - Mutation paths (configuration CRUD, system associations, satellite ingest, queue-driven check runs, system/satellite removal hooks) invalidate affected keys before broadcasting their signals so frontend refetches always observe fresh data.
52
+
53
+ ### `@checkstack/incident-backend`
54
+
55
+ - `listIncidents`, `getIncident`, `getIncidentsForSystem`, and `getBulkIncidentsForSystems` now read through a scoped cache:
56
+ - per-incident at `incident:<id>`
57
+ - per-system at `system:<systemId>`
58
+ - per-filter-shape at `list:<stable-stringify(filters)>` for the few list shapes the dashboard polls
59
+ - Mutations (`createIncident`, `updateIncident`, `addUpdate`, `resolveIncident`, `deleteIncident`) invalidate the incident, every affected system, and every cached list before broadcasting `INCIDENT_UPDATED`.
60
+ - The catalog `systemDeleted` cleanup hook drops that system's cached entries.
61
+
62
+ ### `@checkstack/maintenance-backend`
63
+
64
+ - `listMaintenances`, `getMaintenance`, `getMaintenancesForSystem`, and `getBulkMaintenancesForSystems` use the same per-entity / per-system / per-filter-shape pattern as incidents.
65
+ - Mutations (`createMaintenance`, `updateMaintenance`, `addUpdate`, `closeMaintenance`, `deleteMaintenance`) invalidate before broadcasting `MAINTENANCE_UPDATED`.
66
+
67
+ ### `@checkstack/catalog-backend`
68
+
69
+ - Topology reads (`getEntities`, `getSystems`, `getSystem`, `getGroups`, `getSystemGroupIds`) cache under the `entity:` family (25s TTL).
70
+ - Views (`getViews`) and per-system contacts (`getSystemContacts`) cache in their own families.
71
+ - System / group / membership mutations drop the entire `entity:` family (every reader joins the same tables); view and contact mutations drop only their respective scopes.
72
+
73
+ ### `@checkstack/slo-backend`
74
+
75
+ - `listObjectives`, `getObjective`, `getObjectivesForSystem`, and `getBulkObjectivesForSystems` cache results including the expensive `engine.computeStatus` output.
76
+ - Per-entity caching for the bulk handler so dashboards with overlapping system sets share entries.
77
+ - Mutations (`createObjective`, `updateObjective`, `deleteObjective`) invalidate before broadcasting `SLO_STATUS_CHANGED`.
78
+
79
+ ### `@checkstack/anomaly-backend`
80
+
81
+ - New `router-cache.ts` adds a cache scope distinct from the existing detector baseline cache, keyed by stable filter hash.
82
+ - `getAnomalies` and `getAnomalyBaselines` cache through this scope (15s TTL).
83
+ - The detector invalidates the router cache before broadcasting `ANOMALY_STATE_CHANGED` on every state transition (suspicious/anomaly/recovered).
84
+ - Config mutations also invalidate.
85
+
86
+ ### `@checkstack/notification-backend`
87
+
88
+ - `getUnreadCount`, `getNotifications`, and `getSubscriptions` cache per-user.
89
+ - `markAsRead`, `deleteNotification`, `notifyUsers`, and `notifyGroups` invalidate every affected user's cache before sending realtime signals to that user.
90
+ - `subscribe` and `unsubscribe` invalidate the user's subscription cache.
91
+
92
+ ### `@checkstack/announcement-backend`
93
+
94
+ - `getActiveAnnouncements` caches per-user (or anonymous) and per-`includeDismissed` flag (45s TTL — admin-driven, slowly changing).
95
+ - `listAllAnnouncements` caches under a single key.
96
+ - `dismissAnnouncement` only drops that user's cache; `createAnnouncement`, `updateAnnouncement`, `deleteAnnouncement` drop every user's cache before broadcasting `ANNOUNCEMENT_UPDATED`.
97
+ - The auth `userDeleted` cleanup hook drops that user's cached entries.
98
+
99
+ ### Patch Changes
100
+
101
+ - Updated dependencies [8d1ef12]
102
+ - Updated dependencies [8d1ef12]
103
+ - Updated dependencies [8d1ef12]
104
+ - Updated dependencies [8d1ef12]
105
+ - Updated dependencies [8d1ef12]
106
+ - @checkstack/healthcheck-common@0.12.0
107
+ - @checkstack/healthcheck-backend@0.18.0
108
+ - @checkstack/common@0.7.0
109
+ - @checkstack/cache-api@0.2.0
110
+ - @checkstack/cache-utils@0.2.0
111
+ - @checkstack/catalog-backend@0.7.0
112
+ - @checkstack/backend-api@0.13.0
113
+ - @checkstack/catalog-common@1.5.2
114
+ - @checkstack/command-backend@0.1.20
115
+ - @checkstack/dependency-common@0.2.3
116
+ - @checkstack/integration-backend@0.1.20
117
+ - @checkstack/integration-common@0.2.9
118
+ - @checkstack/signal-common@0.1.10
119
+ - @checkstack/slo-common@0.2.2
120
+ - @checkstack/queue-api@0.2.14
121
+
3
122
  ## 0.2.16
4
123
 
5
124
  ### Patch Changes
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@checkstack/slo-backend",
3
- "version": "0.2.16",
3
+ "version": "0.3.1",
4
4
  "type": "module",
5
5
  "main": "src/index.ts",
6
6
  "checkstack": {
@@ -13,19 +13,21 @@
13
13
  "lint:code": "eslint . --max-warnings 0"
14
14
  },
15
15
  "dependencies": {
16
- "@checkstack/backend-api": "0.12.0",
17
- "@checkstack/slo-common": "0.2.0",
18
- "@checkstack/healthcheck-common": "0.11.0",
19
- "@checkstack/healthcheck-backend": "0.16.5",
20
- "@checkstack/dependency-common": "0.2.1",
21
- "@checkstack/catalog-common": "1.4.1",
22
- "@checkstack/catalog-backend": "0.5.4",
23
- "@checkstack/command-backend": "0.1.19",
24
- "@checkstack/signal-common": "0.1.9",
25
- "@checkstack/integration-backend": "0.1.19",
26
- "@checkstack/integration-common": "0.2.8",
27
- "@checkstack/common": "0.6.5",
28
- "@checkstack/queue-api": "0.2.13",
16
+ "@checkstack/backend-api": "0.13.0",
17
+ "@checkstack/cache-api": "0.2.0",
18
+ "@checkstack/cache-utils": "0.2.0",
19
+ "@checkstack/slo-common": "0.2.2",
20
+ "@checkstack/healthcheck-common": "0.12.0",
21
+ "@checkstack/healthcheck-backend": "0.18.0",
22
+ "@checkstack/dependency-common": "0.2.3",
23
+ "@checkstack/catalog-common": "1.5.2",
24
+ "@checkstack/catalog-backend": "0.7.0",
25
+ "@checkstack/command-backend": "0.1.20",
26
+ "@checkstack/signal-common": "0.1.10",
27
+ "@checkstack/integration-backend": "0.1.20",
28
+ "@checkstack/integration-common": "0.2.9",
29
+ "@checkstack/common": "0.7.0",
30
+ "@checkstack/queue-api": "0.2.14",
29
31
  "drizzle-orm": "^0.45.0",
30
32
  "zod": "^4.2.1",
31
33
  "@orpc/server": "^1.13.2"
@@ -33,7 +35,7 @@
33
35
  "devDependencies": {
34
36
  "@checkstack/drizzle-helper": "0.0.4",
35
37
  "@checkstack/scripts": "0.1.2",
36
- "@checkstack/test-utils-backend": "0.1.19",
38
+ "@checkstack/test-utils-backend": "0.1.20",
37
39
  "@checkstack/tsconfig": "0.0.5",
38
40
  "@types/bun": "^1.0.0",
39
41
  "typescript": "^5.0.0"
package/src/cache.ts ADDED
@@ -0,0 +1,74 @@
1
+ import type { CacheManager } from "@checkstack/cache-api";
2
+ import {
3
+ createCachedScope,
4
+ type CachedScope,
5
+ } from "@checkstack/cache-utils";
6
+ import type { Logger } from "@checkstack/backend-api";
7
+
8
+ /**
9
+ * 15s. Each `engine.computeStatus` call runs a SLO calculation over a
10
+ * rolling window so this cache is more about CPU than DB. Downtime events
11
+ * generated by the SLO engine can change status outside any tRPC mutation,
12
+ * so the TTL doubles as a freshness ceiling for that path.
13
+ */
14
+ const SLO_TTL_MS = 15_000;
15
+
16
+ const LIST_KEY = "list:objectives";
17
+ const OBJECTIVE_PREFIX = "objective:";
18
+ const SYSTEM_PREFIX = "system:";
19
+
20
+ const objectiveKey = (id: string): string => `${OBJECTIVE_PREFIX}${id}`;
21
+ const systemKey = (systemId: string): string => `${SYSTEM_PREFIX}${systemId}`;
22
+
23
+ export interface SloCache {
24
+ wrapList: <T>(loader: () => Promise<T>) => Promise<T>;
25
+ wrapObjective: <T>(id: string, loader: () => Promise<T>) => Promise<T>;
26
+ wrapSystem: <T>(
27
+ systemId: string,
28
+ loader: () => Promise<T>,
29
+ ) => Promise<T>;
30
+
31
+ /**
32
+ * Drop all cached entries affected by a mutation that touches one
33
+ * objective on one system. Always awaits before signal broadcast.
34
+ */
35
+ invalidateForMutation: (props: {
36
+ objectiveId: string;
37
+ systemId: string;
38
+ }) => Promise<void>;
39
+
40
+ scope: CachedScope;
41
+ }
42
+
43
+ export function createSloCache({
44
+ cacheManager,
45
+ logger,
46
+ }: {
47
+ cacheManager: CacheManager;
48
+ logger: Logger;
49
+ }): SloCache {
50
+ const scope = createCachedScope({
51
+ cacheManager,
52
+ pluginId: "slo",
53
+ defaultTtlMs: SLO_TTL_MS,
54
+ onError: (op: string, error: unknown) => {
55
+ logger.warn(`slo cache ${op} failed: ${String(error)}`);
56
+ },
57
+ });
58
+
59
+ return {
60
+ wrapList: (loader) => scope.wrap(LIST_KEY, loader),
61
+ wrapObjective: (id, loader) => scope.wrap(objectiveKey(id), loader),
62
+ wrapSystem: (systemId, loader) => scope.wrap(systemKey(systemId), loader),
63
+
64
+ invalidateForMutation: async ({ objectiveId, systemId }) => {
65
+ await Promise.all([
66
+ scope.invalidate(LIST_KEY),
67
+ scope.invalidate(objectiveKey(objectiveId)),
68
+ scope.invalidate(systemKey(systemId)),
69
+ ]);
70
+ },
71
+
72
+ scope,
73
+ };
74
+ }
package/src/index.ts CHANGED
@@ -13,6 +13,7 @@ import { integrationEventExtensionPoint } from "@checkstack/integration-backend"
13
13
  import { SloService } from "./service";
14
14
  import { SloEngine } from "./slo-engine";
15
15
  import { createRouter } from "./router";
16
+ import { createSloCache } from "./cache";
16
17
  import { DependencyApi } from "@checkstack/dependency-common";
17
18
  import { HealthCheckApi } from "@checkstack/healthcheck-common";
18
19
  import { catalogHooks } from "@checkstack/catalog-backend";
@@ -176,8 +177,16 @@ export default createBackendPlugin({
176
177
  signalService: coreServices.signalService,
177
178
  rpcClient: coreServices.rpcClient,
178
179
  queueManager: coreServices.queueManager,
180
+ cacheManager: coreServices.cacheManager,
179
181
  },
180
- init: async ({ logger, database, rpc, signalService, rpcClient }) => {
182
+ init: async ({
183
+ logger,
184
+ database,
185
+ rpc,
186
+ signalService,
187
+ rpcClient,
188
+ cacheManager,
189
+ }) => {
181
190
  logger.debug("🔧 Initializing SLO Backend...");
182
191
 
183
192
  const service = new SloService(database as SafeDatabase<typeof schema>);
@@ -190,7 +199,14 @@ export default createBackendPlugin({
190
199
  // Store for afterPluginsReady
191
200
  sharedEngine = engine;
192
201
 
193
- const router = createRouter({ service, engine, signalService, rpcClient });
202
+ const cache = createSloCache({ cacheManager, logger });
203
+ const router = createRouter({
204
+ service,
205
+ engine,
206
+ signalService,
207
+ rpcClient,
208
+ cache,
209
+ });
194
210
  rpc.registerRouter(router, sloContract);
195
211
 
196
212
  // Register command palette entries
package/src/router.ts CHANGED
@@ -12,17 +12,20 @@ import type { SignalService } from "@checkstack/signal-common";
12
12
  import { CatalogApi } from "@checkstack/catalog-common";
13
13
  import type { SloService } from "./service";
14
14
  import type { SloEngine } from "./slo-engine";
15
+ import type { SloCache } from "./cache";
15
16
 
16
17
  export function createRouter({
17
18
  service,
18
19
  engine,
19
20
  signalService,
20
21
  rpcClient,
22
+ cache,
21
23
  }: {
22
24
  service: SloService;
23
25
  engine: SloEngine;
24
26
  signalService: SignalService;
25
27
  rpcClient: RpcClient;
28
+ cache: SloCache;
26
29
  }) {
27
30
  const os = implement(sloContract)
28
31
  .$context<RpcContext>()
@@ -33,43 +36,49 @@ export function createRouter({
33
36
  // OBJECTIVES
34
37
  // =========================================================================
35
38
 
36
- listObjectives: os.listObjectives.handler(async () => {
37
- const objectives = await service.listObjectives();
38
- const results = await Promise.all(
39
- objectives.map(async (objective) => ({
40
- objective,
41
- status: await engine.computeStatus({ objective }),
42
- })),
43
- );
44
- return { objectives: results };
45
- }),
46
-
47
- getObjective: os.getObjective.handler(async ({ input }) => {
48
- const objective = await service.getObjective({ id: input.id });
49
- if (!objective) {
50
- // eslint-disable-next-line unicorn/no-null -- oRPC contract requires null for missing values
51
- return null;
52
- }
53
- const status = await engine.computeStatus({ objective });
54
- return { objective, status };
55
- }),
56
-
57
- getObjectivesForSystem: os.getObjectivesForSystem.handler(
58
- async ({ input }) => {
59
- const objectives = await service.getObjectivesForSystem({
60
- systemId: input.systemId,
61
- });
62
- return Promise.all(
39
+ listObjectives: os.listObjectives.handler(async () =>
40
+ cache.wrapList(async () => {
41
+ const objectives = await service.listObjectives();
42
+ const results = await Promise.all(
63
43
  objectives.map(async (objective) => ({
64
44
  objective,
65
45
  status: await engine.computeStatus({ objective }),
66
46
  })),
67
47
  );
68
- },
48
+ return { objectives: results };
49
+ }),
50
+ ),
51
+
52
+ getObjective: os.getObjective.handler(async ({ input }) =>
53
+ cache.wrapObjective(input.id, async () => {
54
+ const objective = await service.getObjective({ id: input.id });
55
+ if (!objective) {
56
+ // eslint-disable-next-line unicorn/no-null -- oRPC contract requires null for missing values
57
+ return null;
58
+ }
59
+ const status = await engine.computeStatus({ objective });
60
+ return { objective, status };
61
+ }),
62
+ ),
63
+
64
+ getObjectivesForSystem: os.getObjectivesForSystem.handler(
65
+ async ({ input }) =>
66
+ cache.wrapSystem(input.systemId, async () => {
67
+ const objectives = await service.getObjectivesForSystem({
68
+ systemId: input.systemId,
69
+ });
70
+ return Promise.all(
71
+ objectives.map(async (objective) => ({
72
+ objective,
73
+ status: await engine.computeStatus({ objective }),
74
+ })),
75
+ );
76
+ }),
69
77
  ),
70
78
 
71
79
  getBulkObjectivesForSystems: os.getBulkObjectivesForSystems.handler(
72
80
  async ({ input }) => {
81
+ // Per-entity caching: see ./cache.ts for the invalidation contract.
73
82
  const systems: Record<
74
83
  string,
75
84
  Array<{
@@ -80,15 +89,17 @@ export function createRouter({
80
89
 
81
90
  await Promise.all(
82
91
  input.systemIds.map(async (systemId) => {
83
- const objectives = await service.getObjectivesForSystem({
84
- systemId,
92
+ systems[systemId] = await cache.wrapSystem(systemId, async () => {
93
+ const objectives = await service.getObjectivesForSystem({
94
+ systemId,
95
+ });
96
+ return Promise.all(
97
+ objectives.map(async (objective) => ({
98
+ objective,
99
+ status: await engine.computeStatus({ objective }),
100
+ })),
101
+ );
85
102
  });
86
- systems[systemId] = await Promise.all(
87
- objectives.map(async (objective) => ({
88
- objective,
89
- status: await engine.computeStatus({ objective }),
90
- })),
91
- );
92
103
  }),
93
104
  );
94
105
 
@@ -105,6 +116,11 @@ export function createRouter({
105
116
  await engine.reconcileObjective({ objective });
106
117
 
107
118
  const status = await engine.computeStatus({ objective });
119
+ // Mutation invariant: db.write → cache.invalidate (await) → signals.emit.
120
+ await cache.invalidateForMutation({
121
+ objectiveId: objective.id,
122
+ systemId: objective.systemId,
123
+ });
108
124
  await signalService.broadcast(SLO_STATUS_CHANGED, {
109
125
  systemId: objective.systemId,
110
126
  objectiveId: objective.id,
@@ -128,6 +144,10 @@ export function createRouter({
128
144
  await engine.reconcileObjective({ objective });
129
145
 
130
146
  const status = await engine.computeStatus({ objective });
147
+ await cache.invalidateForMutation({
148
+ objectiveId: objective.id,
149
+ systemId: objective.systemId,
150
+ });
131
151
  await signalService.broadcast(SLO_STATUS_CHANGED, {
132
152
  systemId: objective.systemId,
133
153
  objectiveId: objective.id,
@@ -139,7 +159,16 @@ export function createRouter({
139
159
  }),
140
160
 
141
161
  deleteObjective: os.deleteObjective.handler(async ({ input }) => {
162
+ // Look up the objective first so we can target the per-system cache
163
+ // entry; otherwise we'd have no way to know which system to drop.
164
+ const existing = await service.getObjective({ id: input.id });
142
165
  const success = await service.deleteObjective({ id: input.id });
166
+ if (success && existing) {
167
+ await cache.invalidateForMutation({
168
+ objectiveId: existing.id,
169
+ systemId: existing.systemId,
170
+ });
171
+ }
143
172
  return { success };
144
173
  }),
145
174