@gravito/monitor 3.0.1 โ†’ 3.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/README.md CHANGED
@@ -1,14 +1,15 @@
1
- # @gravito/monitor
1
+ # @gravito/monitor ๐Ÿ›ฐ๏ธ
2
2
 
3
- Lightweight observability module for Gravito - Health Checks, Metrics, and Tracing.
3
+ Lightweight observability module for Gravito - Health Checks, Metrics, and Tracing. Built on top of the **Galaxy Architecture**, this Orbit provides essential infrastructure for monitoring your planet's health.
4
4
 
5
- ## Features
5
+ ## ๐Ÿš€ Features
6
6
 
7
- - ๐Ÿฅ **Health Checks** - Kubernetes-ready `/health`, `/ready`, `/live` endpoints
8
- - ๐Ÿ“Š **Metrics** - Prometheus-compatible `/metrics` endpoint
9
- - ๐Ÿ” **Tracing** - OpenTelemetry OTLP support (via external Collector)
7
+ - ๐Ÿฅ **Health Checks** - Kubernetes-ready `/health`, `/ready`, `/live` endpoints with custom check support.
8
+ - ๐Ÿ“Š **Metrics** - Prometheus-compatible `/metrics` endpoint with built-in Node.js runtime and HTTP metrics.
9
+ - ๐Ÿ” **Tracing** - OpenTelemetry OTLP support for distributed tracing across services.
10
+ - ๐Ÿ›ก๏ธ **Kubernetes Native** - Seamless integration with probe configurations and Prometheus ServiceMonitors.
10
11
 
11
- ## Installation
12
+ ## ๐Ÿ“ฆ Installation
12
13
 
13
14
  ```bash
14
15
  bun add @gravito/monitor
@@ -20,7 +21,9 @@ For OpenTelemetry tracing (optional):
20
21
  bun add @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-http
21
22
  ```
22
23
 
23
- ## Quick Start
24
+ ## ๐ŸŒŒ Quick Start
25
+
26
+ Enable observability by adding the `MonitorOrbit` to your `PlanetCore`.
24
27
 
25
28
  ```typescript
26
29
  import { PlanetCore } from '@gravito/core'
@@ -32,274 +35,136 @@ core.orbit(new MonitorOrbit({
32
35
  health: {
33
36
  enabled: true,
34
37
  path: '/health',
35
- readyPath: '/ready',
36
- livePath: '/live',
37
38
  },
38
39
  metrics: {
39
40
  enabled: true,
40
- path: '/metrics',
41
41
  prefix: 'myapp_',
42
42
  },
43
43
  tracing: {
44
- enabled: true,
45
- serviceName: 'my-gravito-app',
46
- endpoint: 'http://localhost:4318/v1/traces',
44
+ enabled: process.env.NODE_ENV === 'production',
45
+ serviceName: 'order-service',
47
46
  },
48
47
  }))
49
48
 
50
49
  await core.liftoff()
51
50
  ```
52
51
 
53
- ## Health Checks
52
+ ## ๐Ÿฅ Health Checks
54
53
 
55
- ### Default Endpoints
54
+ The health system provides three distinct probes following Kubernetes best practices:
56
55
 
57
- | Endpoint | Description |
58
- |----------|-------------|
59
- | `GET /health` | Full health check with all registered checks |
60
- | `GET /ready` | Kubernetes readiness probe |
61
- | `GET /live` | Kubernetes liveness probe |
56
+ - **Liveness (`/live`)**: Indicates if the process is running.
57
+ - **Readiness (`/ready`)**: Indicates if the app is ready to serve traffic (waits for all checks to pass).
58
+ - **Health (`/health`)**: Full aggregated report of all registered checks.
62
59
 
63
60
  ### Registering Custom Checks
64
61
 
62
+ You can register custom health checks via the `monitor` service.
63
+
65
64
  ```typescript
66
65
  const monitor = core.services.get('monitor')
67
66
 
68
- // Register a database check
67
+ // Simple check
69
68
  monitor.health.register('database', async () => {
70
- const isConnected = await db.ping()
71
- return isConnected
72
- ? { status: 'healthy' }
73
- : { status: 'unhealthy', message: 'Database disconnected' }
74
- })
75
-
76
- // Register a Redis check
77
- monitor.health.register('redis', async () => {
78
- const result = await redis.ping()
79
- return { status: result === 'PONG' ? 'healthy' : 'unhealthy' }
69
+ const isOk = await db.ping()
70
+ return isOk ? { status: 'healthy' } : { status: 'unhealthy', message: 'DB down' }
80
71
  })
81
- ```
82
-
83
- ### Built-in Check Factories
84
-
85
- ```typescript
86
- import {
87
- createDatabaseCheck,
88
- createRedisCheck,
89
- createMemoryCheck,
90
- createHttpCheck
91
- } from '@gravito/monitor'
92
72
 
93
- // Database check
94
- monitor.health.register('db', createDatabaseCheck(() => db.isConnected()))
95
-
96
- // Memory check (warns at 90% heap usage)
97
- monitor.health.register('memory', createMemoryCheck({ maxHeapUsedPercent: 90 }))
98
-
99
- // External service check
100
- monitor.health.register('api', createHttpCheck('https://api.example.com/health'))
101
- ```
102
-
103
- ### Health Response Format
104
-
105
- ```json
106
- {
107
- "status": "healthy",
108
- "timestamp": "2024-12-25T12:00:00Z",
109
- "uptime": 3600,
110
- "checks": {
111
- "database": { "status": "healthy", "latency": 5 },
112
- "redis": { "status": "healthy", "latency": 2 },
113
- "memory": {
114
- "status": "healthy",
115
- "details": { "heapUsedPercent": "45.2" }
116
- }
73
+ // Detailed check
74
+ monitor.health.register('disk_space', () => {
75
+ const usage = getDiskUsage()
76
+ return {
77
+ status: usage < 90 ? 'healthy' : 'degraded',
78
+ details: { usage: `${usage}%` }
117
79
  }
118
- }
80
+ })
119
81
  ```
120
82
 
121
- ## Metrics
122
-
123
- ### Prometheus Endpoint
83
+ ## ๐Ÿ“Š Metrics
124
84
 
125
- The `/metrics` endpoint exposes metrics in Prometheus text format:
85
+ Metrics are exposed in Prometheus text format at `/metrics`.
126
86
 
127
- ```
128
- # HELP myapp_http_requests_total Total HTTP requests
129
- # TYPE myapp_http_requests_total counter
130
- myapp_http_requests_total{method="GET",path="/api/users",status="200"} 150
131
-
132
- # HELP myapp_http_request_duration_seconds HTTP request duration
133
- # TYPE myapp_http_request_duration_seconds histogram
134
- myapp_http_request_duration_seconds_bucket{le="0.01"} 50
135
- myapp_http_request_duration_seconds_bucket{le="0.1"} 120
136
- myapp_http_request_duration_seconds_sum 12.5
137
- myapp_http_request_duration_seconds_count 150
138
- ```
87
+ ### Built-in Metrics
88
+ - **Runtime**: Heap usage, uptime, active handles.
89
+ - **HTTP**: Request total (`http_requests_total`), duration histogram (`http_request_duration_seconds`).
139
90
 
140
91
  ### Custom Metrics
141
92
 
142
93
  ```typescript
143
94
  const monitor = core.services.get('monitor')
144
95
 
145
- // Counter
146
- const requestCounter = monitor.metrics.counter({
147
- name: 'api_requests_total',
148
- help: 'Total API requests',
149
- labels: ['endpoint', 'status'],
96
+ // 1. Counter (Monotonically increasing)
97
+ const orders = monitor.metrics.counter({
98
+ name: 'orders_total',
99
+ help: 'Total orders processed',
100
+ labels: ['status']
150
101
  })
151
- requestCounter.inc({ endpoint: '/users', status: '200' })
102
+ orders.inc({ status: 'completed' })
152
103
 
153
- // Gauge
154
- const activeConnections = monitor.metrics.gauge({
155
- name: 'active_connections',
156
- help: 'Current active connections',
104
+ // 2. Gauge (Can go up and down)
105
+ const activeUsers = monitor.metrics.gauge({
106
+ name: 'active_users',
107
+ help: 'Current active users'
157
108
  })
158
- activeConnections.set(42)
159
- activeConnections.inc()
160
- activeConnections.dec()
161
-
162
- // Histogram
163
- const responseTime = monitor.metrics.histogram({
164
- name: 'response_time_seconds',
165
- help: 'Response time in seconds',
166
- labels: ['endpoint'],
167
- buckets: [0.01, 0.05, 0.1, 0.5, 1],
168
- })
169
- responseTime.observe(0.125, { endpoint: '/users' })
109
+ activeUsers.set(42)
170
110
 
171
- // Timer helper
172
- const stopTimer = responseTime.startTimer({ endpoint: '/users' })
173
- // ... do work ...
174
- stopTimer() // Records duration automatically
111
+ // 3. Histogram (Value distribution)
112
+ const processTime = monitor.metrics.histogram({
113
+ name: 'order_processing_seconds',
114
+ help: 'Time to process orders',
115
+ buckets: [0.1, 0.5, 1, 2, 5]
116
+ })
117
+ const stop = processTime.startTimer()
118
+ // ... logic ...
119
+ stop()
175
120
  ```
176
121
 
177
- ## Tracing
178
-
179
- ### OpenTelemetry Integration
122
+ ## ๐Ÿ” Tracing
180
123
 
181
- @gravito/monitor uses the **OTLP (OpenTelemetry Protocol)** standard. To send traces to different backends:
182
-
183
- | Backend | Method |
184
- |---------|--------|
185
- | **Jaeger** | OTLP Collector โ†’ Jaeger |
186
- | **Zipkin** | OTLP Collector โ†’ Zipkin |
187
- | **AWS X-Ray** | AWS ADOT Collector |
188
- | **Google Cloud Trace** | GCP OTLP Collector |
189
- | **Datadog** | Datadog Agent (OTLP) |
124
+ Distributed tracing is powered by OpenTelemetry (OTLP). It automatically propagates trace context via W3C `traceparent` headers.
190
125
 
191
126
  ### Configuration
192
-
193
127
  ```typescript
194
- new MonitorOrbit({
195
- tracing: {
196
- enabled: true,
197
- serviceName: 'my-app',
198
- serviceVersion: '1.0.0',
199
- endpoint: 'http://localhost:4318/v1/traces', // OTLP HTTP
200
- sampleRate: 1.0, // 100% sampling
201
- resourceAttributes: {
202
- 'deployment.environment': 'production',
203
- },
204
- },
205
- })
128
+ tracing: {
129
+ enabled: true,
130
+ serviceName: 'gateway',
131
+ endpoint: 'http://otel-collector:4318/v1/traces',
132
+ sampleRate: 0.1, // Sample 10% of requests
133
+ resourceAttributes: {
134
+ env: 'production'
135
+ }
136
+ }
206
137
  ```
207
138
 
208
139
  ### Manual Spans
209
-
210
140
  ```typescript
211
141
  const tracer = core.services.get('tracing')
212
142
 
213
- // Start a span
214
- const span = tracer.startSpan('process-order', {
215
- attributes: { 'order.id': '12345' },
216
- })
217
-
143
+ const span = tracer.startSpan('compute_heavy_logic')
218
144
  try {
219
- // Do work...
220
- tracer.addEvent(span, 'payment-processed')
221
- tracer.setAttribute(span, 'order.total', 99.99)
145
+ // ... work ...
146
+ tracer.setAttribute(span, 'items_count', 100)
222
147
  tracer.endSpan(span, 'ok')
223
- } catch (error) {
148
+ } catch (e) {
224
149
  tracer.endSpan(span, 'error')
225
- throw error
226
150
  }
227
151
  ```
228
152
 
229
- ### Trace Context Propagation
230
-
231
- The tracing middleware automatically:
232
- - Extracts `traceparent` header from incoming requests
233
- - Injects trace context into outgoing requests
234
- - Records HTTP method, path, status code
235
-
236
- ## Kubernetes Integration
237
-
238
- ### Deployment Example
239
-
240
- ```yaml
241
- apiVersion: apps/v1
242
- kind: Deployment
243
- metadata:
244
- name: my-gravito-app
245
- spec:
246
- template:
247
- spec:
248
- containers:
249
- - name: app
250
- image: my-app:latest
251
- ports:
252
- - containerPort: 3000
253
- livenessProbe:
254
- httpGet:
255
- path: /live
256
- port: 3000
257
- initialDelaySeconds: 5
258
- periodSeconds: 10
259
- readinessProbe:
260
- httpGet:
261
- path: /ready
262
- port: 3000
263
- initialDelaySeconds: 5
264
- periodSeconds: 5
265
- ```
266
-
267
- ### ServiceMonitor for Prometheus
268
-
269
- ```yaml
270
- apiVersion: monitoring.coreos.com/v1
271
- kind: ServiceMonitor
272
- metadata:
273
- name: my-gravito-app
274
- spec:
275
- selector:
276
- matchLabels:
277
- app: my-gravito-app
278
- endpoints:
279
- - port: http
280
- path: /metrics
281
- interval: 15s
282
- ```
283
-
284
- ## Configuration Reference
153
+ ## โš™๏ธ Configuration Reference
285
154
 
286
155
  | Option | Type | Default | Description |
287
156
  |--------|------|---------|-------------|
288
- | `health.enabled` | boolean | `true` | Enable health endpoints |
289
- | `health.path` | string | `/health` | Full health check path |
290
- | `health.readyPath` | string | `/ready` | Readiness probe path |
291
- | `health.livePath` | string | `/live` | Liveness probe path |
292
- | `health.timeout` | number | `5000` | Check timeout (ms) |
293
- | `health.cacheTtl` | number | `0` | Cache duration (ms) |
294
- | `metrics.enabled` | boolean | `true` | Enable metrics endpoint |
295
- | `metrics.path` | string | `/metrics` | Metrics endpoint path |
296
- | `metrics.prefix` | string | `gravito_` | Metric name prefix |
297
- | `metrics.defaultMetrics` | boolean | `true` | Collect default metrics |
298
- | `tracing.enabled` | boolean | `false` | Enable tracing |
299
- | `tracing.serviceName` | string | `gravito-app` | Service name |
300
- | `tracing.endpoint` | string | `http://localhost:4318/v1/traces` | OTLP endpoint |
301
- | `tracing.sampleRate` | number | `1.0` | Sample rate (0.0-1.0) |
302
-
303
- ## License
304
-
305
- MIT
157
+ | `health.enabled` | `boolean` | `true` | Enable health endpoints |
158
+ | `health.path` | `string` | `/health` | Path for aggregated health check |
159
+ | `health.timeout` | `number` | `5000` | Timeout for checks in ms |
160
+ | `health.cacheTtl` | `number` | `0` | Cache results in ms (0 = disabled) |
161
+ | `metrics.enabled` | `boolean` | `true` | Enable Prometheus endpoint |
162
+ | `metrics.prefix` | `string` | `gravito_` | Metric name prefix |
163
+ | `metrics.defaultMetrics` | `boolean` | `true` | Collect Node.js runtime metrics |
164
+ | `tracing.enabled` | `boolean` | `false` | Enable OpenTelemetry tracing |
165
+ | `tracing.endpoint` | `string` | `http://localhost:4318/v1/traces` | OTLP Collector URL |
166
+ | `tracing.sampleRate` | `number` | `1.0` | Probability sampling (0.0 - 1.0) |
167
+
168
+ ## ๐Ÿ“„ License
169
+
170
+ MIT ยฉ Carl Lee
@@ -0,0 +1,170 @@
1
+ # @gravito/monitor ๐Ÿ›ฐ๏ธ
2
+
3
+ ่ผ•้‡็ดš็š„ Gravito ๅฏ่ง€ๆธฌๆ€งๆจก็ต„ (Observability) - ๅŒ…ๅซๅฅๅบทๆชขๆŸฅ (Health Checks)ใ€ๆŒ‡ๆจ™็›ฃๆŽง (Metrics) ่ˆ‡้ˆ่ทฏ่ฟฝ่นค (Tracing)ใ€‚ๅŸบๆ–ผ **Galaxy Architecture** ่จญ่จˆ๏ผŒๆญค Orbit ็‚บๆ‚จ็š„ๆ˜Ÿ็ƒๆไพ›ๅฟ…ๅ‚™็š„็›ฃๆŽงๅŸบ็คŽ่จญๆ–ฝใ€‚
4
+
5
+ ## ๐Ÿš€ ๆ ธๅฟƒ็‰นๆ€ง
6
+
7
+ - ๐Ÿฅ **ๅฅๅบทๆชขๆŸฅ (Health Checks)** - ๆ”ฏๆด Kubernetes ๆจ™ๆบ–็š„ `/health`ใ€`/ready`ใ€`/live` ็ซฏ้ปž๏ผŒไธฆๅฏ่‡ชๅฎš็พฉๆชขๆŸฅ้ …ใ€‚
8
+ - ๐Ÿ“Š **ๆŒ‡ๆจ™็›ฃๆŽง (Metrics)** - ็›ธๅฎน Prometheus ๆ ผๅผ็š„ `/metrics` ็ซฏ้ปž๏ผŒๅ…งๅปบ Node.js ๅŸท่กŒ้šŽๆฎต่ˆ‡ HTTP ็›ฃๆŽงใ€‚
9
+ - ๐Ÿ” **้ˆ่ทฏ่ฟฝ่นค (Tracing)** - ๆ”ฏๆด OpenTelemetry OTLP ๆจ™ๆบ–๏ผŒๅฏฆ็พ่ทจๆœๅ‹™็š„ๅˆ†ๆ•ฃๅผ่ฟฝ่นคใ€‚
10
+ - ๐Ÿ›ก๏ธ **้›ฒๅŽŸ็”Ÿๆ•ดๅˆ** - ๅฎŒ็พŽ้ฉ้… Kubernetes Probe ่จญๅฎš่ˆ‡ Prometheus ServiceMonitorใ€‚
11
+
12
+ ## ๐Ÿ“ฆ ๅฎ‰่ฃ
13
+
14
+ ```bash
15
+ bun add @gravito/monitor
16
+ ```
17
+
18
+ ๅฆ‚้œ€ไฝฟ็”จ OpenTelemetry ่ฟฝ่นคๅŠŸ่ƒฝ๏ผˆ้ธ็”จ๏ผ‰๏ผš
19
+
20
+ ```bash
21
+ bun add @opentelemetry/sdk-node @opentelemetry/exporter-trace-otlp-http
22
+ ```
23
+
24
+ ## ๐ŸŒŒ ๅฟซ้€ŸไธŠๆ‰‹
25
+
26
+ ๅช้œ€ๅฐ‡ `MonitorOrbit` ๅŠ ๅ…ฅๆ‚จ็š„ `PlanetCore` ๅณๅฏๅ•Ÿ็”จใ€‚
27
+
28
+ ```typescript
29
+ import { PlanetCore } from '@gravito/core'
30
+ import { MonitorOrbit } from '@gravito/monitor'
31
+
32
+ const core = new PlanetCore()
33
+
34
+ core.orbit(new MonitorOrbit({
35
+ health: {
36
+ enabled: true,
37
+ path: '/health',
38
+ },
39
+ metrics: {
40
+ enabled: true,
41
+ prefix: 'myapp_',
42
+ },
43
+ tracing: {
44
+ enabled: process.env.NODE_ENV === 'production',
45
+ serviceName: 'order-service',
46
+ },
47
+ }))
48
+
49
+ await core.liftoff()
50
+ ```
51
+
52
+ ## ๐Ÿฅ ๅฅๅบทๆชขๆŸฅ (Health Checks)
53
+
54
+ ๅฅๅบทๆชขๆŸฅ็ณป็ตฑๆไพ›ไธ‰ๅ€‹็ฌฆๅˆ Kubernetes ๆœ€ไฝณๅฏฆ่ธ็š„็ซฏ้ปž๏ผš
55
+
56
+ - **Liveness (`/live`)**: ๆŒ‡็คบ็จ‹ๅบๆ˜ฏๅฆๆญฃๅธธ้‹่กŒใ€‚
57
+ - **Readiness (`/ready`)**: ๆŒ‡็คบๆ‡‰็”จๆ˜ฏๅฆๆบ–ๅ‚™ๅฅฝๆŽฅๆ”ถๆต้‡๏ผˆ้œ€้€š้Žๆ‰€ๆœ‰่จปๅ†Š็š„ๆชขๆŸฅ้ …๏ผ‰ใ€‚
58
+ - **Health (`/health`)**: ๅฎŒๆ•ด็š„ๅฅๅบทๅ ฑๅ‘Š๏ผŒๅŒ…ๅซๆ‰€ๆœ‰่จปๅ†Š็š„็ดฐ็ฏ€ใ€‚
59
+
60
+ ### ่จปๅ†Š่‡ชๅฎš็พฉๆชขๆŸฅ้ …
61
+
62
+ ๆ‚จๅฏไปฅ้€้Ž `monitor` ๆœๅ‹™่จปๅ†Š่‡ชๅฎš็พฉๆชขๆŸฅ้‚่ผฏใ€‚
63
+
64
+ ```typescript
65
+ const monitor = core.services.get('monitor')
66
+
67
+ // ็ฐกๅ–ฎๆชขๆŸฅ
68
+ monitor.health.register('database', async () => {
69
+ const isOk = await db.ping()
70
+ return isOk ? { status: 'healthy' } : { status: 'unhealthy', message: '่ณ‡ๆ–™ๅบซ้€ฃ็ทšไธญๆ–ท' }
71
+ })
72
+
73
+ // ่ฉณ็ดฐๅ ฑๅ‘Š
74
+ monitor.health.register('disk_space', () => {
75
+ const usage = getDiskUsage()
76
+ return {
77
+ status: usage < 90 ? 'healthy' : 'degraded',
78
+ details: { usage: `${usage}%` }
79
+ }
80
+ })
81
+ ```
82
+
83
+ ## ๐Ÿ“Š ๆŒ‡ๆจ™็›ฃๆŽง (Metrics)
84
+
85
+ ๆŒ‡ๆจ™ๆœƒไปฅ Prometheus ๆ–‡ๅญ—ๆ ผๅผๆšด้œฒๆ–ผ `/metrics` ็ซฏ้ปžใ€‚
86
+
87
+ ### ๅ…งๅปบๆŒ‡ๆจ™
88
+ - **ๅŸท่กŒ้šŽๆฎต (Runtime)**: ๅ †็ฉ่จ˜ๆ†ถ้ซ” (Heap usage)ใ€้‹่กŒๆ™‚้–“ (Uptime)ใ€ๆดป่บๆŽงๅˆถไปฃ็ขผ (Active handles)ใ€‚
89
+ - **HTTP**: ่ซ‹ๆฑ‚็ธฝๆ•ธ (`http_requests_total`)ใ€่ซ‹ๆฑ‚่€—ๆ™‚ๅˆ†ไฝˆ (`http_request_duration_seconds`)ใ€‚
90
+
91
+ ### ่‡ชๅฎš็พฉๆŒ‡ๆจ™
92
+
93
+ ```typescript
94
+ const monitor = core.services.get('monitor')
95
+
96
+ // 1. Counter (ๅ–ฎ่ชฟ้žๅขž็š„่จˆๆ•ธๅ™จ)
97
+ const orders = monitor.metrics.counter({
98
+ name: 'orders_total',
99
+ help: '่™•็†็š„่จ‚ๅ–ฎ็ธฝๆ•ธ',
100
+ labels: ['status']
101
+ })
102
+ orders.inc({ status: 'completed' })
103
+
104
+ // 2. Gauge (ๅฏๅขžๅฏๆธ›็š„้‡่กจ)
105
+ const activeUsers = monitor.metrics.gauge({
106
+ name: 'active_users',
107
+ help: '็•ถๅ‰ๆดป่บไฝฟ็”จ่€…ๆ•ธ'
108
+ })
109
+ activeUsers.set(42)
110
+
111
+ // 3. Histogram (ๆ•ธๅ€ผๅˆ†ไฝˆ็ตฑ่จˆ)
112
+ const processTime = monitor.metrics.histogram({
113
+ name: 'order_processing_seconds',
114
+ help: '่จ‚ๅ–ฎ่™•็†่€—ๆ™‚',
115
+ buckets: [0.1, 0.5, 1, 2, 5]
116
+ })
117
+ const stop = processTime.startTimer()
118
+ // ... ๅŸท่กŒ้‚่ผฏ ...
119
+ stop()
120
+ ```
121
+
122
+ ## ๐Ÿ” ้ˆ่ทฏ่ฟฝ่นค (Tracing)
123
+
124
+ ้ˆ่ทฏ่ฟฝ่นค็”ฑ OpenTelemetry (OTLP) ้ฉ…ๅ‹•๏ผŒๆœƒ่‡ชๅ‹•้€้Ž W3C `traceparent` Header ๅ‚ณ้ž่ฟฝ่นคไธŠไธ‹ๆ–‡ใ€‚
125
+
126
+ ### ้…็ฝฎ็ฏ„ไพ‹
127
+ ```typescript
128
+ tracing: {
129
+ enabled: true,
130
+ serviceName: 'gateway',
131
+ endpoint: 'http://otel-collector:4318/v1/traces',
132
+ sampleRate: 0.1, // ๅ–ๆจฃ 10% ็š„่ซ‹ๆฑ‚
133
+ resourceAttributes: {
134
+ env: 'production'
135
+ }
136
+ }
137
+ ```
138
+
139
+ ### ๆ‰‹ๅ‹•ๅปบ็ซ‹ Span
140
+ ```typescript
141
+ const tracer = core.services.get('tracing')
142
+
143
+ const span = tracer.startSpan('compute_heavy_logic')
144
+ try {
145
+ // ... ๅŸท่กŒ่ค‡้›œ้‹็ฎ— ...
146
+ tracer.setAttribute(span, 'items_count', 100)
147
+ tracer.endSpan(span, 'ok')
148
+ } catch (e) {
149
+ tracer.endSpan(span, 'error')
150
+ }
151
+ ```
152
+
153
+ ## โš™๏ธ ้…็ฝฎๅƒๆ•ธๅƒ่€ƒ
154
+
155
+ | ๅƒๆ•ธ | ้กžๅž‹ | ้ ่จญๅ€ผ | ่ชชๆ˜Ž |
156
+ |--------|------|---------|-------------|
157
+ | `health.enabled` | `boolean` | `true` | ๆ˜ฏๅฆๅ•Ÿ็”จๅฅๅบทๆชขๆŸฅ็ซฏ้ปž |
158
+ | `health.path` | `string` | `/health` | ๅฎŒๆ•ดๅฅๅบทๅ ฑๅ‘Š่ทฏๅพ‘ |
159
+ | `health.timeout` | `number` | `5000` | ๆชขๆŸฅ้ …้€พๆ™‚ๆ™‚้–“ (ms) |
160
+ | `health.cacheTtl` | `number` | `0` | ็ตๆžœๅฟซๅ–ๆ™‚้–“ (ms, 0 ไปฃ่กจไธๅฟซๅ–) |
161
+ | `metrics.enabled` | `boolean` | `true` | ๆ˜ฏๅฆๅ•Ÿ็”จ Prometheus ๆŒ‡ๆจ™็ซฏ้ปž |
162
+ | `metrics.prefix` | `string` | `gravito_` | ๆŒ‡ๆจ™ๅ็จฑๅ‰็ถด |
163
+ | `metrics.defaultMetrics` | `boolean` | `true` | ๆ˜ฏๅฆๆ”ถ้›† Node.js ๅŸท่กŒ้šŽๆฎตๆŒ‡ๆจ™ |
164
+ | `tracing.enabled` | `boolean` | `false` | ๆ˜ฏๅฆๅ•Ÿ็”จ OpenTelemetry ่ฟฝ่นค |
165
+ | `tracing.endpoint` | `string` | `http://localhost:4318/v1/traces` | OTLP Collector ่ทฏๅพ‘ |
166
+ | `tracing.sampleRate` | `number` | `1.0` | ๅ–ๆจฃ็އ (0.0 - 1.0) |
167
+
168
+ ## ๐Ÿ“„ ๆŽˆๆฌŠๅ”่ญฐ
169
+
170
+ MIT ยฉ Carl Lee
package/dist/index.cjs CHANGED
@@ -65,8 +65,15 @@ var HealthController = class {
65
65
  */
66
66
  async health(c) {
67
67
  const report = await this.registry.check();
68
+ const cacheStats = this.registry.getCacheStats();
68
69
  const status = report.status === "healthy" ? 200 : report.status === "degraded" ? 200 : 503;
69
- return c.json(report, status);
70
+ return c.json(
71
+ {
72
+ ...report,
73
+ cache: cacheStats
74
+ },
75
+ status
76
+ );
70
77
  }
71
78
  /**
72
79
  * GET /ready - Kubernetes readiness probe
@@ -101,6 +108,8 @@ var HealthRegistry = class {
101
108
  cacheExpiry = 0;
102
109
  timeout;
103
110
  cacheTtl;
111
+ cacheHits = 0;
112
+ cacheMisses = 0;
104
113
  constructor(config = {}) {
105
114
  this.timeout = config.timeout ?? DEFAULTS.timeout;
106
115
  this.cacheTtl = config.cacheTtl ?? DEFAULTS.cacheTtl;
@@ -158,8 +167,10 @@ var HealthRegistry = class {
158
167
  */
159
168
  async check() {
160
169
  if (this.cacheTtl > 0 && this.cachedReport && Date.now() < this.cacheExpiry) {
170
+ this.cacheHits++;
161
171
  return this.cachedReport;
162
172
  }
173
+ this.cacheMisses++;
163
174
  const results = await Promise.all(
164
175
  Array.from(this.checks.entries()).map(([name, check]) => this.executeCheck(name, check))
165
176
  );
@@ -209,6 +220,19 @@ var HealthRegistry = class {
209
220
  }
210
221
  return { status: "healthy" };
211
222
  }
223
+ /**
224
+ * Get cache statistics
225
+ *
226
+ * Useful for monitoring cache effectiveness and tuning cacheTtl
227
+ */
228
+ getCacheStats() {
229
+ const total = this.cacheHits + this.cacheMisses;
230
+ return {
231
+ hits: this.cacheHits,
232
+ misses: this.cacheMisses,
233
+ hitRate: total > 0 ? this.cacheHits / total : 0
234
+ };
235
+ }
212
236
  };
213
237
 
214
238
  // src/health/index.ts
@@ -342,6 +366,7 @@ var MetricsController = class {
342
366
  * GET /metrics - Prometheus metrics endpoint
343
367
  */
344
368
  async metrics(_c) {
369
+ this.updateHealthCacheMetrics();
345
370
  const prometheusFormat = this.registry.toPrometheus();
346
371
  return new Response(prometheusFormat, {
347
372
  status: 200,
@@ -350,6 +375,19 @@ var MetricsController = class {
350
375
  }
351
376
  });
352
377
  }
378
+ /**
379
+ * ๆ›ดๆ–ฐ health cache metrics
380
+ *
381
+ * ๅพž HealthRegistry ่ฎ€ๅ–ๆœ€ๆ–ฐ็š„ cache ็ตฑ่จˆไธฆๆ›ดๆ–ฐ gauges
382
+ */
383
+ updateHealthCacheMetrics() {
384
+ const healthMetrics = this.registry._healthCacheMetrics;
385
+ if (!healthMetrics) return;
386
+ const stats = healthMetrics.registry.getCacheStats();
387
+ healthMetrics.hits.set(stats.hits);
388
+ healthMetrics.misses.set(stats.misses);
389
+ healthMetrics.hitRate.set(stats.hitRate);
390
+ }
353
391
  };
354
392
 
355
393
  // src/metrics/MetricsRegistry.ts
@@ -705,7 +743,7 @@ function createHttpMetricsMiddleware(registry) {
705
743
  });
706
744
  return async (c, next) => {
707
745
  const method = c.req.method;
708
- const path = normalizePath(c.req.path);
746
+ const path = c.req.routePattern ?? normalizePath(c.req.path);
709
747
  const start = performance.now();
710
748
  await next();
711
749
  const duration = (performance.now() - start) / 1e3;
@@ -1002,9 +1040,37 @@ var MonitorOrbit = class {
1002
1040
  const metricsController = new MetricsController(this.metricsRegistry);
1003
1041
  router.get(metricsPath, (c) => metricsController.metrics(c));
1004
1042
  console.log(`[Monitor] Metrics endpoint: ${metricsPath}`);
1043
+ if (healthEnabled && this.healthRegistry) {
1044
+ this.registerHealthCacheMetrics(this.metricsRegistry, this.healthRegistry);
1045
+ }
1005
1046
  }
1006
1047
  console.log("[Monitor] Observability services initialized");
1007
1048
  }
1049
+ /**
1050
+ * ่จปๅ†Š health cache metrics
1051
+ *
1052
+ * ๅปบ็ซ‹ metrics ไพ†่ฟฝ่นค health check cache ็š„ๆ•ˆ่ƒฝ
1053
+ */
1054
+ registerHealthCacheMetrics(metricsRegistry, healthRegistry) {
1055
+ const cacheHitsGauge = metricsRegistry.gauge({
1056
+ name: "health_cache_hits_total",
1057
+ help: "Total number of health check cache hits"
1058
+ });
1059
+ const cacheMissesGauge = metricsRegistry.gauge({
1060
+ name: "health_cache_misses_total",
1061
+ help: "Total number of health check cache misses"
1062
+ });
1063
+ const cacheHitRateGauge = metricsRegistry.gauge({
1064
+ name: "health_cache_hit_rate",
1065
+ help: "Health check cache hit rate (0.0 to 1.0)"
1066
+ });
1067
+ metricsRegistry._healthCacheMetrics = {
1068
+ hits: cacheHitsGauge,
1069
+ misses: cacheMissesGauge,
1070
+ hitRate: cacheHitRateGauge,
1071
+ registry: healthRegistry
1072
+ };
1073
+ }
1008
1074
  /**
1009
1075
  * Shutdown hook
1010
1076
  */
package/dist/index.d.cts CHANGED
@@ -113,6 +113,14 @@ interface HealthReport {
113
113
  name: string;
114
114
  }>;
115
115
  }
116
+ /**
117
+ * Cache statistics
118
+ */
119
+ interface CacheStats {
120
+ hits: number;
121
+ misses: number;
122
+ hitRate: number;
123
+ }
116
124
  /**
117
125
  * HealthRegistry manages all health checks
118
126
  */
@@ -123,6 +131,8 @@ declare class HealthRegistry {
123
131
  private cacheExpiry;
124
132
  private timeout;
125
133
  private cacheTtl;
134
+ private cacheHits;
135
+ private cacheMisses;
126
136
  constructor(config?: HealthConfig);
127
137
  /**
128
138
  * Register a health check
@@ -158,6 +168,12 @@ declare class HealthRegistry {
158
168
  status: 'healthy' | 'unhealthy';
159
169
  reason?: string;
160
170
  }>;
171
+ /**
172
+ * Get cache statistics
173
+ *
174
+ * Useful for monitoring cache effectiveness and tuning cacheTtl
175
+ */
176
+ getCacheStats(): CacheStats;
161
177
  }
162
178
 
163
179
  /**
@@ -373,6 +389,12 @@ declare class MetricsController {
373
389
  * GET /metrics - Prometheus metrics endpoint
374
390
  */
375
391
  metrics(_c: GravitoContext): Promise<Response>;
392
+ /**
393
+ * ๆ›ดๆ–ฐ health cache metrics
394
+ *
395
+ * ๅพž HealthRegistry ่ฎ€ๅ–ๆœ€ๆ–ฐ็š„ cache ็ตฑ่จˆไธฆๆ›ดๆ–ฐ gauges
396
+ */
397
+ private updateHealthCacheMetrics;
376
398
  }
377
399
 
378
400
  /**
@@ -519,6 +541,12 @@ declare class MonitorOrbit implements GravitoOrbit {
519
541
  * Install the orbit (required by GravitoOrbit interface)
520
542
  */
521
543
  install(core: PlanetCore): Promise<void>;
544
+ /**
545
+ * ่จปๅ†Š health cache metrics
546
+ *
547
+ * ๅปบ็ซ‹ metrics ไพ†่ฟฝ่นค health check cache ็š„ๆ•ˆ่ƒฝ
548
+ */
549
+ private registerHealthCacheMetrics;
522
550
  /**
523
551
  * Shutdown hook
524
552
  */
package/dist/index.d.ts CHANGED
@@ -113,6 +113,14 @@ interface HealthReport {
113
113
  name: string;
114
114
  }>;
115
115
  }
116
+ /**
117
+ * Cache statistics
118
+ */
119
+ interface CacheStats {
120
+ hits: number;
121
+ misses: number;
122
+ hitRate: number;
123
+ }
116
124
  /**
117
125
  * HealthRegistry manages all health checks
118
126
  */
@@ -123,6 +131,8 @@ declare class HealthRegistry {
123
131
  private cacheExpiry;
124
132
  private timeout;
125
133
  private cacheTtl;
134
+ private cacheHits;
135
+ private cacheMisses;
126
136
  constructor(config?: HealthConfig);
127
137
  /**
128
138
  * Register a health check
@@ -158,6 +168,12 @@ declare class HealthRegistry {
158
168
  status: 'healthy' | 'unhealthy';
159
169
  reason?: string;
160
170
  }>;
171
+ /**
172
+ * Get cache statistics
173
+ *
174
+ * Useful for monitoring cache effectiveness and tuning cacheTtl
175
+ */
176
+ getCacheStats(): CacheStats;
161
177
  }
162
178
 
163
179
  /**
@@ -373,6 +389,12 @@ declare class MetricsController {
373
389
  * GET /metrics - Prometheus metrics endpoint
374
390
  */
375
391
  metrics(_c: GravitoContext): Promise<Response>;
392
+ /**
393
+ * ๆ›ดๆ–ฐ health cache metrics
394
+ *
395
+ * ๅพž HealthRegistry ่ฎ€ๅ–ๆœ€ๆ–ฐ็š„ cache ็ตฑ่จˆไธฆๆ›ดๆ–ฐ gauges
396
+ */
397
+ private updateHealthCacheMetrics;
376
398
  }
377
399
 
378
400
  /**
@@ -519,6 +541,12 @@ declare class MonitorOrbit implements GravitoOrbit {
519
541
  * Install the orbit (required by GravitoOrbit interface)
520
542
  */
521
543
  install(core: PlanetCore): Promise<void>;
544
+ /**
545
+ * ่จปๅ†Š health cache metrics
546
+ *
547
+ * ๅปบ็ซ‹ metrics ไพ†่ฟฝ่นค health check cache ็š„ๆ•ˆ่ƒฝ
548
+ */
549
+ private registerHealthCacheMetrics;
522
550
  /**
523
551
  * Shutdown hook
524
552
  */
package/dist/index.js CHANGED
@@ -13,8 +13,15 @@ var HealthController = class {
13
13
  */
14
14
  async health(c) {
15
15
  const report = await this.registry.check();
16
+ const cacheStats = this.registry.getCacheStats();
16
17
  const status = report.status === "healthy" ? 200 : report.status === "degraded" ? 200 : 503;
17
- return c.json(report, status);
18
+ return c.json(
19
+ {
20
+ ...report,
21
+ cache: cacheStats
22
+ },
23
+ status
24
+ );
18
25
  }
19
26
  /**
20
27
  * GET /ready - Kubernetes readiness probe
@@ -49,6 +56,8 @@ var HealthRegistry = class {
49
56
  cacheExpiry = 0;
50
57
  timeout;
51
58
  cacheTtl;
59
+ cacheHits = 0;
60
+ cacheMisses = 0;
52
61
  constructor(config = {}) {
53
62
  this.timeout = config.timeout ?? DEFAULTS.timeout;
54
63
  this.cacheTtl = config.cacheTtl ?? DEFAULTS.cacheTtl;
@@ -106,8 +115,10 @@ var HealthRegistry = class {
106
115
  */
107
116
  async check() {
108
117
  if (this.cacheTtl > 0 && this.cachedReport && Date.now() < this.cacheExpiry) {
118
+ this.cacheHits++;
109
119
  return this.cachedReport;
110
120
  }
121
+ this.cacheMisses++;
111
122
  const results = await Promise.all(
112
123
  Array.from(this.checks.entries()).map(([name, check]) => this.executeCheck(name, check))
113
124
  );
@@ -157,6 +168,19 @@ var HealthRegistry = class {
157
168
  }
158
169
  return { status: "healthy" };
159
170
  }
171
+ /**
172
+ * Get cache statistics
173
+ *
174
+ * Useful for monitoring cache effectiveness and tuning cacheTtl
175
+ */
176
+ getCacheStats() {
177
+ const total = this.cacheHits + this.cacheMisses;
178
+ return {
179
+ hits: this.cacheHits,
180
+ misses: this.cacheMisses,
181
+ hitRate: total > 0 ? this.cacheHits / total : 0
182
+ };
183
+ }
160
184
  };
161
185
 
162
186
  // src/health/index.ts
@@ -290,6 +314,7 @@ var MetricsController = class {
290
314
  * GET /metrics - Prometheus metrics endpoint
291
315
  */
292
316
  async metrics(_c) {
317
+ this.updateHealthCacheMetrics();
293
318
  const prometheusFormat = this.registry.toPrometheus();
294
319
  return new Response(prometheusFormat, {
295
320
  status: 200,
@@ -298,6 +323,19 @@ var MetricsController = class {
298
323
  }
299
324
  });
300
325
  }
326
+ /**
327
+ * ๆ›ดๆ–ฐ health cache metrics
328
+ *
329
+ * ๅพž HealthRegistry ่ฎ€ๅ–ๆœ€ๆ–ฐ็š„ cache ็ตฑ่จˆไธฆๆ›ดๆ–ฐ gauges
330
+ */
331
+ updateHealthCacheMetrics() {
332
+ const healthMetrics = this.registry._healthCacheMetrics;
333
+ if (!healthMetrics) return;
334
+ const stats = healthMetrics.registry.getCacheStats();
335
+ healthMetrics.hits.set(stats.hits);
336
+ healthMetrics.misses.set(stats.misses);
337
+ healthMetrics.hitRate.set(stats.hitRate);
338
+ }
301
339
  };
302
340
 
303
341
  // src/metrics/MetricsRegistry.ts
@@ -653,7 +691,7 @@ function createHttpMetricsMiddleware(registry) {
653
691
  });
654
692
  return async (c, next) => {
655
693
  const method = c.req.method;
656
- const path = normalizePath(c.req.path);
694
+ const path = c.req.routePattern ?? normalizePath(c.req.path);
657
695
  const start = performance.now();
658
696
  await next();
659
697
  const duration = (performance.now() - start) / 1e3;
@@ -950,9 +988,37 @@ var MonitorOrbit = class {
950
988
  const metricsController = new MetricsController(this.metricsRegistry);
951
989
  router.get(metricsPath, (c) => metricsController.metrics(c));
952
990
  console.log(`[Monitor] Metrics endpoint: ${metricsPath}`);
991
+ if (healthEnabled && this.healthRegistry) {
992
+ this.registerHealthCacheMetrics(this.metricsRegistry, this.healthRegistry);
993
+ }
953
994
  }
954
995
  console.log("[Monitor] Observability services initialized");
955
996
  }
997
+ /**
998
+ * ่จปๅ†Š health cache metrics
999
+ *
1000
+ * ๅปบ็ซ‹ metrics ไพ†่ฟฝ่นค health check cache ็š„ๆ•ˆ่ƒฝ
1001
+ */
1002
+ registerHealthCacheMetrics(metricsRegistry, healthRegistry) {
1003
+ const cacheHitsGauge = metricsRegistry.gauge({
1004
+ name: "health_cache_hits_total",
1005
+ help: "Total number of health check cache hits"
1006
+ });
1007
+ const cacheMissesGauge = metricsRegistry.gauge({
1008
+ name: "health_cache_misses_total",
1009
+ help: "Total number of health check cache misses"
1010
+ });
1011
+ const cacheHitRateGauge = metricsRegistry.gauge({
1012
+ name: "health_cache_hit_rate",
1013
+ help: "Health check cache hit rate (0.0 to 1.0)"
1014
+ });
1015
+ metricsRegistry._healthCacheMetrics = {
1016
+ hits: cacheHitsGauge,
1017
+ misses: cacheMissesGauge,
1018
+ hitRate: cacheHitRateGauge,
1019
+ registry: healthRegistry
1020
+ };
1021
+ }
956
1022
  /**
957
1023
  * Shutdown hook
958
1024
  */
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "@gravito/monitor",
3
- "version": "3.0.1",
3
+ "version": "3.1.0",
4
4
  "description": "Observability module for Gravito - Health checks, Metrics, and Tracing",
5
5
  "type": "module",
6
6
  "main": "./dist/index.cjs",
@@ -25,9 +25,11 @@
25
25
  "scripts": {
26
26
  "build": "bun run build.ts",
27
27
  "typecheck": "bun tsc -p tsconfig.json --noEmit --skipLibCheck",
28
- "test": "bun test",
29
- "test:coverage": "bun test --coverage --coverage-threshold=80",
30
- "test:ci": "bun test --coverage --coverage-threshold=80"
28
+ "test": "bun test --timeout=10000",
29
+ "test:coverage": "bun test --timeout=10000 --coverage --coverage-reporter=lcov --coverage-dir coverage && bun run --bun scripts/check-coverage.ts",
30
+ "test:ci": "bun test --timeout=10000 --coverage --coverage-reporter=lcov --coverage-dir coverage && bun run --bun scripts/check-coverage.ts",
31
+ "test:unit": "bun test tests/ --timeout=10000",
32
+ "test:integration": "test $(find tests -name '*.integration.test.ts' 2>/dev/null | wc -l) -gt 0 && find tests -name '*.integration.test.ts' -print0 | xargs -0 bun test --timeout=10000 || echo 'No integration tests found'"
31
33
  },
32
34
  "peerDependencies": {
33
35
  "@gravito/core": "workspace:*",