loki-mode 5.42.2 → 5.46.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +4 -3
- package/SKILL.md +2 -2
- package/VERSION +1 -1
- package/autonomy/app-runner.sh +684 -0
- package/autonomy/checklist-verify.py +368 -0
- package/autonomy/completion-council.sh +49 -0
- package/autonomy/loki +83 -0
- package/autonomy/playwright-verify.sh +350 -0
- package/autonomy/prd-analyzer.py +457 -0
- package/autonomy/prd-checklist.sh +223 -0
- package/autonomy/run.sh +164 -4
- package/completions/loki.bash +6 -1
- package/dashboard/__init__.py +1 -1
- package/dashboard/server.py +134 -1
- package/dashboard/static/index.html +804 -265
- package/docs/INSTALLATION.md +1 -1
- package/docs/audit-logging.md +600 -0
- package/docs/authentication.md +374 -0
- package/docs/authorization.md +455 -0
- package/docs/git-workflow.md +446 -0
- package/docs/metrics.md +527 -0
- package/docs/network-security.md +275 -0
- package/docs/openclaw-integration.md +572 -0
- package/docs/siem-integration.md +579 -0
- package/learning/__init__.py +1 -1
- package/mcp/__init__.py +1 -1
- package/memory/__init__.py +2 -0
- package/package.json +2 -1
package/docs/metrics.md
ADDED
|
@@ -0,0 +1,527 @@
|
|
|
1
|
+
# Metrics Guide
|
|
2
|
+
|
|
3
|
+
Prometheus and OpenMetrics monitoring for Loki Mode (v5.38.0).
|
|
4
|
+
|
|
5
|
+
## Overview
|
|
6
|
+
|
|
7
|
+
Loki Mode exposes a `/metrics` endpoint that returns production-ready metrics in Prometheus/OpenMetrics text format. This enables integration with:
|
|
8
|
+
|
|
9
|
+
- Prometheus
|
|
10
|
+
- Grafana
|
|
11
|
+
- Datadog
|
|
12
|
+
- New Relic
|
|
13
|
+
- Elastic APM
|
|
14
|
+
- Any OpenMetrics-compatible monitoring system
|
|
15
|
+
|
|
16
|
+
## Quick Start
|
|
17
|
+
|
|
18
|
+
```bash
|
|
19
|
+
# Enable metrics endpoint
|
|
20
|
+
export LOKI_METRICS_ENABLED=true
|
|
21
|
+
|
|
22
|
+
# Start Loki Mode
|
|
23
|
+
loki start ./prd.md
|
|
24
|
+
|
|
25
|
+
# View metrics
|
|
26
|
+
curl http://localhost:57374/metrics
|
|
27
|
+
|
|
28
|
+
# Or use CLI
|
|
29
|
+
loki metrics
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
## Metrics Endpoint
|
|
33
|
+
|
|
34
|
+
```
|
|
35
|
+
GET http://localhost:57374/metrics
|
|
36
|
+
Content-Type: text/plain; version=0.0.4
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
Returns metrics in OpenMetrics text format. No authentication required by default (configure reverse proxy auth for production).
|
|
40
|
+
|
|
41
|
+
## Available Metrics
|
|
42
|
+
|
|
43
|
+
### Session Metrics
|
|
44
|
+
|
|
45
|
+
| Metric | Type | Description |
|
|
46
|
+
|--------|------|-------------|
|
|
47
|
+
| `loki_session_status` | gauge | Current session status: 0=stopped, 1=running, 2=paused |
|
|
48
|
+
| `loki_iteration_current` | gauge | Current iteration number |
|
|
49
|
+
| `loki_iteration_max` | gauge | Maximum configured iterations (from LOKI_MAX_ITERATIONS) |
|
|
50
|
+
| `loki_uptime_seconds` | gauge | Seconds since session started |
|
|
51
|
+
|
|
52
|
+
### Task Metrics
|
|
53
|
+
|
|
54
|
+
| Metric | Type | Labels | Description |
|
|
55
|
+
|--------|------|--------|-------------|
|
|
56
|
+
| `loki_tasks_total` | gauge | `status` | Number of tasks by status: pending, in_progress, completed, failed |
|
|
57
|
+
|
|
58
|
+
### Agent Metrics
|
|
59
|
+
|
|
60
|
+
| Metric | Type | Description |
|
|
61
|
+
|--------|------|-------------|
|
|
62
|
+
| `loki_agents_active` | gauge | Number of currently active agents |
|
|
63
|
+
| `loki_agents_total` | gauge | Total number of registered agents |
|
|
64
|
+
|
|
65
|
+
### Cost Metrics
|
|
66
|
+
|
|
67
|
+
| Metric | Type | Description |
|
|
68
|
+
|--------|------|-------------|
|
|
69
|
+
| `loki_cost_usd` | gauge | Estimated total session cost in USD |
|
|
70
|
+
|
|
71
|
+
### Event Metrics
|
|
72
|
+
|
|
73
|
+
| Metric | Type | Description |
|
|
74
|
+
|--------|------|-------------|
|
|
75
|
+
| `loki_events_total` | counter | Total number of events recorded in events.jsonl |
|
|
76
|
+
|
|
77
|
+
## Data Sources
|
|
78
|
+
|
|
79
|
+
Metrics are derived from `.loki/` flat files:
|
|
80
|
+
|
|
81
|
+
| File | Metrics |
|
|
82
|
+
|------|---------|
|
|
83
|
+
| `dashboard-state.json` | session_status, iteration_current, iteration_max, tasks_total, agents_active |
|
|
84
|
+
| `loki.pid` | session_status (PID alive check fallback), uptime_seconds |
|
|
85
|
+
| `state/agents.json` | agents_total |
|
|
86
|
+
| `metrics/efficiency/*.json` | cost_usd |
|
|
87
|
+
| `events.jsonl` | events_total (line count) |
|
|
88
|
+
|
|
89
|
+
## CLI Usage
|
|
90
|
+
|
|
91
|
+
```bash
|
|
92
|
+
# Fetch all metrics
|
|
93
|
+
loki metrics
|
|
94
|
+
|
|
95
|
+
# Filter specific metric
|
|
96
|
+
loki metrics | grep loki_cost_usd
|
|
97
|
+
|
|
98
|
+
# Watch metrics in real-time
|
|
99
|
+
watch -n 5 loki metrics
|
|
100
|
+
|
|
101
|
+
# Custom dashboard host/port
|
|
102
|
+
loki metrics --host 192.168.1.100 --port 8080
|
|
103
|
+
```
|
|
104
|
+
|
|
105
|
+
## Prometheus Configuration
|
|
106
|
+
|
|
107
|
+
### Basic Scrape Config
|
|
108
|
+
|
|
109
|
+
Add to `prometheus.yml`:
|
|
110
|
+
|
|
111
|
+
```yaml
|
|
112
|
+
scrape_configs:
|
|
113
|
+
- job_name: 'loki-mode'
|
|
114
|
+
scrape_interval: 15s
|
|
115
|
+
static_configs:
|
|
116
|
+
- targets: ['localhost:57374']
|
|
117
|
+
labels:
|
|
118
|
+
environment: 'production'
|
|
119
|
+
project: 'my-app'
|
|
120
|
+
```
|
|
121
|
+
|
|
122
|
+
### With TLS/HTTPS
|
|
123
|
+
|
|
124
|
+
```yaml
|
|
125
|
+
scrape_configs:
|
|
126
|
+
- job_name: 'loki-mode'
|
|
127
|
+
scheme: https
|
|
128
|
+
tls_config:
|
|
129
|
+
insecure_skip_verify: true # For self-signed certs
|
|
130
|
+
static_configs:
|
|
131
|
+
- targets: ['localhost:57374']
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### With Authentication (via reverse proxy)
|
|
135
|
+
|
|
136
|
+
```yaml
|
|
137
|
+
scrape_configs:
|
|
138
|
+
- job_name: 'loki-mode'
|
|
139
|
+
scheme: https
|
|
140
|
+
bearer_token: 'loki_xxx...'
|
|
141
|
+
static_configs:
|
|
142
|
+
- targets: ['dashboard.example.com:443']
|
|
143
|
+
```
|
|
144
|
+
|
|
145
|
+
### Service Discovery (Kubernetes)
|
|
146
|
+
|
|
147
|
+
```yaml
|
|
148
|
+
scrape_configs:
|
|
149
|
+
- job_name: 'loki-mode'
|
|
150
|
+
kubernetes_sd_configs:
|
|
151
|
+
- role: pod
|
|
152
|
+
namespaces:
|
|
153
|
+
names:
|
|
154
|
+
- loki
|
|
155
|
+
relabel_configs:
|
|
156
|
+
- source_labels: [__meta_kubernetes_pod_label_app]
|
|
157
|
+
action: keep
|
|
158
|
+
regex: loki-mode
|
|
159
|
+
- source_labels: [__meta_kubernetes_pod_ip]
|
|
160
|
+
target_label: __address__
|
|
161
|
+
replacement: $1:57374
|
|
162
|
+
```
|
|
163
|
+
|
|
164
|
+
## Grafana Integration
|
|
165
|
+
|
|
166
|
+
### Add Prometheus Data Source
|
|
167
|
+
|
|
168
|
+
1. Navigate to Configuration > Data Sources
|
|
169
|
+
2. Click "Add data source"
|
|
170
|
+
3. Select "Prometheus"
|
|
171
|
+
4. URL: `http://prometheus-server:9090`
|
|
172
|
+
5. Save & Test
|
|
173
|
+
|
|
174
|
+
### Create Dashboard
|
|
175
|
+
|
|
176
|
+
Import the Loki Mode dashboard template or create custom panels:
|
|
177
|
+
|
|
178
|
+
#### Panel 1: Session Status
|
|
179
|
+
|
|
180
|
+
- **Type:** Stat
|
|
181
|
+
- **Query:** `loki_session_status`
|
|
182
|
+
- **Value Mappings:**
|
|
183
|
+
- 0 = Stopped (Red)
|
|
184
|
+
- 1 = Running (Green)
|
|
185
|
+
- 2 = Paused (Yellow)
|
|
186
|
+
|
|
187
|
+
#### Panel 2: Iteration Progress
|
|
188
|
+
|
|
189
|
+
- **Type:** Gauge
|
|
190
|
+
- **Query:** `loki_iteration_current / loki_iteration_max * 100`
|
|
191
|
+
- **Unit:** Percent (0-100)
|
|
192
|
+
- **Thresholds:** 0-50 (yellow), 50-100 (green)
|
|
193
|
+
|
|
194
|
+
#### Panel 3: Task Distribution
|
|
195
|
+
|
|
196
|
+
- **Type:** Pie chart
|
|
197
|
+
- **Query:** `loki_tasks_total`
|
|
198
|
+
- **Legend:** `{{status}}`
|
|
199
|
+
|
|
200
|
+
#### Panel 4: Agent Activity
|
|
201
|
+
|
|
202
|
+
- **Type:** Time series
|
|
203
|
+
- **Query:** `loki_agents_active`
|
|
204
|
+
- **Legend:** Active Agents
|
|
205
|
+
|
|
206
|
+
#### Panel 5: Cost Tracking
|
|
207
|
+
|
|
208
|
+
- **Type:** Stat
|
|
209
|
+
- **Query:** `loki_cost_usd`
|
|
210
|
+
- **Unit:** Currency (USD)
|
|
211
|
+
- **Decimals:** 2
|
|
212
|
+
|
|
213
|
+
#### Panel 6: Event Rate
|
|
214
|
+
|
|
215
|
+
- **Type:** Graph
|
|
216
|
+
- **Query:** `rate(loki_events_total[5m])`
|
|
217
|
+
- **Legend:** Events per second
|
|
218
|
+
|
|
219
|
+
#### Panel 7: Uptime
|
|
220
|
+
|
|
221
|
+
- **Type:** Stat
|
|
222
|
+
- **Query:** `loki_uptime_seconds`
|
|
223
|
+
- **Unit:** Duration (seconds)
|
|
224
|
+
|
|
225
|
+
### Example PromQL Queries
|
|
226
|
+
|
|
227
|
+
```promql
|
|
228
|
+
# Session is running
|
|
229
|
+
loki_session_status == 1
|
|
230
|
+
|
|
231
|
+
# Iteration progress percentage
|
|
232
|
+
loki_iteration_current / loki_iteration_max * 100
|
|
233
|
+
|
|
234
|
+
# Total pending + in-progress tasks
|
|
235
|
+
loki_tasks_total{status="pending"} + loki_tasks_total{status="in_progress"}
|
|
236
|
+
|
|
237
|
+
# Cost per hour
|
|
238
|
+
rate(loki_cost_usd[1h]) * 3600
|
|
239
|
+
|
|
240
|
+
# Event rate (events per minute)
|
|
241
|
+
rate(loki_events_total[5m]) * 60
|
|
242
|
+
|
|
243
|
+
# Task completion rate
|
|
244
|
+
rate(loki_tasks_total{status="completed"}[10m])
|
|
245
|
+
|
|
246
|
+
# Failed task ratio
|
|
247
|
+
loki_tasks_total{status="failed"} / sum(loki_tasks_total)
|
|
248
|
+
```
|
|
249
|
+
|
|
250
|
+
## Datadog Integration
|
|
251
|
+
|
|
252
|
+
### Configure OpenMetrics Check
|
|
253
|
+
|
|
254
|
+
Create `/etc/datadog-agent/conf.d/openmetrics.d/loki_mode.yaml`:
|
|
255
|
+
|
|
256
|
+
```yaml
|
|
257
|
+
instances:
|
|
258
|
+
- prometheus_url: http://localhost:57374/metrics
|
|
259
|
+
namespace: loki
|
|
260
|
+
metrics:
|
|
261
|
+
- loki_session_status
|
|
262
|
+
- loki_iteration_current
|
|
263
|
+
- loki_iteration_max
|
|
264
|
+
- loki_tasks_total
|
|
265
|
+
- loki_agents_active
|
|
266
|
+
- loki_agents_total
|
|
267
|
+
- loki_cost_usd
|
|
268
|
+
- loki_events_total
|
|
269
|
+
- loki_uptime_seconds
|
|
270
|
+
tags:
|
|
271
|
+
- environment:production
|
|
272
|
+
- service:loki-mode
|
|
273
|
+
```
|
|
274
|
+
|
|
275
|
+
Restart Datadog Agent:
|
|
276
|
+
|
|
277
|
+
```bash
|
|
278
|
+
sudo systemctl restart datadog-agent
|
|
279
|
+
```
|
|
280
|
+
|
|
281
|
+
### Datadog Dashboards
|
|
282
|
+
|
|
283
|
+
View metrics in Datadog:
|
|
284
|
+
- Navigate to Dashboards > New Dashboard
|
|
285
|
+
- Add widgets with queries like `loki.session_status`, `loki.cost_usd`
|
|
286
|
+
- Set up monitors for cost thresholds and session failures
|
|
287
|
+
|
|
288
|
+
## Alerting
|
|
289
|
+
|
|
290
|
+
### Prometheus Alert Rules
|
|
291
|
+
|
|
292
|
+
Create `loki_alerts.yml`:
|
|
293
|
+
|
|
294
|
+
```yaml
|
|
295
|
+
groups:
|
|
296
|
+
- name: loki-mode
|
|
297
|
+
interval: 30s
|
|
298
|
+
rules:
|
|
299
|
+
- alert: LokiSessionDown
|
|
300
|
+
expr: loki_session_status == 0
|
|
301
|
+
for: 5m
|
|
302
|
+
labels:
|
|
303
|
+
severity: warning
|
|
304
|
+
annotations:
|
|
305
|
+
summary: "Loki Mode session is not running"
|
|
306
|
+
description: "Session has been stopped for more than 5 minutes"
|
|
307
|
+
|
|
308
|
+
- alert: LokiBudgetWarning
|
|
309
|
+
expr: loki_cost_usd > 4.00
|
|
310
|
+
labels:
|
|
311
|
+
severity: warning
|
|
312
|
+
annotations:
|
|
313
|
+
summary: "Loki Mode cost approaching budget limit"
|
|
314
|
+
description: "Current cost: ${{ $value }}"
|
|
315
|
+
|
|
316
|
+
- alert: LokiBudgetCritical
|
|
317
|
+
expr: loki_cost_usd > 4.50
|
|
318
|
+
labels:
|
|
319
|
+
severity: critical
|
|
320
|
+
annotations:
|
|
321
|
+
summary: "Loki Mode cost exceeds budget"
|
|
322
|
+
description: "Current cost: ${{ $value }}, budget: $5.00"
|
|
323
|
+
|
|
324
|
+
- alert: LokiStagnation
|
|
325
|
+
expr: changes(loki_iteration_current[30m]) == 0 and loki_session_status == 1
|
|
326
|
+
for: 10m
|
|
327
|
+
labels:
|
|
328
|
+
severity: critical
|
|
329
|
+
annotations:
|
|
330
|
+
summary: "Loki Mode iteration not progressing"
|
|
331
|
+
description: "No iteration progress in 30 minutes"
|
|
332
|
+
|
|
333
|
+
- alert: LokiHighFailureRate
|
|
334
|
+
expr: loki_tasks_total{status="failed"} / sum(loki_tasks_total) > 0.1
|
|
335
|
+
for: 5m
|
|
336
|
+
labels:
|
|
337
|
+
severity: warning
|
|
338
|
+
annotations:
|
|
339
|
+
summary: "High task failure rate"
|
|
340
|
+
description: "{{ $value | humanizePercentage }} of tasks are failing"
|
|
341
|
+
|
|
342
|
+
- alert: LokiTooManyAgents
|
|
343
|
+
expr: loki_agents_active > 50
|
|
344
|
+
for: 10m
|
|
345
|
+
labels:
|
|
346
|
+
severity: warning
|
|
347
|
+
annotations:
|
|
348
|
+
summary: "Too many active agents"
|
|
349
|
+
description: "{{ $value }} agents active, may indicate runaway spawning"
|
|
350
|
+
```
|
|
351
|
+
|
|
352
|
+
### Grafana Alerts
|
|
353
|
+
|
|
354
|
+
Configure alerts in Grafana panels:
|
|
355
|
+
|
|
356
|
+
1. Edit panel
|
|
357
|
+
2. Navigate to Alert tab
|
|
358
|
+
3. Create alert rule:
|
|
359
|
+
- **Condition:** `WHEN last() OF query(A, 5m, now) IS ABOVE 4.5`
|
|
360
|
+
- **Evaluate:** Every 1m for 5m
|
|
361
|
+
- **Send to:** Slack, PagerDuty, Email
|
|
362
|
+
|
|
363
|
+
## Environment Variables
|
|
364
|
+
|
|
365
|
+
| Variable | Default | Description |
|
|
366
|
+
|----------|---------|-------------|
|
|
367
|
+
| `LOKI_METRICS_ENABLED` | `false` | Enable `/metrics` endpoint |
|
|
368
|
+
| `LOKI_METRICS_PORT` | `57374` | Port for metrics endpoint (same as dashboard) |
|
|
369
|
+
| `LOKI_METRICS_PATH` | `/metrics` | Endpoint path |
|
|
370
|
+
|
|
371
|
+
## Best Practices
|
|
372
|
+
|
|
373
|
+
### Production Deployment
|
|
374
|
+
|
|
375
|
+
1. Enable metrics in production:
|
|
376
|
+
```bash
|
|
377
|
+
export LOKI_METRICS_ENABLED=true
|
|
378
|
+
```
|
|
379
|
+
|
|
380
|
+
2. Secure endpoint with reverse proxy authentication
|
|
381
|
+
3. Set up Prometheus scraping with appropriate interval (15-30s)
|
|
382
|
+
4. Create Grafana dashboards for visualization
|
|
383
|
+
5. Configure alerts for budget, stagnation, and failures
|
|
384
|
+
6. Monitor metrics retention and storage
|
|
385
|
+
|
|
386
|
+
### Performance
|
|
387
|
+
|
|
388
|
+
- Metrics endpoint is lightweight (reads flat files, no DB queries)
|
|
389
|
+
- Scrape interval of 15-30 seconds recommended
|
|
390
|
+
- Metrics are cached for 2 seconds to avoid excessive file reads
|
|
391
|
+
- No impact on Loki Mode execution performance
|
|
392
|
+
|
|
393
|
+
### Monitoring
|
|
394
|
+
|
|
395
|
+
- Track `loki_cost_usd` to prevent budget overruns
|
|
396
|
+
- Alert on `loki_session_status == 0` for unexpected stops
|
|
397
|
+
- Monitor `loki_tasks_total{status="failed"}` for quality issues
|
|
398
|
+
- Watch `loki_agents_active` for agent spawning issues
|
|
399
|
+
- Track `loki_iteration_current` for progress
|
|
400
|
+
|
|
401
|
+
## Troubleshooting
|
|
402
|
+
|
|
403
|
+
### Metrics Endpoint Returns Empty
|
|
404
|
+
|
|
405
|
+
```bash
|
|
406
|
+
# Check LOKI_METRICS_ENABLED is set
|
|
407
|
+
echo $LOKI_METRICS_ENABLED
|
|
408
|
+
|
|
409
|
+
# Verify LOKI_DIR is set (required for dashboard)
|
|
410
|
+
echo $LOKI_DIR
|
|
411
|
+
|
|
412
|
+
# Check dashboard-state.json exists and is updating
|
|
413
|
+
ls -la .loki/dashboard-state.json
|
|
414
|
+
watch -n 2 cat .loki/dashboard-state.json
|
|
415
|
+
|
|
416
|
+
# Check dashboard is running
|
|
417
|
+
loki dashboard status
|
|
418
|
+
curl http://localhost:57374/health
|
|
419
|
+
```
|
|
420
|
+
|
|
421
|
+
### Metrics Show Zero Values
|
|
422
|
+
|
|
423
|
+
```bash
|
|
424
|
+
# Ensure a Loki session is running
|
|
425
|
+
loki status
|
|
426
|
+
|
|
427
|
+
# Check dashboard-state.json is being updated (every 2 seconds)
|
|
428
|
+
stat .loki/dashboard-state.json
|
|
429
|
+
|
|
430
|
+
# Verify metrics files exist
|
|
431
|
+
ls -la .loki/metrics/efficiency/
|
|
432
|
+
|
|
433
|
+
# Check events.jsonl exists
|
|
434
|
+
ls -la .loki/events.jsonl
|
|
435
|
+
```
|
|
436
|
+
|
|
437
|
+
### Connection Refused
|
|
438
|
+
|
|
439
|
+
```bash
|
|
440
|
+
# Verify dashboard is running on expected port
|
|
441
|
+
curl http://localhost:57374/health
|
|
442
|
+
|
|
443
|
+
# Check if another process is using port 57374
|
|
444
|
+
lsof -ti:57374
|
|
445
|
+
|
|
446
|
+
# Restart dashboard
|
|
447
|
+
loki dashboard stop
|
|
448
|
+
loki dashboard start
|
|
449
|
+
```
|
|
450
|
+
|
|
451
|
+
### Prometheus Cannot Scrape
|
|
452
|
+
|
|
453
|
+
```bash
|
|
454
|
+
# Test endpoint manually
|
|
455
|
+
curl http://localhost:57374/metrics
|
|
456
|
+
|
|
457
|
+
# Check Prometheus targets page
|
|
458
|
+
open http://prometheus-server:9090/targets
|
|
459
|
+
|
|
460
|
+
# Verify network connectivity from Prometheus to Loki dashboard
|
|
461
|
+
# (firewall, security groups, etc.)
|
|
462
|
+
|
|
463
|
+
# Check Prometheus logs
|
|
464
|
+
kubectl logs -f prometheus-server-xyz
|
|
465
|
+
```
|
|
466
|
+
|
|
467
|
+
## Examples
|
|
468
|
+
|
|
469
|
+
### Cost Budget Monitoring
|
|
470
|
+
|
|
471
|
+
```bash
|
|
472
|
+
# Set up budget alert
|
|
473
|
+
cat > /tmp/budget_check.sh <<'EOF'
|
|
474
|
+
#!/bin/bash
|
|
475
|
+
COST=$(curl -s http://localhost:57374/metrics | grep loki_cost_usd | awk '{print $2}')
|
|
476
|
+
if (( $(echo "$COST > 4.5" | bc -l) )); then
|
|
477
|
+
echo "CRITICAL: Cost $COST exceeds budget!"
|
|
478
|
+
loki stop
|
|
479
|
+
fi
|
|
480
|
+
EOF
|
|
481
|
+
|
|
482
|
+
# Run every 5 minutes
|
|
483
|
+
crontab -e
|
|
484
|
+
# Add: */5 * * * * /tmp/budget_check.sh
|
|
485
|
+
```
|
|
486
|
+
|
|
487
|
+
### Custom Metrics Export
|
|
488
|
+
|
|
489
|
+
```python
|
|
490
|
+
import requests
|
|
491
|
+
import json
|
|
492
|
+
|
|
493
|
+
def get_loki_metrics():
|
|
494
|
+
response = requests.get("http://localhost:57374/metrics")
|
|
495
|
+
metrics = {}
|
|
496
|
+
for line in response.text.splitlines():
|
|
497
|
+
if line.startswith("loki_"):
|
|
498
|
+
parts = line.split()
|
|
499
|
+
metric_name = parts[0]
|
|
500
|
+
metric_value = float(parts[1]) if len(parts) > 1 else 0
|
|
501
|
+
metrics[metric_name] = metric_value
|
|
502
|
+
return metrics
|
|
503
|
+
|
|
504
|
+
metrics = get_loki_metrics()
|
|
505
|
+
print(json.dumps(metrics, indent=2))
|
|
506
|
+
```
|
|
507
|
+
|
|
508
|
+
### Slack Notification on High Cost
|
|
509
|
+
|
|
510
|
+
```bash
|
|
511
|
+
# Add to Prometheus Alertmanager config
|
|
512
|
+
cat >> /etc/alertmanager/alertmanager.yml <<EOF
|
|
513
|
+
receivers:
|
|
514
|
+
- name: slack
|
|
515
|
+
slack_configs:
|
|
516
|
+
- api_url: 'https://hooks.slack.com/services/YOUR/WEBHOOK/URL'
|
|
517
|
+
channel: '#loki-alerts'
|
|
518
|
+
text: 'Loki Mode cost: ${{ .Annotations.description }}'
|
|
519
|
+
EOF
|
|
520
|
+
```
|
|
521
|
+
|
|
522
|
+
## See Also
|
|
523
|
+
|
|
524
|
+
- [Audit Logging](audit-logging.md) - Track agent actions
|
|
525
|
+
- [Dashboard Guide](dashboard-guide.md) - Web dashboard
|
|
526
|
+
- [Enterprise Features](../wiki/Enterprise-Features.md) - Complete enterprise guide
|
|
527
|
+
- [Prometheus Metrics](../wiki/Prometheus-Metrics.md) - Detailed wiki documentation
|