@aigentsphere/openclaw-otel-observability 0.2.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (43) hide show
  1. package/.github/workflows/ci.yml +52 -0
  2. package/.github/workflows/docs.yml +25 -0
  3. package/LICENSE +15 -0
  4. package/README.md +300 -0
  5. package/collector/README.md +186 -0
  6. package/collector/otel-collector-config.yaml +230 -0
  7. package/docker-compose.yaml +32 -0
  8. package/docs/architecture.md +319 -0
  9. package/docs/backends/dynatrace.md +168 -0
  10. package/docs/backends/generic-otlp.md +166 -0
  11. package/docs/backends/grafana.md +167 -0
  12. package/docs/backends/index.md +49 -0
  13. package/docs/backends/otel-collector.md +210 -0
  14. package/docs/configuration.md +276 -0
  15. package/docs/development.md +198 -0
  16. package/docs/getting-started.md +295 -0
  17. package/docs/index.md +139 -0
  18. package/docs/limitations.md +95 -0
  19. package/docs/security/detection.md +274 -0
  20. package/docs/security/tetragon.md +454 -0
  21. package/docs/telemetry/metrics.md +283 -0
  22. package/docs/telemetry/tokens.md +188 -0
  23. package/docs/telemetry/traces.md +165 -0
  24. package/dynatrace/security-slo-dql.md +263 -0
  25. package/index.ts +191 -0
  26. package/instrumentation/preload.mjs +59 -0
  27. package/mkdocs.yml +90 -0
  28. package/openclaw.plugin.json +99 -0
  29. package/package.json +49 -0
  30. package/src/config.ts +72 -0
  31. package/src/diagnostics.ts +214 -0
  32. package/src/hooks.ts +575 -0
  33. package/src/openllmetry.ts +27 -0
  34. package/src/security.ts +396 -0
  35. package/src/telemetry.ts +282 -0
  36. package/tetragon-policies/01-process-exec.yaml +20 -0
  37. package/tetragon-policies/02-sensitive-files.yaml +86 -0
  38. package/tetragon-policies/04-privilege-escalation.yaml +25 -0
  39. package/tetragon-policies/05-dangerous-commands.yaml +97 -0
  40. package/tetragon-policies/06-kernel-modules.yaml +27 -0
  41. package/tetragon-policies/07-prompt-injection-shell.yaml +73 -0
  42. package/tetragon-policies/README.md +143 -0
  43. package/tsconfig.json +17 -0
@@ -0,0 +1,454 @@
1
+ # Kernel-Level Security with Tetragon
2
+
3
+ Tetragon provides **eBPF-based security observability** at the Linux kernel level. While the OpenClaw plugin captures application-level telemetry, Tetragon sees what actually happens at the system call level — file access, process execution, network connections, and privilege changes.
4
+
5
+ This is **defense in depth**: the application layer shows what the agent *intended* to do; the kernel layer shows what it *actually* did.
6
+
7
+ ## Why Tetragon for OpenClaw?
8
+
9
+ AI agents can execute commands, read files, and make network connections. Even with application-level monitoring, a compromised or manipulated agent could:
10
+
11
+ - Access sensitive files (`.env`, SSH keys, credentials)
12
+ - Execute dangerous commands (`rm -rf`, `curl | sh`)
13
+ - Attempt privilege escalation
14
+ - Make unexpected network connections
15
+
16
+ Tetragon catches all of this at the kernel level — **tamper-proof** and impossible to bypass.
17
+
18
+ ## Installation
19
+
20
+ ### Prerequisites
21
+
22
+ - Linux kernel 5.4+ (BTF support required)
23
+ - Root access for installation
24
+ - systemd (for service management)
25
+
26
+ ### Install Tetragon
27
+
28
+ ```bash
29
+ # Download latest release
30
+ curl -LO https://github.com/cilium/tetragon/releases/latest/download/tetragon-v1.6.0-amd64.tar.gz
31
+
32
+ # Extract
33
+ tar -xzf tetragon-v1.6.0-amd64.tar.gz
34
+ cd tetragon-v1.6.0-amd64
35
+
36
+ # Install
37
+ sudo ./install.sh
38
+
39
+ # Verify installation
40
+ tetra version
41
+ ```
42
+
43
+ ### Create OpenClaw Policy Directory
44
+
45
+ ```bash
46
+ sudo mkdir -p /etc/tetragon/tetragon.tp.d/openclaw
47
+ ```
48
+
49
+ ## TracingPolicies for OpenClaw
50
+
51
+ Create the following policy files in `/etc/tetragon/tetragon.tp.d/openclaw/`:
52
+
53
+ ### 1. Process Execution Monitoring
54
+
55
+ Captures every command executed by Node.js (OpenClaw).
56
+
57
+ ```yaml
58
+ # /etc/tetragon/tetragon.tp.d/openclaw/01-process-exec.yaml
59
+ apiVersion: cilium.io/v1alpha1
60
+ kind: TracingPolicy
61
+ metadata:
62
+ name: openclaw-process-exec
63
+ spec:
64
+ kprobes:
65
+ - call: "sys_execve"
66
+ syscall: true
67
+ args:
68
+ - index: 0
69
+ type: "string"
70
+ - index: 1
71
+ type: "string"
72
+ selectors:
73
+ - matchBinaries:
74
+ - operator: "In"
75
+ values:
76
+ - "/usr/bin/node"
77
+ - "/usr/local/bin/node"
78
+ matchActions:
79
+ - action: Post
80
+ ```
81
+
82
+ ### 2. Sensitive File Access Detection
83
+
84
+ Alerts when OpenClaw accesses sensitive files.
85
+
86
+ ```yaml
87
+ # /etc/tetragon/tetragon.tp.d/openclaw/02-sensitive-files.yaml
88
+ apiVersion: cilium.io/v1alpha1
89
+ kind: TracingPolicy
90
+ metadata:
91
+ name: openclaw-sensitive-files
92
+ spec:
93
+ kprobes:
94
+ - call: "security_file_open"
95
+ syscall: false
96
+ args:
97
+ - index: 0
98
+ type: "file"
99
+ selectors:
100
+ - matchBinaries:
101
+ - operator: "In"
102
+ values:
103
+ - "/usr/bin/node"
104
+ - "/usr/local/bin/node"
105
+ matchArgs:
106
+ - index: 0
107
+ operator: "Prefix"
108
+ values:
109
+ - "/etc/shadow"
110
+ - "/etc/passwd"
111
+ - "/etc/sudoers"
112
+ - "/root/"
113
+ - ".ssh/"
114
+ - ".aws/"
115
+ - ".kube/"
116
+ - ".config/gcloud/"
117
+ - ".openclaw/"
118
+ - ".env"
119
+ matchActions:
120
+ - action: Post
121
+ ```
122
+
123
+ ### 3. Privilege Escalation Detection
124
+
125
+ Catches attempts to change user/group ID.
126
+
127
+ ```yaml
128
+ # /etc/tetragon/tetragon.tp.d/openclaw/04-privilege-escalation.yaml
129
+ apiVersion: cilium.io/v1alpha1
130
+ kind: TracingPolicy
131
+ metadata:
132
+ name: openclaw-privilege-escalation
133
+ spec:
134
+ kprobes:
135
+ - call: "sys_setuid"
136
+ syscall: true
137
+ args:
138
+ - index: 0
139
+ type: "int"
140
+ selectors:
141
+ - matchBinaries:
142
+ - operator: "In"
143
+ values:
144
+ - "/usr/bin/node"
145
+ - "/usr/local/bin/node"
146
+ matchActions:
147
+ - action: Post
148
+ - call: "sys_setgid"
149
+ syscall: true
150
+ args:
151
+ - index: 0
152
+ type: "int"
153
+ selectors:
154
+ - matchBinaries:
155
+ - operator: "In"
156
+ values:
157
+ - "/usr/bin/node"
158
+ - "/usr/local/bin/node"
159
+ matchActions:
160
+ - action: Post
161
+ ```
162
+
163
+ ### 4. Dangerous Command Detection
164
+
165
+ Flags potentially dangerous binaries.
166
+
167
+ ```yaml
168
+ # /etc/tetragon/tetragon.tp.d/openclaw/05-dangerous-commands.yaml
169
+ apiVersion: cilium.io/v1alpha1
170
+ kind: TracingPolicy
171
+ metadata:
172
+ name: openclaw-dangerous-commands
173
+ spec:
174
+ kprobes:
175
+ - call: "sys_execve"
176
+ syscall: true
177
+ args:
178
+ - index: 0
179
+ type: "string"
180
+ - index: 1
181
+ type: "string"
182
+ selectors:
183
+ - matchArgs:
184
+ - index: 0
185
+ operator: "Postfix"
186
+ values:
187
+ - "/rm"
188
+ - "/dd"
189
+ - "/nc"
190
+ - "/netcat"
191
+ - "/ncat"
192
+ - "/curl"
193
+ - "/wget"
194
+ - "/chmod"
195
+ - "/chown"
196
+ matchActions:
197
+ - action: Post
198
+ ```
199
+
200
+ ### 5. Kernel Module Loading
201
+
202
+ Detects attempts to load kernel modules (critical security event).
203
+
204
+ ```yaml
205
+ # /etc/tetragon/tetragon.tp.d/openclaw/06-kernel-modules.yaml
206
+ apiVersion: cilium.io/v1alpha1
207
+ kind: TracingPolicy
208
+ metadata:
209
+ name: openclaw-kernel-modules
210
+ spec:
211
+ kprobes:
212
+ - call: "__x64_sys_init_module"
213
+ syscall: true
214
+ args:
215
+ - index: 2
216
+ type: "string"
217
+ selectors:
218
+ - matchActions:
219
+ - action: Post
220
+ - call: "__x64_sys_finit_module"
221
+ syscall: true
222
+ args:
223
+ - index: 2
224
+ type: "string"
225
+ selectors:
226
+ - matchActions:
227
+ - action: Post
228
+ ```
229
+
230
+ ## Configure Tetragon Export
231
+
232
+ Configure Tetragon to export events to a log file that the OTel Collector can read:
233
+
234
+ ```bash
235
+ # Set export file
236
+ echo "/var/log/tetragon/tetragon.log" | sudo tee /etc/tetragon/tetragon.conf.d/export-filename
237
+
238
+ # Set file permissions (readable by collector)
239
+ echo "644" | sudo tee /etc/tetragon/tetragon.conf.d/export-file-perm
240
+
241
+ # Rotate at 50MB
242
+ echo "50" | sudo tee /etc/tetragon/tetragon.conf.d/export-file-max-size-mb
243
+
244
+ # Keep 3 backup files
245
+ echo "3" | sudo tee /etc/tetragon/tetragon.conf.d/export-file-max-backups
246
+ ```
247
+
248
+ ## Start Tetragon
249
+
250
+ ```bash
251
+ # Enable and start
252
+ sudo systemctl enable tetragon
253
+ sudo systemctl start tetragon
254
+
255
+ # Verify policies loaded
256
+ sudo systemctl status tetragon
257
+
258
+ # Watch events in real-time
259
+ sudo tetra getevents -o compact
260
+ ```
261
+
262
+ ## OTel Collector Integration
263
+
264
+ Add the Tetragon filelog receiver to your collector configuration:
265
+
266
+ ```yaml
267
+ receivers:
268
+ # ... existing receivers ...
269
+
270
+ filelog/tetragon:
271
+ include:
272
+ - /var/log/tetragon/tetragon.log
273
+ start_at: end
274
+ operators:
275
+ - type: json_parser
276
+ parse_from: body
277
+ timestamp:
278
+ parse_from: attributes.time
279
+ layout: '%Y-%m-%dT%H:%M:%S.%LZ'
280
+
281
+ processors:
282
+ # ... existing processors ...
283
+
284
+ # Transform Tetragon events
285
+ transform/tetragon:
286
+ error_mode: ignore
287
+ log_statements:
288
+ - context: log
289
+ statements:
290
+ # Identify event type
291
+ - set(attributes["tetragon.type"], "kprobe") where attributes["process_kprobe"] != nil
292
+ - set(attributes["tetragon.type"], "exec") where attributes["process_exec"] != nil
293
+ - set(attributes["tetragon.type"], "exit") where attributes["process_exit"] != nil
294
+
295
+ # Extract policy name
296
+ - set(attributes["tetragon.policy"], attributes["process_kprobe"]["policy_name"]) where attributes["process_kprobe"]["policy_name"] != nil
297
+
298
+ # Extract process info
299
+ - set(attributes["process.binary"], attributes["process_kprobe"]["process"]["binary"]) where attributes["process_kprobe"]["process"]["binary"] != nil
300
+ - set(attributes["process.pid"], attributes["process_kprobe"]["process"]["pid"]) where attributes["process_kprobe"]["process"]["pid"] != nil
301
+
302
+ # Extract function name
303
+ - set(attributes["tetragon.function"], attributes["process_kprobe"]["function_name"]) where attributes["process_kprobe"]["function_name"] != nil
304
+
305
+ # Assign security risk levels
306
+ - set(attributes["security.risk"], "critical") where attributes["tetragon.policy"] == "openclaw-privilege-escalation"
307
+ - set(attributes["security.risk"], "critical") where attributes["tetragon.policy"] == "openclaw-kernel-modules"
308
+ - set(attributes["security.risk"], "high") where attributes["tetragon.policy"] == "openclaw-sensitive-files"
309
+ - set(attributes["security.risk"], "high") where attributes["tetragon.policy"] == "openclaw-dangerous-commands"
310
+ - set(attributes["security.risk"], "low") where attributes["tetragon.policy"] == "openclaw-process-exec"
311
+
312
+ # Add service metadata
313
+ resource/tetragon:
314
+ attributes:
315
+ - key: service.name
316
+ value: "openclaw-security"
317
+ action: upsert
318
+ - key: tetragon.version
319
+ value: "1.6.0"
320
+ action: upsert
321
+
322
+ service:
323
+ pipelines:
324
+ # ... existing pipelines ...
325
+
326
+ logs/tetragon:
327
+ receivers: [filelog/tetragon]
328
+ processors: [transform/tetragon, resource/tetragon, batch]
329
+ exporters: [otlphttp/dynatrace] # or your exporter
330
+ ```
331
+
332
+ ### Collector Permissions
333
+
334
+ The OTel Collector needs read access to the Tetragon log file:
335
+
336
+ ```bash
337
+ # Make log readable
338
+ sudo chmod 644 /var/log/tetragon/tetragon.log
339
+
340
+ # Or add collector user to appropriate group
341
+ sudo usermod -a -G adm otelcol-contrib
342
+ ```
343
+
344
+ ## Event Examples
345
+
346
+ ### Process Execution Event
347
+
348
+ ```json
349
+ {
350
+ "process_kprobe": {
351
+ "process": {
352
+ "binary": "/usr/bin/node",
353
+ "pid": 58856,
354
+ "uid": 1000,
355
+ "cwd": "/home/user"
356
+ },
357
+ "parent": {
358
+ "binary": "/usr/bin/node",
359
+ "pid": 56271
360
+ },
361
+ "function_name": "__x64_sys_execve",
362
+ "args": [
363
+ {"string_arg": "/bin/sh"},
364
+ {"string_arg": "-c ls -la"}
365
+ ],
366
+ "policy_name": "openclaw-process-exec"
367
+ },
368
+ "time": "2026-02-04T15:13:03.638Z"
369
+ }
370
+ ```
371
+
372
+ ### Sensitive File Access Event
373
+
374
+ ```json
375
+ {
376
+ "process_kprobe": {
377
+ "process": {
378
+ "binary": "/usr/bin/node",
379
+ "pid": 58900
380
+ },
381
+ "function_name": "security_file_open",
382
+ "args": [
383
+ {"file_arg": {"path": "/home/user/.ssh/id_rsa"}}
384
+ ],
385
+ "policy_name": "openclaw-sensitive-files"
386
+ },
387
+ "time": "2026-02-04T15:14:22.123Z"
388
+ }
389
+ ```
390
+
391
+ ## Alerting on Security Events
392
+
393
+ In your observability backend (Dynatrace, Grafana, etc.), create alerts for:
394
+
395
+ | Alert | Condition | Severity |
396
+ |-------|-----------|----------|
397
+ | Privilege Escalation | `tetragon.policy == "openclaw-privilege-escalation"` | Critical |
398
+ | Kernel Module Load | `tetragon.policy == "openclaw-kernel-modules"` | Critical |
399
+ | Sensitive File Access | `tetragon.policy == "openclaw-sensitive-files"` | High |
400
+ | Dangerous Command | `tetragon.policy == "openclaw-dangerous-commands"` | High |
401
+ | Unusual Process Exec | `tetragon.policy == "openclaw-process-exec"` AND off-hours | Medium |
402
+
403
+ ## Complete Observability Stack
404
+
405
+ With Tetragon integrated, you have three layers of visibility:
406
+
407
+ | Layer | Source | What It Shows |
408
+ |-------|--------|---------------|
409
+ | **Application** | OpenClaw Plugin | Tool calls, tokens, request flow |
410
+ | **Gateway** | diagnostics-otel | Session health, queues, costs |
411
+ | **Kernel** | Tetragon | System calls, file access, network |
412
+
413
+ This provides defense in depth — even if application-level telemetry is manipulated, kernel-level events reveal the truth.
414
+
415
+ ## Troubleshooting
416
+
417
+ ### Tetragon not starting
418
+
419
+ ```bash
420
+ # Check logs
421
+ sudo journalctl -u tetragon -n 50
422
+
423
+ # Common issues:
424
+ # - Kernel too old (need 5.4+)
425
+ # - BTF not available
426
+ # - Policy YAML syntax error
427
+ ```
428
+
429
+ ### Events not appearing in collector
430
+
431
+ ```bash
432
+ # Check Tetragon is writing events
433
+ sudo tail -f /var/log/tetragon/tetragon.log
434
+
435
+ # Check file permissions
436
+ ls -la /var/log/tetragon/tetragon.log
437
+
438
+ # Check collector logs
439
+ sudo journalctl -u otelcol-contrib | grep tetragon
440
+ ```
441
+
442
+ ### High kernel overhead
443
+
444
+ If Tetragon causes performance issues, reduce policy scope:
445
+
446
+ - Use `matchBinaries` to limit to Node.js only
447
+ - Remove high-volume policies (like process-exec) if not needed
448
+ - Increase rate limits in policies
449
+
450
+ ## Resources
451
+
452
+ - [Tetragon Documentation](https://tetragon.io/docs/)
453
+ - [TracingPolicy Reference](https://tetragon.io/docs/concepts/tracing-policy/)
454
+ - [Cilium/Tetragon GitHub](https://github.com/cilium/tetragon)