container-superposition 0.1.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (327) hide show
  1. package/README.md +843 -0
  2. package/dist/scripts/init.d.ts +3 -0
  3. package/dist/scripts/init.d.ts.map +1 -0
  4. package/dist/scripts/init.js +1190 -0
  5. package/dist/scripts/init.js.map +1 -0
  6. package/dist/scripts/migrate-to-manifests.d.ts +12 -0
  7. package/dist/scripts/migrate-to-manifests.d.ts.map +1 -0
  8. package/dist/scripts/migrate-to-manifests.js +230 -0
  9. package/dist/scripts/migrate-to-manifests.js.map +1 -0
  10. package/dist/tool/questionnaire/composer.d.ts +6 -0
  11. package/dist/tool/questionnaire/composer.d.ts.map +1 -0
  12. package/dist/tool/questionnaire/composer.js +1232 -0
  13. package/dist/tool/questionnaire/composer.js.map +1 -0
  14. package/dist/tool/readme/markdown-parser.d.ts +30 -0
  15. package/dist/tool/readme/markdown-parser.d.ts.map +1 -0
  16. package/dist/tool/readme/markdown-parser.js +139 -0
  17. package/dist/tool/readme/markdown-parser.js.map +1 -0
  18. package/dist/tool/readme/readme-generator.d.ts +9 -0
  19. package/dist/tool/readme/readme-generator.d.ts.map +1 -0
  20. package/dist/tool/readme/readme-generator.js +422 -0
  21. package/dist/tool/readme/readme-generator.js.map +1 -0
  22. package/dist/tool/schema/custom-loader.d.ts +17 -0
  23. package/dist/tool/schema/custom-loader.d.ts.map +1 -0
  24. package/dist/tool/schema/custom-loader.js +149 -0
  25. package/dist/tool/schema/custom-loader.js.map +1 -0
  26. package/dist/tool/schema/overlay-loader.d.ts +47 -0
  27. package/dist/tool/schema/overlay-loader.d.ts.map +1 -0
  28. package/dist/tool/schema/overlay-loader.js +252 -0
  29. package/dist/tool/schema/overlay-loader.js.map +1 -0
  30. package/dist/tool/schema/types.d.ts +212 -0
  31. package/dist/tool/schema/types.d.ts.map +1 -0
  32. package/dist/tool/schema/types.js +5 -0
  33. package/dist/tool/schema/types.js.map +1 -0
  34. package/docs/README.md +308 -0
  35. package/docs/architecture.md +233 -0
  36. package/docs/creating-overlays.md +549 -0
  37. package/docs/custom-patches.md +540 -0
  38. package/docs/dependencies.md +279 -0
  39. package/docs/examples/custom-patches-example.md +85 -0
  40. package/docs/examples.md +576 -0
  41. package/docs/messaging-comparison.md +265 -0
  42. package/docs/messaging-quick-start.md +385 -0
  43. package/docs/observability-workflow.md +537 -0
  44. package/docs/overlay-manifest-refactoring.md +214 -0
  45. package/docs/overlay-metadata-archive.md +54 -0
  46. package/docs/overlays.md +523 -0
  47. package/docs/presets-architecture.md +498 -0
  48. package/docs/presets.md +366 -0
  49. package/docs/publishing.md +476 -0
  50. package/docs/quick-reference.md +326 -0
  51. package/docs/ux.md +170 -0
  52. package/features/README.md +85 -0
  53. package/features/cross-distro-packages/README.md +146 -0
  54. package/features/cross-distro-packages/devcontainer-feature.json +20 -0
  55. package/features/cross-distro-packages/install.sh +58 -0
  56. package/features/local-secrets-manager/devcontainer-feature.json +18 -0
  57. package/features/local-secrets-manager/install.sh +127 -0
  58. package/features/project-scaffolder/devcontainer-feature.json +24 -0
  59. package/features/project-scaffolder/install.sh +100 -0
  60. package/features/team-conventions/devcontainer-feature.json +24 -0
  61. package/features/team-conventions/install.sh +93 -0
  62. package/overlays/.registry/README.md +14 -0
  63. package/overlays/.registry/base-images.yml +26 -0
  64. package/overlays/.registry/base-templates.yml +7 -0
  65. package/overlays/README.md +155 -0
  66. package/overlays/alertmanager/.env.example +5 -0
  67. package/overlays/alertmanager/README.md +465 -0
  68. package/overlays/alertmanager/alert-rules.yml +56 -0
  69. package/overlays/alertmanager/alertmanager.yml +42 -0
  70. package/overlays/alertmanager/devcontainer.patch.json +12 -0
  71. package/overlays/alertmanager/docker-compose.yml +20 -0
  72. package/overlays/alertmanager/overlay.yml +17 -0
  73. package/overlays/alertmanager/setup.sh +53 -0
  74. package/overlays/alertmanager/verify.sh +31 -0
  75. package/overlays/aws-cli/README.md +473 -0
  76. package/overlays/aws-cli/devcontainer.patch.json +13 -0
  77. package/overlays/aws-cli/overlay.yml +13 -0
  78. package/overlays/azure-cli/README.md +551 -0
  79. package/overlays/azure-cli/devcontainer.patch.json +8 -0
  80. package/overlays/azure-cli/overlay.yml +13 -0
  81. package/overlays/bun/README.md +312 -0
  82. package/overlays/bun/devcontainer.patch.json +41 -0
  83. package/overlays/bun/overlay.yml +16 -0
  84. package/overlays/bun/setup.sh +79 -0
  85. package/overlays/bun/verify.sh +30 -0
  86. package/overlays/codex/README.md +128 -0
  87. package/overlays/codex/devcontainer.patch.json +3 -0
  88. package/overlays/codex/overlay.yml +14 -0
  89. package/overlays/codex/setup.sh +24 -0
  90. package/overlays/codex/verify.sh +30 -0
  91. package/overlays/commitlint/README.md +333 -0
  92. package/overlays/commitlint/devcontainer.patch.json +8 -0
  93. package/overlays/commitlint/overlay.yml +16 -0
  94. package/overlays/commitlint/setup.sh +234 -0
  95. package/overlays/direnv/README.md +504 -0
  96. package/overlays/direnv/devcontainer.patch.json +6 -0
  97. package/overlays/direnv/overlay.yml +13 -0
  98. package/overlays/direnv/setup.sh +139 -0
  99. package/overlays/docker-in-docker/README.md +534 -0
  100. package/overlays/docker-in-docker/devcontainer.patch.json +10 -0
  101. package/overlays/docker-in-docker/overlay.yml +13 -0
  102. package/overlays/docker-sock/README.md +256 -0
  103. package/overlays/docker-sock/devcontainer.patch.json +9 -0
  104. package/overlays/docker-sock/docker-compose.yml +8 -0
  105. package/overlays/docker-sock/overlay.yml +13 -0
  106. package/overlays/dotnet/README.md +147 -0
  107. package/overlays/dotnet/devcontainer.patch.json +51 -0
  108. package/overlays/dotnet/global-tools.txt +24 -0
  109. package/overlays/dotnet/overlay.yml +13 -0
  110. package/overlays/dotnet/setup.sh +51 -0
  111. package/overlays/dotnet/verify.sh +26 -0
  112. package/overlays/gcloud/README.md +269 -0
  113. package/overlays/gcloud/devcontainer.patch.json +14 -0
  114. package/overlays/gcloud/overlay.yml +14 -0
  115. package/overlays/gcloud/verify.sh +52 -0
  116. package/overlays/git-helpers/README.md +168 -0
  117. package/overlays/git-helpers/devcontainer.patch.json +33 -0
  118. package/overlays/git-helpers/overlay.yml +15 -0
  119. package/overlays/git-helpers/setup.sh +91 -0
  120. package/overlays/go/README.md +293 -0
  121. package/overlays/go/devcontainer.patch.json +43 -0
  122. package/overlays/go/overlay.yml +15 -0
  123. package/overlays/go/setup.sh +33 -0
  124. package/overlays/go/verify.sh +40 -0
  125. package/overlays/grafana/.env.example +9 -0
  126. package/overlays/grafana/README.md +462 -0
  127. package/overlays/grafana/dashboard-provider.yml +11 -0
  128. package/overlays/grafana/dashboards/observability-overview.json +263 -0
  129. package/overlays/grafana/devcontainer.patch.json +12 -0
  130. package/overlays/grafana/docker-compose.yml +27 -0
  131. package/overlays/grafana/grafana-datasources.yml +57 -0
  132. package/overlays/grafana/overlay.yml +21 -0
  133. package/overlays/grafana/verify.sh +34 -0
  134. package/overlays/jaeger/.env.example +7 -0
  135. package/overlays/jaeger/README.md +867 -0
  136. package/overlays/jaeger/devcontainer.patch.json +12 -0
  137. package/overlays/jaeger/docker-compose.yml +17 -0
  138. package/overlays/jaeger/overlay.yml +19 -0
  139. package/overlays/java/README.md +267 -0
  140. package/overlays/java/devcontainer.patch.json +44 -0
  141. package/overlays/java/overlay.yml +16 -0
  142. package/overlays/java/setup.sh +41 -0
  143. package/overlays/java/verify.sh +42 -0
  144. package/overlays/just/README.md +443 -0
  145. package/overlays/just/devcontainer.patch.json +3 -0
  146. package/overlays/just/overlay.yml +13 -0
  147. package/overlays/just/setup.sh +182 -0
  148. package/overlays/kubectl-helm/README.md +660 -0
  149. package/overlays/kubectl-helm/devcontainer.patch.json +10 -0
  150. package/overlays/kubectl-helm/overlay.yml +13 -0
  151. package/overlays/loki/.env.example +5 -0
  152. package/overlays/loki/README.md +1156 -0
  153. package/overlays/loki/devcontainer.patch.json +12 -0
  154. package/overlays/loki/docker-compose.yml +18 -0
  155. package/overlays/loki/loki-config.yaml +45 -0
  156. package/overlays/loki/overlay.yml +17 -0
  157. package/overlays/minio/.env.example +9 -0
  158. package/overlays/minio/README.md +639 -0
  159. package/overlays/minio/devcontainer.patch.json +30 -0
  160. package/overlays/minio/docker-compose.yml +28 -0
  161. package/overlays/minio/overlay.yml +18 -0
  162. package/overlays/minio/setup.sh +61 -0
  163. package/overlays/minio/verify.sh +64 -0
  164. package/overlays/mkdocs/README.md +309 -0
  165. package/overlays/mkdocs/devcontainer.patch.json +24 -0
  166. package/overlays/mkdocs/overlay.yml +15 -0
  167. package/overlays/modern-cli-tools/README.md +556 -0
  168. package/overlays/modern-cli-tools/devcontainer.patch.json +3 -0
  169. package/overlays/modern-cli-tools/overlay.yml +13 -0
  170. package/overlays/modern-cli-tools/setup.sh +153 -0
  171. package/overlays/mongodb/.env.example +9 -0
  172. package/overlays/mongodb/README.md +481 -0
  173. package/overlays/mongodb/devcontainer.patch.json +32 -0
  174. package/overlays/mongodb/docker-compose.yml +44 -0
  175. package/overlays/mongodb/overlay.yml +17 -0
  176. package/overlays/mongodb/verify.sh +48 -0
  177. package/overlays/mysql/.env.example +11 -0
  178. package/overlays/mysql/README.md +542 -0
  179. package/overlays/mysql/devcontainer.patch.json +34 -0
  180. package/overlays/mysql/docker-compose.yml +55 -0
  181. package/overlays/mysql/overlay.yml +16 -0
  182. package/overlays/mysql/verify.sh +48 -0
  183. package/overlays/nats/.env.example +5 -0
  184. package/overlays/nats/README.md +762 -0
  185. package/overlays/nats/devcontainer.patch.json +24 -0
  186. package/overlays/nats/docker-compose.yml +31 -0
  187. package/overlays/nats/overlay.yml +18 -0
  188. package/overlays/nats/verify.sh +50 -0
  189. package/overlays/ngrok/README.md +503 -0
  190. package/overlays/ngrok/devcontainer.patch.json +3 -0
  191. package/overlays/ngrok/overlay.yml +14 -0
  192. package/overlays/ngrok/setup.sh +125 -0
  193. package/overlays/nodejs/README.md +192 -0
  194. package/overlays/nodejs/devcontainer.patch.json +49 -0
  195. package/overlays/nodejs/global-packages.txt +16 -0
  196. package/overlays/nodejs/overlay.yml +14 -0
  197. package/overlays/nodejs/setup.sh +46 -0
  198. package/overlays/nodejs/verify.sh +32 -0
  199. package/overlays/otel-collector/.env.example +9 -0
  200. package/overlays/otel-collector/README.md +1257 -0
  201. package/overlays/otel-collector/devcontainer.patch.json +28 -0
  202. package/overlays/otel-collector/docker-compose.yml +22 -0
  203. package/overlays/otel-collector/otel-collector-config.yaml +68 -0
  204. package/overlays/otel-collector/overlay.yml +21 -0
  205. package/overlays/otel-collector/setup.sh +49 -0
  206. package/overlays/otel-demo-nodejs/.env.example +2 -0
  207. package/overlays/otel-demo-nodejs/Dockerfile-otel-demo-nodejs +17 -0
  208. package/overlays/otel-demo-nodejs/README.md +409 -0
  209. package/overlays/otel-demo-nodejs/devcontainer.patch.json +12 -0
  210. package/overlays/otel-demo-nodejs/docker-compose.yml +19 -0
  211. package/overlays/otel-demo-nodejs/overlay.yml +23 -0
  212. package/overlays/otel-demo-nodejs/package-otel-demo-nodejs.json +20 -0
  213. package/overlays/otel-demo-nodejs/server-otel-demo-nodejs.js +259 -0
  214. package/overlays/otel-demo-nodejs/tracing-otel-demo-nodejs.js +57 -0
  215. package/overlays/otel-demo-nodejs/verify.sh +31 -0
  216. package/overlays/otel-demo-python/.env.example +2 -0
  217. package/overlays/otel-demo-python/Dockerfile-otel-demo-python +16 -0
  218. package/overlays/otel-demo-python/README.md +82 -0
  219. package/overlays/otel-demo-python/app-otel-demo-python.py +208 -0
  220. package/overlays/otel-demo-python/devcontainer.patch.json +12 -0
  221. package/overlays/otel-demo-python/docker-compose.yml +19 -0
  222. package/overlays/otel-demo-python/overlay.yml +23 -0
  223. package/overlays/otel-demo-python/requirements-otel-demo-python.txt +4 -0
  224. package/overlays/otel-demo-python/verify.sh +31 -0
  225. package/overlays/playwright/README.md +629 -0
  226. package/overlays/playwright/devcontainer.patch.json +9 -0
  227. package/overlays/playwright/overlay.yml +13 -0
  228. package/overlays/postgres/.env.example +6 -0
  229. package/overlays/postgres/README.md +602 -0
  230. package/overlays/postgres/devcontainer.patch.json +21 -0
  231. package/overlays/postgres/docker-compose.yml +22 -0
  232. package/overlays/postgres/overlay.yml +15 -0
  233. package/overlays/postgres/verify.sh +45 -0
  234. package/overlays/powershell/README.md +314 -0
  235. package/overlays/powershell/devcontainer.patch.json +22 -0
  236. package/overlays/powershell/overlay.yml +13 -0
  237. package/overlays/powershell/setup.sh +29 -0
  238. package/overlays/powershell/verify.sh +38 -0
  239. package/overlays/pre-commit/README.md +263 -0
  240. package/overlays/pre-commit/devcontainer.patch.json +9 -0
  241. package/overlays/pre-commit/overlay.yml +16 -0
  242. package/overlays/pre-commit/setup.sh +129 -0
  243. package/overlays/presets/docs-site.yml +118 -0
  244. package/overlays/presets/fullstack.yml +181 -0
  245. package/overlays/presets/microservice.yml +118 -0
  246. package/overlays/presets/web-api.yml +109 -0
  247. package/overlays/prometheus/.env.example +5 -0
  248. package/overlays/prometheus/README.md +1246 -0
  249. package/overlays/prometheus/devcontainer.patch.json +12 -0
  250. package/overlays/prometheus/docker-compose.yml +22 -0
  251. package/overlays/prometheus/overlay.yml +17 -0
  252. package/overlays/prometheus/prometheus.yml +12 -0
  253. package/overlays/prometheus/verify.sh +34 -0
  254. package/overlays/promtail/.env.example +2 -0
  255. package/overlays/promtail/README.md +357 -0
  256. package/overlays/promtail/devcontainer.patch.json +5 -0
  257. package/overlays/promtail/docker-compose.yml +16 -0
  258. package/overlays/promtail/overlay.yml +17 -0
  259. package/overlays/promtail/promtail-config.yaml +60 -0
  260. package/overlays/promtail/verify.sh +31 -0
  261. package/overlays/pulumi/README.md +472 -0
  262. package/overlays/pulumi/devcontainer.patch.json +13 -0
  263. package/overlays/pulumi/overlay.yml +14 -0
  264. package/overlays/pulumi/verify.sh +31 -0
  265. package/overlays/python/README.md +919 -0
  266. package/overlays/python/devcontainer.patch.json +41 -0
  267. package/overlays/python/overlay.yml +12 -0
  268. package/overlays/python/requirements-overlay.txt +13 -0
  269. package/overlays/python/setup.sh +47 -0
  270. package/overlays/python/verify.sh +32 -0
  271. package/overlays/rabbitmq/.env.example +7 -0
  272. package/overlays/rabbitmq/README.md +680 -0
  273. package/overlays/rabbitmq/devcontainer.patch.json +28 -0
  274. package/overlays/rabbitmq/docker-compose.yml +30 -0
  275. package/overlays/rabbitmq/overlay.yml +18 -0
  276. package/overlays/rabbitmq/verify.sh +41 -0
  277. package/overlays/redis/.env.example +4 -0
  278. package/overlays/redis/README.md +776 -0
  279. package/overlays/redis/devcontainer.patch.json +21 -0
  280. package/overlays/redis/docker-compose.yml +21 -0
  281. package/overlays/redis/overlay.yml +15 -0
  282. package/overlays/redis/verify.sh +41 -0
  283. package/overlays/redpanda/.env.example +10 -0
  284. package/overlays/redpanda/README.md +703 -0
  285. package/overlays/redpanda/devcontainer.patch.json +37 -0
  286. package/overlays/redpanda/docker-compose.yml +67 -0
  287. package/overlays/redpanda/overlay.yml +21 -0
  288. package/overlays/redpanda/verify.sh +48 -0
  289. package/overlays/rust/README.md +299 -0
  290. package/overlays/rust/devcontainer.patch.json +39 -0
  291. package/overlays/rust/overlay.yml +15 -0
  292. package/overlays/rust/setup.sh +36 -0
  293. package/overlays/rust/verify.sh +51 -0
  294. package/overlays/sqlite/README.md +584 -0
  295. package/overlays/sqlite/devcontainer.patch.json +14 -0
  296. package/overlays/sqlite/overlay.yml +15 -0
  297. package/overlays/sqlite/setup.sh +27 -0
  298. package/overlays/sqlite/verify.sh +43 -0
  299. package/overlays/sqlserver/.env.example +6 -0
  300. package/overlays/sqlserver/README.md +592 -0
  301. package/overlays/sqlserver/devcontainer.patch.json +22 -0
  302. package/overlays/sqlserver/docker-compose.yml +32 -0
  303. package/overlays/sqlserver/overlay.yml +17 -0
  304. package/overlays/sqlserver/verify.sh +30 -0
  305. package/overlays/tempo/.env.example +5 -0
  306. package/overlays/tempo/README.md +273 -0
  307. package/overlays/tempo/devcontainer.patch.json +12 -0
  308. package/overlays/tempo/docker-compose.yml +20 -0
  309. package/overlays/tempo/overlay.yml +20 -0
  310. package/overlays/tempo/tempo-config.yaml +32 -0
  311. package/overlays/tempo/verify.sh +31 -0
  312. package/overlays/terraform/README.md +389 -0
  313. package/overlays/terraform/devcontainer.patch.json +15 -0
  314. package/overlays/terraform/overlay.yml +14 -0
  315. package/overlays/terraform/verify.sh +63 -0
  316. package/package.json +74 -0
  317. package/templates/README.md +285 -0
  318. package/templates/compose/.devcontainer/devcontainer.json +46 -0
  319. package/templates/compose/.devcontainer/docker-compose.yml +12 -0
  320. package/templates/compose/README.md +20 -0
  321. package/templates/plain/.devcontainer/devcontainer.json +35 -0
  322. package/templates/plain/README.md +21 -0
  323. package/tool/README.md +281 -0
  324. package/tool/schema/base-images.schema.json +43 -0
  325. package/tool/schema/base-templates.schema.json +34 -0
  326. package/tool/schema/config.schema.json +71 -0
  327. package/tool/schema/overlay-manifest.schema.json +86 -0
@@ -0,0 +1,1246 @@
1
+ # Prometheus Overlay
2
+
3
+ Time-series database and monitoring system for collecting and querying metrics from your applications and infrastructure.
4
+
5
+ ## Features
6
+
7
+ - **Prometheus server** - Latest version with TSDB storage
8
+ - **PromQL query language** - Powerful metrics querying and aggregation
9
+ - **Persistent storage** - Data survives container restarts
10
+ - **Web UI** - Built-in query interface and graph visualization
11
+ - **Service discovery** - Auto-discovery of scrape targets
12
+ - **Alerting support** - Rule evaluation and alert generation
13
+ - **Multi-dimensional data** - Flexible label-based data model
14
+
15
+ ## How It Works
16
+
17
+ Prometheus is a pull-based monitoring system that periodically scrapes metrics from configured endpoints. It stores time-series data and provides a powerful query language (PromQL) for analysis and alerting.
18
+
19
+ **Architecture:**
20
+
21
+ ```mermaid
22
+ graph TD
23
+ A[Your Application<br/>Exposes /metrics endpoint<br/>Returns metrics in text fmt] -->|HTTP pull every 15s| B[Prometheus Server<br/>Scrapes metrics<br/>Stores in TSDB<br/>Evaluates rules<br/>Serves UI http://...:9090]
24
+ ```
25
+
26
+ **Metric Types:**
27
+
28
+ - **Counter** - Monotonically increasing value (requests, errors)
29
+ - **Gauge** - Value that can go up or down (memory, temperature)
30
+ - **Histogram** - Distribution of values (request durations)
31
+ - **Summary** - Similar to histogram, with quantiles
32
+
33
+ ## Configuration
34
+
35
+ ### Ports
36
+
37
+ - `9090` - Prometheus web UI and HTTP API
38
+
39
+ ### Environment Variables
40
+
41
+ The overlay includes a `.env.example` file. Copy it to `.env` and customize:
42
+
43
+ ```bash
44
+ cd .devcontainer
45
+ cp .env.example .env
46
+ ```
47
+
48
+ **Available variables:**
49
+
50
+ ```bash
51
+ # Prometheus version
52
+ PROMETHEUS_VERSION=latest
53
+
54
+ # Prometheus port (default 9090)
55
+ PROMETHEUS_PORT=9090
56
+ ```
57
+
58
+ ### Prometheus Configuration File
59
+
60
+ Prometheus is configured via `prometheus.yml` with default scrape targets:
61
+
62
+ ```yaml
63
+ global:
64
+ scrape_interval: 15s
65
+ evaluation_interval: 15s
66
+
67
+ scrape_configs:
68
+ - job_name: 'prometheus'
69
+ static_configs:
70
+ - targets: ['localhost:9090']
71
+
72
+ - job_name: 'otel-collector'
73
+ static_configs:
74
+ - targets: ['otel-collector:8889']
75
+ ```
76
+
77
+ ### Adding Custom Scrape Targets
78
+
79
+ Edit `prometheus.yml` in your project's `.devcontainer` directory:
80
+
81
+ ```yaml
82
+ scrape_configs:
83
+ # ... existing configs ...
84
+
85
+ - job_name: 'my-app'
86
+ static_configs:
87
+ - targets: ['my-app:8080']
88
+ scrape_interval: 5s
89
+ scrape_timeout: 5s
90
+ metrics_path: '/metrics'
91
+ scheme: http
92
+ ```
93
+
94
+ ### Service Discovery
95
+
96
+ **File-based service discovery:**
97
+
98
+ ```yaml
99
+ scrape_configs:
100
+ - job_name: 'services'
101
+ file_sd_configs:
102
+ - files:
103
+ - '/etc/prometheus/targets/*.json'
104
+ refresh_interval: 30s
105
+ ```
106
+
107
+ **targets.json:**
108
+
109
+ ```json
110
+ [
111
+ {
112
+ "targets": ["service1:8080", "service2:8080"],
113
+ "labels": {
114
+ "env": "development",
115
+ "team": "backend"
116
+ }
117
+ }
118
+ ]
119
+ ```
120
+
121
+ **Docker service discovery:**
122
+
123
+ ```yaml
124
+ scrape_configs:
125
+ - job_name: 'docker'
126
+ docker_sd_configs:
127
+ - host: unix:///var/run/docker.sock
128
+ relabel_configs:
129
+ - source_labels: [__meta_docker_container_name]
130
+ target_label: container
131
+ ```
132
+
133
+ ### Port Configuration
134
+
135
+ Ports can be changed via `--port-offset`:
136
+
137
+ ```bash
138
+ # Offset all ports by 100
139
+ container-superposition --port-offset 100
140
+
141
+ # Prometheus will be on 9190 instead of 9090
142
+ ```
143
+
144
+ ## Accessing Prometheus UI
145
+
146
+ Open your browser to:
147
+
148
+ ```
149
+ http://localhost:9090
150
+ ```
151
+
152
+ ### UI Features
153
+
154
+ **1. Expression Browser**
155
+
156
+ - Execute PromQL queries
157
+ - View instant values
158
+ - Graph time series
159
+ - Export to CSV/JSON
160
+
161
+ **2. Targets**
162
+
163
+ - View all scrape targets
164
+ - Check target health (up/down)
165
+ - See last scrape time
166
+ - View scrape errors
167
+
168
+ **3. Rules**
169
+
170
+ - View recording rules
171
+ - View alerting rules
172
+ - Check rule evaluation
173
+
174
+ **4. Alerts**
175
+
176
+ - See active alerts
177
+ - View alert history
178
+ - Check alert conditions
179
+
180
+ **5. Service Discovery**
181
+
182
+ - View discovered targets
183
+ - Check target labels
184
+ - Inspect metadata
185
+
186
+ ## PromQL Query Examples
187
+
188
+ ### Basic Queries
189
+
190
+ **Current value of metric:**
191
+
192
+ ```promql
193
+ http_requests_total
194
+ ```
195
+
196
+ **Filter by labels:**
197
+
198
+ ```promql
199
+ http_requests_total{method="GET", status="200"}
200
+ ```
201
+
202
+ **Regular expression matching:**
203
+
204
+ ```promql
205
+ http_requests_total{status=~"2.."} # 2xx status codes
206
+ http_requests_total{path!~"/health|/metrics"} # Exclude paths
207
+ ```
208
+
209
+ ### Rate and Increase
210
+
211
+ **Request rate (per second over 5 minutes):**
212
+
213
+ ```promql
214
+ rate(http_requests_total[5m])
215
+ ```
216
+
217
+ **Request rate by status code:**
218
+
219
+ ```promql
220
+ sum by (status) (rate(http_requests_total[5m]))
221
+ ```
222
+
223
+ **Total requests in last hour:**
224
+
225
+ ```promql
226
+ increase(http_requests_total[1h])
227
+ ```
228
+
229
+ **Requests per minute:**
230
+
231
+ ```promql
232
+ rate(http_requests_total[1m]) * 60
233
+ ```
234
+
235
+ ### Aggregation
236
+
237
+ **Total requests across all instances:**
238
+
239
+ ```promql
240
+ sum(rate(http_requests_total[5m]))
241
+ ```
242
+
243
+ **Average by service:**
244
+
245
+ ```promql
246
+ avg by (service) (http_request_duration_seconds)
247
+ ```
248
+
249
+ **Maximum value:**
250
+
251
+ ```promql
252
+ max(process_resident_memory_bytes)
253
+ ```
254
+
255
+ **Count number of instances:**
256
+
257
+ ```promql
258
+ count(up{job="my-app"})
259
+ ```
260
+
261
+ **Group by multiple labels:**
262
+
263
+ ```promql
264
+ sum by (service, method) (rate(http_requests_total[5m]))
265
+ ```
266
+
267
+ ### Percentiles and Histograms
268
+
269
+ **95th percentile latency:**
270
+
271
+ ```promql
272
+ histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))
273
+ ```
274
+
275
+ **99th percentile by endpoint:**
276
+
277
+ ```promql
278
+ histogram_quantile(0.99,
279
+ sum by (le, endpoint) (rate(http_request_duration_seconds_bucket[5m]))
280
+ )
281
+ ```
282
+
283
+ **Average request duration:**
284
+
285
+ ```promql
286
+ rate(http_request_duration_seconds_sum[5m]) /
287
+ rate(http_request_duration_seconds_count[5m])
288
+ ```
289
+
290
+ ### Error Rate
291
+
292
+ **Error rate (5xx responses):**
293
+
294
+ ```promql
295
+ rate(http_requests_total{status=~"5.."}[5m])
296
+ ```
297
+
298
+ **Error ratio:**
299
+
300
+ ```promql
301
+ sum(rate(http_requests_total{status=~"5.."}[5m])) /
302
+ sum(rate(http_requests_total[5m]))
303
+ ```
304
+
305
+ **Percentage of errors:**
306
+
307
+ ```promql
308
+ 100 * (
309
+ sum(rate(http_requests_total{status=~"5.."}[5m])) /
310
+ sum(rate(http_requests_total[5m]))
311
+ )
312
+ ```
313
+
314
+ ### Prediction and Trending
315
+
316
+ **Predict value in 1 hour:**
317
+
318
+ ```promql
319
+ predict_linear(disk_usage_bytes[1h], 3600)
320
+ ```
321
+
322
+ **Derive change over time:**
323
+
324
+ ```promql
325
+ deriv(cpu_usage_percent[5m])
326
+ ```
327
+
328
+ ### Mathematical Operations
329
+
330
+ **Memory usage percentage:**
331
+
332
+ ```promql
333
+ 100 * (1 - (node_memory_available_bytes / node_memory_total_bytes))
334
+ ```
335
+
336
+ **Rate difference:**
337
+
338
+ ```promql
339
+ rate(http_requests_total{status="200"}[5m]) -
340
+ rate(http_requests_total{status="200"}[5m] offset 1h)
341
+ ```
342
+
343
+ **Compare with offset:**
344
+
345
+ ```promql
346
+ http_requests_total - http_requests_total offset 1h
347
+ ```
348
+
349
+ ## Application Integration
350
+
351
+ ### Node.js (prom-client)
352
+
353
+ Install dependency:
354
+
355
+ ```bash
356
+ npm install prom-client
357
+ ```
358
+
359
+ **Basic setup:**
360
+
361
+ ```javascript
362
+ const express = require('express');
363
+ const promClient = require('prom-client');
364
+
365
+ const app = express();
366
+
367
+ // Create registry
368
+ const register = new promClient.Registry();
369
+
370
+ // Collect default metrics (CPU, memory, etc.)
371
+ promClient.collectDefaultMetrics({
372
+ register,
373
+ prefix: 'myapp_',
374
+ });
375
+
376
+ // Create custom counter
377
+ const httpRequestsTotal = new promClient.Counter({
378
+ name: 'http_requests_total',
379
+ help: 'Total number of HTTP requests',
380
+ labelNames: ['method', 'route', 'status_code'],
381
+ registers: [register],
382
+ });
383
+
384
+ // Create histogram for latency
385
+ const httpRequestDuration = new promClient.Histogram({
386
+ name: 'http_request_duration_seconds',
387
+ help: 'Duration of HTTP requests in seconds',
388
+ labelNames: ['method', 'route', 'status_code'],
389
+ buckets: [0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1, 5],
390
+ registers: [register],
391
+ });
392
+
393
+ // Create gauge for active connections
394
+ const activeConnections = new promClient.Gauge({
395
+ name: 'active_connections',
396
+ help: 'Number of active connections',
397
+ registers: [register],
398
+ });
399
+
400
+ // Middleware to track metrics
401
+ app.use((req, res, next) => {
402
+ const start = Date.now();
403
+ activeConnections.inc();
404
+
405
+ res.on('finish', () => {
406
+ const duration = (Date.now() - start) / 1000;
407
+
408
+ httpRequestsTotal.inc({
409
+ method: req.method,
410
+ route: req.route?.path || req.path,
411
+ status_code: res.statusCode,
412
+ });
413
+
414
+ httpRequestDuration.observe(
415
+ {
416
+ method: req.method,
417
+ route: req.route?.path || req.path,
418
+ status_code: res.statusCode,
419
+ },
420
+ duration
421
+ );
422
+
423
+ activeConnections.dec();
424
+ });
425
+
426
+ next();
427
+ });
428
+
429
+ // Expose metrics endpoint
430
+ app.get('/metrics', async (req, res) => {
431
+ res.set('Content-Type', register.contentType);
432
+ res.end(await register.metrics());
433
+ });
434
+
435
+ app.listen(3000);
436
+ ```
437
+
438
+ ### Python (prometheus_client)
439
+
440
+ Install dependency:
441
+
442
+ ```bash
443
+ pip install prometheus-client
444
+ ```
445
+
446
+ **Flask application:**
447
+
448
+ ```python
449
+ from flask import Flask, Response
450
+ from prometheus_client import Counter, Histogram, Gauge, generate_latest, REGISTRY
451
+ from prometheus_client import make_wsgi_app
452
+ from werkzeug.middleware.dispatcher import DispatcherMiddleware
453
+ import time
454
+
455
+ app = Flask(__name__)
456
+
457
+ # Create metrics
458
+ REQUEST_COUNT = Counter(
459
+ 'http_requests_total',
460
+ 'Total HTTP requests',
461
+ ['method', 'endpoint', 'status']
462
+ )
463
+
464
+ REQUEST_DURATION = Histogram(
465
+ 'http_request_duration_seconds',
466
+ 'HTTP request duration',
467
+ ['method', 'endpoint'],
468
+ buckets=[0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0]
469
+ )
470
+
471
+ ACTIVE_REQUESTS = Gauge(
472
+ 'active_requests',
473
+ 'Number of active requests'
474
+ )
475
+
476
+ # Middleware to track metrics
477
+ @app.before_request
478
+ def before_request():
479
+ request._start_time = time.time()
480
+ ACTIVE_REQUESTS.inc()
481
+
482
+ @app.after_request
483
+ def after_request(response):
484
+ duration = time.time() - request._start_time
485
+
486
+ REQUEST_COUNT.labels(
487
+ method=request.method,
488
+ endpoint=request.endpoint or request.path,
489
+ status=response.status_code
490
+ ).inc()
491
+
492
+ REQUEST_DURATION.labels(
493
+ method=request.method,
494
+ endpoint=request.endpoint or request.path
495
+ ).observe(duration)
496
+
497
+ ACTIVE_REQUESTS.dec()
498
+
499
+ return response
500
+
501
+ # Expose metrics endpoint
502
+ @app.route('/metrics')
503
+ def metrics():
504
+ return Response(generate_latest(REGISTRY), mimetype='text/plain')
505
+
506
+ # Alternative: Use wsgi middleware
507
+ app.wsgi_app = DispatcherMiddleware(app.wsgi_app, {
508
+ '/metrics': make_wsgi_app()
509
+ })
510
+
511
+ if __name__ == '__main__':
512
+ app.run(host='0.0.0.0', port=8080)
513
+ ```
514
+
515
+ ### .NET (prometheus-net)
516
+
517
+ Install package:
518
+
519
+ ```bash
520
+ dotnet add package prometheus-net.AspNetCore
521
+ ```
522
+
523
+ **ASP.NET Core configuration:**
524
+
525
+ ```csharp
526
+ using Prometheus;
527
+
528
+ var builder = WebApplication.CreateBuilder(args);
529
+
530
+ var app = builder.Build();
531
+
532
+ // Create custom metrics
533
+ var requestCounter = Metrics.CreateCounter(
534
+ "http_requests_total",
535
+ "Total HTTP requests",
536
+ new CounterConfiguration
537
+ {
538
+ LabelNames = new[] { "method", "endpoint", "status" }
539
+ }
540
+ );
541
+
542
+ var requestDuration = Metrics.CreateHistogram(
543
+ "http_request_duration_seconds",
544
+ "HTTP request duration in seconds",
545
+ new HistogramConfiguration
546
+ {
547
+ LabelNames = new[] { "method", "endpoint" },
548
+ Buckets = new[] { 0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0 }
549
+ }
550
+ );
551
+
552
+ var activeRequests = Metrics.CreateGauge(
553
+ "active_requests",
554
+ "Number of active requests"
555
+ );
556
+
557
+ // Use HTTP metrics middleware
558
+ app.UseHttpMetrics(options =>
559
+ {
560
+ options.AddCustomLabel("host", context => context.Request.Host.Host);
561
+ });
562
+
563
+ // Expose /metrics endpoint
564
+ app.UseMetricServer(); // Uses /metrics by default
565
+
566
+ // Or specify custom path
567
+ // app.MapMetrics("/custom-metrics");
568
+
569
+ // Custom middleware for additional metrics
570
+ app.Use(async (context, next) =>
571
+ {
572
+ using (requestDuration
573
+ .WithLabels(context.Request.Method, context.Request.Path)
574
+ .NewTimer())
575
+ {
576
+ activeRequests.Inc();
577
+
578
+ await next();
579
+
580
+ activeRequests.Dec();
581
+
582
+ requestCounter
583
+ .WithLabels(
584
+ context.Request.Method,
585
+ context.Request.Path,
586
+ context.Response.StatusCode.ToString()
587
+ )
588
+ .Inc();
589
+ }
590
+ });
591
+
592
+ app.Run();
593
+ ```
594
+
595
+ ### Go (prometheus/client_golang)
596
+
597
+ Install package:
598
+
599
+ ```bash
600
+ go get github.com/prometheus/client_golang/prometheus
601
+ go get github.com/prometheus/client_golang/prometheus/promhttp
602
+ ```
603
+
604
+ **HTTP server with metrics:**
605
+
606
+ ```go
607
+ package main
608
+
609
+ import (
610
+ "net/http"
611
+ "time"
612
+
613
+ "github.com/prometheus/client_golang/prometheus"
614
+ "github.com/prometheus/client_golang/prometheus/promhttp"
615
+ )
616
+
617
+ var (
618
+ httpRequestsTotal = prometheus.NewCounterVec(
619
+ prometheus.CounterOpts{
620
+ Name: "http_requests_total",
621
+ Help: "Total number of HTTP requests",
622
+ },
623
+ []string{"method", "endpoint", "status"},
624
+ )
625
+
626
+ httpRequestDuration = prometheus.NewHistogramVec(
627
+ prometheus.HistogramOpts{
628
+ Name: "http_request_duration_seconds",
629
+ Help: "HTTP request duration in seconds",
630
+ Buckets: []float64{0.001, 0.005, 0.01, 0.05, 0.1, 0.5, 1.0, 5.0},
631
+ },
632
+ []string{"method", "endpoint"},
633
+ )
634
+
635
+ activeRequests = prometheus.NewGauge(
636
+ prometheus.GaugeOpts{
637
+ Name: "active_requests",
638
+ Help: "Number of active requests",
639
+ },
640
+ )
641
+ )
642
+
643
+ func init() {
644
+ prometheus.MustRegister(httpRequestsTotal)
645
+ prometheus.MustRegister(httpRequestDuration)
646
+ prometheus.MustRegister(activeRequests)
647
+ }
648
+
649
+ func metricsMiddleware(next http.HandlerFunc) http.HandlerFunc {
650
+ return func(w http.ResponseWriter, r *http.Request) {
651
+ start := time.Now()
652
+ activeRequests.Inc()
653
+ defer activeRequests.Dec()
654
+
655
+ // Wrap response writer to capture status code
656
+ wrapped := &responseWriter{ResponseWriter: w, statusCode: http.StatusOK}
657
+
658
+ next(wrapped, r)
659
+
660
+ duration := time.Since(start).Seconds()
661
+
662
+ httpRequestsTotal.WithLabelValues(
663
+ r.Method,
664
+ r.URL.Path,
665
+ http.StatusText(wrapped.statusCode),
666
+ ).Inc()
667
+
668
+ httpRequestDuration.WithLabelValues(
669
+ r.Method,
670
+ r.URL.Path,
671
+ ).Observe(duration)
672
+ }
673
+ }
674
+
675
+ type responseWriter struct {
676
+ http.ResponseWriter
677
+ statusCode int
678
+ }
679
+
680
+ func (rw *responseWriter) WriteHeader(code int) {
681
+ rw.statusCode = code
682
+ rw.ResponseWriter.WriteHeader(code)
683
+ }
684
+
685
+ func main() {
686
+ http.Handle("/metrics", promhttp.Handler())
687
+ http.HandleFunc("/api/hello", metricsMiddleware(helloHandler))
688
+
689
+ http.ListenAndServe(":8080", nil)
690
+ }
691
+
692
+ func helloHandler(w http.ResponseWriter, r *http.Request) {
693
+ w.Write([]byte("Hello, World!"))
694
+ }
695
+ ```
696
+
697
+ ## Recording Rules
698
+
699
+ Recording rules pre-compute expensive queries and store results as new time series.
700
+
701
+ **prometheus.yml:**
702
+
703
+ ```yaml
704
+ rule_files:
705
+ - '/etc/prometheus/rules/*.yml'
706
+ ```
707
+
708
+ **rules/app_rules.yml:**
709
+
710
+ ```yaml
711
+ groups:
712
+ - name: app_rules
713
+ interval: 30s
714
+ rules:
715
+ # Pre-calculate request rate
716
+ - record: job:http_requests:rate5m
717
+ expr: sum by (job) (rate(http_requests_total[5m]))
718
+
719
+ # Pre-calculate error rate
720
+ - record: job:http_errors:rate5m
721
+ expr: |
722
+ sum by (job) (rate(http_requests_total{status=~"5.."}[5m]))
723
+
724
+ # Pre-calculate error ratio
725
+ - record: job:http_errors:ratio5m
726
+ expr: |
727
+ job:http_errors:rate5m / job:http_requests:rate5m
728
+
729
+ # Pre-calculate 95th percentile latency
730
+ - record: job:http_request_duration:p95
731
+ expr: |
732
+ histogram_quantile(0.95,
733
+ sum by (job, le) (rate(http_request_duration_seconds_bucket[5m]))
734
+ )
735
+ ```
736
+
737
+ ## Alerting Rules
738
+
739
+ Define alerts that trigger when conditions are met.
740
+
741
+ **rules/alerts.yml:**
742
+
743
+ ```yaml
744
+ groups:
745
+ - name: app_alerts
746
+ interval: 15s
747
+ rules:
748
+ # Alert when error rate is high
749
+ - alert: HighErrorRate
750
+ expr: |
751
+ (
752
+ sum(rate(http_requests_total{status=~"5.."}[5m])) /
753
+ sum(rate(http_requests_total[5m]))
754
+ ) > 0.05
755
+ for: 5m
756
+ labels:
757
+ severity: warning
758
+ annotations:
759
+ summary: 'High error rate detected'
760
+ description: 'Error rate is {{ $value | humanizePercentage }} (threshold: 5%)'
761
+
762
+ # Alert when latency is high
763
+ - alert: HighLatency
764
+ expr: |
765
+ histogram_quantile(0.95,
766
+ sum by (le) (rate(http_request_duration_seconds_bucket[5m]))
767
+ ) > 1.0
768
+ for: 10m
769
+ labels:
770
+ severity: warning
771
+ annotations:
772
+ summary: 'High latency detected'
773
+ description: '95th percentile latency is {{ $value }}s'
774
+
775
+ # Alert when service is down
776
+ - alert: ServiceDown
777
+ expr: up{job="my-app"} == 0
778
+ for: 1m
779
+ labels:
780
+ severity: critical
781
+ annotations:
782
+ summary: 'Service {{ $labels.job }} is down'
783
+ description: '{{ $labels.instance }} has been down for more than 1 minute'
784
+
785
+ # Alert when disk space is low
786
+ - alert: DiskSpaceLow
787
+ expr: |
788
+ (
789
+ node_filesystem_avail_bytes{mountpoint="/"} /
790
+ node_filesystem_size_bytes{mountpoint="/"}
791
+ ) < 0.1
792
+ for: 5m
793
+ labels:
794
+ severity: warning
795
+ annotations:
796
+ summary: 'Disk space is running low'
797
+ description: 'Only {{ $value | humanizePercentage }} disk space remaining'
798
+ ```
799
+
800
+ ## Grafana Integration
801
+
802
+ ### Adding Prometheus Data Source
803
+
804
+ 1. Open Grafana: http://localhost:3000
805
+ 2. Go to **Configuration** → **Data Sources**
806
+ 3. Click **Add data source**
807
+ 4. Select **Prometheus**
808
+ 5. Set URL: `http://prometheus:9090`
809
+ 6. Click **Save & Test**
810
+
811
+ ### Example Dashboard JSON
812
+
813
+ ```json
814
+ {
815
+ "dashboard": {
816
+ "title": "Application Metrics",
817
+ "panels": [
818
+ {
819
+ "title": "Request Rate",
820
+ "targets": [
821
+ {
822
+ "expr": "rate(http_requests_total[5m])",
823
+ "legendFormat": "{{method}} {{status}}"
824
+ }
825
+ ],
826
+ "type": "graph"
827
+ },
828
+ {
829
+ "title": "Error Rate",
830
+ "targets": [
831
+ {
832
+ "expr": "sum(rate(http_requests_total{status=~\"5..\"}[5m]))",
833
+ "legendFormat": "Errors"
834
+ }
835
+ ],
836
+ "type": "graph"
837
+ },
838
+ {
839
+ "title": "Latency (95th percentile)",
840
+ "targets": [
841
+ {
842
+ "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
843
+ "legendFormat": "p95"
844
+ }
845
+ ],
846
+ "type": "graph"
847
+ }
848
+ ]
849
+ }
850
+ }
851
+ ```
852
+
853
+ ## Best Practices
854
+
855
+ ### Metric Naming Conventions
856
+
857
+ **✅ Good names:**
858
+
859
+ - `http_requests_total` - Counter with `_total` suffix
860
+ - `http_request_duration_seconds` - Histogram with unit
861
+ - `active_connections` - Gauge, no suffix
862
+ - `process_cpu_seconds_total` - Counter with unit and `_total`
863
+
864
+ **❌ Bad names:**
865
+
866
+ - `requests` - Too generic, no type/unit
867
+ - `http_request_duration` - Missing unit
868
+ - `activeConnections` - Use snake_case, not camelCase
869
+ - `http_requests_count` - Use `_total`, not `_count`
870
+
871
+ ### Label Best Practices
872
+
873
+ **✅ Good labels:**
874
+
875
+ ```promql
876
+ http_requests_total{method="GET", status="200", service="api"}
877
+ # Low cardinality, meaningful dimensions
878
+ ```
879
+
880
+ **❌ Bad labels:**
881
+
882
+ ```promql
883
+ http_requests_total{user_id="12345", timestamp="1234567890"}
884
+ # High cardinality, creates millions of time series
885
+ ```
886
+
887
+ **Keep label cardinality low:**
888
+
889
+ - ✅ `status="200"` - ~5-10 values
890
+ - ✅ `method="GET"` - ~10 values
891
+ - ✅ `service="api"` - ~10-100 values
892
+ - ❌ `user_id="12345"` - Millions of values
893
+ - ❌ `request_id="abc-123"` - Unique per request
894
+ - ❌ `timestamp="..."` - Always unique
895
+
896
+ ### Metric Types Usage
897
+
898
+ **Counter** - Always increasing values:
899
+
900
+ ```python
901
+ # ✅ Correct
902
+ requests_total.inc()
903
+ errors_total.inc()
904
+ bytes_sent_total.inc(size)
905
+
906
+ # ❌ Wrong
907
+ active_connections.inc() # Use Gauge instead
908
+ ```
909
+
910
+ **Gauge** - Values that go up and down:
911
+
912
+ ```python
913
+ # ✅ Correct
914
+ temperature.set(23.5)
915
+ active_connections.inc()
916
+ active_connections.dec()
917
+ memory_usage.set(1024)
918
+
919
+ # ❌ Wrong
920
+ requests_total.set(100) # Use Counter instead
921
+ ```
922
+
923
+ **Histogram** - Distribution of values:
924
+
925
+ ```python
926
+ # ✅ Correct
927
+ request_duration.observe(0.234)
928
+ response_size.observe(1024)
929
+
930
+ # Define appropriate buckets
931
+ Histogram('http_request_duration_seconds',
932
+ buckets=[0.001, 0.01, 0.1, 1.0, 10.0])
933
+ ```
934
+
935
+ ### Query Optimization
936
+
937
+ **Use recording rules for expensive queries:**
938
+
939
+ ```yaml
940
+ # Instead of running this complex query repeatedly:
941
+ histogram_quantile(0.95,
942
+ sum by (service, le) (rate(http_request_duration_seconds_bucket[5m]))
943
+ )
944
+
945
+ # Pre-calculate with recording rule:
946
+ - record: service:http_request_duration:p95
947
+ expr: |
948
+ histogram_quantile(0.95,
949
+ sum by (service, le) (rate(http_request_duration_seconds_bucket[5m]))
950
+ )
951
+ ```
952
+
953
+ **Filter early, aggregate late:**
954
+
955
+ ```promql
956
+ # ✅ Good - Filter first
957
+ sum(rate(http_requests_total{service="api", status="200"}[5m]))
958
+
959
+ # ❌ Bad - Aggregate then filter
960
+ sum(rate(http_requests_total[5m])) and {service="api"}
961
+ ```
962
+
963
+ ## Performance Tuning
964
+
965
+ ### Storage Optimization
966
+
967
+ **Adjust retention:**
968
+
969
+ ```yaml
970
+ # docker-compose.yml
971
+ services:
972
+ prometheus:
973
+ command:
974
+ - '--storage.tsdb.retention.time=30d' # Keep 30 days
975
+ - '--storage.tsdb.retention.size=10GB' # Max 10GB
976
+ ```
977
+
978
+ **Tune for write performance:**
979
+
980
+ ```yaml
981
+ command:
982
+ - '--storage.tsdb.min-block-duration=2h'
983
+ - '--storage.tsdb.max-block-duration=2h'
984
+ ```
985
+
986
+ ### Scrape Configuration
987
+
988
+ **Adjust scrape intervals:**
989
+
990
+ ```yaml
991
+ global:
992
+ scrape_interval: 15s # Default
993
+ scrape_timeout: 10s
994
+ evaluation_interval: 15s # How often to evaluate rules
995
+
996
+ scrape_configs:
997
+ - job_name: 'high-frequency'
998
+ scrape_interval: 5s # Override for specific job
999
+ static_configs:
1000
+ - targets: ['app:8080']
1001
+
1002
+ - job_name: 'low-frequency'
1003
+ scrape_interval: 60s # Less frequent scraping
1004
+ static_configs:
1005
+ - targets: ['batch:8080']
1006
+ ```
1007
+
1008
+ ### Memory Management
1009
+
1010
+ **Limit memory usage in docker-compose.yml:**
1011
+
1012
+ ```yaml
1013
+ services:
1014
+ prometheus:
1015
+ mem_limit: 2g
1016
+ environment:
1017
+ - GOGC=50 # More aggressive garbage collection
1018
+ ```
1019
+
1020
+ **Monitor Prometheus itself:**
1021
+
1022
+ ```promql
1023
+ # TSDB size
1024
+ prometheus_tsdb_storage_blocks_bytes
1025
+
1026
+ # Number of time series
1027
+ prometheus_tsdb_head_series
1028
+
1029
+ # Ingestion rate
1030
+ rate(prometheus_tsdb_head_samples_appended_total[5m])
1031
+ ```
1032
+
1033
+ ## Remote Write/Read
1034
+
1035
+ ### Remote Write to Long-term Storage
1036
+
1037
+ **prometheus.yml:**
1038
+
1039
+ ```yaml
1040
+ remote_write:
1041
+ - url: 'http://remote-storage:9201/write'
1042
+ queue_config:
1043
+ capacity: 10000
1044
+ max_shards: 50
1045
+ max_samples_per_send: 5000
1046
+ write_relabel_configs:
1047
+ - source_labels: [__name__]
1048
+ regex: 'expensive_.*'
1049
+ action: drop
1050
+ ```
1051
+
1052
+ ### Remote Read
1053
+
1054
+ ```yaml
1055
+ remote_read:
1056
+ - url: 'http://remote-storage:9201/read'
1057
+ read_recent: true
1058
+ ```
1059
+
1060
+ ## Troubleshooting
1061
+
1062
+ ### Target not being scraped
1063
+
1064
+ **Check targets page:**
1065
+
1066
+ ```
1067
+ http://localhost:9090/targets
1068
+ ```
1069
+
1070
+ **Verify target is reachable:**
1071
+
1072
+ ```bash
1073
+ # From Prometheus container
1074
+ docker-compose exec prometheus wget -O- http://my-app:8080/metrics
1075
+
1076
+ # From dev container
1077
+ curl http://my-app:8080/metrics
1078
+ ```
1079
+
1080
+ **Check logs:**
1081
+
1082
+ ```bash
1083
+ docker-compose logs prometheus | grep -i error
1084
+ ```
1085
+
1086
+ **Common issues:**
1087
+
1088
+ - Wrong hostname (use Docker service name, not localhost)
1089
+ - Wrong port
1090
+ - Metrics endpoint not implemented
1091
+ - Network not shared (check `networks:` in docker-compose)
1092
+
1093
+ ### Metrics not appearing
1094
+
1095
+ **Verify metric format:**
1096
+
1097
+ ```bash
1098
+ curl http://localhost:8080/metrics
1099
+ ```
1100
+
1101
+ **Should return:**
1102
+
1103
+ ```
1104
+ # HELP http_requests_total Total HTTP requests
1105
+ # TYPE http_requests_total counter
1106
+ http_requests_total{method="GET",status="200"} 1234
1107
+ ```
1108
+
1109
+ **Check:**
1110
+
1111
+ - Correct Content-Type: `text/plain; version=0.0.4`
1112
+ - Metric names follow conventions (snake_case)
1113
+ - Counter names end with `_total`
1114
+ - Include unit in name (`_seconds`, `_bytes`)
1115
+
1116
+ ### High memory usage
1117
+
1118
+ **Check time series count:**
1119
+
1120
+ ```promql
1121
+ prometheus_tsdb_head_series
1122
+ ```
1123
+
1124
+ **If too high (>1 million):**
1125
+
1126
+ 1. Reduce label cardinality
1127
+ 2. Drop unnecessary metrics with relabel configs
1128
+ 3. Decrease retention time
1129
+ 4. Use recording rules
1130
+
1131
+ **Drop metrics:**
1132
+
1133
+ ```yaml
1134
+ scrape_configs:
1135
+ - job_name: 'my-app'
1136
+ static_configs:
1137
+ - targets: ['my-app:8080']
1138
+ metric_relabel_configs:
1139
+ - source_labels: [__name__]
1140
+ regex: 'go_.*|process_.*' # Drop Go runtime metrics
1141
+ action: drop
1142
+ ```
1143
+
1144
+ ### Slow queries
1145
+
1146
+ **Check query stats:**
1147
+
1148
+ ```
1149
+ http://localhost:9090/tsdb-status
1150
+ ```
1151
+
1152
+ **Optimize queries:**
1153
+
1154
+ ```promql
1155
+ # ❌ Slow - Processes all data then filters
1156
+ rate(http_requests_total[5m]){status="200"}
1157
+
1158
+ # ✅ Fast - Filters first
1159
+ rate(http_requests_total{status="200"}[5m])
1160
+ ```
1161
+
1162
+ **Use recording rules for repeated queries:**
1163
+
1164
+ ```yaml
1165
+ - record: job:http_requests:rate5m
1166
+ expr: sum by (job) (rate(http_requests_total[5m]))
1167
+ ```
1168
+
1169
+ ### Data not persisting
1170
+
1171
+ **Check volume mount:**
1172
+
1173
+ ```bash
1174
+ docker volume ls
1175
+ docker volume inspect <prometheus_volume>
1176
+ ```
1177
+
1178
+ **Verify in docker-compose.yml:**
1179
+
1180
+ ```yaml
1181
+ services:
1182
+ prometheus:
1183
+ volumes:
1184
+ - prometheus_data:/prometheus
1185
+
1186
+ volumes:
1187
+ prometheus_data:
1188
+ ```
1189
+
1190
+ ## Use Cases
1191
+
1192
+ ### Application Monitoring
1193
+
1194
+ - Track request rates and latencies
1195
+ - Monitor error rates
1196
+ - Measure business metrics (orders, sign-ups)
1197
+ - Track API usage
1198
+
1199
+ ### Infrastructure Monitoring
1200
+
1201
+ - CPU, memory, disk usage
1202
+ - Network traffic
1203
+ - Container metrics
1204
+ - Database performance
1205
+
1206
+ ### SLI/SLO Tracking
1207
+
1208
+ - Service availability
1209
+ - Request latency percentiles
1210
+ - Error budget consumption
1211
+ - SLA compliance
1212
+
1213
+ ### Capacity Planning
1214
+
1215
+ - Resource utilization trends
1216
+ - Growth predictions
1217
+ - Scaling triggers
1218
+ - Cost optimization
1219
+
1220
+ ## Related Overlays
1221
+
1222
+ - **grafana** - Visualization and dashboards for Prometheus metrics
1223
+ - **otel-collector** - Collect metrics via OpenTelemetry and export to Prometheus
1224
+ - **jaeger** - Correlation between metrics and traces
1225
+ - **loki** - Correlation between metrics and logs
1226
+ - **nodejs/python/dotnet/go** - Application frameworks with Prometheus clients
1227
+
1228
+ ## Additional Resources
1229
+
1230
+ - [Prometheus Documentation](https://prometheus.io/docs/)
1231
+ - [PromQL Basics](https://prometheus.io/docs/prometheus/latest/querying/basics/)
1232
+ - [Best Practices](https://prometheus.io/docs/practices/)
1233
+ - [Metric Types](https://prometheus.io/docs/concepts/metric_types/)
1234
+ - [Recording Rules](https://prometheus.io/docs/prometheus/latest/configuration/recording_rules/)
1235
+ - [Alerting Rules](https://prometheus.io/docs/prometheus/latest/configuration/alerting_rules/)
1236
+ - [Awesome Prometheus](https://github.com/roaldnefs/awesome-prometheus)
1237
+
1238
+ ## Notes
1239
+
1240
+ - This overlay **requires compose stack** (uses docker-compose)
1241
+ - Prometheus runs on port **9090** (configurable with port-offset)
1242
+ - Data persists in Docker volume `prometheus_data`
1243
+ - Default retention is **15 days** (configurable via command args)
1244
+ - Use hostname **`prometheus`** from other containers
1245
+ - Use **`localhost`** from host machine
1246
+ - Scrape interval defaults to **15 seconds**