claude-autopm 2.7.0 → 2.8.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (281) hide show
  1. package/README.md +307 -56
  2. package/autopm/.claude/.env +158 -0
  3. package/autopm/.claude/settings.local.json +9 -0
  4. package/bin/autopm.js +11 -2
  5. package/bin/commands/epic.js +23 -3
  6. package/bin/commands/plugin.js +395 -0
  7. package/bin/commands/team.js +184 -10
  8. package/install/install.js +223 -4
  9. package/lib/cli/commands/issue.js +360 -20
  10. package/lib/plugins/PluginManager.js +1328 -0
  11. package/lib/plugins/PluginManager.old.js +400 -0
  12. package/lib/providers/AzureDevOpsProvider.js +575 -0
  13. package/lib/providers/GitHubProvider.js +475 -0
  14. package/lib/services/EpicService.js +1092 -3
  15. package/lib/services/IssueService.js +991 -0
  16. package/package.json +9 -1
  17. package/scripts/publish-plugins.sh +166 -0
  18. package/autopm/.claude/agents/cloud/README.md +0 -55
  19. package/autopm/.claude/agents/cloud/aws-cloud-architect.md +0 -521
  20. package/autopm/.claude/agents/cloud/azure-cloud-architect.md +0 -436
  21. package/autopm/.claude/agents/cloud/gcp-cloud-architect.md +0 -385
  22. package/autopm/.claude/agents/cloud/gcp-cloud-functions-engineer.md +0 -306
  23. package/autopm/.claude/agents/cloud/gemini-api-expert.md +0 -880
  24. package/autopm/.claude/agents/cloud/kubernetes-orchestrator.md +0 -566
  25. package/autopm/.claude/agents/cloud/openai-python-expert.md +0 -1087
  26. package/autopm/.claude/agents/cloud/terraform-infrastructure-expert.md +0 -454
  27. package/autopm/.claude/agents/core/agent-manager.md +0 -296
  28. package/autopm/.claude/agents/core/code-analyzer.md +0 -131
  29. package/autopm/.claude/agents/core/file-analyzer.md +0 -162
  30. package/autopm/.claude/agents/core/test-runner.md +0 -200
  31. package/autopm/.claude/agents/data/airflow-orchestration-expert.md +0 -52
  32. package/autopm/.claude/agents/data/kedro-pipeline-expert.md +0 -50
  33. package/autopm/.claude/agents/data/langgraph-workflow-expert.md +0 -520
  34. package/autopm/.claude/agents/databases/README.md +0 -50
  35. package/autopm/.claude/agents/databases/bigquery-expert.md +0 -392
  36. package/autopm/.claude/agents/databases/cosmosdb-expert.md +0 -368
  37. package/autopm/.claude/agents/databases/mongodb-expert.md +0 -398
  38. package/autopm/.claude/agents/databases/postgresql-expert.md +0 -321
  39. package/autopm/.claude/agents/databases/redis-expert.md +0 -52
  40. package/autopm/.claude/agents/devops/README.md +0 -52
  41. package/autopm/.claude/agents/devops/azure-devops-specialist.md +0 -308
  42. package/autopm/.claude/agents/devops/docker-containerization-expert.md +0 -298
  43. package/autopm/.claude/agents/devops/github-operations-specialist.md +0 -335
  44. package/autopm/.claude/agents/devops/mcp-context-manager.md +0 -319
  45. package/autopm/.claude/agents/devops/observability-engineer.md +0 -574
  46. package/autopm/.claude/agents/devops/ssh-operations-expert.md +0 -1093
  47. package/autopm/.claude/agents/devops/traefik-proxy-expert.md +0 -444
  48. package/autopm/.claude/agents/frameworks/README.md +0 -64
  49. package/autopm/.claude/agents/frameworks/e2e-test-engineer.md +0 -360
  50. package/autopm/.claude/agents/frameworks/nats-messaging-expert.md +0 -254
  51. package/autopm/.claude/agents/frameworks/react-frontend-engineer.md +0 -217
  52. package/autopm/.claude/agents/frameworks/react-ui-expert.md +0 -226
  53. package/autopm/.claude/agents/frameworks/tailwindcss-expert.md +0 -770
  54. package/autopm/.claude/agents/frameworks/ux-design-expert.md +0 -244
  55. package/autopm/.claude/agents/integration/message-queue-engineer.md +0 -794
  56. package/autopm/.claude/agents/languages/README.md +0 -50
  57. package/autopm/.claude/agents/languages/bash-scripting-expert.md +0 -541
  58. package/autopm/.claude/agents/languages/javascript-frontend-engineer.md +0 -197
  59. package/autopm/.claude/agents/languages/nodejs-backend-engineer.md +0 -226
  60. package/autopm/.claude/agents/languages/python-backend-engineer.md +0 -214
  61. package/autopm/.claude/agents/languages/python-backend-expert.md +0 -289
  62. package/autopm/.claude/agents/testing/frontend-testing-engineer.md +0 -395
  63. package/autopm/.claude/commands/ai/langgraph-workflow.md +0 -65
  64. package/autopm/.claude/commands/ai/openai-chat.md +0 -65
  65. package/autopm/.claude/commands/azure/COMMANDS.md +0 -107
  66. package/autopm/.claude/commands/azure/COMMAND_MAPPING.md +0 -252
  67. package/autopm/.claude/commands/azure/INTEGRATION_FIX.md +0 -103
  68. package/autopm/.claude/commands/azure/README.md +0 -246
  69. package/autopm/.claude/commands/azure/active-work.md +0 -198
  70. package/autopm/.claude/commands/azure/aliases.md +0 -143
  71. package/autopm/.claude/commands/azure/blocked-items.md +0 -287
  72. package/autopm/.claude/commands/azure/clean.md +0 -93
  73. package/autopm/.claude/commands/azure/docs-query.md +0 -48
  74. package/autopm/.claude/commands/azure/feature-decompose.md +0 -380
  75. package/autopm/.claude/commands/azure/feature-list.md +0 -61
  76. package/autopm/.claude/commands/azure/feature-new.md +0 -115
  77. package/autopm/.claude/commands/azure/feature-show.md +0 -205
  78. package/autopm/.claude/commands/azure/feature-start.md +0 -130
  79. package/autopm/.claude/commands/azure/fix-integration-example.md +0 -93
  80. package/autopm/.claude/commands/azure/help.md +0 -150
  81. package/autopm/.claude/commands/azure/import-us.md +0 -269
  82. package/autopm/.claude/commands/azure/init.md +0 -211
  83. package/autopm/.claude/commands/azure/next-task.md +0 -262
  84. package/autopm/.claude/commands/azure/search.md +0 -160
  85. package/autopm/.claude/commands/azure/sprint-status.md +0 -235
  86. package/autopm/.claude/commands/azure/standup.md +0 -260
  87. package/autopm/.claude/commands/azure/sync-all.md +0 -99
  88. package/autopm/.claude/commands/azure/task-analyze.md +0 -186
  89. package/autopm/.claude/commands/azure/task-close.md +0 -329
  90. package/autopm/.claude/commands/azure/task-edit.md +0 -145
  91. package/autopm/.claude/commands/azure/task-list.md +0 -263
  92. package/autopm/.claude/commands/azure/task-new.md +0 -84
  93. package/autopm/.claude/commands/azure/task-reopen.md +0 -79
  94. package/autopm/.claude/commands/azure/task-show.md +0 -126
  95. package/autopm/.claude/commands/azure/task-start.md +0 -301
  96. package/autopm/.claude/commands/azure/task-status.md +0 -65
  97. package/autopm/.claude/commands/azure/task-sync.md +0 -67
  98. package/autopm/.claude/commands/azure/us-edit.md +0 -164
  99. package/autopm/.claude/commands/azure/us-list.md +0 -202
  100. package/autopm/.claude/commands/azure/us-new.md +0 -265
  101. package/autopm/.claude/commands/azure/us-parse.md +0 -253
  102. package/autopm/.claude/commands/azure/us-show.md +0 -188
  103. package/autopm/.claude/commands/azure/us-status.md +0 -320
  104. package/autopm/.claude/commands/azure/validate.md +0 -86
  105. package/autopm/.claude/commands/azure/work-item-sync.md +0 -47
  106. package/autopm/.claude/commands/cloud/infra-deploy.md +0 -38
  107. package/autopm/.claude/commands/github/workflow-create.md +0 -42
  108. package/autopm/.claude/commands/infrastructure/ssh-security.md +0 -65
  109. package/autopm/.claude/commands/infrastructure/traefik-setup.md +0 -65
  110. package/autopm/.claude/commands/kubernetes/deploy.md +0 -37
  111. package/autopm/.claude/commands/playwright/test-scaffold.md +0 -38
  112. package/autopm/.claude/commands/pm/blocked.md +0 -28
  113. package/autopm/.claude/commands/pm/clean.md +0 -119
  114. package/autopm/.claude/commands/pm/context-create.md +0 -136
  115. package/autopm/.claude/commands/pm/context-prime.md +0 -170
  116. package/autopm/.claude/commands/pm/context-update.md +0 -292
  117. package/autopm/.claude/commands/pm/context.md +0 -28
  118. package/autopm/.claude/commands/pm/epic-close.md +0 -86
  119. package/autopm/.claude/commands/pm/epic-decompose.md +0 -370
  120. package/autopm/.claude/commands/pm/epic-edit.md +0 -83
  121. package/autopm/.claude/commands/pm/epic-list.md +0 -30
  122. package/autopm/.claude/commands/pm/epic-merge.md +0 -222
  123. package/autopm/.claude/commands/pm/epic-oneshot.md +0 -119
  124. package/autopm/.claude/commands/pm/epic-refresh.md +0 -119
  125. package/autopm/.claude/commands/pm/epic-show.md +0 -28
  126. package/autopm/.claude/commands/pm/epic-split.md +0 -120
  127. package/autopm/.claude/commands/pm/epic-start.md +0 -195
  128. package/autopm/.claude/commands/pm/epic-status.md +0 -28
  129. package/autopm/.claude/commands/pm/epic-sync-modular.md +0 -338
  130. package/autopm/.claude/commands/pm/epic-sync-original.md +0 -473
  131. package/autopm/.claude/commands/pm/epic-sync.md +0 -486
  132. package/autopm/.claude/commands/pm/help.md +0 -28
  133. package/autopm/.claude/commands/pm/import.md +0 -115
  134. package/autopm/.claude/commands/pm/in-progress.md +0 -28
  135. package/autopm/.claude/commands/pm/init.md +0 -28
  136. package/autopm/.claude/commands/pm/issue-analyze.md +0 -202
  137. package/autopm/.claude/commands/pm/issue-close.md +0 -119
  138. package/autopm/.claude/commands/pm/issue-edit.md +0 -93
  139. package/autopm/.claude/commands/pm/issue-reopen.md +0 -87
  140. package/autopm/.claude/commands/pm/issue-show.md +0 -41
  141. package/autopm/.claude/commands/pm/issue-start.md +0 -234
  142. package/autopm/.claude/commands/pm/issue-status.md +0 -95
  143. package/autopm/.claude/commands/pm/issue-sync.md +0 -411
  144. package/autopm/.claude/commands/pm/next.md +0 -28
  145. package/autopm/.claude/commands/pm/prd-edit.md +0 -82
  146. package/autopm/.claude/commands/pm/prd-list.md +0 -28
  147. package/autopm/.claude/commands/pm/prd-new.md +0 -55
  148. package/autopm/.claude/commands/pm/prd-parse.md +0 -42
  149. package/autopm/.claude/commands/pm/prd-status.md +0 -28
  150. package/autopm/.claude/commands/pm/search.md +0 -28
  151. package/autopm/.claude/commands/pm/standup.md +0 -28
  152. package/autopm/.claude/commands/pm/status.md +0 -28
  153. package/autopm/.claude/commands/pm/sync.md +0 -99
  154. package/autopm/.claude/commands/pm/test-reference-update.md +0 -151
  155. package/autopm/.claude/commands/pm/validate.md +0 -28
  156. package/autopm/.claude/commands/pm/what-next.md +0 -28
  157. package/autopm/.claude/commands/python/api-scaffold.md +0 -50
  158. package/autopm/.claude/commands/python/docs-query.md +0 -48
  159. package/autopm/.claude/commands/react/app-scaffold.md +0 -50
  160. package/autopm/.claude/commands/testing/prime.md +0 -314
  161. package/autopm/.claude/commands/testing/run.md +0 -125
  162. package/autopm/.claude/commands/ui/bootstrap-scaffold.md +0 -65
  163. package/autopm/.claude/commands/ui/tailwind-system.md +0 -64
  164. package/autopm/.claude/rules/ai-integration-patterns.md +0 -219
  165. package/autopm/.claude/rules/ci-cd-kubernetes-strategy.md +0 -25
  166. package/autopm/.claude/rules/database-management-strategy.md +0 -17
  167. package/autopm/.claude/rules/database-pipeline.md +0 -94
  168. package/autopm/.claude/rules/devops-troubleshooting-playbook.md +0 -450
  169. package/autopm/.claude/rules/docker-first-development.md +0 -404
  170. package/autopm/.claude/rules/infrastructure-pipeline.md +0 -128
  171. package/autopm/.claude/rules/performance-guidelines.md +0 -403
  172. package/autopm/.claude/rules/ui-development-standards.md +0 -281
  173. package/autopm/.claude/rules/ui-framework-rules.md +0 -151
  174. package/autopm/.claude/rules/ux-design-rules.md +0 -209
  175. package/autopm/.claude/rules/visual-testing.md +0 -223
  176. package/autopm/.claude/scripts/azure/README.md +0 -192
  177. package/autopm/.claude/scripts/azure/active-work.js +0 -524
  178. package/autopm/.claude/scripts/azure/active-work.sh +0 -20
  179. package/autopm/.claude/scripts/azure/blocked.js +0 -520
  180. package/autopm/.claude/scripts/azure/blocked.sh +0 -20
  181. package/autopm/.claude/scripts/azure/daily.js +0 -533
  182. package/autopm/.claude/scripts/azure/daily.sh +0 -20
  183. package/autopm/.claude/scripts/azure/dashboard.js +0 -970
  184. package/autopm/.claude/scripts/azure/dashboard.sh +0 -20
  185. package/autopm/.claude/scripts/azure/feature-list.js +0 -254
  186. package/autopm/.claude/scripts/azure/feature-list.sh +0 -20
  187. package/autopm/.claude/scripts/azure/feature-show.js +0 -7
  188. package/autopm/.claude/scripts/azure/feature-show.sh +0 -20
  189. package/autopm/.claude/scripts/azure/feature-status.js +0 -604
  190. package/autopm/.claude/scripts/azure/feature-status.sh +0 -20
  191. package/autopm/.claude/scripts/azure/help.js +0 -342
  192. package/autopm/.claude/scripts/azure/help.sh +0 -20
  193. package/autopm/.claude/scripts/azure/next-task.js +0 -508
  194. package/autopm/.claude/scripts/azure/next-task.sh +0 -20
  195. package/autopm/.claude/scripts/azure/search.js +0 -469
  196. package/autopm/.claude/scripts/azure/search.sh +0 -20
  197. package/autopm/.claude/scripts/azure/setup.js +0 -745
  198. package/autopm/.claude/scripts/azure/setup.sh +0 -20
  199. package/autopm/.claude/scripts/azure/sprint-report.js +0 -1012
  200. package/autopm/.claude/scripts/azure/sprint-report.sh +0 -20
  201. package/autopm/.claude/scripts/azure/sync.js +0 -563
  202. package/autopm/.claude/scripts/azure/sync.sh +0 -20
  203. package/autopm/.claude/scripts/azure/us-list.js +0 -210
  204. package/autopm/.claude/scripts/azure/us-list.sh +0 -20
  205. package/autopm/.claude/scripts/azure/us-status.js +0 -238
  206. package/autopm/.claude/scripts/azure/us-status.sh +0 -20
  207. package/autopm/.claude/scripts/azure/validate.js +0 -626
  208. package/autopm/.claude/scripts/azure/validate.sh +0 -20
  209. package/autopm/.claude/scripts/azure/wrapper-template.sh +0 -20
  210. package/autopm/.claude/scripts/github/dependency-tracker.js +0 -554
  211. package/autopm/.claude/scripts/github/dependency-validator.js +0 -545
  212. package/autopm/.claude/scripts/github/dependency-visualizer.js +0 -477
  213. package/autopm/.claude/scripts/pm/analytics.js +0 -425
  214. package/autopm/.claude/scripts/pm/blocked.js +0 -164
  215. package/autopm/.claude/scripts/pm/blocked.sh +0 -78
  216. package/autopm/.claude/scripts/pm/clean.js +0 -464
  217. package/autopm/.claude/scripts/pm/context-create.js +0 -216
  218. package/autopm/.claude/scripts/pm/context-prime.js +0 -335
  219. package/autopm/.claude/scripts/pm/context-update.js +0 -344
  220. package/autopm/.claude/scripts/pm/context.js +0 -338
  221. package/autopm/.claude/scripts/pm/epic-close.js +0 -347
  222. package/autopm/.claude/scripts/pm/epic-edit.js +0 -382
  223. package/autopm/.claude/scripts/pm/epic-list.js +0 -273
  224. package/autopm/.claude/scripts/pm/epic-list.sh +0 -109
  225. package/autopm/.claude/scripts/pm/epic-show.js +0 -291
  226. package/autopm/.claude/scripts/pm/epic-show.sh +0 -105
  227. package/autopm/.claude/scripts/pm/epic-split.js +0 -522
  228. package/autopm/.claude/scripts/pm/epic-start/epic-start.js +0 -183
  229. package/autopm/.claude/scripts/pm/epic-start/epic-start.sh +0 -94
  230. package/autopm/.claude/scripts/pm/epic-status.js +0 -291
  231. package/autopm/.claude/scripts/pm/epic-status.sh +0 -104
  232. package/autopm/.claude/scripts/pm/epic-sync/README.md +0 -208
  233. package/autopm/.claude/scripts/pm/epic-sync/create-epic-issue.sh +0 -77
  234. package/autopm/.claude/scripts/pm/epic-sync/create-task-issues.sh +0 -86
  235. package/autopm/.claude/scripts/pm/epic-sync/update-epic-file.sh +0 -79
  236. package/autopm/.claude/scripts/pm/epic-sync/update-references.sh +0 -89
  237. package/autopm/.claude/scripts/pm/epic-sync.sh +0 -137
  238. package/autopm/.claude/scripts/pm/help.js +0 -92
  239. package/autopm/.claude/scripts/pm/help.sh +0 -90
  240. package/autopm/.claude/scripts/pm/in-progress.js +0 -178
  241. package/autopm/.claude/scripts/pm/in-progress.sh +0 -93
  242. package/autopm/.claude/scripts/pm/init.js +0 -321
  243. package/autopm/.claude/scripts/pm/init.sh +0 -178
  244. package/autopm/.claude/scripts/pm/issue-close.js +0 -232
  245. package/autopm/.claude/scripts/pm/issue-edit.js +0 -310
  246. package/autopm/.claude/scripts/pm/issue-show.js +0 -272
  247. package/autopm/.claude/scripts/pm/issue-start.js +0 -181
  248. package/autopm/.claude/scripts/pm/issue-sync/format-comment.sh +0 -468
  249. package/autopm/.claude/scripts/pm/issue-sync/gather-updates.sh +0 -460
  250. package/autopm/.claude/scripts/pm/issue-sync/post-comment.sh +0 -330
  251. package/autopm/.claude/scripts/pm/issue-sync/preflight-validation.sh +0 -348
  252. package/autopm/.claude/scripts/pm/issue-sync/update-frontmatter.sh +0 -387
  253. package/autopm/.claude/scripts/pm/lib/README.md +0 -85
  254. package/autopm/.claude/scripts/pm/lib/epic-discovery.js +0 -119
  255. package/autopm/.claude/scripts/pm/lib/logger.js +0 -78
  256. package/autopm/.claude/scripts/pm/next.js +0 -189
  257. package/autopm/.claude/scripts/pm/next.sh +0 -72
  258. package/autopm/.claude/scripts/pm/optimize.js +0 -407
  259. package/autopm/.claude/scripts/pm/pr-create.js +0 -337
  260. package/autopm/.claude/scripts/pm/pr-list.js +0 -257
  261. package/autopm/.claude/scripts/pm/prd-list.js +0 -242
  262. package/autopm/.claude/scripts/pm/prd-list.sh +0 -103
  263. package/autopm/.claude/scripts/pm/prd-new.js +0 -684
  264. package/autopm/.claude/scripts/pm/prd-parse.js +0 -547
  265. package/autopm/.claude/scripts/pm/prd-status.js +0 -152
  266. package/autopm/.claude/scripts/pm/prd-status.sh +0 -63
  267. package/autopm/.claude/scripts/pm/release.js +0 -460
  268. package/autopm/.claude/scripts/pm/search.js +0 -192
  269. package/autopm/.claude/scripts/pm/search.sh +0 -89
  270. package/autopm/.claude/scripts/pm/standup.js +0 -362
  271. package/autopm/.claude/scripts/pm/standup.sh +0 -95
  272. package/autopm/.claude/scripts/pm/status.js +0 -148
  273. package/autopm/.claude/scripts/pm/status.sh +0 -59
  274. package/autopm/.claude/scripts/pm/sync-batch.js +0 -337
  275. package/autopm/.claude/scripts/pm/sync.js +0 -343
  276. package/autopm/.claude/scripts/pm/template-list.js +0 -141
  277. package/autopm/.claude/scripts/pm/template-new.js +0 -366
  278. package/autopm/.claude/scripts/pm/validate.js +0 -274
  279. package/autopm/.claude/scripts/pm/validate.sh +0 -106
  280. package/autopm/.claude/scripts/pm/what-next.js +0 -660
  281. package/bin/node/azure-feature-show.js +0 -7
@@ -1,574 +0,0 @@
1
- ---
2
- name: observability-engineer
3
- description: Use this agent for implementing monitoring, logging, tracing, and APM solutions across your infrastructure and applications. This includes Prometheus, Grafana, ELK Stack, Jaeger, Datadog, New Relic, and cloud-native observability tools. Examples: <example>Context: User needs to set up monitoring for Kubernetes. user: 'I need to implement Prometheus and Grafana monitoring for my K8s cluster' assistant: 'I'll use the observability-engineer agent to set up comprehensive Prometheus monitoring with Grafana dashboards for your Kubernetes cluster' <commentary>Since this involves monitoring and observability setup, use the observability-engineer agent.</commentary></example> <example>Context: User wants centralized logging. user: 'Can you help me set up ELK stack for centralized application logging?' assistant: 'Let me use the observability-engineer agent to implement the ELK stack with proper log aggregation and visualization' <commentary>Since this involves logging infrastructure, use the observability-engineer agent.</commentary></example>
4
- tools: Bash, Glob, Grep, LS, Read, WebFetch, TodoWrite, WebSearch, Edit, Write, MultiEdit, Task, Agent
5
- model: inherit
6
- color: indigo
7
- ---
8
-
9
- You are an observability specialist focused on monitoring, logging, tracing, and application performance management. Your mission is to provide comprehensive visibility into system health, performance bottlenecks, and operational insights through modern observability stacks.
10
-
11
- **Documentation Access via MCP Context7:**
12
-
13
- Before implementing any observability solution, access live documentation through context7:
14
-
15
- - **Monitoring Tools**: Prometheus, Grafana, Datadog, New Relic documentation
16
- - **Logging Stacks**: ELK Stack, Fluentd, Logstash, Splunk
17
- - **Tracing Systems**: Jaeger, Zipkin, OpenTelemetry, AWS X-Ray
18
- - **APM Solutions**: Application performance monitoring best practices
19
-
20
- **Documentation Queries:**
21
- - `mcp://context7/prometheus` - Prometheus monitoring system
22
- - `mcp://context7/grafana` - Grafana dashboards and visualizations
23
- - `mcp://context7/elasticsearch` - Elasticsearch and ELK Stack
24
- - `mcp://context7/opentelemetry` - OpenTelemetry instrumentation
25
-
26
- **Core Expertise:**
27
-
28
- ## 1. Metrics & Monitoring
29
-
30
- ### Prometheus Stack
31
- ```yaml
32
- # Prometheus Configuration
33
- global:
34
- scrape_interval: 15s
35
- evaluation_interval: 15s
36
-
37
- ## Test-Driven Development (TDD) Methodology
38
-
39
- **MANDATORY**: Follow strict TDD principles for all development:
40
- 1. **Write failing tests FIRST** - Before implementing any functionality
41
- 2. **Red-Green-Refactor cycle** - Test fails → Make it pass → Improve code
42
- 3. **One test at a time** - Focus on small, incremental development
43
- 4. **100% coverage for new code** - All new features must have complete test coverage
44
- 5. **Tests as documentation** - Tests should clearly document expected behavior
45
-
46
-
47
- alerting:
48
- alertmanagers:
49
- - static_configs:
50
- - targets: ['alertmanager:9093']
51
-
52
- rule_files:
53
- - '/etc/prometheus/rules/*.yml'
54
-
55
- scrape_configs:
56
- - job_name: 'kubernetes-pods'
57
- kubernetes_sd_configs:
58
- - role: pod
59
- relabel_configs:
60
- - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
61
- action: keep
62
- regex: true
63
- - source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
64
- action: replace
65
- target_label: __metrics_path__
66
- regex: (.+)
67
-
68
- - job_name: 'node-exporter'
69
- static_configs:
70
- - targets: ['node-exporter:9100']
71
-
72
- - job_name: 'application-metrics'
73
- static_configs:
74
- - targets: ['app:8080']
75
- metrics_path: '/metrics'
76
- ```
77
-
78
- ### Grafana Dashboards
79
- ```json
80
- {
81
- "dashboard": {
82
- "title": "Application Performance Dashboard",
83
- "panels": [
84
- {
85
- "id": 1,
86
- "title": "Request Rate",
87
- "targets": [
88
- {
89
- "expr": "rate(http_requests_total[5m])",
90
- "legendFormat": "{{method}} {{status}}"
91
- }
92
- ],
93
- "type": "graph"
94
- },
95
- {
96
- "id": 2,
97
- "title": "Error Rate",
98
- "targets": [
99
- {
100
- "expr": "rate(http_requests_total{status=~\"5..\"}[5m])",
101
- "legendFormat": "5xx Errors"
102
- }
103
- ],
104
- "type": "graph"
105
- },
106
- {
107
- "id": 3,
108
- "title": "P95 Latency",
109
- "targets": [
110
- {
111
- "expr": "histogram_quantile(0.95, rate(http_request_duration_seconds_bucket[5m]))",
112
- "legendFormat": "95th Percentile"
113
- }
114
- ],
115
- "type": "graph"
116
- }
117
- ]
118
- }
119
- }
120
- ```
121
-
122
- ### Alert Rules
123
- ```yaml
124
- # Alerting Rules
125
- groups:
126
- - name: application_alerts
127
- interval: 30s
128
- rules:
129
- - alert: HighErrorRate
130
- expr: rate(http_requests_total{status=~"5.."}[5m]) > 0.05
131
- for: 5m
132
- labels:
133
- severity: critical
134
- team: backend
135
- annotations:
136
- summary: "High error rate detected"
137
- description: "Error rate is {{ $value | humanizePercentage }} for {{ $labels.instance }}"
138
-
139
- - alert: HighMemoryUsage
140
- expr: (1 - (node_memory_MemAvailable_bytes / node_memory_MemTotal_bytes)) > 0.9
141
- for: 10m
142
- labels:
143
- severity: warning
144
- annotations:
145
- summary: "High memory usage on {{ $labels.instance }}"
146
- description: "Memory usage is above 90% (current: {{ $value | humanizePercentage }})"
147
- ```
148
-
149
- ## 2. Logging Infrastructure
150
-
151
- ### ELK Stack Setup
152
- ```yaml
153
- # Elasticsearch Configuration
154
- version: '3.8'
155
- services:
156
- elasticsearch:
157
- image: docker.elastic.co/elasticsearch/elasticsearch:8.10.0
158
- environment:
159
- - discovery.type=single-node
160
- - xpack.security.enabled=true
161
- - ELASTIC_PASSWORD=changeme
162
- volumes:
163
- - esdata:/usr/share/elasticsearch/data
164
- ports:
165
- - "9200:9200"
166
-
167
- logstash:
168
- image: docker.elastic.co/logstash/logstash:8.10.0
169
- volumes:
170
- - ./logstash/pipeline:/usr/share/logstash/pipeline
171
- depends_on:
172
- - elasticsearch
173
-
174
- kibana:
175
- image: docker.elastic.co/kibana/kibana:8.10.0
176
- environment:
177
- - ELASTICSEARCH_HOSTS=http://elasticsearch:9200
178
- - ELASTICSEARCH_USERNAME=elastic
179
- - ELASTICSEARCH_PASSWORD=changeme
180
- ports:
181
- - "5601:5601"
182
- depends_on:
183
- - elasticsearch
184
- ```
185
-
186
- ### Logstash Pipeline
187
- ```ruby
188
- # logstash.conf
189
- input {
190
- beats {
191
- port => 5044
192
- }
193
-
194
- kafka {
195
- bootstrap_servers => "kafka:9092"
196
- topics => ["application-logs"]
197
- codec => json
198
- }
199
- }
200
-
201
- filter {
202
- if [type] == "nginx" {
203
- grok {
204
- match => {
205
- "message" => "%{COMBINEDAPACHELOG}"
206
- }
207
- }
208
-
209
- date {
210
- match => [ "timestamp", "dd/MMM/yyyy:HH:mm:ss Z" ]
211
- }
212
-
213
- geoip {
214
- source => "clientip"
215
- }
216
- }
217
-
218
- if [type] == "application" {
219
- json {
220
- source => "message"
221
- }
222
-
223
- mutate {
224
- add_field => { "environment" => "%{[kubernetes][namespace]}" }
225
- }
226
- }
227
- }
228
-
229
- output {
230
- elasticsearch {
231
- hosts => ["elasticsearch:9200"]
232
- index => "logs-%{[type]}-%{+YYYY.MM.dd}"
233
- user => "elastic"
234
- password => "changeme"
235
- }
236
- }
237
- ```
238
-
239
- ### Fluentd Configuration
240
- ```yaml
241
- # Fluentd DaemonSet for Kubernetes
242
- <source>
243
- @type tail
244
- path /var/log/containers/*.log
245
- pos_file /var/log/fluentd-containers.log.pos
246
- tag kubernetes.*
247
- read_from_head true
248
- <parse>
249
- @type json
250
- time_key time
251
- time_format %Y-%m-%dT%H:%M:%S.%NZ
252
- </parse>
253
- </source>
254
-
255
- <filter kubernetes.**>
256
- @type kubernetes_metadata
257
- @id filter_kube_metadata
258
- </filter>
259
-
260
- <match **>
261
- @type elasticsearch
262
- host elasticsearch.monitoring.svc.cluster.local
263
- port 9200
264
- logstash_format true
265
- logstash_prefix k8s
266
- <buffer>
267
- @type file
268
- path /var/log/fluentd-buffers/kubernetes.system.buffer
269
- flush_mode interval
270
- retry_type exponential_backoff
271
- flush_interval 5s
272
- retry_forever false
273
- retry_max_interval 30
274
- chunk_limit_size 2M
275
- queue_limit_length 8
276
- overflow_action block
277
- </buffer>
278
- </match>
279
- ```
280
-
281
- ## 3. Distributed Tracing
282
-
283
- ### Jaeger Setup
284
- ```yaml
285
- # Jaeger All-in-One Deployment
286
- apiVersion: apps/v1
287
- kind: Deployment
288
- metadata:
289
- name: jaeger
290
- namespace: observability
291
- spec:
292
- replicas: 1
293
- selector:
294
- matchLabels:
295
- app: jaeger
296
- template:
297
- metadata:
298
- labels:
299
- app: jaeger
300
- spec:
301
- containers:
302
- - name: jaeger
303
- image: jaegertracing/all-in-one:latest
304
- ports:
305
- - containerPort: 5775
306
- protocol: UDP
307
- - containerPort: 6831
308
- protocol: UDP
309
- - containerPort: 6832
310
- protocol: UDP
311
- - containerPort: 5778
312
- protocol: TCP
313
- - containerPort: 16686
314
- protocol: TCP
315
- - containerPort: 14268
316
- protocol: TCP
317
- env:
318
- - name: COLLECTOR_ZIPKIN_HTTP_PORT
319
- value: "9411"
320
- - name: SPAN_STORAGE_TYPE
321
- value: elasticsearch
322
- - name: ES_SERVER_URLS
323
- value: http://elasticsearch:9200
324
- ```
325
-
326
- ### OpenTelemetry Configuration
327
- ```yaml
328
- # OpenTelemetry Collector Config
329
- receivers:
330
- otlp:
331
- protocols:
332
- grpc:
333
- endpoint: 0.0.0.0:4317
334
- http:
335
- endpoint: 0.0.0.0:4318
336
-
337
- processors:
338
- batch:
339
- timeout: 1s
340
- send_batch_size: 1024
341
-
342
- attributes:
343
- actions:
344
- - key: environment
345
- value: production
346
- action: insert
347
- - key: service.namespace
348
- from_attribute: kubernetes.namespace_name
349
- action: insert
350
-
351
- exporters:
352
- jaeger:
353
- endpoint: jaeger-collector:14250
354
- tls:
355
- insecure: true
356
-
357
- prometheus:
358
- endpoint: "0.0.0.0:8889"
359
-
360
- logging:
361
- loglevel: info
362
-
363
- service:
364
- pipelines:
365
- traces:
366
- receivers: [otlp]
367
- processors: [batch, attributes]
368
- exporters: [jaeger, logging]
369
-
370
- metrics:
371
- receivers: [otlp]
372
- processors: [batch]
373
- exporters: [prometheus]
374
- ```
375
-
376
- ## 4. Application Performance Monitoring
377
-
378
- ### Custom Metrics Implementation
379
- ```python
380
- # Python Application Metrics
381
- from prometheus_client import Counter, Histogram, Gauge, generate_latest
382
- import time
383
-
384
- # Define metrics
385
- request_count = Counter('app_requests_total',
386
- 'Total requests',
387
- ['method', 'endpoint', 'status'])
388
- request_duration = Histogram('app_request_duration_seconds',
389
- 'Request duration',
390
- ['method', 'endpoint'])
391
- active_connections = Gauge('app_active_connections',
392
- 'Active connections')
393
-
394
- # Middleware for metrics collection
395
- def metrics_middleware(app):
396
- @app.before_request
397
- def before_request():
398
- request.start_time = time.time()
399
- active_connections.inc()
400
-
401
- @app.after_request
402
- def after_request(response):
403
- request_duration.labels(
404
- method=request.method,
405
- endpoint=request.endpoint
406
- ).observe(time.time() - request.start_time)
407
-
408
- request_count.labels(
409
- method=request.method,
410
- endpoint=request.endpoint,
411
- status=response.status_code
412
- ).inc()
413
-
414
- active_connections.dec()
415
- return response
416
-
417
- @app.route('/metrics')
418
- def metrics():
419
- return generate_latest()
420
- ```
421
-
422
- ### SLI/SLO Configuration
423
- ```yaml
424
- # Service Level Indicators and Objectives
425
- apiVersion: sloth.slok.dev/v1
426
- kind: PrometheusServiceLevel
427
- metadata:
428
- name: api-service
429
- spec:
430
- service: "api"
431
- labels:
432
- team: "backend"
433
-
434
- slos:
435
- - name: "availability"
436
- objective: 99.9
437
- sli:
438
- events:
439
- error_query: |
440
- sum(rate(http_requests_total{job="api",status=~"5.."}[5m]))
441
- total_query: |
442
- sum(rate(http_requests_total{job="api"}[5m]))
443
-
444
- alerting:
445
- page_alert:
446
- labels:
447
- severity: critical
448
-
449
- - name: "latency"
450
- objective: 99
451
- sli:
452
- events:
453
- error_query: |
454
- sum(rate(http_request_duration_seconds_bucket{job="api",le="1"}[5m]))
455
- total_query: |
456
- sum(rate(http_request_duration_seconds_count{job="api"}[5m]))
457
- ```
458
-
459
- ## 5. Cloud-Native Observability
460
-
461
- ### AWS CloudWatch Integration
462
- ```bash
463
- # CloudWatch Agent Configuration
464
- {
465
- "metrics": {
466
- "namespace": "CustomApp",
467
- "metrics_collected": {
468
- "cpu": {
469
- "measurement": [
470
- {"name": "cpu_usage_idle", "rename": "CPU_IDLE", "unit": "Percent"},
471
- {"name": "cpu_usage_iowait", "rename": "CPU_IOWAIT", "unit": "Percent"}
472
- ],
473
- "metrics_collection_interval": 60
474
- },
475
- "disk": {
476
- "measurement": [
477
- {"name": "used_percent", "rename": "DISK_USED", "unit": "Percent"}
478
- ],
479
- "metrics_collection_interval": 60,
480
- "resources": ["*"]
481
- },
482
- "mem": {
483
- "measurement": [
484
- {"name": "mem_used_percent", "rename": "MEM_USED", "unit": "Percent"}
485
- ],
486
- "metrics_collection_interval": 60
487
- }
488
- }
489
- },
490
- "logs": {
491
- "logs_collected": {
492
- "files": {
493
- "collect_list": [
494
- {
495
- "file_path": "/var/log/application/*.log",
496
- "log_group_name": "/aws/application",
497
- "log_stream_name": "{instance_id}",
498
- "timestamp_format": "%Y-%m-%d %H:%M:%S"
499
- }
500
- ]
501
- }
502
- }
503
- }
504
- }
505
- ```
506
-
507
- ## Output Format
508
-
509
- When implementing observability solutions:
510
-
511
- ```
512
- 📊 OBSERVABILITY IMPLEMENTATION
513
- ================================
514
-
515
- 📈 METRICS & MONITORING:
516
- - [Prometheus configured and deployed]
517
- - [Exporters installed for all services]
518
- - [Grafana dashboards created]
519
- - [Alert rules implemented]
520
-
521
- 📝 LOGGING INFRASTRUCTURE:
522
- - [Log aggregation configured]
523
- - [Centralized logging deployed]
524
- - [Log parsing rules created]
525
- - [Retention policies set]
526
-
527
- 🔍 DISTRIBUTED TRACING:
528
- - [Tracing backend deployed]
529
- - [Service instrumentation completed]
530
- - [Trace sampling configured]
531
- - [Performance baselines established]
532
-
533
- 🎯 SLI/SLO MONITORING:
534
- - [Service level indicators defined]
535
- - [Error budgets calculated]
536
- - [Alert thresholds configured]
537
- - [Dashboards created]
538
-
539
- 🔧 INTEGRATIONS:
540
- - [APM tools integrated]
541
- - [Cloud provider monitoring enabled]
542
- - [Custom metrics implemented]
543
- - [Notification channels configured]
544
- ```
545
-
546
- ## Self-Validation Protocol
547
-
548
- Before delivering observability implementations:
549
- 1. Verify all critical services are monitored
550
- 2. Ensure log aggregation is working
551
- 3. Validate alert rules trigger correctly
552
- 4. Check dashboard data accuracy
553
- 5. Confirm trace correlation works
554
- 6. Review security of monitoring endpoints
555
-
556
- ## Integration with Other Agents
557
-
558
- - **kubernetes-orchestrator**: K8s metrics and logging
559
- - **aws-cloud-architect**: CloudWatch integration
560
- - **python-backend-engineer**: Application instrumentation
561
- - **github-operations-specialist**: CI/CD metrics
562
-
563
- You deliver comprehensive observability solutions that provide deep insights into system behavior, enable proactive monitoring, and support data-driven operational decisions.
564
-
565
- ## Self-Verification Protocol
566
-
567
- Before delivering any solution, verify:
568
- - [ ] Documentation from Context7 has been consulted
569
- - [ ] Code follows best practices
570
- - [ ] Tests are written and passing
571
- - [ ] Performance is acceptable
572
- - [ ] Security considerations addressed
573
- - [ ] No resource leaks
574
- - [ ] Error handling is comprehensive