@jetrabbits/agentic 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (440) hide show
  1. package/AGENTS.md +143 -0
  2. package/README.md +154 -0
  3. package/agentic +1615 -0
  4. package/areas/devops/ci-cd/AGENTS.md +48 -0
  5. package/areas/devops/ci-cd/PROMPTS.md +7 -0
  6. package/areas/devops/ci-cd/prompts/onboard-repo.md +97 -0
  7. package/areas/devops/ci-cd/prompts/pipeline-debug.md +103 -0
  8. package/areas/devops/ci-cd/prompts/release-pipeline.md +115 -0
  9. package/areas/devops/ci-cd/rules/pipeline-standards.md +33 -0
  10. package/areas/devops/ci-cd/rules/quality-gates.md +24 -0
  11. package/areas/devops/ci-cd/rules/supply-chain-security.md +34 -0
  12. package/areas/devops/ci-cd/skills/artifact-management/SKILL.md +157 -0
  13. package/areas/devops/ci-cd/skills/build-optimization/SKILL.md +168 -0
  14. package/areas/devops/ci-cd/skills/github-actions-patterns/SKILL.md +190 -0
  15. package/areas/devops/ci-cd/skills/gitlab-ci-patterns/SKILL.md +169 -0
  16. package/areas/devops/ci-cd/skills/pipeline-security/SKILL.md +161 -0
  17. package/areas/devops/ci-cd/workflows/onboard-repo.md +73 -0
  18. package/areas/devops/ci-cd/workflows/pipeline-debug.md +66 -0
  19. package/areas/devops/ci-cd/workflows/release-pipeline.md +115 -0
  20. package/areas/devops/database-ops/AGENTS.md +47 -0
  21. package/areas/devops/database-ops/prompts/backup-verify.md +83 -0
  22. package/areas/devops/database-ops/prompts/db-incident.md +127 -0
  23. package/areas/devops/database-ops/rules/access-control.md +20 -0
  24. package/areas/devops/database-ops/rules/backup-policy.md +33 -0
  25. package/areas/devops/database-ops/rules/migration-runbook.md +32 -0
  26. package/areas/devops/database-ops/skills/backup-restore/SKILL.md +226 -0
  27. package/areas/devops/database-ops/skills/db-performance/SKILL.md +205 -0
  28. package/areas/devops/database-ops/skills/migration-safety/SKILL.md +155 -0
  29. package/areas/devops/database-ops/skills/postgres-operations/SKILL.md +156 -0
  30. package/areas/devops/database-ops/skills/redis-operations/SKILL.md +174 -0
  31. package/areas/devops/database-ops/workflows/backup-verify.md +107 -0
  32. package/areas/devops/database-ops/workflows/db-incident.md +86 -0
  33. package/areas/devops/devsecops/AGENTS.md +47 -0
  34. package/areas/devops/devsecops/prompts/policy-onboard.md +79 -0
  35. package/areas/devops/devsecops/prompts/security-scan-pipeline.md +131 -0
  36. package/areas/devops/devsecops/rules/container-security.md +22 -0
  37. package/areas/devops/devsecops/rules/policy-as-code.md +37 -0
  38. package/areas/devops/devsecops/rules/shift-left-policy.md +26 -0
  39. package/areas/devops/devsecops/skills/container-hardening/SKILL.md +146 -0
  40. package/areas/devops/devsecops/skills/opa-policies/SKILL.md +188 -0
  41. package/areas/devops/devsecops/skills/sbom-supply-chain/SKILL.md +165 -0
  42. package/areas/devops/devsecops/skills/secret-detection/SKILL.md +190 -0
  43. package/areas/devops/devsecops/skills/sigstore-signing/SKILL.md +184 -0
  44. package/areas/devops/devsecops/workflows/policy-onboard.md +104 -0
  45. package/areas/devops/devsecops/workflows/security-scan-pipeline.md +155 -0
  46. package/areas/devops/infrastructure/AGENTS.md +50 -0
  47. package/areas/devops/infrastructure/prompts/destroy-environment.md +81 -0
  48. package/areas/devops/infrastructure/prompts/drift-remediation.md +71 -0
  49. package/areas/devops/infrastructure/prompts/module-development.md +69 -0
  50. package/areas/devops/infrastructure/prompts/provision-environment.md +121 -0
  51. package/areas/devops/infrastructure/rules/iac-standards.md +80 -0
  52. package/areas/devops/infrastructure/rules/immutability.md +28 -0
  53. package/areas/devops/infrastructure/rules/secret-hygiene.md +53 -0
  54. package/areas/devops/infrastructure/rules/state-management.md +47 -0
  55. package/areas/devops/infrastructure/skills/ansible-playbooks/SKILL.md +174 -0
  56. package/areas/devops/infrastructure/skills/cost-optimization/SKILL.md +177 -0
  57. package/areas/devops/infrastructure/skills/drift-detection/SKILL.md +178 -0
  58. package/areas/devops/infrastructure/skills/state-management/SKILL.md +159 -0
  59. package/areas/devops/infrastructure/skills/terraform-modules/SKILL.md +169 -0
  60. package/areas/devops/infrastructure/workflows/destroy-environment.md +96 -0
  61. package/areas/devops/infrastructure/workflows/drift-remediation.md +66 -0
  62. package/areas/devops/infrastructure/workflows/module-development.md +101 -0
  63. package/areas/devops/infrastructure/workflows/provision-environment.md +96 -0
  64. package/areas/devops/kubernetes/AGENTS.md +57 -0
  65. package/areas/devops/kubernetes/PROMPTS.md +9 -0
  66. package/areas/devops/kubernetes/prompts/cluster-bootstrap.md +67 -0
  67. package/areas/devops/kubernetes/prompts/debug-workload.md +91 -0
  68. package/areas/devops/kubernetes/prompts/onboard-service.md +101 -0
  69. package/areas/devops/kubernetes/prompts/upgrade-cluster.md +63 -0
  70. package/areas/devops/kubernetes/rules/cluster-standards.md +51 -0
  71. package/areas/devops/kubernetes/rules/resource-governance.md +80 -0
  72. package/areas/devops/kubernetes/rules/upgrade-policy.md +52 -0
  73. package/areas/devops/kubernetes/rules/workload-security.md +64 -0
  74. package/areas/devops/kubernetes/skills/cluster-operations/SKILL.md +136 -0
  75. package/areas/devops/kubernetes/skills/helm-charts/SKILL.md +152 -0
  76. package/areas/devops/kubernetes/skills/network-policies/SKILL.md +169 -0
  77. package/areas/devops/kubernetes/skills/pod-troubleshooting/SKILL.md +129 -0
  78. package/areas/devops/kubernetes/skills/rbac-design/SKILL.md +148 -0
  79. package/areas/devops/kubernetes/skills/resource-tuning/SKILL.md +156 -0
  80. package/areas/devops/kubernetes/workflows/cluster-bootstrap.md +194 -0
  81. package/areas/devops/kubernetes/workflows/debug-workload.md +108 -0
  82. package/areas/devops/kubernetes/workflows/onboard-service.md +124 -0
  83. package/areas/devops/kubernetes/workflows/upgrade-cluster.md +165 -0
  84. package/areas/devops/networking/AGENTS.md +47 -0
  85. package/areas/devops/networking/prompts/onboard-ingress.md +119 -0
  86. package/areas/devops/networking/prompts/service-mesh-onboard.md +77 -0
  87. package/areas/devops/networking/rules/ingress-standards.md +17 -0
  88. package/areas/devops/networking/rules/network-segmentation.md +24 -0
  89. package/areas/devops/networking/rules/tls-policy.md +32 -0
  90. package/areas/devops/networking/skills/dns-management/SKILL.md +169 -0
  91. package/areas/devops/networking/skills/ingress-patterns/SKILL.md +165 -0
  92. package/areas/devops/networking/skills/service-mesh/SKILL.md +206 -0
  93. package/areas/devops/networking/skills/tls-termination/SKILL.md +198 -0
  94. package/areas/devops/networking/skills/vpc-design/SKILL.md +132 -0
  95. package/areas/devops/networking/workflows/onboard-ingress.md +64 -0
  96. package/areas/devops/networking/workflows/service-mesh-onboard.md +122 -0
  97. package/areas/devops/observability/AGENTS.md +48 -0
  98. package/areas/devops/observability/prompts/alert-investigation.md +117 -0
  99. package/areas/devops/observability/prompts/observability-stack-setup.md +99 -0
  100. package/areas/devops/observability/prompts/onboard-service-monitoring.md +79 -0
  101. package/areas/devops/observability/rules/alerting-standards.md +36 -0
  102. package/areas/devops/observability/rules/data-retention.md +19 -0
  103. package/areas/devops/observability/rules/golden-signals.md +28 -0
  104. package/areas/devops/observability/skills/distributed-tracing/SKILL.md +149 -0
  105. package/areas/devops/observability/skills/grafana-dashboards/SKILL.md +201 -0
  106. package/areas/devops/observability/skills/log-aggregation/SKILL.md +159 -0
  107. package/areas/devops/observability/skills/prometheus-alertmanager/SKILL.md +188 -0
  108. package/areas/devops/observability/skills/slo-implementation/SKILL.md +189 -0
  109. package/areas/devops/observability/workflows/alert-investigation.md +98 -0
  110. package/areas/devops/observability/workflows/observability-stack-setup.md +156 -0
  111. package/areas/devops/observability/workflows/onboard-service-monitoring.md +83 -0
  112. package/areas/devops/sre/AGENTS.md +48 -0
  113. package/areas/devops/sre/prompts/incident-response.md +129 -0
  114. package/areas/devops/sre/prompts/postmortem.md +101 -0
  115. package/areas/devops/sre/prompts/slo-review.md +125 -0
  116. package/areas/devops/sre/rules/error-budget-policy.md +25 -0
  117. package/areas/devops/sre/rules/on-call-standards.md +25 -0
  118. package/areas/devops/sre/rules/slo-policy.md +31 -0
  119. package/areas/devops/sre/skills/capacity-planning/SKILL.md +162 -0
  120. package/areas/devops/sre/skills/chaos-engineering/SKILL.md +186 -0
  121. package/areas/devops/sre/skills/incident-command/SKILL.md +119 -0
  122. package/areas/devops/sre/skills/postmortem-analysis/SKILL.md +104 -0
  123. package/areas/devops/sre/skills/slo-sli-design/SKILL.md +145 -0
  124. package/areas/devops/sre/workflows/incident-response.md +66 -0
  125. package/areas/devops/sre/workflows/postmortem.md +90 -0
  126. package/areas/devops/sre/workflows/slo-review.md +95 -0
  127. package/areas/software/backend/AGENTS.md +59 -0
  128. package/areas/software/backend/PROMPTS.md +50 -0
  129. package/areas/software/backend/README.md +48 -0
  130. package/areas/software/backend/prompts/add-migration.md +93 -0
  131. package/areas/software/backend/prompts/create-endpoint.md +97 -0
  132. package/areas/software/backend/prompts/debug-issue.md +87 -0
  133. package/areas/software/backend/prompts/develop-epic.md +83 -0
  134. package/areas/software/backend/prompts/develop-feature.md +91 -0
  135. package/areas/software/backend/prompts/refactor-module.md +79 -0
  136. package/areas/software/backend/prompts/test-feature.md +89 -0
  137. package/areas/software/backend/rules/architecture.md +20 -0
  138. package/areas/software/backend/rules/data_access.md +20 -0
  139. package/areas/software/backend/rules/security.md +20 -0
  140. package/areas/software/backend/rules/testing.md +19 -0
  141. package/areas/software/backend/skills/api-design/SKILL.md +170 -0
  142. package/areas/software/backend/skills/async-processing/SKILL.md +152 -0
  143. package/areas/software/backend/skills/database-modeling/SKILL.md +173 -0
  144. package/areas/software/backend/skills/observability/SKILL.md +162 -0
  145. package/areas/software/backend/skills/troubleshooting/SKILL.md +139 -0
  146. package/areas/software/backend/workflows/add-migration.md +79 -0
  147. package/areas/software/backend/workflows/create-endpoint.md +89 -0
  148. package/areas/software/backend/workflows/debug-issue.md +77 -0
  149. package/areas/software/backend/workflows/develop-epic.md +78 -0
  150. package/areas/software/backend/workflows/develop-feature.md +98 -0
  151. package/areas/software/backend/workflows/refactor-module.md +73 -0
  152. package/areas/software/backend/workflows/test-feature.md +67 -0
  153. package/areas/software/data-engineering/AGENTS.md +59 -0
  154. package/areas/software/data-engineering/PROMPTS.md +32 -0
  155. package/areas/software/data-engineering/prompts/backfill-data.md +107 -0
  156. package/areas/software/data-engineering/prompts/data-quality-incident.md +109 -0
  157. package/areas/software/data-engineering/prompts/lineage-trace.md +121 -0
  158. package/areas/software/data-engineering/prompts/new-model.md +117 -0
  159. package/areas/software/data-engineering/prompts/schema-migration.md +111 -0
  160. package/areas/software/data-engineering/rules/data-governance.md +11 -0
  161. package/areas/software/data-engineering/rules/pii-handling.md +19 -0
  162. package/areas/software/data-engineering/rules/pipeline-integrity.md +11 -0
  163. package/areas/software/data-engineering/rules/schema-management.md +21 -0
  164. package/areas/software/data-engineering/skills/data-modeling/SKILL.md +49 -0
  165. package/areas/software/data-engineering/skills/dbt-patterns/SKILL.md +43 -0
  166. package/areas/software/data-engineering/skills/lineage-governance/SKILL.md +38 -0
  167. package/areas/software/data-engineering/skills/orchestration/SKILL.md +35 -0
  168. package/areas/software/data-engineering/skills/quality-checks/SKILL.md +50 -0
  169. package/areas/software/data-engineering/skills/sql-optimization/SKILL.md +47 -0
  170. package/areas/software/data-engineering/skills/streaming-patterns/SKILL.md +48 -0
  171. package/areas/software/data-engineering/workflows/backfill-data.md +59 -0
  172. package/areas/software/data-engineering/workflows/data-quality-incident.md +64 -0
  173. package/areas/software/data-engineering/workflows/lineage-trace.md +56 -0
  174. package/areas/software/data-engineering/workflows/new-model.md +71 -0
  175. package/areas/software/data-engineering/workflows/schema-migration.md +67 -0
  176. package/areas/software/frontend/AGENTS.md +60 -0
  177. package/areas/software/frontend/PROMPTS.md +32 -0
  178. package/areas/software/frontend/prompts/a11y-fix.md +75 -0
  179. package/areas/software/frontend/prompts/bundle-analyze.md +75 -0
  180. package/areas/software/frontend/prompts/release-prep.md +83 -0
  181. package/areas/software/frontend/prompts/scaffold-component.md +69 -0
  182. package/areas/software/frontend/prompts/visual-regression.md +73 -0
  183. package/areas/software/frontend/rules/accessibility.md +16 -0
  184. package/areas/software/frontend/rules/architecture.md +29 -0
  185. package/areas/software/frontend/rules/performance.md +23 -0
  186. package/areas/software/frontend/rules/quality.md +12 -0
  187. package/areas/software/frontend/skills/a11y-audit/SKILL.md +61 -0
  188. package/areas/software/frontend/skills/api-integration/SKILL.md +58 -0
  189. package/areas/software/frontend/skills/component-design/SKILL.md +171 -0
  190. package/areas/software/frontend/skills/css-architecture/SKILL.md +146 -0
  191. package/areas/software/frontend/skills/error-handling/SKILL.md +55 -0
  192. package/areas/software/frontend/skills/performance-tuning/SKILL.md +58 -0
  193. package/areas/software/frontend/skills/state-management/SKILL.md +54 -0
  194. package/areas/software/frontend/skills/testing-patterns/SKILL.md +69 -0
  195. package/areas/software/frontend/workflows/a11y-fix.md +63 -0
  196. package/areas/software/frontend/workflows/bundle-analyze.md +56 -0
  197. package/areas/software/frontend/workflows/release-prep.md +66 -0
  198. package/areas/software/frontend/workflows/scaffold-component.md +67 -0
  199. package/areas/software/frontend/workflows/visual-regression.md +65 -0
  200. package/areas/software/full-stack/AGENTS.md +72 -0
  201. package/areas/software/full-stack/PROMPTS.md +66 -0
  202. package/areas/software/full-stack/prompts/backend-project-full-cycle.md +141 -0
  203. package/areas/software/full-stack/prompts/debug-issue.md +115 -0
  204. package/areas/software/full-stack/prompts/develop-feature.md +119 -0
  205. package/areas/software/full-stack/prompts/feature-implementation-flow.md +137 -0
  206. package/areas/software/full-stack/prompts/testing-ci-pipeline.md +119 -0
  207. package/areas/software/full-stack/rules/api-design-guide.md +24 -0
  208. package/areas/software/full-stack/rules/async-concurrency-guide.md +21 -0
  209. package/areas/software/full-stack/rules/backend-architecture-rule.md +41 -0
  210. package/areas/software/full-stack/rules/background-jobs-guide.md +20 -0
  211. package/areas/software/full-stack/rules/code-quality-guide.md +22 -0
  212. package/areas/software/full-stack/rules/database-access-guide.md +24 -0
  213. package/areas/software/full-stack/rules/database-migrations-guide.md +24 -0
  214. package/areas/software/full-stack/rules/domain-models-guide.md +28 -0
  215. package/areas/software/full-stack/rules/e2e-test-guide.md +18 -0
  216. package/areas/software/full-stack/rules/env-settings-guide.md +34 -0
  217. package/areas/software/full-stack/rules/error-handling-guide.md +20 -0
  218. package/areas/software/full-stack/rules/logging-observability-guide.md +22 -0
  219. package/areas/software/full-stack/rules/project-guide.md +34 -0
  220. package/areas/software/full-stack/rules/python-venv-guide.md +23 -0
  221. package/areas/software/full-stack/rules/security-guide.md +22 -0
  222. package/areas/software/full-stack/rules/svt-test-guide.md +17 -0
  223. package/areas/software/full-stack/rules/testing-ci-guide.md +25 -0
  224. package/areas/software/full-stack/skills/api-design-principles/SKILL.md +125 -0
  225. package/areas/software/full-stack/skills/api-design-principles/assets/api-design-checklist.md +155 -0
  226. package/areas/software/full-stack/skills/api-design-principles/assets/rest-api-template.py +182 -0
  227. package/areas/software/full-stack/skills/api-design-principles/references/graphql-schema-design.md +583 -0
  228. package/areas/software/full-stack/skills/api-design-principles/references/rest-best-practices.md +408 -0
  229. package/areas/software/full-stack/skills/api-design-principles/resources/implementation-playbook.md +513 -0
  230. package/areas/software/full-stack/skills/api-patterns/SKILL.md +81 -0
  231. package/areas/software/full-stack/skills/api-patterns/api-style.md +42 -0
  232. package/areas/software/full-stack/skills/api-patterns/auth.md +24 -0
  233. package/areas/software/full-stack/skills/api-patterns/documentation.md +26 -0
  234. package/areas/software/full-stack/skills/api-patterns/graphql.md +41 -0
  235. package/areas/software/full-stack/skills/api-patterns/rate-limiting.md +31 -0
  236. package/areas/software/full-stack/skills/api-patterns/response.md +37 -0
  237. package/areas/software/full-stack/skills/api-patterns/rest.md +40 -0
  238. package/areas/software/full-stack/skills/api-patterns/scripts/api_validator.py +211 -0
  239. package/areas/software/full-stack/skills/api-patterns/security-testing.md +122 -0
  240. package/areas/software/full-stack/skills/api-patterns/trpc.md +41 -0
  241. package/areas/software/full-stack/skills/api-patterns/versioning.md +22 -0
  242. package/areas/software/full-stack/skills/app-builder/SKILL.md +135 -0
  243. package/areas/software/full-stack/skills/app-builder/agent-coordination.md +71 -0
  244. package/areas/software/full-stack/skills/app-builder/feature-building.md +53 -0
  245. package/areas/software/full-stack/skills/app-builder/project-detection.md +34 -0
  246. package/areas/software/full-stack/skills/app-builder/scaffolding.md +118 -0
  247. package/areas/software/full-stack/skills/app-builder/tech-stack.md +40 -0
  248. package/areas/software/full-stack/skills/app-builder/templates/SKILL.md +39 -0
  249. package/areas/software/full-stack/skills/app-builder/templates/astro-static/TEMPLATE.md +76 -0
  250. package/areas/software/full-stack/skills/app-builder/templates/chrome-extension/TEMPLATE.md +92 -0
  251. package/areas/software/full-stack/skills/app-builder/templates/cli-tool/TEMPLATE.md +88 -0
  252. package/areas/software/full-stack/skills/app-builder/templates/electron-desktop/TEMPLATE.md +88 -0
  253. package/areas/software/full-stack/skills/app-builder/templates/express-api/TEMPLATE.md +83 -0
  254. package/areas/software/full-stack/skills/app-builder/templates/flutter-app/TEMPLATE.md +90 -0
  255. package/areas/software/full-stack/skills/app-builder/templates/monorepo-turborepo/TEMPLATE.md +90 -0
  256. package/areas/software/full-stack/skills/app-builder/templates/nextjs-fullstack/TEMPLATE.md +82 -0
  257. package/areas/software/full-stack/skills/app-builder/templates/nextjs-saas/TEMPLATE.md +100 -0
  258. package/areas/software/full-stack/skills/app-builder/templates/nextjs-static/TEMPLATE.md +106 -0
  259. package/areas/software/full-stack/skills/app-builder/templates/nuxt-app/TEMPLATE.md +101 -0
  260. package/areas/software/full-stack/skills/app-builder/templates/python-fastapi/TEMPLATE.md +83 -0
  261. package/areas/software/full-stack/skills/app-builder/templates/react-native-app/TEMPLATE.md +93 -0
  262. package/areas/software/full-stack/skills/backend-developer/SKILL.md +58 -0
  263. package/areas/software/full-stack/skills/bash-pro/SKILL.md +310 -0
  264. package/areas/software/full-stack/skills/blackbox-test/SKILL.md +84 -0
  265. package/areas/software/full-stack/skills/prompt-project-planner/SKILL.md +130 -0
  266. package/areas/software/full-stack/skills/prompt-project-planner/output.schema.md +68 -0
  267. package/areas/software/full-stack/skills/prompt-project-planner/questions.md +80 -0
  268. package/areas/software/full-stack/skills/python-pro/SKILL.md +158 -0
  269. package/areas/software/full-stack/skills/skill-creator/LICENSE.txt +202 -0
  270. package/areas/software/full-stack/skills/skill-creator/SKILL.md +356 -0
  271. package/areas/software/full-stack/skills/skill-creator/references/output-patterns.md +82 -0
  272. package/areas/software/full-stack/skills/skill-creator/references/workflows.md +28 -0
  273. package/areas/software/full-stack/skills/skill-creator/scripts/init_skill.py +303 -0
  274. package/areas/software/full-stack/skills/skill-creator/scripts/package_skill.py +110 -0
  275. package/areas/software/full-stack/skills/skill-creator/scripts/quick_validate.py +95 -0
  276. package/areas/software/full-stack/workflows/backend-project-full-cycle.md +132 -0
  277. package/areas/software/full-stack/workflows/debug-issue.md +70 -0
  278. package/areas/software/full-stack/workflows/develop-feature.md +85 -0
  279. package/areas/software/full-stack/workflows/feature-implementation-flow.md +78 -0
  280. package/areas/software/full-stack/workflows/testing-ci-pipeline.md +65 -0
  281. package/areas/software/general/AGENTS.md +68 -0
  282. package/areas/software/general/prompts/code-review-workflow.md +87 -0
  283. package/areas/software/general/prompts/development-cycle-workflow.md +83 -0
  284. package/areas/software/general/prompts/project-setup-workflow.md +93 -0
  285. package/areas/software/general/rules/code-style-guide.md +31 -0
  286. package/areas/software/general/rules/docker-compose-guide.md +27 -0
  287. package/areas/software/general/rules/git-workflow-guide.md +27 -0
  288. package/areas/software/general/rules/github-workflow-guide.md +27 -0
  289. package/areas/software/general/rules/gitlab-ci-guide.md +27 -0
  290. package/areas/software/general/rules/lint-format-guide.md +29 -0
  291. package/areas/software/general/rules/makefile-guide.md +34 -0
  292. package/areas/software/general/rules/readme-sync-guide.md +40 -0
  293. package/areas/software/general/rules/sdlc-methodology-guide.md +27 -0
  294. package/areas/software/general/rules/sdlc-role-responsibilities.md +108 -0
  295. package/areas/software/general/skills/general-dev-tools/SKILL.md +324 -0
  296. package/areas/software/general/workflows/code-review-workflow.md +84 -0
  297. package/areas/software/general/workflows/development-cycle-workflow.md +85 -0
  298. package/areas/software/general/workflows/project-setup-workflow.md +94 -0
  299. package/areas/software/mlops/AGENTS.md +57 -0
  300. package/areas/software/mlops/PROMPTS.md +32 -0
  301. package/areas/software/mlops/prompts/champion-challenger.md +87 -0
  302. package/areas/software/mlops/prompts/deploy-endpoint.md +91 -0
  303. package/areas/software/mlops/prompts/evaluate-model.md +87 -0
  304. package/areas/software/mlops/prompts/model-incident.md +87 -0
  305. package/areas/software/mlops/prompts/train-experiment.md +83 -0
  306. package/areas/software/mlops/rules/data-integrity.md +9 -0
  307. package/areas/software/mlops/rules/model-governance.md +9 -0
  308. package/areas/software/mlops/rules/production-safety.md +9 -0
  309. package/areas/software/mlops/rules/reproducibility.md +9 -0
  310. package/areas/software/mlops/skills/experiment-tracking/SKILL.md +29 -0
  311. package/areas/software/mlops/skills/feature-engineering/SKILL.md +44 -0
  312. package/areas/software/mlops/skills/inference-serving/SKILL.md +35 -0
  313. package/areas/software/mlops/skills/model-evaluation/SKILL.md +40 -0
  314. package/areas/software/mlops/skills/model-monitoring/SKILL.md +32 -0
  315. package/areas/software/mlops/workflows/champion-challenger.md +65 -0
  316. package/areas/software/mlops/workflows/deploy-endpoint.md +70 -0
  317. package/areas/software/mlops/workflows/evaluate-model.md +63 -0
  318. package/areas/software/mlops/workflows/model-incident.md +64 -0
  319. package/areas/software/mlops/workflows/train-experiment.md +56 -0
  320. package/areas/software/mobile/AGENTS.md +58 -0
  321. package/areas/software/mobile/PROMPTS.md +32 -0
  322. package/areas/software/mobile/prompts/crash-triage.md +63 -0
  323. package/areas/software/mobile/prompts/device-testing.md +83 -0
  324. package/areas/software/mobile/prompts/ota-update.md +75 -0
  325. package/areas/software/mobile/prompts/release-build.md +67 -0
  326. package/areas/software/mobile/prompts/store-submission.md +79 -0
  327. package/areas/software/mobile/rules/offline-first.md +10 -0
  328. package/areas/software/mobile/rules/performance-budget.md +20 -0
  329. package/areas/software/mobile/rules/platform-compliance.md +17 -0
  330. package/areas/software/mobile/rules/security-mobile.md +9 -0
  331. package/areas/software/mobile/skills/app-store-prep/SKILL.md +27 -0
  332. package/areas/software/mobile/skills/mobile-testing/SKILL.md +36 -0
  333. package/areas/software/mobile/skills/native-modules/SKILL.md +38 -0
  334. package/areas/software/mobile/skills/navigation-patterns/SKILL.md +49 -0
  335. package/areas/software/mobile/skills/push-notifications/SKILL.md +40 -0
  336. package/areas/software/mobile/skills/state-sync/SKILL.md +48 -0
  337. package/areas/software/mobile/workflows/crash-triage.md +63 -0
  338. package/areas/software/mobile/workflows/device-testing.md +54 -0
  339. package/areas/software/mobile/workflows/ota-update.md +54 -0
  340. package/areas/software/mobile/workflows/release-build.md +67 -0
  341. package/areas/software/mobile/workflows/store-submission.md +63 -0
  342. package/areas/software/platform/AGENTS.md +67 -0
  343. package/areas/software/platform/PROMPTS.md +32 -0
  344. package/areas/software/platform/prompts/cost-audit.md +117 -0
  345. package/areas/software/platform/prompts/deploy-production.md +109 -0
  346. package/areas/software/platform/prompts/drift-check.md +107 -0
  347. package/areas/software/platform/prompts/incident-response.md +121 -0
  348. package/areas/software/platform/prompts/provision-env.md +113 -0
  349. package/areas/software/platform/rules/cost-governance.md +11 -0
  350. package/areas/software/platform/rules/immutability.md +17 -0
  351. package/areas/software/platform/rules/reliability.md +19 -0
  352. package/areas/software/platform/rules/security-posture.md +12 -0
  353. package/areas/software/platform/skills/ci-cd-pipelines/SKILL.md +58 -0
  354. package/areas/software/platform/skills/incident-response/SKILL.md +41 -0
  355. package/areas/software/platform/skills/k8s-manifests/SKILL.md +56 -0
  356. package/areas/software/platform/skills/networking/SKILL.md +44 -0
  357. package/areas/software/platform/skills/observability-setup/SKILL.md +49 -0
  358. package/areas/software/platform/skills/secrets-management/SKILL.md +43 -0
  359. package/areas/software/platform/skills/terraform-patterns/SKILL.md +75 -0
  360. package/areas/software/platform/workflows/cost-audit.md +61 -0
  361. package/areas/software/platform/workflows/deploy-production.md +67 -0
  362. package/areas/software/platform/workflows/drift-check.md +61 -0
  363. package/areas/software/platform/workflows/incident-response.md +69 -0
  364. package/areas/software/platform/workflows/provision-env.md +77 -0
  365. package/areas/software/qa/AGENTS.md +58 -0
  366. package/areas/software/qa/PROMPTS.md +32 -0
  367. package/areas/software/qa/prompts/flakiness-investigation.md +61 -0
  368. package/areas/software/qa/prompts/performance-audit.md +65 -0
  369. package/areas/software/qa/prompts/regression-suite.md +61 -0
  370. package/areas/software/qa/prompts/smoke-test.md +65 -0
  371. package/areas/software/qa/prompts/test-coverage-report.md +61 -0
  372. package/areas/software/qa/rules/flakiness-policy.md +12 -0
  373. package/areas/software/qa/rules/quality-gates.md +28 -0
  374. package/areas/software/qa/rules/test-data.md +9 -0
  375. package/areas/software/qa/rules/test-strategy.md +11 -0
  376. package/areas/software/qa/skills/accessibility-testing/SKILL.md +139 -0
  377. package/areas/software/qa/skills/api-testing/SKILL.md +140 -0
  378. package/areas/software/qa/skills/e2e-patterns/SKILL.md +152 -0
  379. package/areas/software/qa/skills/performance-testing/SKILL.md +177 -0
  380. package/areas/software/qa/skills/test-data-management/SKILL.md +161 -0
  381. package/areas/software/qa/skills/test-pyramid/SKILL.md +127 -0
  382. package/areas/software/qa/workflows/flakiness-investigation.md +63 -0
  383. package/areas/software/qa/workflows/performance-audit.md +59 -0
  384. package/areas/software/qa/workflows/regression-suite.md +59 -0
  385. package/areas/software/qa/workflows/smoke-test.md +64 -0
  386. package/areas/software/qa/workflows/test-coverage-report.md +57 -0
  387. package/areas/software/security/AGENTS.md +58 -0
  388. package/areas/software/security/PROMPTS.md +32 -0
  389. package/areas/software/security/prompts/compliance-report.md +113 -0
  390. package/areas/software/security/prompts/pen-test-sim.md +113 -0
  391. package/areas/software/security/prompts/secret-rotation.md +115 -0
  392. package/areas/software/security/prompts/security-scan.md +91 -0
  393. package/areas/software/security/prompts/threat-model-review.md +105 -0
  394. package/areas/software/security/rules/compliance-baseline.md +23 -0
  395. package/areas/software/security/rules/dependency-policy.md +12 -0
  396. package/areas/software/security/rules/secrets-policy.md +22 -0
  397. package/areas/software/security/rules/secure-coding.md +22 -0
  398. package/areas/software/security/skills/auth-patterns/SKILL.md +42 -0
  399. package/areas/software/security/skills/crypto-standards/SKILL.md +42 -0
  400. package/areas/software/security/skills/dependency-audit/SKILL.md +29 -0
  401. package/areas/software/security/skills/sast-dast-interpretation/SKILL.md +33 -0
  402. package/areas/software/security/skills/security-headers/SKILL.md +29 -0
  403. package/areas/software/security/skills/threat-modeling/SKILL.md +36 -0
  404. package/areas/software/security/workflows/compliance-report.md +57 -0
  405. package/areas/software/security/workflows/pen-test-sim.md +63 -0
  406. package/areas/software/security/workflows/secret-rotation.md +67 -0
  407. package/areas/software/security/workflows/security-scan.md +64 -0
  408. package/areas/software/security/workflows/threat-model-review.md +62 -0
  409. package/areas/template/AGENTS-area.tmpl.md +61 -0
  410. package/areas/template/AGENTS.tmpl.md +67 -0
  411. package/areas/template/GUIDE.md +102 -0
  412. package/areas/template/PROMPTS.tmpl.md +29 -0
  413. package/areas/template/README.md +57 -0
  414. package/areas/template/README.tmpl.md +51 -0
  415. package/areas/template/prompt.tmpl.md +101 -0
  416. package/areas/template/rule.tmpl.md +71 -0
  417. package/areas/template/skill.tmpl.md +108 -0
  418. package/areas/template/workflow.tmpl.md +104 -0
  419. package/bin/agentic.js +24 -0
  420. package/extensions/antigravity/GEMINI.md +10 -0
  421. package/extensions/claude/CLAUDE.md +10 -0
  422. package/extensions/codex/AGENTS.override.md +93 -0
  423. package/extensions/gemini/GEMINI.md +10 -0
  424. package/extensions/opencode/agents/designer.md +65 -0
  425. package/extensions/opencode/agents/developer.md +63 -0
  426. package/extensions/opencode/agents/devops-engineer.md +69 -0
  427. package/extensions/opencode/agents/pm.md +61 -0
  428. package/extensions/opencode/agents/product-owner.md +76 -0
  429. package/extensions/opencode/agents/qa.md +66 -0
  430. package/extensions/opencode/agents/team-lead.md +67 -0
  431. package/extensions/opencode/commands/feature.md +75 -0
  432. package/extensions/opencode/opencode.json +93 -0
  433. package/extensions/opencode/plugins/model-checker.json +14 -0
  434. package/extensions/opencode/plugins/model-checker.ts +279 -0
  435. package/extensions/opencode/plugins/sound-notification.ts +13 -0
  436. package/extensions/opencode/plugins/telegram-notification.ts +86 -0
  437. package/extensions/opencode/skills/code_review_expert/SKILL.md +144 -0
  438. package/extensions/opencode/skills/design_expert/SKILL.md +42 -0
  439. package/extensions/opencode/skills/qa_expert/SKILL.md +116 -0
  440. package/package.json +19 -0
@@ -0,0 +1,159 @@
1
+ ---
2
+ name: log-aggregation
3
+ type: skill
4
+ description: Set up Loki or ELK log aggregation for K8s workloads — structured logging, log routing, and log-based alerting.
5
+ related-rules:
6
+ - golden-signals.md
7
+ - data-retention.md
8
+ allowed-tools: Read, Write, Edit, Bash
9
+ ---
10
+
11
+ # Skill: Log Aggregation
12
+
13
+ > **Expertise:** Loki (Grafana stack), Promtail/Fluent Bit, structured JSON logging, log-based alerting, ELK basics.
14
+
15
+ ## When to load
16
+
17
+ When setting up log collection, writing log queries, debugging missing logs, or adding log-based alerts.
18
+
19
+ ## Loki Stack (K8s — recommended)
20
+
21
+ ```yaml
22
+ # Promtail DaemonSet auto-discovers K8s pod logs
23
+ # Install via helm:
24
+ helm upgrade --install loki grafana/loki-stack \
25
+ -n monitoring \
26
+ -f loki-values.yaml
27
+
28
+ # loki-values.yaml
29
+ loki:
30
+ auth_enabled: false
31
+ limits_config:
32
+ retention_period: 720h # 30 days
33
+ ingestion_rate_mb: 16
34
+ max_streams_per_user: 10000
35
+ storage_config:
36
+ boltdb_shipper:
37
+ active_index_directory: /data/loki/boltdb-index
38
+ filesystem:
39
+ directory: /data/loki/chunks
40
+
41
+ promtail:
42
+ config:
43
+ clients:
44
+ - url: http://loki:3100/loki/api/v1/push
45
+ scrape_configs:
46
+ - job_name: kubernetes-pods
47
+ kubernetes_sd_configs:
48
+ - role: pod
49
+ pipeline_stages:
50
+ - docker: {} # parse Docker JSON log format
51
+ - json: # extract fields from app JSON logs
52
+ expressions:
53
+ level: level
54
+ trace_id: trace_id
55
+ service: service
56
+ - labels:
57
+ level:
58
+ service:
59
+ ```
60
+
61
+ ## LogQL Queries
62
+
63
+ ```logql
64
+ # All error logs from a service in last 5 min
65
+ {namespace="production", app="order-service"} |= "ERROR"
66
+
67
+ # Parse JSON and filter by field
68
+ {namespace="production"} | json | level="error" | trace_id != ""
69
+
70
+ # Count errors per service (for alerting)
71
+ sum by (service) (
72
+ count_over_time({namespace="production"} | json | level="error" [5m])
73
+ )
74
+
75
+ # Log rate (to detect log explosion)
76
+ sum(rate({namespace="production"}[5m])) by (app)
77
+
78
+ # Find slow requests from logs
79
+ {app="api-gateway"} | json | response_time_ms > 500
80
+ ```
81
+
82
+ ## Structured Logging Standards
83
+
84
+ ```python
85
+ # Python — structlog
86
+ import structlog
87
+
88
+ log = structlog.get_logger()
89
+
90
+ # Always include: service, version, trace_id, span_id, level
91
+ log.info("order.created",
92
+ order_id="ord-123",
93
+ user_id="usr-456", # OK in log; NOT in metrics labels
94
+ amount_cents=4999,
95
+ # trace_id injected automatically via TraceContextFilter
96
+ )
97
+
98
+ # Output (JSON):
99
+ # {"event": "order.created", "level": "info", "order_id": "ord-123",
100
+ # "trace_id": "abc123def456", "span_id": "789xyz", "timestamp": "..."}
101
+ ```
102
+
103
+ ```go
104
+ // Go — slog (stdlib, Go 1.21+)
105
+ logger := slog.New(slog.NewJSONHandler(os.Stdout, &slog.HandlerOptions{
106
+ Level: slog.LevelInfo,
107
+ }))
108
+ slog.SetDefault(logger)
109
+
110
+ slog.Info("order.created",
111
+ "order_id", "ord-123",
112
+ "amount_cents", 4999,
113
+ "trace_id", traceID, // inject from context
114
+ )
115
+ ```
116
+
117
+ ## Log-Based Alerting (Loki ruler)
118
+
119
+ ```yaml
120
+ # loki-rules.yaml
121
+ groups:
122
+ - name: application.logs
123
+ rules:
124
+ - alert: HighErrorLogRate
125
+ expr: |
126
+ sum(rate({namespace="production"} | json | level="error" [5m])) by (app)
127
+ > 10
128
+ for: 2m
129
+ labels:
130
+ severity: warning
131
+ annotations:
132
+ summary: "Error log rate > 10/s — {{ $labels.app }}"
133
+ runbook_url: "https://runbooks.internal/high-error-logs"
134
+ ```
135
+
136
+ ## Fluent Bit (alternative — lower resource usage)
137
+
138
+ ```yaml
139
+ # fluent-bit-config.yaml (K8s ConfigMap)
140
+ [INPUT]
141
+ Name tail
142
+ Path /var/log/containers/*.log
143
+ Parser docker
144
+ Refresh_Interval 5
145
+
146
+ [FILTER]
147
+ Name kubernetes
148
+ Match kube.*
149
+ Kube_Tag_Prefix kube.var.log.containers.
150
+ Merge_Log On
151
+ Keep_Log Off
152
+
153
+ [OUTPUT]
154
+ Name loki
155
+ Match kube.*
156
+ Host loki
157
+ Port 3100
158
+ Labels job=fluent-bit, namespace=$kubernetes['namespace_name']
159
+ ```
@@ -0,0 +1,188 @@
1
+ ---
2
+ name: prometheus-alertmanager
3
+ type: skill
4
+ description: Write production-quality Prometheus alert rules, recording rules, and Alertmanager routing configs.
5
+ related-rules:
6
+ - golden-signals.md
7
+ - alerting-standards.md
8
+ allowed-tools: Read, Write, Edit, Bash
9
+ ---
10
+
11
+ # Skill: Prometheus & Alertmanager
12
+
13
+ > **Expertise:** PromQL, alert rules, recording rules, Alertmanager routing, inhibition, silences.
14
+
15
+ ## When to load
16
+
17
+ When writing alert rules, debugging PromQL, configuring Alertmanager routing, or investigating a firing alert.
18
+
19
+ ## Golden Signal Alert Rules
20
+
21
+ ```yaml
22
+ # alerts/service-golden-signals.yaml
23
+ groups:
24
+ - name: service.golden-signals
25
+ rules:
26
+
27
+ # ── Errors ────────────────────────────────────────
28
+ - alert: HighErrorRate
29
+ expr: |
30
+ (
31
+ sum(rate(http_requests_total{status=~"5.."}[5m])) by (namespace, service)
32
+ /
33
+ sum(rate(http_requests_total[5m])) by (namespace, service)
34
+ ) > 0.01
35
+ for: 2m
36
+ labels:
37
+ severity: critical
38
+ annotations:
39
+ summary: "Error rate > 1% — {{ $labels.service }} in {{ $labels.namespace }}"
40
+ description: "Current error rate: {{ $value | humanizePercentage }}"
41
+ runbook_url: "https://runbooks.internal/high-error-rate"
42
+
43
+ # ── Latency ───────────────────────────────────────
44
+ - alert: HighP99Latency
45
+ expr: |
46
+ histogram_quantile(0.99,
47
+ sum(rate(http_request_duration_seconds_bucket[5m])) by (namespace, service, le)
48
+ ) > 1.0
49
+ for: 5m
50
+ labels:
51
+ severity: warning
52
+ annotations:
53
+ summary: "p99 latency > 1s — {{ $labels.service }}"
54
+ description: "p99: {{ $value | humanizeDuration }}"
55
+ runbook_url: "https://runbooks.internal/high-latency"
56
+
57
+ # ── Saturation ────────────────────────────────────
58
+ - alert: PodMemoryPressure
59
+ expr: |
60
+ (
61
+ container_memory_working_set_bytes{container!=""}
62
+ /
63
+ container_spec_memory_limit_bytes{container!=""}
64
+ ) > 0.85
65
+ for: 5m
66
+ labels:
67
+ severity: warning
68
+ annotations:
69
+ summary: "Memory > 85% limit — {{ $labels.container }} in {{ $labels.namespace }}"
70
+ runbook_url: "https://runbooks.internal/memory-pressure"
71
+
72
+ # ── Traffic Drop ──────────────────────────────────
73
+ - alert: TrafficDrop
74
+ expr: |
75
+ (
76
+ sum(rate(http_requests_total[5m])) by (service)
77
+ /
78
+ sum(rate(http_requests_total[1h] offset 5m)) by (service)
79
+ ) < 0.5
80
+ for: 5m
81
+ labels:
82
+ severity: warning
83
+ annotations:
84
+ summary: "Traffic dropped > 50% vs 1h ago — {{ $labels.service }}"
85
+ runbook_url: "https://runbooks.internal/traffic-drop"
86
+ ```
87
+
88
+ ## Recording Rules (pre-aggregate expensive queries)
89
+
90
+ ```yaml
91
+ groups:
92
+ - name: service.recording
93
+ interval: 1m
94
+ rules:
95
+ # Pre-compute error rate (used in dashboards — no re-computation)
96
+ - record: job:http_requests:error_rate5m
97
+ expr: |
98
+ sum(rate(http_requests_total{status=~"5.."}[5m])) by (job, namespace)
99
+ /
100
+ sum(rate(http_requests_total[5m])) by (job, namespace)
101
+
102
+ # Pre-compute p99 (expensive histogram_quantile)
103
+ - record: job:http_request_duration_seconds:p99_5m
104
+ expr: |
105
+ histogram_quantile(0.99,
106
+ sum(rate(http_request_duration_seconds_bucket[5m])) by (job, le)
107
+ )
108
+ ```
109
+
110
+ ## PromQL Patterns
111
+
112
+ ```promql
113
+ # Rate of requests (always use rate() on counters, not irate() for alerting)
114
+ rate(http_requests_total[5m])
115
+
116
+ # Error ratio
117
+ sum(rate(http_requests_total{status=~"5.."}[5m])) by (service)
118
+ / sum(rate(http_requests_total[5m])) by (service)
119
+
120
+ # Memory utilisation (working set vs limit)
121
+ container_memory_working_set_bytes / container_spec_memory_limit_bytes
122
+
123
+ # CPU throttling ratio (> 25% = limit too low)
124
+ sum(rate(container_cpu_throttled_seconds_total[5m])) by (pod)
125
+ / sum(rate(container_cpu_usage_seconds_total[5m])) by (pod)
126
+
127
+ # Absent metric (detect missing scrape targets)
128
+ absent(up{job="my-service"} == 1)
129
+ ```
130
+
131
+ ## Alertmanager Config
132
+
133
+ ```yaml
134
+ # alertmanager.yml
135
+ global:
136
+ resolve_timeout: 5m
137
+ slack_api_url: https://hooks.slack.com/...
138
+
139
+ route:
140
+ receiver: slack-warning
141
+ group_by: [alertname, namespace, service]
142
+ group_wait: 30s
143
+ group_interval: 5m
144
+ repeat_interval: 4h
145
+ routes:
146
+ - matchers: [severity="critical"]
147
+ receiver: pagerduty
148
+ group_wait: 0s # page immediately
149
+ - matchers: [alertname="Watchdog"]
150
+ receiver: deadman-snitch # heartbeat alert
151
+
152
+ inhibit_rules:
153
+ # If a service is down (critical), suppress its latency/error warnings
154
+ - source_matchers: [severity="critical", alertname="ServiceDown"]
155
+ target_matchers: [severity="warning"]
156
+ equal: [namespace, service]
157
+
158
+ receivers:
159
+ - name: pagerduty
160
+ pagerduty_configs:
161
+ - routing_key: $PD_ROUTING_KEY
162
+ description: '{{ range .Alerts }}{{ .Annotations.summary }}{{ end }}'
163
+
164
+ - name: slack-warning
165
+ slack_configs:
166
+ - channel: '#alerts-warning'
167
+ title: '{{ .GroupLabels.alertname }}'
168
+ text: '{{ range .Alerts }}{{ .Annotations.description }}{{ end }}'
169
+ ```
170
+
171
+ ## Debugging Alerts
172
+
173
+ ```bash
174
+ # Check currently firing alerts
175
+ kubectl port-forward svc/alertmanager 9093:9093 -n monitoring
176
+ # Open http://localhost:9093
177
+
178
+ # Evaluate a PromQL expression (check why alert fired/didn't fire)
179
+ kubectl port-forward svc/prometheus 9090:9090 -n monitoring
180
+ # Open http://localhost:9090/graph
181
+
182
+ # Check alert rule evaluation
183
+ curl http://localhost:9090/api/v1/rules | jq '.data.groups[].rules[] | select(.name=="HighErrorRate")'
184
+
185
+ # Silence a noisy alert during maintenance
186
+ amtool silence add alertname="HighErrorRate" namespace="staging" \
187
+ --duration=2h --comment="Scheduled maintenance window"
188
+ ```
@@ -0,0 +1,189 @@
1
+ ---
2
+ name: slo-implementation
3
+ type: skill
4
+ description: Implement SLOs end-to-end in Prometheus — recording rules, burn rate alerts, error budget dashboards, and Sloth/pyrra integration.
5
+ related-rules:
6
+ - golden-signals.md
7
+ - alerting-standards.md
8
+ allowed-tools: Read, Write, Edit, Bash
9
+ ---
10
+
11
+ # Skill: SLO Implementation
12
+
13
+ > **Expertise:** Prometheus recording rules for SLOs, multi-window burn rate alerts, Sloth code generation, error budget Grafana panels.
14
+
15
+ ## When to load
16
+
17
+ When implementing SLOs for a service in Prometheus, setting up burn rate alerts, or creating error budget dashboards.
18
+
19
+ ## Full SLO Stack (single service)
20
+
21
+ ### Step 1: Define the SLI Recording Rules
22
+
23
+ ```yaml
24
+ # prometheus-rules/slo-checkout-service.yaml
25
+ groups:
26
+ - name: slo:checkout-service:recording
27
+ interval: 30s
28
+ rules:
29
+ # Good requests: 2xx, latency < 500ms (combine availability + latency SLI)
30
+ - record: slo:http_requests_good:rate5m
31
+ labels: { service: checkout-service }
32
+ expr: |
33
+ sum(rate(http_requests_total{
34
+ service="checkout-service",
35
+ status=~"2.."
36
+ }[5m]))
37
+ # For latency SLI, intersect with bucket:
38
+ # sum(rate(http_request_duration_seconds_bucket{
39
+ # service="checkout-service", le="0.5"}[5m]))
40
+
41
+ - record: slo:http_requests_total:rate5m
42
+ labels: { service: checkout-service }
43
+ expr: |
44
+ sum(rate(http_requests_total{service="checkout-service"}[5m]))
45
+
46
+ # SLI ratio (5m window)
47
+ - record: slo:http_availability:ratio_rate5m
48
+ labels: { service: checkout-service }
49
+ expr: |
50
+ slo:http_requests_good:rate5m{service="checkout-service"}
51
+ / slo:http_requests_total:rate5m{service="checkout-service"}
52
+
53
+ # Pre-compute multiple windows for burn rate alerts
54
+ - record: slo:http_availability:ratio_rate30m
55
+ labels: { service: checkout-service }
56
+ expr: |
57
+ sum(rate(http_requests_total{service="checkout-service",status=~"2.."}[30m]))
58
+ / sum(rate(http_requests_total{service="checkout-service"}[30m]))
59
+
60
+ - record: slo:http_availability:ratio_rate1h
61
+ labels: { service: checkout-service }
62
+ expr: |
63
+ sum(rate(http_requests_total{service="checkout-service",status=~"2.."}[1h]))
64
+ / sum(rate(http_requests_total{service="checkout-service"}[1h]))
65
+
66
+ - record: slo:http_availability:ratio_rate6h
67
+ labels: { service: checkout-service }
68
+ expr: |
69
+ sum(rate(http_requests_total{service="checkout-service",status=~"2.."}[6h]))
70
+ / sum(rate(http_requests_total{service="checkout-service"}[6h]))
71
+
72
+ - record: slo:http_availability:ratio_rate1d
73
+ labels: { service: checkout-service }
74
+ expr: |
75
+ sum(rate(http_requests_total{service="checkout-service",status=~"2.."}[1d]))
76
+ / sum(rate(http_requests_total{service="checkout-service"}[1d]))
77
+
78
+ - record: slo:http_availability:ratio_rate28d
79
+ labels: { service: checkout-service }
80
+ expr: |
81
+ sum_over_time(slo:http_availability:ratio_rate5m{service="checkout-service"}[28d])
82
+ / (28 * 24 * 12) # 12 samples/hour × 24h × 28d
83
+ ```
84
+
85
+ ### Step 2: Multi-Window Burn Rate Alerts
86
+
87
+ ```yaml
88
+ - name: slo:checkout-service:alerts
89
+ rules:
90
+ # ── Fast burn (1h + 5m windows, 14.4× rate) ──────────────────
91
+ # Consumes 2% of 28d budget in 1h → page immediately
92
+ - alert: CheckoutSLOFastBurn
93
+ expr: |
94
+ (slo:http_availability:ratio_rate1h{service="checkout-service"} < (1 - 14.4 * 0.005))
95
+ and
96
+ (slo:http_availability:ratio_rate5m{service="checkout-service"} < (1 - 14.4 * 0.005))
97
+ for: 2m
98
+ labels:
99
+ severity: critical
100
+ service: checkout-service
101
+ slo: availability-99.5
102
+ annotations:
103
+ summary: "Checkout SLO fast burn — error rate > 14.4× baseline"
104
+ description: "1h availability: {{ $value | humanizePercentage }}. Budget burning rapidly."
105
+ runbook_url: "https://runbooks.internal/checkout-slo-fast-burn"
106
+
107
+ # ── Slow burn (6h + 30m windows, 6× rate) ────────────────────
108
+ # Consumes 5% of 28d budget in 6h → ticket, fix in business hours
109
+ - alert: CheckoutSLOSlowBurn
110
+ expr: |
111
+ (slo:http_availability:ratio_rate6h{service="checkout-service"} < (1 - 6 * 0.005))
112
+ and
113
+ (slo:http_availability:ratio_rate30m{service="checkout-service"} < (1 - 6 * 0.005))
114
+ for: 15m
115
+ labels:
116
+ severity: warning
117
+ service: checkout-service
118
+ slo: availability-99.5
119
+ annotations:
120
+ summary: "Checkout SLO slow burn — error rate > 6× baseline"
121
+ runbook_url: "https://runbooks.internal/checkout-slo-slow-burn"
122
+
123
+ # ── Budget exhaustion warning ─────────────────────────────────
124
+ - alert: CheckoutSLOBudgetLow
125
+ expr: |
126
+ slo:http_availability:ratio_rate28d{service="checkout-service"}
127
+ < (1 - 0.005 * 0.75) # < 25% budget remaining
128
+ for: 1h
129
+ labels:
130
+ severity: warning
131
+ service: checkout-service
132
+ annotations:
133
+ summary: "Checkout error budget < 25% remaining for this month"
134
+ runbook_url: "https://runbooks.internal/checkout-error-budget"
135
+ ```
136
+
137
+ ### Step 3: Sloth (generate from YAML spec)
138
+
139
+ ```yaml
140
+ # slo/checkout-service.yaml
141
+ version: "prometheus/v1"
142
+ service: checkout-service
143
+ labels: { team: backend, tier: "1" }
144
+ slos:
145
+ - name: requests-availability
146
+ objective: 99.5
147
+ description: "99.5% of checkout requests succeed"
148
+ sli:
149
+ events:
150
+ error_query: |
151
+ sum(rate(http_requests_total{
152
+ service="checkout-service",
153
+ status=~"5.."}[{{.window}}]))
154
+ total_query: |
155
+ sum(rate(http_requests_total{
156
+ service="checkout-service"}[{{.window}}]))
157
+ alerting:
158
+ name: CheckoutServiceAvailability
159
+ page_alert:
160
+ labels: { severity: critical }
161
+ annotations:
162
+ runbook_url: https://runbooks.internal/checkout-availability
163
+ ticket_alert:
164
+ labels: { severity: warning }
165
+ ```
166
+
167
+ ```bash
168
+ # Generate Prometheus rules + alerts from Sloth spec
169
+ sloth generate -i slo/checkout-service.yaml -o rules/slo-checkout-generated.yaml
170
+ # Produces: recording rules for all windows + multi-window burn rate alerts
171
+ ```
172
+
173
+ ### Step 4: Error Budget Dashboard (Grafana)
174
+
175
+ ```promql
176
+ -- Current error budget remaining (percent of 28d budget)
177
+ (
178
+ sum_over_time(slo:http_availability:ratio_rate5m{service="checkout-service"}[28d])
179
+ / (28 * 24 * 12)
180
+ - (1 - 0.005)
181
+ )
182
+ / 0.005 * 100
183
+
184
+ -- Hours of budget remaining at current burn rate
185
+ (
186
+ (slo:http_availability:ratio_rate28d{service="checkout-service"} - (1 - 0.005))
187
+ / 0.005
188
+ ) * 28 * 24
189
+ ```
@@ -0,0 +1,98 @@
1
+ ---
2
+ name: alert-investigation
3
+ type: workflow
4
+ trigger: /alert-investigation
5
+ description: Structured alert investigation — classify, correlate metrics/logs/traces, identify root cause, mitigate, and improve alert quality.
6
+ inputs:
7
+ - alert_name
8
+ - alert_labels
9
+ - firing_since
10
+ outputs:
11
+ - root_cause_summary
12
+ - mitigation_applied_or_deferred
13
+ - alert_quality_notes
14
+ roles:
15
+ - devops-engineer
16
+ - developer
17
+ execution:
18
+ initiator: developer
19
+ related-rules:
20
+ - golden-signals.md
21
+ - alerting-standards.md
22
+ uses-skills:
23
+ - prometheus-alertmanager
24
+ - grafana-dashboards
25
+ - log-aggregation
26
+ - distributed-tracing
27
+ quality-gates:
28
+ - root cause identified before alert is silenced
29
+ - action item created for any alert that fired without a valid runbook step
30
+ ---
31
+
32
+ ## Steps
33
+
34
+ ### 1. Acknowledge & Classify — `@devops-engineer`
35
+ - Open Grafana: navigate to service dashboard for the affected service
36
+ - Check: is this a real user-impact alert or a false positive?
37
+ - Real: error rate / latency / saturation affecting users
38
+ - False: alert threshold too sensitive for normal traffic patterns
39
+ - Check: when did the alert start? Correlate with recent deploys or cron jobs
40
+ - **Done when:** alert classified (real/false-positive) and current status known
41
+
42
+ ### 2. Correlate Signals — `@devops-engineer`
43
+
44
+ **Metrics (Prometheus):**
45
+ ```promql
46
+ -- Error rate breakdown by endpoint
47
+ sum(rate(http_requests_total{service="$svc", status=~"5.."}[5m])) by (path)
48
+ / sum(rate(http_requests_total{service="$svc"}[5m])) by (path)
49
+
50
+ -- Latency distribution shift
51
+ histogram_quantile(0.99, sum(rate(http_request_duration_seconds_bucket{service="$svc"}[5m])) by (le))
52
+
53
+ -- Recent pod restarts
54
+ increase(kube_pod_container_status_restarts_total{namespace="$ns"}[30m])
55
+ ```
56
+
57
+ **Logs (Loki):**
58
+ ```logql
59
+ {namespace="$ns", app="$svc"} | json | level="error"
60
+ | line_format "{{.message}} trace={{.trace_id}}"
61
+ ```
62
+
63
+ **Traces (Tempo):**
64
+ - Search by trace_id from logs → view full request trace
65
+ - Filter by `duration > 1s AND status=error` to find slow/failing requests
66
+
67
+ ### 3. Identify Root Cause — `@devops-engineer` + `@developer`
68
+
69
+ Decision tree:
70
+ ```
71
+ Error rate spike?
72
+ → Recent deploy? → Check image diff, config changes → Rollback candidate
73
+ → No deploy? → Check upstream dependency health, DB connections, external API
74
+
75
+ Latency spike?
76
+ → CPU throttling? → Check container_cpu_throttled_seconds
77
+ → Memory pressure? → Check working set vs limits
78
+ → Downstream slow? → Trace to identify bottleneck service
79
+ → DB slow? → Check pg_stat_statements, lock waits
80
+
81
+ Saturation?
82
+ → CPU: scale out or increase limits
83
+ → Memory: right-size or find leak
84
+ → Connections: check PgBouncer, connection leak
85
+ ```
86
+
87
+ ### 4. Mitigate — `@devops-engineer`
88
+ - Apply fix (rollback, scale, restart, config change)
89
+ - Watch: is the alert resolving? (usually auto-resolves within `for:` duration after fix)
90
+ - If not resolving: escalate to P1
91
+
92
+ ### 5. Post-Investigation Notes — `@devops-engineer`
93
+ - Was the runbook adequate? (could a junior follow it to resolution?)
94
+ - Is the alert threshold correct? (too sensitive = toil; too loose = misses real issues)
95
+ - Create ticket if: runbook needs update, threshold needs tuning, or root cause needs a code fix
96
+
97
+ ## Exit
98
+ Alert resolved or escalated + root cause noted + runbook quality assessed = investigation complete.