blockmine 1.24.0 → 1.27.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (476) hide show
  1. package/CHANGELOG.md +76 -1
  2. package/README.en.md +427 -0
  3. package/README.md +40 -0
  4. package/backend/package.json +2 -2
  5. package/backend/prisma/migrations/20260328173000_add_plugin_source_ref/migration.sql +2 -0
  6. package/backend/prisma/migrations/migration_lock.toml +2 -2
  7. package/backend/prisma/schema.prisma +2 -0
  8. package/backend/src/ai/plugin-assistant-system-prompt.md +664 -5
  9. package/backend/src/api/routes/apiKeys.js +8 -0
  10. package/backend/src/api/routes/bots.js +271 -9
  11. package/backend/src/api/routes/eventGraphs.js +151 -1
  12. package/backend/src/api/routes/health.js +38 -0
  13. package/backend/src/api/routes/nodeRegistry.js +63 -0
  14. package/backend/src/api/routes/plugins.js +254 -29
  15. package/backend/src/api/routes/servers.js +14 -2
  16. package/backend/src/container.js +11 -8
  17. package/backend/src/core/BotCommandLoader.js +161 -0
  18. package/backend/src/core/BotConnection.js +125 -0
  19. package/backend/src/core/BotEventHandlers.js +234 -0
  20. package/backend/src/core/BotIPCHandler.js +445 -0
  21. package/backend/src/core/BotManager.js +15 -7
  22. package/backend/src/core/BotProcess.js +169 -140
  23. package/backend/src/core/EventGraphManager.js +7 -3
  24. package/backend/src/core/GraphDebugHandler.js +229 -0
  25. package/backend/src/core/GraphDebugIPC.js +117 -0
  26. package/backend/src/core/GraphExecutionEngine.js +545 -978
  27. package/backend/src/core/GraphTraversal.js +80 -0
  28. package/backend/src/core/GraphValidation.js +73 -0
  29. package/backend/src/core/NodeDefinition.js +138 -0
  30. package/backend/src/core/NodeRegistry.js +153 -141
  31. package/backend/src/core/PluginLoader.js +83 -3
  32. package/backend/src/core/PluginManager.js +346 -35
  33. package/backend/src/core/RewindSignal.js +9 -0
  34. package/backend/src/core/config/ConfigValidator.js +72 -0
  35. package/backend/src/core/config/FeatureFlags.js +52 -0
  36. package/backend/src/core/config/__tests__/ConfigValidator.test.js +232 -0
  37. package/backend/src/core/domain/entities/Bot.js +39 -0
  38. package/backend/src/core/domain/entities/Command.js +41 -0
  39. package/backend/src/core/domain/entities/EventGraph.js +39 -0
  40. package/backend/src/core/domain/entities/Plugin.js +45 -0
  41. package/backend/src/core/domain/entities/User.js +40 -0
  42. package/backend/src/core/domain/services/DependencyResolver.js +168 -0
  43. package/backend/src/core/domain/services/GraphValidator.js +117 -0
  44. package/backend/src/core/domain/services/PermissionChecker.js +34 -0
  45. package/backend/src/core/domain/services/__tests__/DependencyResolver.test.js +126 -0
  46. package/backend/src/core/domain/valueObjects/BotConfig.js +27 -0
  47. package/backend/src/core/domain/valueObjects/DependencyGraph.js +86 -0
  48. package/backend/src/core/domain/valueObjects/PluginManifest.js +36 -0
  49. package/backend/src/core/errors/BaseError.js +29 -0
  50. package/backend/src/core/errors/ErrorHandler.js +81 -0
  51. package/backend/src/core/errors/__tests__/ErrorHandler.test.js +188 -0
  52. package/backend/src/core/errors/index.js +68 -0
  53. package/backend/src/core/infrastructure/BatchingUtility.js +66 -0
  54. package/backend/src/core/infrastructure/CircuitBreaker.js +103 -0
  55. package/backend/src/core/infrastructure/ConnectionPool.js +81 -0
  56. package/backend/src/core/infrastructure/RateLimiter.js +64 -0
  57. package/backend/src/core/infrastructure/__tests__/BatchingUtility.test.js +86 -0
  58. package/backend/src/core/infrastructure/__tests__/CircuitBreaker.test.js +156 -0
  59. package/backend/src/core/infrastructure/__tests__/ConnectionPool.test.js +146 -0
  60. package/backend/src/core/infrastructure/__tests__/RateLimiter.test.js +171 -0
  61. package/backend/src/core/ipc/botApiFactory.js +72 -0
  62. package/backend/src/core/ipc/ipcMessageTypes.js +115 -0
  63. package/backend/src/core/logging/AuditLogger.js +61 -0
  64. package/backend/src/core/logging/StructuredLogger.js +80 -0
  65. package/backend/src/core/logging/__tests__/StructuredLogger.test.js +213 -0
  66. package/backend/src/core/logging/index.js +7 -0
  67. package/backend/src/core/metrics/MetricsCollector.js +104 -0
  68. package/backend/src/core/metrics/__tests__/MetricsCollector.test.js +131 -0
  69. package/backend/src/core/node-registries/actionsNodes.js +191 -0
  70. package/backend/src/core/node-registries/arraysNodes.js +152 -0
  71. package/backend/src/core/node-registries/botNodes.js +48 -0
  72. package/backend/src/core/node-registries/containerNodes.js +141 -0
  73. package/backend/src/core/node-registries/dataNodes.js +284 -0
  74. package/backend/src/core/node-registries/debugNodes.js +23 -0
  75. package/backend/src/core/node-registries/eventsNodes.js +223 -0
  76. package/backend/src/core/node-registries/flowNodes.js +151 -0
  77. package/backend/src/core/node-registries/furnaceNodes.js +123 -0
  78. package/backend/src/core/node-registries/index.js +108 -0
  79. package/backend/src/core/node-registries/inventory.js +102 -106
  80. package/backend/src/core/node-registries/logicNodes.js +54 -0
  81. package/backend/src/core/node-registries/mathNodes.js +38 -0
  82. package/backend/src/core/node-registries/navigationNodes.js +109 -0
  83. package/backend/src/core/node-registries/objectsNodes.js +90 -0
  84. package/backend/src/core/node-registries/stringsNodes.js +165 -0
  85. package/backend/src/core/node-registries/timeNodes.js +105 -0
  86. package/backend/src/core/node-registries/typeNodes.js +22 -0
  87. package/backend/src/core/node-registries/usersNodes.js +126 -0
  88. package/backend/src/core/nodes/arrays/shuffle.js +14 -0
  89. package/backend/src/core/nodes/bot/get_name.js +8 -0
  90. package/backend/src/core/nodes/bot/stop_bot.js +5 -0
  91. package/backend/src/core/nodes/container/open.js +101 -111
  92. package/backend/src/core/nodes/data/store_read.js +26 -0
  93. package/backend/src/core/nodes/data/store_write.js +23 -0
  94. package/backend/src/core/nodes/event/call_event.js +31 -0
  95. package/backend/src/core/nodes/event/custom_event.js +8 -0
  96. package/backend/src/core/nodes/flow/timer.js +35 -0
  97. package/backend/src/core/nodes/inventory/drop.js +73 -65
  98. package/backend/src/core/nodes/inventory/equip.js +54 -45
  99. package/backend/src/core/nodes/inventory/select_slot.js +48 -46
  100. package/backend/src/core/nodes/navigation/follow.js +54 -51
  101. package/backend/src/core/nodes/navigation/go_to.js +41 -53
  102. package/backend/src/core/nodes/navigation/go_to_entity.js +65 -69
  103. package/backend/src/core/nodes/navigation/go_to_player.js +65 -70
  104. package/backend/src/core/nodes/navigation/stop.js +17 -26
  105. package/backend/src/core/nodes/users/add_to_group.js +24 -0
  106. package/backend/src/core/nodes/users/check_permission.js +26 -0
  107. package/backend/src/core/nodes/users/remove_from_group.js +24 -0
  108. package/backend/src/core/services/BotIPCMessageRouter.js +337 -0
  109. package/backend/src/core/services/BotLifecycleService.js +43 -450
  110. package/backend/src/core/services/CacheManager.js +83 -23
  111. package/backend/src/core/services/CrashRestartManager.js +42 -0
  112. package/backend/src/core/services/DebugSessionManager.js +114 -12
  113. package/backend/src/core/services/EventGraphService.js +69 -0
  114. package/backend/src/core/services/MinecraftBotManager.js +9 -1
  115. package/backend/src/core/services/PluginManagementService.js +84 -0
  116. package/backend/src/core/services/TestModeContext.js +65 -0
  117. package/backend/src/core/services/__tests__/CacheManager.test.js +168 -0
  118. package/backend/src/core/services.js +1 -11
  119. package/backend/src/core/validation/InputValidator.js +167 -0
  120. package/backend/src/core/validation/__tests__/InputValidator.test.js +296 -0
  121. package/backend/src/real-time/botApi/index.js +1 -1
  122. package/backend/src/real-time/socketHandler.js +26 -0
  123. package/backend/src/server.js +21 -6
  124. package/frontend/dist/assets/browser-ponyfill-D8y0Ty7C.js +2 -0
  125. package/frontend/dist/assets/index-CFJLS0dk.css +32 -0
  126. package/frontend/dist/assets/index-D91UGNMG.js +11260 -0
  127. package/frontend/dist/flags/en.svg +32 -0
  128. package/frontend/dist/flags/ru.svg +5 -0
  129. package/frontend/dist/index.html +2 -2
  130. package/frontend/dist/locales/en/admin.json +100 -0
  131. package/frontend/dist/locales/en/api-keys.json +58 -0
  132. package/frontend/dist/locales/en/bots.json +113 -0
  133. package/frontend/dist/locales/en/common.json +53 -0
  134. package/frontend/dist/locales/en/configuration.json +22 -0
  135. package/frontend/dist/locales/en/console.json +10 -0
  136. package/frontend/dist/locales/en/dashboard.json +85 -0
  137. package/frontend/dist/locales/en/dialogs.json +70 -0
  138. package/frontend/dist/locales/en/event-graphs.json +50 -0
  139. package/frontend/dist/locales/en/graph-store.json +70 -0
  140. package/frontend/dist/locales/en/login.json +36 -0
  141. package/frontend/dist/locales/en/management.json +192 -0
  142. package/frontend/dist/locales/en/minecraft-viewer.json +27 -0
  143. package/frontend/dist/locales/en/nodes.json +1132 -0
  144. package/frontend/dist/locales/en/permissions.json +50 -0
  145. package/frontend/dist/locales/en/plugin-detail.json +69 -0
  146. package/frontend/dist/locales/en/plugins.json +329 -0
  147. package/frontend/dist/locales/en/proxies.json +81 -0
  148. package/frontend/dist/locales/en/servers.json +39 -0
  149. package/frontend/dist/locales/en/setup.json +19 -0
  150. package/frontend/dist/locales/en/sidebar.json +195 -0
  151. package/frontend/dist/locales/en/tasks.json +62 -0
  152. package/frontend/dist/locales/en/visual-editor.json +418 -0
  153. package/frontend/dist/locales/en/websocket.json +86 -0
  154. package/frontend/dist/locales/ru/admin.json +100 -0
  155. package/frontend/dist/locales/ru/api-keys.json +58 -0
  156. package/frontend/dist/locales/ru/bots.json +113 -0
  157. package/frontend/dist/locales/ru/common.json +49 -0
  158. package/frontend/dist/locales/ru/configuration.json +22 -0
  159. package/frontend/dist/locales/ru/console.json +10 -0
  160. package/frontend/dist/locales/ru/dashboard.json +85 -0
  161. package/frontend/dist/locales/ru/dialogs.json +70 -0
  162. package/frontend/dist/locales/ru/event-graphs.json +50 -0
  163. package/frontend/dist/locales/ru/graph-store.json +70 -0
  164. package/frontend/dist/locales/ru/login.json +36 -0
  165. package/frontend/dist/locales/ru/management.json +192 -0
  166. package/frontend/dist/locales/ru/minecraft-viewer.json +30 -0
  167. package/frontend/dist/locales/ru/nodes.json +1131 -0
  168. package/frontend/dist/locales/ru/permissions.json +50 -0
  169. package/frontend/dist/locales/ru/plugin-detail.json +49 -0
  170. package/frontend/dist/locales/ru/plugins.json +209 -0
  171. package/frontend/dist/locales/ru/proxies.json +81 -0
  172. package/frontend/dist/locales/ru/servers.json +39 -0
  173. package/frontend/dist/locales/ru/setup.json +19 -0
  174. package/frontend/dist/locales/ru/sidebar.json +195 -0
  175. package/frontend/dist/locales/ru/tasks.json +62 -0
  176. package/frontend/dist/locales/ru/visual-editor.json +420 -0
  177. package/frontend/dist/locales/ru/websocket.json +86 -0
  178. package/frontend/dist/monacoeditorwork/css.worker.bundle.js +7 -7
  179. package/frontend/dist/monacoeditorwork/html.worker.bundle.js +7 -7
  180. package/frontend/dist/monacoeditorwork/json.worker.bundle.js +7 -7
  181. package/frontend/dist/monacoeditorwork/ts.worker.bundle.js +3 -3
  182. package/frontend/package.json +6 -0
  183. package/nul +12 -0
  184. package/package.json +3 -3
  185. package/screen/3dviewer.png +0 -0
  186. package/screen/console.png +0 -0
  187. package/screen/dashboard.png +0 -0
  188. package/screen/graph_collabe.png +0 -0
  189. package/screen/graph_live_debug.png +0 -0
  190. package/screen/language_selector.png +0 -0
  191. package/screen/management_command.png +0 -0
  192. package/screen/node_debug_trace.png +0 -0
  193. package/screen/plugin_/320/276/320/261/320/267/320/276/321/200.png +0 -0
  194. package/screen/websocket.png +0 -0
  195. package/screen//320/275/320/260/321/201/321/202/321/200/320/276/320/271/320/272/320/270_/320/276/321/202/320/264/320/265/320/273/321/214/320/275/321/213/321/205_/320/272/320/276/320/274/320/260/320/275/320/264_/320/272/320/260/320/266/320/264/321/203_/320/272/320/276/320/274/320/260/320/275/320/273/320/264/321/203_/320/274/320/276/320/266/320/275/320/276_/320/275/320/260/321/201/321/202/321/200/320/260/320/270/320/262/320/260/321/202/321/214.png +0 -0
  196. package/screen//320/277/320/273/320/260/320/275/320/270/321/200/320/276/320/262/321/211/320/270/320/272_/320/274/320/276/320/266/320/275/320/276_/320/267/320/260/320/264/320/260/320/262/320/260/321/202/321/214_/320/264/320/265/320/271/321/201/321/202/320/262/320/270/321/217_/320/277/320/276_/320/262/321/200/320/265/320/274/320/265/320/275/320/270.png +0 -0
  197. package/.claude/agents/README.md +0 -469
  198. package/.claude/agents/auth-route-debugger.md +0 -118
  199. package/.claude/agents/auth-route-tester.md +0 -93
  200. package/.claude/agents/auto-error-resolver.md +0 -97
  201. package/.claude/agents/build-optimizer.md +0 -236
  202. package/.claude/agents/code-architect.md +0 -34
  203. package/.claude/agents/code-architecture-reviewer.md +0 -83
  204. package/.claude/agents/code-explorer.md +0 -51
  205. package/.claude/agents/code-refactor-master.md +0 -94
  206. package/.claude/agents/code-reviewer.md +0 -46
  207. package/.claude/agents/cost-optimizer.md +0 -134
  208. package/.claude/agents/deployment-orchestrator.md +0 -113
  209. package/.claude/agents/documentation-architect.md +0 -82
  210. package/.claude/agents/frontend-error-fixer.md +0 -77
  211. package/.claude/agents/iac-code-generator.md +0 -71
  212. package/.claude/agents/incident-responder.md +0 -346
  213. package/.claude/agents/infrastructure-architect.md +0 -31
  214. package/.claude/agents/kubernetes-specialist.md +0 -56
  215. package/.claude/agents/migration-planner.md +0 -181
  216. package/.claude/agents/network-architect.md +0 -196
  217. package/.claude/agents/plan-reviewer.md +0 -52
  218. package/.claude/agents/refactor-planner.md +0 -63
  219. package/.claude/agents/security-scanner.md +0 -102
  220. package/.claude/agents/web-research-specialist.md +0 -78
  221. package/.claude/commands/cost-analysis.md +0 -315
  222. package/.claude/commands/dev-docs-update.md +0 -55
  223. package/.claude/commands/dev-docs.md +0 -51
  224. package/.claude/commands/feature-dev.md +0 -125
  225. package/.claude/commands/incident-debug.md +0 -247
  226. package/.claude/commands/infra-plan.md +0 -81
  227. package/.claude/commands/migration-plan.md +0 -478
  228. package/.claude/commands/route-research-for-testing.md +0 -37
  229. package/.claude/commands/security-review.md +0 -66
  230. package/.claude/hooks/CONFIG.md +0 -448
  231. package/.claude/hooks/README.md +0 -163
  232. package/.claude/hooks/SKILL_ACTIVATION_COMPLETE.md +0 -226
  233. package/.claude/hooks/WINDOWS_HOOKS_README.md +0 -151
  234. package/.claude/hooks/add-skill-activation-banners.ts +0 -132
  235. package/.claude/hooks/comprehensive-skill-test.ts +0 -1315
  236. package/.claude/hooks/error-handling-reminder.sh +0 -12
  237. package/.claude/hooks/error-handling-reminder.ts +0 -222
  238. package/.claude/hooks/k8s-manifest-validator.sh +0 -56
  239. package/.claude/hooks/package-lock.json +0 -556
  240. package/.claude/hooks/package.json +0 -16
  241. package/.claude/hooks/post-tool-use-tracker.ps1 +0 -174
  242. package/.claude/hooks/post-tool-use-tracker.sh +0 -183
  243. package/.claude/hooks/security-policy-check.sh +0 -247
  244. package/.claude/hooks/skill-activation-prompt.ps1 +0 -10
  245. package/.claude/hooks/skill-activation-prompt.sh +0 -10
  246. package/.claude/hooks/skill-activation-prompt.ts +0 -141
  247. package/.claude/hooks/stop-build-check-enhanced.sh +0 -130
  248. package/.claude/hooks/terraform-validator.sh +0 -53
  249. package/.claude/hooks/test-input.json +0 -7
  250. package/.claude/hooks/test-skill-activation.ts +0 -427
  251. package/.claude/hooks/trigger-build-resolver.sh +0 -79
  252. package/.claude/hooks/tsc-check.sh +0 -173
  253. package/.claude/hooks/tsconfig.json +0 -19
  254. package/.claude/settings.json +0 -59
  255. package/.claude/settings.local.json +0 -67
  256. package/.claude/skills/README.md +0 -507
  257. package/.claude/skills/api-engineering/SKILL.md +0 -63
  258. package/.claude/skills/api-engineering/resources/api-versioning.md +0 -88
  259. package/.claude/skills/api-engineering/resources/graphql-patterns.md +0 -106
  260. package/.claude/skills/api-engineering/resources/rate-limiting.md +0 -118
  261. package/.claude/skills/api-engineering/resources/rest-api-design.md +0 -105
  262. package/.claude/skills/backend-dev-guidelines/SKILL.md +0 -306
  263. package/.claude/skills/backend-dev-guidelines/resources/architecture-overview.md +0 -451
  264. package/.claude/skills/backend-dev-guidelines/resources/async-and-errors.md +0 -307
  265. package/.claude/skills/backend-dev-guidelines/resources/complete-examples.md +0 -638
  266. package/.claude/skills/backend-dev-guidelines/resources/configuration.md +0 -275
  267. package/.claude/skills/backend-dev-guidelines/resources/database-patterns.md +0 -224
  268. package/.claude/skills/backend-dev-guidelines/resources/middleware-guide.md +0 -213
  269. package/.claude/skills/backend-dev-guidelines/resources/routing-and-controllers.md +0 -756
  270. package/.claude/skills/backend-dev-guidelines/resources/sentry-and-monitoring.md +0 -336
  271. package/.claude/skills/backend-dev-guidelines/resources/services-and-repositories.md +0 -789
  272. package/.claude/skills/backend-dev-guidelines/resources/testing-guide.md +0 -235
  273. package/.claude/skills/backend-dev-guidelines/resources/validation-patterns.md +0 -754
  274. package/.claude/skills/budget-and-cost-management/SKILL.md +0 -850
  275. package/.claude/skills/build-engineering/SKILL.md +0 -431
  276. package/.claude/skills/build-engineering/resources/artifact-repositories.md +0 -72
  277. package/.claude/skills/build-engineering/resources/build-caching.md +0 -96
  278. package/.claude/skills/build-engineering/resources/build-pipelines.md +0 -105
  279. package/.claude/skills/build-engineering/resources/build-security.md +0 -95
  280. package/.claude/skills/build-engineering/resources/build-systems.md +0 -389
  281. package/.claude/skills/build-engineering/resources/compilation-optimization.md +0 -201
  282. package/.claude/skills/build-engineering/resources/dependency-management.md +0 -73
  283. package/.claude/skills/build-engineering/resources/monorepo-builds.md +0 -110
  284. package/.claude/skills/build-engineering/resources/performance-optimization.md +0 -113
  285. package/.claude/skills/build-engineering/resources/reproducible-builds.md +0 -82
  286. package/.claude/skills/cloud-engineering/SKILL.md +0 -675
  287. package/.claude/skills/cloud-engineering/resources/aws-patterns.md +0 -742
  288. package/.claude/skills/cloud-engineering/resources/azure-patterns.md +0 -714
  289. package/.claude/skills/cloud-engineering/resources/cleared-cloud-environments.md +0 -987
  290. package/.claude/skills/cloud-engineering/resources/cloud-cost-optimization.md +0 -757
  291. package/.claude/skills/cloud-engineering/resources/cloud-networking.md +0 -1058
  292. package/.claude/skills/cloud-engineering/resources/cloud-security-tools.md +0 -1530
  293. package/.claude/skills/cloud-engineering/resources/cloud-security.md +0 -990
  294. package/.claude/skills/cloud-engineering/resources/gcp-patterns.md +0 -758
  295. package/.claude/skills/cloud-engineering/resources/migration-strategies.md +0 -820
  296. package/.claude/skills/cloud-engineering/resources/multi-cloud-strategies.md +0 -670
  297. package/.claude/skills/cloud-engineering/resources/oci-patterns.md +0 -1198
  298. package/.claude/skills/cloud-engineering/resources/serverless-patterns.md +0 -795
  299. package/.claude/skills/cloud-engineering/resources/well-architected-frameworks.md +0 -966
  300. package/.claude/skills/cybersecurity/SKILL.md +0 -409
  301. package/.claude/skills/cybersecurity/resources/security-architecture.md +0 -266
  302. package/.claude/skills/database-engineering/SKILL.md +0 -61
  303. package/.claude/skills/database-engineering/resources/backup-and-recovery.md +0 -72
  304. package/.claude/skills/database-engineering/resources/database-replication.md +0 -63
  305. package/.claude/skills/database-engineering/resources/postgresql-fundamentals.md +0 -70
  306. package/.claude/skills/database-engineering/resources/query-optimization.md +0 -68
  307. package/.claude/skills/devsecops/SKILL.md +0 -374
  308. package/.claude/skills/devsecops/resources/ci-cd-security.md +0 -204
  309. package/.claude/skills/devsecops/resources/compliance-automation.md +0 -530
  310. package/.claude/skills/devsecops/resources/compliance-frameworks.md +0 -2322
  311. package/.claude/skills/devsecops/resources/container-security.md +0 -915
  312. package/.claude/skills/devsecops/resources/cspm-integration.md +0 -1440
  313. package/.claude/skills/devsecops/resources/policy-enforcement.md +0 -619
  314. package/.claude/skills/devsecops/resources/secrets-management.md +0 -755
  315. package/.claude/skills/devsecops/resources/security-monitoring.md +0 -146
  316. package/.claude/skills/devsecops/resources/security-scanning.md +0 -887
  317. package/.claude/skills/devsecops/resources/security-testing.md +0 -203
  318. package/.claude/skills/devsecops/resources/supply-chain-security.md +0 -518
  319. package/.claude/skills/devsecops/resources/vulnerability-management.md +0 -481
  320. package/.claude/skills/devsecops/resources/zero-trust-architecture.md +0 -177
  321. package/.claude/skills/documentation-as-code/SKILL.md +0 -323
  322. package/.claude/skills/documentation-as-code/resources/api-documentation.md +0 -90
  323. package/.claude/skills/documentation-as-code/resources/changelog-management.md +0 -79
  324. package/.claude/skills/documentation-as-code/resources/diagram-generation.md +0 -44
  325. package/.claude/skills/documentation-as-code/resources/docs-as-code-workflow.md +0 -99
  326. package/.claude/skills/documentation-as-code/resources/documentation-automation.md +0 -68
  327. package/.claude/skills/documentation-as-code/resources/documentation-sites.md +0 -79
  328. package/.claude/skills/documentation-as-code/resources/markdown-best-practices.md +0 -162
  329. package/.claude/skills/documentation-as-code/resources/openapi-specification.md +0 -77
  330. package/.claude/skills/documentation-as-code/resources/readme-engineering.md +0 -60
  331. package/.claude/skills/documentation-as-code/resources/technical-writing-guide.md +0 -202
  332. package/.claude/skills/engineering-management/SKILL.md +0 -356
  333. package/.claude/skills/engineering-management/resources/career-ladders.md +0 -609
  334. package/.claude/skills/engineering-management/resources/hiring-and-assessment.md +0 -555
  335. package/.claude/skills/engineering-management/resources/one-on-one-guides.md +0 -609
  336. package/.claude/skills/engineering-management/resources/resource-planning.md +0 -557
  337. package/.claude/skills/engineering-management/resources/team-organization-patterns.md +0 -491
  338. package/.claude/skills/engineering-management/resources/technical-interviews.md +0 -474
  339. package/.claude/skills/engineering-operations-management/SKILL.md +0 -817
  340. package/.claude/skills/error-tracking/SKILL.md +0 -379
  341. package/.claude/skills/frontend-design/SKILL.md +0 -42
  342. package/.claude/skills/frontend-dev-guidelines/SKILL.md +0 -403
  343. package/.claude/skills/frontend-dev-guidelines/resources/common-patterns.md +0 -331
  344. package/.claude/skills/frontend-dev-guidelines/resources/complete-examples.md +0 -872
  345. package/.claude/skills/frontend-dev-guidelines/resources/component-patterns.md +0 -502
  346. package/.claude/skills/frontend-dev-guidelines/resources/data-fetching.md +0 -767
  347. package/.claude/skills/frontend-dev-guidelines/resources/file-organization.md +0 -502
  348. package/.claude/skills/frontend-dev-guidelines/resources/loading-and-error-states.md +0 -501
  349. package/.claude/skills/frontend-dev-guidelines/resources/performance.md +0 -406
  350. package/.claude/skills/frontend-dev-guidelines/resources/routing-guide.md +0 -364
  351. package/.claude/skills/frontend-dev-guidelines/resources/styling-guide.md +0 -428
  352. package/.claude/skills/frontend-dev-guidelines/resources/typescript-standards.md +0 -418
  353. package/.claude/skills/general-it-engineering/SKILL.md +0 -393
  354. package/.claude/skills/general-it-engineering/resources/asset-management.md +0 -712
  355. package/.claude/skills/general-it-engineering/resources/automation-orchestration.md +0 -817
  356. package/.claude/skills/general-it-engineering/resources/business-continuity.md +0 -786
  357. package/.claude/skills/general-it-engineering/resources/change-management.md +0 -715
  358. package/.claude/skills/general-it-engineering/resources/enterprise-monitoring.md +0 -729
  359. package/.claude/skills/general-it-engineering/resources/help-desk-operations.md +0 -738
  360. package/.claude/skills/general-it-engineering/resources/incident-service-management.md +0 -834
  361. package/.claude/skills/general-it-engineering/resources/it-governance.md +0 -753
  362. package/.claude/skills/general-it-engineering/resources/itil-framework.md +0 -503
  363. package/.claude/skills/general-it-engineering/resources/service-management.md +0 -669
  364. package/.claude/skills/infrastructure-architecture/SKILL.md +0 -328
  365. package/.claude/skills/infrastructure-architecture/resources/architecture-decision-records.md +0 -505
  366. package/.claude/skills/infrastructure-architecture/resources/architecture-patterns.md +0 -528
  367. package/.claude/skills/infrastructure-architecture/resources/capacity-planning.md +0 -453
  368. package/.claude/skills/infrastructure-architecture/resources/cleared-environment-architecture.md +0 -773
  369. package/.claude/skills/infrastructure-architecture/resources/cost-architecture.md +0 -499
  370. package/.claude/skills/infrastructure-architecture/resources/data-architecture.md +0 -501
  371. package/.claude/skills/infrastructure-architecture/resources/disaster-recovery.md +0 -535
  372. package/.claude/skills/infrastructure-architecture/resources/migration-architecture.md +0 -512
  373. package/.claude/skills/infrastructure-architecture/resources/multi-region-design.md +0 -608
  374. package/.claude/skills/infrastructure-architecture/resources/reference-architectures.md +0 -562
  375. package/.claude/skills/infrastructure-architecture/resources/security-architecture.md +0 -538
  376. package/.claude/skills/infrastructure-architecture/resources/system-design-principles.md +0 -489
  377. package/.claude/skills/infrastructure-architecture/resources/workload-classification.md +0 -1000
  378. package/.claude/skills/infrastructure-strategy/SKILL.md +0 -924
  379. package/.claude/skills/network-engineering/SKILL.md +0 -385
  380. package/.claude/skills/network-engineering/resources/dns-management.md +0 -738
  381. package/.claude/skills/network-engineering/resources/load-balancing.md +0 -820
  382. package/.claude/skills/network-engineering/resources/network-architecture.md +0 -546
  383. package/.claude/skills/network-engineering/resources/network-security.md +0 -921
  384. package/.claude/skills/network-engineering/resources/network-troubleshooting.md +0 -749
  385. package/.claude/skills/network-engineering/resources/routing-switching.md +0 -373
  386. package/.claude/skills/network-engineering/resources/sdn-networking.md +0 -695
  387. package/.claude/skills/network-engineering/resources/service-mesh-networking.md +0 -777
  388. package/.claude/skills/network-engineering/resources/tcp-ip-protocols.md +0 -444
  389. package/.claude/skills/network-engineering/resources/vpn-connectivity.md +0 -672
  390. package/.claude/skills/node-development/SKILL.md +0 -317
  391. package/.claude/skills/observability-engineering/SKILL.md +0 -101
  392. package/.claude/skills/observability-engineering/resources/apm-tools.md +0 -97
  393. package/.claude/skills/observability-engineering/resources/correlation-strategies.md +0 -87
  394. package/.claude/skills/observability-engineering/resources/distributed-tracing.md +0 -98
  395. package/.claude/skills/observability-engineering/resources/logs-aggregation.md +0 -118
  396. package/.claude/skills/observability-engineering/resources/observability-cost-optimization.md +0 -141
  397. package/.claude/skills/observability-engineering/resources/opentelemetry.md +0 -110
  398. package/.claude/skills/platform-engineering/SKILL.md +0 -555
  399. package/.claude/skills/platform-engineering/resources/architecture-overview.md +0 -600
  400. package/.claude/skills/platform-engineering/resources/container-orchestration.md +0 -916
  401. package/.claude/skills/platform-engineering/resources/cost-optimization.md +0 -634
  402. package/.claude/skills/platform-engineering/resources/developer-platforms.md +0 -670
  403. package/.claude/skills/platform-engineering/resources/gitops-automation.md +0 -650
  404. package/.claude/skills/platform-engineering/resources/infrastructure-as-code.md +0 -778
  405. package/.claude/skills/platform-engineering/resources/infrastructure-standards.md +0 -708
  406. package/.claude/skills/platform-engineering/resources/multi-tenancy.md +0 -602
  407. package/.claude/skills/platform-engineering/resources/platform-security.md +0 -711
  408. package/.claude/skills/platform-engineering/resources/resource-management.md +0 -592
  409. package/.claude/skills/platform-engineering/resources/service-mesh.md +0 -628
  410. package/.claude/skills/release-engineering/SKILL.md +0 -393
  411. package/.claude/skills/release-engineering/resources/artifact-management.md +0 -108
  412. package/.claude/skills/release-engineering/resources/build-optimization.md +0 -84
  413. package/.claude/skills/release-engineering/resources/ci-cd-pipelines.md +0 -411
  414. package/.claude/skills/release-engineering/resources/deployment-strategies.md +0 -197
  415. package/.claude/skills/release-engineering/resources/pipeline-security.md +0 -62
  416. package/.claude/skills/release-engineering/resources/progressive-delivery.md +0 -83
  417. package/.claude/skills/release-engineering/resources/release-automation.md +0 -68
  418. package/.claude/skills/release-engineering/resources/release-orchestration.md +0 -77
  419. package/.claude/skills/release-engineering/resources/rollback-strategies.md +0 -66
  420. package/.claude/skills/release-engineering/resources/versioning-strategies.md +0 -59
  421. package/.claude/skills/route-tester/SKILL.md +0 -392
  422. package/.claude/skills/skill-developer/ADVANCED.md +0 -197
  423. package/.claude/skills/skill-developer/HOOK_MECHANISMS.md +0 -306
  424. package/.claude/skills/skill-developer/PATTERNS_LIBRARY.md +0 -152
  425. package/.claude/skills/skill-developer/SKILL.md +0 -430
  426. package/.claude/skills/skill-developer/SKILL_RULES_REFERENCE.md +0 -315
  427. package/.claude/skills/skill-developer/TRIGGER_TYPES.md +0 -305
  428. package/.claude/skills/skill-developer/TROUBLESHOOTING.md +0 -514
  429. package/.claude/skills/skill-rules.json +0 -2989
  430. package/.claude/skills/sre/SKILL.md +0 -464
  431. package/.claude/skills/sre/resources/alerting-best-practices.md +0 -282
  432. package/.claude/skills/sre/resources/capacity-planning.md +0 -226
  433. package/.claude/skills/sre/resources/chaos-engineering.md +0 -193
  434. package/.claude/skills/sre/resources/disaster-recovery.md +0 -232
  435. package/.claude/skills/sre/resources/incident-management.md +0 -436
  436. package/.claude/skills/sre/resources/observability-stack.md +0 -240
  437. package/.claude/skills/sre/resources/on-call-runbooks.md +0 -167
  438. package/.claude/skills/sre/resources/performance-optimization.md +0 -108
  439. package/.claude/skills/sre/resources/reliability-patterns.md +0 -183
  440. package/.claude/skills/sre/resources/slo-sli-sla.md +0 -464
  441. package/.claude/skills/sre/resources/toil-reduction.md +0 -145
  442. package/.claude/skills/systems-engineering/SKILL.md +0 -648
  443. package/.claude/skills/systems-engineering/resources/automation-patterns.md +0 -771
  444. package/.claude/skills/systems-engineering/resources/configuration-management.md +0 -998
  445. package/.claude/skills/systems-engineering/resources/linux-administration.md +0 -672
  446. package/.claude/skills/systems-engineering/resources/networking-fundamentals.md +0 -982
  447. package/.claude/skills/systems-engineering/resources/performance-tuning.md +0 -871
  448. package/.claude/skills/systems-engineering/resources/powershell-scripting.md +0 -482
  449. package/.claude/skills/systems-engineering/resources/security-hardening.md +0 -739
  450. package/.claude/skills/systems-engineering/resources/shell-scripting.md +0 -915
  451. package/.claude/skills/systems-engineering/resources/storage-management.md +0 -628
  452. package/.claude/skills/systems-engineering/resources/system-monitoring.md +0 -787
  453. package/.claude/skills/systems-engineering/resources/troubleshooting-guide.md +0 -753
  454. package/.claude/skills/systems-engineering/resources/windows-administration.md +0 -738
  455. package/.claude/skills/technical-leadership/SKILL.md +0 -728
  456. package/backend/docs/SECRETS_DOCUMENTATION.md +0 -327
  457. package/backend/package-lock.json +0 -6801
  458. package/backend/src/core/node-registries/actions.js +0 -202
  459. package/backend/src/core/node-registries/arrays.js +0 -155
  460. package/backend/src/core/node-registries/bot.js +0 -23
  461. package/backend/src/core/node-registries/container.js +0 -162
  462. package/backend/src/core/node-registries/data.js +0 -290
  463. package/backend/src/core/node-registries/debug.js +0 -26
  464. package/backend/src/core/node-registries/events.js +0 -201
  465. package/backend/src/core/node-registries/flow.js +0 -139
  466. package/backend/src/core/node-registries/furnace.js +0 -143
  467. package/backend/src/core/node-registries/logic.js +0 -62
  468. package/backend/src/core/node-registries/math.js +0 -42
  469. package/backend/src/core/node-registries/navigation.js +0 -111
  470. package/backend/src/core/node-registries/objects.js +0 -98
  471. package/backend/src/core/node-registries/strings.js +0 -187
  472. package/backend/src/core/node-registries/time.js +0 -113
  473. package/backend/src/core/node-registries/type.js +0 -25
  474. package/backend/src/core/node-registries/users.js +0 -79
  475. package/frontend/dist/assets/index-BC-NbKXi.css +0 -32
  476. package/frontend/dist/assets/index-DqJXZMHY.js +0 -11266
@@ -1,464 +0,0 @@
1
- # SRE - Site Reliability Engineering
2
-
3
- Comprehensive guide for implementing SRE practices: SLI/SLO/SLA definitions, error budgets, incident management, on-call best practices, chaos engineering, observability, and reliability patterns for production systems.
4
-
5
- ## Purpose
6
-
7
- Enable teams to build and operate reliable, scalable systems by applying software engineering principles to infrastructure and operations challenges.
8
-
9
- ## When to Use This Skill
10
-
11
- Automatically activates when working on:
12
- - Defining SLIs, SLOs, and SLAs
13
- - Managing error budgets
14
- - Incident response and management
15
- - On-call rotation and procedures
16
- - Observability and monitoring
17
- - Chaos engineering and resilience testing
18
- - Capacity planning and performance
19
- - Disaster recovery planning
20
-
21
- ## Quick Start Checklist
22
-
23
- When implementing SRE practices:
24
-
25
- - [ ] Define Service Level Indicators (SLIs)
26
- - [ ] Set Service Level Objectives (SLOs) with stakeholders
27
- - [ ] Implement error budget tracking
28
- - [ ] Set up comprehensive monitoring and alerting
29
- - [ ] Create incident response runbooks
30
- - [ ] Establish on-call rotation
31
- - [ ] Implement automated remediation
32
- - [ ] Conduct post-incident reviews
33
- - [ ] Perform chaos engineering experiments
34
- - [ ] Document capacity planning process
35
- - [ ] Set up disaster recovery procedures
36
-
37
- ## Core Concepts
38
-
39
- ### SLI / SLO / SLA Hierarchy
40
-
41
- ```
42
- SLA (Service Level Agreement)
43
- ↓ Commits to customers
44
-
45
- SLO (Service Level Objective)
46
- ↓ Internal target (stricter than SLA)
47
-
48
- SLI (Service Level Indicator)
49
- ↓ Measurement that shows if you're meeting SLO
50
-
51
- Metrics (actual measurements)
52
- ```
53
-
54
- **Example:**
55
- ```
56
- SLA: 99.9% uptime guaranteed (3 customers)
57
- SLO: 99.95% uptime target (internal)
58
- SLI: Percentage of successful HTTP requests
59
- (2xx, 3xx responses / total requests)
60
- ```
61
-
62
- ### Error Budgets
63
-
64
- ```
65
- Error Budget = 1 - SLO
66
-
67
- Example:
68
- SLO: 99.9% → Error Budget: 0.1%
69
-
70
- Monthly calculations (30 days):
71
- - Total time: 43,200 minutes
72
- - Allowed downtime: 43.2 minutes
73
-
74
- If error budget is:
75
- - Available → Can take risks, deploy features
76
- - Depleted → Focus on reliability, freeze features
77
- ```
78
-
79
- ### The Four Golden Signals
80
-
81
- ```yaml
82
- 1. Latency: How long requests take
83
- - Good: p50, p95, p99 latencies
84
- - Bad: Average latency (hides outliers)
85
-
86
- 2. Traffic: Demand on your system
87
- - Requests per second
88
- - Transactions per second
89
- - Bandwidth usage
90
-
91
- 3. Errors: Rate of failed requests
92
- - HTTP 5xx errors
93
- - Failed operations
94
- - Wrong results
95
-
96
- 4. Saturation: How "full" your service is
97
- - CPU usage
98
- - Memory usage
99
- - Disk I/O
100
- - Queue depth
101
- ```
102
-
103
- ## Common Patterns
104
-
105
- ### Pattern 1: SLO Definition
106
-
107
- ```yaml
108
- # slos/api-service.yaml
109
- apiVersion: v1
110
- kind: SLO
111
- metadata:
112
- name: api-service-availability
113
- service: api-service
114
- spec:
115
- description: "API service availability for customer requests"
116
-
117
- # SLI definition
118
- sli:
119
- name: availability
120
- description: "Percentage of successful requests"
121
- query: |
122
- sum(rate(http_requests_total{job="api-service",code=~"2.."}[5m]))
123
- /
124
- sum(rate(http_requests_total{job="api-service"}[5m]))
125
-
126
- # SLO target
127
- objectives:
128
- - target: 0.999 # 99.9%
129
- window: 30d # Rolling 30 days
130
-
131
- # Alert thresholds
132
- alerting:
133
- - burn_rate: 14.4 # Alert if burning budget 14.4x faster
134
- short_window: 1h
135
- long_window: 5m
136
- severity: critical
137
-
138
- - burn_rate: 6
139
- short_window: 6h
140
- long_window: 30m
141
- severity: warning
142
- ```
143
-
144
- **Latency SLO:**
145
- ```yaml
146
- apiVersion: v1
147
- kind: SLO
148
- metadata:
149
- name: api-service-latency
150
- spec:
151
- sli:
152
- name: latency
153
- description: "95th percentile latency under 200ms"
154
- query: |
155
- histogram_quantile(0.95,
156
- sum(rate(http_request_duration_seconds_bucket{job="api-service"}[5m])) by (le)
157
- )
158
-
159
- objectives:
160
- - target: 0.200 # 200ms
161
- window: 7d
162
- ```
163
-
164
- ### Pattern 2: Incident Response
165
-
166
- **Incident Severity Levels:**
167
- ```yaml
168
- Severity 1 (Critical):
169
- - Complete service outage
170
- - Security breach
171
- - Data loss
172
- Response: Immediate, all hands
173
- Escalation: 15 minutes
174
-
175
- Severity 2 (High):
176
- - Partial outage
177
- - Degraded performance
178
- - Feature broken
179
- Response: Within 30 minutes
180
- Escalation: 1 hour
181
-
182
- Severity 3 (Medium):
183
- - Minor issues
184
- - Non-critical features affected
185
- Response: Within 4 hours
186
- Escalation: Next business day
187
-
188
- Severity 4 (Low):
189
- - Cosmetic issues
190
- - Questions
191
- Response: Best effort
192
- ```
193
-
194
- **Incident Runbook Template:**
195
- ```markdown
196
- # Incident: API Service High Latency
197
-
198
- ## Symptoms
199
- - API p95 latency > 500ms
200
- - Users reporting slow responses
201
- - Alert: "api-service-latency-high" firing
202
-
203
- ## Impact
204
- - Degraded user experience
205
- - Potential timeout errors
206
-
207
- ## Investigation Steps
208
-
209
- 1. Check service status:
210
- \`\`\`bash
211
- kubectl get pods -n production -l app=api-service
212
- kubectl top pods -n production -l app=api-service
213
- \`\`\`
214
-
215
- 2. Check recent deployments:
216
- \`\`\`bash
217
- kubectl rollout history deployment/api-service -n production
218
- \`\`\`
219
-
220
- 3. Check database performance:
221
- - Query: `rate(pg_stat_database_tup_fetched[5m])`
222
- - Check slow query log
223
-
224
- 4. Check downstream dependencies:
225
- - External API status
226
- - Cache hit rate
227
-
228
- ## Mitigation
229
-
230
- ### Quick Fixes
231
- - Scale up: `kubectl scale deployment api-service --replicas=10`
232
- - Rollback: `kubectl rollout undo deployment/api-service`
233
-
234
- ### Root Cause Fixes
235
- - Optimize slow queries
236
- - Add caching layer
237
- - Increase resource limits
238
-
239
- ## Communication
240
-
241
- - Slack: #incidents channel
242
- - Status page: Update customer-facing status
243
- - Stakeholders: Notify via PagerDuty
244
-
245
- ## Post-Incident
246
-
247
- - Create post-incident review
248
- - Update runbook with learnings
249
- - File tickets for improvements
250
- ```
251
-
252
- ### Pattern 3: Chaos Engineering
253
-
254
- ```yaml
255
- # chaos-experiments/pod-failure.yaml
256
- apiVersion: chaos-mesh.org/v1alpha1
257
- kind: PodChaos
258
- metadata:
259
- name: api-service-pod-failure
260
- namespace: chaos-testing
261
- spec:
262
- action: pod-failure
263
- mode: one
264
- selector:
265
- namespaces:
266
- - production
267
- labelSelectors:
268
- app: api-service
269
- duration: "30s"
270
- scheduler:
271
- cron: "@every 2h" # Run every 2 hours
272
-
273
- ---
274
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
275
- 🎯 SKILL ACTIVATED: sre
276
- ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
277
-
278
- # Network chaos
279
- apiVersion: chaos-mesh.org/v1alpha1
280
- kind: NetworkChaos
281
- metadata:
282
- name: api-network-delay
283
- spec:
284
- action: delay
285
- mode: one
286
- selector:
287
- namespaces:
288
- - production
289
- labelSelectors:
290
- app: api-service
291
- delay:
292
- latency: "100ms"
293
- correlation: "100"
294
- jitter: "0ms"
295
- duration: "5m"
296
- ```
297
-
298
- ## Resource Files
299
-
300
- For detailed guidance on specific topics, see:
301
-
302
- ### Core SRE Practices
303
- - **[slo-sli-sla.md](resources/slo-sli-sla.md)** - Service level definitions, SLO implementation, error budgets, SLA management
304
- - **[incident-management.md](resources/incident-management.md)** - Incident response procedures, runbooks, post-mortems, severity levels
305
- - **[toil-reduction.md](resources/toil-reduction.md)** - Identifying toil, automation strategies, measuring toil reduction
306
-
307
- ### Observability
308
- - **[observability-stack.md](resources/observability-stack.md)** - Metrics, logs, traces, Prometheus, Grafana, Jaeger, distributed tracing
309
- - **[alerting-best-practices.md](resources/alerting-best-practices.md)** - Alert design, notification strategies, on-call alerting, alert fatigue
310
- - **[performance-optimization.md](resources/performance-optimization.md)** - Profiling, performance tuning, optimization techniques, bottleneck analysis
311
-
312
- ### Operations
313
- - **[on-call-runbooks.md](resources/on-call-runbooks.md)** - On-call rotations, escalation procedures, runbook templates, burnout prevention
314
- - **[capacity-planning.md](resources/capacity-planning.md)** - Load testing, capacity modeling, growth forecasting, resource planning
315
- - **[disaster-recovery.md](resources/disaster-recovery.md)** - Backup strategies, RPO/RTO, DR testing, failover procedures, BC planning
316
-
317
- ### Reliability Patterns
318
- - **[reliability-patterns.md](resources/reliability-patterns.md)** - Circuit breakers, retries, timeouts, bulkheads, rate limiting, graceful degradation
319
- - **[chaos-engineering.md](resources/chaos-engineering.md)** - Fault injection, resilience testing, game days, chaos experiments
320
-
321
- ## Best Practices
322
-
323
- ### 1. Start with User-Focused SLIs
324
-
325
- Measure what matters to users, not infrastructure metrics.
326
-
327
- ```yaml
328
- # Good SLIs
329
- - Request success rate
330
- - Request latency
331
- - Data freshness
332
-
333
- # Poor SLIs
334
- - CPU usage
335
- - Memory usage
336
- - Pod count
337
- ```
338
-
339
- ### 2. Set Realistic SLOs
340
-
341
- ```
342
- Don't aim for 100% - it's:
343
- - Impossible to achieve
344
- - Extremely expensive
345
- - Prevents risk-taking
346
-
347
- Typical SLOs:
348
- - User-facing: 99.9% - 99.99%
349
- - Internal services: 99% - 99.9%
350
- - Batch jobs: 95% - 99%
351
- ```
352
-
353
- ### 3. Embrace Error Budgets
354
-
355
- ```
356
- Error budget available:
357
- ✅ Deploy new features
358
- ✅ Experiment
359
- ✅ Take calculated risks
360
-
361
- Error budget depleted:
362
- ❌ Feature freeze
363
- ✅ Focus on reliability
364
- ✅ Fix technical debt
365
- ```
366
-
367
- ### 4. Automate Toil
368
-
369
- ```
370
- Toil = Manual, repetitive, automatable work
371
-
372
- Track toil:
373
- - Measure time spent on toil
374
- - Target: <50% of SRE time on toil
375
- - Automate high-frequency tasks
376
- ```
377
-
378
- ### 5. Blameless Post-Mortems
379
-
380
- ```markdown
381
- # Post-Incident Review Template
382
-
383
- ## Incident Summary
384
- - Date/Time
385
- - Duration
386
- - Severity
387
- - Services affected
388
-
389
- ## Impact
390
- - Users affected
391
- - Revenue impact
392
- - SLO impact
393
-
394
- ## Root Cause
395
- Technical explanation (not who)
396
-
397
- ## Timeline
398
- Detailed chronology
399
-
400
- ## Resolution
401
- What fixed it
402
-
403
- ## Action Items
404
- - [ ] Short-term fixes
405
- - [ ] Long-term improvements
406
- - [ ] Process changes
407
-
408
- ## Lessons Learned
409
- What went well
410
- What could be improved
411
- ```
412
-
413
- ### 6. Gradual Rollouts
414
-
415
- ```
416
- Deployment stages:
417
- 1. Canary (1% traffic) - 15 mins
418
- 2. Small (10% traffic) - 30 mins
419
- 3. Medium (50% traffic) - 1 hour
420
- 4. Full (100% traffic)
421
-
422
- Automated rollback on:
423
- - Error rate increase >1%
424
- - Latency increase >20%
425
- - Failed health checks
426
- ```
427
-
428
- ## Anti-Patterns to Avoid
429
-
430
- ❌ 100% availability target (unrealistic, expensive)
431
- ❌ Ignoring error budgets (defeats the purpose)
432
- ❌ Alert fatigue (too many noisy alerts)
433
- ❌ Manual toil (should be automated)
434
- ❌ Blame culture (prevents learning)
435
- ❌ No runbooks (tribal knowledge)
436
- ❌ Reactive only (should be proactive)
437
- ❌ Monitoring without alerting (data without action)
438
- ❌ No capacity planning (surprise outages)
439
- ❌ Skipping post-mortems (missing learning opportunities)
440
-
441
- ## Integration Points
442
-
443
- This skill integrates with:
444
- - **platform-engineering**: Infrastructure reliability, Kubernetes operations
445
- - **devsecops**: Security incident response, vulnerability management
446
- - **release-engineering**: Safe deployments, rollback strategies
447
- - **cloud-engineering**: Cloud monitoring, disaster recovery
448
- - **systems-engineering**: Performance tuning, system monitoring
449
-
450
- ## Triggers and Activation
451
-
452
- This skill activates when you:
453
- - Define or modify SLOs/SLIs
454
- - Respond to incidents
455
- - Set up monitoring and alerting
456
- - Plan capacity or performance optimization
457
- - Conduct post-incident reviews
458
- - Implement chaos engineering
459
-
460
- ---
461
-
462
- **Focus:** Reliability through engineering discipline
463
- **Key Principle:** Measure everything, improve continuously
464
- **Maintained by:** SRE team based on Google SRE practices and industry standards