blockmine 1.24.0 → 1.27.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (476) hide show
  1. package/CHANGELOG.md +76 -1
  2. package/README.en.md +427 -0
  3. package/README.md +40 -0
  4. package/backend/package.json +2 -2
  5. package/backend/prisma/migrations/20260328173000_add_plugin_source_ref/migration.sql +2 -0
  6. package/backend/prisma/migrations/migration_lock.toml +2 -2
  7. package/backend/prisma/schema.prisma +2 -0
  8. package/backend/src/ai/plugin-assistant-system-prompt.md +664 -5
  9. package/backend/src/api/routes/apiKeys.js +8 -0
  10. package/backend/src/api/routes/bots.js +271 -9
  11. package/backend/src/api/routes/eventGraphs.js +151 -1
  12. package/backend/src/api/routes/health.js +38 -0
  13. package/backend/src/api/routes/nodeRegistry.js +63 -0
  14. package/backend/src/api/routes/plugins.js +254 -29
  15. package/backend/src/api/routes/servers.js +14 -2
  16. package/backend/src/container.js +11 -8
  17. package/backend/src/core/BotCommandLoader.js +161 -0
  18. package/backend/src/core/BotConnection.js +125 -0
  19. package/backend/src/core/BotEventHandlers.js +234 -0
  20. package/backend/src/core/BotIPCHandler.js +445 -0
  21. package/backend/src/core/BotManager.js +15 -7
  22. package/backend/src/core/BotProcess.js +169 -140
  23. package/backend/src/core/EventGraphManager.js +7 -3
  24. package/backend/src/core/GraphDebugHandler.js +229 -0
  25. package/backend/src/core/GraphDebugIPC.js +117 -0
  26. package/backend/src/core/GraphExecutionEngine.js +545 -978
  27. package/backend/src/core/GraphTraversal.js +80 -0
  28. package/backend/src/core/GraphValidation.js +73 -0
  29. package/backend/src/core/NodeDefinition.js +138 -0
  30. package/backend/src/core/NodeRegistry.js +153 -141
  31. package/backend/src/core/PluginLoader.js +83 -3
  32. package/backend/src/core/PluginManager.js +346 -35
  33. package/backend/src/core/RewindSignal.js +9 -0
  34. package/backend/src/core/config/ConfigValidator.js +72 -0
  35. package/backend/src/core/config/FeatureFlags.js +52 -0
  36. package/backend/src/core/config/__tests__/ConfigValidator.test.js +232 -0
  37. package/backend/src/core/domain/entities/Bot.js +39 -0
  38. package/backend/src/core/domain/entities/Command.js +41 -0
  39. package/backend/src/core/domain/entities/EventGraph.js +39 -0
  40. package/backend/src/core/domain/entities/Plugin.js +45 -0
  41. package/backend/src/core/domain/entities/User.js +40 -0
  42. package/backend/src/core/domain/services/DependencyResolver.js +168 -0
  43. package/backend/src/core/domain/services/GraphValidator.js +117 -0
  44. package/backend/src/core/domain/services/PermissionChecker.js +34 -0
  45. package/backend/src/core/domain/services/__tests__/DependencyResolver.test.js +126 -0
  46. package/backend/src/core/domain/valueObjects/BotConfig.js +27 -0
  47. package/backend/src/core/domain/valueObjects/DependencyGraph.js +86 -0
  48. package/backend/src/core/domain/valueObjects/PluginManifest.js +36 -0
  49. package/backend/src/core/errors/BaseError.js +29 -0
  50. package/backend/src/core/errors/ErrorHandler.js +81 -0
  51. package/backend/src/core/errors/__tests__/ErrorHandler.test.js +188 -0
  52. package/backend/src/core/errors/index.js +68 -0
  53. package/backend/src/core/infrastructure/BatchingUtility.js +66 -0
  54. package/backend/src/core/infrastructure/CircuitBreaker.js +103 -0
  55. package/backend/src/core/infrastructure/ConnectionPool.js +81 -0
  56. package/backend/src/core/infrastructure/RateLimiter.js +64 -0
  57. package/backend/src/core/infrastructure/__tests__/BatchingUtility.test.js +86 -0
  58. package/backend/src/core/infrastructure/__tests__/CircuitBreaker.test.js +156 -0
  59. package/backend/src/core/infrastructure/__tests__/ConnectionPool.test.js +146 -0
  60. package/backend/src/core/infrastructure/__tests__/RateLimiter.test.js +171 -0
  61. package/backend/src/core/ipc/botApiFactory.js +72 -0
  62. package/backend/src/core/ipc/ipcMessageTypes.js +115 -0
  63. package/backend/src/core/logging/AuditLogger.js +61 -0
  64. package/backend/src/core/logging/StructuredLogger.js +80 -0
  65. package/backend/src/core/logging/__tests__/StructuredLogger.test.js +213 -0
  66. package/backend/src/core/logging/index.js +7 -0
  67. package/backend/src/core/metrics/MetricsCollector.js +104 -0
  68. package/backend/src/core/metrics/__tests__/MetricsCollector.test.js +131 -0
  69. package/backend/src/core/node-registries/actionsNodes.js +191 -0
  70. package/backend/src/core/node-registries/arraysNodes.js +152 -0
  71. package/backend/src/core/node-registries/botNodes.js +48 -0
  72. package/backend/src/core/node-registries/containerNodes.js +141 -0
  73. package/backend/src/core/node-registries/dataNodes.js +284 -0
  74. package/backend/src/core/node-registries/debugNodes.js +23 -0
  75. package/backend/src/core/node-registries/eventsNodes.js +223 -0
  76. package/backend/src/core/node-registries/flowNodes.js +151 -0
  77. package/backend/src/core/node-registries/furnaceNodes.js +123 -0
  78. package/backend/src/core/node-registries/index.js +108 -0
  79. package/backend/src/core/node-registries/inventory.js +102 -106
  80. package/backend/src/core/node-registries/logicNodes.js +54 -0
  81. package/backend/src/core/node-registries/mathNodes.js +38 -0
  82. package/backend/src/core/node-registries/navigationNodes.js +109 -0
  83. package/backend/src/core/node-registries/objectsNodes.js +90 -0
  84. package/backend/src/core/node-registries/stringsNodes.js +165 -0
  85. package/backend/src/core/node-registries/timeNodes.js +105 -0
  86. package/backend/src/core/node-registries/typeNodes.js +22 -0
  87. package/backend/src/core/node-registries/usersNodes.js +126 -0
  88. package/backend/src/core/nodes/arrays/shuffle.js +14 -0
  89. package/backend/src/core/nodes/bot/get_name.js +8 -0
  90. package/backend/src/core/nodes/bot/stop_bot.js +5 -0
  91. package/backend/src/core/nodes/container/open.js +101 -111
  92. package/backend/src/core/nodes/data/store_read.js +26 -0
  93. package/backend/src/core/nodes/data/store_write.js +23 -0
  94. package/backend/src/core/nodes/event/call_event.js +31 -0
  95. package/backend/src/core/nodes/event/custom_event.js +8 -0
  96. package/backend/src/core/nodes/flow/timer.js +35 -0
  97. package/backend/src/core/nodes/inventory/drop.js +73 -65
  98. package/backend/src/core/nodes/inventory/equip.js +54 -45
  99. package/backend/src/core/nodes/inventory/select_slot.js +48 -46
  100. package/backend/src/core/nodes/navigation/follow.js +54 -51
  101. package/backend/src/core/nodes/navigation/go_to.js +41 -53
  102. package/backend/src/core/nodes/navigation/go_to_entity.js +65 -69
  103. package/backend/src/core/nodes/navigation/go_to_player.js +65 -70
  104. package/backend/src/core/nodes/navigation/stop.js +17 -26
  105. package/backend/src/core/nodes/users/add_to_group.js +24 -0
  106. package/backend/src/core/nodes/users/check_permission.js +26 -0
  107. package/backend/src/core/nodes/users/remove_from_group.js +24 -0
  108. package/backend/src/core/services/BotIPCMessageRouter.js +337 -0
  109. package/backend/src/core/services/BotLifecycleService.js +43 -450
  110. package/backend/src/core/services/CacheManager.js +83 -23
  111. package/backend/src/core/services/CrashRestartManager.js +42 -0
  112. package/backend/src/core/services/DebugSessionManager.js +114 -12
  113. package/backend/src/core/services/EventGraphService.js +69 -0
  114. package/backend/src/core/services/MinecraftBotManager.js +9 -1
  115. package/backend/src/core/services/PluginManagementService.js +84 -0
  116. package/backend/src/core/services/TestModeContext.js +65 -0
  117. package/backend/src/core/services/__tests__/CacheManager.test.js +168 -0
  118. package/backend/src/core/services.js +1 -11
  119. package/backend/src/core/validation/InputValidator.js +167 -0
  120. package/backend/src/core/validation/__tests__/InputValidator.test.js +296 -0
  121. package/backend/src/real-time/botApi/index.js +1 -1
  122. package/backend/src/real-time/socketHandler.js +26 -0
  123. package/backend/src/server.js +21 -6
  124. package/frontend/dist/assets/browser-ponyfill-D8y0Ty7C.js +2 -0
  125. package/frontend/dist/assets/index-CFJLS0dk.css +32 -0
  126. package/frontend/dist/assets/index-D91UGNMG.js +11260 -0
  127. package/frontend/dist/flags/en.svg +32 -0
  128. package/frontend/dist/flags/ru.svg +5 -0
  129. package/frontend/dist/index.html +2 -2
  130. package/frontend/dist/locales/en/admin.json +100 -0
  131. package/frontend/dist/locales/en/api-keys.json +58 -0
  132. package/frontend/dist/locales/en/bots.json +113 -0
  133. package/frontend/dist/locales/en/common.json +53 -0
  134. package/frontend/dist/locales/en/configuration.json +22 -0
  135. package/frontend/dist/locales/en/console.json +10 -0
  136. package/frontend/dist/locales/en/dashboard.json +85 -0
  137. package/frontend/dist/locales/en/dialogs.json +70 -0
  138. package/frontend/dist/locales/en/event-graphs.json +50 -0
  139. package/frontend/dist/locales/en/graph-store.json +70 -0
  140. package/frontend/dist/locales/en/login.json +36 -0
  141. package/frontend/dist/locales/en/management.json +192 -0
  142. package/frontend/dist/locales/en/minecraft-viewer.json +27 -0
  143. package/frontend/dist/locales/en/nodes.json +1132 -0
  144. package/frontend/dist/locales/en/permissions.json +50 -0
  145. package/frontend/dist/locales/en/plugin-detail.json +69 -0
  146. package/frontend/dist/locales/en/plugins.json +329 -0
  147. package/frontend/dist/locales/en/proxies.json +81 -0
  148. package/frontend/dist/locales/en/servers.json +39 -0
  149. package/frontend/dist/locales/en/setup.json +19 -0
  150. package/frontend/dist/locales/en/sidebar.json +195 -0
  151. package/frontend/dist/locales/en/tasks.json +62 -0
  152. package/frontend/dist/locales/en/visual-editor.json +418 -0
  153. package/frontend/dist/locales/en/websocket.json +86 -0
  154. package/frontend/dist/locales/ru/admin.json +100 -0
  155. package/frontend/dist/locales/ru/api-keys.json +58 -0
  156. package/frontend/dist/locales/ru/bots.json +113 -0
  157. package/frontend/dist/locales/ru/common.json +49 -0
  158. package/frontend/dist/locales/ru/configuration.json +22 -0
  159. package/frontend/dist/locales/ru/console.json +10 -0
  160. package/frontend/dist/locales/ru/dashboard.json +85 -0
  161. package/frontend/dist/locales/ru/dialogs.json +70 -0
  162. package/frontend/dist/locales/ru/event-graphs.json +50 -0
  163. package/frontend/dist/locales/ru/graph-store.json +70 -0
  164. package/frontend/dist/locales/ru/login.json +36 -0
  165. package/frontend/dist/locales/ru/management.json +192 -0
  166. package/frontend/dist/locales/ru/minecraft-viewer.json +30 -0
  167. package/frontend/dist/locales/ru/nodes.json +1131 -0
  168. package/frontend/dist/locales/ru/permissions.json +50 -0
  169. package/frontend/dist/locales/ru/plugin-detail.json +49 -0
  170. package/frontend/dist/locales/ru/plugins.json +209 -0
  171. package/frontend/dist/locales/ru/proxies.json +81 -0
  172. package/frontend/dist/locales/ru/servers.json +39 -0
  173. package/frontend/dist/locales/ru/setup.json +19 -0
  174. package/frontend/dist/locales/ru/sidebar.json +195 -0
  175. package/frontend/dist/locales/ru/tasks.json +62 -0
  176. package/frontend/dist/locales/ru/visual-editor.json +420 -0
  177. package/frontend/dist/locales/ru/websocket.json +86 -0
  178. package/frontend/dist/monacoeditorwork/css.worker.bundle.js +7 -7
  179. package/frontend/dist/monacoeditorwork/html.worker.bundle.js +7 -7
  180. package/frontend/dist/monacoeditorwork/json.worker.bundle.js +7 -7
  181. package/frontend/dist/monacoeditorwork/ts.worker.bundle.js +3 -3
  182. package/frontend/package.json +6 -0
  183. package/nul +12 -0
  184. package/package.json +3 -3
  185. package/screen/3dviewer.png +0 -0
  186. package/screen/console.png +0 -0
  187. package/screen/dashboard.png +0 -0
  188. package/screen/graph_collabe.png +0 -0
  189. package/screen/graph_live_debug.png +0 -0
  190. package/screen/language_selector.png +0 -0
  191. package/screen/management_command.png +0 -0
  192. package/screen/node_debug_trace.png +0 -0
  193. package/screen/plugin_/320/276/320/261/320/267/320/276/321/200.png +0 -0
  194. package/screen/websocket.png +0 -0
  195. package/screen//320/275/320/260/321/201/321/202/321/200/320/276/320/271/320/272/320/270_/320/276/321/202/320/264/320/265/320/273/321/214/320/275/321/213/321/205_/320/272/320/276/320/274/320/260/320/275/320/264_/320/272/320/260/320/266/320/264/321/203_/320/272/320/276/320/274/320/260/320/275/320/273/320/264/321/203_/320/274/320/276/320/266/320/275/320/276_/320/275/320/260/321/201/321/202/321/200/320/260/320/270/320/262/320/260/321/202/321/214.png +0 -0
  196. package/screen//320/277/320/273/320/260/320/275/320/270/321/200/320/276/320/262/321/211/320/270/320/272_/320/274/320/276/320/266/320/275/320/276_/320/267/320/260/320/264/320/260/320/262/320/260/321/202/321/214_/320/264/320/265/320/271/321/201/321/202/320/262/320/270/321/217_/320/277/320/276_/320/262/321/200/320/265/320/274/320/265/320/275/320/270.png +0 -0
  197. package/.claude/agents/README.md +0 -469
  198. package/.claude/agents/auth-route-debugger.md +0 -118
  199. package/.claude/agents/auth-route-tester.md +0 -93
  200. package/.claude/agents/auto-error-resolver.md +0 -97
  201. package/.claude/agents/build-optimizer.md +0 -236
  202. package/.claude/agents/code-architect.md +0 -34
  203. package/.claude/agents/code-architecture-reviewer.md +0 -83
  204. package/.claude/agents/code-explorer.md +0 -51
  205. package/.claude/agents/code-refactor-master.md +0 -94
  206. package/.claude/agents/code-reviewer.md +0 -46
  207. package/.claude/agents/cost-optimizer.md +0 -134
  208. package/.claude/agents/deployment-orchestrator.md +0 -113
  209. package/.claude/agents/documentation-architect.md +0 -82
  210. package/.claude/agents/frontend-error-fixer.md +0 -77
  211. package/.claude/agents/iac-code-generator.md +0 -71
  212. package/.claude/agents/incident-responder.md +0 -346
  213. package/.claude/agents/infrastructure-architect.md +0 -31
  214. package/.claude/agents/kubernetes-specialist.md +0 -56
  215. package/.claude/agents/migration-planner.md +0 -181
  216. package/.claude/agents/network-architect.md +0 -196
  217. package/.claude/agents/plan-reviewer.md +0 -52
  218. package/.claude/agents/refactor-planner.md +0 -63
  219. package/.claude/agents/security-scanner.md +0 -102
  220. package/.claude/agents/web-research-specialist.md +0 -78
  221. package/.claude/commands/cost-analysis.md +0 -315
  222. package/.claude/commands/dev-docs-update.md +0 -55
  223. package/.claude/commands/dev-docs.md +0 -51
  224. package/.claude/commands/feature-dev.md +0 -125
  225. package/.claude/commands/incident-debug.md +0 -247
  226. package/.claude/commands/infra-plan.md +0 -81
  227. package/.claude/commands/migration-plan.md +0 -478
  228. package/.claude/commands/route-research-for-testing.md +0 -37
  229. package/.claude/commands/security-review.md +0 -66
  230. package/.claude/hooks/CONFIG.md +0 -448
  231. package/.claude/hooks/README.md +0 -163
  232. package/.claude/hooks/SKILL_ACTIVATION_COMPLETE.md +0 -226
  233. package/.claude/hooks/WINDOWS_HOOKS_README.md +0 -151
  234. package/.claude/hooks/add-skill-activation-banners.ts +0 -132
  235. package/.claude/hooks/comprehensive-skill-test.ts +0 -1315
  236. package/.claude/hooks/error-handling-reminder.sh +0 -12
  237. package/.claude/hooks/error-handling-reminder.ts +0 -222
  238. package/.claude/hooks/k8s-manifest-validator.sh +0 -56
  239. package/.claude/hooks/package-lock.json +0 -556
  240. package/.claude/hooks/package.json +0 -16
  241. package/.claude/hooks/post-tool-use-tracker.ps1 +0 -174
  242. package/.claude/hooks/post-tool-use-tracker.sh +0 -183
  243. package/.claude/hooks/security-policy-check.sh +0 -247
  244. package/.claude/hooks/skill-activation-prompt.ps1 +0 -10
  245. package/.claude/hooks/skill-activation-prompt.sh +0 -10
  246. package/.claude/hooks/skill-activation-prompt.ts +0 -141
  247. package/.claude/hooks/stop-build-check-enhanced.sh +0 -130
  248. package/.claude/hooks/terraform-validator.sh +0 -53
  249. package/.claude/hooks/test-input.json +0 -7
  250. package/.claude/hooks/test-skill-activation.ts +0 -427
  251. package/.claude/hooks/trigger-build-resolver.sh +0 -79
  252. package/.claude/hooks/tsc-check.sh +0 -173
  253. package/.claude/hooks/tsconfig.json +0 -19
  254. package/.claude/settings.json +0 -59
  255. package/.claude/settings.local.json +0 -67
  256. package/.claude/skills/README.md +0 -507
  257. package/.claude/skills/api-engineering/SKILL.md +0 -63
  258. package/.claude/skills/api-engineering/resources/api-versioning.md +0 -88
  259. package/.claude/skills/api-engineering/resources/graphql-patterns.md +0 -106
  260. package/.claude/skills/api-engineering/resources/rate-limiting.md +0 -118
  261. package/.claude/skills/api-engineering/resources/rest-api-design.md +0 -105
  262. package/.claude/skills/backend-dev-guidelines/SKILL.md +0 -306
  263. package/.claude/skills/backend-dev-guidelines/resources/architecture-overview.md +0 -451
  264. package/.claude/skills/backend-dev-guidelines/resources/async-and-errors.md +0 -307
  265. package/.claude/skills/backend-dev-guidelines/resources/complete-examples.md +0 -638
  266. package/.claude/skills/backend-dev-guidelines/resources/configuration.md +0 -275
  267. package/.claude/skills/backend-dev-guidelines/resources/database-patterns.md +0 -224
  268. package/.claude/skills/backend-dev-guidelines/resources/middleware-guide.md +0 -213
  269. package/.claude/skills/backend-dev-guidelines/resources/routing-and-controllers.md +0 -756
  270. package/.claude/skills/backend-dev-guidelines/resources/sentry-and-monitoring.md +0 -336
  271. package/.claude/skills/backend-dev-guidelines/resources/services-and-repositories.md +0 -789
  272. package/.claude/skills/backend-dev-guidelines/resources/testing-guide.md +0 -235
  273. package/.claude/skills/backend-dev-guidelines/resources/validation-patterns.md +0 -754
  274. package/.claude/skills/budget-and-cost-management/SKILL.md +0 -850
  275. package/.claude/skills/build-engineering/SKILL.md +0 -431
  276. package/.claude/skills/build-engineering/resources/artifact-repositories.md +0 -72
  277. package/.claude/skills/build-engineering/resources/build-caching.md +0 -96
  278. package/.claude/skills/build-engineering/resources/build-pipelines.md +0 -105
  279. package/.claude/skills/build-engineering/resources/build-security.md +0 -95
  280. package/.claude/skills/build-engineering/resources/build-systems.md +0 -389
  281. package/.claude/skills/build-engineering/resources/compilation-optimization.md +0 -201
  282. package/.claude/skills/build-engineering/resources/dependency-management.md +0 -73
  283. package/.claude/skills/build-engineering/resources/monorepo-builds.md +0 -110
  284. package/.claude/skills/build-engineering/resources/performance-optimization.md +0 -113
  285. package/.claude/skills/build-engineering/resources/reproducible-builds.md +0 -82
  286. package/.claude/skills/cloud-engineering/SKILL.md +0 -675
  287. package/.claude/skills/cloud-engineering/resources/aws-patterns.md +0 -742
  288. package/.claude/skills/cloud-engineering/resources/azure-patterns.md +0 -714
  289. package/.claude/skills/cloud-engineering/resources/cleared-cloud-environments.md +0 -987
  290. package/.claude/skills/cloud-engineering/resources/cloud-cost-optimization.md +0 -757
  291. package/.claude/skills/cloud-engineering/resources/cloud-networking.md +0 -1058
  292. package/.claude/skills/cloud-engineering/resources/cloud-security-tools.md +0 -1530
  293. package/.claude/skills/cloud-engineering/resources/cloud-security.md +0 -990
  294. package/.claude/skills/cloud-engineering/resources/gcp-patterns.md +0 -758
  295. package/.claude/skills/cloud-engineering/resources/migration-strategies.md +0 -820
  296. package/.claude/skills/cloud-engineering/resources/multi-cloud-strategies.md +0 -670
  297. package/.claude/skills/cloud-engineering/resources/oci-patterns.md +0 -1198
  298. package/.claude/skills/cloud-engineering/resources/serverless-patterns.md +0 -795
  299. package/.claude/skills/cloud-engineering/resources/well-architected-frameworks.md +0 -966
  300. package/.claude/skills/cybersecurity/SKILL.md +0 -409
  301. package/.claude/skills/cybersecurity/resources/security-architecture.md +0 -266
  302. package/.claude/skills/database-engineering/SKILL.md +0 -61
  303. package/.claude/skills/database-engineering/resources/backup-and-recovery.md +0 -72
  304. package/.claude/skills/database-engineering/resources/database-replication.md +0 -63
  305. package/.claude/skills/database-engineering/resources/postgresql-fundamentals.md +0 -70
  306. package/.claude/skills/database-engineering/resources/query-optimization.md +0 -68
  307. package/.claude/skills/devsecops/SKILL.md +0 -374
  308. package/.claude/skills/devsecops/resources/ci-cd-security.md +0 -204
  309. package/.claude/skills/devsecops/resources/compliance-automation.md +0 -530
  310. package/.claude/skills/devsecops/resources/compliance-frameworks.md +0 -2322
  311. package/.claude/skills/devsecops/resources/container-security.md +0 -915
  312. package/.claude/skills/devsecops/resources/cspm-integration.md +0 -1440
  313. package/.claude/skills/devsecops/resources/policy-enforcement.md +0 -619
  314. package/.claude/skills/devsecops/resources/secrets-management.md +0 -755
  315. package/.claude/skills/devsecops/resources/security-monitoring.md +0 -146
  316. package/.claude/skills/devsecops/resources/security-scanning.md +0 -887
  317. package/.claude/skills/devsecops/resources/security-testing.md +0 -203
  318. package/.claude/skills/devsecops/resources/supply-chain-security.md +0 -518
  319. package/.claude/skills/devsecops/resources/vulnerability-management.md +0 -481
  320. package/.claude/skills/devsecops/resources/zero-trust-architecture.md +0 -177
  321. package/.claude/skills/documentation-as-code/SKILL.md +0 -323
  322. package/.claude/skills/documentation-as-code/resources/api-documentation.md +0 -90
  323. package/.claude/skills/documentation-as-code/resources/changelog-management.md +0 -79
  324. package/.claude/skills/documentation-as-code/resources/diagram-generation.md +0 -44
  325. package/.claude/skills/documentation-as-code/resources/docs-as-code-workflow.md +0 -99
  326. package/.claude/skills/documentation-as-code/resources/documentation-automation.md +0 -68
  327. package/.claude/skills/documentation-as-code/resources/documentation-sites.md +0 -79
  328. package/.claude/skills/documentation-as-code/resources/markdown-best-practices.md +0 -162
  329. package/.claude/skills/documentation-as-code/resources/openapi-specification.md +0 -77
  330. package/.claude/skills/documentation-as-code/resources/readme-engineering.md +0 -60
  331. package/.claude/skills/documentation-as-code/resources/technical-writing-guide.md +0 -202
  332. package/.claude/skills/engineering-management/SKILL.md +0 -356
  333. package/.claude/skills/engineering-management/resources/career-ladders.md +0 -609
  334. package/.claude/skills/engineering-management/resources/hiring-and-assessment.md +0 -555
  335. package/.claude/skills/engineering-management/resources/one-on-one-guides.md +0 -609
  336. package/.claude/skills/engineering-management/resources/resource-planning.md +0 -557
  337. package/.claude/skills/engineering-management/resources/team-organization-patterns.md +0 -491
  338. package/.claude/skills/engineering-management/resources/technical-interviews.md +0 -474
  339. package/.claude/skills/engineering-operations-management/SKILL.md +0 -817
  340. package/.claude/skills/error-tracking/SKILL.md +0 -379
  341. package/.claude/skills/frontend-design/SKILL.md +0 -42
  342. package/.claude/skills/frontend-dev-guidelines/SKILL.md +0 -403
  343. package/.claude/skills/frontend-dev-guidelines/resources/common-patterns.md +0 -331
  344. package/.claude/skills/frontend-dev-guidelines/resources/complete-examples.md +0 -872
  345. package/.claude/skills/frontend-dev-guidelines/resources/component-patterns.md +0 -502
  346. package/.claude/skills/frontend-dev-guidelines/resources/data-fetching.md +0 -767
  347. package/.claude/skills/frontend-dev-guidelines/resources/file-organization.md +0 -502
  348. package/.claude/skills/frontend-dev-guidelines/resources/loading-and-error-states.md +0 -501
  349. package/.claude/skills/frontend-dev-guidelines/resources/performance.md +0 -406
  350. package/.claude/skills/frontend-dev-guidelines/resources/routing-guide.md +0 -364
  351. package/.claude/skills/frontend-dev-guidelines/resources/styling-guide.md +0 -428
  352. package/.claude/skills/frontend-dev-guidelines/resources/typescript-standards.md +0 -418
  353. package/.claude/skills/general-it-engineering/SKILL.md +0 -393
  354. package/.claude/skills/general-it-engineering/resources/asset-management.md +0 -712
  355. package/.claude/skills/general-it-engineering/resources/automation-orchestration.md +0 -817
  356. package/.claude/skills/general-it-engineering/resources/business-continuity.md +0 -786
  357. package/.claude/skills/general-it-engineering/resources/change-management.md +0 -715
  358. package/.claude/skills/general-it-engineering/resources/enterprise-monitoring.md +0 -729
  359. package/.claude/skills/general-it-engineering/resources/help-desk-operations.md +0 -738
  360. package/.claude/skills/general-it-engineering/resources/incident-service-management.md +0 -834
  361. package/.claude/skills/general-it-engineering/resources/it-governance.md +0 -753
  362. package/.claude/skills/general-it-engineering/resources/itil-framework.md +0 -503
  363. package/.claude/skills/general-it-engineering/resources/service-management.md +0 -669
  364. package/.claude/skills/infrastructure-architecture/SKILL.md +0 -328
  365. package/.claude/skills/infrastructure-architecture/resources/architecture-decision-records.md +0 -505
  366. package/.claude/skills/infrastructure-architecture/resources/architecture-patterns.md +0 -528
  367. package/.claude/skills/infrastructure-architecture/resources/capacity-planning.md +0 -453
  368. package/.claude/skills/infrastructure-architecture/resources/cleared-environment-architecture.md +0 -773
  369. package/.claude/skills/infrastructure-architecture/resources/cost-architecture.md +0 -499
  370. package/.claude/skills/infrastructure-architecture/resources/data-architecture.md +0 -501
  371. package/.claude/skills/infrastructure-architecture/resources/disaster-recovery.md +0 -535
  372. package/.claude/skills/infrastructure-architecture/resources/migration-architecture.md +0 -512
  373. package/.claude/skills/infrastructure-architecture/resources/multi-region-design.md +0 -608
  374. package/.claude/skills/infrastructure-architecture/resources/reference-architectures.md +0 -562
  375. package/.claude/skills/infrastructure-architecture/resources/security-architecture.md +0 -538
  376. package/.claude/skills/infrastructure-architecture/resources/system-design-principles.md +0 -489
  377. package/.claude/skills/infrastructure-architecture/resources/workload-classification.md +0 -1000
  378. package/.claude/skills/infrastructure-strategy/SKILL.md +0 -924
  379. package/.claude/skills/network-engineering/SKILL.md +0 -385
  380. package/.claude/skills/network-engineering/resources/dns-management.md +0 -738
  381. package/.claude/skills/network-engineering/resources/load-balancing.md +0 -820
  382. package/.claude/skills/network-engineering/resources/network-architecture.md +0 -546
  383. package/.claude/skills/network-engineering/resources/network-security.md +0 -921
  384. package/.claude/skills/network-engineering/resources/network-troubleshooting.md +0 -749
  385. package/.claude/skills/network-engineering/resources/routing-switching.md +0 -373
  386. package/.claude/skills/network-engineering/resources/sdn-networking.md +0 -695
  387. package/.claude/skills/network-engineering/resources/service-mesh-networking.md +0 -777
  388. package/.claude/skills/network-engineering/resources/tcp-ip-protocols.md +0 -444
  389. package/.claude/skills/network-engineering/resources/vpn-connectivity.md +0 -672
  390. package/.claude/skills/node-development/SKILL.md +0 -317
  391. package/.claude/skills/observability-engineering/SKILL.md +0 -101
  392. package/.claude/skills/observability-engineering/resources/apm-tools.md +0 -97
  393. package/.claude/skills/observability-engineering/resources/correlation-strategies.md +0 -87
  394. package/.claude/skills/observability-engineering/resources/distributed-tracing.md +0 -98
  395. package/.claude/skills/observability-engineering/resources/logs-aggregation.md +0 -118
  396. package/.claude/skills/observability-engineering/resources/observability-cost-optimization.md +0 -141
  397. package/.claude/skills/observability-engineering/resources/opentelemetry.md +0 -110
  398. package/.claude/skills/platform-engineering/SKILL.md +0 -555
  399. package/.claude/skills/platform-engineering/resources/architecture-overview.md +0 -600
  400. package/.claude/skills/platform-engineering/resources/container-orchestration.md +0 -916
  401. package/.claude/skills/platform-engineering/resources/cost-optimization.md +0 -634
  402. package/.claude/skills/platform-engineering/resources/developer-platforms.md +0 -670
  403. package/.claude/skills/platform-engineering/resources/gitops-automation.md +0 -650
  404. package/.claude/skills/platform-engineering/resources/infrastructure-as-code.md +0 -778
  405. package/.claude/skills/platform-engineering/resources/infrastructure-standards.md +0 -708
  406. package/.claude/skills/platform-engineering/resources/multi-tenancy.md +0 -602
  407. package/.claude/skills/platform-engineering/resources/platform-security.md +0 -711
  408. package/.claude/skills/platform-engineering/resources/resource-management.md +0 -592
  409. package/.claude/skills/platform-engineering/resources/service-mesh.md +0 -628
  410. package/.claude/skills/release-engineering/SKILL.md +0 -393
  411. package/.claude/skills/release-engineering/resources/artifact-management.md +0 -108
  412. package/.claude/skills/release-engineering/resources/build-optimization.md +0 -84
  413. package/.claude/skills/release-engineering/resources/ci-cd-pipelines.md +0 -411
  414. package/.claude/skills/release-engineering/resources/deployment-strategies.md +0 -197
  415. package/.claude/skills/release-engineering/resources/pipeline-security.md +0 -62
  416. package/.claude/skills/release-engineering/resources/progressive-delivery.md +0 -83
  417. package/.claude/skills/release-engineering/resources/release-automation.md +0 -68
  418. package/.claude/skills/release-engineering/resources/release-orchestration.md +0 -77
  419. package/.claude/skills/release-engineering/resources/rollback-strategies.md +0 -66
  420. package/.claude/skills/release-engineering/resources/versioning-strategies.md +0 -59
  421. package/.claude/skills/route-tester/SKILL.md +0 -392
  422. package/.claude/skills/skill-developer/ADVANCED.md +0 -197
  423. package/.claude/skills/skill-developer/HOOK_MECHANISMS.md +0 -306
  424. package/.claude/skills/skill-developer/PATTERNS_LIBRARY.md +0 -152
  425. package/.claude/skills/skill-developer/SKILL.md +0 -430
  426. package/.claude/skills/skill-developer/SKILL_RULES_REFERENCE.md +0 -315
  427. package/.claude/skills/skill-developer/TRIGGER_TYPES.md +0 -305
  428. package/.claude/skills/skill-developer/TROUBLESHOOTING.md +0 -514
  429. package/.claude/skills/skill-rules.json +0 -2989
  430. package/.claude/skills/sre/SKILL.md +0 -464
  431. package/.claude/skills/sre/resources/alerting-best-practices.md +0 -282
  432. package/.claude/skills/sre/resources/capacity-planning.md +0 -226
  433. package/.claude/skills/sre/resources/chaos-engineering.md +0 -193
  434. package/.claude/skills/sre/resources/disaster-recovery.md +0 -232
  435. package/.claude/skills/sre/resources/incident-management.md +0 -436
  436. package/.claude/skills/sre/resources/observability-stack.md +0 -240
  437. package/.claude/skills/sre/resources/on-call-runbooks.md +0 -167
  438. package/.claude/skills/sre/resources/performance-optimization.md +0 -108
  439. package/.claude/skills/sre/resources/reliability-patterns.md +0 -183
  440. package/.claude/skills/sre/resources/slo-sli-sla.md +0 -464
  441. package/.claude/skills/sre/resources/toil-reduction.md +0 -145
  442. package/.claude/skills/systems-engineering/SKILL.md +0 -648
  443. package/.claude/skills/systems-engineering/resources/automation-patterns.md +0 -771
  444. package/.claude/skills/systems-engineering/resources/configuration-management.md +0 -998
  445. package/.claude/skills/systems-engineering/resources/linux-administration.md +0 -672
  446. package/.claude/skills/systems-engineering/resources/networking-fundamentals.md +0 -982
  447. package/.claude/skills/systems-engineering/resources/performance-tuning.md +0 -871
  448. package/.claude/skills/systems-engineering/resources/powershell-scripting.md +0 -482
  449. package/.claude/skills/systems-engineering/resources/security-hardening.md +0 -739
  450. package/.claude/skills/systems-engineering/resources/shell-scripting.md +0 -915
  451. package/.claude/skills/systems-engineering/resources/storage-management.md +0 -628
  452. package/.claude/skills/systems-engineering/resources/system-monitoring.md +0 -787
  453. package/.claude/skills/systems-engineering/resources/troubleshooting-guide.md +0 -753
  454. package/.claude/skills/systems-engineering/resources/windows-administration.md +0 -738
  455. package/.claude/skills/technical-leadership/SKILL.md +0 -728
  456. package/backend/docs/SECRETS_DOCUMENTATION.md +0 -327
  457. package/backend/package-lock.json +0 -6801
  458. package/backend/src/core/node-registries/actions.js +0 -202
  459. package/backend/src/core/node-registries/arrays.js +0 -155
  460. package/backend/src/core/node-registries/bot.js +0 -23
  461. package/backend/src/core/node-registries/container.js +0 -162
  462. package/backend/src/core/node-registries/data.js +0 -290
  463. package/backend/src/core/node-registries/debug.js +0 -26
  464. package/backend/src/core/node-registries/events.js +0 -201
  465. package/backend/src/core/node-registries/flow.js +0 -139
  466. package/backend/src/core/node-registries/furnace.js +0 -143
  467. package/backend/src/core/node-registries/logic.js +0 -62
  468. package/backend/src/core/node-registries/math.js +0 -42
  469. package/backend/src/core/node-registries/navigation.js +0 -111
  470. package/backend/src/core/node-registries/objects.js +0 -98
  471. package/backend/src/core/node-registries/strings.js +0 -187
  472. package/backend/src/core/node-registries/time.js +0 -113
  473. package/backend/src/core/node-registries/type.js +0 -25
  474. package/backend/src/core/node-registries/users.js +0 -79
  475. package/frontend/dist/assets/index-BC-NbKXi.css +0 -32
  476. package/frontend/dist/assets/index-DqJXZMHY.js +0 -11266
@@ -1,729 +0,0 @@
1
- # Enterprise Monitoring
2
-
3
- Enterprise monitoring tools, dashboards, capacity management, performance metrics, and proactive monitoring strategies.
4
-
5
- ## Table of Contents
6
-
7
- - [Monitoring Overview](#monitoring-overview)
8
- - [Monitoring Tools](#monitoring-tools)
9
- - [Monitoring Metrics](#monitoring-metrics)
10
- - [Dashboards](#dashboards)
11
- - [Alerting](#alerting)
12
- - [Capacity Management](#capacity-management)
13
- - [Best Practices](#best-practices)
14
-
15
- ## Monitoring Overview
16
-
17
- ### Purpose
18
-
19
- Enterprise monitoring provides:
20
- - Real-time visibility into IT infrastructure
21
- - Proactive issue detection
22
- - Performance optimization
23
- - Capacity planning
24
- - Service level compliance
25
- - Root cause analysis
26
-
27
- ### Monitoring Layers
28
-
29
- ```
30
- ┌─────────────────────────────────────────┐
31
- │ Business Monitoring │
32
- │ - Transaction success rate │
33
- │ - Revenue per minute │
34
- │ - Customer experience │
35
- └──────────────┬──────────────────────────┘
36
-
37
- ┌─────────────────────────────────────────┐
38
- │ Application Monitoring (APM) │
39
- │ - Response times │
40
- │ - Error rates │
41
- │ - Database query performance │
42
- └──────────────┬──────────────────────────┘
43
-
44
- ┌─────────────────────────────────────────┐
45
- │ Infrastructure Monitoring │
46
- │ - Server CPU/memory │
47
- │ - Network bandwidth │
48
- │ - Storage capacity │
49
- └──────────────┬──────────────────────────┘
50
-
51
- ┌─────────────────────────────────────────┐
52
- │ Network Monitoring │
53
- │ - Link availability │
54
- │ - Latency │
55
- │ - Packet loss │
56
- └─────────────────────────────────────────┘
57
- ```
58
-
59
- ## Monitoring Tools
60
-
61
- ### Enterprise Monitoring Stack
62
-
63
- **Infrastructure Monitoring:**
64
- ```yaml
65
- Tools:
66
- - Nagios/Icinga: Traditional monitoring
67
- - Zabbix: Enterprise monitoring
68
- - PRTG: Network monitoring
69
- - SolarWinds: Comprehensive suite
70
-
71
- Capabilities:
72
- - Server monitoring (CPU, memory, disk)
73
- - Network device monitoring
74
- - Service checks (HTTP, SMTP, etc.)
75
- - SNMP monitoring
76
- - Alerting
77
- ```
78
-
79
- **Application Performance Monitoring (APM):**
80
- ```yaml
81
- Tools:
82
- - New Relic: Full-stack observability
83
- - Dynatrace: AI-powered APM
84
- - AppDynamics: Application intelligence
85
- - Datadog: Cloud-scale monitoring
86
-
87
- Capabilities:
88
- - Application performance
89
- - Transaction tracing
90
- - Code-level diagnostics
91
- - User experience monitoring
92
- - Error tracking
93
- ```
94
-
95
- **Log Management:**
96
- ```yaml
97
- Tools:
98
- - Splunk: Enterprise log analysis
99
- - ELK Stack: Open-source (Elasticsearch, Logstash, Kibana)
100
- - Graylog: Log management
101
- - Sumo Logic: Cloud-native logs
102
-
103
- Capabilities:
104
- - Centralized logging
105
- - Log aggregation
106
- - Search and analysis
107
- - Correlation
108
- - Compliance
109
- ```
110
-
111
- **Cloud Monitoring:**
112
- ```yaml
113
- AWS:
114
- - CloudWatch: Metrics and logs
115
- - X-Ray: Distributed tracing
116
- - CloudTrail: Audit logs
117
-
118
- Azure:
119
- - Azure Monitor: Unified monitoring
120
- - Application Insights: APM
121
- - Log Analytics: Log management
122
-
123
- GCP:
124
- - Cloud Monitoring: Metrics
125
- - Cloud Logging: Logs
126
- - Cloud Trace: Distributed tracing
127
- ```
128
-
129
- ## Monitoring Metrics
130
-
131
- ### Infrastructure Metrics
132
-
133
- **Server Metrics:**
134
- ```yaml
135
- CPU:
136
- - CPU utilization (%)
137
- - Load average (1m, 5m, 15m)
138
- - Context switches
139
- - CPU steal time (virtual)
140
-
141
- Thresholds:
142
- Warning: >70%
143
- Critical: >90%
144
-
145
- Memory:
146
- - Memory utilization (%)
147
- - Swap usage
148
- - Memory available
149
- - Page faults
150
-
151
- Thresholds:
152
- Warning: >80%
153
- Critical: >95%
154
-
155
- Disk:
156
- - Disk utilization (%)
157
- - Disk I/O (read/write IOPS)
158
- - Disk latency
159
- - Disk queue depth
160
-
161
- Thresholds:
162
- Utilization Warning: >80%
163
- Utilization Critical: >90%
164
- Latency Warning: >20ms
165
- Latency Critical: >50ms
166
-
167
- Network:
168
- - Bandwidth utilization (%)
169
- - Packets in/out
170
- - Errors
171
- - Dropped packets
172
-
173
- Thresholds:
174
- Bandwidth Warning: >70%
175
- Bandwidth Critical: >90%
176
- ```
177
-
178
- ### Application Metrics
179
-
180
- ```yaml
181
- Availability:
182
- - Uptime (%)
183
- - Error rate (%)
184
- - Success rate (%)
185
-
186
- Targets:
187
- Uptime: 99.9% (SLA)
188
- Error Rate: <1%
189
-
190
- Performance:
191
- - Response time (p50, p95, p99)
192
- - Transactions per second (TPS)
193
- - Throughput
194
- - Apdex score
195
-
196
- Targets:
197
- Response Time p95: <500ms
198
- Response Time p99: <1000ms
199
- TPS: >1000
200
-
201
- Resource Usage:
202
- - Connection pool usage
203
- - Thread pool usage
204
- - Cache hit rate
205
- - Queue depth
206
-
207
- Targets:
208
- Connection Pool: <80%
209
- Cache Hit Rate: >90%
210
-
211
- Database:
212
- - Query response time
213
- - Slow queries
214
- - Connection count
215
- - Deadlocks
216
-
217
- Targets:
218
- Query Time p95: <100ms
219
- Slow Queries: <10/hour
220
- ```
221
-
222
- ### Business Metrics
223
-
224
- ```yaml
225
- E-Commerce Example:
226
-
227
- Revenue Metrics:
228
- - Orders per minute
229
- - Revenue per minute
230
- - Cart abandonment rate
231
- - Conversion rate
232
-
233
- User Experience:
234
- - Page load time
235
- - Time to first byte
236
- - Search results time
237
- - Checkout time
238
-
239
- Operational:
240
- - Inventory accuracy
241
- - Order fulfillment time
242
- - Customer support tickets
243
- - Failed payments
244
- ```
245
-
246
- ## Dashboards
247
-
248
- ### Executive Dashboard
249
-
250
- ```yaml
251
- Executive IT Dashboard:
252
-
253
- Service Health:
254
- ┌──────────────────────────────────────┐
255
- │ Service Status │
256
- ├──────────────────────────────────────┤
257
- │ Email: ✅ Operational │
258
- │ Customer Portal: ✅ Operational │
259
- │ VPN: ✅ Operational │
260
- │ File Shares: ⚠️ Degraded │
261
- │ ERP System: ✅ Operational │
262
- └──────────────────────────────────────┘
263
-
264
- SLA Compliance (This Month):
265
- ┌──────────────────────────────────────┐
266
- │ Overall SLA: 99.7% ✅ (Target: 99.5%)│
267
- │ │
268
- │ Email: 99.95% ✅ │
269
- │ Portal: 99.80% ✅ │
270
- │ VPN: 99.50% ✅ │
271
- │ File Shares: 99.40% ⚠️ │
272
- └──────────────────────────────────────┘
273
-
274
- Incidents:
275
- ┌──────────────────────────────────────┐
276
- │ Open: 15 (▼ 25% vs last month) │
277
- │ P1: 0 │
278
- │ P2: 2 │
279
- │ P3: 8 │
280
- │ P4: 5 │
281
- │ │
282
- │ MTTR: 2.5 hours ✅ (Target: 4 hours) │
283
- └──────────────────────────────────────┘
284
-
285
- Costs:
286
- ┌──────────────────────────────────────┐
287
- │ Cloud Spend: $145,000 │
288
- │ Trend: ▼ 5% vs budget ✅ │
289
- │ Top Costs: │
290
- │ 1. Compute: $65,000 (45%) │
291
- │ 2. Storage: $35,000 (24%) │
292
- │ 3. Network: $25,000 (17%) │
293
- └──────────────────────────────────────┘
294
- ```
295
-
296
- ### Operations Dashboard
297
-
298
- ```yaml
299
- NOC (Network Operations Center) Dashboard:
300
-
301
- Infrastructure Overview:
302
- ┌──────────────────────────────────────┐
303
- │ Servers: 245 ✅ / 3 ⚠️ / 0 ❌ │
304
- │ Network: 45 ✅ / 1 ⚠️ / 0 ❌ │
305
- │ Storage: 15 ✅ / 0 ⚠️ / 0 ❌ │
306
- │ Applications: 32 ✅ / 1 ⚠️ / 0 ❌ │
307
- └──────────────────────────────────────┘
308
-
309
- Active Alerts:
310
- ┌──────────────────────────────────────┐
311
- │ Critical: 0 │
312
- │ Warning: 5 │
313
- │ │
314
- │ 1. File Server Disk 85% (Warning) │
315
- │ 2. Web01 CPU 75% (Warning) │
316
- │ 3. Network Link Latency 25ms (Warn) │
317
- │ 4. Database Slow Queries (Warning) │
318
- │ 5. Backup Job Delayed (Warning) │
319
- └──────────────────────────────────────┘
320
-
321
- Performance:
322
- ┌──────────────────────────────────────┐
323
- │ Application Response Time (p95) │
324
- │ ████████████░░░░░░░░ 485ms ✅ │
325
- │ Target: <500ms │
326
- │ │
327
- │ Network Latency (avg) │
328
- │ ████░░░░░░░░░░░░░░░░ 18ms ✅ │
329
- │ Target: <50ms │
330
- │ │
331
- │ Database Query Time (p95) │
332
- │ ██████░░░░░░░░░░░░░░ 85ms ✅ │
333
- │ Target: <100ms │
334
- └──────────────────────────────────────┘
335
- ```
336
-
337
- ### Application Dashboard
338
-
339
- ```yaml
340
- Application Performance Dashboard:
341
-
342
- Customer Portal:
343
-
344
- Response Time Trend (24 hours):
345
- ┌──────────────────────────────────────┐
346
- │ p50 ▁▂▃▂▁▂▃▂▁▂▃▂▁▂▃▂▁▂▃▂▁▂▃▂ 250ms│
347
- │ p95 ▃▅▆▅▃▅▆▅▃▅▆▅▃▅▆▅▃▅▆▅▃▅▆▅ 480ms│
348
- │ p99 ▆▇█▇▆▇█▇▆▇█▇▆▇█▇▆▇█▇▆▇█▇ 920ms│
349
- │ │
350
- │ 00:00 06:00 12:00 18:00 │
351
- └──────────────────────────────────────┘
352
-
353
- Error Rate (24 hours):
354
- ┌──────────────────────────────────────┐
355
- │ 2% █ │
356
- │ 1% █ ▆ ▃ │
357
- │ 0% ▅▃▂▁▂▃▂▁▂▃▂▁▂▃▂▁▂▃▂▁▂▃▂▁▂ │
358
- │ │
359
- │ Current: 0.3% ✅ (Target: <1%) │
360
- └──────────────────────────────────────┘
361
-
362
- Top Endpoints:
363
- ┌──────────────────────────────────────┐
364
- │ Endpoint | Requests | p95 │
365
- ├──────────────────────────────────────┤
366
- │ /api/orders | 15,000 | 320ms │
367
- │ /api/products | 12,500 | 280ms │
368
- │ /api/customers | 8,000 | 450ms │
369
- │ /api/search | 6,000 | 650ms │
370
- │ /api/checkout | 3,500 | 890ms │
371
- └──────────────────────────────────────┘
372
-
373
- Database Queries:
374
- ┌──────────────────────────────────────┐
375
- │ Slow Queries (>1s): 12 ⚠️ │
376
- │ │
377
- │ Top Slow Queries: │
378
- │ 1. SELECT * FROM orders... (2.5s) │
379
- │ 2. JOIN customers... (1.8s) │
380
- │ 3. UPDATE inventory... (1.2s) │
381
- └──────────────────────────────────────┘
382
- ```
383
-
384
- ## Alerting
385
-
386
- ### Alert Levels
387
-
388
- ```yaml
389
- Alert Severity Levels:
390
-
391
- Critical:
392
- Description: Service down, immediate action required
393
- Examples:
394
- - Production database down
395
- - Website unreachable
396
- - Data loss detected
397
- Response: Page on-call, all hands on deck
398
- SLA: Response within 15 minutes
399
-
400
- Warning:
401
- Description: Threshold exceeded, may impact service
402
- Examples:
403
- - Disk 85% full
404
- - CPU 80% for 10 minutes
405
- - Backup job delayed
406
- Response: Investigate within 1 hour
407
- SLA: Acknowledge within 30 minutes
408
-
409
- Info:
410
- Description: Informational, no action required
411
- Examples:
412
- - Backup completed successfully
413
- - Deployment finished
414
- - Certificate renewed
415
- Response: Review during business hours
416
- SLA: No SLA
417
- ```
418
-
419
- ### Alert Rules
420
-
421
- ```yaml
422
- Server CPU Alert:
423
-
424
- Metric: cpu.utilization
425
- Condition: Average > 80% for 10 minutes
426
- Severity: Warning
427
-
428
- Actions:
429
- - Send email to ops-team@company.com
430
- - Create Slack notification in #ops-alerts
431
- - Create ServiceNow ticket
432
-
433
- Escalation:
434
- If CPU > 90% for 10 minutes:
435
- - Upgrade to Critical
436
- - Page on-call engineer
437
- - Notify manager
438
-
439
- Auto-remediation:
440
- If CPU > 95% for 5 minutes:
441
- - Scale up (add server instance)
442
- - Restart stuck processes (if configured)
443
- ```
444
-
445
- ### Alert Best Practices
446
-
447
- ```yaml
448
- Alert Design:
449
-
450
- 1. Actionable:
451
- ❌ Bad: "Server CPU high"
452
- ✅ Good: "Web01 CPU >90% for 15min. Check runaway processes or scale up."
453
-
454
- 2. Contextual:
455
- Include:
456
- - Current value
457
- - Threshold
458
- - Duration
459
- - Impact
460
- - Runbook link
461
-
462
- 3. Threshold Tuning:
463
- - Start conservative (avoid alert fatigue)
464
- - Adjust based on normal patterns
465
- - Different thresholds for different times
466
- - Use anomaly detection
467
-
468
- 4. Alert Routing:
469
- - Route to responsible team
470
- - Escalate if not acknowledged
471
- - Different channels per severity
472
- - On-call rotation
473
-
474
- 5. Alert Deduplication:
475
- - Group related alerts
476
- - Suppress dependent alerts
477
- - Cooldown periods
478
- - Flapping detection
479
- ```
480
-
481
- ## Capacity Management
482
-
483
- ### Capacity Planning Process
484
-
485
- ```yaml
486
- Capacity Planning Cycle:
487
-
488
- 1. Monitor Current Usage (Ongoing):
489
- - Track resource utilization
490
- - Identify trends
491
- - Collect metrics
492
-
493
- 2. Forecast Future Demand (Quarterly):
494
- - Business growth projections
495
- - Seasonal variations
496
- - New initiatives
497
- - Historical trends
498
-
499
- 3. Analyze Capacity (Quarterly):
500
- - Current vs forecasted demand
501
- - Time to resource exhaustion
502
- - Bottlenecks
503
- - Optimization opportunities
504
-
505
- 4. Plan Capacity Changes (Quarterly):
506
- - Procurement requirements
507
- - Budget approval
508
- - Implementation timeline
509
- - Risk mitigation
510
-
511
- 5. Implement Changes (As needed):
512
- - Procure resources
513
- - Deploy infrastructure
514
- - Validate capacity
515
- - Document changes
516
-
517
- 6. Review and Optimize (Monthly):
518
- - Actual vs plan
519
- - Cost efficiency
520
- - Performance impact
521
- - Lessons learned
522
- ```
523
-
524
- ### Capacity Metrics
525
-
526
- ```yaml
527
- Server Capacity:
528
-
529
- Current State:
530
- Total Servers: 250
531
- Average CPU: 45%
532
- Average Memory: 60%
533
- Average Disk: 55%
534
-
535
- Trend (6 months):
536
- CPU: ▲ 5% increase
537
- Memory: ▲ 8% increase
538
- Disk: ▲ 12% increase
539
-
540
- Forecast (Next 6 months):
541
- Expected CPU: 55% (10% headroom)
542
- Expected Memory: 75% (adequate)
543
- Expected Disk: 75% (adequate)
544
-
545
- Action Required:
546
- - None for CPU/Memory
547
- - Monitor disk growth
548
- - Plan storage expansion in Q2
549
-
550
- Storage Capacity:
551
-
552
- Current: 500 TB used / 750 TB total (67%)
553
- Growth Rate: 15 TB/month
554
- Forecast: 590 TB in 6 months (79%)
555
- Threshold: 80% (warning)
556
-
557
- Action:
558
- - Procure additional 250 TB
559
- - Timeline: Q2 2025
560
- - Budget: $50,000
561
-
562
- Network Capacity:
563
-
564
- Current: 1 Gbps links
565
- Peak Usage: 650 Mbps (65%)
566
- Growth: 5% per quarter
567
- Forecast: 850 Mbps in 12 months (85%)
568
-
569
- Action:
570
- - Upgrade to 10 Gbps in Q3
571
- - Cost: $25,000
572
- - Provides 10x headroom
573
- ```
574
-
575
- ### Capacity Reporting
576
-
577
- ```yaml
578
- Monthly Capacity Report:
579
-
580
- Executive Summary:
581
- - All systems within capacity targets
582
- - Storage requiring expansion in 6 months
583
- - Network upgrade planned Q3
584
- - No immediate concerns
585
-
586
- Current Utilization:
587
- Compute: 45% (Low ✅)
588
- Memory: 60% (Moderate ✅)
589
- Storage: 67% (Moderate ✅)
590
- Network: 65% (Moderate ✅)
591
-
592
- Trends:
593
- - Steady 5% quarterly compute growth
594
- - Storage growth accelerating (cleanup needed)
595
- - Network stable
596
-
597
- Forecasts:
598
- Next 6 Months:
599
- - Compute: Adequate capacity
600
- - Storage: Approaching limit (action required)
601
- - Network: Adequate capacity
602
-
603
- Next 12 Months:
604
- - Compute: Adequate capacity
605
- - Storage: Expansion required
606
- - Network: Upgrade recommended
607
-
608
- Actions:
609
- - Storage procurement initiated
610
- - Network upgrade planning started
611
- - Cost: $75,000 (approved)
612
- ```
613
-
614
- ## Best Practices
615
-
616
- ### 1. Monitoring Coverage
617
-
618
- ```yaml
619
- Ensure Comprehensive Coverage:
620
-
621
- Infrastructure:
622
- - All production servers
623
- - Network devices
624
- - Storage systems
625
- - Virtualization platforms
626
-
627
- Applications:
628
- - All critical applications
629
- - Key transactions
630
- - Dependencies
631
- - APIs
632
-
633
- Business:
634
- - Revenue metrics
635
- - User experience
636
- - SLA compliance
637
- - Customer satisfaction
638
- ```
639
-
640
- ### 2. Baseline Establishment
641
-
642
- ```yaml
643
- Establish Performance Baselines:
644
-
645
- Process:
646
- 1. Collect metrics for 30 days
647
- 2. Analyze patterns (daily, weekly)
648
- 3. Calculate normal ranges
649
- 4. Set thresholds above baseline
650
- 5. Review quarterly
651
-
652
- Example:
653
- Metric: Application response time
654
- Baseline (p95): 450ms
655
- Warning: 600ms (133% of baseline)
656
- Critical: 900ms (200% of baseline)
657
- ```
658
-
659
- ### 3. Alert Fatigue Prevention
660
-
661
- ```yaml
662
- Avoid Alert Fatigue:
663
-
664
- Strategies:
665
- - Tune thresholds (reduce false positives)
666
- - Use intelligent alerting (anomaly detection)
667
- - Implement alert aggregation
668
- - Regular alert review and cleanup
669
- - Auto-remediation where possible
670
-
671
- Metrics:
672
- - Alert volume: <100/day
673
- - Alert-to-incident ratio: >50%
674
- - False positive rate: <10%
675
- - Time to acknowledge: <5 minutes
676
- ```
677
-
678
- ### 4. Correlation and Root Cause
679
-
680
- ```yaml
681
- Use Correlation for RCA:
682
-
683
- Approach:
684
- - Correlate metrics across layers
685
- - Identify cascading failures
686
- - Trace requests end-to-end
687
- - Link logs to metrics
688
- - Use dependency mapping
689
-
690
- Example:
691
- Symptom: Application slow
692
- Correlation:
693
- - Application response time ↑
694
- - Database query time ↑
695
- - Database disk I/O ↑
696
- - Storage latency ↑
697
- Root Cause: Storage array degraded disk
698
- ```
699
-
700
- ### 5. Continuous Improvement
701
-
702
- ```yaml
703
- Monitoring Improvement Process:
704
-
705
- Monthly:
706
- - Review alert effectiveness
707
- - Tune thresholds
708
- - Add missing metrics
709
- - Update dashboards
710
-
711
- Quarterly:
712
- - Capacity planning review
713
- - Tool evaluation
714
- - Process optimization
715
- - Team training
716
-
717
- Annually:
718
- - Technology refresh
719
- - Tool consolidation
720
- - Architecture review
721
- - Strategy planning
722
- ```
723
-
724
- ---
725
-
726
- **Related Resources:**
727
- - [incident-service-management.md](incident-service-management.md) - Incident response
728
- - [business-continuity.md](business-continuity.md) - DR monitoring
729
- - [automation-orchestration.md](automation-orchestration.md) - Automated remediation