blockmine 1.21.0 → 1.22.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.claude/agents/README.md +469 -0
- package/.claude/agents/auth-route-debugger.md +118 -0
- package/.claude/agents/auth-route-tester.md +93 -0
- package/.claude/agents/auto-error-resolver.md +97 -0
- package/.claude/agents/build-optimizer.md +236 -0
- package/.claude/agents/code-architecture-reviewer.md +83 -0
- package/.claude/agents/code-refactor-master.md +94 -0
- package/.claude/agents/cost-optimizer.md +134 -0
- package/.claude/agents/deployment-orchestrator.md +113 -0
- package/.claude/agents/documentation-architect.md +82 -0
- package/.claude/agents/frontend-error-fixer.md +77 -0
- package/.claude/agents/iac-code-generator.md +71 -0
- package/.claude/agents/incident-responder.md +346 -0
- package/.claude/agents/infrastructure-architect.md +31 -0
- package/.claude/agents/kubernetes-specialist.md +56 -0
- package/.claude/agents/migration-planner.md +181 -0
- package/.claude/agents/network-architect.md +196 -0
- package/.claude/agents/plan-reviewer.md +52 -0
- package/.claude/agents/refactor-planner.md +63 -0
- package/.claude/agents/security-scanner.md +102 -0
- package/.claude/agents/web-research-specialist.md +78 -0
- package/.claude/commands/cost-analysis.md +315 -0
- package/.claude/commands/dev-docs-update.md +55 -0
- package/.claude/commands/dev-docs.md +51 -0
- package/.claude/commands/incident-debug.md +247 -0
- package/.claude/commands/infra-plan.md +81 -0
- package/.claude/commands/migration-plan.md +478 -0
- package/.claude/commands/route-research-for-testing.md +37 -0
- package/.claude/commands/security-review.md +66 -0
- package/.claude/hooks/CONFIG.md +448 -0
- package/.claude/hooks/README.md +163 -0
- package/.claude/hooks/SKILL_ACTIVATION_COMPLETE.md +226 -0
- package/.claude/hooks/WINDOWS_HOOKS_README.md +151 -0
- package/.claude/hooks/add-skill-activation-banners.ts +132 -0
- package/.claude/hooks/comprehensive-skill-test.ts +1315 -0
- package/.claude/hooks/error-handling-reminder.sh +12 -0
- package/.claude/hooks/error-handling-reminder.ts +222 -0
- package/.claude/hooks/k8s-manifest-validator.sh +56 -0
- package/.claude/hooks/package-lock.json +556 -0
- package/.claude/hooks/package.json +16 -0
- package/.claude/hooks/post-tool-use-tracker.ps1 +174 -0
- package/.claude/hooks/post-tool-use-tracker.sh +183 -0
- package/.claude/hooks/security-policy-check.sh +247 -0
- package/.claude/hooks/skill-activation-prompt.ps1 +10 -0
- package/.claude/hooks/skill-activation-prompt.sh +10 -0
- package/.claude/hooks/skill-activation-prompt.ts +141 -0
- package/.claude/hooks/stop-build-check-enhanced.sh +130 -0
- package/.claude/hooks/terraform-validator.sh +53 -0
- package/.claude/hooks/test-input.json +7 -0
- package/.claude/hooks/test-skill-activation.ts +427 -0
- package/.claude/hooks/trigger-build-resolver.sh +79 -0
- package/.claude/hooks/tsc-check.sh +173 -0
- package/.claude/hooks/tsconfig.json +19 -0
- package/.claude/settings.json +55 -0
- package/.claude/settings.local.json +27 -14
- package/.claude/skills/README.md +507 -0
- package/.claude/skills/api-engineering/SKILL.md +63 -0
- package/.claude/skills/api-engineering/resources/api-versioning.md +88 -0
- package/.claude/skills/api-engineering/resources/graphql-patterns.md +106 -0
- package/.claude/skills/api-engineering/resources/rate-limiting.md +118 -0
- package/.claude/skills/api-engineering/resources/rest-api-design.md +105 -0
- package/.claude/skills/backend-dev-guidelines/SKILL.md +306 -0
- package/.claude/skills/backend-dev-guidelines/resources/architecture-overview.md +451 -0
- package/.claude/skills/backend-dev-guidelines/resources/async-and-errors.md +307 -0
- package/.claude/skills/backend-dev-guidelines/resources/complete-examples.md +638 -0
- package/.claude/skills/backend-dev-guidelines/resources/configuration.md +275 -0
- package/.claude/skills/backend-dev-guidelines/resources/database-patterns.md +224 -0
- package/.claude/skills/backend-dev-guidelines/resources/middleware-guide.md +213 -0
- package/.claude/skills/backend-dev-guidelines/resources/routing-and-controllers.md +756 -0
- package/.claude/skills/backend-dev-guidelines/resources/sentry-and-monitoring.md +336 -0
- package/.claude/skills/backend-dev-guidelines/resources/services-and-repositories.md +789 -0
- package/.claude/skills/backend-dev-guidelines/resources/testing-guide.md +235 -0
- package/.claude/skills/backend-dev-guidelines/resources/validation-patterns.md +754 -0
- package/.claude/skills/budget-and-cost-management/SKILL.md +850 -0
- package/.claude/skills/build-engineering/SKILL.md +431 -0
- package/.claude/skills/build-engineering/resources/artifact-repositories.md +72 -0
- package/.claude/skills/build-engineering/resources/build-caching.md +96 -0
- package/.claude/skills/build-engineering/resources/build-pipelines.md +105 -0
- package/.claude/skills/build-engineering/resources/build-security.md +95 -0
- package/.claude/skills/build-engineering/resources/build-systems.md +389 -0
- package/.claude/skills/build-engineering/resources/compilation-optimization.md +201 -0
- package/.claude/skills/build-engineering/resources/dependency-management.md +73 -0
- package/.claude/skills/build-engineering/resources/monorepo-builds.md +110 -0
- package/.claude/skills/build-engineering/resources/performance-optimization.md +113 -0
- package/.claude/skills/build-engineering/resources/reproducible-builds.md +82 -0
- package/.claude/skills/cloud-engineering/SKILL.md +675 -0
- package/.claude/skills/cloud-engineering/resources/aws-patterns.md +742 -0
- package/.claude/skills/cloud-engineering/resources/azure-patterns.md +714 -0
- package/.claude/skills/cloud-engineering/resources/cleared-cloud-environments.md +987 -0
- package/.claude/skills/cloud-engineering/resources/cloud-cost-optimization.md +757 -0
- package/.claude/skills/cloud-engineering/resources/cloud-networking.md +1058 -0
- package/.claude/skills/cloud-engineering/resources/cloud-security-tools.md +1530 -0
- package/.claude/skills/cloud-engineering/resources/cloud-security.md +990 -0
- package/.claude/skills/cloud-engineering/resources/gcp-patterns.md +758 -0
- package/.claude/skills/cloud-engineering/resources/migration-strategies.md +820 -0
- package/.claude/skills/cloud-engineering/resources/multi-cloud-strategies.md +670 -0
- package/.claude/skills/cloud-engineering/resources/oci-patterns.md +1198 -0
- package/.claude/skills/cloud-engineering/resources/serverless-patterns.md +795 -0
- package/.claude/skills/cloud-engineering/resources/well-architected-frameworks.md +966 -0
- package/.claude/skills/cybersecurity/SKILL.md +409 -0
- package/.claude/skills/cybersecurity/resources/security-architecture.md +266 -0
- package/.claude/skills/database-engineering/SKILL.md +61 -0
- package/.claude/skills/database-engineering/resources/backup-and-recovery.md +72 -0
- package/.claude/skills/database-engineering/resources/database-replication.md +63 -0
- package/.claude/skills/database-engineering/resources/postgresql-fundamentals.md +70 -0
- package/.claude/skills/database-engineering/resources/query-optimization.md +68 -0
- package/.claude/skills/devsecops/SKILL.md +374 -0
- package/.claude/skills/devsecops/resources/ci-cd-security.md +204 -0
- package/.claude/skills/devsecops/resources/compliance-automation.md +530 -0
- package/.claude/skills/devsecops/resources/compliance-frameworks.md +2322 -0
- package/.claude/skills/devsecops/resources/container-security.md +915 -0
- package/.claude/skills/devsecops/resources/cspm-integration.md +1440 -0
- package/.claude/skills/devsecops/resources/policy-enforcement.md +619 -0
- package/.claude/skills/devsecops/resources/secrets-management.md +755 -0
- package/.claude/skills/devsecops/resources/security-monitoring.md +146 -0
- package/.claude/skills/devsecops/resources/security-scanning.md +887 -0
- package/.claude/skills/devsecops/resources/security-testing.md +203 -0
- package/.claude/skills/devsecops/resources/supply-chain-security.md +518 -0
- package/.claude/skills/devsecops/resources/vulnerability-management.md +481 -0
- package/.claude/skills/devsecops/resources/zero-trust-architecture.md +177 -0
- package/.claude/skills/documentation-as-code/SKILL.md +323 -0
- package/.claude/skills/documentation-as-code/resources/api-documentation.md +90 -0
- package/.claude/skills/documentation-as-code/resources/changelog-management.md +79 -0
- package/.claude/skills/documentation-as-code/resources/diagram-generation.md +44 -0
- package/.claude/skills/documentation-as-code/resources/docs-as-code-workflow.md +99 -0
- package/.claude/skills/documentation-as-code/resources/documentation-automation.md +68 -0
- package/.claude/skills/documentation-as-code/resources/documentation-sites.md +79 -0
- package/.claude/skills/documentation-as-code/resources/markdown-best-practices.md +162 -0
- package/.claude/skills/documentation-as-code/resources/openapi-specification.md +77 -0
- package/.claude/skills/documentation-as-code/resources/readme-engineering.md +60 -0
- package/.claude/skills/documentation-as-code/resources/technical-writing-guide.md +202 -0
- package/.claude/skills/engineering-management/SKILL.md +356 -0
- package/.claude/skills/engineering-management/resources/career-ladders.md +609 -0
- package/.claude/skills/engineering-management/resources/hiring-and-assessment.md +555 -0
- package/.claude/skills/engineering-management/resources/one-on-one-guides.md +609 -0
- package/.claude/skills/engineering-management/resources/resource-planning.md +557 -0
- package/.claude/skills/engineering-management/resources/team-organization-patterns.md +491 -0
- package/.claude/skills/engineering-management/resources/technical-interviews.md +474 -0
- package/.claude/skills/engineering-operations-management/SKILL.md +817 -0
- package/.claude/skills/error-tracking/SKILL.md +379 -0
- package/.claude/skills/frontend-dev-guidelines/SKILL.md +403 -0
- package/.claude/skills/frontend-dev-guidelines/resources/common-patterns.md +331 -0
- package/.claude/skills/frontend-dev-guidelines/resources/complete-examples.md +872 -0
- package/.claude/skills/frontend-dev-guidelines/resources/component-patterns.md +502 -0
- package/.claude/skills/frontend-dev-guidelines/resources/data-fetching.md +767 -0
- package/.claude/skills/frontend-dev-guidelines/resources/file-organization.md +502 -0
- package/.claude/skills/frontend-dev-guidelines/resources/loading-and-error-states.md +501 -0
- package/.claude/skills/frontend-dev-guidelines/resources/performance.md +406 -0
- package/.claude/skills/frontend-dev-guidelines/resources/routing-guide.md +364 -0
- package/.claude/skills/frontend-dev-guidelines/resources/styling-guide.md +428 -0
- package/.claude/skills/frontend-dev-guidelines/resources/typescript-standards.md +418 -0
- package/.claude/skills/general-it-engineering/SKILL.md +393 -0
- package/.claude/skills/general-it-engineering/resources/asset-management.md +712 -0
- package/.claude/skills/general-it-engineering/resources/automation-orchestration.md +817 -0
- package/.claude/skills/general-it-engineering/resources/business-continuity.md +786 -0
- package/.claude/skills/general-it-engineering/resources/change-management.md +715 -0
- package/.claude/skills/general-it-engineering/resources/enterprise-monitoring.md +729 -0
- package/.claude/skills/general-it-engineering/resources/help-desk-operations.md +738 -0
- package/.claude/skills/general-it-engineering/resources/incident-service-management.md +834 -0
- package/.claude/skills/general-it-engineering/resources/it-governance.md +753 -0
- package/.claude/skills/general-it-engineering/resources/itil-framework.md +503 -0
- package/.claude/skills/general-it-engineering/resources/service-management.md +669 -0
- package/.claude/skills/infrastructure-architecture/SKILL.md +328 -0
- package/.claude/skills/infrastructure-architecture/resources/architecture-decision-records.md +505 -0
- package/.claude/skills/infrastructure-architecture/resources/architecture-patterns.md +528 -0
- package/.claude/skills/infrastructure-architecture/resources/capacity-planning.md +453 -0
- package/.claude/skills/infrastructure-architecture/resources/cleared-environment-architecture.md +773 -0
- package/.claude/skills/infrastructure-architecture/resources/cost-architecture.md +499 -0
- package/.claude/skills/infrastructure-architecture/resources/data-architecture.md +501 -0
- package/.claude/skills/infrastructure-architecture/resources/disaster-recovery.md +535 -0
- package/.claude/skills/infrastructure-architecture/resources/migration-architecture.md +512 -0
- package/.claude/skills/infrastructure-architecture/resources/multi-region-design.md +608 -0
- package/.claude/skills/infrastructure-architecture/resources/reference-architectures.md +562 -0
- package/.claude/skills/infrastructure-architecture/resources/security-architecture.md +538 -0
- package/.claude/skills/infrastructure-architecture/resources/system-design-principles.md +489 -0
- package/.claude/skills/infrastructure-architecture/resources/workload-classification.md +1000 -0
- package/.claude/skills/infrastructure-strategy/SKILL.md +924 -0
- package/.claude/skills/network-engineering/SKILL.md +385 -0
- package/.claude/skills/network-engineering/resources/dns-management.md +738 -0
- package/.claude/skills/network-engineering/resources/load-balancing.md +820 -0
- package/.claude/skills/network-engineering/resources/network-architecture.md +546 -0
- package/.claude/skills/network-engineering/resources/network-security.md +921 -0
- package/.claude/skills/network-engineering/resources/network-troubleshooting.md +749 -0
- package/.claude/skills/network-engineering/resources/routing-switching.md +373 -0
- package/.claude/skills/network-engineering/resources/sdn-networking.md +695 -0
- package/.claude/skills/network-engineering/resources/service-mesh-networking.md +777 -0
- package/.claude/skills/network-engineering/resources/tcp-ip-protocols.md +444 -0
- package/.claude/skills/network-engineering/resources/vpn-connectivity.md +672 -0
- package/.claude/skills/observability-engineering/SKILL.md +101 -0
- package/.claude/skills/observability-engineering/resources/apm-tools.md +97 -0
- package/.claude/skills/observability-engineering/resources/correlation-strategies.md +87 -0
- package/.claude/skills/observability-engineering/resources/distributed-tracing.md +98 -0
- package/.claude/skills/observability-engineering/resources/logs-aggregation.md +118 -0
- package/.claude/skills/observability-engineering/resources/observability-cost-optimization.md +141 -0
- package/.claude/skills/observability-engineering/resources/opentelemetry.md +110 -0
- package/.claude/skills/platform-engineering/SKILL.md +555 -0
- package/.claude/skills/platform-engineering/resources/architecture-overview.md +600 -0
- package/.claude/skills/platform-engineering/resources/container-orchestration.md +916 -0
- package/.claude/skills/platform-engineering/resources/cost-optimization.md +634 -0
- package/.claude/skills/platform-engineering/resources/developer-platforms.md +670 -0
- package/.claude/skills/platform-engineering/resources/gitops-automation.md +650 -0
- package/.claude/skills/platform-engineering/resources/infrastructure-as-code.md +778 -0
- package/.claude/skills/platform-engineering/resources/infrastructure-standards.md +708 -0
- package/.claude/skills/platform-engineering/resources/multi-tenancy.md +602 -0
- package/.claude/skills/platform-engineering/resources/platform-security.md +711 -0
- package/.claude/skills/platform-engineering/resources/resource-management.md +592 -0
- package/.claude/skills/platform-engineering/resources/service-mesh.md +628 -0
- package/.claude/skills/release-engineering/SKILL.md +393 -0
- package/.claude/skills/release-engineering/resources/artifact-management.md +108 -0
- package/.claude/skills/release-engineering/resources/build-optimization.md +84 -0
- package/.claude/skills/release-engineering/resources/ci-cd-pipelines.md +411 -0
- package/.claude/skills/release-engineering/resources/deployment-strategies.md +197 -0
- package/.claude/skills/release-engineering/resources/pipeline-security.md +62 -0
- package/.claude/skills/release-engineering/resources/progressive-delivery.md +83 -0
- package/.claude/skills/release-engineering/resources/release-automation.md +68 -0
- package/.claude/skills/release-engineering/resources/release-orchestration.md +77 -0
- package/.claude/skills/release-engineering/resources/rollback-strategies.md +66 -0
- package/.claude/skills/release-engineering/resources/versioning-strategies.md +59 -0
- package/.claude/skills/route-tester/SKILL.md +392 -0
- package/.claude/skills/skill-developer/ADVANCED.md +197 -0
- package/.claude/skills/skill-developer/HOOK_MECHANISMS.md +306 -0
- package/.claude/skills/skill-developer/PATTERNS_LIBRARY.md +152 -0
- package/.claude/skills/skill-developer/SKILL.md +430 -0
- package/.claude/skills/skill-developer/SKILL_RULES_REFERENCE.md +315 -0
- package/.claude/skills/skill-developer/TRIGGER_TYPES.md +305 -0
- package/.claude/skills/skill-developer/TROUBLESHOOTING.md +514 -0
- package/.claude/skills/skill-rules.json +2940 -0
- package/.claude/skills/sre/SKILL.md +464 -0
- package/.claude/skills/sre/resources/alerting-best-practices.md +282 -0
- package/.claude/skills/sre/resources/capacity-planning.md +226 -0
- package/.claude/skills/sre/resources/chaos-engineering.md +193 -0
- package/.claude/skills/sre/resources/disaster-recovery.md +232 -0
- package/.claude/skills/sre/resources/incident-management.md +436 -0
- package/.claude/skills/sre/resources/observability-stack.md +240 -0
- package/.claude/skills/sre/resources/on-call-runbooks.md +167 -0
- package/.claude/skills/sre/resources/performance-optimization.md +108 -0
- package/.claude/skills/sre/resources/reliability-patterns.md +183 -0
- package/.claude/skills/sre/resources/slo-sli-sla.md +464 -0
- package/.claude/skills/sre/resources/toil-reduction.md +145 -0
- package/.claude/skills/systems-engineering/SKILL.md +648 -0
- package/.claude/skills/systems-engineering/resources/automation-patterns.md +771 -0
- package/.claude/skills/systems-engineering/resources/configuration-management.md +998 -0
- package/.claude/skills/systems-engineering/resources/linux-administration.md +672 -0
- package/.claude/skills/systems-engineering/resources/networking-fundamentals.md +982 -0
- package/.claude/skills/systems-engineering/resources/performance-tuning.md +871 -0
- package/.claude/skills/systems-engineering/resources/powershell-scripting.md +482 -0
- package/.claude/skills/systems-engineering/resources/security-hardening.md +739 -0
- package/.claude/skills/systems-engineering/resources/shell-scripting.md +915 -0
- package/.claude/skills/systems-engineering/resources/storage-management.md +628 -0
- package/.claude/skills/systems-engineering/resources/system-monitoring.md +787 -0
- package/.claude/skills/systems-engineering/resources/troubleshooting-guide.md +753 -0
- package/.claude/skills/systems-engineering/resources/windows-administration.md +738 -0
- package/.claude/skills/technical-leadership/SKILL.md +728 -0
- package/CHANGELOG.md +90 -54
- package/README.md +94 -0
- package/backend/docs/SECRETS_DOCUMENTATION.md +327 -0
- package/backend/jest.config.js +59 -0
- package/backend/package-lock.json +6129 -0
- package/backend/package.json +16 -4
- package/backend/prisma/migrations/20251026104609_add_websocket_api/migration.sql +33 -0
- package/backend/prisma/schema.prisma +33 -0
- package/backend/src/__tests__/core/DependencyService.test.js +336 -0
- package/backend/src/__tests__/core/UserService.test.js +875 -0
- package/backend/src/__tests__/repositories/BaseRepository.test.js +146 -0
- package/backend/src/__tests__/repositories/BotRepository.test.js +118 -0
- package/backend/src/__tests__/repositories/CommandRepository.test.js +132 -0
- package/backend/src/__tests__/repositories/EventGraphRepository.test.js +93 -0
- package/backend/src/__tests__/repositories/GroupRepository.test.js +155 -0
- package/backend/src/__tests__/repositories/PermissionRepository.test.js +130 -0
- package/backend/src/__tests__/repositories/PluginRepository.test.js +107 -0
- package/backend/src/__tests__/repositories/ServerRepository.test.js +80 -0
- package/backend/src/__tests__/repositories/UserRepository.test.js +128 -0
- package/backend/src/__tests__/secretsFilter.test.js +425 -0
- package/backend/src/__tests__/services/BotLifecycleService.test.js +411 -0
- package/backend/src/__tests__/services/BotProcessManager.test.js +285 -0
- package/backend/src/__tests__/services/CacheManager.test.js +125 -0
- package/backend/src/__tests__/services/CommandExecutionService.test.js +460 -0
- package/backend/src/__tests__/services/ResourceMonitorService.test.js +207 -0
- package/backend/src/__tests__/services/TelemetryService.test.js +291 -0
- package/backend/src/__tests__/setup.js +25 -0
- package/backend/src/api/routes/apiKeys.js +181 -0
- package/backend/src/api/routes/bots.js +49 -7
- package/backend/src/api/routes/plugins.js +2 -1
- package/backend/src/api/routes/system.js +174 -0
- package/backend/src/container.js +82 -0
- package/backend/src/core/BotManager.js +142 -871
- package/backend/src/core/BotManager.old.js +1093 -0
- package/backend/src/core/BotProcess.js +1092 -858
- package/backend/src/core/EventGraphManager.js +280 -198
- package/backend/src/core/GraphExecutionEngine.js +321 -325
- package/backend/src/core/MessageQueue.js +27 -6
- package/backend/src/core/NodeRegistry.js +37 -1134
- package/backend/src/core/PluginManager.js +62 -12
- package/backend/src/core/PrismaService.js +32 -0
- package/backend/src/core/UserService.js +3 -3
- package/backend/src/core/__tests__/PrismaService.test.js +24 -0
- package/backend/src/core/commands/README.md +305 -0
- package/backend/src/core/commands/dev.js +13 -7
- package/backend/src/core/commands/ping.js +10 -4
- package/backend/src/core/commands/whois.js +63 -0
- package/backend/src/core/config/validation.js +27 -0
- package/backend/src/core/constants/graphTypes.js +21 -0
- package/backend/src/core/node-registries/actions.js +132 -0
- package/backend/src/core/node-registries/arrays.js +137 -0
- package/backend/src/core/node-registries/bot.js +23 -0
- package/backend/src/core/node-registries/data.js +290 -0
- package/backend/src/core/node-registries/debug.js +26 -0
- package/backend/src/core/node-registries/events.js +187 -0
- package/backend/src/core/node-registries/flow.js +139 -0
- package/backend/src/core/node-registries/logic.js +45 -0
- package/backend/src/core/node-registries/math.js +42 -0
- package/backend/src/core/node-registries/objects.js +98 -0
- package/backend/src/core/node-registries/strings.js +153 -0
- package/backend/src/core/node-registries/time.js +113 -0
- package/backend/src/core/node-registries/users.js +79 -0
- package/backend/src/core/nodes/{action_bot_look_at.js → actions/bot_look_at.js} +36 -36
- package/backend/src/core/nodes/{action_bot_set_variable.js → actions/bot_set_variable.js} +32 -32
- package/backend/src/core/nodes/{action_send_log.js → actions/send_log.js} +28 -23
- package/backend/src/core/nodes/{action_send_message.js → actions/send_message.js} +32 -32
- package/backend/src/core/nodes/actions/send_websocket_response.js +33 -0
- package/backend/src/core/nodes/arrays/get_next.js +35 -0
- package/backend/src/core/nodes/{data_cast.js → data/cast.js} +8 -0
- package/backend/src/core/nodes/data/datetime_literal.js +27 -0
- package/backend/src/core/nodes/data/entity_info.js +69 -0
- package/backend/src/core/nodes/data/get_nearby_entities.js +32 -0
- package/backend/src/core/nodes/data/get_nearby_players.js +64 -0
- package/backend/src/core/nodes/{data_get_user_field.js → data/get_user_field.js} +1 -1
- package/backend/src/core/nodes/data/type_check.js +53 -0
- package/backend/src/core/nodes/{debug_log.js → debug/log.js} +16 -16
- package/backend/src/core/nodes/{flow_branch.js → flow/branch.js} +15 -15
- package/backend/src/core/nodes/{flow_break.js → flow/break.js} +14 -14
- package/backend/src/core/nodes/flow/delay.js +43 -0
- package/backend/src/core/nodes/{flow_for_each.js → flow/for_each.js} +39 -39
- package/backend/src/core/nodes/{flow_sequence.js → flow/sequence.js} +16 -16
- package/backend/src/core/nodes/{flow_switch.js → flow/switch.js} +47 -47
- package/backend/src/core/nodes/{flow_while.js → flow/while.js} +1 -1
- package/backend/src/core/nodes/logic/__tests__/compare.test.js +83 -0
- package/backend/src/core/nodes/math/__tests__/operation.test.js +65 -0
- package/backend/src/core/nodes/strings/__tests__/concat.test.js +89 -0
- package/backend/src/core/nodes/time/__tests__/now.test.js +24 -0
- package/backend/src/core/nodes/time/add.js +33 -0
- package/backend/src/core/nodes/time/compare.js +35 -0
- package/backend/src/core/nodes/time/diff.js +29 -0
- package/backend/src/core/nodes/time/format.js +32 -0
- package/backend/src/core/nodes/time/now.js +18 -0
- package/backend/src/core/nodes/{user_check_blacklist.js → users/check_blacklist.js} +37 -37
- package/backend/src/core/nodes/{user_get_groups.js → users/get_groups.js} +36 -36
- package/backend/src/core/nodes/{user_get_permissions.js → users/get_permissions.js} +36 -36
- package/backend/src/core/nodes/{user_set_blacklist.js → users/set_blacklist.js} +37 -37
- package/backend/src/core/services/BotLifecycleService.js +596 -0
- package/backend/src/core/services/BotProcessManager.js +163 -0
- package/backend/src/core/services/CacheManager.js +111 -0
- package/backend/src/core/services/CommandExecutionService.js +351 -0
- package/backend/src/core/services/ResourceMonitorService.js +90 -0
- package/backend/src/core/services/TelemetryService.js +124 -0
- package/backend/src/core/services/ValidationService.js +132 -0
- package/backend/src/core/services/__tests__/ValidationService.test.js +148 -0
- package/backend/src/core/services.js +20 -5
- package/backend/src/core/system/CommandContext.js +84 -0
- package/backend/src/core/system/Transport.js +78 -0
- package/backend/src/core/utils/__tests__/jsonParser.test.js +44 -0
- package/backend/src/core/utils/jsonParser.js +18 -0
- package/backend/src/core/utils/secretsFilter.js +262 -0
- package/backend/src/core/utils/variableParser.js +89 -0
- package/backend/src/core/validation/__tests__/nodeSchemas.test.js +175 -0
- package/backend/src/core/validation/nodeSchemas.js +112 -0
- package/backend/src/lib/prisma.js +2 -4
- package/backend/src/real-time/botApi/handlers/commandHandlers.js +28 -0
- package/backend/src/real-time/botApi/handlers/graphHandlers.js +99 -0
- package/backend/src/real-time/botApi/handlers/graphWebSocketHandlers.js +147 -0
- package/backend/src/real-time/botApi/handlers/index.js +43 -0
- package/backend/src/real-time/botApi/handlers/messageHandlers.js +66 -0
- package/backend/src/real-time/botApi/handlers/statusHandlers.js +17 -0
- package/backend/src/real-time/botApi/handlers/userHandlers.js +141 -0
- package/backend/src/real-time/botApi/index.js +40 -0
- package/backend/src/real-time/botApi/middleware.js +79 -0
- package/backend/src/real-time/botApi/utils.js +54 -0
- package/backend/src/real-time/socketHandler.js +6 -2
- package/backend/src/repositories/BaseRepository.js +43 -0
- package/backend/src/repositories/BotRepository.js +42 -0
- package/backend/src/repositories/CommandRepository.js +53 -0
- package/backend/src/repositories/EventGraphRepository.js +40 -0
- package/backend/src/repositories/GroupRepository.js +69 -0
- package/backend/src/repositories/PermissionRepository.js +48 -0
- package/backend/src/repositories/PluginRepository.js +42 -0
- package/backend/src/repositories/ServerRepository.js +27 -0
- package/backend/src/repositories/UserRepository.js +48 -0
- package/backend/src/server.js +3 -0
- package/backend/src/test-refactor.js +85 -0
- package/frontend/dist/assets/index-CfTo92bP.css +1 -0
- package/frontend/dist/assets/index-CiFD5X9Z.js +8344 -0
- package/frontend/dist/index.html +2 -2
- package/frontend/package.json +0 -5
- package/package.json +2 -1
- package/frontend/dist/assets/index-B9GedHEa.js +0 -8352
- package/frontend/dist/assets/index-zLiy9MDx.css +0 -1
- package/nul +0 -0
- /package/backend/src/core/nodes/{action_http_request.js → actions/http_request.js} +0 -0
- /package/backend/src/core/nodes/{array_add_element.js → arrays/add_element.js} +0 -0
- /package/backend/src/core/nodes/{array_contains.js → arrays/contains.js} +0 -0
- /package/backend/src/core/nodes/{array_find_index.js → arrays/find_index.js} +0 -0
- /package/backend/src/core/nodes/{array_get_by_index.js → arrays/get_by_index.js} +0 -0
- /package/backend/src/core/nodes/{array_get_random_element.js → arrays/get_random_element.js} +0 -0
- /package/backend/src/core/nodes/{array_remove_by_index.js → arrays/remove_by_index.js} +0 -0
- /package/backend/src/core/nodes/{bot_get_position.js → bot/get_position.js} +0 -0
- /package/backend/src/core/nodes/{data_array_literal.js → data/array_literal.js} +0 -0
- /package/backend/src/core/nodes/{data_boolean_literal.js → data/boolean_literal.js} +0 -0
- /package/backend/src/core/nodes/{data_get_argument.js → data/get_argument.js} +0 -0
- /package/backend/src/core/nodes/{data_get_bot_look.js → data/get_bot_look.js} +0 -0
- /package/backend/src/core/nodes/{data_get_entity_field.js → data/get_entity_field.js} +0 -0
- /package/backend/src/core/nodes/{data_get_server_players.js → data/get_server_players.js} +0 -0
- /package/backend/src/core/nodes/{data_get_variable.js → data/get_variable.js} +0 -0
- /package/backend/src/core/nodes/{data_length.js → data/length.js} +0 -0
- /package/backend/src/core/nodes/{data_make_object.js → data/make_object.js} +0 -0
- /package/backend/src/core/nodes/{data_number_literal.js → data/number_literal.js} +0 -0
- /package/backend/src/core/nodes/{data_string_literal.js → data/string_literal.js} +0 -0
- /package/backend/src/core/nodes/{logic_compare.js → logic/compare.js} +0 -0
- /package/backend/src/core/nodes/{logic_operation.js → logic/operation.js} +0 -0
- /package/backend/src/core/nodes/{math_operation.js → math/operation.js} +0 -0
- /package/backend/src/core/nodes/{math_random_number.js → math/random_number.js} +0 -0
- /package/backend/src/core/nodes/{object_create.js → objects/create.js} +0 -0
- /package/backend/src/core/nodes/{object_delete.js → objects/delete.js} +0 -0
- /package/backend/src/core/nodes/{object_get.js → objects/get.js} +0 -0
- /package/backend/src/core/nodes/{object_has_key.js → objects/has_key.js} +0 -0
- /package/backend/src/core/nodes/{object_set.js → objects/set.js} +0 -0
- /package/backend/src/core/nodes/{string_concat.js → strings/concat.js} +0 -0
- /package/backend/src/core/nodes/{string_contains.js → strings/contains.js} +0 -0
- /package/backend/src/core/nodes/{string_ends_with.js → strings/ends_with.js} +0 -0
- /package/backend/src/core/nodes/{string_equals.js → strings/equals.js} +0 -0
- /package/backend/src/core/nodes/{string_length.js → strings/length.js} +0 -0
- /package/backend/src/core/nodes/{string_matches.js → strings/matches.js} +0 -0
- /package/backend/src/core/nodes/{string_split.js → strings/split.js} +0 -0
- /package/backend/src/core/nodes/{string_starts_with.js → strings/starts_with.js} +0 -0
|
@@ -0,0 +1,535 @@
|
|
|
1
|
+
# Disaster Recovery
|
|
2
|
+
|
|
3
|
+
Comprehensive guide to disaster recovery planning, RTO/RPO requirements, backup strategies, and failover procedures.
|
|
4
|
+
|
|
5
|
+
## Core Concepts
|
|
6
|
+
|
|
7
|
+
### RTO (Recovery Time Objective)
|
|
8
|
+
|
|
9
|
+
**Definition:** Maximum acceptable time to restore service after disaster.
|
|
10
|
+
|
|
11
|
+
**Examples:**
|
|
12
|
+
- RTO = 1 hour: Service must be restored within 1 hour
|
|
13
|
+
- RTO = 4 hours: Can tolerate 4-hour outage
|
|
14
|
+
- RTO = 24 hours: Next business day acceptable
|
|
15
|
+
|
|
16
|
+
**Cost Impact:**
|
|
17
|
+
- Lower RTO = Higher cost (hot standby, automation)
|
|
18
|
+
- Higher RTO = Lower cost (cold backup, manual restore)
|
|
19
|
+
|
|
20
|
+
### RPO (Recovery Point Objective)
|
|
21
|
+
|
|
22
|
+
**Definition:** Maximum acceptable data loss measured in time.
|
|
23
|
+
|
|
24
|
+
**Examples:**
|
|
25
|
+
- RPO = 0: Zero data loss (synchronous replication)
|
|
26
|
+
- RPO = 15 minutes: Can lose up to 15 minutes of data
|
|
27
|
+
- RPO = 24 hours: Daily backups acceptable
|
|
28
|
+
|
|
29
|
+
**Cost Impact:**
|
|
30
|
+
- Lower RPO = Higher cost (frequent backups, replication)
|
|
31
|
+
- Higher RPO = Lower cost (infrequent backups)
|
|
32
|
+
|
|
33
|
+
## DR Tiers
|
|
34
|
+
|
|
35
|
+
### Tier 0: No DR (Baseline)
|
|
36
|
+
|
|
37
|
+
**RTO:** Days to weeks
|
|
38
|
+
**RPO:** 24+ hours
|
|
39
|
+
**Cost:** 0% additional
|
|
40
|
+
**Method:** Rebuild from scratch
|
|
41
|
+
|
|
42
|
+
**Use Case:** Non-critical dev/test environments
|
|
43
|
+
|
|
44
|
+
---
|
|
45
|
+
|
|
46
|
+
### Tier 1: Backup & Restore
|
|
47
|
+
|
|
48
|
+
**RTO:** 12-24 hours
|
|
49
|
+
**RPO:** 24 hours
|
|
50
|
+
**Cost:** ~10% additional
|
|
51
|
+
**Method:** Daily backups to S3/Glacier
|
|
52
|
+
|
|
53
|
+
```terraform
|
|
54
|
+
# S3 backup bucket with lifecycle
|
|
55
|
+
resource "aws_s3_bucket" "backups" {
|
|
56
|
+
bucket = "prod-database-backups"
|
|
57
|
+
}
|
|
58
|
+
|
|
59
|
+
resource "aws_s3_bucket_lifecycle_configuration" "backups" {
|
|
60
|
+
bucket = aws_s3_bucket.backups.id
|
|
61
|
+
|
|
62
|
+
rule {
|
|
63
|
+
id = "archive-old-backups"
|
|
64
|
+
status = "Enabled"
|
|
65
|
+
|
|
66
|
+
transition {
|
|
67
|
+
days = 30
|
|
68
|
+
storage_class = "GLACIER"
|
|
69
|
+
}
|
|
70
|
+
|
|
71
|
+
expiration {
|
|
72
|
+
days = 90
|
|
73
|
+
}
|
|
74
|
+
}
|
|
75
|
+
}
|
|
76
|
+
|
|
77
|
+
# Daily backup script
|
|
78
|
+
resource "aws_cloudwatch_event_rule" "daily_backup" {
|
|
79
|
+
name = "daily-backup"
|
|
80
|
+
schedule_expression = "cron(0 2 * * ? *)" # 2 AM daily
|
|
81
|
+
}
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
**Pros:** ✅ Low cost, ✅ Simple
|
|
85
|
+
**Cons:** ❌ Slow recovery, ❌ High data loss
|
|
86
|
+
|
|
87
|
+
---
|
|
88
|
+
|
|
89
|
+
### Tier 2: Pilot Light
|
|
90
|
+
|
|
91
|
+
**RTO:** 1-4 hours
|
|
92
|
+
**RPO:** 1 hour
|
|
93
|
+
**Cost:** ~20-30% additional
|
|
94
|
+
**Method:** Minimal secondary environment + continuous backup
|
|
95
|
+
|
|
96
|
+
```
|
|
97
|
+
Primary: Secondary (Minimal):
|
|
98
|
+
┌────────────┐ ┌────────────┐
|
|
99
|
+
│Full Stack │ → │Core DB Only│
|
|
100
|
+
│- Web │ Data │(Replicated)│
|
|
101
|
+
│- API │ Repl │ │
|
|
102
|
+
│- Database │ │ │
|
|
103
|
+
└────────────┘ └────────────┘
|
|
104
|
+
100% 5-10%
|
|
105
|
+
Running Running
|
|
106
|
+
|
|
107
|
+
During DR:
|
|
108
|
+
Secondary spins up full stack (1-4 hours)
|
|
109
|
+
```
|
|
110
|
+
|
|
111
|
+
**Implementation:**
|
|
112
|
+
```yaml
|
|
113
|
+
# Terraform: Secondary region (pilot light)
|
|
114
|
+
module "pilot_light_region" {
|
|
115
|
+
source = "./modules/region"
|
|
116
|
+
region = "eu-west-1"
|
|
117
|
+
|
|
118
|
+
# Only run database replica
|
|
119
|
+
enable_database = true
|
|
120
|
+
enable_compute = false # Spin up during DR
|
|
121
|
+
|
|
122
|
+
database_config = {
|
|
123
|
+
instance_class = "db.r5.large"
|
|
124
|
+
replica_source = module.primary_region.database_arn
|
|
125
|
+
}
|
|
126
|
+
}
|
|
127
|
+
|
|
128
|
+
# DR automation
|
|
129
|
+
resource "aws_lambda_function" "activate_dr" {
|
|
130
|
+
function_name = "activate-disaster-recovery"
|
|
131
|
+
|
|
132
|
+
# Spins up compute when triggered
|
|
133
|
+
environment {
|
|
134
|
+
variables = {
|
|
135
|
+
SECONDARY_REGION = "eu-west-1"
|
|
136
|
+
SCALE_UP_COMPUTE = "true"
|
|
137
|
+
}
|
|
138
|
+
}
|
|
139
|
+
}
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
**Pros:** ✅ Faster than backup/restore, ✅ Lower cost than warm standby
|
|
143
|
+
**Cons:** ❌ Manual intervention, ❌ 1-4 hour RTO
|
|
144
|
+
|
|
145
|
+
---
|
|
146
|
+
|
|
147
|
+
### Tier 3: Warm Standby
|
|
148
|
+
|
|
149
|
+
**RTO:** 5-30 minutes
|
|
150
|
+
**RPO:** 5-15 minutes
|
|
151
|
+
**Cost:** ~50-70% additional
|
|
152
|
+
**Method:** Scaled-down secondary environment always running
|
|
153
|
+
|
|
154
|
+
```
|
|
155
|
+
Primary: Secondary (Scaled Down):
|
|
156
|
+
┌────────────┐ ┌────────────┐
|
|
157
|
+
│Full Stack │ → │Full Stack │
|
|
158
|
+
│- 10 servers│ Data │- 2 servers │
|
|
159
|
+
│- Large DB │ Repl │- Small DB │
|
|
160
|
+
└────────────┘ └────────────┘
|
|
161
|
+
100% 20-30%
|
|
162
|
+
Running Running
|
|
163
|
+
|
|
164
|
+
During DR:
|
|
165
|
+
Secondary scales up to full capacity (5-30 min)
|
|
166
|
+
```
|
|
167
|
+
|
|
168
|
+
**Implementation:**
|
|
169
|
+
```yaml
|
|
170
|
+
# Auto-scaling group in secondary region
|
|
171
|
+
resource "aws_autoscaling_group" "secondary" {
|
|
172
|
+
name = "api-secondary"
|
|
173
|
+
region = "eu-west-1"
|
|
174
|
+
|
|
175
|
+
# Normal: 2 instances (20% capacity)
|
|
176
|
+
min_size = 2
|
|
177
|
+
desired_capacity = 2
|
|
178
|
+
|
|
179
|
+
# DR: Scale to 10 instances (100% capacity)
|
|
180
|
+
max_size = 10
|
|
181
|
+
}
|
|
182
|
+
|
|
183
|
+
# DR trigger increases desired capacity
|
|
184
|
+
resource "aws_autoscaling_policy" "dr_scale_up" {
|
|
185
|
+
name = "dr-scale-up"
|
|
186
|
+
autoscaling_group_name = aws_autoscaling_group.secondary.name
|
|
187
|
+
|
|
188
|
+
# Scale to 10 when DR activated
|
|
189
|
+
scaling_adjustment = 8
|
|
190
|
+
adjustment_type = "ExactCapacity"
|
|
191
|
+
}
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
**Pros:** ✅ Fast recovery, ✅ Can test regularly
|
|
195
|
+
**Cons:** ❌ Higher cost, ❌ Still requires scaling
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
### Tier 4: Hot Standby (Active-Active)
|
|
200
|
+
|
|
201
|
+
**RTO:** < 5 minutes (often seconds)
|
|
202
|
+
**RPO:** Near-zero
|
|
203
|
+
**Cost:** ~100-150% additional
|
|
204
|
+
**Method:** Full capacity in multiple regions
|
|
205
|
+
|
|
206
|
+
```
|
|
207
|
+
Primary & Secondary (Both Full):
|
|
208
|
+
┌────────────┐ ┌────────────┐
|
|
209
|
+
│Full Stack │ ←→ │Full Stack │
|
|
210
|
+
│- 10 servers│ Data │- 10 servers│
|
|
211
|
+
│- Large DB │ Sync │- Large DB │
|
|
212
|
+
└────────────┘ └────────────┘
|
|
213
|
+
50% 50%
|
|
214
|
+
Traffic Traffic
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
**Implementation:**
|
|
218
|
+
```yaml
|
|
219
|
+
# DynamoDB Global Tables (automatic multi-region)
|
|
220
|
+
resource "aws_dynamodb_table" "users" {
|
|
221
|
+
name = "users"
|
|
222
|
+
billing_mode = "PAY_PER_REQUEST"
|
|
223
|
+
|
|
224
|
+
replica {
|
|
225
|
+
region_name = "us-east-1"
|
|
226
|
+
}
|
|
227
|
+
|
|
228
|
+
replica {
|
|
229
|
+
region_name = "eu-west-1"
|
|
230
|
+
}
|
|
231
|
+
|
|
232
|
+
# Automatic conflict resolution
|
|
233
|
+
stream_enabled = true
|
|
234
|
+
stream_view_type = "NEW_AND_OLD_IMAGES"
|
|
235
|
+
}
|
|
236
|
+
|
|
237
|
+
# Route 53 health check failover
|
|
238
|
+
resource "aws_route53_health_check" "us_east" {
|
|
239
|
+
fqdn = "api.us-east-1.example.com"
|
|
240
|
+
type = "HTTPS"
|
|
241
|
+
}
|
|
242
|
+
|
|
243
|
+
resource "aws_route53_record" "api" {
|
|
244
|
+
name = "api.example.com"
|
|
245
|
+
type = "A"
|
|
246
|
+
|
|
247
|
+
# Latency-based routing with failover
|
|
248
|
+
set_identifier = "us-east-1"
|
|
249
|
+
health_check_id = aws_route53_health_check.us_east.id
|
|
250
|
+
|
|
251
|
+
latency_routing_policy {
|
|
252
|
+
region = "us-east-1"
|
|
253
|
+
}
|
|
254
|
+
}
|
|
255
|
+
```
|
|
256
|
+
|
|
257
|
+
**Pros:** ✅ Fastest recovery, ✅ No user impact, ✅ Global performance
|
|
258
|
+
**Cons:** ❌ Highest cost, ❌ Complexity, ❌ Data consistency challenges
|
|
259
|
+
|
|
260
|
+
---
|
|
261
|
+
|
|
262
|
+
## Backup Strategies
|
|
263
|
+
|
|
264
|
+
### Database Backups
|
|
265
|
+
|
|
266
|
+
```bash
|
|
267
|
+
# PostgreSQL automated backups
|
|
268
|
+
resource "aws_db_instance" "primary" {
|
|
269
|
+
identifier = "prod-db"
|
|
270
|
+
|
|
271
|
+
# Automated backups
|
|
272
|
+
backup_retention_period = 7 # Keep 7 days
|
|
273
|
+
backup_window = "03:00-04:00" # 3-4 AM
|
|
274
|
+
|
|
275
|
+
# Point-in-time restore
|
|
276
|
+
enabled_cloudwatch_logs_exports = ["postgresql"]
|
|
277
|
+
}
|
|
278
|
+
|
|
279
|
+
# Manual snapshot before major changes
|
|
280
|
+
aws rds create-db-snapshot \
|
|
281
|
+
--db-instance-identifier prod-db \
|
|
282
|
+
--db-snapshot-identifier pre-migration-2024-01-15
|
|
283
|
+
```
|
|
284
|
+
|
|
285
|
+
### Application State Backups
|
|
286
|
+
|
|
287
|
+
```yaml
|
|
288
|
+
# Velero: Kubernetes backup
|
|
289
|
+
apiVersion: velero.io/v1
|
|
290
|
+
kind: Schedule
|
|
291
|
+
metadata:
|
|
292
|
+
name: daily-backup
|
|
293
|
+
spec:
|
|
294
|
+
schedule: "0 2 * * *" # 2 AM daily
|
|
295
|
+
template:
|
|
296
|
+
includedNamespaces:
|
|
297
|
+
- production
|
|
298
|
+
storageLocation: aws-s3-backups
|
|
299
|
+
volumeSnapshotLocations:
|
|
300
|
+
- aws-ebs-snapshots
|
|
301
|
+
ttl: 720h # 30 days retention
|
|
302
|
+
```
|
|
303
|
+
|
|
304
|
+
### 3-2-1 Backup Rule
|
|
305
|
+
|
|
306
|
+
**Best Practice:**
|
|
307
|
+
- **3** copies of data
|
|
308
|
+
- **2** different media types
|
|
309
|
+
- **1** off-site copy
|
|
310
|
+
|
|
311
|
+
**Example:**
|
|
312
|
+
```
|
|
313
|
+
1. Production database (primary)
|
|
314
|
+
2. Local replica (secondary)
|
|
315
|
+
3. S3 backup (off-site)
|
|
316
|
+
|
|
317
|
+
Media types:
|
|
318
|
+
- EBS volumes (database)
|
|
319
|
+
- S3 (object storage)
|
|
320
|
+
```
|
|
321
|
+
|
|
322
|
+
---
|
|
323
|
+
|
|
324
|
+
## Testing DR Plans
|
|
325
|
+
|
|
326
|
+
### Monthly DR Drills
|
|
327
|
+
|
|
328
|
+
```markdown
|
|
329
|
+
# DR Drill Checklist
|
|
330
|
+
|
|
331
|
+
## Preparation (1 week before)
|
|
332
|
+
- [ ] Schedule 2-hour window
|
|
333
|
+
- [ ] Notify stakeholders
|
|
334
|
+
- [ ] Prepare runbook
|
|
335
|
+
- [ ] Set up monitoring
|
|
336
|
+
|
|
337
|
+
## Execution (2 hours)
|
|
338
|
+
- [ ] T+0: Simulate primary region failure
|
|
339
|
+
- [ ] T+5: Detect failure (monitoring alerts)
|
|
340
|
+
- [ ] T+10: Initiate DR procedure
|
|
341
|
+
- [ ] T+30: Secondary region serving traffic
|
|
342
|
+
- [ ] T+60: Verify functionality
|
|
343
|
+
- [ ] T+90: Measure RTO/RPO achieved
|
|
344
|
+
- [ ] T+120: Restore to primary
|
|
345
|
+
|
|
346
|
+
## Post-Drill (1 week after)
|
|
347
|
+
- [ ] Document actual RTO/RPO
|
|
348
|
+
- [ ] Identify issues encountered
|
|
349
|
+
- [ ] Update runbooks
|
|
350
|
+
- [ ] Action items assigned
|
|
351
|
+
```
|
|
352
|
+
|
|
353
|
+
### Chaos Engineering
|
|
354
|
+
|
|
355
|
+
```yaml
|
|
356
|
+
# Simulate region failure with Chaos Mesh
|
|
357
|
+
apiVersion: chaos-mesh.org/v1alpha1
|
|
358
|
+
kind: NetworkChaos
|
|
359
|
+
metadata:
|
|
360
|
+
name: region-us-east-1-failure
|
|
361
|
+
spec:
|
|
362
|
+
action: partition
|
|
363
|
+
mode: all
|
|
364
|
+
selector:
|
|
365
|
+
namespaces:
|
|
366
|
+
- production
|
|
367
|
+
labelSelectors:
|
|
368
|
+
'region': 'us-east-1'
|
|
369
|
+
duration: '30m'
|
|
370
|
+
|
|
371
|
+
# Isolate entire region
|
|
372
|
+
direction: both
|
|
373
|
+
```
|
|
374
|
+
|
|
375
|
+
---
|
|
376
|
+
|
|
377
|
+
## DR Runbooks
|
|
378
|
+
|
|
379
|
+
### Example: Database Failover
|
|
380
|
+
|
|
381
|
+
```markdown
|
|
382
|
+
# Runbook: Promote Read Replica to Primary
|
|
383
|
+
|
|
384
|
+
## Prerequisites
|
|
385
|
+
- Read replica lag < 5 minutes
|
|
386
|
+
- Secondary region healthy
|
|
387
|
+
- Stakeholders notified
|
|
388
|
+
|
|
389
|
+
## Steps
|
|
390
|
+
|
|
391
|
+
### 1. Stop writes to primary (5 min)
|
|
392
|
+
```bash
|
|
393
|
+
# Set database to read-only
|
|
394
|
+
aws rds modify-db-instance \
|
|
395
|
+
--db-instance-identifier prod-db \
|
|
396
|
+
--no-publicly-accessible
|
|
397
|
+
|
|
398
|
+
# Verify no active connections
|
|
399
|
+
psql -h prod-db -c "SELECT count(*) FROM pg_stat_activity WHERE state = 'active';"
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
### 2. Promote replica (10 min)
|
|
403
|
+
```bash
|
|
404
|
+
# Promote replica to standalone
|
|
405
|
+
aws rds promote-read-replica \
|
|
406
|
+
--db-instance-identifier prod-db-replica-eu
|
|
407
|
+
|
|
408
|
+
# Wait for promotion
|
|
409
|
+
aws rds wait db-instance-available \
|
|
410
|
+
--db-instance-identifier prod-db-replica-eu
|
|
411
|
+
```
|
|
412
|
+
|
|
413
|
+
### 3. Update application config (5 min)
|
|
414
|
+
```bash
|
|
415
|
+
# Update connection string
|
|
416
|
+
kubectl set env deployment/api \
|
|
417
|
+
DATABASE_URL=postgresql://prod-db-replica-eu.amazonaws.com/myapp
|
|
418
|
+
|
|
419
|
+
# Rolling restart
|
|
420
|
+
kubectl rollout restart deployment/api
|
|
421
|
+
```
|
|
422
|
+
|
|
423
|
+
### 4. Update DNS (15 min)
|
|
424
|
+
```bash
|
|
425
|
+
# Update Route 53
|
|
426
|
+
aws route53 change-resource-record-sets \
|
|
427
|
+
--hosted-zone-id Z123456 \
|
|
428
|
+
--change-batch file://dns-update.json
|
|
429
|
+
|
|
430
|
+
# Verify propagation
|
|
431
|
+
dig api.example.com # Should resolve to new region
|
|
432
|
+
```
|
|
433
|
+
|
|
434
|
+
### 5. Verify (15 min)
|
|
435
|
+
- Test critical user flows
|
|
436
|
+
- Check error rates in Datadog
|
|
437
|
+
- Monitor database connections
|
|
438
|
+
- Verify replication lag
|
|
439
|
+
|
|
440
|
+
## Rollback
|
|
441
|
+
If issues arise:
|
|
442
|
+
```bash
|
|
443
|
+
# Revert DNS
|
|
444
|
+
aws route53 change-resource-record-sets ...
|
|
445
|
+
|
|
446
|
+
# Scale down secondary
|
|
447
|
+
kubectl scale deployment/api --replicas=2
|
|
448
|
+
```
|
|
449
|
+
|
|
450
|
+
## Total Time: ~50 minutes
|
|
451
|
+
```
|
|
452
|
+
|
|
453
|
+
---
|
|
454
|
+
|
|
455
|
+
## Cost Analysis
|
|
456
|
+
|
|
457
|
+
### DR Cost Comparison
|
|
458
|
+
|
|
459
|
+
| Tier | RTO | RPO | Additional Cost | Use Case |
|
|
460
|
+
|------|-----|-----|-----------------|----------|
|
|
461
|
+
| Backup & Restore | 12-24h | 24h | +10% | Non-critical |
|
|
462
|
+
| Pilot Light | 1-4h | 1h | +20-30% | Standard |
|
|
463
|
+
| Warm Standby | 5-30m | 5-15m | +50-70% | Business-critical |
|
|
464
|
+
| Hot Standby | <5m | Near-zero | +100-150% | Mission-critical |
|
|
465
|
+
|
|
466
|
+
### Example Cost Breakdown (1000 RPS application)
|
|
467
|
+
|
|
468
|
+
**Single Region:** $5,000/month
|
|
469
|
+
```
|
|
470
|
+
- Compute: $2,000
|
|
471
|
+
- Database: $1,500
|
|
472
|
+
- Load balancer: $500
|
|
473
|
+
- Data transfer: $500
|
|
474
|
+
- Other: $500
|
|
475
|
+
```
|
|
476
|
+
|
|
477
|
+
**With Warm Standby DR:**
|
|
478
|
+
```
|
|
479
|
+
Primary Region: $5,000 (100%)
|
|
480
|
+
Secondary Region: $2,500 (50% - scaled down)
|
|
481
|
+
Data Replication: $500 (10%)
|
|
482
|
+
------------------------------------
|
|
483
|
+
Total: $8,000 (+60%)
|
|
484
|
+
```
|
|
485
|
+
|
|
486
|
+
---
|
|
487
|
+
|
|
488
|
+
## Compliance Requirements
|
|
489
|
+
|
|
490
|
+
### Industry Standards
|
|
491
|
+
|
|
492
|
+
**Healthcare (HIPAA):**
|
|
493
|
+
- Require documented DR plan
|
|
494
|
+
- Regular testing (quarterly minimum)
|
|
495
|
+
- Encrypted backups
|
|
496
|
+
- Audit logs
|
|
497
|
+
|
|
498
|
+
**Finance (PCI-DSS):**
|
|
499
|
+
- RTO < 4 hours for critical systems
|
|
500
|
+
- RPO < 1 hour
|
|
501
|
+
- Annual DR tests documented
|
|
502
|
+
|
|
503
|
+
**General (SOC 2):**
|
|
504
|
+
- DR plan documented
|
|
505
|
+
- Annual testing
|
|
506
|
+
- Incident response procedures
|
|
507
|
+
|
|
508
|
+
---
|
|
509
|
+
|
|
510
|
+
## Best Practices
|
|
511
|
+
|
|
512
|
+
✅ **Document everything** - Runbooks, dependencies, contacts
|
|
513
|
+
✅ **Test regularly** - Monthly drills, annual GameDays
|
|
514
|
+
✅ **Automate failover** - Reduce human error
|
|
515
|
+
✅ **Monitor replication lag** - Alert on delays
|
|
516
|
+
✅ **Encrypt backups** - At rest and in transit
|
|
517
|
+
✅ **Version backups** - Keep multiple restore points
|
|
518
|
+
✅ **Test restores** - Verify backups actually work
|
|
519
|
+
✅ **Update runbooks** - After every drill or incident
|
|
520
|
+
|
|
521
|
+
## Anti-Patterns
|
|
522
|
+
|
|
523
|
+
❌ **Untested DR plan** - "Hope is not a strategy"
|
|
524
|
+
❌ **No automation** - Manual failover too slow/error-prone
|
|
525
|
+
❌ **Single backup** - No versioning or redundancy
|
|
526
|
+
❌ **Forgetting DNS** - TTL too high for failover
|
|
527
|
+
❌ **No monitoring** - Can't detect failures quickly
|
|
528
|
+
❌ **Ignoring costs** - DR can double infrastructure spend
|
|
529
|
+
|
|
530
|
+
---
|
|
531
|
+
|
|
532
|
+
**Related Resources:**
|
|
533
|
+
- multi-region-design.md - Active-active and active-passive architectures
|
|
534
|
+
- capacity-planning.md - Sizing DR infrastructure
|
|
535
|
+
- cost-architecture.md - DR cost optimization
|