@wazir-dev/cli 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +111 -0
- package/CHANGELOG.md +14 -0
- package/CONTRIBUTING.md +101 -0
- package/LICENSE +21 -0
- package/README.md +314 -0
- package/assets/composition-engine.mmd +34 -0
- package/assets/demo-script.sh +17 -0
- package/assets/logo-dark.svg +14 -0
- package/assets/logo.svg +14 -0
- package/assets/pipeline.mmd +39 -0
- package/assets/record-demo.sh +51 -0
- package/docs/README.md +51 -0
- package/docs/adapters/context-mode.md +60 -0
- package/docs/concepts/architecture.md +87 -0
- package/docs/concepts/artifact-model.md +60 -0
- package/docs/concepts/composition-engine.md +36 -0
- package/docs/concepts/indexing-and-recall.md +160 -0
- package/docs/concepts/observability.md +41 -0
- package/docs/concepts/roles-and-workflows.md +59 -0
- package/docs/concepts/terminology-policy.md +27 -0
- package/docs/getting-started/01-installation.md +78 -0
- package/docs/getting-started/02-first-run.md +102 -0
- package/docs/getting-started/03-adding-to-project.md +15 -0
- package/docs/getting-started/04-host-setup.md +15 -0
- package/docs/guides/ci-integration.md +15 -0
- package/docs/guides/creating-skills.md +15 -0
- package/docs/guides/expertise-module-authoring.md +15 -0
- package/docs/guides/hook-development.md +15 -0
- package/docs/guides/memory-and-learnings.md +34 -0
- package/docs/guides/multi-host-export.md +15 -0
- package/docs/guides/troubleshooting.md +101 -0
- package/docs/guides/writing-custom-roles.md +15 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
- package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
- package/docs/readmes/INDEX.md +99 -0
- package/docs/readmes/features/expertise/README.md +171 -0
- package/docs/readmes/features/exports/README.md +222 -0
- package/docs/readmes/features/hooks/README.md +103 -0
- package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
- package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
- package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
- package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
- package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
- package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
- package/docs/readmes/features/hooks/session-start.md +119 -0
- package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
- package/docs/readmes/features/roles/README.md +157 -0
- package/docs/readmes/features/roles/clarifier.md +152 -0
- package/docs/readmes/features/roles/content-author.md +190 -0
- package/docs/readmes/features/roles/designer.md +193 -0
- package/docs/readmes/features/roles/executor.md +184 -0
- package/docs/readmes/features/roles/learner.md +210 -0
- package/docs/readmes/features/roles/planner.md +182 -0
- package/docs/readmes/features/roles/researcher.md +164 -0
- package/docs/readmes/features/roles/reviewer.md +184 -0
- package/docs/readmes/features/roles/specifier.md +162 -0
- package/docs/readmes/features/roles/verifier.md +215 -0
- package/docs/readmes/features/schemas/README.md +178 -0
- package/docs/readmes/features/skills/README.md +63 -0
- package/docs/readmes/features/skills/brainstorming.md +96 -0
- package/docs/readmes/features/skills/debugging.md +148 -0
- package/docs/readmes/features/skills/design.md +120 -0
- package/docs/readmes/features/skills/prepare-next.md +109 -0
- package/docs/readmes/features/skills/run-audit.md +159 -0
- package/docs/readmes/features/skills/scan-project.md +109 -0
- package/docs/readmes/features/skills/self-audit.md +176 -0
- package/docs/readmes/features/skills/tdd.md +137 -0
- package/docs/readmes/features/skills/using-skills.md +92 -0
- package/docs/readmes/features/skills/verification.md +120 -0
- package/docs/readmes/features/skills/writing-plans.md +104 -0
- package/docs/readmes/features/tooling/README.md +320 -0
- package/docs/readmes/features/workflows/README.md +186 -0
- package/docs/readmes/features/workflows/author.md +181 -0
- package/docs/readmes/features/workflows/clarify.md +154 -0
- package/docs/readmes/features/workflows/design-review.md +171 -0
- package/docs/readmes/features/workflows/design.md +169 -0
- package/docs/readmes/features/workflows/discover.md +162 -0
- package/docs/readmes/features/workflows/execute.md +173 -0
- package/docs/readmes/features/workflows/learn.md +167 -0
- package/docs/readmes/features/workflows/plan-review.md +165 -0
- package/docs/readmes/features/workflows/plan.md +170 -0
- package/docs/readmes/features/workflows/prepare-next.md +167 -0
- package/docs/readmes/features/workflows/review.md +169 -0
- package/docs/readmes/features/workflows/run-audit.md +191 -0
- package/docs/readmes/features/workflows/spec-challenge.md +159 -0
- package/docs/readmes/features/workflows/specify.md +160 -0
- package/docs/readmes/features/workflows/verify.md +177 -0
- package/docs/readmes/packages/README.md +50 -0
- package/docs/readmes/packages/ajv.md +117 -0
- package/docs/readmes/packages/context-mode.md +118 -0
- package/docs/readmes/packages/gray-matter.md +116 -0
- package/docs/readmes/packages/node-test.md +137 -0
- package/docs/readmes/packages/yaml.md +112 -0
- package/docs/reference/configuration-reference.md +159 -0
- package/docs/reference/expertise-index.md +52 -0
- package/docs/reference/git-flow.md +43 -0
- package/docs/reference/hooks.md +87 -0
- package/docs/reference/host-exports.md +50 -0
- package/docs/reference/launch-checklist.md +172 -0
- package/docs/reference/marketplace-listings.md +76 -0
- package/docs/reference/release-process.md +34 -0
- package/docs/reference/roles-reference.md +77 -0
- package/docs/reference/skills.md +33 -0
- package/docs/reference/templates.md +29 -0
- package/docs/reference/tooling-cli.md +94 -0
- package/docs/truth-claims.yaml +222 -0
- package/expertise/PROGRESS.md +63 -0
- package/expertise/README.md +18 -0
- package/expertise/antipatterns/PROGRESS.md +56 -0
- package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
- package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
- package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
- package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
- package/expertise/antipatterns/backend/index.md +24 -0
- package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
- package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
- package/expertise/antipatterns/code/async-antipatterns.md +622 -0
- package/expertise/antipatterns/code/code-smells.md +1186 -0
- package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
- package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
- package/expertise/antipatterns/code/index.md +27 -0
- package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
- package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
- package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
- package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
- package/expertise/antipatterns/design/dark-patterns.md +1121 -0
- package/expertise/antipatterns/design/index.md +22 -0
- package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
- package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
- package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
- package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
- package/expertise/antipatterns/frontend/index.md +23 -0
- package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
- package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
- package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
- package/expertise/antipatterns/index.md +31 -0
- package/expertise/antipatterns/performance/index.md +20 -0
- package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
- package/expertise/antipatterns/performance/premature-optimization.md +623 -0
- package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
- package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
- package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
- package/expertise/antipatterns/process/index.md +23 -0
- package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
- package/expertise/antipatterns/security/index.md +20 -0
- package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
- package/expertise/antipatterns/security/security-theater.md +843 -0
- package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
- package/expertise/architecture/PROGRESS.md +70 -0
- package/expertise/architecture/data/caching-architecture.md +671 -0
- package/expertise/architecture/data/data-consistency.md +574 -0
- package/expertise/architecture/data/data-modeling.md +536 -0
- package/expertise/architecture/data/event-streams-and-queues.md +634 -0
- package/expertise/architecture/data/index.md +25 -0
- package/expertise/architecture/data/search-architecture.md +663 -0
- package/expertise/architecture/data/sql-vs-nosql.md +708 -0
- package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
- package/expertise/architecture/decisions/build-vs-buy.md +616 -0
- package/expertise/architecture/decisions/index.md +23 -0
- package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
- package/expertise/architecture/decisions/technology-selection.md +616 -0
- package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
- package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
- package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
- package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
- package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
- package/expertise/architecture/distributed/index.md +25 -0
- package/expertise/architecture/distributed/saga-pattern.md +797 -0
- package/expertise/architecture/foundations/architectural-thinking.md +460 -0
- package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
- package/expertise/architecture/foundations/design-principles-solid.md +649 -0
- package/expertise/architecture/foundations/domain-driven-design.md +719 -0
- package/expertise/architecture/foundations/index.md +25 -0
- package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
- package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
- package/expertise/architecture/index.md +34 -0
- package/expertise/architecture/integration/api-design-graphql.md +638 -0
- package/expertise/architecture/integration/api-design-grpc.md +804 -0
- package/expertise/architecture/integration/api-design-rest.md +892 -0
- package/expertise/architecture/integration/index.md +25 -0
- package/expertise/architecture/integration/third-party-integration.md +795 -0
- package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
- package/expertise/architecture/integration/websockets-realtime.md +791 -0
- package/expertise/architecture/mobile-architecture/index.md +22 -0
- package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
- package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
- package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
- package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
- package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
- package/expertise/architecture/patterns/event-driven.md +797 -0
- package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
- package/expertise/architecture/patterns/index.md +27 -0
- package/expertise/architecture/patterns/layered-architecture.md +736 -0
- package/expertise/architecture/patterns/microservices.md +753 -0
- package/expertise/architecture/patterns/modular-monolith.md +692 -0
- package/expertise/architecture/patterns/monolith.md +626 -0
- package/expertise/architecture/patterns/plugin-architecture.md +735 -0
- package/expertise/architecture/patterns/serverless.md +780 -0
- package/expertise/architecture/scaling/database-scaling.md +615 -0
- package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
- package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
- package/expertise/architecture/scaling/index.md +24 -0
- package/expertise/architecture/scaling/multi-tenancy.md +800 -0
- package/expertise/architecture/scaling/stateless-design.md +787 -0
- package/expertise/backend/embedded-firmware.md +625 -0
- package/expertise/backend/go.md +853 -0
- package/expertise/backend/index.md +24 -0
- package/expertise/backend/java-spring.md +448 -0
- package/expertise/backend/node-typescript.md +625 -0
- package/expertise/backend/python-fastapi.md +724 -0
- package/expertise/backend/rust.md +458 -0
- package/expertise/backend/solidity.md +711 -0
- package/expertise/composition-map.yaml +443 -0
- package/expertise/content/foundations/content-modeling.md +395 -0
- package/expertise/content/foundations/editorial-standards.md +449 -0
- package/expertise/content/foundations/index.md +24 -0
- package/expertise/content/foundations/microcopy.md +455 -0
- package/expertise/content/foundations/terminology-governance.md +509 -0
- package/expertise/content/index.md +34 -0
- package/expertise/content/patterns/accessibility-copy.md +518 -0
- package/expertise/content/patterns/index.md +24 -0
- package/expertise/content/patterns/notification-content.md +433 -0
- package/expertise/content/patterns/sample-content.md +486 -0
- package/expertise/content/patterns/state-copy.md +439 -0
- package/expertise/design/PROGRESS.md +58 -0
- package/expertise/design/disciplines/dark-mode-theming.md +577 -0
- package/expertise/design/disciplines/design-systems.md +595 -0
- package/expertise/design/disciplines/index.md +25 -0
- package/expertise/design/disciplines/information-architecture.md +800 -0
- package/expertise/design/disciplines/interaction-design.md +788 -0
- package/expertise/design/disciplines/responsive-design.md +552 -0
- package/expertise/design/disciplines/usability-testing.md +516 -0
- package/expertise/design/disciplines/user-research.md +792 -0
- package/expertise/design/foundations/accessibility-design.md +796 -0
- package/expertise/design/foundations/color-theory.md +797 -0
- package/expertise/design/foundations/iconography.md +795 -0
- package/expertise/design/foundations/index.md +26 -0
- package/expertise/design/foundations/motion-and-animation.md +653 -0
- package/expertise/design/foundations/rtl-design.md +585 -0
- package/expertise/design/foundations/spacing-and-layout.md +607 -0
- package/expertise/design/foundations/typography.md +800 -0
- package/expertise/design/foundations/visual-hierarchy.md +761 -0
- package/expertise/design/index.md +32 -0
- package/expertise/design/patterns/authentication-flows.md +474 -0
- package/expertise/design/patterns/content-consumption.md +789 -0
- package/expertise/design/patterns/data-display.md +618 -0
- package/expertise/design/patterns/e-commerce.md +1494 -0
- package/expertise/design/patterns/feedback-and-states.md +642 -0
- package/expertise/design/patterns/forms-and-input.md +819 -0
- package/expertise/design/patterns/gamification.md +801 -0
- package/expertise/design/patterns/index.md +31 -0
- package/expertise/design/patterns/microinteractions.md +449 -0
- package/expertise/design/patterns/navigation.md +800 -0
- package/expertise/design/patterns/notifications.md +705 -0
- package/expertise/design/patterns/onboarding.md +700 -0
- package/expertise/design/patterns/search-and-filter.md +601 -0
- package/expertise/design/patterns/settings-and-preferences.md +768 -0
- package/expertise/design/patterns/social-and-community.md +748 -0
- package/expertise/design/platforms/desktop-native.md +612 -0
- package/expertise/design/platforms/index.md +25 -0
- package/expertise/design/platforms/mobile-android.md +825 -0
- package/expertise/design/platforms/mobile-cross-platform.md +983 -0
- package/expertise/design/platforms/mobile-ios.md +699 -0
- package/expertise/design/platforms/tablet.md +794 -0
- package/expertise/design/platforms/web-dashboard.md +790 -0
- package/expertise/design/platforms/web-responsive.md +550 -0
- package/expertise/design/psychology/behavioral-nudges.md +449 -0
- package/expertise/design/psychology/cognitive-load.md +1191 -0
- package/expertise/design/psychology/error-psychology.md +778 -0
- package/expertise/design/psychology/index.md +22 -0
- package/expertise/design/psychology/persuasive-design.md +736 -0
- package/expertise/design/psychology/user-mental-models.md +623 -0
- package/expertise/design/tooling/open-pencil.md +266 -0
- package/expertise/frontend/angular.md +1073 -0
- package/expertise/frontend/desktop-electron.md +546 -0
- package/expertise/frontend/flutter.md +782 -0
- package/expertise/frontend/index.md +27 -0
- package/expertise/frontend/native-android.md +409 -0
- package/expertise/frontend/native-ios.md +490 -0
- package/expertise/frontend/react-native.md +1160 -0
- package/expertise/frontend/react.md +808 -0
- package/expertise/frontend/vue.md +1089 -0
- package/expertise/humanize/domain-rules-code.md +79 -0
- package/expertise/humanize/domain-rules-content.md +67 -0
- package/expertise/humanize/domain-rules-technical-docs.md +56 -0
- package/expertise/humanize/index.md +35 -0
- package/expertise/humanize/self-audit-checklist.md +87 -0
- package/expertise/humanize/sentence-patterns.md +218 -0
- package/expertise/humanize/vocabulary-blacklist.md +105 -0
- package/expertise/i18n/PROGRESS.md +65 -0
- package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
- package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
- package/expertise/i18n/advanced/complex-scripts.md +30 -0
- package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
- package/expertise/i18n/advanced/testing-i18n.md +28 -0
- package/expertise/i18n/content/content-adaptation.md +23 -0
- package/expertise/i18n/content/locale-specific-formatting.md +23 -0
- package/expertise/i18n/content/machine-translation-integration.md +28 -0
- package/expertise/i18n/content/translation-management.md +29 -0
- package/expertise/i18n/foundations/date-time-calendars.md +67 -0
- package/expertise/i18n/foundations/i18n-architecture.md +272 -0
- package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
- package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
- package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
- package/expertise/i18n/foundations/string-externalization.md +236 -0
- package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
- package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
- package/expertise/i18n/index.md +38 -0
- package/expertise/i18n/platform/backend-i18n.md +31 -0
- package/expertise/i18n/platform/flutter-i18n.md +148 -0
- package/expertise/i18n/platform/native-android-i18n.md +36 -0
- package/expertise/i18n/platform/native-ios-i18n.md +36 -0
- package/expertise/i18n/platform/react-i18n.md +103 -0
- package/expertise/i18n/platform/web-css-i18n.md +81 -0
- package/expertise/i18n/rtl/arabic-specific.md +175 -0
- package/expertise/i18n/rtl/hebrew-specific.md +149 -0
- package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
- package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
- package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
- package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
- package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
- package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
- package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
- package/expertise/i18n/rtl/rtl-typography.md +160 -0
- package/expertise/index.md +113 -0
- package/expertise/index.yaml +216 -0
- package/expertise/infrastructure/cloud-aws.md +597 -0
- package/expertise/infrastructure/cloud-gcp.md +599 -0
- package/expertise/infrastructure/cybersecurity.md +816 -0
- package/expertise/infrastructure/database-mongodb.md +447 -0
- package/expertise/infrastructure/database-postgres.md +400 -0
- package/expertise/infrastructure/devops-cicd.md +787 -0
- package/expertise/infrastructure/index.md +27 -0
- package/expertise/performance/PROGRESS.md +50 -0
- package/expertise/performance/backend/api-latency.md +1204 -0
- package/expertise/performance/backend/background-jobs.md +506 -0
- package/expertise/performance/backend/connection-pooling.md +1209 -0
- package/expertise/performance/backend/database-query-optimization.md +515 -0
- package/expertise/performance/backend/index.md +23 -0
- package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
- package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
- package/expertise/performance/foundations/caching-strategies.md +489 -0
- package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
- package/expertise/performance/foundations/index.md +24 -0
- package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
- package/expertise/performance/foundations/memory-management.md +964 -0
- package/expertise/performance/foundations/performance-budgets.md +1314 -0
- package/expertise/performance/index.md +31 -0
- package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
- package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
- package/expertise/performance/infrastructure/index.md +22 -0
- package/expertise/performance/infrastructure/load-balancing.md +1081 -0
- package/expertise/performance/infrastructure/observability.md +1079 -0
- package/expertise/performance/mobile/index.md +23 -0
- package/expertise/performance/mobile/mobile-animations.md +544 -0
- package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
- package/expertise/performance/mobile/mobile-network.md +452 -0
- package/expertise/performance/mobile/mobile-rendering.md +599 -0
- package/expertise/performance/mobile/mobile-startup-time.md +505 -0
- package/expertise/performance/platform-specific/flutter-performance.md +647 -0
- package/expertise/performance/platform-specific/index.md +22 -0
- package/expertise/performance/platform-specific/node-performance.md +1307 -0
- package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
- package/expertise/performance/platform-specific/react-performance.md +1403 -0
- package/expertise/performance/web/bundle-optimization.md +1239 -0
- package/expertise/performance/web/image-and-media.md +636 -0
- package/expertise/performance/web/index.md +24 -0
- package/expertise/performance/web/network-optimization.md +1133 -0
- package/expertise/performance/web/rendering-performance.md +1098 -0
- package/expertise/performance/web/ssr-and-hydration.md +918 -0
- package/expertise/performance/web/web-vitals.md +1374 -0
- package/expertise/quality/accessibility.md +985 -0
- package/expertise/quality/evidence-based-verification.md +499 -0
- package/expertise/quality/index.md +24 -0
- package/expertise/quality/ml-model-audit.md +614 -0
- package/expertise/quality/performance.md +600 -0
- package/expertise/quality/testing-api.md +891 -0
- package/expertise/quality/testing-mobile.md +496 -0
- package/expertise/quality/testing-web.md +849 -0
- package/expertise/security/PROGRESS.md +54 -0
- package/expertise/security/agentic-identity.md +540 -0
- package/expertise/security/compliance-frameworks.md +601 -0
- package/expertise/security/data/data-encryption.md +364 -0
- package/expertise/security/data/data-privacy-gdpr.md +692 -0
- package/expertise/security/data/database-security.md +1171 -0
- package/expertise/security/data/index.md +22 -0
- package/expertise/security/data/pii-handling.md +531 -0
- package/expertise/security/foundations/authentication.md +1041 -0
- package/expertise/security/foundations/authorization.md +603 -0
- package/expertise/security/foundations/cryptography.md +1001 -0
- package/expertise/security/foundations/index.md +25 -0
- package/expertise/security/foundations/owasp-top-10.md +1354 -0
- package/expertise/security/foundations/secrets-management.md +1217 -0
- package/expertise/security/foundations/secure-sdlc.md +700 -0
- package/expertise/security/foundations/supply-chain-security.md +698 -0
- package/expertise/security/index.md +31 -0
- package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
- package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
- package/expertise/security/infrastructure/container-security.md +721 -0
- package/expertise/security/infrastructure/incident-response.md +1295 -0
- package/expertise/security/infrastructure/index.md +24 -0
- package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
- package/expertise/security/infrastructure/network-security.md +1337 -0
- package/expertise/security/mobile/index.md +23 -0
- package/expertise/security/mobile/mobile-android-security.md +1218 -0
- package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
- package/expertise/security/mobile/mobile-data-storage.md +1265 -0
- package/expertise/security/mobile/mobile-ios-security.md +1401 -0
- package/expertise/security/mobile/mobile-network-security.md +1520 -0
- package/expertise/security/smart-contract-security.md +594 -0
- package/expertise/security/testing/index.md +22 -0
- package/expertise/security/testing/penetration-testing.md +1258 -0
- package/expertise/security/testing/security-code-review.md +1765 -0
- package/expertise/security/testing/threat-modeling.md +1074 -0
- package/expertise/security/testing/vulnerability-scanning.md +1062 -0
- package/expertise/security/web/api-security.md +586 -0
- package/expertise/security/web/cors-and-headers.md +433 -0
- package/expertise/security/web/csrf.md +562 -0
- package/expertise/security/web/file-upload.md +1477 -0
- package/expertise/security/web/index.md +25 -0
- package/expertise/security/web/injection.md +1375 -0
- package/expertise/security/web/session-management.md +1101 -0
- package/expertise/security/web/xss.md +1158 -0
- package/exports/README.md +17 -0
- package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
- package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
- package/exports/hosts/claude/.claude/agents/designer.md +55 -0
- package/exports/hosts/claude/.claude/agents/executor.md +55 -0
- package/exports/hosts/claude/.claude/agents/learner.md +51 -0
- package/exports/hosts/claude/.claude/agents/planner.md +53 -0
- package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
- package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
- package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
- package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
- package/exports/hosts/claude/.claude/commands/author.md +42 -0
- package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
- package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
- package/exports/hosts/claude/.claude/commands/design.md +44 -0
- package/exports/hosts/claude/.claude/commands/discover.md +37 -0
- package/exports/hosts/claude/.claude/commands/execute.md +48 -0
- package/exports/hosts/claude/.claude/commands/learn.md +38 -0
- package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
- package/exports/hosts/claude/.claude/commands/plan.md +39 -0
- package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
- package/exports/hosts/claude/.claude/commands/review.md +40 -0
- package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
- package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
- package/exports/hosts/claude/.claude/commands/specify.md +38 -0
- package/exports/hosts/claude/.claude/commands/verify.md +37 -0
- package/exports/hosts/claude/.claude/settings.json +34 -0
- package/exports/hosts/claude/CLAUDE.md +19 -0
- package/exports/hosts/claude/export.manifest.json +38 -0
- package/exports/hosts/claude/host-package.json +67 -0
- package/exports/hosts/codex/AGENTS.md +19 -0
- package/exports/hosts/codex/export.manifest.json +38 -0
- package/exports/hosts/codex/host-package.json +41 -0
- package/exports/hosts/cursor/.cursor/hooks.json +16 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
- package/exports/hosts/cursor/export.manifest.json +38 -0
- package/exports/hosts/cursor/host-package.json +42 -0
- package/exports/hosts/gemini/GEMINI.md +19 -0
- package/exports/hosts/gemini/export.manifest.json +38 -0
- package/exports/hosts/gemini/host-package.json +41 -0
- package/hooks/README.md +18 -0
- package/hooks/definitions/loop_cap_guard.yaml +21 -0
- package/hooks/definitions/post_tool_capture.yaml +24 -0
- package/hooks/definitions/pre_compact_summary.yaml +19 -0
- package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
- package/hooks/definitions/protected_path_write_guard.yaml +19 -0
- package/hooks/definitions/session_start.yaml +19 -0
- package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
- package/hooks/loop-cap-guard +17 -0
- package/hooks/post-tool-lint +36 -0
- package/hooks/protected-path-write-guard +17 -0
- package/hooks/session-start +41 -0
- package/llms-full.txt +2355 -0
- package/llms.txt +43 -0
- package/package.json +79 -0
- package/roles/README.md +20 -0
- package/roles/clarifier.md +42 -0
- package/roles/content-author.md +63 -0
- package/roles/designer.md +55 -0
- package/roles/executor.md +55 -0
- package/roles/learner.md +51 -0
- package/roles/planner.md +53 -0
- package/roles/researcher.md +43 -0
- package/roles/reviewer.md +54 -0
- package/roles/specifier.md +47 -0
- package/roles/verifier.md +71 -0
- package/schemas/README.md +24 -0
- package/schemas/accepted-learning.schema.json +20 -0
- package/schemas/author-artifact.schema.json +156 -0
- package/schemas/clarification.schema.json +19 -0
- package/schemas/design-artifact.schema.json +80 -0
- package/schemas/docs-claim.schema.json +18 -0
- package/schemas/export-manifest.schema.json +20 -0
- package/schemas/hook.schema.json +67 -0
- package/schemas/host-export-package.schema.json +18 -0
- package/schemas/implementation-plan.schema.json +19 -0
- package/schemas/proposed-learning.schema.json +19 -0
- package/schemas/research.schema.json +18 -0
- package/schemas/review.schema.json +29 -0
- package/schemas/run-manifest.schema.json +18 -0
- package/schemas/spec-challenge.schema.json +18 -0
- package/schemas/spec.schema.json +20 -0
- package/schemas/usage.schema.json +102 -0
- package/schemas/verification-proof.schema.json +29 -0
- package/schemas/wazir-manifest.schema.json +173 -0
- package/skills/README.md +40 -0
- package/skills/brainstorming/SKILL.md +77 -0
- package/skills/debugging/SKILL.md +50 -0
- package/skills/design/SKILL.md +61 -0
- package/skills/dispatching-parallel-agents/SKILL.md +128 -0
- package/skills/executing-plans/SKILL.md +70 -0
- package/skills/finishing-a-development-branch/SKILL.md +169 -0
- package/skills/humanize/SKILL.md +123 -0
- package/skills/init-pipeline/SKILL.md +124 -0
- package/skills/prepare-next/SKILL.md +20 -0
- package/skills/receiving-code-review/SKILL.md +123 -0
- package/skills/requesting-code-review/SKILL.md +105 -0
- package/skills/requesting-code-review/code-reviewer.md +108 -0
- package/skills/run-audit/SKILL.md +197 -0
- package/skills/scan-project/SKILL.md +41 -0
- package/skills/self-audit/SKILL.md +153 -0
- package/skills/subagent-driven-development/SKILL.md +154 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
- package/skills/subagent-driven-development/implementer-prompt.md +102 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/skills/tdd/SKILL.md +23 -0
- package/skills/using-git-worktrees/SKILL.md +163 -0
- package/skills/using-skills/SKILL.md +95 -0
- package/skills/verification/SKILL.md +22 -0
- package/skills/wazir/SKILL.md +463 -0
- package/skills/writing-plans/SKILL.md +30 -0
- package/skills/writing-skills/SKILL.md +157 -0
- package/skills/writing-skills/anthropic-best-practices.md +122 -0
- package/skills/writing-skills/persuasion-principles.md +50 -0
- package/templates/README.md +20 -0
- package/templates/artifacts/README.md +10 -0
- package/templates/artifacts/accepted-learning.md +19 -0
- package/templates/artifacts/accepted-learning.template.json +12 -0
- package/templates/artifacts/author.md +74 -0
- package/templates/artifacts/author.template.json +19 -0
- package/templates/artifacts/clarification.md +21 -0
- package/templates/artifacts/clarification.template.json +12 -0
- package/templates/artifacts/execute-notes.md +19 -0
- package/templates/artifacts/implementation-plan.md +21 -0
- package/templates/artifacts/implementation-plan.template.json +11 -0
- package/templates/artifacts/learning-proposal.md +19 -0
- package/templates/artifacts/next-run-handoff.md +21 -0
- package/templates/artifacts/plan-review.md +19 -0
- package/templates/artifacts/proposed-learning.template.json +12 -0
- package/templates/artifacts/research.md +21 -0
- package/templates/artifacts/research.template.json +12 -0
- package/templates/artifacts/review-findings.md +19 -0
- package/templates/artifacts/review.template.json +11 -0
- package/templates/artifacts/run-manifest.template.json +8 -0
- package/templates/artifacts/spec-challenge.md +19 -0
- package/templates/artifacts/spec-challenge.template.json +11 -0
- package/templates/artifacts/spec.md +21 -0
- package/templates/artifacts/spec.template.json +12 -0
- package/templates/artifacts/verification-proof.md +19 -0
- package/templates/artifacts/verification-proof.template.json +11 -0
- package/templates/examples/accepted-learning.example.json +14 -0
- package/templates/examples/author.example.json +152 -0
- package/templates/examples/clarification.example.json +15 -0
- package/templates/examples/docs-claim.example.json +8 -0
- package/templates/examples/export-manifest.example.json +7 -0
- package/templates/examples/host-export-package.example.json +11 -0
- package/templates/examples/implementation-plan.example.json +17 -0
- package/templates/examples/proposed-learning.example.json +13 -0
- package/templates/examples/research.example.json +15 -0
- package/templates/examples/research.example.md +6 -0
- package/templates/examples/review.example.json +17 -0
- package/templates/examples/run-manifest.example.json +9 -0
- package/templates/examples/spec-challenge.example.json +14 -0
- package/templates/examples/spec.example.json +21 -0
- package/templates/examples/verification-proof.example.json +21 -0
- package/templates/examples/wazir-manifest.example.yaml +65 -0
- package/templates/task-definition-schema.md +99 -0
- package/tooling/README.md +20 -0
- package/tooling/src/adapters/context-mode.js +50 -0
- package/tooling/src/capture/command.js +376 -0
- package/tooling/src/capture/store.js +99 -0
- package/tooling/src/capture/usage.js +270 -0
- package/tooling/src/checks/branches.js +50 -0
- package/tooling/src/checks/brand-truth.js +110 -0
- package/tooling/src/checks/changelog.js +231 -0
- package/tooling/src/checks/command-registry.js +36 -0
- package/tooling/src/checks/commits.js +102 -0
- package/tooling/src/checks/docs-drift.js +103 -0
- package/tooling/src/checks/docs-truth.js +201 -0
- package/tooling/src/checks/runtime-surface.js +156 -0
- package/tooling/src/cli.js +116 -0
- package/tooling/src/command-options.js +56 -0
- package/tooling/src/commands/validate.js +320 -0
- package/tooling/src/doctor/command.js +91 -0
- package/tooling/src/export/command.js +77 -0
- package/tooling/src/export/compiler.js +498 -0
- package/tooling/src/guards/loop-cap-guard.js +52 -0
- package/tooling/src/guards/protected-path-write-guard.js +67 -0
- package/tooling/src/index/command.js +152 -0
- package/tooling/src/index/storage.js +1061 -0
- package/tooling/src/index/summarizers.js +261 -0
- package/tooling/src/loaders.js +18 -0
- package/tooling/src/project-root.js +22 -0
- package/tooling/src/recall/command.js +225 -0
- package/tooling/src/schema-validator.js +30 -0
- package/tooling/src/state-root.js +40 -0
- package/tooling/src/status/command.js +71 -0
- package/wazir.manifest.yaml +135 -0
- package/workflows/README.md +19 -0
- package/workflows/author.md +42 -0
- package/workflows/clarify.md +38 -0
- package/workflows/design-review.md +46 -0
- package/workflows/design.md +44 -0
- package/workflows/discover.md +37 -0
- package/workflows/execute.md +48 -0
- package/workflows/learn.md +38 -0
- package/workflows/plan-review.md +42 -0
- package/workflows/plan.md +39 -0
- package/workflows/prepare-next.md +37 -0
- package/workflows/review.md +40 -0
- package/workflows/run-audit.md +41 -0
- package/workflows/spec-challenge.md +41 -0
- package/workflows/specify.md +38 -0
- package/workflows/verify.md +37 -0
|
@@ -0,0 +1,920 @@
|
|
|
1
|
+
# Deployment Anti-Patterns
|
|
2
|
+
|
|
3
|
+
> Deployment is the most dangerous phase of the software lifecycle. Code that passes every test can still destroy a company in minutes if deployed carelessly. The history of software is littered with billion-dollar lessons: Knight Capital lost $440 million in 45 minutes from a botched deploy, GitLab deleted its own production database and discovered all five backup systems were broken, and CrowdStrike bricked 8.5 million Windows machines with an untested kernel update. These are not edge cases -- they are the predictable consequences of deployment anti-patterns that persist across the industry.
|
|
4
|
+
|
|
5
|
+
> **Domain:** Process
|
|
6
|
+
> **Anti-patterns covered:** 20
|
|
7
|
+
> **Highest severity:** Critical
|
|
8
|
+
|
|
9
|
+
## Anti-Patterns
|
|
10
|
+
|
|
11
|
+
### AP-01: No Rollback Plan
|
|
12
|
+
|
|
13
|
+
**Also known as:** One-Way Deploy, Burn the Ships, Forward-Only Release
|
|
14
|
+
**Frequency:** Very Common
|
|
15
|
+
**Severity:** Critical
|
|
16
|
+
**Detection difficulty:** Easy
|
|
17
|
+
|
|
18
|
+
**What it looks like:**
|
|
19
|
+
|
|
20
|
+
The team deploys with no documented or tested procedure to revert to the previous version. Rollback is a theoretical exercise rather than a rehearsed capability.
|
|
21
|
+
|
|
22
|
+
```yaml
|
|
23
|
+
# deploy.sh -- no rollback logic anywhere
|
|
24
|
+
#!/bin/bash
|
|
25
|
+
docker pull myapp:latest
|
|
26
|
+
docker stop myapp
|
|
27
|
+
docker run -d myapp:latest
|
|
28
|
+
echo "Deploy complete"
|
|
29
|
+
# What happens when :latest is broken? Nobody knows.
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
**Why teams do it:**
|
|
33
|
+
|
|
34
|
+
Rolling forward feels faster. Teams assume the new version will work because it passed tests. Writing and maintaining rollback procedures doubles the deployment engineering effort, and "we've never needed it" becomes a self-reinforcing excuse -- until the day they do.
|
|
35
|
+
|
|
36
|
+
**What goes wrong:**
|
|
37
|
+
|
|
38
|
+
Knight Capital Group, August 1, 2012. A deployment of new trading software to their SMARS order routing system failed silently on one of eight servers. That server retained old code with a defunct "Power Peg" function. When the market opened, the old code executed millions of erroneous trades. With no rollback procedure, engineers scrambled for 45 minutes while the system purchased $7 billion in unintended stock. Knight Capital lost $440 million -- three times its annual earnings. The stock dropped 75% in two days, and the company was acquired within months.
|
|
39
|
+
|
|
40
|
+
**The fix:**
|
|
41
|
+
|
|
42
|
+
Every deployment must have a documented, tested rollback procedure. Treat rollback as a first-class deployment artifact.
|
|
43
|
+
|
|
44
|
+
```yaml
|
|
45
|
+
# deploy.yaml -- rollback-aware
|
|
46
|
+
deploy:
|
|
47
|
+
steps:
|
|
48
|
+
- name: snapshot_current
|
|
49
|
+
run: docker tag myapp:current myapp:rollback-$(date +%s)
|
|
50
|
+
- name: deploy_new
|
|
51
|
+
run: docker pull myapp:${{ VERSION }} && docker run -d myapp:${{ VERSION }}
|
|
52
|
+
- name: health_check
|
|
53
|
+
run: ./scripts/health-check.sh --timeout 60
|
|
54
|
+
- name: auto_rollback
|
|
55
|
+
if: failure()
|
|
56
|
+
run: docker run -d myapp:rollback-${{ SNAPSHOT_TS }}
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
**Detection rule:**
|
|
60
|
+
|
|
61
|
+
Flag deployment pipelines that lack a rollback step or `on_failure` handler. Flag releases without a tagged previous-version artifact.
|
|
62
|
+
|
|
63
|
+
---
|
|
64
|
+
|
|
65
|
+
### AP-02: Friday Deployments
|
|
66
|
+
|
|
67
|
+
**Also known as:** YOLO Friday, End-of-Week Push, Weekend Roulette
|
|
68
|
+
**Frequency:** Very Common
|
|
69
|
+
**Severity:** High
|
|
70
|
+
**Detection difficulty:** Easy
|
|
71
|
+
|
|
72
|
+
**What it looks like:**
|
|
73
|
+
|
|
74
|
+
Teams push significant changes to production on Friday afternoons, leaving minimal staff available if problems emerge over the weekend.
|
|
75
|
+
|
|
76
|
+
**Why teams do it:**
|
|
77
|
+
|
|
78
|
+
Sprint deadlines align with Friday. Product managers want features "shipped this week." Developers feel pressure to close tickets before the weekend. The deploy itself seems routine.
|
|
79
|
+
|
|
80
|
+
**What goes wrong:**
|
|
81
|
+
|
|
82
|
+
A small config change deployed Friday at 5 PM can cascade into a weekend-long outage with skeleton staff. The AWS S3 outage (February 28, 2017) showed how a single mistyped command removed more servers than intended, taking down a significant portion of the internet for four hours and costing S&P 500 companies an estimated $150 million. Incidents discovered after-hours take 2-3x longer to resolve due to reduced staffing and fatigue-impaired decision-making.
|
|
83
|
+
|
|
84
|
+
**The fix:**
|
|
85
|
+
|
|
86
|
+
Establish deployment windows: deploy Monday through Thursday before 2 PM local time. If your CI/CD is mature enough that every deploy is small and instantly reversible, Friday deploys become safe -- but earn that trust with data, not optimism.
|
|
87
|
+
|
|
88
|
+
```bash
|
|
89
|
+
# Deploy window gate in CI
|
|
90
|
+
DAY=$(date +%u) # 1=Monday, 5=Friday
|
|
91
|
+
HOUR=$(date +%H)
|
|
92
|
+
if [ "$DAY" -ge 5 ] || [ "$HOUR" -ge 16 ]; then
|
|
93
|
+
echo "::error::Deploys blocked outside window (Mon-Thu before 4PM)"
|
|
94
|
+
exit 1
|
|
95
|
+
fi
|
|
96
|
+
```
|
|
97
|
+
|
|
98
|
+
**Detection rule:**
|
|
99
|
+
|
|
100
|
+
Flag deployments triggered on Friday after 2 PM or on weekends. Track deployment day-of-week distribution and alert if Friday exceeds 25% of total deploys.
|
|
101
|
+
|
|
102
|
+
---
|
|
103
|
+
|
|
104
|
+
### AP-03: Big Bang Deployment
|
|
105
|
+
|
|
106
|
+
**Also known as:** All-at-Once Release, Flag Day, Forklift Upgrade
|
|
107
|
+
**Frequency:** Common
|
|
108
|
+
**Severity:** Critical
|
|
109
|
+
**Detection difficulty:** Moderate
|
|
110
|
+
|
|
111
|
+
**What it looks like:**
|
|
112
|
+
|
|
113
|
+
Months of accumulated changes are deployed in a single massive release. The diff is thousands of lines across hundreds of files.
|
|
114
|
+
|
|
115
|
+
**Why teams do it:**
|
|
116
|
+
|
|
117
|
+
Long release cycles accumulate changes. Testing "everything together" feels thorough. Some teams simply lack deployment automation.
|
|
118
|
+
|
|
119
|
+
**What goes wrong:**
|
|
120
|
+
|
|
121
|
+
TSB Bank, April 2018. TSB attempted a "big bang" migration from Lloyds' legacy platform to Sabadell's Proteo4UK system over a single weekend. Nearly 2 million customers were locked out; some could see other people's accounts. TSB did not return to normal until December 2018 -- eight months later. The bank lost 330 million pounds, 80,000 customers, its CEO, and was fined 48.65 million pounds by regulators.
|
|
122
|
+
|
|
123
|
+
**The fix:**
|
|
124
|
+
|
|
125
|
+
Deploy small, deploy often. Break large changes into independently deployable increments behind feature flags. Target deployment frequency of at least weekly, ideally daily.
|
|
126
|
+
|
|
127
|
+
**Detection rule:**
|
|
128
|
+
|
|
129
|
+
Flag releases containing more than 50 commits or more than 2 weeks of accumulated changes. Track release size (lines changed, files touched) and alert on outliers.
|
|
130
|
+
|
|
131
|
+
---
|
|
132
|
+
|
|
133
|
+
### AP-04: Manual Deployment Steps
|
|
134
|
+
|
|
135
|
+
**Also known as:** Artisanal Deploys, Human Pipeline, Click-Ops
|
|
136
|
+
**Frequency:** Common
|
|
137
|
+
**Severity:** High
|
|
138
|
+
**Detection difficulty:** Easy
|
|
139
|
+
|
|
140
|
+
**What it looks like:**
|
|
141
|
+
|
|
142
|
+
Deployment requires a human to SSH into servers, run commands in sequence, edit config files by hand, and verify by eyeballing logs.
|
|
143
|
+
|
|
144
|
+
```text
|
|
145
|
+
DEPLOYMENT RUNBOOK (manual):
|
|
146
|
+
1. SSH to prod-web-01
|
|
147
|
+
2. cd /opt/app && git pull origin main
|
|
148
|
+
3. Edit config.yaml -- change DB_HOST to new address
|
|
149
|
+
4. pip install -r requirements.txt && python manage.py migrate
|
|
150
|
+
5. sudo systemctl restart app
|
|
151
|
+
6. Repeat steps 1-5 for prod-web-02 through prod-web-08
|
|
152
|
+
7. Check https://app.example.com and "make sure it works"
|
|
153
|
+
```
|
|
154
|
+
|
|
155
|
+
**Why teams do it:**
|
|
156
|
+
|
|
157
|
+
The team started with one server and manual deploys worked fine. Automation requires upfront investment. "We know the steps" becomes knowledge locked in one person's head.
|
|
158
|
+
|
|
159
|
+
**What goes wrong:**
|
|
160
|
+
|
|
161
|
+
GitLab, January 31, 2017. During manual troubleshooting of replication lag, a sysadmin ran `rm -rf /var/opt/gitlab/postgresql/data/` on the wrong server -- deleting 300 GB of live production data instead of the replica. By the time it was cancelled, only 4.5 GB remained. Recovery took 18 hours. GitLab lost 6 hours of data including 5,000 projects and 700 user accounts. None of their five backup/replication strategies were functioning.
|
|
162
|
+
|
|
163
|
+
**The fix:**
|
|
164
|
+
|
|
165
|
+
Automate everything. No human should type commands on production servers during deployment.
|
|
166
|
+
|
|
167
|
+
```yaml
|
|
168
|
+
# Fully automated -- no SSH, no manual steps
|
|
169
|
+
on:
|
|
170
|
+
push:
|
|
171
|
+
branches: [main]
|
|
172
|
+
jobs:
|
|
173
|
+
deploy:
|
|
174
|
+
steps:
|
|
175
|
+
- uses: actions/checkout@v4
|
|
176
|
+
- run: npm ci && npm test
|
|
177
|
+
- run: docker build -t app:${{ github.sha }} . && docker push $REGISTRY/app:${{ github.sha }}
|
|
178
|
+
- uses: azure/k8s-deploy@v4
|
|
179
|
+
with:
|
|
180
|
+
images: $REGISTRY/app:${{ github.sha }}
|
|
181
|
+
strategy: canary
|
|
182
|
+
```
|
|
183
|
+
|
|
184
|
+
**Detection rule:**
|
|
185
|
+
|
|
186
|
+
Flag any deployment process that includes SSH commands, manual file edits, or human-executed shell scripts. Audit deployment logs for interactive session indicators.
|
|
187
|
+
|
|
188
|
+
---
|
|
189
|
+
|
|
190
|
+
### AP-05: No Feature Flags
|
|
191
|
+
|
|
192
|
+
**Also known as:** Deploy-to-Release Coupling, All-or-Nothing Features, Binary Releases
|
|
193
|
+
**Frequency:** Common
|
|
194
|
+
**Severity:** High
|
|
195
|
+
**Detection difficulty:** Moderate
|
|
196
|
+
|
|
197
|
+
**What it looks like:**
|
|
198
|
+
|
|
199
|
+
Every deployment is also a release. Code ships to all users with no ability to toggle features, ramp gradually, or disable a broken feature without a full redeploy.
|
|
200
|
+
|
|
201
|
+
**Why teams do it:**
|
|
202
|
+
|
|
203
|
+
Feature flags add complexity: management, cleanup, combinatorial testing. Small teams feel the overhead outweighs the benefit.
|
|
204
|
+
|
|
205
|
+
**What goes wrong:**
|
|
206
|
+
|
|
207
|
+
Facebook, October 4, 2021. During routine backbone maintenance, a command to assess network capacity accidentally withdrew all BGP routes, disconnecting every Facebook data center from the internet. Facebook, WhatsApp, Instagram, and Messenger went down for six hours. Engineers could not access internal tools because those tools depended on the same infrastructure. Technicians had to physically travel to data centers. A feature flag on the maintenance command -- or an independent circuit breaker -- could have prevented the cascade.
|
|
208
|
+
|
|
209
|
+
**The fix:**
|
|
210
|
+
|
|
211
|
+
Decouple deployment from release. Ship code behind flags; activate features independently of deployments.
|
|
212
|
+
|
|
213
|
+
```python
|
|
214
|
+
# Feature flag controlling gradual rollout
|
|
215
|
+
if feature_flags.is_enabled("new_checkout_flow", user_id=user.id):
|
|
216
|
+
return new_checkout(cart)
|
|
217
|
+
else:
|
|
218
|
+
return legacy_checkout(cart)
|
|
219
|
+
```
|
|
220
|
+
|
|
221
|
+
**Detection rule:**
|
|
222
|
+
|
|
223
|
+
Flag any user-facing feature that ships without a feature flag. Track the ratio of deploys to feature flag changes -- if they are 1:1, flags are not being used for gradual rollout.
|
|
224
|
+
|
|
225
|
+
---
|
|
226
|
+
|
|
227
|
+
### AP-06: Deploying Untested Code
|
|
228
|
+
|
|
229
|
+
**Also known as:** Ship and Pray, YOLO Deploy, Test in Production
|
|
230
|
+
**Frequency:** Common
|
|
231
|
+
**Severity:** Critical
|
|
232
|
+
**Detection difficulty:** Moderate
|
|
233
|
+
|
|
234
|
+
**What it looks like:**
|
|
235
|
+
|
|
236
|
+
Code is deployed without passing through the full automated test suite. Tests are skipped "just this once" because the change is "trivial" or the deadline is urgent.
|
|
237
|
+
|
|
238
|
+
**Why teams do it:**
|
|
239
|
+
|
|
240
|
+
Slow CI pipelines create pressure to bypass. "It's just a config change." Hotfixes feel too urgent for a full test cycle. Management pressure overrides engineering discipline.
|
|
241
|
+
|
|
242
|
+
**What goes wrong:**
|
|
243
|
+
|
|
244
|
+
CrowdStrike, July 19, 2024. A "rapid response content update" to the Falcon Sensor kernel-level security software contained a faulty configuration file that caused Windows machines to crash on boot. Because it was classified as a "content update," it bypassed standard testing. Approximately 8.5 million systems were bricked worldwide -- the largest IT outage in history. Airlines grounded flights, hospitals postponed surgeries, banks went offline. Fortune 500 losses: an estimated $5.4 billion. Delta Air Lines filed a $500 million lawsuit alleging CrowdStrike "deployed untested software updates."
|
|
245
|
+
|
|
246
|
+
**The fix:**
|
|
247
|
+
|
|
248
|
+
No code reaches production without passing the full automated test suite. No exceptions.
|
|
249
|
+
|
|
250
|
+
```yaml
|
|
251
|
+
# CI pipeline -- tests are mandatory, not advisory
|
|
252
|
+
deploy:
|
|
253
|
+
needs: [unit-tests, integration-tests, e2e-tests, security-scan]
|
|
254
|
+
if: |
|
|
255
|
+
needs.unit-tests.result == 'success' &&
|
|
256
|
+
needs.integration-tests.result == 'success' &&
|
|
257
|
+
needs.e2e-tests.result == 'success' &&
|
|
258
|
+
needs.security-scan.result == 'success'
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
**Detection rule:**
|
|
262
|
+
|
|
263
|
+
Flag any deployment that does not have a completed test run as a prerequisite. Alert on manual pipeline overrides that skip test stages.
|
|
264
|
+
|
|
265
|
+
---
|
|
266
|
+
|
|
267
|
+
### AP-07: No Staging Environment
|
|
268
|
+
|
|
269
|
+
**Also known as:** Dev-to-Prod Pipeline, Missing Middle, Cowboy Pipeline
|
|
270
|
+
**Frequency:** Common
|
|
271
|
+
**Severity:** High
|
|
272
|
+
**Detection difficulty:** Easy
|
|
273
|
+
|
|
274
|
+
**What it looks like:**
|
|
275
|
+
|
|
276
|
+
Code moves directly from a developer's machine or CI build to production with no intermediate environment mirroring production's configuration and data volume.
|
|
277
|
+
|
|
278
|
+
**Why teams do it:**
|
|
279
|
+
|
|
280
|
+
Staging costs money and maintenance effort. "Our tests are good enough." Startups under cash pressure cut infrastructure corners first.
|
|
281
|
+
|
|
282
|
+
**What goes wrong:**
|
|
283
|
+
|
|
284
|
+
Without staging, the first time code meets production-like conditions is in production. The Fastly CDN outage (June 8, 2021) was caused by a software update deployed in May containing a bug triggerable only under specific conditions. It sat dormant until a customer's config change triggered it, taking down Amazon, BBC, CNN, Shopify, and UK/US government websites. A staging environment with production-level configuration complexity would have surfaced the trigger before it reached the global network.
|
|
285
|
+
|
|
286
|
+
**The fix:**
|
|
287
|
+
|
|
288
|
+
Maintain at least one pre-production environment that mirrors production in architecture, configuration, and data volume (anonymized). Deploy to staging first, run smoke tests, soak for a defined period, then promote to production.
|
|
289
|
+
|
|
290
|
+
**Detection rule:**
|
|
291
|
+
|
|
292
|
+
Flag CI/CD pipelines that deploy directly to production without a staging step. Audit infrastructure-as-code for production resources without corresponding staging equivalents.
|
|
293
|
+
|
|
294
|
+
---
|
|
295
|
+
|
|
296
|
+
### AP-08: Config Drift Across Environments
|
|
297
|
+
|
|
298
|
+
**Also known as:** Snowflake Environments, Works on My Machine, Env Parity Violations
|
|
299
|
+
**Frequency:** Very Common
|
|
300
|
+
**Severity:** High
|
|
301
|
+
**Detection difficulty:** Hard
|
|
302
|
+
|
|
303
|
+
**What it looks like:**
|
|
304
|
+
|
|
305
|
+
Environments differ in software versions, database schemas, or infrastructure topology. Dev uses SQLite while production uses PostgreSQL. Staging runs Python 3.11 while production runs 3.9.
|
|
306
|
+
|
|
307
|
+
```yaml
|
|
308
|
+
# dev.env # prod.env
|
|
309
|
+
DATABASE_URL=sqlite:///db DATABASE_URL=postgres://prod-host/db
|
|
310
|
+
CACHE_DRIVER=memory CACHE_DRIVER=redis
|
|
311
|
+
PYTHON_VERSION=3.11 PYTHON_VERSION=3.9 # different!
|
|
312
|
+
```
|
|
313
|
+
|
|
314
|
+
**Why teams do it:**
|
|
315
|
+
|
|
316
|
+
Local development prioritizes convenience over fidelity. Drift accumulates gradually and invisibly across teams.
|
|
317
|
+
|
|
318
|
+
**What goes wrong:**
|
|
319
|
+
|
|
320
|
+
Config drift means "works in staging" tells you nothing about production. GitLab's backup failure that compounded the 2017 database deletion is a textbook example: backups silently failed because pg_dump 9.2 was running against PostgreSQL 9.6. The version mismatch went undetected because nobody verified tool compatibility in production. Teams routinely discover that staging "green" means nothing when production has different connection pool sizes, timeout values, or TLS configurations.
|
|
321
|
+
|
|
322
|
+
**The fix:**
|
|
323
|
+
|
|
324
|
+
Use infrastructure-as-code to define environments from shared templates. Parameterize only values that must differ (hostnames, credentials) and lock everything else.
|
|
325
|
+
|
|
326
|
+
```hcl
|
|
327
|
+
# Shared Terraform module -- environments differ only in variables
|
|
328
|
+
module "app_env" {
|
|
329
|
+
source = "./modules/app"
|
|
330
|
+
environment = var.env_name # "staging" or "production"
|
|
331
|
+
instance_type = var.instance_type # parameterized
|
|
332
|
+
postgres_version = "16" # locked across all envs
|
|
333
|
+
redis_version = "7.2" # locked across all envs
|
|
334
|
+
}
|
|
335
|
+
```
|
|
336
|
+
|
|
337
|
+
**Detection rule:**
|
|
338
|
+
|
|
339
|
+
Diff environment configuration files weekly. Flag version mismatches for databases, runtimes, and key dependencies between staging and production. Automate parity checks in CI.
|
|
340
|
+
|
|
341
|
+
---
|
|
342
|
+
|
|
343
|
+
### AP-09: No Post-Deploy Monitoring
|
|
344
|
+
|
|
345
|
+
**Also known as:** Fire and Forget, Deploy and Walk Away, Blind Deploy
|
|
346
|
+
**Frequency:** Common
|
|
347
|
+
**Severity:** Critical
|
|
348
|
+
**Detection difficulty:** Moderate
|
|
349
|
+
|
|
350
|
+
**What it looks like:**
|
|
351
|
+
|
|
352
|
+
The team deploys and immediately moves on. No one watches error rates, latency, or business metrics for the first 30-60 minutes after deploy.
|
|
353
|
+
|
|
354
|
+
**Why teams do it:**
|
|
355
|
+
|
|
356
|
+
Deploys happen frequently and "usually work." Monitoring dashboards exist but nobody watches them post-deploy. Alert thresholds are too coarse to catch gradual degradations.
|
|
357
|
+
|
|
358
|
+
**What goes wrong:**
|
|
359
|
+
|
|
360
|
+
Knight Capital's 45-minute catastrophe is the canonical example. After deploying to 7 of 8 servers, the team moved on. No one watched for anomalous trading patterns. The eighth server, running old code, generated millions of erroneous orders. By the time anyone noticed, $440 million was gone. Silent failures -- memory leaks, slow query regressions, elevated error rates -- can compound for hours before a customer complaint triggers investigation.
|
|
361
|
+
|
|
362
|
+
**The fix:**
|
|
363
|
+
|
|
364
|
+
Implement mandatory post-deploy observation windows with automated anomaly detection.
|
|
365
|
+
|
|
366
|
+
```yaml
|
|
367
|
+
# Post-deploy monitoring gate
|
|
368
|
+
post_deploy:
|
|
369
|
+
duration: 30m
|
|
370
|
+
checks:
|
|
371
|
+
- error_rate < baseline * 1.1
|
|
372
|
+
- p99_latency < baseline * 1.2
|
|
373
|
+
- cpu_usage < 80%
|
|
374
|
+
- business_metric.orders_per_minute > baseline * 0.9
|
|
375
|
+
on_violation: auto_rollback
|
|
376
|
+
```
|
|
377
|
+
|
|
378
|
+
**Detection rule:**
|
|
379
|
+
|
|
380
|
+
Flag deployments where no one accessed monitoring dashboards within 30 minutes post-deploy. Track time-to-detection for post-deploy incidents -- if it exceeds 15 minutes, monitoring is insufficient.
|
|
381
|
+
|
|
382
|
+
---
|
|
383
|
+
|
|
384
|
+
### AP-10: Coupled Database Migrations
|
|
385
|
+
|
|
386
|
+
**Also known as:** Big Bang Migration, Schema-Code Lock-Step, Migration Roulette
|
|
387
|
+
**Frequency:** Common
|
|
388
|
+
**Severity:** Critical
|
|
389
|
+
**Detection difficulty:** Moderate
|
|
390
|
+
|
|
391
|
+
**What it looks like:**
|
|
392
|
+
|
|
393
|
+
Database schema changes are deployed simultaneously with application code changes in a single atomic release. The migration and the code that depends on it ship together, making rollback of either impossible without rolling back both.
|
|
394
|
+
|
|
395
|
+
```sql
|
|
396
|
+
-- Migration runs during deploy, before new code starts
|
|
397
|
+
ALTER TABLE users DROP COLUMN legacy_role;
|
|
398
|
+
ALTER TABLE users ADD COLUMN role_id INTEGER REFERENCES roles(id);
|
|
399
|
+
-- If new code fails, old code crashes because legacy_role is gone
|
|
400
|
+
```
|
|
401
|
+
|
|
402
|
+
**Why teams do it:**
|
|
403
|
+
|
|
404
|
+
It feels logical to ship the schema change with the code that uses it. Separate deployments require handling both schemas temporarily. Migration frameworks encourage this coupling by default.
|
|
405
|
+
|
|
406
|
+
**What goes wrong:**
|
|
407
|
+
|
|
408
|
+
When migration and code are coupled, rollback becomes impossible: old code expects the old schema, but the migration has already altered it. The TSB Bank disaster exemplifies this -- the entire data migration was coupled with the application cutover, and when the application failed, there was no way to safely revert the data layer.
|
|
409
|
+
|
|
410
|
+
**The fix:**
|
|
411
|
+
|
|
412
|
+
Use the expand-contract pattern: (1) expand -- add new columns, (2) deploy code that writes to both old and new, (3) contract -- remove old columns after new code is stable for days.
|
|
413
|
+
|
|
414
|
+
```sql
|
|
415
|
+
-- Phase 1: EXPAND (add new, keep old)
|
|
416
|
+
ALTER TABLE users ADD COLUMN role_id INTEGER REFERENCES roles(id);
|
|
417
|
+
-- Phase 2: CODE deploys, writes both columns, reads new with fallback
|
|
418
|
+
-- Phase 3: CONTRACT (after stable for days)
|
|
419
|
+
ALTER TABLE users DROP COLUMN legacy_role;
|
|
420
|
+
```
|
|
421
|
+
|
|
422
|
+
**Detection rule:**
|
|
423
|
+
|
|
424
|
+
Flag pull requests that contain both migration files and application code changes. Flag migrations that use DROP COLUMN or destructive ALTER without a corresponding expand phase.
|
|
425
|
+
|
|
426
|
+
---
|
|
427
|
+
|
|
428
|
+
### AP-11: No Blue-Green or Canary Strategy
|
|
429
|
+
|
|
430
|
+
**Also known as:** All-at-Once Cutover, Replace-and-Pray, In-Place Upgrade
|
|
431
|
+
**Frequency:** Common
|
|
432
|
+
**Severity:** High
|
|
433
|
+
**Detection difficulty:** Easy
|
|
434
|
+
|
|
435
|
+
**What it looks like:**
|
|
436
|
+
|
|
437
|
+
Every deployment replaces the running application in-place for all users simultaneously. No traffic routing to a subset, no parallel old-version environment, no instant switch.
|
|
438
|
+
|
|
439
|
+
**Why teams do it:**
|
|
440
|
+
|
|
441
|
+
Blue-green requires double the infrastructure. Canary needs traffic routing capabilities. "Our app is stateless, we just restart."
|
|
442
|
+
|
|
443
|
+
**What goes wrong:**
|
|
444
|
+
|
|
445
|
+
Cloudflare, November 18, 2025. A routine ClickHouse database permission change caused the Bot Management system to receive malformed configuration files that doubled in size, overwhelming the global network. Because the faulty configuration was pushed to the entire fleet simultaneously, the blast radius was the entire Cloudflare network. A canary deployment to 1% of edge nodes would have detected the anomaly before global impact.
|
|
446
|
+
|
|
447
|
+
**The fix:**
|
|
448
|
+
|
|
449
|
+
Implement progressive delivery: canary a new version to a small percentage of traffic, observe metrics, then gradually increase.
|
|
450
|
+
|
|
451
|
+
```yaml
|
|
452
|
+
# Kubernetes canary with Argo Rollouts
|
|
453
|
+
apiVersion: argoproj.io/v1alpha1
|
|
454
|
+
kind: Rollout
|
|
455
|
+
spec:
|
|
456
|
+
strategy:
|
|
457
|
+
canary:
|
|
458
|
+
steps:
|
|
459
|
+
- setWeight: 5
|
|
460
|
+
- pause: { duration: 5m }
|
|
461
|
+
- analysis:
|
|
462
|
+
templates: [{ templateName: success-rate }]
|
|
463
|
+
- setWeight: 25
|
|
464
|
+
- pause: { duration: 10m }
|
|
465
|
+
- setWeight: 50
|
|
466
|
+
- pause: { duration: 10m }
|
|
467
|
+
- setWeight: 100
|
|
468
|
+
```
|
|
469
|
+
|
|
470
|
+
**Detection rule:**
|
|
471
|
+
|
|
472
|
+
Flag deployment configurations that route 100% of traffic to the new version immediately. Flag infrastructure that lacks a traffic-splitting mechanism.
|
|
473
|
+
|
|
474
|
+
---
|
|
475
|
+
|
|
476
|
+
### AP-12: Hardcoded Environment Values
|
|
477
|
+
|
|
478
|
+
**Also known as:** Baked-In Config, Magic Strings, Env-in-Code
|
|
479
|
+
**Frequency:** Very Common
|
|
480
|
+
**Severity:** High
|
|
481
|
+
**Detection difficulty:** Easy
|
|
482
|
+
|
|
483
|
+
**What it looks like:**
|
|
484
|
+
|
|
485
|
+
Connection strings, API keys, hostnames, and environment-specific values are embedded directly in application code or committed to version control.
|
|
486
|
+
|
|
487
|
+
```python
|
|
488
|
+
# hardcoded.py
|
|
489
|
+
DATABASE_URL = "postgres://admin:s3cret@prod-db.internal:5432/myapp"
|
|
490
|
+
API_KEY = "sk-live-abc123def456"
|
|
491
|
+
REDIS_HOST = "10.0.1.42"
|
|
492
|
+
FEATURE_LIMIT = 1000 # different in staging (100) -- but hardcoded here
|
|
493
|
+
```
|
|
494
|
+
|
|
495
|
+
**Why teams do it:**
|
|
496
|
+
|
|
497
|
+
It works immediately. No environment variables, no secret management infrastructure. The prototype shipped with hardcoded values and nobody replaced them.
|
|
498
|
+
|
|
499
|
+
**What goes wrong:**
|
|
500
|
+
|
|
501
|
+
Hardcoded credentials are the root cause of countless breaches -- in 2023, researchers found over 100,000 valid API keys exposed in public GitHub repositories. Beyond security, hardcoded values make the same artifact behave differently across environments. The SolarWinds attack (discovered December 2020) exploited the build pipeline partly because credentials and configuration were insufficiently separated from code, enabling attackers to inject malicious code that was signed and distributed to 18,000 customers.
|
|
502
|
+
|
|
503
|
+
**The fix:**
|
|
504
|
+
|
|
505
|
+
Externalize all environment-specific values. Use environment variables, secret managers, and config services.
|
|
506
|
+
|
|
507
|
+
```python
|
|
508
|
+
import os
|
|
509
|
+
|
|
510
|
+
DATABASE_URL = os.environ["DATABASE_URL"] # injected at runtime
|
|
511
|
+
API_KEY = os.environ["API_KEY"] # from secret manager
|
|
512
|
+
REDIS_HOST = os.environ.get("REDIS_HOST", "localhost") # safe default
|
|
513
|
+
FEATURE_LIMIT = int(os.environ.get("FEATURE_LIMIT", "100"))
|
|
514
|
+
```
|
|
515
|
+
|
|
516
|
+
**Detection rule:**
|
|
517
|
+
|
|
518
|
+
Scan code for patterns matching connection strings, API key formats (`sk-live-`, `AKIA`, `ghp_`), IP addresses, and hardcoded port numbers. Use tools like `gitleaks`, `trufflehog`, or `detect-secrets` in CI.
|
|
519
|
+
|
|
520
|
+
---
|
|
521
|
+
|
|
522
|
+
### AP-13: No Backup Before Migrations
|
|
523
|
+
|
|
524
|
+
**Also known as:** Naked Migration, Leap of Faith, YOLO Schema Change
|
|
525
|
+
**Frequency:** Common
|
|
526
|
+
**Severity:** Critical
|
|
527
|
+
**Detection difficulty:** Easy
|
|
528
|
+
|
|
529
|
+
**What it looks like:**
|
|
530
|
+
|
|
531
|
+
Database migrations run against production without first taking a backup or snapshot. The team assumes the migration will succeed.
|
|
532
|
+
|
|
533
|
+
**Why teams do it:**
|
|
534
|
+
|
|
535
|
+
Backups are slow on large databases. "The migration is simple, just adding a column." Nobody wants to add 30 minutes to the deploy window for a backup.
|
|
536
|
+
|
|
537
|
+
**What goes wrong:**
|
|
538
|
+
|
|
539
|
+
GitLab's 2017 database disaster is the definitive cautionary tale. When `rm -rf` deleted 300 GB of production data, the team discovered that regular PostgreSQL backups had been silently failing for months (pg_dump 9.2 vs PostgreSQL 9.6). LVM snapshots were not configured. Azure backups existed but were untested. Of five backup strategies, zero worked. A verified backup would have prevented the 18-hour recovery and permanent loss of 6 hours of data.
|
|
540
|
+
|
|
541
|
+
**The fix:**
|
|
542
|
+
|
|
543
|
+
Take and verify a backup immediately before every migration. Make backup verification part of the deployment pipeline, not a separate ops concern.
|
|
544
|
+
|
|
545
|
+
```bash
|
|
546
|
+
#!/bin/bash
|
|
547
|
+
# pre-migration backup with verification
|
|
548
|
+
BACKUP="pre_migration_$(date +%Y%m%d_%H%M%S).sql.gz"
|
|
549
|
+
pg_dump $DATABASE_URL | gzip > "$BACKUP"
|
|
550
|
+
|
|
551
|
+
# Verify: restore to test database and check integrity
|
|
552
|
+
gunzip -c "$BACKUP" | psql $TEST_DATABASE_URL
|
|
553
|
+
ROWS=$(psql $TEST_DATABASE_URL -t -c "SELECT count(*) FROM users")
|
|
554
|
+
[ "$ROWS" -lt 1 ] && echo "BACKUP VERIFICATION FAILED" && exit 1
|
|
555
|
+
echo "Backup verified ($ROWS users). Proceeding with migration."
|
|
556
|
+
```
|
|
557
|
+
|
|
558
|
+
**Detection rule:**
|
|
559
|
+
|
|
560
|
+
Flag migration scripts that do not call a backup step. Alert if the most recent verified backup is older than the deployment window.
|
|
561
|
+
|
|
562
|
+
---
|
|
563
|
+
|
|
564
|
+
### AP-14: Deployment Order Ignored
|
|
565
|
+
|
|
566
|
+
**Also known as:** Out-of-Order Deploy, Dependency Blindness, Race to Production
|
|
567
|
+
**Frequency:** Occasional
|
|
568
|
+
**Severity:** High
|
|
569
|
+
**Detection difficulty:** Hard
|
|
570
|
+
|
|
571
|
+
**What it looks like:**
|
|
572
|
+
|
|
573
|
+
Services are deployed in arbitrary order without considering inter-service dependencies. A consumer is deployed before its provider, or a breaking API change ships before consumers are updated.
|
|
574
|
+
|
|
575
|
+
**Why teams do it:**
|
|
576
|
+
|
|
577
|
+
Each team owns its own pipeline and deploys independently. Nobody maintains a dependency graph. "Microservices are supposed to be independent."
|
|
578
|
+
|
|
579
|
+
**What goes wrong:**
|
|
580
|
+
|
|
581
|
+
The AWS DynamoDB outage (October 2025) demonstrated cascading failures: an automation error triggered a chain reaction propagating to thousands of applications -- consumer apps, smart devices, banking systems -- because upstream changes were not coordinated with dependent services. Deploying a provider with a breaking API change before consumers are updated causes immediate 5xx errors cascading through the call graph.
|
|
582
|
+
|
|
583
|
+
**The fix:**
|
|
584
|
+
|
|
585
|
+
Maintain a service dependency graph. Deploy providers before consumers. Use API versioning so breaking changes coexist with old versions during transition.
|
|
586
|
+
|
|
587
|
+
**Detection rule:**
|
|
588
|
+
|
|
589
|
+
Flag API-breaking changes (removed endpoints, changed contracts) without a consumer deployment plan. Track deployment order against the dependency graph.
|
|
590
|
+
|
|
591
|
+
---
|
|
592
|
+
|
|
593
|
+
### AP-15: SSH to Production
|
|
594
|
+
|
|
595
|
+
**Also known as:** Cowboy Ops, Direct Server Access, Admin Shell, Hotfix-by-Hand
|
|
596
|
+
**Frequency:** Common
|
|
597
|
+
**Severity:** Critical
|
|
598
|
+
**Detection difficulty:** Moderate
|
|
599
|
+
|
|
600
|
+
**What it looks like:**
|
|
601
|
+
|
|
602
|
+
Engineers SSH directly into production servers to diagnose issues, apply hotfixes, edit configs, or run database queries by hand.
|
|
603
|
+
|
|
604
|
+
```bash
|
|
605
|
+
ssh admin@prod-server-01
|
|
606
|
+
vim /opt/app/config.yaml # manual edit, no review
|
|
607
|
+
sudo systemctl restart app
|
|
608
|
+
psql -h prod-db -c "UPDATE users SET role='admin' WHERE id=42;"
|
|
609
|
+
```
|
|
610
|
+
|
|
611
|
+
**Why teams do it:**
|
|
612
|
+
|
|
613
|
+
It is the fastest path from "problem identified" to "problem fixed." Incident pressure demands speed. The CI/CD pipeline feels too slow during an outage.
|
|
614
|
+
|
|
615
|
+
**What goes wrong:**
|
|
616
|
+
|
|
617
|
+
GitLab's 2017 database deletion happened because a sysadmin was SSH'd into what they thought was the replica but was actually the production server. Direct production access bypasses every safety control: code review, testing, audit logging, and rollback capability. Changes via SSH are invisible to version control, unreproducible, and often forgotten. SOC 2, PCI-DSS, and HIPAA explicitly prohibit interactive production access without compensating controls.
|
|
618
|
+
|
|
619
|
+
**The fix:**
|
|
620
|
+
|
|
621
|
+
Eliminate direct production access. All changes flow through version-controlled, reviewed, automated pipelines. For incident response, use runbook automation and read-only observability tools.
|
|
622
|
+
|
|
623
|
+
```yaml
|
|
624
|
+
# Controlled break-glass procedure instead of SSH
|
|
625
|
+
break_glass:
|
|
626
|
+
requires: [approval_from_on_call_lead, justification, session_recording]
|
|
627
|
+
limits: { time: 60m, audit_log: immutable }
|
|
628
|
+
post_access: [review_all_changes, commit_to_repo]
|
|
629
|
+
```
|
|
630
|
+
|
|
631
|
+
**Detection rule:**
|
|
632
|
+
|
|
633
|
+
Alert on any SSH connection to production servers. Flag production firewall rules that allow SSH from developer workstations. Monitor for interactive shell sessions on production hosts.
|
|
634
|
+
|
|
635
|
+
---
|
|
636
|
+
|
|
637
|
+
### AP-16: No Deployment Runbooks
|
|
638
|
+
|
|
639
|
+
**Also known as:** Tribal Knowledge Deploys, Oral Tradition Ops, Hero-Dependent Releases
|
|
640
|
+
**Frequency:** Common
|
|
641
|
+
**Severity:** High
|
|
642
|
+
**Detection difficulty:** Easy
|
|
643
|
+
|
|
644
|
+
**What it looks like:**
|
|
645
|
+
|
|
646
|
+
The deployment procedure exists only in the heads of one or two senior engineers. When they are unavailable, nobody knows how to deploy or what to do when something goes wrong.
|
|
647
|
+
|
|
648
|
+
**Why teams do it:**
|
|
649
|
+
|
|
650
|
+
Documentation is boring. The process "isn't that complicated." Key engineers are always available (until they're not).
|
|
651
|
+
|
|
652
|
+
**What goes wrong:**
|
|
653
|
+
|
|
654
|
+
When the sole deployment expert is unavailable, the team either cannot deploy or deploys incorrectly. Facebook's 2021 outage recovery was slowed because engineers needing physical data center access had to be dispatched and authenticated in person -- processes not well-documented for a scenario where all remote tools were simultaneously down. Without runbooks, incident response devolves into panicked improvisation.
|
|
655
|
+
|
|
656
|
+
**The fix:**
|
|
657
|
+
|
|
658
|
+
Write runbooks for every deployment procedure and common failure mode. Store them alongside the code. Test them by having someone other than the author follow them.
|
|
659
|
+
|
|
660
|
+
**Detection rule:**
|
|
661
|
+
|
|
662
|
+
Flag services with no `RUNBOOK.md` or linked runbook. Track bus factor -- if only one person has deployed a service in 90 days, the runbook is insufficient.
|
|
663
|
+
|
|
664
|
+
---
|
|
665
|
+
|
|
666
|
+
### AP-17: Shared Deploy Credentials
|
|
667
|
+
|
|
668
|
+
**Also known as:** Communal Keys, Shared Service Account, Everyone Is Root
|
|
669
|
+
**Frequency:** Common
|
|
670
|
+
**Severity:** Critical
|
|
671
|
+
**Detection difficulty:** Moderate
|
|
672
|
+
|
|
673
|
+
**What it looks like:**
|
|
674
|
+
|
|
675
|
+
The entire team uses the same credentials to deploy: a shared SSH key, a shared CI/CD service account, or a single API token passed around via Slack.
|
|
676
|
+
|
|
677
|
+
```text
|
|
678
|
+
# Shared credentials in team wiki (real examples found in audits)
|
|
679
|
+
SSH key: /shared/deploy_key (same key on all machines)
|
|
680
|
+
AWS Access Key: AKIAEXAMPLE123456
|
|
681
|
+
Docker Hub: deploy-bot / P@ssw0rd2024
|
|
682
|
+
```
|
|
683
|
+
|
|
684
|
+
**Why teams do it:**
|
|
685
|
+
|
|
686
|
+
Individual credentials are more work. Rotating is a chore. The team is small and trusts each other. "We'll fix it when we scale."
|
|
687
|
+
|
|
688
|
+
**What goes wrong:**
|
|
689
|
+
|
|
690
|
+
With shared credentials, there is no audit trail -- when something breaks, you cannot determine who deployed what. When someone leaves the company, you must rotate every shared credential or accept that an ex-employee retains production access. The SolarWinds attack succeeded partly because attackers could operate within the build pipeline using compromised credentials, and the lack of individual attribution made intrusion harder to detect.
|
|
691
|
+
|
|
692
|
+
**The fix:**
|
|
693
|
+
|
|
694
|
+
Issue individual, scoped credentials with short-lived tokens. Use OIDC federation for CI/CD.
|
|
695
|
+
|
|
696
|
+
```yaml
|
|
697
|
+
# GitHub Actions with OIDC -- no long-lived credentials
|
|
698
|
+
permissions:
|
|
699
|
+
id-token: write
|
|
700
|
+
jobs:
|
|
701
|
+
deploy:
|
|
702
|
+
steps:
|
|
703
|
+
- uses: aws-actions/configure-aws-credentials@v4
|
|
704
|
+
with:
|
|
705
|
+
role-to-assume: arn:aws:iam::123456789:role/deploy-${{ github.actor }}
|
|
706
|
+
```
|
|
707
|
+
|
|
708
|
+
**Detection rule:**
|
|
709
|
+
|
|
710
|
+
Flag pipelines using long-lived credentials or static API keys. Alert on credentials not rotated in 90 days. Audit for shared SSH keys across machines.
|
|
711
|
+
|
|
712
|
+
---
|
|
713
|
+
|
|
714
|
+
### AP-18: No Audit Trail
|
|
715
|
+
|
|
716
|
+
**Also known as:** Ghost Deploys, Untracked Changes, Who Did What?
|
|
717
|
+
**Frequency:** Common
|
|
718
|
+
**Severity:** High
|
|
719
|
+
**Detection difficulty:** Moderate
|
|
720
|
+
|
|
721
|
+
**What it looks like:**
|
|
722
|
+
|
|
723
|
+
Deployments happen without recording who triggered them, what version was deployed, or whether the deploy succeeded. When an incident occurs, the team cannot answer "what changed recently?"
|
|
724
|
+
|
|
725
|
+
**Why teams do it:**
|
|
726
|
+
|
|
727
|
+
Logging feels like overhead. The CI/CD tool has some logs but they are not centralized or searchable. Manual deploys bypass logging entirely.
|
|
728
|
+
|
|
729
|
+
**What goes wrong:**
|
|
730
|
+
|
|
731
|
+
Without an audit trail, incident response starts with "does anyone know if anything was deployed recently?" The Knight Capital investigation revealed that the partial deployment (7 of 8 servers) was not logged in a way that made the anomaly visible. If an audit trail had shown "8 expected, 7 completed," the problem could have been caught before market open. SOC 2 Type II, PCI-DSS, and HIPAA require deployment audit trails; their absence blocks certifications.
|
|
732
|
+
|
|
733
|
+
**The fix:**
|
|
734
|
+
|
|
735
|
+
Log every deployment to a centralized, immutable audit system with structured metadata.
|
|
736
|
+
|
|
737
|
+
```json
|
|
738
|
+
{
|
|
739
|
+
"event": "deployment",
|
|
740
|
+
"timestamp": "2026-03-08T14:23:00Z",
|
|
741
|
+
"deployer": "jane.doe@company.com",
|
|
742
|
+
"service": "checkout-api",
|
|
743
|
+
"version": "v2.14.3",
|
|
744
|
+
"commit": "abc123f",
|
|
745
|
+
"environment": "production",
|
|
746
|
+
"status": "success",
|
|
747
|
+
"rollback_version": "v2.14.2",
|
|
748
|
+
"change_ticket": "JIRA-4521"
|
|
749
|
+
}
|
|
750
|
+
```
|
|
751
|
+
|
|
752
|
+
**Detection rule:**
|
|
753
|
+
|
|
754
|
+
Flag deployments without a structured audit log entry. Alert on version jumps (e.g., v2.14.2 to v2.14.5 suggests unlogged deploys).
|
|
755
|
+
|
|
756
|
+
---
|
|
757
|
+
|
|
758
|
+
### AP-19: Untested Rollbacks
|
|
759
|
+
|
|
760
|
+
**Also known as:** Rollback Theater, Paper Rollback, Theoretical Revert
|
|
761
|
+
**Frequency:** Very Common
|
|
762
|
+
**Severity:** Critical
|
|
763
|
+
**Detection difficulty:** Hard
|
|
764
|
+
|
|
765
|
+
**What it looks like:**
|
|
766
|
+
|
|
767
|
+
The deployment pipeline includes a rollback step, but it has never been executed. No one has verified it against real conditions: data migrations, cache invalidation, session state, downstream dependencies.
|
|
768
|
+
|
|
769
|
+
**Why teams do it:**
|
|
770
|
+
|
|
771
|
+
Testing rollbacks requires deliberate failure injection, which is scary and time-consuming. "We have a rollback plan" satisfies the process checkbox without actually testing it.
|
|
772
|
+
|
|
773
|
+
**What goes wrong:**
|
|
774
|
+
|
|
775
|
+
GitLab's five backup strategies all failed when tested under real disaster conditions. Untested recovery mechanisms are not recovery mechanisms. A rollback script that works against a clean database may fail post-migration. A rollback for stateless services may corrupt state in stateful ones. A 2023 Gartner study found that 75% of organizations that had not tested their disaster recovery plans failed to meet recovery time objectives during actual incidents.
|
|
776
|
+
|
|
777
|
+
**The fix:**
|
|
778
|
+
|
|
779
|
+
Regularly test rollbacks in production-like environments. Include rollback testing in your deployment pipeline validation.
|
|
780
|
+
|
|
781
|
+
```yaml
|
|
782
|
+
# Monthly rollback drill
|
|
783
|
+
rollback_drill:
|
|
784
|
+
schedule: "0 10 1 * *" # First of every month
|
|
785
|
+
steps:
|
|
786
|
+
- deploy: app:$NEXT_VERSION
|
|
787
|
+
- verify: health_check
|
|
788
|
+
- rollback: app:$CURRENT_VERSION
|
|
789
|
+
- verify: [health_check, data_integrity_check]
|
|
790
|
+
- report: post to #engineering with results
|
|
791
|
+
```
|
|
792
|
+
|
|
793
|
+
**Detection rule:**
|
|
794
|
+
|
|
795
|
+
Track last rollback execution per service. Flag services where rollback has never been tested or was last tested more than 90 days ago.
|
|
796
|
+
|
|
797
|
+
---
|
|
798
|
+
|
|
799
|
+
### AP-20: No Security Scanning in CI/CD
|
|
800
|
+
|
|
801
|
+
**Also known as:** Unguarded Pipeline, Security Afterthought, Shift-Right Security
|
|
802
|
+
**Frequency:** Common
|
|
803
|
+
**Severity:** Critical
|
|
804
|
+
**Detection difficulty:** Easy
|
|
805
|
+
|
|
806
|
+
**What it looks like:**
|
|
807
|
+
|
|
808
|
+
The CI/CD pipeline builds, tests, and deploys code without any automated security checks: no dependency scanning, no SAST, no container image scanning, no secrets detection.
|
|
809
|
+
|
|
810
|
+
```yaml
|
|
811
|
+
# Pipeline with zero security gates
|
|
812
|
+
pipeline: [lint, unit_test, build, deploy]
|
|
813
|
+
# no SAST, no DAST, no dependency audit, no secrets scan
|
|
814
|
+
```
|
|
815
|
+
|
|
816
|
+
**Why teams do it:**
|
|
817
|
+
|
|
818
|
+
Security scanning slows down the pipeline. Security is "someone else's job." Security tools generate too many false positives, so they were disabled.
|
|
819
|
+
|
|
820
|
+
**What goes wrong:**
|
|
821
|
+
|
|
822
|
+
The SolarWinds supply chain attack (2020) demonstrated catastrophic consequences. Attackers infiltrated SolarWinds' CI/CD system and injected the "SUNBURST" backdoor into the Orion build process. Trojanized updates were digitally signed and distributed to over 18,000 customers, including US government agencies and Fortune 500 companies. Automated integrity verification and security scanning in the pipeline could have detected the unauthorized modifications before signing and shipping.
|
|
823
|
+
|
|
824
|
+
**The fix:**
|
|
825
|
+
|
|
826
|
+
Integrate security scanning at every stage of the pipeline. Fail the build on critical vulnerabilities.
|
|
827
|
+
|
|
828
|
+
```yaml
|
|
829
|
+
# Security-hardened pipeline
|
|
830
|
+
stages:
|
|
831
|
+
- secrets_scan: gitleaks detect --source . --verbose
|
|
832
|
+
- dependency_audit: npm audit --audit-level=critical
|
|
833
|
+
- sast: semgrep --config=auto --error
|
|
834
|
+
- container_scan: trivy image myapp:$VERSION --severity CRITICAL,HIGH --exit-code 1
|
|
835
|
+
- build_and_test: npm ci && npm test
|
|
836
|
+
- deploy: # only runs if all above pass
|
|
837
|
+
needs: [secrets_scan, dependency_audit, sast, container_scan, build_and_test]
|
|
838
|
+
```
|
|
839
|
+
|
|
840
|
+
**Detection rule:**
|
|
841
|
+
|
|
842
|
+
Flag CI/CD pipelines without at least one security scanning step. Audit for disabled or "allow-failure" security jobs. Track deploys that bypass security gates.
|
|
843
|
+
|
|
844
|
+
---
|
|
845
|
+
|
|
846
|
+
## Root Cause Analysis
|
|
847
|
+
|
|
848
|
+
| Root Cause | Contributing Anti-Patterns | Systemic Fix |
|
|
849
|
+
|---|---|---|
|
|
850
|
+
| **Speed over safety culture** | AP-01, AP-06, AP-09, AP-15 | Automated rollback, canary deploys, fast pipelines with all checks |
|
|
851
|
+
| **Insufficient automation** | AP-04, AP-11, AP-17, AP-18 | CI/CD as product infrastructure; zero manual steps in deploy path |
|
|
852
|
+
| **Missing environments & parity** | AP-07, AP-08, AP-12 | Infrastructure-as-code with shared modules; parity checks in CI |
|
|
853
|
+
| **Deploy-release coupling** | AP-03, AP-05, AP-10 | Decouple deploy from release; expand-contract migrations; feature flags |
|
|
854
|
+
| **No resilience testing** | AP-19, AP-13, AP-01 | DR drills; chaos engineering; rollback testing in pipeline |
|
|
855
|
+
| **Organizational silos** | AP-14, AP-16, AP-02 | Cross-team coordination; dependency graph; deploy window policies |
|
|
856
|
+
| **Security as afterthought** | AP-20, AP-17, AP-12, AP-15 | Shift-left security; OIDC federation; zero-trust access; security gates |
|
|
857
|
+
| **Hero culture** | AP-04, AP-15, AP-16, AP-17 | Eliminate single points of human dependency; automate and document |
|
|
858
|
+
| **Budget constraints** | AP-07, AP-11, AP-08 | Quantify outage cost vs infrastructure cost; risk-adjusted ROI |
|
|
859
|
+
| **Incremental neglect** | AP-18, AP-19, AP-13, AP-08 | Scheduled audits; automated compliance checks; decay detection |
|
|
860
|
+
|
|
861
|
+
## Self-Check Questions
|
|
862
|
+
|
|
863
|
+
1. **Rollback readiness:** If your production deploy fails right now, can you revert within 5 minutes? Have you tested this in the last 90 days?
|
|
864
|
+
|
|
865
|
+
2. **Deploy timing:** What percentage of deploys happen on Fridays or after 4 PM? If above 20%, why?
|
|
866
|
+
|
|
867
|
+
3. **Release size:** How many commits in your average release? If more than 20, can you break into smaller, more frequent deploys?
|
|
868
|
+
|
|
869
|
+
4. **Automation coverage:** How many manual steps exist in your deployment? Can a new team member deploy on their first day using the pipeline alone?
|
|
870
|
+
|
|
871
|
+
5. **Feature flags:** Can you disable a feature in production without deploying new code?
|
|
872
|
+
|
|
873
|
+
6. **Environment parity:** When did you last diff staging and production configurations? Are runtime and dependency versions identical?
|
|
874
|
+
|
|
875
|
+
7. **Post-deploy observation:** After your last deploy, how long did someone actively monitor metrics? Is this defined in process or dependent on individual discipline?
|
|
876
|
+
|
|
877
|
+
8. **Migration safety:** Does your pipeline take a verified backup before running migrations? Have you tested restoring from it?
|
|
878
|
+
|
|
879
|
+
9. **Blast radius:** If your next deploy has a critical bug, what percentage of users are affected? Is it 100%?
|
|
880
|
+
|
|
881
|
+
10. **Credential hygiene:** Can you identify exactly who deployed last Tuesday? Are deploy credentials individual or shared?
|
|
882
|
+
|
|
883
|
+
11. **Security gates:** Does your pipeline include dependency scanning, secrets detection, and SAST? Can these be bypassed without approval?
|
|
884
|
+
|
|
885
|
+
12. **Runbook coverage:** If your primary deploy engineer is unreachable, can someone else deploy using written documentation alone?
|
|
886
|
+
|
|
887
|
+
13. **Rollback testing:** When did you last deliberately roll back a production deployment to verify the procedure works?
|
|
888
|
+
|
|
889
|
+
14. **Dependency awareness:** Do you know which services depend on yours? If you deploy a breaking change, will dependents fail gracefully or crash?
|
|
890
|
+
|
|
891
|
+
15. **Audit completeness:** Can you produce a complete deployment history for the last 12 months -- who, what, when, and whether it succeeded?
|
|
892
|
+
|
|
893
|
+
## Code Smell Quick Reference
|
|
894
|
+
|
|
895
|
+
| Anti-Pattern | AKA | Severity | Frequency | Key Signal | First Action |
|
|
896
|
+
|---|---|---|---|---|---|
|
|
897
|
+
| No Rollback Plan | Burn the Ships | Critical | Very Common | No rollback step in pipeline | Add rollback automation + test it |
|
|
898
|
+
| Friday Deploys | YOLO Friday | High | Very Common | >20% of deploys on Friday PM | Enforce deploy windows |
|
|
899
|
+
| Big Bang Deploy | Flag Day | Critical | Common | 50+ commits per release | Break into incremental releases |
|
|
900
|
+
| Manual Steps | Click-Ops | High | Common | SSH in deploy process | Automate the entire pipeline |
|
|
901
|
+
| No Feature Flags | Binary Release | High | Common | Deploy = release for all users | Add flag framework + gradual rollout |
|
|
902
|
+
| Untested Code | Ship and Pray | Critical | Common | Skipped test stages | Make test suite a hard gate |
|
|
903
|
+
| No Staging | Cowboy Pipeline | High | Common | CI deploys direct to prod | Create staging environment |
|
|
904
|
+
| Config Drift | Snowflake Envs | High | Very Common | Version mismatches across envs | Infrastructure-as-code parity |
|
|
905
|
+
| No Post-Deploy Monitoring | Fire and Forget | Critical | Common | No observation window defined | Add automated post-deploy checks |
|
|
906
|
+
| Coupled DB Migrations | Migration Roulette | Critical | Common | Migration + code in same PR | Expand-contract pattern |
|
|
907
|
+
| No Canary/Blue-Green | Replace and Pray | High | Common | 100% traffic to new version | Implement progressive delivery |
|
|
908
|
+
| Hardcoded Env Values | Baked-In Config | High | Very Common | Credentials in source code | Externalize config + secrets manager |
|
|
909
|
+
| No Pre-Migration Backup | YOLO Schema Change | Critical | Common | No backup step before migrate | Verified backup in pipeline |
|
|
910
|
+
| Deploy Order Ignored | Dependency Blindness | High | Occasional | Independent service deploys | Maintain dependency graph |
|
|
911
|
+
| SSH to Production | Cowboy Ops | Critical | Common | Interactive prod sessions | Zero-trust access + break-glass |
|
|
912
|
+
| No Runbooks | Tribal Knowledge | High | Common | One person can deploy | Write + test runbooks |
|
|
913
|
+
| Shared Credentials | Everyone Is Root | Critical | Common | Shared SSH keys / tokens | Individual OIDC credentials |
|
|
914
|
+
| No Audit Trail | Ghost Deploys | High | Common | Cannot answer "who deployed?" | Structured deploy logging |
|
|
915
|
+
| Untested Rollbacks | Rollback Theater | Critical | Very Common | Rollback script never executed | Monthly rollback drills |
|
|
916
|
+
| No Security Scanning | Unguarded Pipeline | Critical | Common | No SAST/SCA in CI/CD | Add security gates to pipeline |
|
|
917
|
+
|
|
918
|
+
---
|
|
919
|
+
|
|
920
|
+
*Researched: 2026-03-08 | Sources: Knight Capital SEC filing and post-mortem (henricodolfing.ch, SEC); GitLab.com database incident post-mortem (about.gitlab.com/blog); CrowdStrike July 2024 outage analysis (Wikipedia, CNN, Fortune, CybersecurityDive); TSB Bank migration disaster (ComputerWeekly, henricodolfing.ch); Facebook/Meta October 2021 outage post-mortem (engineering.fb.com, Cloudflare blog); AWS S3 February 2017 outage summary (aws.amazon.com/message/41926); Cloudflare November 2025 outage post-mortem (blog.cloudflare.com); Fastly CDN June 2021 outage; SolarWinds supply chain attack analysis (Fortinet, CyberArk, CISA); AWS DynamoDB October 2025 outage; charity.wtf Friday deploys analysis; enov8.com deployment management; alpacked.io DevOps anti-patterns*
|