@wazir-dev/cli 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +111 -0
- package/CHANGELOG.md +14 -0
- package/CONTRIBUTING.md +101 -0
- package/LICENSE +21 -0
- package/README.md +314 -0
- package/assets/composition-engine.mmd +34 -0
- package/assets/demo-script.sh +17 -0
- package/assets/logo-dark.svg +14 -0
- package/assets/logo.svg +14 -0
- package/assets/pipeline.mmd +39 -0
- package/assets/record-demo.sh +51 -0
- package/docs/README.md +51 -0
- package/docs/adapters/context-mode.md +60 -0
- package/docs/concepts/architecture.md +87 -0
- package/docs/concepts/artifact-model.md +60 -0
- package/docs/concepts/composition-engine.md +36 -0
- package/docs/concepts/indexing-and-recall.md +160 -0
- package/docs/concepts/observability.md +41 -0
- package/docs/concepts/roles-and-workflows.md +59 -0
- package/docs/concepts/terminology-policy.md +27 -0
- package/docs/getting-started/01-installation.md +78 -0
- package/docs/getting-started/02-first-run.md +102 -0
- package/docs/getting-started/03-adding-to-project.md +15 -0
- package/docs/getting-started/04-host-setup.md +15 -0
- package/docs/guides/ci-integration.md +15 -0
- package/docs/guides/creating-skills.md +15 -0
- package/docs/guides/expertise-module-authoring.md +15 -0
- package/docs/guides/hook-development.md +15 -0
- package/docs/guides/memory-and-learnings.md +34 -0
- package/docs/guides/multi-host-export.md +15 -0
- package/docs/guides/troubleshooting.md +101 -0
- package/docs/guides/writing-custom-roles.md +15 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-design.md +592 -0
- package/docs/plans/2026-03-15-cli-pipeline-integration-plan.md +598 -0
- package/docs/plans/2026-03-15-docs-enforcement-plan.md +238 -0
- package/docs/readmes/INDEX.md +99 -0
- package/docs/readmes/features/expertise/README.md +171 -0
- package/docs/readmes/features/exports/README.md +222 -0
- package/docs/readmes/features/hooks/README.md +103 -0
- package/docs/readmes/features/hooks/loop-cap-guard.md +133 -0
- package/docs/readmes/features/hooks/post-tool-capture.md +121 -0
- package/docs/readmes/features/hooks/post-tool-lint.md +130 -0
- package/docs/readmes/features/hooks/pre-compact-summary.md +122 -0
- package/docs/readmes/features/hooks/pre-tool-capture-route.md +100 -0
- package/docs/readmes/features/hooks/protected-path-write-guard.md +128 -0
- package/docs/readmes/features/hooks/session-start.md +119 -0
- package/docs/readmes/features/hooks/stop-handoff-harvest.md +125 -0
- package/docs/readmes/features/roles/README.md +157 -0
- package/docs/readmes/features/roles/clarifier.md +152 -0
- package/docs/readmes/features/roles/content-author.md +190 -0
- package/docs/readmes/features/roles/designer.md +193 -0
- package/docs/readmes/features/roles/executor.md +184 -0
- package/docs/readmes/features/roles/learner.md +210 -0
- package/docs/readmes/features/roles/planner.md +182 -0
- package/docs/readmes/features/roles/researcher.md +164 -0
- package/docs/readmes/features/roles/reviewer.md +184 -0
- package/docs/readmes/features/roles/specifier.md +162 -0
- package/docs/readmes/features/roles/verifier.md +215 -0
- package/docs/readmes/features/schemas/README.md +178 -0
- package/docs/readmes/features/skills/README.md +63 -0
- package/docs/readmes/features/skills/brainstorming.md +96 -0
- package/docs/readmes/features/skills/debugging.md +148 -0
- package/docs/readmes/features/skills/design.md +120 -0
- package/docs/readmes/features/skills/prepare-next.md +109 -0
- package/docs/readmes/features/skills/run-audit.md +159 -0
- package/docs/readmes/features/skills/scan-project.md +109 -0
- package/docs/readmes/features/skills/self-audit.md +176 -0
- package/docs/readmes/features/skills/tdd.md +137 -0
- package/docs/readmes/features/skills/using-skills.md +92 -0
- package/docs/readmes/features/skills/verification.md +120 -0
- package/docs/readmes/features/skills/writing-plans.md +104 -0
- package/docs/readmes/features/tooling/README.md +320 -0
- package/docs/readmes/features/workflows/README.md +186 -0
- package/docs/readmes/features/workflows/author.md +181 -0
- package/docs/readmes/features/workflows/clarify.md +154 -0
- package/docs/readmes/features/workflows/design-review.md +171 -0
- package/docs/readmes/features/workflows/design.md +169 -0
- package/docs/readmes/features/workflows/discover.md +162 -0
- package/docs/readmes/features/workflows/execute.md +173 -0
- package/docs/readmes/features/workflows/learn.md +167 -0
- package/docs/readmes/features/workflows/plan-review.md +165 -0
- package/docs/readmes/features/workflows/plan.md +170 -0
- package/docs/readmes/features/workflows/prepare-next.md +167 -0
- package/docs/readmes/features/workflows/review.md +169 -0
- package/docs/readmes/features/workflows/run-audit.md +191 -0
- package/docs/readmes/features/workflows/spec-challenge.md +159 -0
- package/docs/readmes/features/workflows/specify.md +160 -0
- package/docs/readmes/features/workflows/verify.md +177 -0
- package/docs/readmes/packages/README.md +50 -0
- package/docs/readmes/packages/ajv.md +117 -0
- package/docs/readmes/packages/context-mode.md +118 -0
- package/docs/readmes/packages/gray-matter.md +116 -0
- package/docs/readmes/packages/node-test.md +137 -0
- package/docs/readmes/packages/yaml.md +112 -0
- package/docs/reference/configuration-reference.md +159 -0
- package/docs/reference/expertise-index.md +52 -0
- package/docs/reference/git-flow.md +43 -0
- package/docs/reference/hooks.md +87 -0
- package/docs/reference/host-exports.md +50 -0
- package/docs/reference/launch-checklist.md +172 -0
- package/docs/reference/marketplace-listings.md +76 -0
- package/docs/reference/release-process.md +34 -0
- package/docs/reference/roles-reference.md +77 -0
- package/docs/reference/skills.md +33 -0
- package/docs/reference/templates.md +29 -0
- package/docs/reference/tooling-cli.md +94 -0
- package/docs/truth-claims.yaml +222 -0
- package/expertise/PROGRESS.md +63 -0
- package/expertise/README.md +18 -0
- package/expertise/antipatterns/PROGRESS.md +56 -0
- package/expertise/antipatterns/backend/api-design-antipatterns.md +1271 -0
- package/expertise/antipatterns/backend/auth-antipatterns.md +1195 -0
- package/expertise/antipatterns/backend/caching-antipatterns.md +622 -0
- package/expertise/antipatterns/backend/database-antipatterns.md +1038 -0
- package/expertise/antipatterns/backend/index.md +24 -0
- package/expertise/antipatterns/backend/microservices-antipatterns.md +850 -0
- package/expertise/antipatterns/code/architecture-antipatterns.md +919 -0
- package/expertise/antipatterns/code/async-antipatterns.md +622 -0
- package/expertise/antipatterns/code/code-smells.md +1186 -0
- package/expertise/antipatterns/code/dependency-antipatterns.md +1209 -0
- package/expertise/antipatterns/code/error-handling-antipatterns.md +1360 -0
- package/expertise/antipatterns/code/index.md +27 -0
- package/expertise/antipatterns/code/naming-and-abstraction.md +1118 -0
- package/expertise/antipatterns/code/state-management-antipatterns.md +1076 -0
- package/expertise/antipatterns/code/testing-antipatterns.md +1053 -0
- package/expertise/antipatterns/design/accessibility-antipatterns.md +1136 -0
- package/expertise/antipatterns/design/dark-patterns.md +1121 -0
- package/expertise/antipatterns/design/index.md +22 -0
- package/expertise/antipatterns/design/ui-antipatterns.md +1202 -0
- package/expertise/antipatterns/design/ux-antipatterns.md +680 -0
- package/expertise/antipatterns/frontend/css-layout-antipatterns.md +691 -0
- package/expertise/antipatterns/frontend/flutter-antipatterns.md +1827 -0
- package/expertise/antipatterns/frontend/index.md +23 -0
- package/expertise/antipatterns/frontend/mobile-antipatterns.md +573 -0
- package/expertise/antipatterns/frontend/react-antipatterns.md +1128 -0
- package/expertise/antipatterns/frontend/spa-antipatterns.md +1235 -0
- package/expertise/antipatterns/index.md +31 -0
- package/expertise/antipatterns/performance/index.md +20 -0
- package/expertise/antipatterns/performance/performance-antipatterns.md +1013 -0
- package/expertise/antipatterns/performance/premature-optimization.md +623 -0
- package/expertise/antipatterns/performance/scaling-antipatterns.md +785 -0
- package/expertise/antipatterns/process/ai-coding-antipatterns.md +853 -0
- package/expertise/antipatterns/process/code-review-antipatterns.md +656 -0
- package/expertise/antipatterns/process/deployment-antipatterns.md +920 -0
- package/expertise/antipatterns/process/index.md +23 -0
- package/expertise/antipatterns/process/technical-debt-antipatterns.md +647 -0
- package/expertise/antipatterns/security/index.md +20 -0
- package/expertise/antipatterns/security/secrets-antipatterns.md +849 -0
- package/expertise/antipatterns/security/security-theater.md +843 -0
- package/expertise/antipatterns/security/vulnerability-patterns.md +801 -0
- package/expertise/architecture/PROGRESS.md +70 -0
- package/expertise/architecture/data/caching-architecture.md +671 -0
- package/expertise/architecture/data/data-consistency.md +574 -0
- package/expertise/architecture/data/data-modeling.md +536 -0
- package/expertise/architecture/data/event-streams-and-queues.md +634 -0
- package/expertise/architecture/data/index.md +25 -0
- package/expertise/architecture/data/search-architecture.md +663 -0
- package/expertise/architecture/data/sql-vs-nosql.md +708 -0
- package/expertise/architecture/decisions/architecture-decision-records.md +640 -0
- package/expertise/architecture/decisions/build-vs-buy.md +616 -0
- package/expertise/architecture/decisions/index.md +23 -0
- package/expertise/architecture/decisions/monolith-to-microservices.md +790 -0
- package/expertise/architecture/decisions/technology-selection.md +616 -0
- package/expertise/architecture/distributed/cap-theorem-and-tradeoffs.md +800 -0
- package/expertise/architecture/distributed/circuit-breaker-bulkhead.md +741 -0
- package/expertise/architecture/distributed/consensus-and-coordination.md +796 -0
- package/expertise/architecture/distributed/distributed-systems-fundamentals.md +564 -0
- package/expertise/architecture/distributed/idempotency-and-retry.md +796 -0
- package/expertise/architecture/distributed/index.md +25 -0
- package/expertise/architecture/distributed/saga-pattern.md +797 -0
- package/expertise/architecture/foundations/architectural-thinking.md +460 -0
- package/expertise/architecture/foundations/coupling-and-cohesion.md +770 -0
- package/expertise/architecture/foundations/design-principles-solid.md +649 -0
- package/expertise/architecture/foundations/domain-driven-design.md +719 -0
- package/expertise/architecture/foundations/index.md +25 -0
- package/expertise/architecture/foundations/separation-of-concerns.md +472 -0
- package/expertise/architecture/foundations/twelve-factor-app.md +797 -0
- package/expertise/architecture/index.md +34 -0
- package/expertise/architecture/integration/api-design-graphql.md +638 -0
- package/expertise/architecture/integration/api-design-grpc.md +804 -0
- package/expertise/architecture/integration/api-design-rest.md +892 -0
- package/expertise/architecture/integration/index.md +25 -0
- package/expertise/architecture/integration/third-party-integration.md +795 -0
- package/expertise/architecture/integration/webhooks-and-callbacks.md +1152 -0
- package/expertise/architecture/integration/websockets-realtime.md +791 -0
- package/expertise/architecture/mobile-architecture/index.md +22 -0
- package/expertise/architecture/mobile-architecture/mobile-app-architecture.md +780 -0
- package/expertise/architecture/mobile-architecture/mobile-backend-for-frontend.md +670 -0
- package/expertise/architecture/mobile-architecture/offline-first.md +719 -0
- package/expertise/architecture/mobile-architecture/push-and-sync.md +782 -0
- package/expertise/architecture/patterns/cqrs-event-sourcing.md +717 -0
- package/expertise/architecture/patterns/event-driven.md +797 -0
- package/expertise/architecture/patterns/hexagonal-clean-architecture.md +870 -0
- package/expertise/architecture/patterns/index.md +27 -0
- package/expertise/architecture/patterns/layered-architecture.md +736 -0
- package/expertise/architecture/patterns/microservices.md +753 -0
- package/expertise/architecture/patterns/modular-monolith.md +692 -0
- package/expertise/architecture/patterns/monolith.md +626 -0
- package/expertise/architecture/patterns/plugin-architecture.md +735 -0
- package/expertise/architecture/patterns/serverless.md +780 -0
- package/expertise/architecture/scaling/database-scaling.md +615 -0
- package/expertise/architecture/scaling/feature-flags-and-rollouts.md +757 -0
- package/expertise/architecture/scaling/horizontal-vs-vertical.md +606 -0
- package/expertise/architecture/scaling/index.md +24 -0
- package/expertise/architecture/scaling/multi-tenancy.md +800 -0
- package/expertise/architecture/scaling/stateless-design.md +787 -0
- package/expertise/backend/embedded-firmware.md +625 -0
- package/expertise/backend/go.md +853 -0
- package/expertise/backend/index.md +24 -0
- package/expertise/backend/java-spring.md +448 -0
- package/expertise/backend/node-typescript.md +625 -0
- package/expertise/backend/python-fastapi.md +724 -0
- package/expertise/backend/rust.md +458 -0
- package/expertise/backend/solidity.md +711 -0
- package/expertise/composition-map.yaml +443 -0
- package/expertise/content/foundations/content-modeling.md +395 -0
- package/expertise/content/foundations/editorial-standards.md +449 -0
- package/expertise/content/foundations/index.md +24 -0
- package/expertise/content/foundations/microcopy.md +455 -0
- package/expertise/content/foundations/terminology-governance.md +509 -0
- package/expertise/content/index.md +34 -0
- package/expertise/content/patterns/accessibility-copy.md +518 -0
- package/expertise/content/patterns/index.md +24 -0
- package/expertise/content/patterns/notification-content.md +433 -0
- package/expertise/content/patterns/sample-content.md +486 -0
- package/expertise/content/patterns/state-copy.md +439 -0
- package/expertise/design/PROGRESS.md +58 -0
- package/expertise/design/disciplines/dark-mode-theming.md +577 -0
- package/expertise/design/disciplines/design-systems.md +595 -0
- package/expertise/design/disciplines/index.md +25 -0
- package/expertise/design/disciplines/information-architecture.md +800 -0
- package/expertise/design/disciplines/interaction-design.md +788 -0
- package/expertise/design/disciplines/responsive-design.md +552 -0
- package/expertise/design/disciplines/usability-testing.md +516 -0
- package/expertise/design/disciplines/user-research.md +792 -0
- package/expertise/design/foundations/accessibility-design.md +796 -0
- package/expertise/design/foundations/color-theory.md +797 -0
- package/expertise/design/foundations/iconography.md +795 -0
- package/expertise/design/foundations/index.md +26 -0
- package/expertise/design/foundations/motion-and-animation.md +653 -0
- package/expertise/design/foundations/rtl-design.md +585 -0
- package/expertise/design/foundations/spacing-and-layout.md +607 -0
- package/expertise/design/foundations/typography.md +800 -0
- package/expertise/design/foundations/visual-hierarchy.md +761 -0
- package/expertise/design/index.md +32 -0
- package/expertise/design/patterns/authentication-flows.md +474 -0
- package/expertise/design/patterns/content-consumption.md +789 -0
- package/expertise/design/patterns/data-display.md +618 -0
- package/expertise/design/patterns/e-commerce.md +1494 -0
- package/expertise/design/patterns/feedback-and-states.md +642 -0
- package/expertise/design/patterns/forms-and-input.md +819 -0
- package/expertise/design/patterns/gamification.md +801 -0
- package/expertise/design/patterns/index.md +31 -0
- package/expertise/design/patterns/microinteractions.md +449 -0
- package/expertise/design/patterns/navigation.md +800 -0
- package/expertise/design/patterns/notifications.md +705 -0
- package/expertise/design/patterns/onboarding.md +700 -0
- package/expertise/design/patterns/search-and-filter.md +601 -0
- package/expertise/design/patterns/settings-and-preferences.md +768 -0
- package/expertise/design/patterns/social-and-community.md +748 -0
- package/expertise/design/platforms/desktop-native.md +612 -0
- package/expertise/design/platforms/index.md +25 -0
- package/expertise/design/platforms/mobile-android.md +825 -0
- package/expertise/design/platforms/mobile-cross-platform.md +983 -0
- package/expertise/design/platforms/mobile-ios.md +699 -0
- package/expertise/design/platforms/tablet.md +794 -0
- package/expertise/design/platforms/web-dashboard.md +790 -0
- package/expertise/design/platforms/web-responsive.md +550 -0
- package/expertise/design/psychology/behavioral-nudges.md +449 -0
- package/expertise/design/psychology/cognitive-load.md +1191 -0
- package/expertise/design/psychology/error-psychology.md +778 -0
- package/expertise/design/psychology/index.md +22 -0
- package/expertise/design/psychology/persuasive-design.md +736 -0
- package/expertise/design/psychology/user-mental-models.md +623 -0
- package/expertise/design/tooling/open-pencil.md +266 -0
- package/expertise/frontend/angular.md +1073 -0
- package/expertise/frontend/desktop-electron.md +546 -0
- package/expertise/frontend/flutter.md +782 -0
- package/expertise/frontend/index.md +27 -0
- package/expertise/frontend/native-android.md +409 -0
- package/expertise/frontend/native-ios.md +490 -0
- package/expertise/frontend/react-native.md +1160 -0
- package/expertise/frontend/react.md +808 -0
- package/expertise/frontend/vue.md +1089 -0
- package/expertise/humanize/domain-rules-code.md +79 -0
- package/expertise/humanize/domain-rules-content.md +67 -0
- package/expertise/humanize/domain-rules-technical-docs.md +56 -0
- package/expertise/humanize/index.md +35 -0
- package/expertise/humanize/self-audit-checklist.md +87 -0
- package/expertise/humanize/sentence-patterns.md +218 -0
- package/expertise/humanize/vocabulary-blacklist.md +105 -0
- package/expertise/i18n/PROGRESS.md +65 -0
- package/expertise/i18n/advanced/accessibility-and-i18n.md +28 -0
- package/expertise/i18n/advanced/bidirectional-text-algorithm.md +38 -0
- package/expertise/i18n/advanced/complex-scripts.md +30 -0
- package/expertise/i18n/advanced/performance-and-i18n.md +27 -0
- package/expertise/i18n/advanced/testing-i18n.md +28 -0
- package/expertise/i18n/content/content-adaptation.md +23 -0
- package/expertise/i18n/content/locale-specific-formatting.md +23 -0
- package/expertise/i18n/content/machine-translation-integration.md +28 -0
- package/expertise/i18n/content/translation-management.md +29 -0
- package/expertise/i18n/foundations/date-time-calendars.md +67 -0
- package/expertise/i18n/foundations/i18n-architecture.md +272 -0
- package/expertise/i18n/foundations/locale-and-language-tags.md +79 -0
- package/expertise/i18n/foundations/numbers-currency-units.md +61 -0
- package/expertise/i18n/foundations/pluralization-and-gender.md +109 -0
- package/expertise/i18n/foundations/string-externalization.md +236 -0
- package/expertise/i18n/foundations/text-direction-bidi.md +241 -0
- package/expertise/i18n/foundations/unicode-and-encoding.md +86 -0
- package/expertise/i18n/index.md +38 -0
- package/expertise/i18n/platform/backend-i18n.md +31 -0
- package/expertise/i18n/platform/flutter-i18n.md +148 -0
- package/expertise/i18n/platform/native-android-i18n.md +36 -0
- package/expertise/i18n/platform/native-ios-i18n.md +36 -0
- package/expertise/i18n/platform/react-i18n.md +103 -0
- package/expertise/i18n/platform/web-css-i18n.md +81 -0
- package/expertise/i18n/rtl/arabic-specific.md +175 -0
- package/expertise/i18n/rtl/hebrew-specific.md +149 -0
- package/expertise/i18n/rtl/rtl-animations-and-transitions.md +111 -0
- package/expertise/i18n/rtl/rtl-forms-and-input.md +161 -0
- package/expertise/i18n/rtl/rtl-fundamentals.md +211 -0
- package/expertise/i18n/rtl/rtl-icons-and-images.md +181 -0
- package/expertise/i18n/rtl/rtl-layout-mirroring.md +252 -0
- package/expertise/i18n/rtl/rtl-navigation-and-gestures.md +107 -0
- package/expertise/i18n/rtl/rtl-testing-and-qa.md +147 -0
- package/expertise/i18n/rtl/rtl-typography.md +160 -0
- package/expertise/index.md +113 -0
- package/expertise/index.yaml +216 -0
- package/expertise/infrastructure/cloud-aws.md +597 -0
- package/expertise/infrastructure/cloud-gcp.md +599 -0
- package/expertise/infrastructure/cybersecurity.md +816 -0
- package/expertise/infrastructure/database-mongodb.md +447 -0
- package/expertise/infrastructure/database-postgres.md +400 -0
- package/expertise/infrastructure/devops-cicd.md +787 -0
- package/expertise/infrastructure/index.md +27 -0
- package/expertise/performance/PROGRESS.md +50 -0
- package/expertise/performance/backend/api-latency.md +1204 -0
- package/expertise/performance/backend/background-jobs.md +506 -0
- package/expertise/performance/backend/connection-pooling.md +1209 -0
- package/expertise/performance/backend/database-query-optimization.md +515 -0
- package/expertise/performance/backend/index.md +23 -0
- package/expertise/performance/backend/rate-limiting-and-throttling.md +971 -0
- package/expertise/performance/foundations/algorithmic-complexity.md +954 -0
- package/expertise/performance/foundations/caching-strategies.md +489 -0
- package/expertise/performance/foundations/concurrency-and-parallelism.md +847 -0
- package/expertise/performance/foundations/index.md +24 -0
- package/expertise/performance/foundations/measuring-and-profiling.md +440 -0
- package/expertise/performance/foundations/memory-management.md +964 -0
- package/expertise/performance/foundations/performance-budgets.md +1314 -0
- package/expertise/performance/index.md +31 -0
- package/expertise/performance/infrastructure/auto-scaling.md +1059 -0
- package/expertise/performance/infrastructure/cdn-and-edge.md +1081 -0
- package/expertise/performance/infrastructure/index.md +22 -0
- package/expertise/performance/infrastructure/load-balancing.md +1081 -0
- package/expertise/performance/infrastructure/observability.md +1079 -0
- package/expertise/performance/mobile/index.md +23 -0
- package/expertise/performance/mobile/mobile-animations.md +544 -0
- package/expertise/performance/mobile/mobile-memory-battery.md +416 -0
- package/expertise/performance/mobile/mobile-network.md +452 -0
- package/expertise/performance/mobile/mobile-rendering.md +599 -0
- package/expertise/performance/mobile/mobile-startup-time.md +505 -0
- package/expertise/performance/platform-specific/flutter-performance.md +647 -0
- package/expertise/performance/platform-specific/index.md +22 -0
- package/expertise/performance/platform-specific/node-performance.md +1307 -0
- package/expertise/performance/platform-specific/postgres-performance.md +1366 -0
- package/expertise/performance/platform-specific/react-performance.md +1403 -0
- package/expertise/performance/web/bundle-optimization.md +1239 -0
- package/expertise/performance/web/image-and-media.md +636 -0
- package/expertise/performance/web/index.md +24 -0
- package/expertise/performance/web/network-optimization.md +1133 -0
- package/expertise/performance/web/rendering-performance.md +1098 -0
- package/expertise/performance/web/ssr-and-hydration.md +918 -0
- package/expertise/performance/web/web-vitals.md +1374 -0
- package/expertise/quality/accessibility.md +985 -0
- package/expertise/quality/evidence-based-verification.md +499 -0
- package/expertise/quality/index.md +24 -0
- package/expertise/quality/ml-model-audit.md +614 -0
- package/expertise/quality/performance.md +600 -0
- package/expertise/quality/testing-api.md +891 -0
- package/expertise/quality/testing-mobile.md +496 -0
- package/expertise/quality/testing-web.md +849 -0
- package/expertise/security/PROGRESS.md +54 -0
- package/expertise/security/agentic-identity.md +540 -0
- package/expertise/security/compliance-frameworks.md +601 -0
- package/expertise/security/data/data-encryption.md +364 -0
- package/expertise/security/data/data-privacy-gdpr.md +692 -0
- package/expertise/security/data/database-security.md +1171 -0
- package/expertise/security/data/index.md +22 -0
- package/expertise/security/data/pii-handling.md +531 -0
- package/expertise/security/foundations/authentication.md +1041 -0
- package/expertise/security/foundations/authorization.md +603 -0
- package/expertise/security/foundations/cryptography.md +1001 -0
- package/expertise/security/foundations/index.md +25 -0
- package/expertise/security/foundations/owasp-top-10.md +1354 -0
- package/expertise/security/foundations/secrets-management.md +1217 -0
- package/expertise/security/foundations/secure-sdlc.md +700 -0
- package/expertise/security/foundations/supply-chain-security.md +698 -0
- package/expertise/security/index.md +31 -0
- package/expertise/security/infrastructure/cloud-security-aws.md +1296 -0
- package/expertise/security/infrastructure/cloud-security-gcp.md +1376 -0
- package/expertise/security/infrastructure/container-security.md +721 -0
- package/expertise/security/infrastructure/incident-response.md +1295 -0
- package/expertise/security/infrastructure/index.md +24 -0
- package/expertise/security/infrastructure/logging-and-monitoring.md +1618 -0
- package/expertise/security/infrastructure/network-security.md +1337 -0
- package/expertise/security/mobile/index.md +23 -0
- package/expertise/security/mobile/mobile-android-security.md +1218 -0
- package/expertise/security/mobile/mobile-binary-protection.md +1229 -0
- package/expertise/security/mobile/mobile-data-storage.md +1265 -0
- package/expertise/security/mobile/mobile-ios-security.md +1401 -0
- package/expertise/security/mobile/mobile-network-security.md +1520 -0
- package/expertise/security/smart-contract-security.md +594 -0
- package/expertise/security/testing/index.md +22 -0
- package/expertise/security/testing/penetration-testing.md +1258 -0
- package/expertise/security/testing/security-code-review.md +1765 -0
- package/expertise/security/testing/threat-modeling.md +1074 -0
- package/expertise/security/testing/vulnerability-scanning.md +1062 -0
- package/expertise/security/web/api-security.md +586 -0
- package/expertise/security/web/cors-and-headers.md +433 -0
- package/expertise/security/web/csrf.md +562 -0
- package/expertise/security/web/file-upload.md +1477 -0
- package/expertise/security/web/index.md +25 -0
- package/expertise/security/web/injection.md +1375 -0
- package/expertise/security/web/session-management.md +1101 -0
- package/expertise/security/web/xss.md +1158 -0
- package/exports/README.md +17 -0
- package/exports/hosts/claude/.claude/agents/clarifier.md +42 -0
- package/exports/hosts/claude/.claude/agents/content-author.md +63 -0
- package/exports/hosts/claude/.claude/agents/designer.md +55 -0
- package/exports/hosts/claude/.claude/agents/executor.md +55 -0
- package/exports/hosts/claude/.claude/agents/learner.md +51 -0
- package/exports/hosts/claude/.claude/agents/planner.md +53 -0
- package/exports/hosts/claude/.claude/agents/researcher.md +43 -0
- package/exports/hosts/claude/.claude/agents/reviewer.md +54 -0
- package/exports/hosts/claude/.claude/agents/specifier.md +47 -0
- package/exports/hosts/claude/.claude/agents/verifier.md +71 -0
- package/exports/hosts/claude/.claude/commands/author.md +42 -0
- package/exports/hosts/claude/.claude/commands/clarify.md +38 -0
- package/exports/hosts/claude/.claude/commands/design-review.md +46 -0
- package/exports/hosts/claude/.claude/commands/design.md +44 -0
- package/exports/hosts/claude/.claude/commands/discover.md +37 -0
- package/exports/hosts/claude/.claude/commands/execute.md +48 -0
- package/exports/hosts/claude/.claude/commands/learn.md +38 -0
- package/exports/hosts/claude/.claude/commands/plan-review.md +42 -0
- package/exports/hosts/claude/.claude/commands/plan.md +39 -0
- package/exports/hosts/claude/.claude/commands/prepare-next.md +37 -0
- package/exports/hosts/claude/.claude/commands/review.md +40 -0
- package/exports/hosts/claude/.claude/commands/run-audit.md +41 -0
- package/exports/hosts/claude/.claude/commands/spec-challenge.md +41 -0
- package/exports/hosts/claude/.claude/commands/specify.md +38 -0
- package/exports/hosts/claude/.claude/commands/verify.md +37 -0
- package/exports/hosts/claude/.claude/settings.json +34 -0
- package/exports/hosts/claude/CLAUDE.md +19 -0
- package/exports/hosts/claude/export.manifest.json +38 -0
- package/exports/hosts/claude/host-package.json +67 -0
- package/exports/hosts/codex/AGENTS.md +19 -0
- package/exports/hosts/codex/export.manifest.json +38 -0
- package/exports/hosts/codex/host-package.json +41 -0
- package/exports/hosts/cursor/.cursor/hooks.json +16 -0
- package/exports/hosts/cursor/.cursor/rules/wazir-core.mdc +19 -0
- package/exports/hosts/cursor/export.manifest.json +38 -0
- package/exports/hosts/cursor/host-package.json +42 -0
- package/exports/hosts/gemini/GEMINI.md +19 -0
- package/exports/hosts/gemini/export.manifest.json +38 -0
- package/exports/hosts/gemini/host-package.json +41 -0
- package/hooks/README.md +18 -0
- package/hooks/definitions/loop_cap_guard.yaml +21 -0
- package/hooks/definitions/post_tool_capture.yaml +24 -0
- package/hooks/definitions/pre_compact_summary.yaml +19 -0
- package/hooks/definitions/pre_tool_capture_route.yaml +19 -0
- package/hooks/definitions/protected_path_write_guard.yaml +19 -0
- package/hooks/definitions/session_start.yaml +19 -0
- package/hooks/definitions/stop_handoff_harvest.yaml +20 -0
- package/hooks/loop-cap-guard +17 -0
- package/hooks/post-tool-lint +36 -0
- package/hooks/protected-path-write-guard +17 -0
- package/hooks/session-start +41 -0
- package/llms-full.txt +2355 -0
- package/llms.txt +43 -0
- package/package.json +79 -0
- package/roles/README.md +20 -0
- package/roles/clarifier.md +42 -0
- package/roles/content-author.md +63 -0
- package/roles/designer.md +55 -0
- package/roles/executor.md +55 -0
- package/roles/learner.md +51 -0
- package/roles/planner.md +53 -0
- package/roles/researcher.md +43 -0
- package/roles/reviewer.md +54 -0
- package/roles/specifier.md +47 -0
- package/roles/verifier.md +71 -0
- package/schemas/README.md +24 -0
- package/schemas/accepted-learning.schema.json +20 -0
- package/schemas/author-artifact.schema.json +156 -0
- package/schemas/clarification.schema.json +19 -0
- package/schemas/design-artifact.schema.json +80 -0
- package/schemas/docs-claim.schema.json +18 -0
- package/schemas/export-manifest.schema.json +20 -0
- package/schemas/hook.schema.json +67 -0
- package/schemas/host-export-package.schema.json +18 -0
- package/schemas/implementation-plan.schema.json +19 -0
- package/schemas/proposed-learning.schema.json +19 -0
- package/schemas/research.schema.json +18 -0
- package/schemas/review.schema.json +29 -0
- package/schemas/run-manifest.schema.json +18 -0
- package/schemas/spec-challenge.schema.json +18 -0
- package/schemas/spec.schema.json +20 -0
- package/schemas/usage.schema.json +102 -0
- package/schemas/verification-proof.schema.json +29 -0
- package/schemas/wazir-manifest.schema.json +173 -0
- package/skills/README.md +40 -0
- package/skills/brainstorming/SKILL.md +77 -0
- package/skills/debugging/SKILL.md +50 -0
- package/skills/design/SKILL.md +61 -0
- package/skills/dispatching-parallel-agents/SKILL.md +128 -0
- package/skills/executing-plans/SKILL.md +70 -0
- package/skills/finishing-a-development-branch/SKILL.md +169 -0
- package/skills/humanize/SKILL.md +123 -0
- package/skills/init-pipeline/SKILL.md +124 -0
- package/skills/prepare-next/SKILL.md +20 -0
- package/skills/receiving-code-review/SKILL.md +123 -0
- package/skills/requesting-code-review/SKILL.md +105 -0
- package/skills/requesting-code-review/code-reviewer.md +108 -0
- package/skills/run-audit/SKILL.md +197 -0
- package/skills/scan-project/SKILL.md +41 -0
- package/skills/self-audit/SKILL.md +153 -0
- package/skills/subagent-driven-development/SKILL.md +154 -0
- package/skills/subagent-driven-development/code-quality-reviewer-prompt.md +26 -0
- package/skills/subagent-driven-development/implementer-prompt.md +102 -0
- package/skills/subagent-driven-development/spec-reviewer-prompt.md +61 -0
- package/skills/tdd/SKILL.md +23 -0
- package/skills/using-git-worktrees/SKILL.md +163 -0
- package/skills/using-skills/SKILL.md +95 -0
- package/skills/verification/SKILL.md +22 -0
- package/skills/wazir/SKILL.md +463 -0
- package/skills/writing-plans/SKILL.md +30 -0
- package/skills/writing-skills/SKILL.md +157 -0
- package/skills/writing-skills/anthropic-best-practices.md +122 -0
- package/skills/writing-skills/persuasion-principles.md +50 -0
- package/templates/README.md +20 -0
- package/templates/artifacts/README.md +10 -0
- package/templates/artifacts/accepted-learning.md +19 -0
- package/templates/artifacts/accepted-learning.template.json +12 -0
- package/templates/artifacts/author.md +74 -0
- package/templates/artifacts/author.template.json +19 -0
- package/templates/artifacts/clarification.md +21 -0
- package/templates/artifacts/clarification.template.json +12 -0
- package/templates/artifacts/execute-notes.md +19 -0
- package/templates/artifacts/implementation-plan.md +21 -0
- package/templates/artifacts/implementation-plan.template.json +11 -0
- package/templates/artifacts/learning-proposal.md +19 -0
- package/templates/artifacts/next-run-handoff.md +21 -0
- package/templates/artifacts/plan-review.md +19 -0
- package/templates/artifacts/proposed-learning.template.json +12 -0
- package/templates/artifacts/research.md +21 -0
- package/templates/artifacts/research.template.json +12 -0
- package/templates/artifacts/review-findings.md +19 -0
- package/templates/artifacts/review.template.json +11 -0
- package/templates/artifacts/run-manifest.template.json +8 -0
- package/templates/artifacts/spec-challenge.md +19 -0
- package/templates/artifacts/spec-challenge.template.json +11 -0
- package/templates/artifacts/spec.md +21 -0
- package/templates/artifacts/spec.template.json +12 -0
- package/templates/artifacts/verification-proof.md +19 -0
- package/templates/artifacts/verification-proof.template.json +11 -0
- package/templates/examples/accepted-learning.example.json +14 -0
- package/templates/examples/author.example.json +152 -0
- package/templates/examples/clarification.example.json +15 -0
- package/templates/examples/docs-claim.example.json +8 -0
- package/templates/examples/export-manifest.example.json +7 -0
- package/templates/examples/host-export-package.example.json +11 -0
- package/templates/examples/implementation-plan.example.json +17 -0
- package/templates/examples/proposed-learning.example.json +13 -0
- package/templates/examples/research.example.json +15 -0
- package/templates/examples/research.example.md +6 -0
- package/templates/examples/review.example.json +17 -0
- package/templates/examples/run-manifest.example.json +9 -0
- package/templates/examples/spec-challenge.example.json +14 -0
- package/templates/examples/spec.example.json +21 -0
- package/templates/examples/verification-proof.example.json +21 -0
- package/templates/examples/wazir-manifest.example.yaml +65 -0
- package/templates/task-definition-schema.md +99 -0
- package/tooling/README.md +20 -0
- package/tooling/src/adapters/context-mode.js +50 -0
- package/tooling/src/capture/command.js +376 -0
- package/tooling/src/capture/store.js +99 -0
- package/tooling/src/capture/usage.js +270 -0
- package/tooling/src/checks/branches.js +50 -0
- package/tooling/src/checks/brand-truth.js +110 -0
- package/tooling/src/checks/changelog.js +231 -0
- package/tooling/src/checks/command-registry.js +36 -0
- package/tooling/src/checks/commits.js +102 -0
- package/tooling/src/checks/docs-drift.js +103 -0
- package/tooling/src/checks/docs-truth.js +201 -0
- package/tooling/src/checks/runtime-surface.js +156 -0
- package/tooling/src/cli.js +116 -0
- package/tooling/src/command-options.js +56 -0
- package/tooling/src/commands/validate.js +320 -0
- package/tooling/src/doctor/command.js +91 -0
- package/tooling/src/export/command.js +77 -0
- package/tooling/src/export/compiler.js +498 -0
- package/tooling/src/guards/loop-cap-guard.js +52 -0
- package/tooling/src/guards/protected-path-write-guard.js +67 -0
- package/tooling/src/index/command.js +152 -0
- package/tooling/src/index/storage.js +1061 -0
- package/tooling/src/index/summarizers.js +261 -0
- package/tooling/src/loaders.js +18 -0
- package/tooling/src/project-root.js +22 -0
- package/tooling/src/recall/command.js +225 -0
- package/tooling/src/schema-validator.js +30 -0
- package/tooling/src/state-root.js +40 -0
- package/tooling/src/status/command.js +71 -0
- package/wazir.manifest.yaml +135 -0
- package/workflows/README.md +19 -0
- package/workflows/author.md +42 -0
- package/workflows/clarify.md +38 -0
- package/workflows/design-review.md +46 -0
- package/workflows/design.md +44 -0
- package/workflows/discover.md +37 -0
- package/workflows/execute.md +48 -0
- package/workflows/learn.md +38 -0
- package/workflows/plan-review.md +42 -0
- package/workflows/plan.md +39 -0
- package/workflows/prepare-next.md +37 -0
- package/workflows/review.md +40 -0
- package/workflows/run-audit.md +41 -0
- package/workflows/spec-challenge.md +41 -0
- package/workflows/specify.md +38 -0
- package/workflows/verify.md +37 -0
|
@@ -0,0 +1,787 @@
|
|
|
1
|
+
# DevOps & CI/CD -- Expertise Module
|
|
2
|
+
|
|
3
|
+
> A DevOps/CI-CD specialist designs, builds, and maintains the automated pipelines, infrastructure,
|
|
4
|
+
> and operational practices that enable teams to deliver software reliably, securely, and at speed.
|
|
5
|
+
> Scope spans source control workflows through production observability, including IaC, container
|
|
6
|
+
> orchestration, deployment strategies, security scanning, and incident response.
|
|
7
|
+
|
|
8
|
+
---
|
|
9
|
+
|
|
10
|
+
## Core Patterns & Conventions
|
|
11
|
+
|
|
12
|
+
### CI/CD Pipeline Design
|
|
13
|
+
|
|
14
|
+
**Canonical Stage Progression:**
|
|
15
|
+
|
|
16
|
+
```
|
|
17
|
+
Source -> Build -> Unit Test -> SAST/Lint -> Integration Test -> Artifact Publish
|
|
18
|
+
-> Deploy Staging -> E2E/Smoke -> Security Scan -> Deploy Production -> Post-Deploy Verify
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
**Key Principles:**
|
|
22
|
+
|
|
23
|
+
- **Pipeline as Code**: Store pipeline definitions in version control alongside application code.
|
|
24
|
+
- **Fail fast**: Place cheap checks (linting, unit tests, SAST) early; expensive checks later.
|
|
25
|
+
- **Parallel execution**: Run independent jobs (static analysis, unit tests, security scans) concurrently.
|
|
26
|
+
- **Quality gates**: Block promotion if coverage drops, vulnerabilities are found, or thresholds breach.
|
|
27
|
+
- **Immutable artifacts**: Build once, promote the same binary through environments.
|
|
28
|
+
- **Environment parity**: Staging must mirror production (same base images, resource limits).
|
|
29
|
+
|
|
30
|
+
**Environment Promotion:** `dev -> staging -> canary (5%) -> production (full rollout)` -- each promotion requires a gate (automated tests, approval, or metric-based analysis).
|
|
31
|
+
|
|
32
|
+
### Infrastructure as Code (IaC)
|
|
33
|
+
|
|
34
|
+
**Tool Landscape (early 2026):**
|
|
35
|
+
|
|
36
|
+
| Tool | Language | Multi-Cloud | License | State Management |
|
|
37
|
+
|------|----------|-------------|---------|------------------|
|
|
38
|
+
| Terraform 1.10+ | HCL | Yes | BSL 1.1 (proprietary) | Remote state (S3, TFC) |
|
|
39
|
+
| OpenTofu 1.9+ | HCL | Yes | MPL 2.0 (open source) | Same as Terraform |
|
|
40
|
+
| Pulumi 3.x | Python/TS/Go/C# | Yes | Apache 2.0 | Pulumi Cloud or self-managed |
|
|
41
|
+
| AWS CDK 2.x | TS/Python/Java/C# | AWS only | Apache 2.0 | CloudFormation |
|
|
42
|
+
|
|
43
|
+
**Critical note**: Terraform Open Source under BSL will be discontinued after July 2025. OpenTofu (CNCF sandbox) is the drop-in open-source replacement. CDKTF was deprecated in December 2025.
|
|
44
|
+
|
|
45
|
+
**Best Practices:** Pin provider/module versions explicitly. Store state remotely with locking. Separate state per environment and per component. Run `plan`/`preview` in CI before any apply. Tag all resources with `team`, `env`, `cost-center`, `managed-by`.
|
|
46
|
+
|
|
47
|
+
### Container Orchestration
|
|
48
|
+
|
|
49
|
+
**Kubernetes** remains the standard for production orchestration. Key practices: use namespaces for isolation; define resource requests AND limits; use `PodDisruptionBudget` for availability during drains; prefer `Deployment` (stateless) or `StatefulSet` (stateful); use HPA with custom metrics; deploy via GitOps.
|
|
50
|
+
|
|
51
|
+
**Docker Compose** suits local development. **ECS Fargate** is a valid simpler alternative for AWS-only workloads not needing K8s ecosystem tooling.
|
|
52
|
+
|
|
53
|
+
### GitOps Workflows
|
|
54
|
+
|
|
55
|
+
GitOps treats Git as the single source of truth. An agent inside the cluster continuously reconciles state with Git.
|
|
56
|
+
|
|
57
|
+
- **ArgoCD**: Rich web UI, multi-tenancy, RBAC, SSO. Stronger ecosystem and enterprise backing. Recommended for most new projects. CNCF graduated.
|
|
58
|
+
- **Flux**: Kubernetes-native (CRDs), modular, CLI-driven, lightweight. CNCF graduated. Weaveworks shut down in 2024; ArgoCD has stronger momentum.
|
|
59
|
+
|
|
60
|
+
**Best Practices:** Separate app code repos from GitOps config repos. Use Kustomize overlays or Helm values per environment. Enable drift detection. Require PRs for all changes. Use sealed secrets or external secret operators.
|
|
61
|
+
|
|
62
|
+
### Configuration Management
|
|
63
|
+
|
|
64
|
+
- **Ansible** (agentless, SSH-based): Best for VM provisioning and OS configuration.
|
|
65
|
+
- **Chef/Puppet**: Legacy environments only. Prefer Ansible for new projects.
|
|
66
|
+
|
|
67
|
+
### Branching Strategies
|
|
68
|
+
|
|
69
|
+
| Strategy | CI/CD Implications | Best For |
|
|
70
|
+
|----------|-------------------|----------|
|
|
71
|
+
| **Trunk-based** | CI on every commit to main; short-lived branches (<1 day) | Continuous deployment |
|
|
72
|
+
| **GitHub Flow** | CI on PR branches; CD triggers on merge to main | Most SaaS teams |
|
|
73
|
+
| **GitFlow** | CI on feature/develop/release branches; complex release trains | Versioned/scheduled releases |
|
|
74
|
+
|
|
75
|
+
Trunk-based development with feature flags is the recommended default for continuous deployment.
|
|
76
|
+
|
|
77
|
+
### Artifact Management
|
|
78
|
+
|
|
79
|
+
Use a dedicated registry (GHCR, ECR, Artifact Registry, Artifactory). Tag images with Git SHA, not `latest`. Implement retention policies. Sign artifacts with Sigstore/cosign.
|
|
80
|
+
|
|
81
|
+
### Secret Management
|
|
82
|
+
|
|
83
|
+
| Tool | Best For |
|
|
84
|
+
|------|----------|
|
|
85
|
+
| **HashiCorp Vault** | Dynamic secrets, PKI, multi-cloud |
|
|
86
|
+
| **AWS Secrets Manager** | AWS-native workloads, automatic rotation |
|
|
87
|
+
| **SOPS** (Mozilla) | Encrypting secrets in Git (KMS backend) |
|
|
88
|
+
| **External Secrets Operator** | Syncing cloud secrets into K8s Secrets |
|
|
89
|
+
|
|
90
|
+
Never store secrets in code or CI/CD logs. Rotate static credentials every 90 days maximum. Use short-lived, dynamically generated credentials wherever possible (Vault dynamic secrets, IAM Roles for Service Accounts, GCP Workload Identity).
|
|
91
|
+
|
|
92
|
+
---
|
|
93
|
+
|
|
94
|
+
## Anti-Patterns & Pitfalls
|
|
95
|
+
|
|
96
|
+
### 1. Creating a Separate "DevOps Team"
|
|
97
|
+
**Why**: Creates another silo, contradicting DevOps's goal of shared responsibility. Teams should own their own pipelines and infrastructure.
|
|
98
|
+
|
|
99
|
+
### 2. Tool-First, Culture-Last
|
|
100
|
+
**Why**: Adopting K8s/Docker/Jenkins without changing collaboration patterns delivers zero value. Tools amplify culture; they do not replace it.
|
|
101
|
+
|
|
102
|
+
### 3. Manual Deployments to Production
|
|
103
|
+
**Why**: Human error, unrepeatable, unauditable. Every deployment must be automated through the pipeline.
|
|
104
|
+
|
|
105
|
+
### 4. No Rollback Strategy
|
|
106
|
+
**Why**: A failed deployment without a tested rollback path becomes an incident. Always have one-click rollback.
|
|
107
|
+
|
|
108
|
+
### 5. Snowflake Servers / Configuration Drift
|
|
109
|
+
**Why**: Manually configured servers diverge and cannot be reproduced. IaC + immutable infrastructure eliminates this.
|
|
110
|
+
|
|
111
|
+
### 6. Secrets in Source Control
|
|
112
|
+
**Why**: Secrets persist in Git history even after deletion. Bots actively scan repos for leaked credentials.
|
|
113
|
+
|
|
114
|
+
### 7. Monolithic Pipelines (No Parallelism)
|
|
115
|
+
**Why**: Sequential execution turns 5-minute pipelines into 45-minute pipelines. Developers batch changes, reducing feedback speed.
|
|
116
|
+
|
|
117
|
+
### 8. Skipping Staging
|
|
118
|
+
**Why**: Without production-parity staging, bugs from network policies, resource limits, and DNS surface only in production.
|
|
119
|
+
|
|
120
|
+
### 9. Over-Automation Without Process Understanding
|
|
121
|
+
**Why**: Automating a broken process makes it break faster. Optimize the process first, then automate.
|
|
122
|
+
|
|
123
|
+
### 10. Ignoring Pipeline as Code Versioning
|
|
124
|
+
**Why**: Editing pipelines via web UI means no audit trail, no code review, no rollback capability.
|
|
125
|
+
|
|
126
|
+
### 11. Alert Fatigue
|
|
127
|
+
**Why**: Hundreds of noisy alerts train teams to ignore all alerts. Every alert must be actionable.
|
|
128
|
+
|
|
129
|
+
### 12. "Lift and Shift" to Kubernetes
|
|
130
|
+
**Why**: Moving monoliths into containers without architectural changes adds complexity without benefits.
|
|
131
|
+
|
|
132
|
+
### 13. Hardcoded Environment Configuration
|
|
133
|
+
**Why**: Config baked into images requires rebuilding per environment, breaking immutable artifact principles.
|
|
134
|
+
|
|
135
|
+
### 14. No Observability Until Incidents
|
|
136
|
+
**Why**: Monitoring from day one is essential. Without baseline metrics, you cannot compare during incidents.
|
|
137
|
+
|
|
138
|
+
### 15. Premature Microservices
|
|
139
|
+
**Why**: Adds network complexity and operational overhead. Start with a structured monolith; extract when scale demands it.
|
|
140
|
+
|
|
141
|
+
---
|
|
142
|
+
|
|
143
|
+
## Testing Strategy
|
|
144
|
+
|
|
145
|
+
### Pipeline Testing
|
|
146
|
+
|
|
147
|
+
- **Linting**: Dockerfiles (`hadolint`), Helm (`helm lint`), Terraform (`tflint`), YAML (`yamllint`).
|
|
148
|
+
- **Unit tests**: Coverage thresholds (e.g., 80% min). Fail pipeline on coverage regression.
|
|
149
|
+
- **Integration tests**: Containerized dependencies via Docker Compose or Testcontainers.
|
|
150
|
+
- **E2E tests**: Deployed staging environment. Playwright/Cypress. Test critical paths only.
|
|
151
|
+
|
|
152
|
+
### Infrastructure Testing
|
|
153
|
+
|
|
154
|
+
| Tool | Purpose |
|
|
155
|
+
|------|---------|
|
|
156
|
+
| **Terratest** | Integration tests for Terraform modules (Go) |
|
|
157
|
+
| **Checkov** | Static analysis for IaC security misconfigurations |
|
|
158
|
+
| **tfsec / Trivy IaC** | Security scanning for Terraform, CloudFormation, K8s manifests |
|
|
159
|
+
| **OPA/Conftest** | Policy testing against structured data (JSON/YAML/HCL) |
|
|
160
|
+
|
|
161
|
+
### Deployment Testing
|
|
162
|
+
|
|
163
|
+
- **Canary analysis**: 5-10% traffic to new version; Argo Rollouts or Flagger with Prometheus metrics (error rate, p99 latency) for auto-promote/rollback.
|
|
164
|
+
- **Blue/green validation**: Smoke tests against green before switching the load balancer.
|
|
165
|
+
- **Smoke tests**: HTTP checks on `/health`, `/readiness`, key API routes post-deployment.
|
|
166
|
+
|
|
167
|
+
### Chaos Engineering
|
|
168
|
+
|
|
169
|
+
- **Tools**: LitmusChaos (CNCF, K8s-native), Gremlin (commercial), Chaos Mesh (CNCF sandbox).
|
|
170
|
+
- Start small (kill a pod, add latency). Define steady state first. Limit blast radius.
|
|
171
|
+
- Integrate chaos experiments into staging release pipelines for resilience validation.
|
|
172
|
+
|
|
173
|
+
---
|
|
174
|
+
|
|
175
|
+
## Performance Considerations
|
|
176
|
+
|
|
177
|
+
### Pipeline Speed Optimization
|
|
178
|
+
|
|
179
|
+
- **Dependency caching**: Cache `node_modules`, `.m2/repository`, pip wheels between runs. GHA: `actions/cache@v4`. GitLab: `cache:` directive.
|
|
180
|
+
- **Docker layer caching**: BuildKit with `--cache-from`/`--cache-to` for registry caching. GHA: `cache-from: type=gha`.
|
|
181
|
+
- **Remote build caching**: Gradle remote cache, Bazel remote execution, Nx Cloud, Turborepo.
|
|
182
|
+
- **Parallelism**: Split test suites (`jest --shard`, `pytest-split`). Fan-out/fan-in pattern. Matrix builds for multi-platform testing.
|
|
183
|
+
|
|
184
|
+
### Docker Layer Caching
|
|
185
|
+
|
|
186
|
+
```dockerfile
|
|
187
|
+
# BAD: Invalidates cache on any file change
|
|
188
|
+
COPY . /app
|
|
189
|
+
RUN npm install
|
|
190
|
+
|
|
191
|
+
# GOOD: Dependency manifest first, then source
|
|
192
|
+
COPY package.json package-lock.json /app/
|
|
193
|
+
RUN npm ci --production
|
|
194
|
+
COPY . /app
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
Order instructions least-to-most frequently changed. Use `.dockerignore`. Use cache mounts: `RUN --mount=type=cache,target=/root/.npm npm ci`. Separate build stage from runtime via multi-stage builds.
|
|
198
|
+
|
|
199
|
+
### Monorepo CI Optimization
|
|
200
|
+
|
|
201
|
+
- **Affected detection**: Nx (`nx affected`), Turborepo, Bazel, or `git diff` to build only what changed.
|
|
202
|
+
- **Task graph**: Nx/Bazel model inter-package dependencies for correct order + max parallelism.
|
|
203
|
+
- **Impact**: 60-80% CI time reduction with selective execution + remote caching.
|
|
204
|
+
|
|
205
|
+
---
|
|
206
|
+
|
|
207
|
+
## Security Considerations
|
|
208
|
+
|
|
209
|
+
### Supply Chain Security
|
|
210
|
+
|
|
211
|
+
**SBOM**: Generate in SPDX 3.0 or CycloneDX for every release. Tools: Syft, Trivy, `cdxgen`. Mandated by U.S. EO 14028 for federal suppliers; CISA updated minimum elements in 2025.
|
|
212
|
+
|
|
213
|
+
**SLSA 1.0** (four levels of build integrity):
|
|
214
|
+
- **L1**: Documented build. **L2**: Hosted build + signed provenance (achievable in weeks with GHA OIDC + Sigstore). **L3**: Hardened platform, non-falsifiable provenance. **L4**: Two-person review, hermetic builds.
|
|
215
|
+
|
|
216
|
+
**Sigstore**: Keyless signing via `cosign` with OIDC identity. Sign: `cosign sign --yes <image>@<digest>`. Verify: `cosign verify --certificate-oidc-issuer=... <image>`. Rekor transparency log for tamper-proof audit.
|
|
217
|
+
|
|
218
|
+
### Container Scanning
|
|
219
|
+
|
|
220
|
+
| Tool | Type | Strengths |
|
|
221
|
+
|------|------|-----------|
|
|
222
|
+
| **Trivy** | OSS, Apache 2.0 | All-in-one: containers, IaC, secrets, SBOM, licenses |
|
|
223
|
+
| **Snyk Container** | Commercial ($25/dev/mo) | Actionable remediation, auto-fix PRs |
|
|
224
|
+
| **Grype** | OSS | Fast, pairs with Syft for SBOM-based scanning |
|
|
225
|
+
|
|
226
|
+
Scan in CI (block on HIGH/CRITICAL), in registries (admission control), and at runtime. Trivy recommended for cost-sensitive teams.
|
|
227
|
+
|
|
228
|
+
### Secret Scanning and Rotation
|
|
229
|
+
|
|
230
|
+
Enable GitHub secret scanning + push protection. Use `gitleaks` or `trufflehog` as pre-commit hooks. Rotate automatically (AWS Secrets Manager + Lambda). Use OIDC-based auth in CI/CD to eliminate static credentials (GHA OIDC with AWS/GCP/Azure).
|
|
231
|
+
|
|
232
|
+
### RBAC for CI/CD
|
|
233
|
+
|
|
234
|
+
Least privilege for service accounts and runner tokens. Short-lived credentials scoped to repos/environments. GHA: environment protection rules, required reviewers, deployment branches. Separate build (read-only) from deploy (write) permissions.
|
|
235
|
+
|
|
236
|
+
### Compliance as Code
|
|
237
|
+
|
|
238
|
+
- **OPA**: CNCF graduated. Rego-based policies. Steeper learning curve. Best for cross-cutting concerns (API auth, Terraform plan validation, SOC2 mapping).
|
|
239
|
+
- **Kyverno**: CNCF incubating. YAML-based, K8s-native. Lower learning curve. Built-in mutation. Best for K8s policies (pod security, image registry restrictions).
|
|
240
|
+
|
|
241
|
+
---
|
|
242
|
+
|
|
243
|
+
## Integration Patterns
|
|
244
|
+
|
|
245
|
+
### CI/CD Platform Patterns
|
|
246
|
+
|
|
247
|
+
**GitHub Actions:** Reusable workflows as templates. Pin actions to commit SHAs in production. `secrets.inherit` for passing secrets. Limit: 50 workflows, 10 nested reusable per run (Feb 2026). `concurrency` groups to cancel redundant runs.
|
|
248
|
+
|
|
249
|
+
**GitLab CI:** `include:` for shared templates. `rules:changes:` for path-based triggering. `needs:` for DAG-based parallel execution.
|
|
250
|
+
|
|
251
|
+
**Jenkins:** Shared libraries for reuse. Declarative pipelines over scripted. K8s plugin for ephemeral agents. Market share declining in favor of GHA and GitLab CI.
|
|
252
|
+
|
|
253
|
+
### Multi-Cloud Deployment
|
|
254
|
+
|
|
255
|
+
Use multi-cloud IaC (Terraform/OpenTofu, Pulumi). Abstract cloud details behind modules. Single CI/CD platform deploying to multiple clouds. Consistent tagging, monitoring, security across clouds.
|
|
256
|
+
|
|
257
|
+
### Database Migration in CI/CD
|
|
258
|
+
|
|
259
|
+
- **Flyway**: Simple, sequential SQL migrations. Lightweight.
|
|
260
|
+
- **Liquibase**: Advanced governance, rollback, drift detection.
|
|
261
|
+
- **Atlas**: Modern, HCL-based.
|
|
262
|
+
|
|
263
|
+
Run migrations after build, before app deployment. Store scripts in VCS (`db/migrations/`). Design forward-compatible migrations. Separate migration credentials (elevated) from app credentials.
|
|
264
|
+
|
|
265
|
+
### Feature Flags
|
|
266
|
+
|
|
267
|
+
- **LaunchDarkly**: Enterprise, FedRAMP/SOC2. 25% of Fortune 500.
|
|
268
|
+
- **Unleash**: Open-source, self-hostable.
|
|
269
|
+
|
|
270
|
+
Decouple deployment from release. Set expiration dates on temporary flags. Never reuse flag names (linked to 32% of production incidents). Audit and remove stale flags regularly.
|
|
271
|
+
|
|
272
|
+
---
|
|
273
|
+
|
|
274
|
+
## DevOps & Deployment
|
|
275
|
+
|
|
276
|
+
### Deployment Strategies
|
|
277
|
+
|
|
278
|
+
| Strategy | Downtime | Risk | Resource Cost | Best For |
|
|
279
|
+
|----------|----------|------|---------------|----------|
|
|
280
|
+
| **Rolling** | Zero | Medium | 1x + surge | Stateless apps, K8s default |
|
|
281
|
+
| **Blue/Green** | Zero | Low | 2x | Zero-downtime with instant rollback |
|
|
282
|
+
| **Canary** | Zero | Very Low | 1x + small % | High-traffic, gradual validation |
|
|
283
|
+
| **A/B Testing** | Zero | Low | 1x + split | Feature validation with user segments |
|
|
284
|
+
| **Recreate** | Yes | High | 1x | Dev/test, stateful legacy |
|
|
285
|
+
|
|
286
|
+
**Tooling**: Argo Rollouts (canary/blue-green with Prometheus analysis), Flagger (Istio/Linkerd/Traefik traffic shifting).
|
|
287
|
+
|
|
288
|
+
### Rollback Patterns
|
|
289
|
+
|
|
290
|
+
- **K8s**: `kubectl rollout undo deployment/<name>`.
|
|
291
|
+
- **GitOps**: Revert commit in config repo; ArgoCD/Flux reconciles automatically.
|
|
292
|
+
- **Blue/green**: Switch LB back to blue environment.
|
|
293
|
+
- **Database**: Backward-compatible migrations (expand-and-contract pattern).
|
|
294
|
+
- **Feature flags**: Disable the flag instantly without redeployment.
|
|
295
|
+
|
|
296
|
+
### Observability Stack
|
|
297
|
+
|
|
298
|
+
**The "LGTM" Stack (2026):** Loki (logs) + Grafana 11.x (dashboards) + Tempo/Jaeger (traces) + Prometheus 3.x (metrics).
|
|
299
|
+
|
|
300
|
+
**OpenTelemetry** is the unified instrumentation standard (48.5% adoption, 2025 survey). Vendor-neutral SDKs for metrics, traces, logs. Prometheus 3.x supports OTLP ingestion natively.
|
|
301
|
+
|
|
302
|
+
**Key practices:** RED metrics (Rate/Errors/Duration) for services; USE metrics (Utilization/Saturation/Errors) for infrastructure. Alert on symptoms (error rate, latency), not causes (CPU). Define SLOs and alert on error budget consumption.
|
|
303
|
+
|
|
304
|
+
### Incident Response Automation
|
|
305
|
+
|
|
306
|
+
- **PagerDuty / Opsgenie / incident.io**: Alert routing with escalation policies.
|
|
307
|
+
- **Runbook automation**: Pre-defined diagnostic/remediation workflows (PagerDuty Runbook Automation, Rundeck).
|
|
308
|
+
- **ChatOps**: Slack/Teams integration for status updates, escalation, timeline generation.
|
|
309
|
+
- **Post-incident**: Blameless retrospectives. Document timeline and action items.
|
|
310
|
+
|
|
311
|
+
---
|
|
312
|
+
|
|
313
|
+
## Decision Trees
|
|
314
|
+
|
|
315
|
+
### Decision Tree 1: Which IaC Tool?
|
|
316
|
+
|
|
317
|
+
```
|
|
318
|
+
START: Multi-cloud needed?
|
|
319
|
+
+-- NO (AWS only) --> Want CloudFormation safety nets?
|
|
320
|
+
| +-- YES --> AWS CDK 2.x
|
|
321
|
+
| +-- NO --> Terraform/OpenTofu or Pulumi
|
|
322
|
+
+-- YES --> Prefer declarative (HCL) or imperative (Python/TS/Go)?
|
|
323
|
+
+-- Declarative --> Need open-source license?
|
|
324
|
+
| +-- YES --> OpenTofu 1.9+ (MPL 2.0, CNCF)
|
|
325
|
+
| +-- NO --> Terraform 1.10+ (BSL 1.1, IBM/HashiCorp)
|
|
326
|
+
+-- Imperative --> Pulumi 3.x (native testing, IDE support, ~30% faster onboarding)
|
|
327
|
+
```
|
|
328
|
+
|
|
329
|
+
### Decision Tree 2: Which Deployment Strategy?
|
|
330
|
+
|
|
331
|
+
```
|
|
332
|
+
START: Can the app tolerate brief partial unavailability?
|
|
333
|
+
+-- YES --> Low-risk change?
|
|
334
|
+
| +-- YES --> Rolling (K8s default, simplest)
|
|
335
|
+
| +-- NO --> Blue/Green (instant rollback, 2x resources)
|
|
336
|
+
+-- NO --> Have metric-based automation (Prometheus, etc.)?
|
|
337
|
+
+-- YES --> Canary with Argo Rollouts/Flagger (auto-promote/rollback)
|
|
338
|
+
+-- NO --> Blue/Green with manual verification
|
|
339
|
+
Need user-segment targeting? --> A/B with feature flags
|
|
340
|
+
```
|
|
341
|
+
|
|
342
|
+
### Decision Tree 3: Kubernetes vs. Simpler Alternatives?
|
|
343
|
+
|
|
344
|
+
```
|
|
345
|
+
START: How many services?
|
|
346
|
+
+-- 1-3 --> Need auto-scaling/self-healing/multi-region?
|
|
347
|
+
| +-- NO --> ECS Fargate / Cloud Run / App Runner
|
|
348
|
+
| +-- YES --> Consider managed K8s (evaluate overhead)
|
|
349
|
+
+-- 4-10 --> Have platform engineering capacity?
|
|
350
|
+
| +-- YES --> Managed K8s (EKS/GKE/AKS)
|
|
351
|
+
| +-- NO --> ECS Fargate / Cloud Run
|
|
352
|
+
+-- 10+ --> Managed K8s + platform team/IDP + ArgoCD GitOps
|
|
353
|
+
```
|
|
354
|
+
|
|
355
|
+
---
|
|
356
|
+
|
|
357
|
+
## Code Examples
|
|
358
|
+
|
|
359
|
+
### Example 1: GitHub Actions CI/CD Pipeline
|
|
360
|
+
|
|
361
|
+
```yaml
|
|
362
|
+
name: CI/CD Pipeline
|
|
363
|
+
on:
|
|
364
|
+
push: { branches: [main] }
|
|
365
|
+
pull_request: { branches: [main] }
|
|
366
|
+
concurrency:
|
|
367
|
+
group: ${{ github.workflow }}-${{ github.ref }}
|
|
368
|
+
cancel-in-progress: true
|
|
369
|
+
permissions:
|
|
370
|
+
contents: read
|
|
371
|
+
id-token: write
|
|
372
|
+
packages: write
|
|
373
|
+
|
|
374
|
+
jobs:
|
|
375
|
+
lint-and-test:
|
|
376
|
+
runs-on: ubuntu-24.04
|
|
377
|
+
steps:
|
|
378
|
+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683 # v4.2.2
|
|
379
|
+
- uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4.4.0
|
|
380
|
+
with: { node-version: 22, cache: npm }
|
|
381
|
+
- run: npm ci && npm run lint && npm test -- --coverage
|
|
382
|
+
|
|
383
|
+
security-scan:
|
|
384
|
+
runs-on: ubuntu-24.04
|
|
385
|
+
steps:
|
|
386
|
+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683
|
|
387
|
+
- uses: aquasecurity/trivy-action@18f2510ee396bbf400402947e7f3b01b8e110956 # v0.29.0
|
|
388
|
+
with: { scan-type: fs, severity: "CRITICAL,HIGH", exit-code: 1 }
|
|
389
|
+
|
|
390
|
+
build-and-push:
|
|
391
|
+
needs: [lint-and-test, security-scan]
|
|
392
|
+
if: github.ref == 'refs/heads/main'
|
|
393
|
+
runs-on: ubuntu-24.04
|
|
394
|
+
steps:
|
|
395
|
+
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683
|
|
396
|
+
- uses: docker/setup-buildx-action@b5ca514318bd6ebac0fb2aedd5d36ec1b5c232a2
|
|
397
|
+
- uses: docker/login-action@74a5d142397b4f367a81961eba4e8cd7edddf772
|
|
398
|
+
with: { registry: ghcr.io, username: "${{ github.actor }}", password: "${{ secrets.GITHUB_TOKEN }}" }
|
|
399
|
+
- uses: docker/build-push-action@14487ce63c7a62a4a324b0bfb37086795e31c6c1
|
|
400
|
+
with:
|
|
401
|
+
push: true
|
|
402
|
+
tags: ghcr.io/${{ github.repository }}:${{ github.sha }}
|
|
403
|
+
cache-from: type=gha
|
|
404
|
+
cache-to: type=gha,mode=max
|
|
405
|
+
```
|
|
406
|
+
|
|
407
|
+
### Example 2: Production Dockerfile (Multi-Stage)
|
|
408
|
+
|
|
409
|
+
```dockerfile
|
|
410
|
+
# syntax=docker/dockerfile:1.9
|
|
411
|
+
FROM node:22-alpine AS builder
|
|
412
|
+
WORKDIR /app
|
|
413
|
+
COPY package.json package-lock.json ./
|
|
414
|
+
RUN --mount=type=cache,target=/root/.npm npm ci
|
|
415
|
+
COPY tsconfig.json ./
|
|
416
|
+
COPY src/ ./src/
|
|
417
|
+
RUN npm run build && npm prune --production
|
|
418
|
+
|
|
419
|
+
FROM node:22-alpine AS runtime
|
|
420
|
+
RUN apk add --no-cache tini && adduser -u 1001 -D appuser
|
|
421
|
+
WORKDIR /app
|
|
422
|
+
COPY --from=builder --chown=appuser /app/dist ./dist
|
|
423
|
+
COPY --from=builder --chown=appuser /app/node_modules ./node_modules
|
|
424
|
+
COPY --from=builder --chown=appuser /app/package.json ./
|
|
425
|
+
USER appuser
|
|
426
|
+
EXPOSE 3000
|
|
427
|
+
HEALTHCHECK --interval=30s --timeout=3s CMD wget -qO- http://localhost:3000/health || exit 1
|
|
428
|
+
ENTRYPOINT ["tini", "--"]
|
|
429
|
+
CMD ["node", "dist/index.js"]
|
|
430
|
+
```
|
|
431
|
+
|
|
432
|
+
### Example 3: Terraform ECS Fargate Service (condensed)
|
|
433
|
+
|
|
434
|
+
```hcl
|
|
435
|
+
# modules/ecs-service/main.tf
|
|
436
|
+
terraform {
|
|
437
|
+
required_version = ">= 1.9.0"
|
|
438
|
+
required_providers {
|
|
439
|
+
aws = { source = "hashicorp/aws", version = "~> 5.80" }
|
|
440
|
+
}
|
|
441
|
+
}
|
|
442
|
+
|
|
443
|
+
resource "aws_ecs_service" "this" {
|
|
444
|
+
name = var.service_name
|
|
445
|
+
cluster = var.cluster_arn
|
|
446
|
+
task_definition = aws_ecs_task_definition.this.arn
|
|
447
|
+
desired_count = var.desired_count
|
|
448
|
+
launch_type = "FARGATE"
|
|
449
|
+
|
|
450
|
+
network_configuration {
|
|
451
|
+
subnets = var.subnet_ids
|
|
452
|
+
security_groups = [aws_security_group.service.id]
|
|
453
|
+
assign_public_ip = false
|
|
454
|
+
}
|
|
455
|
+
deployment_circuit_breaker {
|
|
456
|
+
enable = true
|
|
457
|
+
rollback = true # Auto-rollback on deployment failure
|
|
458
|
+
}
|
|
459
|
+
tags = merge(var.tags, { ManagedBy = "terraform" })
|
|
460
|
+
}
|
|
461
|
+
```
|
|
462
|
+
|
|
463
|
+
### Example 4: ArgoCD Application + Argo Rollouts Canary
|
|
464
|
+
|
|
465
|
+
```yaml
|
|
466
|
+
# ArgoCD Application
|
|
467
|
+
apiVersion: argoproj.io/v1alpha1
|
|
468
|
+
kind: Application
|
|
469
|
+
metadata:
|
|
470
|
+
name: my-service
|
|
471
|
+
namespace: argocd
|
|
472
|
+
spec:
|
|
473
|
+
project: default
|
|
474
|
+
source:
|
|
475
|
+
repoURL: https://github.com/org/gitops-config.git
|
|
476
|
+
targetRevision: main
|
|
477
|
+
path: services/my-service/overlays/production
|
|
478
|
+
destination:
|
|
479
|
+
server: https://kubernetes.default.svc
|
|
480
|
+
namespace: my-service
|
|
481
|
+
syncPolicy:
|
|
482
|
+
automated: { prune: true, selfHeal: true }
|
|
483
|
+
syncOptions: [CreateNamespace=true, ServerSideApply=true]
|
|
484
|
+
---
|
|
485
|
+
# Argo Rollouts Canary with Prometheus analysis
|
|
486
|
+
apiVersion: argoproj.io/v1alpha1
|
|
487
|
+
kind: Rollout
|
|
488
|
+
metadata:
|
|
489
|
+
name: my-service
|
|
490
|
+
spec:
|
|
491
|
+
replicas: 10
|
|
492
|
+
selector:
|
|
493
|
+
matchLabels: { app: my-service }
|
|
494
|
+
strategy:
|
|
495
|
+
canary:
|
|
496
|
+
steps:
|
|
497
|
+
- setWeight: 5
|
|
498
|
+
- pause: { duration: 2m }
|
|
499
|
+
- analysis:
|
|
500
|
+
templates: [{ templateName: success-rate }]
|
|
501
|
+
- setWeight: 25
|
|
502
|
+
- pause: { duration: 5m }
|
|
503
|
+
- setWeight: 100
|
|
504
|
+
trafficRouting:
|
|
505
|
+
istio:
|
|
506
|
+
virtualService: { name: my-service-vsvc }
|
|
507
|
+
---
|
|
508
|
+
apiVersion: argoproj.io/v1alpha1
|
|
509
|
+
kind: AnalysisTemplate
|
|
510
|
+
metadata:
|
|
511
|
+
name: success-rate
|
|
512
|
+
spec:
|
|
513
|
+
metrics:
|
|
514
|
+
- name: success-rate
|
|
515
|
+
interval: 60s
|
|
516
|
+
successCondition: result[0] >= 0.99
|
|
517
|
+
failureLimit: 3
|
|
518
|
+
provider:
|
|
519
|
+
prometheus:
|
|
520
|
+
address: http://prometheus.monitoring:9090
|
|
521
|
+
query: |
|
|
522
|
+
sum(rate(http_requests_total{service="my-service",status=~"2.."}[2m]))
|
|
523
|
+
/ sum(rate(http_requests_total{service="my-service"}[2m]))
|
|
524
|
+
```
|
|
525
|
+
|
|
526
|
+
---
|
|
527
|
+
|
|
528
|
+
## Deployment Strategies
|
|
529
|
+
|
|
530
|
+
Production deployment requires deliberate strategy selection based on risk tolerance, resource
|
|
531
|
+
budget, and rollback requirements. The patterns below move from simplest (blue-green) through
|
|
532
|
+
progressive delivery (canary) to the infrastructure and observability layers that support them.
|
|
533
|
+
|
|
534
|
+
### Blue-Green Deployment
|
|
535
|
+
|
|
536
|
+
Two identical environments run simultaneously: **blue** (current production) and **green** (new
|
|
537
|
+
release candidate). Traffic is routed entirely to one environment at a time. The deployment
|
|
538
|
+
sequence is: deploy to green, verify health, switch traffic, keep blue as instant rollback.
|
|
539
|
+
|
|
540
|
+
**Advantages:** Zero downtime, instant rollback (switch back to blue), full production-parity
|
|
541
|
+
testing before user exposure. **Trade-off:** Requires 2x infrastructure during the transition
|
|
542
|
+
window.
|
|
543
|
+
|
|
544
|
+
```yaml
|
|
545
|
+
# .github/workflows/blue-green-deploy.yml
|
|
546
|
+
name: Blue-Green Deploy
|
|
547
|
+
on:
|
|
548
|
+
push:
|
|
549
|
+
branches: [main]
|
|
550
|
+
|
|
551
|
+
jobs:
|
|
552
|
+
deploy:
|
|
553
|
+
runs-on: ubuntu-latest
|
|
554
|
+
environment: production
|
|
555
|
+
steps:
|
|
556
|
+
- uses: actions/checkout@v4
|
|
557
|
+
|
|
558
|
+
- name: Deploy to green environment
|
|
559
|
+
run: |
|
|
560
|
+
aws ecs update-service \
|
|
561
|
+
--cluster prod \
|
|
562
|
+
--service app-green \
|
|
563
|
+
--task-definition app:${{ github.sha }} \
|
|
564
|
+
--desired-count 3
|
|
565
|
+
|
|
566
|
+
- name: Wait for green to stabilize
|
|
567
|
+
run: |
|
|
568
|
+
aws ecs wait services-stable \
|
|
569
|
+
--cluster prod \
|
|
570
|
+
--services app-green
|
|
571
|
+
|
|
572
|
+
- name: Health check green
|
|
573
|
+
run: |
|
|
574
|
+
for i in $(seq 1 30); do
|
|
575
|
+
STATUS=$(curl -sf -o /dev/null -w "%{http_code}" https://green.app.example.com/health)
|
|
576
|
+
if [ "$STATUS" = "200" ]; then
|
|
577
|
+
echo "Health check passed on attempt $i"
|
|
578
|
+
exit 0
|
|
579
|
+
fi
|
|
580
|
+
echo "Attempt $i failed (status: $STATUS), retrying in 10s..."
|
|
581
|
+
sleep 10
|
|
582
|
+
done
|
|
583
|
+
echo "Health check failed after 30 attempts"
|
|
584
|
+
exit 1
|
|
585
|
+
|
|
586
|
+
- name: Switch traffic to green
|
|
587
|
+
run: |
|
|
588
|
+
aws elbv2 modify-listener \
|
|
589
|
+
--listener-arn ${{ secrets.ALB_LISTENER_ARN }} \
|
|
590
|
+
--default-actions Type=forward,TargetGroupArn=${{ secrets.GREEN_TG_ARN }}
|
|
591
|
+
|
|
592
|
+
- name: Verify traffic switch
|
|
593
|
+
run: |
|
|
594
|
+
sleep 30
|
|
595
|
+
curl -sf https://app.example.com/health | jq .version
|
|
596
|
+
```
|
|
597
|
+
|
|
598
|
+
**Rollback procedure:** If post-switch monitoring detects anomalies, revert the listener to
|
|
599
|
+
point back at the blue target group. No redeployment required -- blue is still running the
|
|
600
|
+
previous known-good version.
|
|
601
|
+
|
|
602
|
+
### Canary Deployment
|
|
603
|
+
|
|
604
|
+
Canary releases route a small percentage of production traffic to the new version while the
|
|
605
|
+
majority continues hitting the stable release. Traffic is shifted incrementally as confidence
|
|
606
|
+
grows: **5% -> 25% -> 100%**. At each stage, automated analysis compares error rates, latency
|
|
607
|
+
percentiles, and business metrics between the canary and the baseline.
|
|
608
|
+
|
|
609
|
+
**When to use canary over blue-green:**
|
|
610
|
+
- High-traffic services where even brief full-cutover risk is unacceptable
|
|
611
|
+
- When metric-based automated promotion/rollback is available (Argo Rollouts, Flagger)
|
|
612
|
+
- When you need gradual user exposure to catch long-tail issues
|
|
613
|
+
|
|
614
|
+
**Traffic progression example:**
|
|
615
|
+
|
|
616
|
+
```
|
|
617
|
+
Step 1: 5% canary -- 2 min pause -- run AnalysisTemplate (success rate >= 99%)
|
|
618
|
+
Step 2: 25% canary -- 5 min pause -- run AnalysisTemplate
|
|
619
|
+
Step 3: 100% canary -- promotion complete
|
|
620
|
+
```
|
|
621
|
+
|
|
622
|
+
If any analysis step fails, traffic is automatically routed back to the stable version. The
|
|
623
|
+
canary pods are scaled down and the rollout is marked as degraded. See Example 4 (Argo Rollouts)
|
|
624
|
+
in the Code Examples section above for a full working manifest.
|
|
625
|
+
|
|
626
|
+
**Key metrics to monitor during canary analysis:**
|
|
627
|
+
- HTTP error rate (5xx / total requests) -- threshold: < 1%
|
|
628
|
+
- P95 and P99 latency -- threshold: within 10% of baseline
|
|
629
|
+
- Pod restart count -- threshold: 0 restarts during analysis window
|
|
630
|
+
- Business metrics (conversion rate, checkout success) when applicable
|
|
631
|
+
|
|
632
|
+
### Infrastructure as Code (Terraform)
|
|
633
|
+
|
|
634
|
+
Auto Scaling Groups with target tracking policies provide elastic capacity that responds to
|
|
635
|
+
real-time demand. The configuration below demonstrates a rolling instance refresh strategy
|
|
636
|
+
that maintains 75% healthy capacity during deployments -- ensuring zero downtime while
|
|
637
|
+
replacing instances with updated launch templates.
|
|
638
|
+
|
|
639
|
+
```hcl
|
|
640
|
+
# Auto Scaling Group with target tracking
|
|
641
|
+
resource "aws_autoscaling_group" "app" {
|
|
642
|
+
name = "${var.project}-${var.environment}-asg"
|
|
643
|
+
min_size = var.min_instances
|
|
644
|
+
max_size = var.max_instances
|
|
645
|
+
desired_capacity = var.desired_instances
|
|
646
|
+
health_check_type = "ELB"
|
|
647
|
+
health_check_grace_period = 300
|
|
648
|
+
target_group_arns = [aws_lb_target_group.app.arn]
|
|
649
|
+
vpc_zone_identifier = var.private_subnet_ids
|
|
650
|
+
|
|
651
|
+
launch_template {
|
|
652
|
+
id = aws_launch_template.app.id
|
|
653
|
+
version = "$Latest"
|
|
654
|
+
}
|
|
655
|
+
|
|
656
|
+
instance_refresh {
|
|
657
|
+
strategy = "Rolling"
|
|
658
|
+
preferences {
|
|
659
|
+
min_healthy_percentage = 75
|
|
660
|
+
}
|
|
661
|
+
}
|
|
662
|
+
|
|
663
|
+
tag {
|
|
664
|
+
key = "Environment"
|
|
665
|
+
value = var.environment
|
|
666
|
+
propagate_at_launch = true
|
|
667
|
+
}
|
|
668
|
+
}
|
|
669
|
+
|
|
670
|
+
# CPU-based auto scaling
|
|
671
|
+
resource "aws_autoscaling_policy" "cpu_target" {
|
|
672
|
+
name = "${var.project}-cpu-tracking"
|
|
673
|
+
autoscaling_group_name = aws_autoscaling_group.app.name
|
|
674
|
+
policy_type = "TargetTrackingScaling"
|
|
675
|
+
|
|
676
|
+
target_tracking_configuration {
|
|
677
|
+
predefined_metric_specification {
|
|
678
|
+
predefined_metric_type = "ASGAverageCPUUtilization"
|
|
679
|
+
}
|
|
680
|
+
target_value = 60.0
|
|
681
|
+
disable_scale_in = false
|
|
682
|
+
}
|
|
683
|
+
}
|
|
684
|
+
```
|
|
685
|
+
|
|
686
|
+
**Scaling considerations:**
|
|
687
|
+
- Set `health_check_grace_period` long enough for the application to fully start (including
|
|
688
|
+
warm-up, cache priming, connection pool initialization).
|
|
689
|
+
- Use `mixed_instances_policy` with multiple instance types for cost optimization and
|
|
690
|
+
availability across AZs.
|
|
691
|
+
- Pair CPU-based scaling with request-count scaling (`ALBRequestCountPerTarget`) for
|
|
692
|
+
web-facing services -- CPU alone misses I/O-bound bottlenecks.
|
|
693
|
+
|
|
694
|
+
### Monitoring & Alerting (Prometheus)
|
|
695
|
+
|
|
696
|
+
SLO-based alerting focuses on what matters to users: error rates and latency. The rules below
|
|
697
|
+
implement multi-window burn rate alerts that catch both sudden spikes and slow degradation.
|
|
698
|
+
Every alert includes a `runbook` annotation linking to the remediation procedure -- alerts
|
|
699
|
+
without runbooks become noise.
|
|
700
|
+
|
|
701
|
+
```yaml
|
|
702
|
+
# prometheus-alerts.yml
|
|
703
|
+
groups:
|
|
704
|
+
- name: slo-alerts
|
|
705
|
+
rules:
|
|
706
|
+
- alert: HighErrorRate
|
|
707
|
+
expr: |
|
|
708
|
+
sum(rate(http_requests_total{status=~"5.."}[5m]))
|
|
709
|
+
/ sum(rate(http_requests_total[5m])) > 0.01
|
|
710
|
+
for: 5m
|
|
711
|
+
labels:
|
|
712
|
+
severity: critical
|
|
713
|
+
annotations:
|
|
714
|
+
summary: "Error rate exceeds 1% SLO for 5 minutes"
|
|
715
|
+
runbook: "https://wiki.example.com/runbooks/high-error-rate"
|
|
716
|
+
|
|
717
|
+
- alert: HighP95Latency
|
|
718
|
+
expr: |
|
|
719
|
+
histogram_quantile(0.95,
|
|
720
|
+
sum(rate(http_request_duration_seconds_bucket[5m])) by (le)
|
|
721
|
+
) > 0.5
|
|
722
|
+
for: 5m
|
|
723
|
+
labels:
|
|
724
|
+
severity: warning
|
|
725
|
+
annotations:
|
|
726
|
+
summary: "P95 latency exceeds 500ms SLO"
|
|
727
|
+
|
|
728
|
+
- alert: PodCrashLooping
|
|
729
|
+
expr: |
|
|
730
|
+
increase(kube_pod_container_status_restarts_total[1h]) > 3
|
|
731
|
+
for: 10m
|
|
732
|
+
labels:
|
|
733
|
+
severity: critical
|
|
734
|
+
annotations:
|
|
735
|
+
summary: "Pod {{ $labels.pod }} restarting frequently"
|
|
736
|
+
```
|
|
737
|
+
|
|
738
|
+
**Alert severity guidelines:**
|
|
739
|
+
- **critical**: Pages on-call immediately. Error budget burning at 14.4x+ rate. Examples:
|
|
740
|
+
sustained 5xx spike, data loss risk, complete service unavailability.
|
|
741
|
+
- **warning**: Notifies team channel. Error budget burning at 6x+ rate. Examples: elevated
|
|
742
|
+
latency, increased restart count, approaching resource limits.
|
|
743
|
+
- **info**: Dashboard-only. No notification. Examples: deployment started, scaling event,
|
|
744
|
+
certificate renewal approaching.
|
|
745
|
+
|
|
746
|
+
### Zero-Downtime Database Migrations
|
|
747
|
+
|
|
748
|
+
Database schema changes are the most common source of deployment-related outages. The
|
|
749
|
+
**expand-contract pattern** ensures backwards compatibility throughout the migration lifecycle
|
|
750
|
+
by never removing or renaming something the running application depends on.
|
|
751
|
+
|
|
752
|
+
**The expand-contract sequence:**
|
|
753
|
+
|
|
754
|
+
1. **Expand (additive change):** Add the new column, table, or index. The existing application
|
|
755
|
+
ignores these additions -- no code change needed yet. Run this migration independently,
|
|
756
|
+
well before the application deployment.
|
|
757
|
+
|
|
758
|
+
2. **Deploy new application code:** The updated application writes to both old and new
|
|
759
|
+
columns/tables. It reads from the new structure but falls back to the old if the new
|
|
760
|
+
data is not yet populated.
|
|
761
|
+
|
|
762
|
+
3. **Backfill:** Migrate existing data from old structure to new. Use batched updates with
|
|
763
|
+
throttling to avoid locking tables or overwhelming replication. Verify row counts match.
|
|
764
|
+
|
|
765
|
+
4. **Add constraints:** Once backfill is complete and verified, add NOT NULL constraints,
|
|
766
|
+
foreign keys, or unique indexes on the new structure.
|
|
767
|
+
|
|
768
|
+
5. **Deploy cleanup code:** Remove the fallback reads and dual-writes from the application.
|
|
769
|
+
The application now uses only the new structure.
|
|
770
|
+
|
|
771
|
+
6. **Contract (remove old structure):** Drop the old column, table, or index. This is safe
|
|
772
|
+
because no running code references it.
|
|
773
|
+
|
|
774
|
+
**Hard rules for production migrations:**
|
|
775
|
+
- Never run `ALTER TABLE ... DROP COLUMN` in the same deployment that stops writing to it.
|
|
776
|
+
- Never add a `NOT NULL` column without a `DEFAULT` in the same migration.
|
|
777
|
+
- Never rename a column -- add the new one, backfill, drop the old one.
|
|
778
|
+
- Always test migrations against a production-sized dataset. A migration that takes 2ms on
|
|
779
|
+
dev can lock a 500M-row table for 30 minutes.
|
|
780
|
+
- Use online DDL tools (`pt-online-schema-change`, `gh-ost`, `pg_repack`) for large tables
|
|
781
|
+
to avoid locking.
|
|
782
|
+
- Separate migration deployment from application deployment -- run migrations first, verify,
|
|
783
|
+
then deploy application code.
|
|
784
|
+
|
|
785
|
+
---
|
|
786
|
+
|
|
787
|
+
*Researched: 2026-03-07 | Sources: [Kellton CI/CD Best Practices](https://www.kellton.com/kellton-tech-blog/continuous-integration-deployment-best-practices-2025), [TekRecruiter CI/CD 2026](https://www.tekrecruiter.com/post/top-10-ci-cd-pipeline-best-practices-for-engineering-leaders-in-2026), [GitLab CI/CD Best Practices](https://about.gitlab.com/blog/how-to-keep-up-with-ci-cd-best-practices/), [Naviteq IaC Comparison](https://www.naviteq.io/blog/choosing-the-right-infrastructure-as-code-tools-a-ctos-guide-to-terraform-pulumi-cdk-and-more/), [dasroot IaC 2026](https://dasroot.net/posts/2026/01/infrastructure-as-code-terraform-opentofu-pulumi-comparison-2026/), [sanj.dev IaC Decision Framework](https://sanj.dev/post/terraform-pulumi-aws-cdk-2025-decision-framework), [CNCF GitOps 2025](https://www.cncf.io/blog/2025/06/09/gitops-in-2025-from-old-school-updates-to-the-modern-way/), [Spacelift Flux vs ArgoCD](https://spacelift.io/blog/flux-vs-argo-cd), [Alpacked Anti-Patterns](https://alpacked.io/blog/devops-anti-patterns/), [IsDown Antipatterns](https://isdown.app/blog/devops-antipatterns), [Faith Forge Labs Supply Chain](https://faithforgelabs.com/blog_supplychain_security_2025.php), [SLSA Framework](https://slsa.dev/blog/2025/07/slsa-e2e), [Aikido Snyk vs Trivy](https://www.aikido.dev/blog/snyk-vs-trivy), [Trivy.dev](https://trivy.dev/), [Nirmata Kyverno vs OPA](https://nirmata.com/2025/02/07/kubernetes-policy-comparison-kyverno-vs-opa-gatekeeper/), [GitHub Reusable Workflows](https://docs.github.com/en/actions/how-tos/reuse-automations/reuse-workflows), [GHA Feb 2026 Updates](https://github.blog/changelog/2026-02-05-github-actions-early-february-2026-updates/), [Docker Build Cache](https://docs.docker.com/build/cache/optimize/), [Netdata Docker Caching](https://www.netdata.cloud/academy/docker-layer-caching/), [DZone Monorepo CI/CD](https://dzone.com/articles/ci-cd-at-scale-smarter-pipelines-for-monorepos), [Groundcover K8s Strategies](https://www.groundcover.com/blog/kubernetes-deployment-strategies), [Akuity Argo Rollouts](https://akuity.io/blog/automating-blue-green-and-canary-deployments-with-argo-rollouts), [Bytebase Flyway vs Liquibase](https://www.bytebase.com/blog/flyway-vs-liquibase/), [LaunchDarkly Feature Flags](https://launchdarkly.com/blog/what-are-feature-flags/), [Grafana OTel](https://grafana.com/blog/2023/12/18/opentelemetry-best-practices-a-users-guide-to-getting-started-with-opentelemetry/), [PagerDuty Runbook Automation](https://www.pagerduty.com/platform/automation/runbook/), [Steadybit Chaos Tools](https://steadybit.com/blog/top-chaos-engineering-tools-worth-knowing-about-2025-guide/)*
|