@synergyerp/backend-standards 1.0.0 → 1.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,1888 @@
1
+ # CI/CD Standardization & DevOps Guide
2
+
3
+ > **Purpose**: Company-wide standards for CI/CD pipelines, containerization, deployment, infrastructure, and DevOps practices.
4
+ > **Target Audience**: DevOps engineers, Tech Leads, QA engineers.
5
+ > **Date**: June 2026
6
+ > **Status**: v1.0 -- Standard
7
+
8
+ ---
9
+
10
+ ## Table of Contents
11
+
12
+ 1. [Core Principles](#1-core-principles)
13
+ 2. [Git Branching Strategy](#2-git-branching-strategy)
14
+ 3. [Pipeline Architecture Overview](#3-pipeline-architecture-overview)
15
+ 4. [CI Pipeline Standards](#4-ci-pipeline-standards)
16
+ 5. [CD Pipeline Standards](#5-cd-pipeline-standards)
17
+ 6. [Containerization Standards](#6-containerization-standards)
18
+ 7. [Environment Strategy](#7-environment-strategy)
19
+ 8. [Infrastructure as Code (IaC)](#8-infrastructure-as-code-iac)
20
+ 9. [Secret Management](#9-secret-management)
21
+ 10. [Database Migration Strategy](#10-database-migration-strategy)
22
+ 11. [Monitoring & Alerting](#11-monitoring--alerting)
23
+ 12. [Security Scanning](#12-security-scanning)
24
+ 13. [Rollback Strategy](#13-rollback-strategy)
25
+ 14. [Incident Response](#14-incident-response)
26
+ 15. [Tool-Specific Pipeline Templates](#15-tool-specific-pipeline-templates)
27
+ 16. [Appendix: Quick References](#16-appendix-quick-references)
28
+ 17. [Implementation Validation & Enforcement Mechanisms](#17-implementation-validation--enforcement-mechanisms)
29
+ 18. [L1: Editor-Level Validation (VS Code)](#18-l1-editor-level-validation-vs-code)
30
+ 19. [L2: Git Hook Validation (Husky + lint-staged)](#19-l2-git-hook-validation-husky--lint-staged)
31
+ 20. [L3: CI Pipeline Validation (GitHub Actions)](#20-l3-ci-pipeline-validation-github-actions)
32
+ 21. [Validation Rules Summary](#21-validation-rules-summary)
33
+
34
+ ---
35
+
36
+ ## 1. Core Principles
37
+
38
+ ### 1.1 The Ten Commandments of CI/CD
39
+
40
+ 1. **Fail Fast, Fail Loud**: Pipelines must fail immediately on any error. No continueOnError: true for lint, test, or build steps.
41
+ 2. **Reproducible Builds**: Lockfiles are sacred. **pnpm-lock.yaml** is the preferred standard and must always be committed. (npm/yarn lockfiles are acceptable for legacy projects).
42
+ 3. **Immutable Artifacts**: Every build produces a versioned artifact (Docker image) that is never modified after creation.
43
+ 4. **Shift Left**: Security scanning, linting, and testing happen as early as possible in the pipeline.
44
+ 5. **No Secrets In Code**: All secrets come from a secrets manager (Azure Key Vault, AWS Secrets Manager, GitHub Secrets, etc.).
45
+ 6. **Manual Gate To Production**: Automated deployment stops at staging. Production requires human approval.
46
+ 7. **Zero-Downtime Deployments**: All production deployments must use rolling updates or blue/green strategy.
47
+ 8. **Everything As Code**: Pipeline definitions, infrastructure, and configuration are all version-controlled.
48
+ 9. **Observability**: Every deployment emits metrics, logs, and traces. Alerts are configured before the first production deploy.
49
+ 10. **Documented Rollback**: Every service must have a documented and tested rollback procedure.
50
+
51
+ ### 1.2 Separation From Frontend Standards
52
+
53
+ This document **only** covers CI/CD, DevOps, and deployment concerns. For frontend code quality, testing, linting, and component standards, refer to the Frontend Standards document.
54
+
55
+ ---
56
+
57
+ ## 2. Git Branching Strategy
58
+
59
+ ### 2.1 Branch Model
60
+
61
+ ```
62
+ main --- Production-ready code (protected)
63
+ develop --- Integration branch (protected)
64
+ feature/* --- New features (branch from develop)
65
+ bugfix/* --- Bug fixes (branch from develop)
66
+ hotfix/* --- Critical fixes (branch from main)
67
+ release/* --- Release candidates (branch from develop)
68
+ ```
69
+
70
+ ### 2.2 Branch Protection Rules
71
+
72
+ | Branch | Required Reviews | Status Checks | Direct Push |
73
+ |--------|-----------------|---------------|-------------|
74
+ | main | 2 approvals | CI must pass, coverage >=80% lines, >=75% branches | Blocked |
75
+ | develop | 1 approval | CI must pass | Blocked |
76
+ | release/* | 1 approval | CI must pass | Allowed for admins |
77
+
78
+ ### 2.3 GitHub Branch Protection Setup
79
+
80
+ To configure branch protection in GitHub:
81
+
82
+ 1. Go to **Settings > Branches > Add branch protection rule**
83
+ 2. Apply to `main`, `develop`, `release/*`
84
+ 3. Enable:
85
+ - [x] Require pull request reviews (2 for main, 1 for develop)
86
+ - [x] Dismiss stale pull request approvals when new commits are pushed
87
+ - [x] Require status checks before merging (CI pipeline)
88
+ - [x] Require branches to be up to date
89
+ - [x] Require conversation resolution
90
+ - [x] Do not allow bypassing the above settings
91
+
92
+ > For Azure DevOps, use **Branch Policies** under Repos > Branches to achieve the same protections.
93
+
94
+ ### 2.4 Trigger Mapping
95
+
96
+ | Event | Branch | Pipeline |
97
+ |-------|--------|----------|
98
+ | PR opened/synced | feature/* -> develop | CI (lint, test, build) |
99
+ | PR merged to develop | develop | CI + CD (deploy to Dev) |
100
+ | PR opened/synced | develop -> main | CI (lint, test, build) |
101
+ | PR merged to main | main | CI + CD (deploy to Staging) |
102
+ | Tag pushed | v* | CI + CD (deploy to Production) |
103
+ | Push to hotfix/* | hotfix/* -> main | CI + CD (hotfix deploy) |
104
+
105
+ ---
106
+
107
+ ## 3. Pipeline Architecture Overview
108
+
109
+ ### 3.1 High-Level Flow
110
+
111
+ ```
112
+ DEVELOPER PUSHES CODE
113
+ |
114
+ v
115
+ +---------------------------------------+
116
+ | CI PIPELINE |
117
+ | |
118
+ | Lint -> Type Check -> Test -> Scan -> Build |
119
+ +---------------------------------------+
120
+ |
121
+ (if merged to develop/main)
122
+ |
123
+ v
124
+ +---------------------------------------+
125
+ | CD PIPELINE |
126
+ | |
127
+ | Build Docker -> Push Registry -> Deploy -> Smoke Tests |
128
+ +---------------------------------------+
129
+ ```
130
+
131
+ ### 3.2 Pipeline Separation
132
+
133
+ | Pipeline | Trigger | Duration Target | Includes |
134
+ |----------|---------|-----------------|----------|
135
+ | CI | Every PR + push | < 10 minutes | Lint, types, tests, security, build |
136
+ | CD (Dev) | Merge to develop | < 5 minutes | Docker build, deploy Dev, smoke tests |
137
+ | CD (Staging) | Merge to main | < 10 minutes | Docker build, deploy Staging, E2E tests |
138
+ | CD (Production) | Tag v* + approval | < 15 minutes | Docker build, deploy Prod, health checks |
139
+
140
+ ---
141
+
142
+ ## 4. CI Pipeline Standards
143
+
144
+ ### 4.1 Mandatory Steps (In Order)
145
+
146
+ ```
147
+ 1. Checkout source code
148
+ 2. Setup Node.js (version from .nvmrc or engines in package.json)
149
+ 3. Cache dependencies (keyed on lockfile hash - pnpm-lock.yaml preferred)
150
+ 4. Install dependencies (--frozen-lockfile). Note: While developers may use any manager locally, CI/CD is optimized for **pnpm**.
151
+ 5. Lint (must pass, zero warnings threshold)
152
+ 6. TypeScript type check (must pass)
153
+ 7. Unit/Integration tests with coverage (>=80% lines, 75% branches, 80% functions, 80% statements)
154
+ 8. Security audit (npm audit / pnpm audit / Snyk)
155
+ 9. Build application
156
+ 10. Upload test results + coverage reports
157
+ ```
158
+
159
+ ### 4.2 Caching Strategy
160
+
161
+ ```yaml
162
+ - name: Cache pnpm dependencies
163
+ uses: actions/cache@v4
164
+ with:
165
+ key: pnpm-${{ runner.os }}-${{ hashFiles('pnpm-lock.yaml') }}
166
+ path: |
167
+ ~/.pnpm-store
168
+ node_modules
169
+ restore-keys: |
170
+ pnpm-${{ runner.os }}-
171
+ ```
172
+
173
+ ### 4.3 Non-Negotiables
174
+
175
+ | Rule | Enforcement |
176
+ |------|-------------|
177
+ | Pipeline fails on lint errors | Exit code 1 |
178
+ | Pipeline fails on type errors | Exit code 1 |
179
+ | Pipeline fails on test failures | Exit code 1 |
180
+ | Pipeline fails if coverage < 80% lines, 75% branches, 80% functions, 80% statements | Jest/Vitest coverage threshold |
181
+ | Pipeline fails on high/critical audit findings | npm audit --audit-level=high |
182
+ | Lockfile must be frozen | --frozen-lockfile flag |
183
+ | No console.log / debugger | ESLint rule no-console: error (warn/error allowed) |
184
+ | **Pipeline fails on modularization violations** | **pnpm check-modularization exit code 1** |
185
+ | **Pipeline fails on barrel file / cross-domain violations** | **ESLint boundaries + import/no-internal-modules** |
186
+ | **Commit messages must follow conventional commits** | **commitlint in CI (anti-bypass for --no-verify)** |
187
+
188
+ ### 4.4 CI Pipeline Concurrency & Timeouts
189
+
190
+ All CI workflows must include concurrency groups to cancel redundant runs and hard timeouts to prevent hung jobs:
191
+
192
+ ```yaml
193
+ concurrency:
194
+ group: ${{ github.workflow }}-${{ github.ref }}
195
+ cancel-in-progress: true
196
+
197
+ jobs:
198
+ quality:
199
+ timeout-minutes: 15
200
+ # ...
201
+ docker:
202
+ timeout-minutes: 20
203
+ ```
204
+
205
+ ---
206
+
207
+ ## 5. CD Pipeline Standards
208
+
209
+ ### 5.1 Deployment Stages
210
+
211
+ ```
212
+ DEVELOP BRANCH: Build & Push -> Deploy Dev -> Smoke Tests
213
+ MAIN BRANCH: Build & Push -> Deploy Staging -> E2E Tests -> [Manual Approval] -> Deploy Prod -> Health Checks
214
+ ```
215
+
216
+ ### 5.2 Image Tagging Strategy
217
+
218
+ ```yaml
219
+ # GitHub Actions
220
+ tags: |
221
+ ghcr.io/${{ github.repository }}:${{ github.run_id }}
222
+ ghcr.io/${{ github.repository }}:${{ github.sha }}
223
+ ghcr.io/${{ github.repository }}:latest
224
+
225
+ # Azure Pipelines
226
+ tags: |
227
+ $(dockerRegistry)/$(imageRepository):$(Build.BuildId)
228
+ $(dockerRegistry)/$(imageRepository):$(Build.SourceVersion)
229
+ $(dockerRegistry)/$(imageRepository):latest
230
+ ```
231
+
232
+ ### 5.3 Deployment Strategy by Environment
233
+
234
+ | Environment | Strategy | Max Downtime | Notes |
235
+ |-------------|----------|--------------|-------|
236
+ | Dev | Recreate | 1-2 minutes | Fast iteration, accept downtime |
237
+ | Staging | Rolling update | Zero | Match production behavior |
238
+ | Production | Rolling update or Blue/Green | Zero | Canary if high risk |
239
+
240
+ ### 5.4 Health Check Endpoint Requirement
241
+
242
+ Every service must expose a /health endpoint:
243
+ ```json
244
+ {
245
+ "status": "ok",
246
+ "timestamp": "2026-06-17T12:00:00Z",
247
+ "version": "1.2.3",
248
+ "checks": {
249
+ "database": { "status": "ok", "latency_ms": 5 },
250
+ "redis": { "status": "ok", "latency_ms": 2 }
251
+ }
252
+ }
253
+ ```
254
+
255
+ ### 5.6 Feature Flag Deployment Integration
256
+
257
+ Feature flags enable dark launches, A/B testing, canary releases, and instant kill-switches. All new features that touch critical paths must be deployed behind a feature flag.
258
+
259
+ | Rule | Requirement |
260
+ |------|-------------|
261
+ | Flag provider | LaunchDarkly, Unleash, or Flagsmith (OSS) |
262
+ | Flag naming | `kebab-case` domain prefix: `employee-new-dashboard`, `billing-v2-checkout` |
263
+ | Flag lifecycle | Remove flag code within 2 sprints of 100% rollout |
264
+ | CI integration | Flag creation as part of feature PR (provider API) |
265
+ | Kill switch | Critical feature flags serve as instant rollback mechanism |
266
+ | No nested flags | Flags must not depend on other flags |
267
+
268
+ ---
269
+
270
+ ## 6. Containerization Standards
271
+
272
+ ### 6.1 Dockerfile Template
273
+
274
+ ```dockerfile
275
+ FROM node:20-alpine AS base
276
+ RUN corepack enable && corepack prepare pnpm@9 --activate
277
+ WORKDIR /app
278
+
279
+ FROM base AS deps
280
+ COPY package.json pnpm-lock.yaml ./
281
+ COPY apps/*/package.json ./apps/
282
+ COPY packages/*/package.json ./packages/
283
+ RUN pnpm install --frozen-lockfile
284
+
285
+ FROM base AS builder
286
+ COPY --from=deps /app/node_modules ./node_modules
287
+ COPY . .
288
+ RUN pnpm build
289
+
290
+ FROM node:20-alpine AS runner
291
+ WORKDIR /app
292
+ ENV NODE_ENV=production
293
+ ENV PORT=3000
294
+ RUN addgroup --system --gid 1001 appgroup && \
295
+ adduser --system --uid 1001 appuser
296
+ COPY --from=builder --chown=appuser:appgroup /app/dist ./dist
297
+ COPY --from=builder --chown=appuser:appgroup /app/package.json ./
298
+ COPY --from=builder --chown=appuser:appgroup /app/node_modules ./node_modules
299
+ USER appuser
300
+ EXPOSE 3000
301
+ HEALTHCHECK --interval=30s --timeout=5s --start-period=40s --retries=3 \
302
+ CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1
303
+ CMD ["node", "dist/server.js"]
304
+ ```
305
+
306
+ ### 6.2 Dockerfile Rules
307
+
308
+ | Rule | Reason |
309
+ |------|--------|
310
+ | Always use -alpine base images | Smaller attack surface, smaller size |
311
+ | Multi-stage builds | Separate build deps from runtime |
312
+ | Non-root user | Security best practice |
313
+ | Pin specific Node.js version | Reproducibility |
314
+ | Include HEALTHCHECK | K8s readiness/liveness probes |
315
+ | Copy only dist, not source | Smaller images, no source code in prod |
316
+ | Use --chown for file ownership | Match non-root user |
317
+
318
+ ### 6.3 .dockerignore Template
319
+
320
+ ```
321
+ **/node_modules
322
+ npm-debug.log
323
+ pnpm-debug.log
324
+ dist
325
+ .next
326
+ .turbo
327
+ build
328
+ out
329
+ coverage
330
+ .env
331
+ .env.*
332
+ !.env.example
333
+ .git
334
+ .gitignore
335
+ .gitattributes
336
+ .idea
337
+ .vscode
338
+ *.swp
339
+ .github
340
+ .azure-pipelines
341
+ .husky
342
+ *.md
343
+ docs
344
+ .DS_Store
345
+ Thumbs.db
346
+ logs
347
+ *.log
348
+ Dockerfile
349
+ docker-compose*.yml
350
+ .dockerignore
351
+ ```
352
+
353
+ ---
354
+
355
+ ## 7. Environment Strategy
356
+
357
+ ### 7.1 Environment Matrix
358
+
359
+ | Environment | Purpose | Deploy Trigger | Replicas | Backup |
360
+ |-------------|---------|---------------|----------|--------|
361
+ | Dev | Developer testing, feature validation | Merge to develop | 1 | No |
362
+ | Staging | QA, UAT, integration testing | Merge to main | 2 | No |
363
+ | Production | Live user-facing | Tag + manual approval | 3+ (auto-scale) | Yes |
364
+
365
+ ### 7.2 Configuration Files
366
+
367
+ | File | Location | Git-Committed? | Contains Secrets? |
368
+ |------|----------|----------------|-------------------|
369
+ | .env.example | Project root | Yes | No (placeholders) |
370
+ | .env | Project root | No (.gitignored) | Yes |
371
+ | ConfigMap | k8s/base/configmap.yaml | Yes | No |
372
+ | Secrets | Secret Manager | No | Yes |
373
+
374
+ ---
375
+
376
+ ## 8. Infrastructure as Code (IaC)
377
+
378
+ ### 8.1 Kubernetes Manifest Structure
379
+
380
+ ```
381
+ k8s/
382
+ base/
383
+ namespace.yaml
384
+ deployment.yaml
385
+ service.yaml
386
+ ingress.yaml
387
+ configmap.yaml
388
+ hpa.yaml
389
+ pdb.yaml
390
+ kustomization.yaml
391
+ overlays/
392
+ dev/
393
+ kustomization.yaml
394
+ staging/
395
+ kustomization.yaml
396
+ prod/
397
+ kustomization.yaml
398
+ hpa-patch.yaml
399
+ ```
400
+
401
+ ### 8.2 Required Kubernetes Manifests
402
+
403
+ | Manifest | Purpose | Required? |
404
+ |----------|---------|-----------|
405
+ | Deployment | Pod deployment with rolling update | Yes |
406
+ | Service | Internal service discovery | Yes |
407
+ | Ingress | External HTTP/S routing | Yes |
408
+ | ConfigMap | Non-sensitive configuration | Yes |
409
+ | Secret | Sensitive configuration | Yes |
410
+ | HorizontalPodAutoscaler | Auto-scaling | For production |
411
+ | PodDisruptionBudget | Min available during updates | For production |
412
+ | NetworkPolicy | Pod-to-pod traffic restrictions | Recommended |
413
+ | Namespace | Environment isolation | Yes |
414
+
415
+ ### 8.3 Kustomize Base Template
416
+
417
+ ```yaml
418
+ # k8s/base/kustomization.yaml
419
+ apiVersion: kustomize.config.k8s.io/v1beta1
420
+ kind: Kustomization
421
+
422
+ resources:
423
+ - namespace.yaml
424
+ - deployment.yaml
425
+ - service.yaml
426
+ - ingress.yaml
427
+ - configmap.yaml
428
+ - hpa.yaml
429
+
430
+ commonLabels:
431
+ app: my-project
432
+ managed-by: kustomize
433
+ ```
434
+
435
+ ```yaml
436
+ # k8s/overlays/prod/kustomization.yaml
437
+ apiVersion: kustomize.config.k8s.io/v1beta1
438
+ kind: Kustomization
439
+
440
+ resources:
441
+ - ../../base
442
+
443
+ patches:
444
+ - target:
445
+ kind: Deployment
446
+ name: app
447
+ patch: |-
448
+ - op: replace
449
+ path: /spec/replicas
450
+ value: 5
451
+ ```
452
+
453
+ ### 8.4 Pod Anti-Affinity (Required for Production)
454
+
455
+ ```yaml
456
+ spec:
457
+ template:
458
+ spec:
459
+ affinity:
460
+ podAntiAffinity:
461
+ preferredDuringSchedulingIgnoredDuringExecution:
462
+ - weight: 100
463
+ podAffinityTerm:
464
+ labelSelector:
465
+ matchLabels:
466
+ app: my-project
467
+ topologyKey: kubernetes.io/hostname
468
+ ```
469
+
470
+ ### 8.5 Resource Quotas (Required for Production)
471
+
472
+ ```yaml
473
+ apiVersion: v1
474
+ kind: ResourceQuota
475
+ metadata:
476
+ name: app-quota
477
+ namespace: production
478
+ spec:
479
+ hard:
480
+ requests.cpu: "10"
481
+ requests.memory: 20Gi
482
+ limits.cpu: "20"
483
+ limits.memory: 40Gi
484
+ ```
485
+
486
+ ### 8.6 SSL/TLS Certificate Management
487
+
488
+ ```yaml
489
+ # cert-manager ClusterIssuer (Let's Encrypt)
490
+ apiVersion: cert-manager.io/v1
491
+ kind: ClusterIssuer
492
+ metadata:
493
+ name: letsencrypt-prod
494
+ spec:
495
+ acme:
496
+ server: https://acme-v02.api.letsencrypt.org/directory
497
+ email: devops@company.com
498
+ privateKeySecretRef:
499
+ name: letsencrypt-prod
500
+ solvers:
501
+ - http01:
502
+ ingress:
503
+ class: nginx
504
+ ```
505
+
506
+ | Rule | Requirement |
507
+ |------|-------------|
508
+ | Issuer | Let's Encrypt via cert-manager (production). Staging issuer for dev/staging. |
509
+ | Renewal | Auto-renew at 30 days before expiry |
510
+ | Monitoring | Alert at < 30 days to expiry |
511
+ | Minimum TLS | TLS 1.2+ only; TLS 1.0/1.1 disabled |
512
+
513
+ ---
514
+
515
+ ## 9. Secret Management
516
+
517
+ ### 9.1 GitHub Secrets Setup
518
+
519
+ GitHub Secrets is the **preferred** method for storing sensitive values in CI/CD pipelines.
520
+
521
+ **Where to configure**: Repo **Settings > Secrets and variables > Actions**
522
+
523
+ **How to use in workflows**:
524
+ ```yaml
525
+ # Reference in pipeline
526
+ - run: echo "${{ secrets.DOCKER_PASSWORD }}" | docker login --username "${{ secrets.DOCKER_USERNAME }}" --password-stdin
527
+
528
+ # Or pass as environment variable
529
+ - run: pnpm build
530
+ env:
531
+ API_KEY: ${{ secrets.API_KEY }}
532
+ DATABASE_URL: ${{ secrets.DATABASE_URL }}
533
+ ```
534
+
535
+ **Recommended GitHub Secrets per project**:
536
+
537
+ | Secret Name | Example Value | Purpose |
538
+ |-------------|--------------|---------|
539
+ | `DOCKER_REGISTRY` | `ghcr.io/myorg` | Container registry URL |
540
+ | `DOCKER_USERNAME` | `myorg-bot` | Registry login user |
541
+ | `DOCKER_PASSWORD` | `ghp_abc...` | Registry token (GitHub PAT) |
542
+ | `KUBE_CONFIG_DEV` | `apiVersion: v1...` | Dev K8s kubeconfig (base64) |
543
+ | `KUBE_CONFIG_STAGING` | `apiVersion: v1...` | Staging K8s kubeconfig |
544
+ | `KUBE_CONFIG_PROD` | `apiVersion: v1...` | Production K8s kubeconfig |
545
+ | `SLACK_WEBHOOK` | `https://hooks.slack.com/...` | Deployment notifications |
546
+ | `SENTRY_DSN` | `https://xxx@sentry.io/xxx` | Error tracking |
547
+ | `SONAR_TOKEN` | `sqp_...` | SonarQube authentication |
548
+
549
+ ### 9.2 GitHub Environments for Deployment Gates
550
+
551
+ GitHub **Environments** provide manual approval gates for production deployments.
552
+
553
+ **Setup**:
554
+ 1. Go to **Settings > Environments**
555
+ 2. Create `production` environment
556
+ 3. Add **Required reviewers** (2-3 senior devs/leads)
557
+ 4. Add **Deployment branches** (`main` only)
558
+
559
+ **Workflow integration**:
560
+ ```yaml
561
+ jobs:
562
+ deploy-production:
563
+ runs-on: ubuntu-latest
564
+ environment: production # <-- This triggers the approval gate
565
+ steps:
566
+ - run: ./deploy.sh
567
+ ```
568
+
569
+ > The pipeline will **pause** automatically until the required reviewers approve, then proceed.
570
+
571
+ ### 9.3 Alternative: Azure Key Vault
572
+
573
+ If the organization uses Azure DevOps, use Azure Key Vault via variable groups:
574
+
575
+ 1. Create a Key Vault in Azure
576
+ 2. Add secrets to the vault
577
+ 3. Link the vault as a Variable Group in Azure DevOps Library
578
+ 4. Reference in pipeline: `$(secret-name)`
579
+
580
+ ### 9.4 Secret Sources By Environment
581
+
582
+ | Environment | Secret Source | Implementation |
583
+ |-------------|--------------|----------------|
584
+ | Local dev | `.env` file (gitignored) | dotenv / VITE_* |
585
+ | CI Pipeline | **GitHub Secrets** (preferred) or Azure Variable Groups | Injected at runtime |
586
+ | Dev K8s | Kubernetes Secrets (from CI) | Base64 encoded |
587
+ | Staging K8s | K8s Secrets + External Secrets Operator | Sync from GitHub Secrets |
588
+ | Production K8s | External Secrets Operator + Cloud Vault | Never stored in etcd |
589
+
590
+ ### 9.5 GitHub Container Registry (GHCR) Setup
591
+
592
+ GitHub Container Registry is the **preferred** container registry for GitHub-hosted projects.
593
+
594
+ **Enable GHCR**:
595
+ 1. Go to **Settings > Packages** and ensure "GitHub Container Registry" is enabled
596
+ 2. Create a **Personal Access Token (PAT)**:
597
+ - Go to **Settings > Developer settings > Personal access tokens > Fine-grained tokens**
598
+ - Permissions required: `read:packages`, `write:packages`, `delete:packages`
599
+
600
+ **Login in CI**:
601
+ ```yaml
602
+ - name: Log in to GHCR
603
+ uses: docker/login-action@v3
604
+ with:
605
+ registry: ghcr.io
606
+ username: ${{ github.actor }}
607
+ password: ${{ secrets.GITHUB_TOKEN }} # Auto-generated, no setup needed
608
+ ```
609
+
610
+ **Image naming convention**:
611
+ ```
612
+ ghcr.io/<organization>/<project>:<tag>
613
+
614
+ Examples:
615
+ ghcr.io/mycompany/frontend:abc123
616
+ ghcr.io/mycompany/backend:latest
617
+ ghcr.io/mycompany/web:v1.2.3
618
+ ```
619
+
620
+ > The `GITHUB_TOKEN` secret is **auto-generated** by GitHub for each workflow run. No manual setup required.
621
+
622
+ ### 9.6 Never Commit These
623
+
624
+ ```
625
+ # NEVER COMMIT THESE
626
+ *.pem
627
+ *.key
628
+ *.cert
629
+ .env
630
+ .env.local
631
+ .env.production
632
+ service-account.json
633
+ credentials.json
634
+ secrets.yml
635
+ **/secret/**
636
+ ```
637
+
638
+ ---
639
+
640
+ ## 10. Database Migration Strategy
641
+
642
+ ### 10.1 Migration Pipeline Integration
643
+
644
+ ```yaml
645
+ steps:
646
+ - task: Kubernetes@1
647
+ displayName: "Run database migrations"
648
+ inputs:
649
+ command: "exec"
650
+ arguments: "deployment/app -- pnpm db:migrate"
651
+ ```
652
+
653
+ ### 10.2 Migration Rules
654
+
655
+ | Rule | Reason |
656
+ |------|--------|
657
+ | Migrations run AFTER new pods are deployed | Old pods can still serve traffic |
658
+ | Migrations must be backward compatible | Rollback must be possible |
659
+ | Never auto-run migrations in production | Require manual trigger |
660
+ | Always run migrations in Dev and Staging first | Catch issues early |
661
+ | Test files must not exist in migration directories | TypeORM CLI fails on test files |
662
+
663
+ ### 10.3 Migration Directory Convention
664
+
665
+ ```
666
+ apps/backend/src/database/
667
+ migrations/ # ONLY migration files (NO test files)
668
+ seeds/ # ONLY seed files (NO test files)
669
+ factories/ # Test factories only
670
+ ```
671
+
672
+ ### 10.4 Database Backup Strategy
673
+
674
+ | Rule | Requirement |
675
+ |------|-------------|
676
+ | Backup tool | Velero (K8s) or pg_dump (bare metal) |
677
+ | Frequency | Production: daily full + continuous WAL archiving. Staging: daily. |
678
+ | Retention | 30 daily, 12 monthly, 7 yearly |
679
+ | Encryption | At rest (AES-256) + in transit (TLS) |
680
+ | Restore testing | Quarterly restore drill required; results documented |
681
+ | Offsite | Backups replicated to secondary region within 24 hours |
682
+
683
+ ---
684
+
685
+ ## 11. Monitoring & Alerting
686
+
687
+ ### 11.1 Required Metrics & Alerts
688
+
689
+ | Metric | Source | Alert Threshold |
690
+ |--------|--------|----------------|
691
+ | CPU Usage | K8s Metrics Server | >80% for 5 minutes |
692
+ | Memory Usage | K8s Metrics Server | >85% for 5 minutes |
693
+ | HTTP 5xx Rate | Application metrics | >1% for 5 minutes |
694
+ | HTTP 4xx Rate | Application metrics | >5% for 5 minutes |
695
+ | Request Latency (p95) | Application metrics | >2000ms for 5 minutes |
696
+ | Disk Usage | Node exporter | >85% |
697
+ | Pod Restarts | K8s | >3 in 10 minutes |
698
+ | Certificate Expiry | Cert manager | <30 days |
699
+
700
+ ### 11.2 Alert Notification Channels
701
+
702
+ | Channel | Severity | Use Case |
703
+ |---------|----------|----------|
704
+ | Slack | All | General alerts |
705
+ | PagerDuty/OpsGenie | Critical | Production outages |
706
+ | Email | Warning | Non-urgent notifications |
707
+ | SMS | Critical | Escalation (if PagerDuty not ackd) |
708
+
709
+ ---
710
+
711
+ ## 12. Security Scanning
712
+
713
+ ### 12.1 Mandatory Scans
714
+
715
+ | Scan | Tool | When | Fail On |
716
+ |------|------|------|---------|
717
+ | Dependency audit | npm audit / pnpm audit | Every CI run | High/Critical |
718
+ | SAST (Static Analysis) | ESLint + SonarQube | Every CI run | Blocker issues |
719
+ | Container scan | Trivy / Snyk | Every Docker build | High/Critical CVEs |
720
+ | Secret detection | GitLeaks / TruffleHog | Every PR | Any secret found |
721
+ | License compliance | FOSSA / License Checker | Weekly | Non-approved licenses |
722
+
723
+ ### 12.2 CI Security Steps
724
+
725
+ ```yaml
726
+ - run: pnpm audit --audit-level=high
727
+ - uses: sonarsource/sonarcloud-github-action@master
728
+ - run: trivy image --exit-code 1 --severity CRITICAL,HIGH myapp:latest
729
+ - uses: gitleaks/gitleaks-action@v2
730
+ ```
731
+
732
+ ### 12.3 Vulnerability Handling SLA
733
+
734
+ | Severity | Response Time | Fix Deadline |
735
+ |----------|--------------|--------------|
736
+ | Critical | < 1 hour | < 24 hours |
737
+ | High | < 4 hours | < 7 days |
738
+ | Medium | < 1 business day | < 30 days |
739
+ | Low | < 1 week | Next release |
740
+
741
+ ---
742
+
743
+ ## 13. Rollback Strategy
744
+
745
+ ### 13.1 Rollback Triggers
746
+
747
+ - Error rate spike > 5% after deployment
748
+ - p95 latency increase > 100% after deployment
749
+ - Critical security vulnerability discovered
750
+ - Database migration failure
751
+ - Manual decision by on-call engineer
752
+
753
+ ### 13.2 Rollback Procedures
754
+
755
+ | Deployment Type | Rollback Method | Time To Recover |
756
+ |----------------|----------------|-----------------|
757
+ | Kubernetes | kubectl rollout undo deployment/app | < 2 minutes |
758
+ | Docker Compose | docker compose down && docker compose -f docker-compose.prev.yml up | < 5 minutes |
759
+ | Blue/Green | Switch traffic back to previous slot | < 1 minute |
760
+ | DB migration | Run migration:revert (must be tested) | < 5 minutes |
761
+
762
+ ### 13.3 Kubernetes Rollback Commands
763
+
764
+ ```bash
765
+ # Check rollout history
766
+ kubectl rollout history deployment/app
767
+
768
+ # Rollback to previous version
769
+ kubectl rollout undo deployment/app
770
+
771
+ # Rollback to specific revision
772
+ kubectl rollout undo deployment/app --to-revision=3
773
+
774
+ # Verify rollback
775
+ kubectl rollout status deployment/app
776
+
777
+ # Check pod health
778
+ kubectl get pods -l app=myapp
779
+ ```
780
+
781
+ ### 13.4 Database Rollback
782
+
783
+ ```bash
784
+ pnpm db:revert # Reverts the last migration
785
+ pnpm db:status # Verify current state
786
+ ```
787
+
788
+ ---
789
+
790
+ ## 14. Incident Response
791
+
792
+ ### 14.1 Incident Severity Levels
793
+
794
+ | Level | Description | Response | Example |
795
+ |-------|-------------|----------|---------|
796
+ | SEV1 | Complete service outage | Immediate, all hands | App is down |
797
+ | SEV2 | Major feature degradation | < 15 minutes | Login broken |
798
+ | SEV3 | Minor feature issue | < 1 hour | UI cosmetic bug |
799
+ | SEV4 | Non-urgent issue | Next business day | Missing documentation |
800
+
801
+ ### 14.2 Incident Response Flow
802
+
803
+ ```
804
+ INCIDENT DETECTED (Alert or User Report)
805
+ |
806
+ v
807
+ Acknowledge (PagerDuty/Slack within 5 minutes)
808
+ |
809
+ v
810
+ Triage (Determine severity, assign owner)
811
+ |
812
+ v
813
+ Mitigate (Rollback / feature flag / hotfix)
814
+ |
815
+ v
816
+ Communicate (Update status page, stakeholders)
817
+ |
818
+ v
819
+ Post-Mortem / RCA (Within 48 hours, documented in ADR)
820
+ ```
821
+
822
+ ### 14.3 Post-Mortem Template
823
+
824
+ ```markdown
825
+ ## Incident Post-Mortem
826
+
827
+ ### Summary
828
+ [What happened, when, impact]
829
+
830
+ ### Timeline
831
+ - [Time] Issue detected
832
+ - [Time] Investigation started
833
+ - [Time] Root cause identified
834
+ - [Time] Mitigation applied
835
+ - [Time] Service recovered
836
+
837
+ ### Root Cause
838
+ [Technical explanation]
839
+
840
+ ### Impact
841
+ - Users affected: [number]
842
+ - Downtime: [duration]
843
+ - Revenue impact: [amount]
844
+
845
+ ### Action Items
846
+ - [ ] Fix [permanent fix]
847
+ - [ ] Add monitoring for [gap]
848
+ - [ ] Update runbook for [procedure]
849
+ - [ ] Schedule post-mortem review
850
+
851
+ ### Prevention
852
+ [How to prevent recurrence]
853
+ ```
854
+
855
+ ### 14.4 Service-Level Objectives (SLO)
856
+
857
+ | Severity | Acknowledgement | Mitigation | Resolution | Uptime Target |
858
+ |----------|----------------|------------|------------|---------------|
859
+ | SEV1 | < 5 min | < 15 min | < 1 hour | 99.9% monthly |
860
+ | SEV2 | < 15 min | < 1 hour | < 4 hours | -- |
861
+ | SEV3 | < 1 hour | < 4 hours | < 24 hours | -- |
862
+ | SEV4 | < 4 hours | -- | Next business day | -- |
863
+
864
+ ### 14.5 Disaster Recovery Standards
865
+
866
+ | Metric | Target |
867
+ |--------|--------|
868
+ | RTO (Recovery Time Objective) | < 4 hours |
869
+ | RPO (Recovery Point Objective) | < 1 hour |
870
+ | Multi-region | Production must span at least 2 availability zones |
871
+ | Failover | Documented runbook tested quarterly |
872
+ | Cross-region replication | Database replication to secondary region (async, < 5 min lag) |
873
+
874
+ ---
875
+
876
+ ## 15. Tool-Specific Pipeline Templates
877
+
878
+ ### 15.1 Platform Decision: GitHub Actions (Recommended) vs Azure Pipelines
879
+
880
+ | Factor | GitHub Actions (Recommended) | Azure Pipelines |
881
+ |--------|------------------------------|-----------------|
882
+ | **Repo hosting** | GitHub | Azure Repos / GitHub |
883
+ | **Setup complexity** | Minimal (built-in to GitHub) | Moderate (requires Azure DevOps org) |
884
+ | **Secret management** | GitHub Secrets (built-in) | Azure Key Vault + Variable Groups |
885
+ | **Container registry** | GHCR (built-in, free) | Azure Container Registry (paid) |
886
+ | **Approval gates** | GitHub Environments | Azure DevOps Deployment Gates |
887
+ | **Cost** | Free for public / 2000 min/mo private | Free for 5 users, paid tiers |
888
+ | **Kubernetes deploy** | `kubectl` or helm actions | KubernetesManifest task |
889
+ | **Community** | Largest ecosystem of actions | Smaller, mostly enterprise |
890
+
891
+ > **Recommendation**: Use **GitHub Actions** for all new projects. If the company standardizes on Azure DevOps, Azure Pipelines is a capable alternative. Both templates are provided below.
892
+
893
+ ### 15.2 GitHub Actions (Full CI) -- Primary Option
894
+
895
+ ```yaml
896
+ # .github/workflows/ci.yml
897
+ name: CI Pipeline
898
+
899
+ on:
900
+ push:
901
+ branches: [main, develop]
902
+ pull_request:
903
+ branches: [main, develop]
904
+
905
+ env:
906
+ NODE_VERSION: "20"
907
+ PNPM_VERSION: "9"
908
+
909
+ jobs:
910
+ quality:
911
+ name: Quality Checks
912
+ runs-on: ubuntu-latest
913
+ steps:
914
+ - uses: actions/checkout@v4
915
+
916
+ - uses: actions/setup-node@v4
917
+ with:
918
+ node-version: ${{ env.NODE_VERSION }}
919
+
920
+ - uses: pnpm/action-setup@v2
921
+ with:
922
+ version: ${{ env.PNPM_VERSION }}
923
+
924
+ - name: Cache pnpm
925
+ uses: actions/cache@v4
926
+ with:
927
+ key: pnpm-${{ runner.os }}-${{ hashFiles('pnpm-lock.yaml') }}
928
+ path: ~/.pnpm-store
929
+
930
+ - run: pnpm install --frozen-lockfile
931
+
932
+ - run: pnpm lint
933
+ - run: pnpm check-types
934
+ - run: pnpm test:ci
935
+
936
+ - run: pnpm audit --audit-level=high
937
+
938
+ - run: pnpm build
939
+
940
+ - uses: codecov/codecov-action@v4
941
+ with:
942
+ files: ./coverage/cobertura-coverage.xml
943
+
944
+ docker:
945
+ name: Build & Push Docker
946
+ needs: quality
947
+ if: github.event_name == 'push'
948
+ runs-on: ubuntu-latest
949
+ steps:
950
+ - uses: actions/checkout@v4
951
+
952
+ - name: Log in to GitHub Container Registry
953
+ uses: docker/login-action@v3
954
+ with:
955
+ registry: ghcr.io
956
+ username: ${{ github.actor }}
957
+ password: ${{ secrets.GITHUB_TOKEN }}
958
+
959
+ - name: Build and push Docker image
960
+ uses: docker/build-push-action@v5
961
+ with:
962
+ context: .
963
+ push: true
964
+ tags: |
965
+ ghcr.io/${{ github.repository }}:${{ github.sha }}
966
+ ghcr.io/${{ github.repository }}:${{ github.ref_name }}
967
+ ghcr.io/${{ github.repository }}:latest
968
+ ### 15.3 GitHub Actions (CD) -- Primary Option
969
+
970
+ ```yaml
971
+ # .github/workflows/cd.yml
972
+ name: CD Pipeline
973
+
974
+ on:
975
+ push:
976
+ branches: [develop, main]
977
+
978
+ env:
979
+ GHCR_IMAGE: ghcr.io/${{ github.repository }}
980
+
981
+ jobs:
982
+ deploy-dev:
983
+ name: Deploy to Dev
984
+ if: github.ref == 'refs/heads/develop'
985
+ runs-on: ubuntu-latest
986
+ environment: dev
987
+ steps:
988
+ - uses: actions/checkout@v4
989
+
990
+ - name: Set up kubectl
991
+ uses: azure/setup-kubectl@v4
992
+ with:
993
+ version: "latest"
994
+
995
+ - name: Configure K8s context
996
+ run: |
997
+ mkdir -p $HOME/.kube
998
+ echo "${{ secrets.KUBE_CONFIG_DEV }}" | base64 --decode > $HOME/.kube/config
999
+ chmod 600 $HOME/.kube/config
1000
+
1001
+ - name: Deploy to K8s
1002
+ run: |
1003
+ kubectl set image deployment/app \
1004
+ app=${{ env.GHCR_IMAGE }}:${{ github.sha }} \
1005
+ --namespace=dev
1006
+ kubectl rollout status deployment/app --namespace=dev
1007
+
1008
+ - name: Run database migrations
1009
+ run: |
1010
+ kubectl exec deployment/app --namespace=dev -- pnpm db:migrate
1011
+
1012
+ - name: Verify deployment
1013
+ run: |
1014
+ kubectl get pods --namespace=dev -l app=app
1015
+ kubectl get ingress --namespace=dev
1016
+
1017
+ deploy-staging:
1018
+ name: Deploy to Staging
1019
+ if: github.ref == 'refs/heads/main'
1020
+ runs-on: ubuntu-latest
1021
+ environment: staging
1022
+ steps:
1023
+ - uses: actions/checkout@v4
1024
+
1025
+ - name: Configure K8s context
1026
+ run: |
1027
+ mkdir -p $HOME/.kube
1028
+ echo "${{ secrets.KUBE_CONFIG_STAGING }}" | base64 --decode > $HOME/.kube/config
1029
+
1030
+ - name: Deploy to K8s
1031
+ run: |
1032
+ kubectl set image deployment/app \
1033
+ app=${{ env.GHCR_IMAGE }}:${{ github.sha }} \
1034
+ --namespace=staging
1035
+ kubectl rollout status deployment/app --namespace=staging
1036
+
1037
+ - name: Run migrations
1038
+ run: |
1039
+ kubectl exec deployment/app --namespace=staging -- pnpm db:migrate
1040
+
1041
+ - name: Run E2E smoke tests
1042
+ run: |
1043
+ kubectl exec deployment/app --namespace=staging -- pnpm test:e2e
1044
+
1045
+ deploy-production:
1046
+ name: Deploy to Production
1047
+ if: startsWith(github.ref, 'refs/tags/v')
1048
+ runs-on: ubuntu-latest
1049
+ needs: [deploy-staging]
1050
+ environment: production # <-- Manual approval gate via GitHub Environments
1051
+ steps:
1052
+ - uses: actions/checkout@v4
1053
+
1054
+ - name: Configure K8s context
1055
+ run: |
1056
+ mkdir -p $HOME/.kube
1057
+ echo "${{ secrets.KUBE_CONFIG_PROD }}" | base64 --decode > $HOME/.kube/config
1058
+
1059
+ - name: Deploy to K8s (rolling update, zero-downtime)
1060
+ run: |
1061
+ kubectl set image deployment/app \
1062
+ app=${{ env.GHCR_IMAGE }}:${{ github.sha }} \
1063
+ --namespace=production
1064
+ kubectl rollout status deployment/app --namespace=production
1065
+
1066
+ - name: Run migrations
1067
+ run: |
1068
+ kubectl exec deployment/app --namespace=production -- pnpm db:migrate
1069
+
1070
+ - name: Health check
1071
+ run: |
1072
+ kubectl get pods --namespace=production -l app=app
1073
+ kubectl describe ingress --namespace=production
1074
+
1075
+ - name: Notify Slack on success
1076
+ if: success()
1077
+ run: |
1078
+ curl -X POST -H "Content-type: application/json" \
1079
+ --data '{"text":"Deployment to Production successful: ${{ github.ref_name }}"}' \
1080
+ ${{ secrets.SLACK_WEBHOOK }}
1081
+
1082
+ - name: Notify Slack on failure
1083
+ if: failure()
1084
+ run: |
1085
+ curl -X POST -H "Content-type: application/json" \
1086
+ --data '{"text":":x: Deployment to Production FAILED: ${{ github.ref_name }}"}' \
1087
+ ${{ secrets.SLACK_WEBHOOK }}
1088
+ ```
1089
+
1090
+ ### 15.4 Azure Pipelines (Alternative Option)
1091
+
1092
+ Use this if the organization standardizes on Azure DevOps instead of GitHub.
1093
+
1094
+ ```yaml
1095
+ # azure-pipelines.yml
1096
+ trigger:
1097
+ branches:
1098
+ include:
1099
+ - main
1100
+ - develop
1101
+ - release/*
1102
+ paths:
1103
+ exclude:
1104
+ - docs/*
1105
+ - "*.md"
1106
+
1107
+ pr:
1108
+ branches:
1109
+ include:
1110
+ - main
1111
+ - develop
1112
+
1113
+ variables:
1114
+ - group: common-vars
1115
+ - name: nodeVersion
1116
+ value: "20.x"
1117
+ - name: pnpmVersion
1118
+ value: "9"
1119
+ - name: dockerRegistry
1120
+ value: "your-acr.azurecr.io" # Change to your ACR
1121
+ - name: imageRepository
1122
+ value: "myapp"
1123
+
1124
+ stages:
1125
+ - stage: Build
1126
+ displayName: "Build and Test"
1127
+ jobs:
1128
+ - job: BuildAndTest
1129
+ steps:
1130
+ - task: NodeTool@0
1131
+ inputs:
1132
+ versionSpec: $(nodeVersion)
1133
+ - script: corepack enable && corepack prepare pnpm@$(pnpmVersion) --activate
1134
+ - task: Cache@2
1135
+ inputs:
1136
+ key: 'pnpm | "$(Agent.OS)" | pnpm-lock.yaml'
1137
+ path: $(PNPM_STORE_PATH)
1138
+ - script: pnpm install --frozen-lockfile
1139
+ - script: pnpm lint
1140
+ - script: pnpm check-types
1141
+ - script: pnpm test:ci
1142
+ - task: PublishTestResults@2
1143
+ inputs:
1144
+ testResultsFiles: "**/junit.xml"
1145
+ - task: PublishCodeCoverageResults@1
1146
+ inputs:
1147
+ codeCoverageTool: "Cobertura"
1148
+ summaryFileLocation: "**/cobertura-coverage.xml"
1149
+ - script: pnpm build
1150
+ - task: PublishBuildArtifacts@1
1151
+ inputs:
1152
+ pathToPublish: "$(Build.ArtifactStagingDirectory)"
1153
+ artifactName: "build"
1154
+
1155
+ - stage: DeployDev
1156
+ displayName: "Deploy to Dev"
1157
+ dependsOn: Build
1158
+ condition: eq(variables['Build.SourceBranchName'], 'develop')
1159
+ jobs:
1160
+ - deployment: DeployDev
1161
+ environment: "dev"
1162
+ strategy:
1163
+ runOnce:
1164
+ deploy:
1165
+ steps:
1166
+ - task: KubernetesManifest@0
1167
+ inputs:
1168
+ action: "deploy"
1169
+ kubernetesServiceConnection: "k8s-dev"
1170
+ namespace: "app-dev"
1171
+ manifests: "k8s/overlays/dev/*.yaml"
1172
+ containers: "$(dockerRegistry)/app:$(Build.BuildId)"
1173
+ - script: |
1174
+ kubectl exec deployment/app -n app-dev -- pnpm db:migrate
1175
+
1176
+ - stage: DeployProduction
1177
+ displayName: "Deploy to Production"
1178
+ dependsOn: DeployDev
1179
+ condition: and(succeeded(), eq(variables['Build.SourceBranchName'], 'main'))
1180
+ jobs:
1181
+ - deployment: DeployProduction
1182
+ environment: "production"
1183
+ strategy:
1184
+ runOnce:
1185
+ deploy:
1186
+ steps:
1187
+ - task: KubernetesManifest@0
1188
+ inputs:
1189
+ action: "deploy"
1190
+ kubernetesServiceConnection: "k8s-prod"
1191
+ namespace: "app-prod"
1192
+ manifests: "k8s/overlays/prod/*.yaml"
1193
+ containers: "$(dockerRegistry)/app:$(Build.BuildId)"
1194
+ - script: |
1195
+ kubectl exec deployment/app -n app-prod -- pnpm db:migrate
1196
+ ```
1197
+
1198
+ ### 15.5 GitLab CI (Quick Reference)
1199
+
1200
+ ```yaml
1201
+ # .gitlab-ci.yml
1202
+ stages:
1203
+ - lint
1204
+ - test
1205
+ - build
1206
+ - docker
1207
+ - deploy
1208
+
1209
+ image: node:20-alpine
1210
+
1211
+ cache:
1212
+ key: ${CI_COMMIT_REF_SLUG}
1213
+ paths:
1214
+ - node_modules/
1215
+ - ~/.pnpm-store
1216
+
1217
+ before_script:
1218
+ - corepack enable && corepack prepare pnpm@9 --activate
1219
+ - pnpm install --frozen-lockfile
1220
+
1221
+ lint:
1222
+ stage: lint
1223
+ script:
1224
+ - pnpm lint
1225
+ - pnpm check-types
1226
+
1227
+ test:
1228
+ stage: test
1229
+ script:
1230
+ - pnpm test:ci
1231
+ artifacts:
1232
+ reports:
1233
+ coverage_report:
1234
+ coverage_format: cobertura
1235
+ path: coverage/cobertura-coverage.xml
1236
+
1237
+ build:
1238
+ stage: build
1239
+ script:
1240
+ - pnpm build
1241
+ artifacts:
1242
+ paths:
1243
+ - dist/
1244
+
1245
+ docker:
1246
+ stage: docker
1247
+ script:
1248
+ - docker build -t $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA .
1249
+ - docker push $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
1250
+ only:
1251
+ - main
1252
+ - develop
1253
+
1254
+ deploy:
1255
+ stage: deploy
1256
+ script:
1257
+ - kubectl set image deployment/app app=$CI_REGISTRY_IMAGE:$CI_COMMIT_SHA
1258
+ - kubectl rollout status deployment/app
1259
+ environment:
1260
+ name: $CI_ENVIRONMENT_NAME
1261
+ only:
1262
+ - main
1263
+ - develop
1264
+ ```
1265
+
1266
+ ---
1267
+
1268
+ ## 16. Appendix: Quick References
1269
+
1270
+ ### 16.1 Common Commands
1271
+
1272
+ ```bash
1273
+ # Docker
1274
+ docker build -t app:latest .
1275
+ docker compose up -d
1276
+ docker compose down
1277
+ docker system prune -f
1278
+
1279
+ # Kubernetes
1280
+ kubectl get pods -n app
1281
+ kubectl logs -f deployment/app -n app
1282
+ kubectl rollout status deployment/app -n app
1283
+ kubectl rollout undo deployment/app -n app
1284
+ kubectl set image deployment/app app=app:newtag
1285
+ kubectl exec deployment/app -- pnpm db:migrate
1286
+ kubectl get hpa -n app
1287
+ kubectl describe pod app-xxx -n app
1288
+
1289
+ # Kustomize
1290
+ kubectl kustomize k8s/overlays/prod/
1291
+ kubectl apply -k k8s/overlays/prod/
1292
+
1293
+ # Debug
1294
+ kubectl describe deployment/app
1295
+ kubectl get events --sort-by='.lastTimestamp'
1296
+ ```
1297
+
1298
+ ### 16.2 Required DevOps Files Checklist
1299
+
1300
+ | File | Purpose | Required For |
1301
+ |------|---------|-------------|
1302
+ | Dockerfile | Multi-stage container build | Every project |
1303
+ | .dockerignore | Exclude files from build | Every project |
1304
+ | docker-compose.yml | Local dev stack | Every project |
1305
+ | docker-compose.dev.yml | Dev with hot-reload | Multi-service projects |
1306
+ | k8s/base/deployment.yaml | K8s deployment config | K8s-deployed projects |
1307
+ | k8s/base/service.yaml | K8s service config | K8s-deployed projects |
1308
+ | k8s/base/ingress.yaml | K8s ingress config | K8s-deployed projects |
1309
+ | k8s/base/kustomization.yaml | Kustomize entry point | K8s-deployed projects |
1310
+ | k8s/overlays/dev/kustomization.yaml | Dev overlay | K8s-deployed projects |
1311
+ | k8s/overlays/prod/kustomization.yaml | Prod overlay | K8s-deployed projects |
1312
+ | .github/workflows/ci.yml | CI pipeline | GitHub repos |
1313
+ | .github/workflows/cd.yml | CD pipeline | GitHub repos |
1314
+ | azure-pipelines.yml | CI/CD pipeline | Azure DevOps |
1315
+ | monitoring/prometheus.yml | Metrics config | If using Prometheus |
1316
+ | scripts/health-check.sh | Health check script | Every project |
1317
+ | .nvmrc | Node version pinning | Every project |
1318
+
1319
+ ### 16.3 Environment Variable Naming
1320
+
1321
+ | Scope | Prefix | Example |
1322
+ |-------|--------|---------|
1323
+ | Build-time (Vite) | VITE_ | VITE_API_URL |
1324
+ | Build-time (Next.js) | NEXT_PUBLIC_ | NEXT_PUBLIC_API_URL |
1325
+ | Runtime (Node) | No prefix | DATABASE_URL |
1326
+ | Docker | Service-specific | DB_HOST, REDIS_URL |
1327
+
1328
+ ### 16.4 Key CI/CD Terms
1329
+
1330
+ | Term | Definition |
1331
+ |------|-----------|
1332
+ | CI | Continuous Integration -- automatically build and test every change |
1333
+ | CD | Continuous Delivery/Deployment -- automatically deploy to environments |
1334
+ | Artifact | Build output (Docker image, compiled files) |
1335
+ | Lockfile | File that locks dependency versions (pnpm-lock.yaml) |
1336
+ | Blue/Green | Deployment strategy with two identical environments |
1337
+ | Canary | Gradual rollout to a subset of users |
1338
+ | Rolling Update | Gradual replacement of pods (Kubernetes default) |
1339
+ | Readiness Probe | Checks if pod is ready to serve traffic |
1340
+ | Liveness Probe | Checks if pod is alive (restarts if failed) |
1341
+ | HPA | Horizontal Pod Autoscaler (auto-scale replicas) |
1342
+ | PDB | PodDisruptionBudget (min pods during maintenance) |
1343
+ | Kustomize | Kubernetes configuration customization tool |
1344
+ | Helm | Kubernetes package manager |
1345
+
1346
+ ---
1347
+
1348
+ > **This is a living document.** It should be reviewed and updated quarterly by the DevOps lead and architecture team. All pipeline changes must follow these standards. Deviations require an Architecture Decision Record (ADR).
1349
+
1350
+ ---
1351
+
1352
+ ## 17. Implementation Validation & Enforcement Mechanisms
1353
+
1354
+ ### 17.1 Validation Architecture Overview
1355
+
1356
+ ```
1357
+ +------------------+ +------------------+ +---------------------+ +-------------+
1358
+ | DEVELOPER | | GIT/HOOKS | | CI PIPELINE | | PRODUCTION |
1359
+ | WORKSPACE | | (HUSKY) | | (GITHUB ACTIONS) | | DEPLOYMENT |
1360
+ +------------------+ +------------------+ +---------------------+ +-------------+
1361
+ | | | |
1362
+ v v v v
1363
+ +------------------+ +------------------+ +---------------------+ +-------------+
1364
+ | L1: Editor | | L2: Pre-commit | | L3: PR Validation | | L4: Post- |
1365
+ | (VS Code / | | (lint-staged + | | (lint, types, tests, | | deploy Smoke|
1366
+ | ESLint/Prettier)| | commitlint) | | coverage, security)| | tests |
1367
+ +------------------+ +------------------+ +---------------------+ +-------------+
1368
+ ```
1369
+
1370
+ **Four enforcement layers protect every rule in this document**:
1371
+
1372
+ | Layer | Tool | Trigger | Purpose |
1373
+ |-------|------|---------|---------|
1374
+ | L1 Editor | VS Code + ESLint extension | Every save | Instant feedback while coding |
1375
+ | L2 Pre-commit | Husky + lint-staged | Every commit | Block bad code before git history |
1376
+ | L3 CI Pipeline | GitHub Actions | Every PR | Enforce before merge |
1377
+ | L4 Post-deploy | Smoke tests + monitoring | Every deploy | Catch runtime issues |
1378
+
1379
+ ---
1380
+
1381
+ ## 18. L1: Editor-Level Validation (VS Code)
1382
+
1383
+ ### 18.1 Required VS Code Extensions
1384
+
1385
+ Create `.vscode/extensions.json` at the project root:
1386
+
1387
+ ```json
1388
+ {
1389
+ "recommendations": [
1390
+ "dbaeumer.vscode-eslint",
1391
+ "esbenp.prettier-vscode",
1392
+ "usernamehw.commitlint",
1393
+ "christian-kohler.npm-intellisense",
1394
+ "christian-kohler.path-intellisense"
1395
+ ]
1396
+ }
1397
+ ```
1398
+
1399
+ ### 18.2 VS Code Settings (Required Per Project)
1400
+
1401
+ Create `.vscode/settings.json`:
1402
+
1403
+ ```json
1404
+ {
1405
+ // === FORMATTING ===
1406
+ "editor.formatOnSave": true,
1407
+ "editor.defaultFormatter": "esbenp.prettier-vscode",
1408
+ "editor.codeActionsOnSave": {
1409
+ "source.fixAll.eslint": "explicit",
1410
+ "source.organizeImports": "never"
1411
+ },
1412
+
1413
+ // === TYPESCRIPT ===
1414
+ "typescript.preferences.importModuleSpecifier": "non-relative",
1415
+ "typescript.preferences.quoteStyle": "single",
1416
+ "typescript.tsdk": "node_modules/typescript/lib",
1417
+
1418
+ // === FILE SCOPES ===
1419
+ "files.eol": "\n",
1420
+ "files.trimTrailingWhitespace": true,
1421
+ "files.insertFinalNewline": true,
1422
+ "files.autoSave": "onFocusChange",
1423
+
1424
+ // === EDITOR ===
1425
+ "editor.tabSize": 2,
1426
+ "editor.insertSpaces": true,
1427
+ "editor.detectIndentation": false,
1428
+ "editor.renderWhitespace": "boundary",
1429
+ "editor.rulers": [100],
1430
+ "editor.wordWrapColumn": 100,
1431
+
1432
+ // === SEARCH & REPLACE ===
1433
+ "search.useIgnoreFilesByDefault": true,
1434
+ "search.exclude": {
1435
+ "**/node_modules": true,
1436
+ "**/dist": true,
1437
+ "**/coverage": true
1438
+ }
1439
+ }
1440
+ ```
1441
+
1442
+ ### 18.3 Editor Validation Checklist
1443
+
1444
+ - [ ] ESLint extension installed and active
1445
+ - [ ] Prettier extension installed and active
1446
+ - [ ] Settings.json committed to repo (shared across team)
1447
+ - [ ] Format-on-save enabled
1448
+ - [ ] Auto-fix on save enabled
1449
+ - [ ] TS extension enabled with project TypeScript version
1450
+
1451
+ ---
1452
+
1453
+ ## 19. L2: Git Hook Validation (Husky + lint-staged)
1454
+
1455
+ ### 19.1 Complete Husky Setup
1456
+
1457
+ ```bash
1458
+ # 1. Install
1459
+ pnpm add -D husky lint-staged
1460
+
1461
+ # 2. Enable
1462
+ pnpm prepare # Runs husky init, creates .husky/
1463
+
1464
+ # 3. Create hooks
1465
+ npx husky add .husky/pre-commit "pnpm exec lint-staged"
1466
+ npx husky add .husky/commit-msg "pnpm exec commitlint --edit \$1"
1467
+ npx husky add .husky/pre-push "pnpm lint && pnpm check-modularization && pnpm check-types && pnpm test:ci"
1468
+
1469
+ # 4. Verify
1470
+ pnpm prepare
1471
+ ```
1472
+
1473
+ The `check-modularization` script validates domain-folder structure, barrel file compliance, and module isolation rules defined in the Frontend and Backend Standards documents.
1474
+
1475
+ ### 19.1.1 Anti-Bypass Protection (`--no-verify` Enforcement)
1476
+
1477
+ The `git commit --no-verify` (or `-n`) flag skips local pre-commit and commit-msg hooks. To prevent this from being used to circumvent standards:
1478
+
1479
+ **1. CI Duplicates All Local Checks** — Every check that runs in pre-commit/pre-push also runs in CI (see Section 20). Using `--no-verify` only delays validation; it never bypasses it.
1480
+
1481
+ **2. Commit Message Validation in CI**:
1482
+
1483
+ ```yaml
1484
+ # .github/workflows/ci.yml
1485
+ - name: Validate Commit Messages (anti-bypass for --no-verify / commit-msg)
1486
+ run: pnpm exec commitlint --from origin/${{ github.base_ref }} --to HEAD
1487
+ ```
1488
+
1489
+ **3. Branch Protection Blocks Non-Compliant PRs** — Even if hooks are skipped locally, CI must pass before merge. The PR cannot merge with failing lint, modularization, type, or test checks.
1490
+
1491
+ **`--no-verify` Policy**:
1492
+
1493
+ | Bypass Attempt | Result |
1494
+ |----------------|--------|
1495
+ | `git commit --no-verify` | Commit lands locally. CI runs all skipped checks on push. PR blocked if failing. |
1496
+ | `git push --no-verify` | Push succeeds. CI pre-push checks (lint + modularization + types + tests) run anyway. PR blocked if failing. |
1497
+ | Removing `.husky/` directory | Caught in code review — `.husky/` is committed. CI lint detects missing config. |
1498
+ | Disabling husky in `package.json` | CI `pnpm install` re-installs husky; `pnpm prepare` re-creates hooks. |
1499
+
1500
+ > **No commit reaches `main` or `develop` without passing every standard check, regardless of local hook bypass attempts.**
1501
+
1502
+ ### 19.2 Package.json Scripts Template
1503
+
1504
+ ```json
1505
+ {
1506
+ "scripts": {
1507
+ "prepare": "husky",
1508
+ "lint": "eslint --ext .ts,.tsx --max-warnings 0 .",
1509
+ "lint:fix": "eslint --ext .ts,.tsx . --fix",
1510
+ "format": "prettier --write \"src/**/*.{ts,tsx,json,css,md,yml,yaml}\"",
1511
+ "format:check": "prettier --check \"src/**/*.{ts,tsx,json,css,md,yml,yaml}\"",
1512
+ "check-types": "tsc --noEmit",
1513
+ "check-modularization": "node scripts/check-modularization.mjs",
1514
+ "audit": "pnpm audit --audit-level=high",
1515
+ "test:ci": "vitest run --coverage",
1516
+ "test:ui": "vitest",
1517
+ "validate": "pnpm lint && pnpm check-modularization && pnpm check-types && pnpm test:ci"
1518
+ },
1519
+ "lint-staged": {
1520
+ "src/**/*.{ts,tsx}": ["eslint --fix --max-warnings 0", "prettier --write"],
1521
+ "src/**/*.{css,scss}": ["prettier --write"],
1522
+ "*.{json,md,yaml,yml}": ["prettier --write"]
1523
+ }
1524
+ }
1525
+ ```
1526
+
1527
+ ### 19.3 Hook Execution Flow
1528
+
1529
+ ```
1530
+ git commit
1531
+ |
1532
+ v
1533
+ +------------------+
1534
+ | Pre-commit hook |
1535
+ | (lint-staged) | -- Runs ESLint + Prettier on STAGED files only
1536
+ +------------------+ -- Blocks commit if errors found
1537
+ |
1538
+ v (passes)
1539
+ +------------------+
1540
+ | Commit-msg hook |
1541
+ | (commitlint) | -- Validates message against conventional commits
1542
+ +------------------+ -- Blocks commit if invalid
1543
+ |
1544
+ v
1545
+ Commit succeeds
1546
+
1547
+
1548
+ git push
1549
+ |
1550
+ v
1551
+ +------------------+
1552
+ | Pre-push hook |
1553
+ | (full checks) | -- Lint + Modularization + Types + Tests
1554
+ +------------------+ -- Blocks push if fails
1555
+ |
1556
+ v
1557
+ Push succeeds
1558
+ ```
1559
+
1560
+ > **Note**: If `--no-verify` is used to skip hooks, all checks still run in CI (Section 20). The PR will be blocked.
1561
+
1562
+ ---
1563
+
1564
+ ## 20. L3: CI Pipeline Validation (GitHub Actions)
1565
+
1566
+ ### 20.1 Required CI Workflow
1567
+
1568
+ Every project must have `.github/workflows/ci.yml`. A partial or missing workflow is a blocking issue in onboarding review.
1569
+
1570
+ ### 20.2 Mandatory CI Checks
1571
+
1572
+ | Step | Command | Fails On | Time Limit |
1573
+ |------|---------|----------|------------|
1574
+ | 1. Lint | `pnpm lint` | Any ESLint error (including boundaries + `no-internal-modules`) | 2 min |
1575
+ | 2. Modularization | `pnpm check-modularization` | Flat files at root, missing barrel files, cross-domain imports | 30s |
1576
+ | 3. Commit Messages | `pnpm exec commitlint --from origin/${{ github.base_ref }} --to HEAD` | Non-conventional commits (anti-bypass) | 10s |
1577
+ | 4. Type Check | `pnpm check-types` | Any TS type error | 1 min |
1578
+ | 5. Format Check | `pnpm format:check` | Any formatting difference | 30s |
1579
+ | 6. Unit Tests | `pnpm test:ci` | Test failure OR coverage < 80% lines, 75% branches, 80% functions, 80% statements | 5 min |
1580
+ | 7. Security Audit | `pnpm audit --audit-level=high` | High/Critical CVEs | 1 min |
1581
+ | 8. Build | `pnpm build` | Build failure | 3 min |
1582
+
1583
+ **Total target CI duration**: < 10 minutes
1584
+
1585
+ ### 20.3 Complete CI Workflow
1586
+
1587
+ ```yaml
1588
+ name: CI Pipeline
1589
+ on:
1590
+ push:
1591
+ branches: [main, develop]
1592
+ pull_request:
1593
+ branches: [main, develop]
1594
+
1595
+ env:
1596
+ NODE_VERSION: "20"
1597
+ PNPM_VERSION: "9"
1598
+
1599
+ jobs:
1600
+ quality:
1601
+ name: Quality Checks
1602
+ runs-on: ubuntu-latest
1603
+ steps:
1604
+ - uses: actions/checkout@v4
1605
+ with:
1606
+ fetch-depth: 0 # Required for commitlint --from/--to
1607
+
1608
+ - uses: actions/setup-node@v4
1609
+ with:
1610
+ node-version: ${{ env.NODE_VERSION }}
1611
+ cache: "pnpm"
1612
+ - uses: pnpm/action-setup@v2
1613
+ with: { version: ${{ env.PNPM_VERSION }} }
1614
+
1615
+ - name: Install dependencies
1616
+ run: pnpm install --frozen-lockfile
1617
+
1618
+ - name: Lint (ESLint + boundaries + no-internal-modules)
1619
+ run: pnpm lint
1620
+
1621
+ - name: Check Modularization (domain structure + barrels)
1622
+ run: pnpm check-modularization
1623
+
1624
+ - name: Validate Commit Messages (anti-bypass for --no-verify)
1625
+ if: github.event_name == 'pull_request'
1626
+ run: pnpm exec commitlint --from origin/${{ github.base_ref }} --to HEAD
1627
+
1628
+ - name: Type Check
1629
+ run: pnpm check-types
1630
+
1631
+ - name: Format Check
1632
+ run: pnpm format:check
1633
+
1634
+ - name: Test with Coverage
1635
+ run: pnpm test:ci
1636
+
1637
+ - name: Security Audit
1638
+ run: pnpm audit --audit-level=high
1639
+
1640
+ - name: Build
1641
+ run: pnpm build
1642
+
1643
+ - uses: codecov/codecov-action@v4
1644
+ if: always()
1645
+ with:
1646
+ files: ./coverage/cobertura-coverage.xml
1647
+ fail_ci_if_error: false
1648
+ ```
1649
+
1650
+ ### 20.4 CD Workflow
1651
+
1652
+ ```yaml
1653
+ name: CD Pipeline
1654
+ on:
1655
+ push:
1656
+ branches: [develop, main]
1657
+
1658
+ jobs:
1659
+ deploy-dev:
1660
+ if: github.ref == 'refs/heads/develop'
1661
+ runs-on: ubuntu-latest
1662
+ environment: dev
1663
+ steps:
1664
+ - uses: actions/checkout@v4
1665
+ - uses: azure/setup-kubectl@v4
1666
+ - name: Configure kubeconfig
1667
+ run: |
1668
+ mkdir -p $HOME/.kube
1669
+ echo "${{ secrets.KUBE_CONFIG_DEV }}" | base64 --decode > $HOME/.kube/config
1670
+ - name: Deploy
1671
+ run: |
1672
+ kubectl set image deployment/app \
1673
+ app=ghcr.io/${{ github.repository }}:${{ github.sha }} \
1674
+ --namespace=dev
1675
+ kubectl rollout status deployment/app --namespace=dev
1676
+ - name: Migrate
1677
+ run: kubectl exec deployment/app --namespace=dev -- pnpm db:migrate
1678
+
1679
+ deploy-staging:
1680
+ needs: [deploy-dev]
1681
+ if: github.ref == 'refs/heads/main'
1682
+ runs-on: ubuntu-latest
1683
+ environment: staging
1684
+ steps:
1685
+ - uses: actions/checkout@v4
1686
+ - name: Configure kubeconfig
1687
+ run: |
1688
+ mkdir -p $HOME/.kube
1689
+ echo "${{ secrets.KUBE_CONFIG_STAGING }}" | base64 --decode > $HOME/.kube/config
1690
+ - name: Deploy
1691
+ run: |
1692
+ kubectl set image deployment/app \
1693
+ app=ghcr.io/${{ github.repository }}:${{ github.sha }} \
1694
+ --namespace=staging
1695
+ kubectl rollout status deployment/app --namespace=staging
1696
+ - name: Migrate
1697
+ run: kubectl exec deployment/app --namespace=staging -- pnpm db:migrate
1698
+ - name: Smoke Tests
1699
+ run: kubectl exec deployment/app --namespace=staging -- pnpm test:e2e
1700
+
1701
+ deploy-production:
1702
+ needs: [deploy-staging]
1703
+ if: startsWith(github.ref, 'refs/tags/v')
1704
+ runs-on: ubuntu-latest
1705
+ environment: production
1706
+ steps:
1707
+ - uses: actions/checkout@v4
1708
+ - name: Configure kubeconfig
1709
+ run: |
1710
+ mkdir -p $HOME/.kube
1711
+ echo "${{ secrets.KUBE_CONFIG_PROD }}" | base64 --decode > $HOME/.kube/config
1712
+ - name: Deploy (rolling update)
1713
+ run: |
1714
+ kubectl set image deployment/app \
1715
+ app=ghcr.io/${{ github.repository }}:${{ github.sha }} \
1716
+ --namespace=production
1717
+ kubectl rollout status deployment/app --namespace=production
1718
+ - name: Migrate
1719
+ run: kubectl exec deployment/app --namespace=production -- pnpm db:migrate
1720
+ - name: Health Check
1721
+ run: kubectl get pods,ingress --namespace=production
1722
+ ```
1723
+
1724
+ ---
1725
+
1726
+ ## 21. Validation Rules Summary
1727
+
1728
+ ### 21.1 Rules That Must Be Enforced (Every Project)
1729
+
1730
+ | Rule | Enforcement | Tool | Severity |
1731
+ |------|-------------|------|----------|
1732
+ | No ESLint errors | Exit code != 0 | ESLint | BLOCK |
1733
+ | No TypeScript errors | Exit code != 0 | TypeScript | BLOCK |
1734
+ | No formatting errors | Exit code != 0 | Prettier | BLOCK |
1735
+ | Tests pass | Exit code != 0 | Vitest / Jest | BLOCK |
1736
+ | Coverage >= 80% lines, 75% branches, 80% functions, 80% statements | Threshold fail | Vitest coverage | BLOCK |
1737
+ | No high/critical CVEs | Exit code != 0 | pnpm audit | BLOCK |
1738
+ | Conventional commits | Message reject | Commitlint | BLOCK |
1739
+ | **Domain modularization** | **Exit code != 0** | **check-modularization + ESLint boundaries** | **BLOCK** |
1740
+ | **Barrel files present** | **Exit code != 0** | **check-modularization** | **BLOCK** |
1741
+ | **No deep imports bypassing barrels** | **ESLint error** | **import/no-internal-modules** | **BLOCK** |
1742
+ | Lockfile committed | CI check | husky + git | BLOCK |
1743
+ | No secrets in code | CI scan | GitLeaks / ESLint | BLOCK |
1744
+ | Naming (camelCase, etc) | CI lint error | ESLint | BLOCK |
1745
+ | No unused vars | CI lint error | ESLint | BLOCK |
1746
+ | Single letters banned | CI lint error | ESLint id-length | BLOCK |
1747
+ | **Anti-bypass (--no-verify)** | **CI runs all checks regardless** | **GitHub Actions** | **BLOCK** |
1748
+ | **PR template compliance** | **PR must follow .github/PULL_REQUEST_TEMPLATE.md** | **Code review + CI checklist gates** | **BLOCK** |
1749
+
1750
+ ### 21.2 New Project Validation Checklist
1751
+
1752
+ When a new project is created, the following must verify before it is allowed to merge its first PR:
1753
+
1754
+ - [ ] `.vscode/settings.json` exists and is committed
1755
+ - [ ] `.vscode/extensions.json` exists and lists required extensions
1756
+ - [ ] `pnpm lint` exits with 0 and reports 0 errors (including boundaries + `no-internal-modules`)
1757
+ - [ ] `pnpm check-modularization` exits with 0 (all domain folders have barrel files, no flat files at roots)
1758
+ - [ ] `pnpm check-types` exits with 0
1759
+ - [ ] `pnpm format:check` exits with 0
1760
+ - [ ] `pnpm test:ci` passes and coverage >= 80%
1761
+ - [ ] `pnpm audit --audit-level=high` reports nothing
1762
+ - [ ] `.husky/pre-commit` exists and runs without error
1763
+ - [ ] `.husky/commit-msg` rejects bad commit messages
1764
+ - [ ] `.husky/pre-push` runs lint + modularization + types + tests
1765
+ - [ ] `.github/workflows/ci.yml` exists and all steps pass (including modularization + commit message validation)
1766
+ - [ ] `.github/workflows/cd.yml` exists with dev/staging/prod jobs
1767
+ - [ ] Coverage threshold configured in `vitest.config.ts` or `jest.config.ts`
1768
+ - [ ] `commitlint.config.js` exists and is enforced
1769
+ - [ ] Branch protection rules configured on GitHub
1770
+ - [ ] `.github/PULL_REQUEST_TEMPLATE.md` exists (copied from standards package)
1771
+ - [ ] Every domain subdirectory (`services/<domain>/`, `hooks/<domain>/`, etc.) has an `index.ts` barrel file
1772
+ - [ ] No flat/orphaned domain files at `services/`, `hooks/`, `store/`, `components/`, `lib/` roots
1773
+
1774
+ ### 21.3 Non-Compliant Project Remediation
1775
+
1776
+ If an existing project is found not meeting these standards:
1777
+
1778
+ 1. **Week 1**: Add missing config files (`.vscode/`, `.husky/`, `.github/workflows/`)
1779
+ 2. **Week 2**: Run `pnpm lint:fix` and `pnpm format` on entire codebase
1780
+ 3. **Week 3**: Add missing tests until coverage >= 80% (start with critical paths)
1781
+ 4. **Week 4**: Enable branch protection, require PR reviews, merge only compliant code
1782
+
1783
+ ---
1784
+ ## Appendix B. CI/CD Cost & Pricing Reference (June 2026)
1785
+
1786
+ > Use this to estimate platform costs when proposing a new project or evaluating vendors. Prices are mid-2026 commercial estimates and vary by vendor, region, and contract terms.
1787
+
1788
+ ### Partner Benefits & Azure Cost Relief
1789
+
1790
+ As a **Microsoft Partner**, your organization may qualify for:
1791
+
1792
+ - **Microsoft Action Pack / Solutions Partner** benefits: included Azure credits, free/dev CI-CD minutes, and co-selling/support access.
1793
+ - **Azure Hybrid Benefit / Reserved Instances**: reduces VM and managed-service compute cost by **20%–40%**.
1794
+ - **Visual Studio Enterprise / GitHub Enterprise** bundled licensing: avoids duplicate license spend and typically lowers total cost of ownership.
1795
+ - **Azure for Startups / ISV Pied Piper**: early-stage credits and advisory hours to reduce ramp-up cost.
1796
+ - **FastTrack / Partner Support**: faster migrations and production-readiness reviews reduce delivery risk and project overruns.
1797
+
1798
+ Use this table to compare partner-inclusive pricing vs. standard list pricing when planning workloads.
1799
+
1800
+ ### B.1 Estimated Annual Costs by Project Scale
1801
+
1802
+ | Project Scale | Runners / Minutes | Container Registry | Secrets / Vault | Monitoring | Total Est. Annual |
1803
+ |---|---|---|---|---|---|
1804
+ | Startup / MVP | ~5,000 min/mo | GHCR free tier | GitHub Secrets included | Free OSS | ~USD 500 – USD 3,000 |
1805
+ | Growth / Series A | ~25,000 min/mo | GHCR + ACR | Key Vault / Doppler | Sentry / Grafana Cloud | ~USD 5,000 – USD 25,000 |
1806
+ | Scaleup / Enterprise | ~100,000+ min/mo | ACR/ECR/GCR + cache | Key Vault / Doppler | Datadog / New Relic | ~USD 30,000 – USD 120,000+ |
1807
+ | Regulated / Fintech | ~50,000 min/mo | Private registry + signing | Dedicated HSM / Azure KV | ELK / Splunk + APM | ~USD 50,000 – USD 200,000+ |
1808
+
1809
+ ### B.2 Tooling & License Costs (Annual)
1810
+
1811
+ | Tool / Service | Purpose | Typical Tier | Est. Cost | Notes |
1812
+ |---|---|---|---|---|
1813
+ | GitHub Actions (private) | CI minutes beyond free tier | 2,000 min free; then ~USD 0.008/min | USD 2,400 – USD 10,000/yr | Cache and matrix strategies dramatically affect cost. |
1814
+ | Azure DevOps | Pipelines, Artifacts, Boards | Basic: free; Parallel jobs: ~USD 40/job/mo | USD 0 – USD 8,000/yr | Parallel jobs are the main cost driver. |
1815
+ | GitLab.com | CI/CD minutes and storage | Free tier 400 min/mo; Premium ~USD 19/user/mo | USD 0 – USD 10,000/yr | Self-managed runners shift cost to infra. |
1816
+ | SonarCloud / SonarQube | Static analysis, quality gate | Developer: ~USD 25/dev/mo; Enterprise annually | USD 2,000 – USD 15,000/yr | SonarCloud SaaS vs self-hosted on same infra. |
1817
+ | Snyk / Trivy / Dependabot | Dependency & container scanning | Snyk Team: ~USD 19/dev/mo; Trivy: free | USD 0 – USD 5,000/yr | Trivy + Dependabot cover most needs at zero marginal cost. |
1818
+ | Datadog / New Relic / Azure Monitor | APM, logs, traces, RUM | Host/ingestion based | USD 1,500 – USD 10,000+/yr | Datadog is easy to adopt; Grafana Cloud is cheaper. |
1819
+ | Grafana Cloud | Metrics + dashboards | Free tier; Pro: ~USD 8–29/user/mo | USD 100 – USD 3,000/yr | Good value for metrics-heavy setups. |
1820
+ | Slack / Teams Notifications | Deployment notifications | Included in workspace | Usually free | Webhook costs are typically negligible. |
1821
+ | HashiCorp Vault / Azure Key Vault | Secret management | Small tier / per-secret pricing | USD 200 – USD 1,500/yr | Avoids expensive breach remediation. |
1822
+ | Terraform Cloud / Bicep / Pulumi | IaC collaboration + state | Free tier; Standard ~USD 0.011/run | USD 100 – USD 1,000/yr | Workspace isolation and RBAC are worth the cost. |
1823
+
1824
+ ### B.3 Infrastructure Costs (Monthly)
1825
+
1826
+ | Environment | Typical Specs | Est. Monthly | Annual Est. |
1827
+ |---|---|---|---|
1828
+ | Dev (1 replica, no HA) | 1 vCPU / 2 GB RAM | USD 30 – USD 120 | USD 360 – USD 1,440 |
1829
+ | Staging (2 replicas) | 2 vCPU / 4 GB RAM each | USD 100 – USD 300 | USD 1,200 – USD 3,600 |
1830
+ | Production (3+ replicas, autoscale) | 2 vCPU / 4 GB RAM + LB | USD 250 – USD 900 | USD 3,000 – USD 10,800 |
1831
+ | Database (managed Postgres) | 2 vCPU / 8 GB + storage | USD 120 – USD 500 | USD 1,440 – USD 6,000 |
1832
+ | Redis / Valkey (managed) | 1 vCPU / 2 GB | USD 15 – USD 50 | USD 180 – USD 600 |
1833
+ | Object Storage | 1 TB + egress | USD 20 – USD 80 | USD 240 – USD 960 |
1834
+ | Load Balancer / Ingress | Basic LB | USD 15 – USD 30 | USD 180 – USD 360 |
1835
+ | DNS, certs, monitoring | Managed | Included in hosting or minimal extra | Often < USD 50/mo |
1836
+
1837
+ > These are mid-2026 market reference prices. Reserved/commitment terms and savings plans can reduce compute costs by 20%–40%.
1838
+
1839
+ ---
1840
+
1841
+ ### B.4 Open-Source vs Commercial Alternatives (2026)
1842
+
1843
+ | Need | Commercial Option | OSS Alternative | Est. Savings | Trade-offs / Notes |
1844
+ |---|---|---|---|---|
1845
+ | CI/CD Platform | GitHub Actions / Azure DevOps / GitLab.com | **Woodpecker**, **Gitea Actions**, **Jenkins**, **Drone CI** | Up to ~USD 10,000+/yr on high volume | OSS needs self-hosted runners/infra. Woodpecker/Gitea give closest DX to GitHub Actions without cloud minute bills. |
1846
+ | Static Analysis | SonarCloud / SonarQube Enterprise | **SonarQube Community Edition**, **ESLint + TypeScript strict + Coverage**, GitHub **CodeQL** | ~USD 2,000 – USD 15,000/yr | CodeQL is free for public and private repos on GitHub. SonarQube CE is fully capable for most teams. |
1847
+ | Dependency Scanning | Snyk | **Trivy**, **OWASP Dependency-Check**, GitHub Dependabot, **Renovate** | ~USD 0 – USD 5,000/yr | Trivy + Dependabot cover most needs at zero license cost. Renovate is best when you want automated PR updates too. |
1848
+ | Container Registry | Azure Container Registry / AWS ECR | **GHCR** (GitHub Container Registry), **Harbor** | ~USD 100 – USD 1,000/yr | GHCR is free with GitHub Actions. Harbor is best for air-gapped / on-prem / high egress cost scenarios. |
1849
+ | Secret Management | HashiCorp Vault Enterprise / Azure Key Vault premium | **HashiCorp Vault OSS**, **SOPS + age/GPG**, **Doppler CLI (free tier)** | ~USD 200 – USD 1,500/yr | Vault OSS is mature. SOPS is ideal when secrets are already in Git via encryption. |
1850
+ | Monitoring/APM | Datadog / New Relic / Azure Monitor | **Prometheus + Grafana**, **SigNoz**, **Uptime Kuma**, **Grafana Cloud Free/Pro** | ~USD 1,500 – USD 10,000+/yr | Grafana Cloud Pro is still far cheaper than Datadog. Self-hosted Prometheus saves the most but needs operator skill. |
1851
+ | Log Aggregation | Splunk / Datadog Logs / ELK Cloud | **Loki + Promtail**, **OpenSearch**, **Vector + ClickHouse** | ~USD 1,000 – USD 8,000/yr | Loki is the easiest OSS drop-in for Kubernetes logging. OpenSearch is heavier but more capable search. |
1852
+ | IaC Platform | Terraform Cloud / Pulumi Business | **Terraform OSS + local backend/S3**, **OpenTofu** | ~USD 100 – USD 1,000/yr | OpenTofu is the open-source fork and is now widely adopted. State in S3 + DynamoDB lock is still the cheapest backend. |
1853
+ | Deployment Gates | Azure DevOps Gates / Cloud Native features | **OpenPolicyAgent / Gatekeeper**, **Kyverno**, GitHub Environments + required reviewers | Often free | Policy-as-code replaces many manual gate features at no license cost. |
1854
+
1855
+ ### B.5 OSS Selection Cheat Sheet
1856
+
1857
+ | Solution | Setup Effort | Ongoing Maintenance | When It Wins | When to Buy Commercial |
1858
+ |---|---|---|---|---|
1859
+ | Woodpecker / Gitea Actions | Medium | Medium | Heavy CI usage, private repo cost control | When you want zero per-minute billing and out-of-box hosted UX |
1860
+ | Prometheus + Grafana | Medium | Medium | Metrics-heavy infra with in-house SRE knowledge | When support SLAs and fully hosted dashboards are mandatory |
1861
+ | SonarQube CE / CodeQL | Low | Low | Security + quality gates with minimal licensing | When you need enterprise support contracts or custom plugins |
1862
+ | Trivy + Dependabot | Low | Low | Dependency and container scanning | When you want integrated vendor support and policy enforcement |
1863
+ | Harbor | Medium | Medium | Air-gapped/on-prem, large images, compliance | When cloud registries are blocked by policy |
1864
+ | OpenTofu / Terraform OSS | Medium | Medium | IaC state and collaboration | When workspaces, RBAC, and managed runs justify SaaS pricing |
1865
+ | Sentry Self-Hosted / GlitchTip | Medium | Medium | High error volume, sensitive logs, privacy requirements | When incident response support and uptime guarantees matter more than license cost |
1866
+
1867
+ ### B.6 Sources & Methodology
1868
+
1869
+ - Pricing is based on **public list prices** from vendor pricing pages:
1870
+ - GitHub Actions: https://docs.github.com/en/billing/actions/usage-limits-for-billed-users
1871
+ - Azure pricing: https://azure.microsoft.com/en-us/pricing/
1872
+ - Azure DevOps pricing: https://azure.microsoft.com/en-us/pricing/details/devops/azure-devops-services/
1873
+ - GitLab pricing: https://about.gitlab.com/pricing/
1874
+ - Datadog: https://www.datadoghq.com/pricing/
1875
+ - Sentry: https://sentry.io/pricing/
1876
+ - Grafana Cloud: https://grafana.com/products/grafana-cloud/pricing/
1877
+ - SonarCloud: https://www.sonarsource.com/pricing/
1878
+ - Snyk: https://snyk.io/pricing/
1879
+ - Trivy: https://github.com/aquasecurity/trivy (open source)
1880
+ - Harbor: https://goharbor.io/ (open source)
1881
+ - Microsoft Partner benefits sourced from:
1882
+ - Microsoft Partner Network: https://partner.microsoft.com/
1883
+ - Microsoft for Startups / ISV Pied Piper: https://startups.microsoft.com/
1884
+ - Azure Hybrid Benefit: https://azure.microsoft.com/en-us/pricing/benefits/azure-hybrid-benefit/
1885
+ - Visual Studio subscriptions: https://visualstudio.microsoft.com/subscriptions/
1886
+ - FastTrack: https://azure.microsoft.com/en-us/programs/azure-fasttrack/
1887
+ Actual entitlement depends on partner tier, agreement, and region; treat these as **items to explore** with your account team.
1888
+ - Infrastructure estimates assume managed cloud VM/service pricing with no reserved/commitment discounts by default; see cloud provider calculators for exact region-specific pricing.