@synergyerp/backend-standards 1.1.1 → 1.1.2
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/BACKEND_STANDARDS.md +218 -181
- package/CICD_STANDARDS.md +355 -341
- package/package.json +1 -1
package/CICD_STANDARDS.md
CHANGED
|
@@ -69,11 +69,11 @@ main --- Production-ready code (protected)
|
|
|
69
69
|
|
|
70
70
|
### 2.2 Branch Protection Rules
|
|
71
71
|
|
|
72
|
-
| Branch
|
|
73
|
-
|
|
74
|
-
| main
|
|
75
|
-
| develop
|
|
76
|
-
| release
|
|
72
|
+
| Branch | Required Reviews | Status Checks | Direct Push |
|
|
73
|
+
| ---------- | ---------------- | -------------------------------------------------- | ------------------ |
|
|
74
|
+
| main | 2 approvals | CI must pass, coverage >=80% lines, >=75% branches | Blocked |
|
|
75
|
+
| develop | 1 approval | CI must pass | Blocked |
|
|
76
|
+
| release/\* | 1 approval | CI must pass | Allowed for admins |
|
|
77
77
|
|
|
78
78
|
### 2.3 GitHub Branch Protection Setup
|
|
79
79
|
|
|
@@ -93,14 +93,14 @@ To configure branch protection in GitHub:
|
|
|
93
93
|
|
|
94
94
|
### 2.4 Trigger Mapping
|
|
95
95
|
|
|
96
|
-
| Event
|
|
97
|
-
|
|
98
|
-
| PR opened/synced
|
|
99
|
-
| PR merged to develop | develop
|
|
100
|
-
| PR opened/synced
|
|
101
|
-
| PR merged to main
|
|
102
|
-
| Tag pushed
|
|
103
|
-
| Push to hotfix
|
|
96
|
+
| Event | Branch | Pipeline |
|
|
97
|
+
| -------------------- | --------------------- | ------------------------------ |
|
|
98
|
+
| PR opened/synced | feature/\* -> develop | CI (lint, test, build) |
|
|
99
|
+
| PR merged to develop | develop | CI + CD (deploy to Dev) |
|
|
100
|
+
| PR opened/synced | develop -> main | CI (lint, test, build) |
|
|
101
|
+
| PR merged to main | main | CI + CD (deploy to Staging) |
|
|
102
|
+
| Tag pushed | v\* | CI + CD (deploy to Production) |
|
|
103
|
+
| Push to hotfix/\* | hotfix/\* -> main | CI + CD (hotfix deploy) |
|
|
104
104
|
|
|
105
105
|
---
|
|
106
106
|
|
|
@@ -130,12 +130,12 @@ DEVELOPER PUSHES CODE
|
|
|
130
130
|
|
|
131
131
|
### 3.2 Pipeline Separation
|
|
132
132
|
|
|
133
|
-
| Pipeline
|
|
134
|
-
|
|
135
|
-
| CI
|
|
136
|
-
| CD (Dev)
|
|
137
|
-
| CD (Staging)
|
|
138
|
-
| CD (Production) | Tag v
|
|
133
|
+
| Pipeline | Trigger | Duration Target | Includes |
|
|
134
|
+
| --------------- | ------------------ | --------------- | ---------------------------------------- |
|
|
135
|
+
| CI | Every PR + push | < 10 minutes | Lint, types, tests, security, build |
|
|
136
|
+
| CD (Dev) | Merge to develop | < 5 minutes | Docker build, deploy Dev, smoke tests |
|
|
137
|
+
| CD (Staging) | Merge to main | < 10 minutes | Docker build, deploy Staging, E2E tests |
|
|
138
|
+
| CD (Production) | Tag v\* + approval | < 15 minutes | Docker build, deploy Prod, health checks |
|
|
139
139
|
|
|
140
140
|
---
|
|
141
141
|
|
|
@@ -172,18 +172,18 @@ DEVELOPER PUSHES CODE
|
|
|
172
172
|
|
|
173
173
|
### 4.3 Non-Negotiables
|
|
174
174
|
|
|
175
|
-
| Rule
|
|
176
|
-
|
|
177
|
-
| Pipeline fails on lint errors
|
|
178
|
-
| Pipeline fails on type errors
|
|
179
|
-
| Pipeline fails on test failures
|
|
180
|
-
| Pipeline fails if coverage < 80% lines, 75% branches, 80% functions, 80% statements | Jest/Vitest coverage threshold
|
|
181
|
-
| Pipeline fails on high/critical audit findings
|
|
182
|
-
| Lockfile must be frozen
|
|
183
|
-
| No console.log / debugger
|
|
184
|
-
| **Pipeline fails on modularization violations**
|
|
185
|
-
| **Pipeline fails on barrel file / cross-domain violations**
|
|
186
|
-
| **Commit messages must follow conventional commits**
|
|
175
|
+
| Rule | Enforcement |
|
|
176
|
+
| ----------------------------------------------------------------------------------- | -------------------------------------------------- |
|
|
177
|
+
| Pipeline fails on lint errors | Exit code 1 |
|
|
178
|
+
| Pipeline fails on type errors | Exit code 1 |
|
|
179
|
+
| Pipeline fails on test failures | Exit code 1 |
|
|
180
|
+
| Pipeline fails if coverage < 80% lines, 75% branches, 80% functions, 80% statements | Jest/Vitest coverage threshold |
|
|
181
|
+
| Pipeline fails on high/critical audit findings | npm audit --audit-level=high |
|
|
182
|
+
| Lockfile must be frozen | --frozen-lockfile flag |
|
|
183
|
+
| No console.log / debugger | ESLint rule no-console: error (warn/error allowed) |
|
|
184
|
+
| **Pipeline fails on modularization violations** | **pnpm check-modularization exit code 1** |
|
|
185
|
+
| **Pipeline fails on barrel file / cross-domain violations** | **ESLint boundaries + import/no-internal-modules** |
|
|
186
|
+
| **Commit messages must follow conventional commits** | **commitlint in CI (anti-bypass for --no-verify)** |
|
|
187
187
|
|
|
188
188
|
### 4.4 CI Pipeline Concurrency & Timeouts
|
|
189
189
|
|
|
@@ -222,7 +222,7 @@ tags: |
|
|
|
222
222
|
ghcr.io/${{ github.repository }}:${{ github.sha }}
|
|
223
223
|
ghcr.io/${{ github.repository }}:latest
|
|
224
224
|
|
|
225
|
-
# Azure Pipelines
|
|
225
|
+
# Azure Pipelines
|
|
226
226
|
tags: |
|
|
227
227
|
$(dockerRegistry)/$(imageRepository):$(Build.BuildId)
|
|
228
228
|
$(dockerRegistry)/$(imageRepository):$(Build.SourceVersion)
|
|
@@ -231,15 +231,16 @@ tags: |
|
|
|
231
231
|
|
|
232
232
|
### 5.3 Deployment Strategy by Environment
|
|
233
233
|
|
|
234
|
-
| Environment | Strategy
|
|
235
|
-
|
|
236
|
-
| Dev
|
|
237
|
-
| Staging
|
|
238
|
-
| Production
|
|
234
|
+
| Environment | Strategy | Max Downtime | Notes |
|
|
235
|
+
| ----------- | ---------------------------- | ------------ | ------------------------------- |
|
|
236
|
+
| Dev | Recreate | 1-2 minutes | Fast iteration, accept downtime |
|
|
237
|
+
| Staging | Rolling update | Zero | Match production behavior |
|
|
238
|
+
| Production | Rolling update or Blue/Green | Zero | Canary if high risk |
|
|
239
239
|
|
|
240
240
|
### 5.4 Health Check Endpoint Requirement
|
|
241
241
|
|
|
242
242
|
Every service must expose a /health endpoint:
|
|
243
|
+
|
|
243
244
|
```json
|
|
244
245
|
{
|
|
245
246
|
"status": "ok",
|
|
@@ -256,14 +257,14 @@ Every service must expose a /health endpoint:
|
|
|
256
257
|
|
|
257
258
|
Feature flags enable dark launches, A/B testing, canary releases, and instant kill-switches. All new features that touch critical paths must be deployed behind a feature flag.
|
|
258
259
|
|
|
259
|
-
| Rule
|
|
260
|
-
|
|
261
|
-
| Flag provider
|
|
262
|
-
| Flag naming
|
|
263
|
-
| Flag lifecycle
|
|
264
|
-
| CI integration
|
|
265
|
-
| Kill switch
|
|
266
|
-
| No nested flags | Flags must not depend on other flags
|
|
260
|
+
| Rule | Requirement |
|
|
261
|
+
| --------------- | --------------------------------------------------------------------------- |
|
|
262
|
+
| Flag provider | LaunchDarkly, Unleash, or Flagsmith (OSS) |
|
|
263
|
+
| Flag naming | `kebab-case` domain prefix: `employee-new-dashboard`, `billing-v2-checkout` |
|
|
264
|
+
| Flag lifecycle | Remove flag code within 2 sprints of 100% rollout |
|
|
265
|
+
| CI integration | Flag creation as part of feature PR (provider API) |
|
|
266
|
+
| Kill switch | Critical feature flags serve as instant rollback mechanism |
|
|
267
|
+
| No nested flags | Flags must not depend on other flags |
|
|
267
268
|
|
|
268
269
|
---
|
|
269
270
|
|
|
@@ -305,15 +306,15 @@ CMD ["node", "dist/server.js"]
|
|
|
305
306
|
|
|
306
307
|
### 6.2 Dockerfile Rules
|
|
307
308
|
|
|
308
|
-
| Rule
|
|
309
|
-
|
|
310
|
-
| Always use -alpine base images | Smaller attack surface, smaller size
|
|
311
|
-
| Multi-stage builds
|
|
312
|
-
| Non-root user
|
|
313
|
-
| Pin specific Node.js version
|
|
314
|
-
| Include HEALTHCHECK
|
|
315
|
-
| Copy only dist, not source
|
|
316
|
-
| Use --chown for file ownership | Match non-root user
|
|
309
|
+
| Rule | Reason |
|
|
310
|
+
| ------------------------------ | -------------------------------------- |
|
|
311
|
+
| Always use -alpine base images | Smaller attack surface, smaller size |
|
|
312
|
+
| Multi-stage builds | Separate build deps from runtime |
|
|
313
|
+
| Non-root user | Security best practice |
|
|
314
|
+
| Pin specific Node.js version | Reproducibility |
|
|
315
|
+
| Include HEALTHCHECK | K8s readiness/liveness probes |
|
|
316
|
+
| Copy only dist, not source | Smaller images, no source code in prod |
|
|
317
|
+
| Use --chown for file ownership | Match non-root user |
|
|
317
318
|
|
|
318
319
|
### 6.3 .dockerignore Template
|
|
319
320
|
|
|
@@ -356,20 +357,20 @@ docker-compose*.yml
|
|
|
356
357
|
|
|
357
358
|
### 7.1 Environment Matrix
|
|
358
359
|
|
|
359
|
-
| Environment | Purpose
|
|
360
|
-
|
|
361
|
-
| Dev
|
|
362
|
-
| Staging
|
|
363
|
-
| Production
|
|
360
|
+
| Environment | Purpose | Deploy Trigger | Replicas | Backup |
|
|
361
|
+
| ----------- | ------------------------------------- | --------------------- | --------------- | ------ |
|
|
362
|
+
| Dev | Developer testing, feature validation | Merge to develop | 1 | No |
|
|
363
|
+
| Staging | QA, UAT, integration testing | Merge to main | 2 | No |
|
|
364
|
+
| Production | Live user-facing | Tag + manual approval | 3+ (auto-scale) | Yes |
|
|
364
365
|
|
|
365
366
|
### 7.2 Configuration Files
|
|
366
367
|
|
|
367
|
-
| File
|
|
368
|
-
|
|
369
|
-
| .env.example | Project root
|
|
370
|
-
| .env
|
|
371
|
-
| ConfigMap
|
|
372
|
-
| Secrets
|
|
368
|
+
| File | Location | Git-Committed? | Contains Secrets? |
|
|
369
|
+
| ------------ | ----------------------- | ---------------- | ----------------- |
|
|
370
|
+
| .env.example | Project root | Yes | No (placeholders) |
|
|
371
|
+
| .env | Project root | No (.gitignored) | Yes |
|
|
372
|
+
| ConfigMap | k8s/base/configmap.yaml | Yes | No |
|
|
373
|
+
| Secrets | Secret Manager | No | Yes |
|
|
373
374
|
|
|
374
375
|
---
|
|
375
376
|
|
|
@@ -400,17 +401,17 @@ k8s/
|
|
|
400
401
|
|
|
401
402
|
### 8.2 Required Kubernetes Manifests
|
|
402
403
|
|
|
403
|
-
| Manifest
|
|
404
|
-
|
|
405
|
-
| Deployment
|
|
406
|
-
| Service
|
|
407
|
-
| Ingress
|
|
408
|
-
| ConfigMap
|
|
409
|
-
| Secret
|
|
410
|
-
| HorizontalPodAutoscaler | Auto-scaling
|
|
411
|
-
| PodDisruptionBudget
|
|
412
|
-
| NetworkPolicy
|
|
413
|
-
| Namespace
|
|
404
|
+
| Manifest | Purpose | Required? |
|
|
405
|
+
| ----------------------- | ---------------------------------- | -------------- |
|
|
406
|
+
| Deployment | Pod deployment with rolling update | Yes |
|
|
407
|
+
| Service | Internal service discovery | Yes |
|
|
408
|
+
| Ingress | External HTTP/S routing | Yes |
|
|
409
|
+
| ConfigMap | Non-sensitive configuration | Yes |
|
|
410
|
+
| Secret | Sensitive configuration | Yes |
|
|
411
|
+
| HorizontalPodAutoscaler | Auto-scaling | For production |
|
|
412
|
+
| PodDisruptionBudget | Min available during updates | For production |
|
|
413
|
+
| NetworkPolicy | Pod-to-pod traffic restrictions | Recommended |
|
|
414
|
+
| Namespace | Environment isolation | Yes |
|
|
414
415
|
|
|
415
416
|
### 8.3 Kustomize Base Template
|
|
416
417
|
|
|
@@ -477,9 +478,9 @@ metadata:
|
|
|
477
478
|
namespace: production
|
|
478
479
|
spec:
|
|
479
480
|
hard:
|
|
480
|
-
requests.cpu:
|
|
481
|
+
requests.cpu: '10'
|
|
481
482
|
requests.memory: 20Gi
|
|
482
|
-
limits.cpu:
|
|
483
|
+
limits.cpu: '20'
|
|
483
484
|
limits.memory: 40Gi
|
|
484
485
|
```
|
|
485
486
|
|
|
@@ -503,12 +504,12 @@ spec:
|
|
|
503
504
|
class: nginx
|
|
504
505
|
```
|
|
505
506
|
|
|
506
|
-
| Rule
|
|
507
|
-
|
|
508
|
-
| Issuer
|
|
509
|
-
| Renewal
|
|
510
|
-
| Monitoring
|
|
511
|
-
| Minimum TLS | TLS 1.2+ only; TLS 1.0/1.1 disabled
|
|
507
|
+
| Rule | Requirement |
|
|
508
|
+
| ----------- | ---------------------------------------------------------------------------- |
|
|
509
|
+
| Issuer | Let's Encrypt via cert-manager (production). Staging issuer for dev/staging. |
|
|
510
|
+
| Renewal | Auto-renew at 30 days before expiry |
|
|
511
|
+
| Monitoring | Alert at < 30 days to expiry |
|
|
512
|
+
| Minimum TLS | TLS 1.2+ only; TLS 1.0/1.1 disabled |
|
|
512
513
|
|
|
513
514
|
---
|
|
514
515
|
|
|
@@ -521,6 +522,7 @@ GitHub Secrets is the **preferred** method for storing sensitive values in CI/CD
|
|
|
521
522
|
**Where to configure**: Repo **Settings > Secrets and variables > Actions**
|
|
522
523
|
|
|
523
524
|
**How to use in workflows**:
|
|
525
|
+
|
|
524
526
|
```yaml
|
|
525
527
|
# Reference in pipeline
|
|
526
528
|
- run: echo "${{ secrets.DOCKER_PASSWORD }}" | docker login --username "${{ secrets.DOCKER_USERNAME }}" --password-stdin
|
|
@@ -534,34 +536,36 @@ GitHub Secrets is the **preferred** method for storing sensitive values in CI/CD
|
|
|
534
536
|
|
|
535
537
|
**Recommended GitHub Secrets per project**:
|
|
536
538
|
|
|
537
|
-
| Secret Name
|
|
538
|
-
|
|
539
|
-
| `DOCKER_REGISTRY`
|
|
540
|
-
| `DOCKER_USERNAME`
|
|
541
|
-
| `DOCKER_PASSWORD`
|
|
542
|
-
| `KUBE_CONFIG_DEV`
|
|
543
|
-
| `KUBE_CONFIG_STAGING` | `apiVersion: v1...`
|
|
544
|
-
| `KUBE_CONFIG_PROD`
|
|
545
|
-
| `SLACK_WEBHOOK`
|
|
546
|
-
| `SENTRY_DSN`
|
|
547
|
-
| `SONAR_TOKEN`
|
|
539
|
+
| Secret Name | Example Value | Purpose |
|
|
540
|
+
| --------------------- | ----------------------------- | --------------------------- |
|
|
541
|
+
| `DOCKER_REGISTRY` | `ghcr.io/myorg` | Container registry URL |
|
|
542
|
+
| `DOCKER_USERNAME` | `myorg-bot` | Registry login user |
|
|
543
|
+
| `DOCKER_PASSWORD` | `ghp_abc...` | Registry token (GitHub PAT) |
|
|
544
|
+
| `KUBE_CONFIG_DEV` | `apiVersion: v1...` | Dev K8s kubeconfig (base64) |
|
|
545
|
+
| `KUBE_CONFIG_STAGING` | `apiVersion: v1...` | Staging K8s kubeconfig |
|
|
546
|
+
| `KUBE_CONFIG_PROD` | `apiVersion: v1...` | Production K8s kubeconfig |
|
|
547
|
+
| `SLACK_WEBHOOK` | `https://hooks.slack.com/...` | Deployment notifications |
|
|
548
|
+
| `SENTRY_DSN` | `https://xxx@sentry.io/xxx` | Error tracking |
|
|
549
|
+
| `SONAR_TOKEN` | `sqp_...` | SonarQube authentication |
|
|
548
550
|
|
|
549
551
|
### 9.2 GitHub Environments for Deployment Gates
|
|
550
552
|
|
|
551
553
|
GitHub **Environments** provide manual approval gates for production deployments.
|
|
552
554
|
|
|
553
555
|
**Setup**:
|
|
556
|
+
|
|
554
557
|
1. Go to **Settings > Environments**
|
|
555
558
|
2. Create `production` environment
|
|
556
559
|
3. Add **Required reviewers** (2-3 senior devs/leads)
|
|
557
560
|
4. Add **Deployment branches** (`main` only)
|
|
558
561
|
|
|
559
562
|
**Workflow integration**:
|
|
563
|
+
|
|
560
564
|
```yaml
|
|
561
565
|
jobs:
|
|
562
566
|
deploy-production:
|
|
563
567
|
runs-on: ubuntu-latest
|
|
564
|
-
environment: production
|
|
568
|
+
environment: production # <-- This triggers the approval gate
|
|
565
569
|
steps:
|
|
566
570
|
- run: ./deploy.sh
|
|
567
571
|
```
|
|
@@ -579,35 +583,38 @@ If the organization uses Azure DevOps, use Azure Key Vault via variable groups:
|
|
|
579
583
|
|
|
580
584
|
### 9.4 Secret Sources By Environment
|
|
581
585
|
|
|
582
|
-
| Environment
|
|
583
|
-
|
|
584
|
-
| Local dev
|
|
585
|
-
| CI Pipeline
|
|
586
|
-
| Dev K8s
|
|
587
|
-
| Staging K8s
|
|
588
|
-
| Production K8s | External Secrets Operator + Cloud Vault
|
|
586
|
+
| Environment | Secret Source | Implementation |
|
|
587
|
+
| -------------- | ------------------------------------------------------- | ------------------------ |
|
|
588
|
+
| Local dev | `.env` file (gitignored) | dotenv / VITE\_\* |
|
|
589
|
+
| CI Pipeline | **GitHub Secrets** (preferred) or Azure Variable Groups | Injected at runtime |
|
|
590
|
+
| Dev K8s | Kubernetes Secrets (from CI) | Base64 encoded |
|
|
591
|
+
| Staging K8s | K8s Secrets + External Secrets Operator | Sync from GitHub Secrets |
|
|
592
|
+
| Production K8s | External Secrets Operator + Cloud Vault | Never stored in etcd |
|
|
589
593
|
|
|
590
594
|
### 9.5 GitHub Container Registry (GHCR) Setup
|
|
591
595
|
|
|
592
596
|
GitHub Container Registry is the **preferred** container registry for GitHub-hosted projects.
|
|
593
597
|
|
|
594
598
|
**Enable GHCR**:
|
|
599
|
+
|
|
595
600
|
1. Go to **Settings > Packages** and ensure "GitHub Container Registry" is enabled
|
|
596
601
|
2. Create a **Personal Access Token (PAT)**:
|
|
597
602
|
- Go to **Settings > Developer settings > Personal access tokens > Fine-grained tokens**
|
|
598
603
|
- Permissions required: `read:packages`, `write:packages`, `delete:packages`
|
|
599
604
|
|
|
600
605
|
**Login in CI**:
|
|
606
|
+
|
|
601
607
|
```yaml
|
|
602
608
|
- name: Log in to GHCR
|
|
603
609
|
uses: docker/login-action@v3
|
|
604
610
|
with:
|
|
605
611
|
registry: ghcr.io
|
|
606
612
|
username: ${{ github.actor }}
|
|
607
|
-
password: ${{ secrets.GITHUB_TOKEN }}
|
|
613
|
+
password: ${{ secrets.GITHUB_TOKEN }} # Auto-generated, no setup needed
|
|
608
614
|
```
|
|
609
615
|
|
|
610
616
|
**Image naming convention**:
|
|
617
|
+
|
|
611
618
|
```
|
|
612
619
|
ghcr.io/<organization>/<project>:<tag>
|
|
613
620
|
|
|
@@ -644,21 +651,21 @@ secrets.yml
|
|
|
644
651
|
```yaml
|
|
645
652
|
steps:
|
|
646
653
|
- task: Kubernetes@1
|
|
647
|
-
displayName:
|
|
654
|
+
displayName: 'Run database migrations'
|
|
648
655
|
inputs:
|
|
649
|
-
command:
|
|
650
|
-
arguments:
|
|
656
|
+
command: 'exec'
|
|
657
|
+
arguments: 'deployment/app -- pnpm db:migrate'
|
|
651
658
|
```
|
|
652
659
|
|
|
653
660
|
### 10.2 Migration Rules
|
|
654
661
|
|
|
655
|
-
| Rule
|
|
656
|
-
|
|
657
|
-
| Migrations run AFTER new pods are deployed
|
|
658
|
-
| Migrations must be backward compatible
|
|
659
|
-
| Never auto-run migrations in production
|
|
660
|
-
| Always run migrations in Dev and Staging first
|
|
661
|
-
| Test files must not exist in migration directories | TypeORM CLI fails on test files
|
|
662
|
+
| Rule | Reason |
|
|
663
|
+
| -------------------------------------------------- | -------------------------------- |
|
|
664
|
+
| Migrations run AFTER new pods are deployed | Old pods can still serve traffic |
|
|
665
|
+
| Migrations must be backward compatible | Rollback must be possible |
|
|
666
|
+
| Never auto-run migrations in production | Require manual trigger |
|
|
667
|
+
| Always run migrations in Dev and Staging first | Catch issues early |
|
|
668
|
+
| Test files must not exist in migration directories | TypeORM CLI fails on test files |
|
|
662
669
|
|
|
663
670
|
### 10.3 Migration Directory Convention
|
|
664
671
|
|
|
@@ -671,14 +678,14 @@ apps/backend/src/database/
|
|
|
671
678
|
|
|
672
679
|
### 10.4 Database Backup Strategy
|
|
673
680
|
|
|
674
|
-
| Rule
|
|
675
|
-
|
|
676
|
-
| Backup tool
|
|
677
|
-
| Frequency
|
|
678
|
-
| Retention
|
|
679
|
-
| Encryption
|
|
680
|
-
| Restore testing | Quarterly restore drill required; results documented
|
|
681
|
-
| Offsite
|
|
681
|
+
| Rule | Requirement |
|
|
682
|
+
| --------------- | ------------------------------------------------------------------ |
|
|
683
|
+
| Backup tool | Velero (K8s) or pg_dump (bare metal) |
|
|
684
|
+
| Frequency | Production: daily full + continuous WAL archiving. Staging: daily. |
|
|
685
|
+
| Retention | 30 daily, 12 monthly, 7 yearly |
|
|
686
|
+
| Encryption | At rest (AES-256) + in transit (TLS) |
|
|
687
|
+
| Restore testing | Quarterly restore drill required; results documented |
|
|
688
|
+
| Offsite | Backups replicated to secondary region within 24 hours |
|
|
682
689
|
|
|
683
690
|
---
|
|
684
691
|
|
|
@@ -686,25 +693,25 @@ apps/backend/src/database/
|
|
|
686
693
|
|
|
687
694
|
### 11.1 Required Metrics & Alerts
|
|
688
695
|
|
|
689
|
-
| Metric
|
|
690
|
-
|
|
691
|
-
| CPU Usage
|
|
692
|
-
| Memory Usage
|
|
693
|
-
| HTTP 5xx Rate
|
|
694
|
-
| HTTP 4xx Rate
|
|
696
|
+
| Metric | Source | Alert Threshold |
|
|
697
|
+
| --------------------- | ------------------- | --------------------- |
|
|
698
|
+
| CPU Usage | K8s Metrics Server | >80% for 5 minutes |
|
|
699
|
+
| Memory Usage | K8s Metrics Server | >85% for 5 minutes |
|
|
700
|
+
| HTTP 5xx Rate | Application metrics | >1% for 5 minutes |
|
|
701
|
+
| HTTP 4xx Rate | Application metrics | >5% for 5 minutes |
|
|
695
702
|
| Request Latency (p95) | Application metrics | >2000ms for 5 minutes |
|
|
696
|
-
| Disk Usage
|
|
697
|
-
| Pod Restarts
|
|
698
|
-
| Certificate Expiry
|
|
703
|
+
| Disk Usage | Node exporter | >85% |
|
|
704
|
+
| Pod Restarts | K8s | >3 in 10 minutes |
|
|
705
|
+
| Certificate Expiry | Cert manager | <30 days |
|
|
699
706
|
|
|
700
707
|
### 11.2 Alert Notification Channels
|
|
701
708
|
|
|
702
|
-
| Channel
|
|
703
|
-
|
|
704
|
-
| Slack
|
|
705
|
-
| PagerDuty/OpsGenie | Critical | Production outages
|
|
706
|
-
| Email
|
|
707
|
-
| SMS
|
|
709
|
+
| Channel | Severity | Use Case |
|
|
710
|
+
| ------------------ | -------- | ---------------------------------- |
|
|
711
|
+
| Slack | All | General alerts |
|
|
712
|
+
| PagerDuty/OpsGenie | Critical | Production outages |
|
|
713
|
+
| Email | Warning | Non-urgent notifications |
|
|
714
|
+
| SMS | Critical | Escalation (if PagerDuty not ackd) |
|
|
708
715
|
|
|
709
716
|
---
|
|
710
717
|
|
|
@@ -712,13 +719,13 @@ apps/backend/src/database/
|
|
|
712
719
|
|
|
713
720
|
### 12.1 Mandatory Scans
|
|
714
721
|
|
|
715
|
-
| Scan
|
|
716
|
-
|
|
717
|
-
| Dependency audit
|
|
718
|
-
| SAST (Static Analysis) | ESLint + SonarQube
|
|
719
|
-
| Container scan
|
|
720
|
-
| Secret detection
|
|
721
|
-
| License compliance
|
|
722
|
+
| Scan | Tool | When | Fail On |
|
|
723
|
+
| ---------------------- | ----------------------- | ------------------ | --------------------- |
|
|
724
|
+
| Dependency audit | npm audit / pnpm audit | Every CI run | High/Critical |
|
|
725
|
+
| SAST (Static Analysis) | ESLint + SonarQube | Every CI run | Blocker issues |
|
|
726
|
+
| Container scan | Trivy / Snyk | Every Docker build | High/Critical CVEs |
|
|
727
|
+
| Secret detection | GitLeaks / TruffleHog | Every PR | Any secret found |
|
|
728
|
+
| License compliance | FOSSA / License Checker | Weekly | Non-approved licenses |
|
|
722
729
|
|
|
723
730
|
### 12.2 CI Security Steps
|
|
724
731
|
|
|
@@ -731,12 +738,12 @@ apps/backend/src/database/
|
|
|
731
738
|
|
|
732
739
|
### 12.3 Vulnerability Handling SLA
|
|
733
740
|
|
|
734
|
-
| Severity | Response Time
|
|
735
|
-
|
|
736
|
-
| Critical | < 1 hour
|
|
737
|
-
| High
|
|
738
|
-
| Medium
|
|
739
|
-
| Low
|
|
741
|
+
| Severity | Response Time | Fix Deadline |
|
|
742
|
+
| -------- | ---------------- | ------------ |
|
|
743
|
+
| Critical | < 1 hour | < 24 hours |
|
|
744
|
+
| High | < 4 hours | < 7 days |
|
|
745
|
+
| Medium | < 1 business day | < 30 days |
|
|
746
|
+
| Low | < 1 week | Next release |
|
|
740
747
|
|
|
741
748
|
---
|
|
742
749
|
|
|
@@ -752,12 +759,12 @@ apps/backend/src/database/
|
|
|
752
759
|
|
|
753
760
|
### 13.2 Rollback Procedures
|
|
754
761
|
|
|
755
|
-
| Deployment Type | Rollback Method
|
|
756
|
-
|
|
757
|
-
| Kubernetes
|
|
758
|
-
| Docker Compose
|
|
759
|
-
| Blue/Green
|
|
760
|
-
| DB migration
|
|
762
|
+
| Deployment Type | Rollback Method | Time To Recover |
|
|
763
|
+
| --------------- | ------------------------------------------------------------------- | --------------- |
|
|
764
|
+
| Kubernetes | kubectl rollout undo deployment/app | < 2 minutes |
|
|
765
|
+
| Docker Compose | docker compose down && docker compose -f docker-compose.prev.yml up | < 5 minutes |
|
|
766
|
+
| Blue/Green | Switch traffic back to previous slot | < 1 minute |
|
|
767
|
+
| DB migration | Run migration:revert (must be tested) | < 5 minutes |
|
|
761
768
|
|
|
762
769
|
### 13.3 Kubernetes Rollback Commands
|
|
763
770
|
|
|
@@ -791,12 +798,12 @@ pnpm db:status # Verify current state
|
|
|
791
798
|
|
|
792
799
|
### 14.1 Incident Severity Levels
|
|
793
800
|
|
|
794
|
-
| Level | Description
|
|
795
|
-
|
|
796
|
-
| SEV1
|
|
797
|
-
| SEV2
|
|
798
|
-
| SEV3
|
|
799
|
-
| SEV4
|
|
801
|
+
| Level | Description | Response | Example |
|
|
802
|
+
| ----- | ------------------------- | -------------------- | --------------------- |
|
|
803
|
+
| SEV1 | Complete service outage | Immediate, all hands | App is down |
|
|
804
|
+
| SEV2 | Major feature degradation | < 15 minutes | Login broken |
|
|
805
|
+
| SEV3 | Minor feature issue | < 1 hour | UI cosmetic bug |
|
|
806
|
+
| SEV4 | Non-urgent issue | Next business day | Missing documentation |
|
|
800
807
|
|
|
801
808
|
### 14.2 Incident Response Flow
|
|
802
809
|
|
|
@@ -825,9 +832,11 @@ Post-Mortem / RCA (Within 48 hours, documented in ADR)
|
|
|
825
832
|
## Incident Post-Mortem
|
|
826
833
|
|
|
827
834
|
### Summary
|
|
835
|
+
|
|
828
836
|
[What happened, when, impact]
|
|
829
837
|
|
|
830
838
|
### Timeline
|
|
839
|
+
|
|
831
840
|
- [Time] Issue detected
|
|
832
841
|
- [Time] Investigation started
|
|
833
842
|
- [Time] Root cause identified
|
|
@@ -835,41 +844,45 @@ Post-Mortem / RCA (Within 48 hours, documented in ADR)
|
|
|
835
844
|
- [Time] Service recovered
|
|
836
845
|
|
|
837
846
|
### Root Cause
|
|
847
|
+
|
|
838
848
|
[Technical explanation]
|
|
839
849
|
|
|
840
850
|
### Impact
|
|
851
|
+
|
|
841
852
|
- Users affected: [number]
|
|
842
853
|
- Downtime: [duration]
|
|
843
854
|
- Revenue impact: [amount]
|
|
844
855
|
|
|
845
856
|
### Action Items
|
|
857
|
+
|
|
846
858
|
- [ ] Fix [permanent fix]
|
|
847
859
|
- [ ] Add monitoring for [gap]
|
|
848
860
|
- [ ] Update runbook for [procedure]
|
|
849
861
|
- [ ] Schedule post-mortem review
|
|
850
862
|
|
|
851
863
|
### Prevention
|
|
864
|
+
|
|
852
865
|
[How to prevent recurrence]
|
|
853
866
|
```
|
|
854
867
|
|
|
855
868
|
### 14.4 Service-Level Objectives (SLO)
|
|
856
869
|
|
|
857
|
-
| Severity | Acknowledgement | Mitigation | Resolution
|
|
858
|
-
|
|
859
|
-
| SEV1
|
|
860
|
-
| SEV2
|
|
861
|
-
| SEV3
|
|
862
|
-
| SEV4
|
|
870
|
+
| Severity | Acknowledgement | Mitigation | Resolution | Uptime Target |
|
|
871
|
+
| -------- | --------------- | ---------- | ----------------- | ------------- |
|
|
872
|
+
| SEV1 | < 5 min | < 15 min | < 1 hour | 99.9% monthly |
|
|
873
|
+
| SEV2 | < 15 min | < 1 hour | < 4 hours | -- |
|
|
874
|
+
| SEV3 | < 1 hour | < 4 hours | < 24 hours | -- |
|
|
875
|
+
| SEV4 | < 4 hours | -- | Next business day | -- |
|
|
863
876
|
|
|
864
877
|
### 14.5 Disaster Recovery Standards
|
|
865
878
|
|
|
866
|
-
| Metric
|
|
867
|
-
|
|
868
|
-
| RTO (Recovery Time Objective)
|
|
869
|
-
| RPO (Recovery Point Objective) | < 1 hour
|
|
870
|
-
| Multi-region
|
|
871
|
-
| Failover
|
|
872
|
-
| Cross-region replication
|
|
879
|
+
| Metric | Target |
|
|
880
|
+
| ------------------------------ | ------------------------------------------------------------- |
|
|
881
|
+
| RTO (Recovery Time Objective) | < 4 hours |
|
|
882
|
+
| RPO (Recovery Point Objective) | < 1 hour |
|
|
883
|
+
| Multi-region | Production must span at least 2 availability zones |
|
|
884
|
+
| Failover | Documented runbook tested quarterly |
|
|
885
|
+
| Cross-region replication | Database replication to secondary region (async, < 5 min lag) |
|
|
873
886
|
|
|
874
887
|
---
|
|
875
888
|
|
|
@@ -877,22 +890,22 @@ Post-Mortem / RCA (Within 48 hours, documented in ADR)
|
|
|
877
890
|
|
|
878
891
|
### 15.1 Platform Decision: GitHub Actions (Recommended) vs Azure Pipelines
|
|
879
892
|
|
|
880
|
-
| Factor
|
|
881
|
-
|
|
882
|
-
| **Repo hosting**
|
|
883
|
-
| **Setup complexity**
|
|
884
|
-
| **Secret management**
|
|
885
|
-
| **Container registry** | GHCR (built-in, free)
|
|
886
|
-
| **Approval gates**
|
|
887
|
-
| **Cost**
|
|
888
|
-
| **Kubernetes deploy**
|
|
889
|
-
| **Community**
|
|
893
|
+
| Factor | GitHub Actions (Recommended) | Azure Pipelines |
|
|
894
|
+
| ---------------------- | ------------------------------------- | ------------------------------------ |
|
|
895
|
+
| **Repo hosting** | GitHub | Azure Repos / GitHub |
|
|
896
|
+
| **Setup complexity** | Minimal (built-in to GitHub) | Moderate (requires Azure DevOps org) |
|
|
897
|
+
| **Secret management** | GitHub Secrets (built-in) | Azure Key Vault + Variable Groups |
|
|
898
|
+
| **Container registry** | GHCR (built-in, free) | Azure Container Registry (paid) |
|
|
899
|
+
| **Approval gates** | GitHub Environments | Azure DevOps Deployment Gates |
|
|
900
|
+
| **Cost** | Free for public / 2000 min/mo private | Free for 5 users, paid tiers |
|
|
901
|
+
| **Kubernetes deploy** | `kubectl` or helm actions | KubernetesManifest task |
|
|
902
|
+
| **Community** | Largest ecosystem of actions | Smaller, mostly enterprise |
|
|
890
903
|
|
|
891
904
|
> **Recommendation**: Use **GitHub Actions** for all new projects. If the company standardizes on Azure DevOps, Azure Pipelines is a capable alternative. Both templates are provided below.
|
|
892
905
|
|
|
893
906
|
### 15.2 GitHub Actions (Full CI) -- Primary Option
|
|
894
907
|
|
|
895
|
-
|
|
908
|
+
````yaml
|
|
896
909
|
# .github/workflows/ci.yml
|
|
897
910
|
name: CI Pipeline
|
|
898
911
|
|
|
@@ -1085,7 +1098,7 @@ jobs:
|
|
|
1085
1098
|
curl -X POST -H "Content-type: application/json" \
|
|
1086
1099
|
--data '{"text":":x: Deployment to Production FAILED: ${{ github.ref_name }}"}' \
|
|
1087
1100
|
${{ secrets.SLACK_WEBHOOK }}
|
|
1088
|
-
|
|
1101
|
+
````
|
|
1089
1102
|
|
|
1090
1103
|
### 15.4 Azure Pipelines (Alternative Option)
|
|
1091
1104
|
|
|
@@ -1102,7 +1115,7 @@ trigger:
|
|
|
1102
1115
|
paths:
|
|
1103
1116
|
exclude:
|
|
1104
1117
|
- docs/*
|
|
1105
|
-
-
|
|
1118
|
+
- '*.md'
|
|
1106
1119
|
|
|
1107
1120
|
pr:
|
|
1108
1121
|
branches:
|
|
@@ -1113,17 +1126,17 @@ pr:
|
|
|
1113
1126
|
variables:
|
|
1114
1127
|
- group: common-vars
|
|
1115
1128
|
- name: nodeVersion
|
|
1116
|
-
value:
|
|
1129
|
+
value: '20.x'
|
|
1117
1130
|
- name: pnpmVersion
|
|
1118
|
-
value:
|
|
1131
|
+
value: '9'
|
|
1119
1132
|
- name: dockerRegistry
|
|
1120
|
-
value:
|
|
1133
|
+
value: 'your-acr.azurecr.io' # Change to your ACR
|
|
1121
1134
|
- name: imageRepository
|
|
1122
|
-
value:
|
|
1135
|
+
value: 'myapp'
|
|
1123
1136
|
|
|
1124
1137
|
stages:
|
|
1125
1138
|
- stage: Build
|
|
1126
|
-
displayName:
|
|
1139
|
+
displayName: 'Build and Test'
|
|
1127
1140
|
jobs:
|
|
1128
1141
|
- job: BuildAndTest
|
|
1129
1142
|
steps:
|
|
@@ -1141,56 +1154,56 @@ stages:
|
|
|
1141
1154
|
- script: pnpm test:ci
|
|
1142
1155
|
- task: PublishTestResults@2
|
|
1143
1156
|
inputs:
|
|
1144
|
-
testResultsFiles:
|
|
1157
|
+
testResultsFiles: '**/junit.xml'
|
|
1145
1158
|
- task: PublishCodeCoverageResults@1
|
|
1146
1159
|
inputs:
|
|
1147
|
-
codeCoverageTool:
|
|
1148
|
-
summaryFileLocation:
|
|
1160
|
+
codeCoverageTool: 'Cobertura'
|
|
1161
|
+
summaryFileLocation: '**/cobertura-coverage.xml'
|
|
1149
1162
|
- script: pnpm build
|
|
1150
1163
|
- task: PublishBuildArtifacts@1
|
|
1151
1164
|
inputs:
|
|
1152
|
-
pathToPublish:
|
|
1153
|
-
artifactName:
|
|
1165
|
+
pathToPublish: '$(Build.ArtifactStagingDirectory)'
|
|
1166
|
+
artifactName: 'build'
|
|
1154
1167
|
|
|
1155
1168
|
- stage: DeployDev
|
|
1156
|
-
displayName:
|
|
1169
|
+
displayName: 'Deploy to Dev'
|
|
1157
1170
|
dependsOn: Build
|
|
1158
1171
|
condition: eq(variables['Build.SourceBranchName'], 'develop')
|
|
1159
1172
|
jobs:
|
|
1160
1173
|
- deployment: DeployDev
|
|
1161
|
-
environment:
|
|
1174
|
+
environment: 'dev'
|
|
1162
1175
|
strategy:
|
|
1163
1176
|
runOnce:
|
|
1164
1177
|
deploy:
|
|
1165
1178
|
steps:
|
|
1166
1179
|
- task: KubernetesManifest@0
|
|
1167
1180
|
inputs:
|
|
1168
|
-
action:
|
|
1169
|
-
kubernetesServiceConnection:
|
|
1170
|
-
namespace:
|
|
1171
|
-
manifests:
|
|
1172
|
-
containers:
|
|
1181
|
+
action: 'deploy'
|
|
1182
|
+
kubernetesServiceConnection: 'k8s-dev'
|
|
1183
|
+
namespace: 'app-dev'
|
|
1184
|
+
manifests: 'k8s/overlays/dev/*.yaml'
|
|
1185
|
+
containers: '$(dockerRegistry)/app:$(Build.BuildId)'
|
|
1173
1186
|
- script: |
|
|
1174
1187
|
kubectl exec deployment/app -n app-dev -- pnpm db:migrate
|
|
1175
1188
|
|
|
1176
1189
|
- stage: DeployProduction
|
|
1177
|
-
displayName:
|
|
1190
|
+
displayName: 'Deploy to Production'
|
|
1178
1191
|
dependsOn: DeployDev
|
|
1179
1192
|
condition: and(succeeded(), eq(variables['Build.SourceBranchName'], 'main'))
|
|
1180
1193
|
jobs:
|
|
1181
1194
|
- deployment: DeployProduction
|
|
1182
|
-
environment:
|
|
1195
|
+
environment: 'production'
|
|
1183
1196
|
strategy:
|
|
1184
1197
|
runOnce:
|
|
1185
1198
|
deploy:
|
|
1186
1199
|
steps:
|
|
1187
1200
|
- task: KubernetesManifest@0
|
|
1188
1201
|
inputs:
|
|
1189
|
-
action:
|
|
1190
|
-
kubernetesServiceConnection:
|
|
1191
|
-
namespace:
|
|
1192
|
-
manifests:
|
|
1193
|
-
containers:
|
|
1202
|
+
action: 'deploy'
|
|
1203
|
+
kubernetesServiceConnection: 'k8s-prod'
|
|
1204
|
+
namespace: 'app-prod'
|
|
1205
|
+
manifests: 'k8s/overlays/prod/*.yaml'
|
|
1206
|
+
containers: '$(dockerRegistry)/app:$(Build.BuildId)'
|
|
1194
1207
|
- script: |
|
|
1195
1208
|
kubectl exec deployment/app -n app-prod -- pnpm db:migrate
|
|
1196
1209
|
```
|
|
@@ -1297,51 +1310,51 @@ kubectl get events --sort-by='.lastTimestamp'
|
|
|
1297
1310
|
|
|
1298
1311
|
### 16.2 Required DevOps Files Checklist
|
|
1299
1312
|
|
|
1300
|
-
| File
|
|
1301
|
-
|
|
1302
|
-
| Dockerfile
|
|
1303
|
-
| .dockerignore
|
|
1304
|
-
| docker-compose.yml
|
|
1305
|
-
| docker-compose.dev.yml
|
|
1306
|
-
| k8s/base/deployment.yaml
|
|
1307
|
-
| k8s/base/service.yaml
|
|
1308
|
-
| k8s/base/ingress.yaml
|
|
1309
|
-
| k8s/base/kustomization.yaml
|
|
1310
|
-
| k8s/overlays/dev/kustomization.yaml
|
|
1311
|
-
| k8s/overlays/prod/kustomization.yaml | Prod overlay
|
|
1312
|
-
| .github/workflows/ci.yml
|
|
1313
|
-
| .github/workflows/cd.yml
|
|
1314
|
-
| azure-pipelines.yml
|
|
1315
|
-
| monitoring/prometheus.yml
|
|
1316
|
-
| scripts/health-check.sh
|
|
1317
|
-
| .nvmrc
|
|
1313
|
+
| File | Purpose | Required For |
|
|
1314
|
+
| ------------------------------------ | --------------------------- | ---------------------- |
|
|
1315
|
+
| Dockerfile | Multi-stage container build | Every project |
|
|
1316
|
+
| .dockerignore | Exclude files from build | Every project |
|
|
1317
|
+
| docker-compose.yml | Local dev stack | Every project |
|
|
1318
|
+
| docker-compose.dev.yml | Dev with hot-reload | Multi-service projects |
|
|
1319
|
+
| k8s/base/deployment.yaml | K8s deployment config | K8s-deployed projects |
|
|
1320
|
+
| k8s/base/service.yaml | K8s service config | K8s-deployed projects |
|
|
1321
|
+
| k8s/base/ingress.yaml | K8s ingress config | K8s-deployed projects |
|
|
1322
|
+
| k8s/base/kustomization.yaml | Kustomize entry point | K8s-deployed projects |
|
|
1323
|
+
| k8s/overlays/dev/kustomization.yaml | Dev overlay | K8s-deployed projects |
|
|
1324
|
+
| k8s/overlays/prod/kustomization.yaml | Prod overlay | K8s-deployed projects |
|
|
1325
|
+
| .github/workflows/ci.yml | CI pipeline | GitHub repos |
|
|
1326
|
+
| .github/workflows/cd.yml | CD pipeline | GitHub repos |
|
|
1327
|
+
| azure-pipelines.yml | CI/CD pipeline | Azure DevOps |
|
|
1328
|
+
| monitoring/prometheus.yml | Metrics config | If using Prometheus |
|
|
1329
|
+
| scripts/health-check.sh | Health check script | Every project |
|
|
1330
|
+
| .nvmrc | Node version pinning | Every project |
|
|
1318
1331
|
|
|
1319
1332
|
### 16.3 Environment Variable Naming
|
|
1320
1333
|
|
|
1321
|
-
| Scope
|
|
1322
|
-
|
|
1323
|
-
| Build-time (Vite)
|
|
1324
|
-
| Build-time (Next.js) |
|
|
1325
|
-
| Runtime (Node)
|
|
1326
|
-
| Docker
|
|
1334
|
+
| Scope | Prefix | Example |
|
|
1335
|
+
| -------------------- | ---------------- | ------------------- |
|
|
1336
|
+
| Build-time (Vite) | VITE\_ | VITE_API_URL |
|
|
1337
|
+
| Build-time (Next.js) | NEXT*PUBLIC* | NEXT_PUBLIC_API_URL |
|
|
1338
|
+
| Runtime (Node) | No prefix | DATABASE_URL |
|
|
1339
|
+
| Docker | Service-specific | DB_HOST, REDIS_URL |
|
|
1327
1340
|
|
|
1328
1341
|
### 16.4 Key CI/CD Terms
|
|
1329
1342
|
|
|
1330
|
-
| Term
|
|
1331
|
-
|
|
1332
|
-
| CI
|
|
1333
|
-
| CD
|
|
1334
|
-
| Artifact
|
|
1335
|
-
| Lockfile
|
|
1336
|
-
| Blue/Green
|
|
1337
|
-
| Canary
|
|
1338
|
-
| Rolling Update
|
|
1339
|
-
| Readiness Probe | Checks if pod is ready to serve traffic
|
|
1340
|
-
| Liveness Probe
|
|
1341
|
-
| HPA
|
|
1342
|
-
| PDB
|
|
1343
|
-
| Kustomize
|
|
1344
|
-
| Helm
|
|
1343
|
+
| Term | Definition |
|
|
1344
|
+
| --------------- | ---------------------------------------------------------------------- |
|
|
1345
|
+
| CI | Continuous Integration -- automatically build and test every change |
|
|
1346
|
+
| CD | Continuous Delivery/Deployment -- automatically deploy to environments |
|
|
1347
|
+
| Artifact | Build output (Docker image, compiled files) |
|
|
1348
|
+
| Lockfile | File that locks dependency versions (pnpm-lock.yaml) |
|
|
1349
|
+
| Blue/Green | Deployment strategy with two identical environments |
|
|
1350
|
+
| Canary | Gradual rollout to a subset of users |
|
|
1351
|
+
| Rolling Update | Gradual replacement of pods (Kubernetes default) |
|
|
1352
|
+
| Readiness Probe | Checks if pod is ready to serve traffic |
|
|
1353
|
+
| Liveness Probe | Checks if pod is alive (restarts if failed) |
|
|
1354
|
+
| HPA | Horizontal Pod Autoscaler (auto-scale replicas) |
|
|
1355
|
+
| PDB | PodDisruptionBudget (min pods during maintenance) |
|
|
1356
|
+
| Kustomize | Kubernetes configuration customization tool |
|
|
1357
|
+
| Helm | Kubernetes package manager |
|
|
1345
1358
|
|
|
1346
1359
|
---
|
|
1347
1360
|
|
|
@@ -1369,12 +1382,12 @@ kubectl get events --sort-by='.lastTimestamp'
|
|
|
1369
1382
|
|
|
1370
1383
|
**Four enforcement layers protect every rule in this document**:
|
|
1371
1384
|
|
|
1372
|
-
| Layer
|
|
1373
|
-
|
|
1374
|
-
| L1 Editor
|
|
1375
|
-
| L2 Pre-commit
|
|
1376
|
-
| L3 CI Pipeline
|
|
1377
|
-
| L4 Post-deploy
|
|
1385
|
+
| Layer | Tool | Trigger | Purpose |
|
|
1386
|
+
| -------------- | -------------------------- | ------------ | --------------------------------- |
|
|
1387
|
+
| L1 Editor | VS Code + ESLint extension | Every save | Instant feedback while coding |
|
|
1388
|
+
| L2 Pre-commit | Husky + lint-staged | Every commit | Block bad code before git history |
|
|
1389
|
+
| L3 CI Pipeline | GitHub Actions | Every PR | Enforce before merge |
|
|
1390
|
+
| L4 Post-deploy | Smoke tests + monitoring | Every deploy | Catch runtime issues |
|
|
1378
1391
|
|
|
1379
1392
|
---
|
|
1380
1393
|
|
|
@@ -1490,12 +1503,12 @@ The `git commit --no-verify` (or `-n`) flag skips local pre-commit and commit-ms
|
|
|
1490
1503
|
|
|
1491
1504
|
**`--no-verify` Policy**:
|
|
1492
1505
|
|
|
1493
|
-
| Bypass Attempt
|
|
1494
|
-
|
|
1495
|
-
| `git commit --no-verify`
|
|
1496
|
-
| `git push --no-verify`
|
|
1497
|
-
| Removing `.husky/` directory
|
|
1498
|
-
| Disabling husky in `package.json` | CI `pnpm install` re-installs husky; `pnpm prepare` re-creates hooks.
|
|
1506
|
+
| Bypass Attempt | Result |
|
|
1507
|
+
| --------------------------------- | ------------------------------------------------------------------------------------------------------------ |
|
|
1508
|
+
| `git commit --no-verify` | Commit lands locally. CI runs all skipped checks on push. PR blocked if failing. |
|
|
1509
|
+
| `git push --no-verify` | Push succeeds. CI pre-push checks (lint + modularization + types + tests) run anyway. PR blocked if failing. |
|
|
1510
|
+
| Removing `.husky/` directory | Caught in code review — `.husky/` is committed. CI lint detects missing config. |
|
|
1511
|
+
| Disabling husky in `package.json` | CI `pnpm install` re-installs husky; `pnpm prepare` re-creates hooks. |
|
|
1499
1512
|
|
|
1500
1513
|
> **No commit reaches `main` or `develop` without passing every standard check, regardless of local hook bypass attempts.**
|
|
1501
1514
|
|
|
@@ -1569,16 +1582,16 @@ Every project must have `.github/workflows/ci.yml`. A partial or missing workflo
|
|
|
1569
1582
|
|
|
1570
1583
|
### 20.2 Mandatory CI Checks
|
|
1571
1584
|
|
|
1572
|
-
| Step
|
|
1573
|
-
|
|
1574
|
-
| 1. Lint
|
|
1575
|
-
| 2. Modularization
|
|
1576
|
-
| 3. Commit Messages | `pnpm exec commitlint --from origin/${{ github.base_ref }} --to HEAD` | Non-conventional commits (anti-bypass)
|
|
1577
|
-
| 4. Type Check
|
|
1578
|
-
| 5. Format Check
|
|
1579
|
-
| 6. Unit Tests
|
|
1580
|
-
| 7. Security Audit
|
|
1581
|
-
| 8. Build
|
|
1585
|
+
| Step | Command | Fails On | Time Limit |
|
|
1586
|
+
| ------------------ | --------------------------------------------------------------------- | --------------------------------------------------------------------------------- | ---------- |
|
|
1587
|
+
| 1. Lint | `pnpm lint` | Any ESLint error (including boundaries + `no-internal-modules`) | 2 min |
|
|
1588
|
+
| 2. Modularization | `pnpm check-modularization` | Flat files at root, missing barrel files, cross-domain imports | 30s |
|
|
1589
|
+
| 3. Commit Messages | `pnpm exec commitlint --from origin/${{ github.base_ref }} --to HEAD` | Non-conventional commits (anti-bypass) | 10s |
|
|
1590
|
+
| 4. Type Check | `pnpm check-types` | Any TS type error | 1 min |
|
|
1591
|
+
| 5. Format Check | `pnpm format:check` | Any formatting difference | 30s |
|
|
1592
|
+
| 6. Unit Tests | `pnpm test:ci` | Test failure OR coverage < 80% lines, 75% branches, 80% functions, 80% statements | 5 min |
|
|
1593
|
+
| 7. Security Audit | `pnpm audit --audit-level=high` | High/Critical CVEs | 1 min |
|
|
1594
|
+
| 8. Build | `pnpm build` | Build failure | 3 min |
|
|
1582
1595
|
|
|
1583
1596
|
**Total target CI duration**: < 10 minutes
|
|
1584
1597
|
|
|
@@ -1727,25 +1740,25 @@ jobs:
|
|
|
1727
1740
|
|
|
1728
1741
|
### 21.1 Rules That Must Be Enforced (Every Project)
|
|
1729
1742
|
|
|
1730
|
-
| Rule
|
|
1731
|
-
|
|
1732
|
-
| No ESLint errors
|
|
1733
|
-
| No TypeScript errors
|
|
1734
|
-
| No formatting errors
|
|
1735
|
-
| Tests pass
|
|
1736
|
-
| Coverage >= 80% lines, 75% branches, 80% functions, 80% statements
|
|
1737
|
-
| No high/critical CVEs
|
|
1738
|
-
| Conventional commits
|
|
1739
|
-
| **Domain modularization**
|
|
1740
|
-
| **Barrel files present**
|
|
1741
|
-
| **No deep imports bypassing barrels**
|
|
1742
|
-
| Lockfile committed
|
|
1743
|
-
| No secrets in code
|
|
1744
|
-
| Naming (camelCase, etc)
|
|
1745
|
-
| No unused vars
|
|
1746
|
-
| Single letters banned
|
|
1747
|
-
| **Anti-bypass (--no-verify)**
|
|
1748
|
-
| **PR template compliance**
|
|
1743
|
+
| Rule | Enforcement | Tool | Severity |
|
|
1744
|
+
| ------------------------------------------------------------------ | --------------------------------------------------- | -------------------------------------------- | --------- |
|
|
1745
|
+
| No ESLint errors | Exit code != 0 | ESLint | BLOCK |
|
|
1746
|
+
| No TypeScript errors | Exit code != 0 | TypeScript | BLOCK |
|
|
1747
|
+
| No formatting errors | Exit code != 0 | Prettier | BLOCK |
|
|
1748
|
+
| Tests pass | Exit code != 0 | Vitest / Jest | BLOCK |
|
|
1749
|
+
| Coverage >= 80% lines, 75% branches, 80% functions, 80% statements | Threshold fail | Vitest coverage | BLOCK |
|
|
1750
|
+
| No high/critical CVEs | Exit code != 0 | pnpm audit | BLOCK |
|
|
1751
|
+
| Conventional commits | Message reject | Commitlint | BLOCK |
|
|
1752
|
+
| **Domain modularization** | **Exit code != 0** | **check-modularization + ESLint boundaries** | **BLOCK** |
|
|
1753
|
+
| **Barrel files present** | **Exit code != 0** | **check-modularization** | **BLOCK** |
|
|
1754
|
+
| **No deep imports bypassing barrels** | **ESLint error** | **import/no-internal-modules** | **BLOCK** |
|
|
1755
|
+
| Lockfile committed | CI check | husky + git | BLOCK |
|
|
1756
|
+
| No secrets in code | CI scan | GitLeaks / ESLint | BLOCK |
|
|
1757
|
+
| Naming (camelCase, etc) | CI lint error | ESLint | BLOCK |
|
|
1758
|
+
| No unused vars | CI lint error | ESLint | BLOCK |
|
|
1759
|
+
| Single letters banned | CI lint error | ESLint id-length | BLOCK |
|
|
1760
|
+
| **Anti-bypass (--no-verify)** | **CI runs all checks regardless** | **GitHub Actions** | **BLOCK** |
|
|
1761
|
+
| **PR template compliance** | **PR must follow .github/PULL_REQUEST_TEMPLATE.md** | **Code review + CI checklist gates** | **BLOCK** |
|
|
1749
1762
|
|
|
1750
1763
|
### 21.2 New Project Validation Checklist
|
|
1751
1764
|
|
|
@@ -1781,6 +1794,7 @@ If an existing project is found not meeting these standards:
|
|
|
1781
1794
|
4. **Week 4**: Enable branch protection, require PR reviews, merge only compliant code
|
|
1782
1795
|
|
|
1783
1796
|
---
|
|
1797
|
+
|
|
1784
1798
|
## Appendix B. CI/CD Cost & Pricing Reference (June 2026)
|
|
1785
1799
|
|
|
1786
1800
|
> Use this to estimate platform costs when proposing a new project or evaluating vendors. Prices are mid-2026 commercial estimates and vary by vendor, region, and contract terms.
|
|
@@ -1799,40 +1813,40 @@ Use this table to compare partner-inclusive pricing vs. standard list pricing wh
|
|
|
1799
1813
|
|
|
1800
1814
|
### B.1 Estimated Annual Costs by Project Scale
|
|
1801
1815
|
|
|
1802
|
-
| Project Scale
|
|
1803
|
-
|
|
1804
|
-
| Startup / MVP
|
|
1805
|
-
| Growth / Series A
|
|
1806
|
-
| Scaleup / Enterprise | ~100,000+ min/mo
|
|
1807
|
-
| Regulated / Fintech
|
|
1816
|
+
| Project Scale | Runners / Minutes | Container Registry | Secrets / Vault | Monitoring | Total Est. Annual |
|
|
1817
|
+
| -------------------- | ----------------- | -------------------------- | ------------------------ | ---------------------- | -------------------------- |
|
|
1818
|
+
| Startup / MVP | ~5,000 min/mo | GHCR free tier | GitHub Secrets included | Free OSS | ~USD 500 – USD 3,000 |
|
|
1819
|
+
| Growth / Series A | ~25,000 min/mo | GHCR + ACR | Key Vault / Doppler | Sentry / Grafana Cloud | ~USD 5,000 – USD 25,000 |
|
|
1820
|
+
| Scaleup / Enterprise | ~100,000+ min/mo | ACR/ECR/GCR + cache | Key Vault / Doppler | Datadog / New Relic | ~USD 30,000 – USD 120,000+ |
|
|
1821
|
+
| Regulated / Fintech | ~50,000 min/mo | Private registry + signing | Dedicated HSM / Azure KV | ELK / Splunk + APM | ~USD 50,000 – USD 200,000+ |
|
|
1808
1822
|
|
|
1809
1823
|
### B.2 Tooling & License Costs (Annual)
|
|
1810
1824
|
|
|
1811
|
-
| Tool / Service
|
|
1812
|
-
|
|
1813
|
-
| GitHub Actions (private)
|
|
1814
|
-
| Azure DevOps
|
|
1815
|
-
| GitLab.com
|
|
1816
|
-
| SonarCloud / SonarQube
|
|
1817
|
-
| Snyk / Trivy / Dependabot
|
|
1818
|
-
| Datadog / New Relic / Azure Monitor | APM, logs, traces, RUM
|
|
1819
|
-
| Grafana Cloud
|
|
1820
|
-
| Slack / Teams Notifications
|
|
1821
|
-
| HashiCorp Vault / Azure Key Vault
|
|
1822
|
-
| Terraform Cloud / Bicep / Pulumi
|
|
1825
|
+
| Tool / Service | Purpose | Typical Tier | Est. Cost | Notes |
|
|
1826
|
+
| ----------------------------------- | ------------------------------- | ---------------------------------------------- | -------------------------- | ---------------------------------------------------------- |
|
|
1827
|
+
| GitHub Actions (private) | CI minutes beyond free tier | 2,000 min free; then ~USD 0.008/min | USD 2,400 – USD 10,000/yr | Cache and matrix strategies dramatically affect cost. |
|
|
1828
|
+
| Azure DevOps | Pipelines, Artifacts, Boards | Basic: free; Parallel jobs: ~USD 40/job/mo | USD 0 – USD 8,000/yr | Parallel jobs are the main cost driver. |
|
|
1829
|
+
| GitLab.com | CI/CD minutes and storage | Free tier 400 min/mo; Premium ~USD 19/user/mo | USD 0 – USD 10,000/yr | Self-managed runners shift cost to infra. |
|
|
1830
|
+
| SonarCloud / SonarQube | Static analysis, quality gate | Developer: ~USD 25/dev/mo; Enterprise annually | USD 2,000 – USD 15,000/yr | SonarCloud SaaS vs self-hosted on same infra. |
|
|
1831
|
+
| Snyk / Trivy / Dependabot | Dependency & container scanning | Snyk Team: ~USD 19/dev/mo; Trivy: free | USD 0 – USD 5,000/yr | Trivy + Dependabot cover most needs at zero marginal cost. |
|
|
1832
|
+
| Datadog / New Relic / Azure Monitor | APM, logs, traces, RUM | Host/ingestion based | USD 1,500 – USD 10,000+/yr | Datadog is easy to adopt; Grafana Cloud is cheaper. |
|
|
1833
|
+
| Grafana Cloud | Metrics + dashboards | Free tier; Pro: ~USD 8–29/user/mo | USD 100 – USD 3,000/yr | Good value for metrics-heavy setups. |
|
|
1834
|
+
| Slack / Teams Notifications | Deployment notifications | Included in workspace | Usually free | Webhook costs are typically negligible. |
|
|
1835
|
+
| HashiCorp Vault / Azure Key Vault | Secret management | Small tier / per-secret pricing | USD 200 – USD 1,500/yr | Avoids expensive breach remediation. |
|
|
1836
|
+
| Terraform Cloud / Bicep / Pulumi | IaC collaboration + state | Free tier; Standard ~USD 0.011/run | USD 100 – USD 1,000/yr | Workspace isolation and RBAC are worth the cost. |
|
|
1823
1837
|
|
|
1824
1838
|
### B.3 Infrastructure Costs (Monthly)
|
|
1825
1839
|
|
|
1826
|
-
| Environment
|
|
1827
|
-
|
|
1828
|
-
| Dev (1 replica, no HA)
|
|
1829
|
-
| Staging (2 replicas)
|
|
1830
|
-
| Production (3+ replicas, autoscale) | 2 vCPU / 4 GB RAM + LB
|
|
1831
|
-
| Database (managed Postgres)
|
|
1832
|
-
| Redis / Valkey (managed)
|
|
1833
|
-
| Object Storage
|
|
1834
|
-
| Load Balancer / Ingress
|
|
1835
|
-
| DNS, certs, monitoring
|
|
1840
|
+
| Environment | Typical Specs | Est. Monthly | Annual Est. |
|
|
1841
|
+
| ----------------------------------- | ----------------------- | ------------------------------------ | ---------------------- |
|
|
1842
|
+
| Dev (1 replica, no HA) | 1 vCPU / 2 GB RAM | USD 30 – USD 120 | USD 360 – USD 1,440 |
|
|
1843
|
+
| Staging (2 replicas) | 2 vCPU / 4 GB RAM each | USD 100 – USD 300 | USD 1,200 – USD 3,600 |
|
|
1844
|
+
| Production (3+ replicas, autoscale) | 2 vCPU / 4 GB RAM + LB | USD 250 – USD 900 | USD 3,000 – USD 10,800 |
|
|
1845
|
+
| Database (managed Postgres) | 2 vCPU / 8 GB + storage | USD 120 – USD 500 | USD 1,440 – USD 6,000 |
|
|
1846
|
+
| Redis / Valkey (managed) | 1 vCPU / 2 GB | USD 15 – USD 50 | USD 180 – USD 600 |
|
|
1847
|
+
| Object Storage | 1 TB + egress | USD 20 – USD 80 | USD 240 – USD 960 |
|
|
1848
|
+
| Load Balancer / Ingress | Basic LB | USD 15 – USD 30 | USD 180 – USD 360 |
|
|
1849
|
+
| DNS, certs, monitoring | Managed | Included in hosting or minimal extra | Often < USD 50/mo |
|
|
1836
1850
|
|
|
1837
1851
|
> These are mid-2026 market reference prices. Reserved/commitment terms and savings plans can reduce compute costs by 20%–40%.
|
|
1838
1852
|
|
|
@@ -1840,29 +1854,29 @@ Use this table to compare partner-inclusive pricing vs. standard list pricing wh
|
|
|
1840
1854
|
|
|
1841
1855
|
### B.4 Open-Source vs Commercial Alternatives (2026)
|
|
1842
1856
|
|
|
1843
|
-
| Need
|
|
1844
|
-
|
|
1845
|
-
| CI/CD Platform
|
|
1846
|
-
| Static Analysis
|
|
1847
|
-
| Dependency Scanning | Snyk
|
|
1848
|
-
| Container Registry
|
|
1849
|
-
| Secret Management
|
|
1850
|
-
| Monitoring/APM
|
|
1851
|
-
| Log Aggregation
|
|
1852
|
-
| IaC Platform
|
|
1853
|
-
| Deployment Gates
|
|
1857
|
+
| Need | Commercial Option | OSS Alternative | Est. Savings | Trade-offs / Notes |
|
|
1858
|
+
| ------------------- | ---------------------------------------------------- | --------------------------------------------------------------------------------------------- | ------------------------------------ | ---------------------------------------------------------------------------------------------------------------------- |
|
|
1859
|
+
| CI/CD Platform | GitHub Actions / Azure DevOps / GitLab.com | **Woodpecker**, **Gitea Actions**, **Jenkins**, **Drone CI** | Up to ~USD 10,000+/yr on high volume | OSS needs self-hosted runners/infra. Woodpecker/Gitea give closest DX to GitHub Actions without cloud minute bills. |
|
|
1860
|
+
| Static Analysis | SonarCloud / SonarQube Enterprise | **SonarQube Community Edition**, **ESLint + TypeScript strict + Coverage**, GitHub **CodeQL** | ~USD 2,000 – USD 15,000/yr | CodeQL is free for public and private repos on GitHub. SonarQube CE is fully capable for most teams. |
|
|
1861
|
+
| Dependency Scanning | Snyk | **Trivy**, **OWASP Dependency-Check**, GitHub Dependabot, **Renovate** | ~USD 0 – USD 5,000/yr | Trivy + Dependabot cover most needs at zero license cost. Renovate is best when you want automated PR updates too. |
|
|
1862
|
+
| Container Registry | Azure Container Registry / AWS ECR | **GHCR** (GitHub Container Registry), **Harbor** | ~USD 100 – USD 1,000/yr | GHCR is free with GitHub Actions. Harbor is best for air-gapped / on-prem / high egress cost scenarios. |
|
|
1863
|
+
| Secret Management | HashiCorp Vault Enterprise / Azure Key Vault premium | **HashiCorp Vault OSS**, **SOPS + age/GPG**, **Doppler CLI (free tier)** | ~USD 200 – USD 1,500/yr | Vault OSS is mature. SOPS is ideal when secrets are already in Git via encryption. |
|
|
1864
|
+
| Monitoring/APM | Datadog / New Relic / Azure Monitor | **Prometheus + Grafana**, **SigNoz**, **Uptime Kuma**, **Grafana Cloud Free/Pro** | ~USD 1,500 – USD 10,000+/yr | Grafana Cloud Pro is still far cheaper than Datadog. Self-hosted Prometheus saves the most but needs operator skill. |
|
|
1865
|
+
| Log Aggregation | Splunk / Datadog Logs / ELK Cloud | **Loki + Promtail**, **OpenSearch**, **Vector + ClickHouse** | ~USD 1,000 – USD 8,000/yr | Loki is the easiest OSS drop-in for Kubernetes logging. OpenSearch is heavier but more capable search. |
|
|
1866
|
+
| IaC Platform | Terraform Cloud / Pulumi Business | **Terraform OSS + local backend/S3**, **OpenTofu** | ~USD 100 – USD 1,000/yr | OpenTofu is the open-source fork and is now widely adopted. State in S3 + DynamoDB lock is still the cheapest backend. |
|
|
1867
|
+
| Deployment Gates | Azure DevOps Gates / Cloud Native features | **OpenPolicyAgent / Gatekeeper**, **Kyverno**, GitHub Environments + required reviewers | Often free | Policy-as-code replaces many manual gate features at no license cost. |
|
|
1854
1868
|
|
|
1855
1869
|
### B.5 OSS Selection Cheat Sheet
|
|
1856
1870
|
|
|
1857
|
-
| Solution
|
|
1858
|
-
|
|
1859
|
-
| Woodpecker / Gitea Actions
|
|
1860
|
-
| Prometheus + Grafana
|
|
1861
|
-
| SonarQube CE / CodeQL
|
|
1862
|
-
| Trivy + Dependabot
|
|
1863
|
-
| Harbor
|
|
1864
|
-
| OpenTofu / Terraform OSS
|
|
1865
|
-
| Sentry Self-Hosted / GlitchTip | Medium
|
|
1871
|
+
| Solution | Setup Effort | Ongoing Maintenance | When It Wins | When to Buy Commercial |
|
|
1872
|
+
| ------------------------------ | ------------ | ------------------- | ------------------------------------------------------- | ---------------------------------------------------------------------------------- |
|
|
1873
|
+
| Woodpecker / Gitea Actions | Medium | Medium | Heavy CI usage, private repo cost control | When you want zero per-minute billing and out-of-box hosted UX |
|
|
1874
|
+
| Prometheus + Grafana | Medium | Medium | Metrics-heavy infra with in-house SRE knowledge | When support SLAs and fully hosted dashboards are mandatory |
|
|
1875
|
+
| SonarQube CE / CodeQL | Low | Low | Security + quality gates with minimal licensing | When you need enterprise support contracts or custom plugins |
|
|
1876
|
+
| Trivy + Dependabot | Low | Low | Dependency and container scanning | When you want integrated vendor support and policy enforcement |
|
|
1877
|
+
| Harbor | Medium | Medium | Air-gapped/on-prem, large images, compliance | When cloud registries are blocked by policy |
|
|
1878
|
+
| OpenTofu / Terraform OSS | Medium | Medium | IaC state and collaboration | When workspaces, RBAC, and managed runs justify SaaS pricing |
|
|
1879
|
+
| Sentry Self-Hosted / GlitchTip | Medium | Medium | High error volume, sensitive logs, privacy requirements | When incident response support and uptime guarantees matter more than license cost |
|
|
1866
1880
|
|
|
1867
1881
|
### B.6 Sources & Methodology
|
|
1868
1882
|
|
|
@@ -1884,5 +1898,5 @@ Use this table to compare partner-inclusive pricing vs. standard list pricing wh
|
|
|
1884
1898
|
- Azure Hybrid Benefit: https://azure.microsoft.com/en-us/pricing/benefits/azure-hybrid-benefit/
|
|
1885
1899
|
- Visual Studio subscriptions: https://visualstudio.microsoft.com/subscriptions/
|
|
1886
1900
|
- FastTrack: https://azure.microsoft.com/en-us/programs/azure-fasttrack/
|
|
1887
|
-
|
|
1901
|
+
Actual entitlement depends on partner tier, agreement, and region; treat these as **items to explore** with your account team.
|
|
1888
1902
|
- Infrastructure estimates assume managed cloud VM/service pricing with no reserved/commitment discounts by default; see cloud provider calculators for exact region-specific pricing.
|