codex-genesis-harness 0.1.4 → 0.1.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/.codebase/ARCHITECTURE_REVIEW_COMPLETE.md +216 -216
- package/.codebase/CURRENT_STATE.md +9 -7
- package/.codebase/FILE_NAMING_CLARIFICATION.md +161 -161
- package/.codebase/HARNESS_COMPLETENESS_AUDIT.md +613 -613
- package/.codebase/IMPLEMENTATION_COMPLETE.md +429 -429
- package/.codebase/IMPLEMENTATION_HANDOFF.md +351 -351
- package/.codebase/IMPROVEMENTS_SUMMARY.md +419 -419
- package/.codebase/PHASE3_SKILLS_NAMING_COMPLETE.md +292 -292
- package/.codebase/PHASE_DEPENDENCY_MAP.md +486 -486
- package/.codebase/QUICK_START_SPEC_IMPACT.md +456 -456
- package/.codebase/README.md +139 -139
- package/.codebase/RECOVERY_POINTS.md +438 -438
- package/.codebase/state.json +37 -0
- package/.codex/skills/genesis-api-sync/SKILL.md +354 -354
- package/.codex/skills/genesis-api-sync/checklists/api-sync-checklist.md +101 -101
- package/.codex/skills/genesis-api-sync/templates/api-change-template.md +257 -257
- package/.codex/skills/genesis-debug-guide/SKILL.md +479 -479
- package/.codex/skills/genesis-debug-guide/checklists/flaky-test-investigation.md +339 -339
- package/.codex/skills/genesis-debug-guide/checklists/production-bug-debug.md +210 -210
- package/.codex/skills/genesis-debug-guide/checklists/test-failure-debug.md +158 -158
- package/.codex/skills/genesis-debug-guide/observability/debug-commands.md +365 -365
- package/.codex/skills/genesis-debug-guide/playbooks/unit-test-failures.md +289 -289
- package/.codex/skills/genesis-debug-guide/templates/debug-investigation-log.md +288 -288
- package/.codex/skills/genesis-docs-automation/SKILL.md +1003 -1003
- package/.codex/skills/genesis-docs-automation/checklists/docs-validation.md +359 -359
- package/.codex/skills/genesis-docs-automation/checklists/spec-alignment.md +312 -312
- package/.codex/skills/genesis-docs-automation/observability/docs-tracking.md +382 -382
- package/.codex/skills/genesis-docs-automation/playbooks/auto-update-flow.md +851 -851
- package/.codex/skills/genesis-docs-automation/playbooks/changelog-generation.md +491 -491
- package/.codex/skills/genesis-docs-automation/templates/changelog-entry-template.md +187 -187
- package/.codex/skills/genesis-docs-automation/templates/handoff-template.md +297 -297
- package/.codex/skills/genesis-harness/SKILL.md +1427 -1418
- package/.codex/skills/genesis-harness/agents/openai.yaml +7 -7
- package/.codex/skills/genesis-harness/checklists/bug-fix-qa.md +169 -169
- package/.codex/skills/genesis-harness/checklists/new-feature-qa.md +157 -157
- package/.codex/skills/genesis-harness/checklists/refactor-qa.md +216 -216
- package/.codex/skills/genesis-harness/checklists/requirements-validation.md +211 -211
- package/.codex/skills/genesis-harness/references/planning-schema.md +35 -35
- package/.codex/skills/genesis-harness/references/quality-rubric.md +21 -21
- package/.codex/skills/genesis-harness/references/research-rubric.md +41 -41
- package/.codex/skills/genesis-harness/references/workflows.md +33 -33
- package/.codex/skills/genesis-harness/resources/agents-template.md +27 -27
- package/.codex/skills/genesis-harness/resources/api-docs-template.md +32 -32
- package/.codex/skills/genesis-harness/resources/architecture-template.md +30 -30
- package/.codex/skills/genesis-harness/resources/audit-template.md +26 -26
- package/.codex/skills/genesis-harness/resources/bug-template.md +34 -34
- package/.codex/skills/genesis-harness/resources/change-impact-matrix-template.md +204 -204
- package/.codex/skills/genesis-harness/resources/check-template.md +21 -21
- package/.codex/skills/genesis-harness/resources/conventions-template.md +42 -42
- package/.codex/skills/genesis-harness/resources/decision-template.md +33 -33
- package/.codex/skills/genesis-harness/resources/design-template.md +26 -26
- package/.codex/skills/genesis-harness/resources/escalation-template.md +21 -21
- package/.codex/skills/genesis-harness/resources/feature-template.md +49 -49
- package/.codex/skills/genesis-harness/resources/foundation-phase-template.md +131 -131
- package/.codex/skills/genesis-harness/resources/integrations-template.md +32 -32
- package/.codex/skills/genesis-harness/resources/journeys-template.md +13 -13
- package/.codex/skills/genesis-harness/resources/lessons-learned-template.md +12 -12
- package/.codex/skills/genesis-harness/resources/observability-template.md +34 -34
- package/.codex/skills/genesis-harness/resources/phase-00-foundation-template.md +76 -76
- package/.codex/skills/genesis-harness/resources/phase-template.md +34 -34
- package/.codex/skills/genesis-harness/resources/pitfalls-template.md +22 -22
- package/.codex/skills/genesis-harness/resources/planning-tree-template.md +39 -39
- package/.codex/skills/genesis-harness/resources/post-implementation-guide.md +347 -347
- package/.codex/skills/genesis-harness/resources/project-template.md +38 -38
- package/.codex/skills/genesis-harness/resources/quality-score-template.md +11 -11
- package/.codex/skills/genesis-harness/resources/requirements-template.md +26 -26
- package/.codex/skills/genesis-harness/resources/research-template.md +26 -26
- package/.codex/skills/genesis-harness/resources/review-template.md +22 -22
- package/.codex/skills/genesis-harness/resources/spec-changelog-template.md +6 -6
- package/.codex/skills/genesis-harness/resources/stack-template.md +33 -33
- package/.codex/skills/genesis-harness/resources/verification-template.md +26 -26
- package/.codex/skills/genesis-harness/scripts/check-architecture-boundaries.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/check-docs-sync.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/check-no-debug-logs.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/check-required-planning-files.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/check-spec-changelog.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/check-task-tracking.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/compact-context.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/create-adr.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/create-bug.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/create-feature.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/detect-stack.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/init-planning.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/list-changed-files.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/offload-log.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/run-verification.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/run-verify-loop.sh +0 -0
- package/.codex/skills/genesis-harness/scripts/update-state.sh +0 -0
- package/.codex/skills/genesis-mvp-planning/SKILL.md +114 -0
- package/.codex/skills/genesis-mvp-planning/agents/openai.yaml +6 -0
- package/.codex/skills/genesis-mvp-planning/checklists/mvp-readiness.md +18 -0
- package/.codex/skills/genesis-mvp-planning/examples/5-phase-roadmap-example.md +43 -0
- package/.codex/skills/genesis-mvp-planning/templates/phase-1-core.md +17 -0
- package/.codex/skills/genesis-mvp-planning/templates/phase-2-auth.md +17 -0
- package/.codex/skills/genesis-mvp-planning/templates/phase-3-features.md +17 -0
- package/.codex/skills/genesis-mvp-planning/templates/phase-4-integrations.md +17 -0
- package/.codex/skills/genesis-mvp-planning/templates/phase-5-readiness.md +17 -0
- package/.codex/skills/genesis-new-design/agents/openai.yaml +3 -3
- package/.codex/skills/genesis-observability-automation/checklists/.gitkeep +0 -0
- package/.codex/skills/genesis-observability-automation/observability/.gitkeep +0 -0
- package/.codex/skills/genesis-observability-automation/playbooks/.gitkeep +0 -0
- package/.codex/skills/genesis-observability-automation/templates/.gitkeep +0 -0
- package/.codex/skills/genesis-release-orchestration/SKILL.md +653 -653
- package/.codex/skills/genesis-release-orchestration/checklists/post-deployment-verification.md +274 -274
- package/.codex/skills/genesis-release-orchestration/checklists/pre-release-validation.md +220 -220
- package/.codex/skills/genesis-release-orchestration/observability/release-tracking.md +253 -253
- package/.codex/skills/genesis-release-orchestration/playbooks/canary-deployment-orchestration.md +472 -472
- package/.codex/skills/genesis-release-orchestration/playbooks/semantic-versioning-automation.md +494 -494
- package/.codex/skills/genesis-release-orchestration/templates/deployment-strategy-template.md +303 -303
- package/.codex/skills/genesis-release-orchestration/templates/release-runbook-template.md +420 -420
- package/.codex/skills/genesis-research-first/SKILL.md +237 -237
- package/.codex/skills/genesis-research-first/templates/.gitkeep +0 -0
- package/.codex/skills/genesis-spec-propagation/SKILL.md +534 -534
- package/.codex/skills/genesis-spec-propagation/checklists/phase-update-verification.md +384 -384
- package/.codex/skills/genesis-spec-propagation/checklists/spec-change-detection.md +257 -257
- package/.codex/skills/genesis-spec-propagation/observability/propagation-tracking.md +373 -373
- package/.codex/skills/genesis-spec-propagation/playbooks/breaking-change-propagation.md +692 -692
- package/.codex/skills/genesis-spec-propagation/playbooks/feature-change-propagation.md +434 -434
- package/.codex/skills/genesis-spec-propagation/templates/migration-guide-template.md +407 -407
- package/.codex/skills/genesis-state-machine/SKILL.md +34 -0
- package/.codex/skills/genesis-upgrade-design/agents/openai.yaml +3 -3
- package/.codex/skills/spec-impact-engine/SKILL.md +504 -504
- package/.codex/skills/spec-impact-engine/detect-spec-changes.sh +0 -0
- package/.codex-plugin/plugin.json +24 -24
- package/CHANGELOG.md +42 -0
- package/LICENSE +22 -22
- package/README.EN.md +784 -719
- package/README.VI.md +776 -712
- package/README.md +113 -253
- package/VERSION +2 -2
- package/bin/genesis-harness.js +90 -87
- package/package.json +68 -43
- package/scripts/README.md +342 -342
- package/scripts/compact-context.sh +0 -0
- package/scripts/contract_integrity_gate.js +83 -0
- package/scripts/detect-changes.sh +0 -0
- package/scripts/healing_telemetry.js +118 -0
- package/scripts/install.sh +4 -1
- package/scripts/offload-log.sh +0 -0
- package/scripts/prompt_sentinel.js +84 -0
- package/scripts/run-evals.sh +1 -0
- package/scripts/run-verify-loop.sh +11 -0
- package/scripts/spec_visual_sync.js +157 -0
- package/scripts/test_generator.js +142 -0
- package/scripts/transition_state.sh +67 -0
- package/scripts/uninstall.sh +1 -0
- package/scripts/validation_gates.sh +85 -0
- package/scripts/verify.sh +5 -0
- package/tests/unit/contract_integrity_gate.test.js +74 -0
- package/tests/unit/healing_telemetry.test.js +58 -0
- package/tests/unit/prompt_sentinel.test.js +50 -0
- package/tests/unit/spec_visual_sync.test.js +77 -0
- package/tests/unit/test_generator.test.js +62 -0
|
@@ -1,420 +1,420 @@
|
|
|
1
|
-
# Release Runbook Template
|
|
2
|
-
|
|
3
|
-
**Environment**: [Development|Staging|Production]
|
|
4
|
-
**Release Version**: v[X.Y.Z]
|
|
5
|
-
**Release Date**: [YYYY-MM-DD HH:MM UTC]
|
|
6
|
-
**Release Manager**: [Name]
|
|
7
|
-
**Approval Chain**: [List approvers]
|
|
8
|
-
|
|
9
|
-
---
|
|
10
|
-
|
|
11
|
-
## Pre-Deployment Phase (30-60 minutes)
|
|
12
|
-
|
|
13
|
-
### 1. Pre-Deployment Verification (10 min)
|
|
14
|
-
|
|
15
|
-
**Checklist**:
|
|
16
|
-
- [ ] All tests passing (80%+ coverage)
|
|
17
|
-
- [ ] Version correctly updated in VERSION file
|
|
18
|
-
- [ ] CHANGELOG.md updated with release notes
|
|
19
|
-
- [ ] Breaking changes documented (if applicable)
|
|
20
|
-
- [ ] Migration guides created (if breaking changes)
|
|
21
|
-
- [ ] Database migrations tested in staging
|
|
22
|
-
- [ ] Configuration validated for this environment
|
|
23
|
-
- [ ] Rollback plan tested and documented
|
|
24
|
-
- [ ] Team on-call and ready
|
|
25
|
-
- [ ] Status page updated (if applicable)
|
|
26
|
-
|
|
27
|
-
**Command to verify**:
|
|
28
|
-
```bash
|
|
29
|
-
# Check tests
|
|
30
|
-
npm run test -- --coverage
|
|
31
|
-
|
|
32
|
-
# Verify version
|
|
33
|
-
cat VERSION # Should show v[X.Y.Z]
|
|
34
|
-
|
|
35
|
-
# Verify changelog
|
|
36
|
-
cat CHANGELOG.md | head -20 # Should have new version entry
|
|
37
|
-
|
|
38
|
-
# Verify migrations (if applicable)
|
|
39
|
-
ls -la db/migrations/ # Check latest migration script
|
|
40
|
-
```
|
|
41
|
-
|
|
42
|
-
### 2. Database Migrations (if applicable, 10-20 min)
|
|
43
|
-
|
|
44
|
-
**For Development/Staging**:
|
|
45
|
-
```bash
|
|
46
|
-
# Run migrations
|
|
47
|
-
./db/migrate-up.sh
|
|
48
|
-
|
|
49
|
-
# Verify: Check table structure
|
|
50
|
-
psql -d database_dev -c "\d users"
|
|
51
|
-
|
|
52
|
-
# Verify: Check data integrity
|
|
53
|
-
psql -d database_dev -c "SELECT COUNT(*) FROM users"
|
|
54
|
-
|
|
55
|
-
# Save migration timestamp
|
|
56
|
-
echo "Migration completed at $(date)" >> MIGRATION_LOG.md
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
**For Production** (if breaking changes):
|
|
60
|
-
- [ ] Backup database created: `db_backup_v[X.Y.Z]_$(date).sql`
|
|
61
|
-
- [ ] Backup tested: Can restore from backup
|
|
62
|
-
- [ ] Estimated migration time: [X] minutes
|
|
63
|
-
- [ ] Data integrity verified post-migration
|
|
64
|
-
- [ ] Rollback migration script tested
|
|
65
|
-
|
|
66
|
-
### 3. Configuration Validation (5-10 min)
|
|
67
|
-
|
|
68
|
-
**Check environment-specific config**:
|
|
69
|
-
```bash
|
|
70
|
-
# Verify environment variables loaded
|
|
71
|
-
env | grep -E "DATABASE_URL|API_KEY|FEATURE_FLAG"
|
|
72
|
-
|
|
73
|
-
# Verify secrets available
|
|
74
|
-
grep -r "SECRET_KEY" config/ | grep -v "# PLACEHOLDER"
|
|
75
|
-
|
|
76
|
-
# Verify no hardcoded values
|
|
77
|
-
grep -r "hardcoded_value\|TODO_REPLACE\|CHANGE_ME" src/
|
|
78
|
-
|
|
79
|
-
# Validate config syntax
|
|
80
|
-
npm run config:validate
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
**Config checklist**:
|
|
84
|
-
- [ ] Database URL correct for environment
|
|
85
|
-
- [ ] API keys/tokens available
|
|
86
|
-
- [ ] Feature flags set correctly
|
|
87
|
-
- [ ] Logging level appropriate (DEBUG in dev, INFO in prod)
|
|
88
|
-
- [ ] TLS certificates valid
|
|
89
|
-
- [ ] Domain names correct
|
|
90
|
-
|
|
91
|
-
### 4. Pre-Deployment Approval (5 min)
|
|
92
|
-
|
|
93
|
-
**Get sign-off from**:
|
|
94
|
-
- [ ] Release Manager: Confirmed ready
|
|
95
|
-
- [ ] Tech Lead: All checks pass
|
|
96
|
-
- [ ] Ops Lead: Infrastructure ready
|
|
97
|
-
- [ ] Product (if breaking changes): Consumer communication sent
|
|
98
|
-
|
|
99
|
-
**Approval timestamp**: [HH:MM UTC]
|
|
100
|
-
|
|
101
|
-
---
|
|
102
|
-
|
|
103
|
-
## Deployment Phase (20-60 minutes)
|
|
104
|
-
|
|
105
|
-
### For Development Environment
|
|
106
|
-
|
|
107
|
-
```bash
|
|
108
|
-
# 1. Build Docker image
|
|
109
|
-
docker build -t myapp:v[X.Y.Z] .
|
|
110
|
-
|
|
111
|
-
# 2. Tag image
|
|
112
|
-
docker tag myapp:v[X.Y.Z] myapp:latest
|
|
113
|
-
|
|
114
|
-
# 3. Stop old container
|
|
115
|
-
docker stop myapp-container || true
|
|
116
|
-
|
|
117
|
-
# 4. Remove old container
|
|
118
|
-
docker rm myapp-container || true
|
|
119
|
-
|
|
120
|
-
# 5. Run new container
|
|
121
|
-
docker run -d \
|
|
122
|
-
--name myapp-container \
|
|
123
|
-
--env-file .env.dev \
|
|
124
|
-
-p 3000:3000 \
|
|
125
|
-
myapp:v[X.Y.Z]
|
|
126
|
-
|
|
127
|
-
# 6. Wait for startup (verify health endpoint)
|
|
128
|
-
sleep 5
|
|
129
|
-
curl -f http://localhost:3000/health || exit 1
|
|
130
|
-
|
|
131
|
-
echo "✅ Development deployment complete: v[X.Y.Z]"
|
|
132
|
-
```
|
|
133
|
-
|
|
134
|
-
### For Staging Environment (via Kubernetes)
|
|
135
|
-
|
|
136
|
-
```bash
|
|
137
|
-
# 1. Build and push image to registry
|
|
138
|
-
docker build -t registry.example.com/myapp:v[X.Y.Z] .
|
|
139
|
-
docker push registry.example.com/myapp:v[X.Y.Z]
|
|
140
|
-
|
|
141
|
-
# 2. Update deployment
|
|
142
|
-
kubectl set image deployment/myapp \
|
|
143
|
-
myapp=registry.example.com/myapp:v[X.Y.Z] \
|
|
144
|
-
-n staging
|
|
145
|
-
|
|
146
|
-
# 3. Wait for rollout
|
|
147
|
-
kubectl rollout status deployment/myapp -n staging --timeout=5m
|
|
148
|
-
|
|
149
|
-
# 4. Verify pods running
|
|
150
|
-
kubectl get pods -n staging -l app=myapp
|
|
151
|
-
|
|
152
|
-
# 5. Port forward for testing
|
|
153
|
-
kubectl port-forward svc/myapp 3000:3000 -n staging
|
|
154
|
-
|
|
155
|
-
echo "✅ Staging deployment complete: v[X.Y.Z]"
|
|
156
|
-
```
|
|
157
|
-
|
|
158
|
-
### For Production Environment (Blue-Green or Canary)
|
|
159
|
-
|
|
160
|
-
**Blue-Green**:
|
|
161
|
-
```bash
|
|
162
|
-
# 1. Build and push to registry
|
|
163
|
-
docker build -t registry.example.com/myapp:v[X.Y.Z] .
|
|
164
|
-
docker push registry.example.com/myapp:v[X.Y.Z]
|
|
165
|
-
|
|
166
|
-
# 2. Deploy to "green" environment (parallel to prod)
|
|
167
|
-
kubectl apply -f deployment-green-v[X.Y.Z].yaml -n production
|
|
168
|
-
|
|
169
|
-
# 3. Wait for health checks
|
|
170
|
-
kubectl rollout status deployment/myapp-green -n production --timeout=5m
|
|
171
|
-
|
|
172
|
-
# 4. Verify green environment healthy
|
|
173
|
-
curl -f https://green.myapp.example.com/health || exit 1
|
|
174
|
-
|
|
175
|
-
# 5. Switch traffic: blue (old) → green (new)
|
|
176
|
-
# This is typically done via load balancer or DNS switch
|
|
177
|
-
aws elbv2 modify-rule --rule-arn <arn> \
|
|
178
|
-
--actions Type=forward,TargetGroups="[{TargetGroupArn=<green-tg>,Weight=100}]"
|
|
179
|
-
|
|
180
|
-
# 6. Monitor: Keep blue running for instant rollback (1-2 hours)
|
|
181
|
-
echo "✅ Production deployment complete: Traffic switched to green (v[X.Y.Z])"
|
|
182
|
-
echo "⏱️ Blue environment available for 2 hours for instant rollback"
|
|
183
|
-
```
|
|
184
|
-
|
|
185
|
-
**Canary** (for breaking changes):
|
|
186
|
-
```bash
|
|
187
|
-
# 1-4: Same as blue-green deployment steps
|
|
188
|
-
|
|
189
|
-
# 5. Route small % of traffic to new version (canary)
|
|
190
|
-
aws elbv2 modify-rule --rule-arn <arn> \
|
|
191
|
-
--actions Type=forward,TargetGroups="[{TargetGroupArn=<v3-tg>,Weight=1},{TargetGroupArn=<v2-tg>,Weight=99}]"
|
|
192
|
-
|
|
193
|
-
# 6. Monitor Stage 1 (1 hour at 1% traffic)
|
|
194
|
-
# ... (see canary-deployment-orchestration.md for full stages)
|
|
195
|
-
```
|
|
196
|
-
|
|
197
|
-
---
|
|
198
|
-
|
|
199
|
-
## Post-Deployment Verification (15-30 minutes)
|
|
200
|
-
|
|
201
|
-
### 1. Health Check Verification (5 min)
|
|
202
|
-
|
|
203
|
-
```bash
|
|
204
|
-
# Check liveness probe
|
|
205
|
-
curl -f http://[app-url]/health
|
|
206
|
-
# Expected: 200 OK with version info
|
|
207
|
-
|
|
208
|
-
# Check readiness probe
|
|
209
|
-
curl -f http://[app-url]/ready
|
|
210
|
-
# Expected: 200 OK with dependency status
|
|
211
|
-
|
|
212
|
-
# Check metrics endpoint
|
|
213
|
-
curl -f http://[app-url]/metrics | head -20
|
|
214
|
-
# Expected: Prometheus metrics output
|
|
215
|
-
```
|
|
216
|
-
|
|
217
|
-
**Verification checklist**:
|
|
218
|
-
- [ ] Liveness endpoint: 200 OK
|
|
219
|
-
- [ ] Readiness endpoint: 200 OK
|
|
220
|
-
- [ ] Version matches v[X.Y.Z]
|
|
221
|
-
- [ ] Database connected: Yes
|
|
222
|
-
- [ ] Cache connected: Yes (if applicable)
|
|
223
|
-
- [ ] External services: Available
|
|
224
|
-
|
|
225
|
-
### 2. Smoke Test Scenarios (5-10 min)
|
|
226
|
-
|
|
227
|
-
**Critical Workflow #1: Authentication**
|
|
228
|
-
```bash
|
|
229
|
-
# Test login
|
|
230
|
-
curl -X POST http://[app-url]/api/login \
|
|
231
|
-
-H "Content-Type: application/json" \
|
|
232
|
-
-d '{"email":"test@example.com","password":"password"}'
|
|
233
|
-
|
|
234
|
-
# Expected response:
|
|
235
|
-
# { "token": "...", "user": { "id": 1, "email": "test@example.com" } }
|
|
236
|
-
```
|
|
237
|
-
|
|
238
|
-
**Critical Workflow #2: Create & Read Data**
|
|
239
|
-
```bash
|
|
240
|
-
# Create user
|
|
241
|
-
curl -X POST http://[app-url]/api/users \
|
|
242
|
-
-H "Authorization: Bearer [token]" \
|
|
243
|
-
-H "Content-Type: application/json" \
|
|
244
|
-
-d '{"name":"John","email":"john@example.com"}'
|
|
245
|
-
|
|
246
|
-
# Read user (verify format is correct for this version)
|
|
247
|
-
curl http://[app-url]/api/users/1 \
|
|
248
|
-
-H "Authorization: Bearer [token]"
|
|
249
|
-
|
|
250
|
-
# Expected for v3.0.0: { "data": { "id": 1, "name": "John" } }
|
|
251
|
-
# NOT: { "user": { "id": 1, "name": "John" } }
|
|
252
|
-
```
|
|
253
|
-
|
|
254
|
-
**Critical Workflow #3: Business Logic**
|
|
255
|
-
```bash
|
|
256
|
-
# Test key business workflow (e.g., payment processing)
|
|
257
|
-
curl -X POST http://[app-url]/api/payments \
|
|
258
|
-
-H "Authorization: Bearer [token]" \
|
|
259
|
-
-H "Content-Type: application/json" \
|
|
260
|
-
-d '{"amount":99.99,"currency":"USD"}'
|
|
261
|
-
|
|
262
|
-
# Expected: 200 OK with payment ID
|
|
263
|
-
```
|
|
264
|
-
|
|
265
|
-
**Smoke Test Checklist**:
|
|
266
|
-
- [ ] Login works
|
|
267
|
-
- [ ] Create resource works
|
|
268
|
-
- [ ] Read resource works (new format if applicable)
|
|
269
|
-
- [ ] Update resource works
|
|
270
|
-
- [ ] Delete resource works
|
|
271
|
-
- [ ] Business critical endpoint works
|
|
272
|
-
- [ ] Error handling works (test 404, 400, 500 scenarios)
|
|
273
|
-
|
|
274
|
-
### 3. Database Integrity Check (5 min)
|
|
275
|
-
|
|
276
|
-
```bash
|
|
277
|
-
# Verify database accessible
|
|
278
|
-
psql -d [database] -c "SELECT 1"
|
|
279
|
-
|
|
280
|
-
# Check recent data
|
|
281
|
-
psql -d [database] -c "SELECT COUNT(*) FROM users"
|
|
282
|
-
|
|
283
|
-
# Check for any errors in migration
|
|
284
|
-
psql -d [database] -c "SELECT * FROM schema_migrations ORDER BY version DESC LIMIT 5"
|
|
285
|
-
|
|
286
|
-
# Verify no data corruption
|
|
287
|
-
psql -d [database] -c "SELECT * FROM users LIMIT 1" | head -10
|
|
288
|
-
```
|
|
289
|
-
|
|
290
|
-
**Database checklist**:
|
|
291
|
-
- [ ] Database connection successful
|
|
292
|
-
- [ ] Tables present
|
|
293
|
-
- [ ] Data accessible
|
|
294
|
-
- [ ] No schema errors
|
|
295
|
-
- [ ] Data integrity checks pass
|
|
296
|
-
|
|
297
|
-
### 4. Performance Baseline (5 min)
|
|
298
|
-
|
|
299
|
-
```bash
|
|
300
|
-
# Get baseline metrics
|
|
301
|
-
curl http://[app-url]/metrics | grep http_request_duration_seconds | head -10
|
|
302
|
-
|
|
303
|
-
# Expected: Requests processing in <200ms (P95)
|
|
304
|
-
```
|
|
305
|
-
|
|
306
|
-
**Performance checklist**:
|
|
307
|
-
- [ ] Response time: <200ms P95
|
|
308
|
-
- [ ] Error rate: <0.1%
|
|
309
|
-
- [ ] CPU usage: <70%
|
|
310
|
-
- [ ] Memory usage: <80%
|
|
311
|
-
- [ ] Cache hit rate: >80% (if applicable)
|
|
312
|
-
|
|
313
|
-
---
|
|
314
|
-
|
|
315
|
-
## Rollback Phase (If Needed - < 5 minutes)
|
|
316
|
-
|
|
317
|
-
### For Development/Staging
|
|
318
|
-
|
|
319
|
-
```bash
|
|
320
|
-
# Get previous version
|
|
321
|
-
PREVIOUS_VERSION=$(git tag | sort -V | tail -2 | head -1)
|
|
322
|
-
|
|
323
|
-
# Stop current version
|
|
324
|
-
docker stop myapp-container
|
|
325
|
-
|
|
326
|
-
# Run previous version
|
|
327
|
-
docker run -d \
|
|
328
|
-
--name myapp-container \
|
|
329
|
-
--env-file .env.dev \
|
|
330
|
-
-p 3000:3000 \
|
|
331
|
-
myapp:${PREVIOUS_VERSION}
|
|
332
|
-
|
|
333
|
-
# Verify rollback successful
|
|
334
|
-
sleep 5
|
|
335
|
-
curl -f http://localhost:3000/health || exit 1
|
|
336
|
-
|
|
337
|
-
echo "✅ Rollback to ${PREVIOUS_VERSION} complete"
|
|
338
|
-
```
|
|
339
|
-
|
|
340
|
-
### For Production
|
|
341
|
-
|
|
342
|
-
```bash
|
|
343
|
-
# Revert to previous version immediately
|
|
344
|
-
# If Blue-Green: Switch traffic back to blue (original)
|
|
345
|
-
aws elbv2 modify-rule --rule-arn <arn> \
|
|
346
|
-
--actions Type=forward,TargetGroups="[{TargetGroupArn=<blue-tg>,Weight=100}]"
|
|
347
|
-
|
|
348
|
-
# If Canary: Immediately route 100% back to previous version
|
|
349
|
-
aws elbv2 modify-rule --rule-arn <arn> \
|
|
350
|
-
--actions Type=forward,TargetGroups="[{TargetGroupArn=<v2-tg>,Weight=100}]"
|
|
351
|
-
|
|
352
|
-
# Verify rollback
|
|
353
|
-
curl -f http://[prod-url]/health
|
|
354
|
-
|
|
355
|
-
# Notify stakeholders
|
|
356
|
-
# Send: Slack message, Status page update, Email to on-call
|
|
357
|
-
|
|
358
|
-
echo "✅ Rollback to v[PREVIOUS] complete"
|
|
359
|
-
echo "🔴 Incident: v[X.Y.Z] deployment ROLLED BACK"
|
|
360
|
-
echo "📋 Investigation: See INCIDENT.md"
|
|
361
|
-
```
|
|
362
|
-
|
|
363
|
-
---
|
|
364
|
-
|
|
365
|
-
## Post-Deployment Monitoring (24 hours)
|
|
366
|
-
|
|
367
|
-
**First 1 hour (Critical)**:
|
|
368
|
-
- [ ] Error rate: Maintain <0.1%
|
|
369
|
-
- [ ] Latency: Within 10% of baseline
|
|
370
|
-
- [ ] Throughput: Expected level
|
|
371
|
-
- [ ] Health checks: Continuous passing
|
|
372
|
-
- [ ] Consumer feedback: No critical issues
|
|
373
|
-
|
|
374
|
-
**24-hour window**:
|
|
375
|
-
- [ ] Error rate: Maintain <0.1%
|
|
376
|
-
- [ ] Latency: Stable, no spikes
|
|
377
|
-
- [ ] All business workflows: Operational
|
|
378
|
-
- [ ] Consumer migrations: On track (if breaking changes)
|
|
379
|
-
- [ ] On-call team: Standing down after 24 hours
|
|
380
|
-
|
|
381
|
-
**Checklist**:
|
|
382
|
-
- [ ] 1-hour post-deployment: All green
|
|
383
|
-
- [ ] 4-hour checkpoint: All green
|
|
384
|
-
- [ ] 24-hour checkpoint: Ready to remove rollback capability
|
|
385
|
-
|
|
386
|
-
---
|
|
387
|
-
|
|
388
|
-
## Incident Log
|
|
389
|
-
|
|
390
|
-
**If any issues occur, document**:
|
|
391
|
-
|
|
392
|
-
```
|
|
393
|
-
Time: [HH:MM UTC]
|
|
394
|
-
Severity: [LOW|MEDIUM|HIGH|CRITICAL]
|
|
395
|
-
Description: [What happened]
|
|
396
|
-
Impact: [How many users affected]
|
|
397
|
-
Root cause: [Why did it happen]
|
|
398
|
-
Action taken: [What was done]
|
|
399
|
-
Resolution: [How was it fixed]
|
|
400
|
-
Prevention: [How to prevent next time]
|
|
401
|
-
```
|
|
402
|
-
|
|
403
|
-
---
|
|
404
|
-
|
|
405
|
-
## Sign-Off & Completion
|
|
406
|
-
|
|
407
|
-
**Deployment Complete**:
|
|
408
|
-
- Version deployed: v[X.Y.Z]
|
|
409
|
-
- Environment: [Development|Staging|Production]
|
|
410
|
-
- Date/Time: [YYYY-MM-DD HH:MM UTC]
|
|
411
|
-
- Deployed by: [Name]
|
|
412
|
-
- Approvals: [List all approvers]
|
|
413
|
-
- All health checks: ✅ PASS
|
|
414
|
-
- Smoke tests: ✅ PASS
|
|
415
|
-
- Issues: [None / List any]
|
|
416
|
-
- Status: ✅ **READY FOR NEXT STAGE**
|
|
417
|
-
|
|
418
|
-
---
|
|
419
|
-
|
|
420
|
-
**RUNBOOK COMPLETE**
|
|
1
|
+
# Release Runbook Template
|
|
2
|
+
|
|
3
|
+
**Environment**: [Development|Staging|Production]
|
|
4
|
+
**Release Version**: v[X.Y.Z]
|
|
5
|
+
**Release Date**: [YYYY-MM-DD HH:MM UTC]
|
|
6
|
+
**Release Manager**: [Name]
|
|
7
|
+
**Approval Chain**: [List approvers]
|
|
8
|
+
|
|
9
|
+
---
|
|
10
|
+
|
|
11
|
+
## Pre-Deployment Phase (30-60 minutes)
|
|
12
|
+
|
|
13
|
+
### 1. Pre-Deployment Verification (10 min)
|
|
14
|
+
|
|
15
|
+
**Checklist**:
|
|
16
|
+
- [ ] All tests passing (80%+ coverage)
|
|
17
|
+
- [ ] Version correctly updated in VERSION file
|
|
18
|
+
- [ ] CHANGELOG.md updated with release notes
|
|
19
|
+
- [ ] Breaking changes documented (if applicable)
|
|
20
|
+
- [ ] Migration guides created (if breaking changes)
|
|
21
|
+
- [ ] Database migrations tested in staging
|
|
22
|
+
- [ ] Configuration validated for this environment
|
|
23
|
+
- [ ] Rollback plan tested and documented
|
|
24
|
+
- [ ] Team on-call and ready
|
|
25
|
+
- [ ] Status page updated (if applicable)
|
|
26
|
+
|
|
27
|
+
**Command to verify**:
|
|
28
|
+
```bash
|
|
29
|
+
# Check tests
|
|
30
|
+
npm run test -- --coverage
|
|
31
|
+
|
|
32
|
+
# Verify version
|
|
33
|
+
cat VERSION # Should show v[X.Y.Z]
|
|
34
|
+
|
|
35
|
+
# Verify changelog
|
|
36
|
+
cat CHANGELOG.md | head -20 # Should have new version entry
|
|
37
|
+
|
|
38
|
+
# Verify migrations (if applicable)
|
|
39
|
+
ls -la db/migrations/ # Check latest migration script
|
|
40
|
+
```
|
|
41
|
+
|
|
42
|
+
### 2. Database Migrations (if applicable, 10-20 min)
|
|
43
|
+
|
|
44
|
+
**For Development/Staging**:
|
|
45
|
+
```bash
|
|
46
|
+
# Run migrations
|
|
47
|
+
./db/migrate-up.sh
|
|
48
|
+
|
|
49
|
+
# Verify: Check table structure
|
|
50
|
+
psql -d database_dev -c "\d users"
|
|
51
|
+
|
|
52
|
+
# Verify: Check data integrity
|
|
53
|
+
psql -d database_dev -c "SELECT COUNT(*) FROM users"
|
|
54
|
+
|
|
55
|
+
# Save migration timestamp
|
|
56
|
+
echo "Migration completed at $(date)" >> MIGRATION_LOG.md
|
|
57
|
+
```
|
|
58
|
+
|
|
59
|
+
**For Production** (if breaking changes):
|
|
60
|
+
- [ ] Backup database created: `db_backup_v[X.Y.Z]_$(date).sql`
|
|
61
|
+
- [ ] Backup tested: Can restore from backup
|
|
62
|
+
- [ ] Estimated migration time: [X] minutes
|
|
63
|
+
- [ ] Data integrity verified post-migration
|
|
64
|
+
- [ ] Rollback migration script tested
|
|
65
|
+
|
|
66
|
+
### 3. Configuration Validation (5-10 min)
|
|
67
|
+
|
|
68
|
+
**Check environment-specific config**:
|
|
69
|
+
```bash
|
|
70
|
+
# Verify environment variables loaded
|
|
71
|
+
env | grep -E "DATABASE_URL|API_KEY|FEATURE_FLAG"
|
|
72
|
+
|
|
73
|
+
# Verify secrets available
|
|
74
|
+
grep -r "SECRET_KEY" config/ | grep -v "# PLACEHOLDER"
|
|
75
|
+
|
|
76
|
+
# Verify no hardcoded values
|
|
77
|
+
grep -r "hardcoded_value\|TODO_REPLACE\|CHANGE_ME" src/
|
|
78
|
+
|
|
79
|
+
# Validate config syntax
|
|
80
|
+
npm run config:validate
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
**Config checklist**:
|
|
84
|
+
- [ ] Database URL correct for environment
|
|
85
|
+
- [ ] API keys/tokens available
|
|
86
|
+
- [ ] Feature flags set correctly
|
|
87
|
+
- [ ] Logging level appropriate (DEBUG in dev, INFO in prod)
|
|
88
|
+
- [ ] TLS certificates valid
|
|
89
|
+
- [ ] Domain names correct
|
|
90
|
+
|
|
91
|
+
### 4. Pre-Deployment Approval (5 min)
|
|
92
|
+
|
|
93
|
+
**Get sign-off from**:
|
|
94
|
+
- [ ] Release Manager: Confirmed ready
|
|
95
|
+
- [ ] Tech Lead: All checks pass
|
|
96
|
+
- [ ] Ops Lead: Infrastructure ready
|
|
97
|
+
- [ ] Product (if breaking changes): Consumer communication sent
|
|
98
|
+
|
|
99
|
+
**Approval timestamp**: [HH:MM UTC]
|
|
100
|
+
|
|
101
|
+
---
|
|
102
|
+
|
|
103
|
+
## Deployment Phase (20-60 minutes)
|
|
104
|
+
|
|
105
|
+
### For Development Environment
|
|
106
|
+
|
|
107
|
+
```bash
|
|
108
|
+
# 1. Build Docker image
|
|
109
|
+
docker build -t myapp:v[X.Y.Z] .
|
|
110
|
+
|
|
111
|
+
# 2. Tag image
|
|
112
|
+
docker tag myapp:v[X.Y.Z] myapp:latest
|
|
113
|
+
|
|
114
|
+
# 3. Stop old container
|
|
115
|
+
docker stop myapp-container || true
|
|
116
|
+
|
|
117
|
+
# 4. Remove old container
|
|
118
|
+
docker rm myapp-container || true
|
|
119
|
+
|
|
120
|
+
# 5. Run new container
|
|
121
|
+
docker run -d \
|
|
122
|
+
--name myapp-container \
|
|
123
|
+
--env-file .env.dev \
|
|
124
|
+
-p 3000:3000 \
|
|
125
|
+
myapp:v[X.Y.Z]
|
|
126
|
+
|
|
127
|
+
# 6. Wait for startup (verify health endpoint)
|
|
128
|
+
sleep 5
|
|
129
|
+
curl -f http://localhost:3000/health || exit 1
|
|
130
|
+
|
|
131
|
+
echo "✅ Development deployment complete: v[X.Y.Z]"
|
|
132
|
+
```
|
|
133
|
+
|
|
134
|
+
### For Staging Environment (via Kubernetes)
|
|
135
|
+
|
|
136
|
+
```bash
|
|
137
|
+
# 1. Build and push image to registry
|
|
138
|
+
docker build -t registry.example.com/myapp:v[X.Y.Z] .
|
|
139
|
+
docker push registry.example.com/myapp:v[X.Y.Z]
|
|
140
|
+
|
|
141
|
+
# 2. Update deployment
|
|
142
|
+
kubectl set image deployment/myapp \
|
|
143
|
+
myapp=registry.example.com/myapp:v[X.Y.Z] \
|
|
144
|
+
-n staging
|
|
145
|
+
|
|
146
|
+
# 3. Wait for rollout
|
|
147
|
+
kubectl rollout status deployment/myapp -n staging --timeout=5m
|
|
148
|
+
|
|
149
|
+
# 4. Verify pods running
|
|
150
|
+
kubectl get pods -n staging -l app=myapp
|
|
151
|
+
|
|
152
|
+
# 5. Port forward for testing
|
|
153
|
+
kubectl port-forward svc/myapp 3000:3000 -n staging
|
|
154
|
+
|
|
155
|
+
echo "✅ Staging deployment complete: v[X.Y.Z]"
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
### For Production Environment (Blue-Green or Canary)
|
|
159
|
+
|
|
160
|
+
**Blue-Green**:
|
|
161
|
+
```bash
|
|
162
|
+
# 1. Build and push to registry
|
|
163
|
+
docker build -t registry.example.com/myapp:v[X.Y.Z] .
|
|
164
|
+
docker push registry.example.com/myapp:v[X.Y.Z]
|
|
165
|
+
|
|
166
|
+
# 2. Deploy to "green" environment (parallel to prod)
|
|
167
|
+
kubectl apply -f deployment-green-v[X.Y.Z].yaml -n production
|
|
168
|
+
|
|
169
|
+
# 3. Wait for health checks
|
|
170
|
+
kubectl rollout status deployment/myapp-green -n production --timeout=5m
|
|
171
|
+
|
|
172
|
+
# 4. Verify green environment healthy
|
|
173
|
+
curl -f https://green.myapp.example.com/health || exit 1
|
|
174
|
+
|
|
175
|
+
# 5. Switch traffic: blue (old) → green (new)
|
|
176
|
+
# This is typically done via load balancer or DNS switch
|
|
177
|
+
aws elbv2 modify-rule --rule-arn <arn> \
|
|
178
|
+
--actions Type=forward,TargetGroups="[{TargetGroupArn=<green-tg>,Weight=100}]"
|
|
179
|
+
|
|
180
|
+
# 6. Monitor: Keep blue running for instant rollback (1-2 hours)
|
|
181
|
+
echo "✅ Production deployment complete: Traffic switched to green (v[X.Y.Z])"
|
|
182
|
+
echo "⏱️ Blue environment available for 2 hours for instant rollback"
|
|
183
|
+
```
|
|
184
|
+
|
|
185
|
+
**Canary** (for breaking changes):
|
|
186
|
+
```bash
|
|
187
|
+
# 1-4: Same as blue-green deployment steps
|
|
188
|
+
|
|
189
|
+
# 5. Route small % of traffic to new version (canary)
|
|
190
|
+
aws elbv2 modify-rule --rule-arn <arn> \
|
|
191
|
+
--actions Type=forward,TargetGroups="[{TargetGroupArn=<v3-tg>,Weight=1},{TargetGroupArn=<v2-tg>,Weight=99}]"
|
|
192
|
+
|
|
193
|
+
# 6. Monitor Stage 1 (1 hour at 1% traffic)
|
|
194
|
+
# ... (see canary-deployment-orchestration.md for full stages)
|
|
195
|
+
```
|
|
196
|
+
|
|
197
|
+
---
|
|
198
|
+
|
|
199
|
+
## Post-Deployment Verification (15-30 minutes)
|
|
200
|
+
|
|
201
|
+
### 1. Health Check Verification (5 min)
|
|
202
|
+
|
|
203
|
+
```bash
|
|
204
|
+
# Check liveness probe
|
|
205
|
+
curl -f http://[app-url]/health
|
|
206
|
+
# Expected: 200 OK with version info
|
|
207
|
+
|
|
208
|
+
# Check readiness probe
|
|
209
|
+
curl -f http://[app-url]/ready
|
|
210
|
+
# Expected: 200 OK with dependency status
|
|
211
|
+
|
|
212
|
+
# Check metrics endpoint
|
|
213
|
+
curl -f http://[app-url]/metrics | head -20
|
|
214
|
+
# Expected: Prometheus metrics output
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
**Verification checklist**:
|
|
218
|
+
- [ ] Liveness endpoint: 200 OK
|
|
219
|
+
- [ ] Readiness endpoint: 200 OK
|
|
220
|
+
- [ ] Version matches v[X.Y.Z]
|
|
221
|
+
- [ ] Database connected: Yes
|
|
222
|
+
- [ ] Cache connected: Yes (if applicable)
|
|
223
|
+
- [ ] External services: Available
|
|
224
|
+
|
|
225
|
+
### 2. Smoke Test Scenarios (5-10 min)
|
|
226
|
+
|
|
227
|
+
**Critical Workflow #1: Authentication**
|
|
228
|
+
```bash
|
|
229
|
+
# Test login
|
|
230
|
+
curl -X POST http://[app-url]/api/login \
|
|
231
|
+
-H "Content-Type: application/json" \
|
|
232
|
+
-d '{"email":"test@example.com","password":"password"}'
|
|
233
|
+
|
|
234
|
+
# Expected response:
|
|
235
|
+
# { "token": "...", "user": { "id": 1, "email": "test@example.com" } }
|
|
236
|
+
```
|
|
237
|
+
|
|
238
|
+
**Critical Workflow #2: Create & Read Data**
|
|
239
|
+
```bash
|
|
240
|
+
# Create user
|
|
241
|
+
curl -X POST http://[app-url]/api/users \
|
|
242
|
+
-H "Authorization: Bearer [token]" \
|
|
243
|
+
-H "Content-Type: application/json" \
|
|
244
|
+
-d '{"name":"John","email":"john@example.com"}'
|
|
245
|
+
|
|
246
|
+
# Read user (verify format is correct for this version)
|
|
247
|
+
curl http://[app-url]/api/users/1 \
|
|
248
|
+
-H "Authorization: Bearer [token]"
|
|
249
|
+
|
|
250
|
+
# Expected for v3.0.0: { "data": { "id": 1, "name": "John" } }
|
|
251
|
+
# NOT: { "user": { "id": 1, "name": "John" } }
|
|
252
|
+
```
|
|
253
|
+
|
|
254
|
+
**Critical Workflow #3: Business Logic**
|
|
255
|
+
```bash
|
|
256
|
+
# Test key business workflow (e.g., payment processing)
|
|
257
|
+
curl -X POST http://[app-url]/api/payments \
|
|
258
|
+
-H "Authorization: Bearer [token]" \
|
|
259
|
+
-H "Content-Type: application/json" \
|
|
260
|
+
-d '{"amount":99.99,"currency":"USD"}'
|
|
261
|
+
|
|
262
|
+
# Expected: 200 OK with payment ID
|
|
263
|
+
```
|
|
264
|
+
|
|
265
|
+
**Smoke Test Checklist**:
|
|
266
|
+
- [ ] Login works
|
|
267
|
+
- [ ] Create resource works
|
|
268
|
+
- [ ] Read resource works (new format if applicable)
|
|
269
|
+
- [ ] Update resource works
|
|
270
|
+
- [ ] Delete resource works
|
|
271
|
+
- [ ] Business critical endpoint works
|
|
272
|
+
- [ ] Error handling works (test 404, 400, 500 scenarios)
|
|
273
|
+
|
|
274
|
+
### 3. Database Integrity Check (5 min)
|
|
275
|
+
|
|
276
|
+
```bash
|
|
277
|
+
# Verify database accessible
|
|
278
|
+
psql -d [database] -c "SELECT 1"
|
|
279
|
+
|
|
280
|
+
# Check recent data
|
|
281
|
+
psql -d [database] -c "SELECT COUNT(*) FROM users"
|
|
282
|
+
|
|
283
|
+
# Check for any errors in migration
|
|
284
|
+
psql -d [database] -c "SELECT * FROM schema_migrations ORDER BY version DESC LIMIT 5"
|
|
285
|
+
|
|
286
|
+
# Verify no data corruption
|
|
287
|
+
psql -d [database] -c "SELECT * FROM users LIMIT 1" | head -10
|
|
288
|
+
```
|
|
289
|
+
|
|
290
|
+
**Database checklist**:
|
|
291
|
+
- [ ] Database connection successful
|
|
292
|
+
- [ ] Tables present
|
|
293
|
+
- [ ] Data accessible
|
|
294
|
+
- [ ] No schema errors
|
|
295
|
+
- [ ] Data integrity checks pass
|
|
296
|
+
|
|
297
|
+
### 4. Performance Baseline (5 min)
|
|
298
|
+
|
|
299
|
+
```bash
|
|
300
|
+
# Get baseline metrics
|
|
301
|
+
curl http://[app-url]/metrics | grep http_request_duration_seconds | head -10
|
|
302
|
+
|
|
303
|
+
# Expected: Requests processing in <200ms (P95)
|
|
304
|
+
```
|
|
305
|
+
|
|
306
|
+
**Performance checklist**:
|
|
307
|
+
- [ ] Response time: <200ms P95
|
|
308
|
+
- [ ] Error rate: <0.1%
|
|
309
|
+
- [ ] CPU usage: <70%
|
|
310
|
+
- [ ] Memory usage: <80%
|
|
311
|
+
- [ ] Cache hit rate: >80% (if applicable)
|
|
312
|
+
|
|
313
|
+
---
|
|
314
|
+
|
|
315
|
+
## Rollback Phase (If Needed - < 5 minutes)
|
|
316
|
+
|
|
317
|
+
### For Development/Staging
|
|
318
|
+
|
|
319
|
+
```bash
|
|
320
|
+
# Get previous version
|
|
321
|
+
PREVIOUS_VERSION=$(git tag | sort -V | tail -2 | head -1)
|
|
322
|
+
|
|
323
|
+
# Stop current version
|
|
324
|
+
docker stop myapp-container
|
|
325
|
+
|
|
326
|
+
# Run previous version
|
|
327
|
+
docker run -d \
|
|
328
|
+
--name myapp-container \
|
|
329
|
+
--env-file .env.dev \
|
|
330
|
+
-p 3000:3000 \
|
|
331
|
+
myapp:${PREVIOUS_VERSION}
|
|
332
|
+
|
|
333
|
+
# Verify rollback successful
|
|
334
|
+
sleep 5
|
|
335
|
+
curl -f http://localhost:3000/health || exit 1
|
|
336
|
+
|
|
337
|
+
echo "✅ Rollback to ${PREVIOUS_VERSION} complete"
|
|
338
|
+
```
|
|
339
|
+
|
|
340
|
+
### For Production
|
|
341
|
+
|
|
342
|
+
```bash
|
|
343
|
+
# Revert to previous version immediately
|
|
344
|
+
# If Blue-Green: Switch traffic back to blue (original)
|
|
345
|
+
aws elbv2 modify-rule --rule-arn <arn> \
|
|
346
|
+
--actions Type=forward,TargetGroups="[{TargetGroupArn=<blue-tg>,Weight=100}]"
|
|
347
|
+
|
|
348
|
+
# If Canary: Immediately route 100% back to previous version
|
|
349
|
+
aws elbv2 modify-rule --rule-arn <arn> \
|
|
350
|
+
--actions Type=forward,TargetGroups="[{TargetGroupArn=<v2-tg>,Weight=100}]"
|
|
351
|
+
|
|
352
|
+
# Verify rollback
|
|
353
|
+
curl -f http://[prod-url]/health
|
|
354
|
+
|
|
355
|
+
# Notify stakeholders
|
|
356
|
+
# Send: Slack message, Status page update, Email to on-call
|
|
357
|
+
|
|
358
|
+
echo "✅ Rollback to v[PREVIOUS] complete"
|
|
359
|
+
echo "🔴 Incident: v[X.Y.Z] deployment ROLLED BACK"
|
|
360
|
+
echo "📋 Investigation: See INCIDENT.md"
|
|
361
|
+
```
|
|
362
|
+
|
|
363
|
+
---
|
|
364
|
+
|
|
365
|
+
## Post-Deployment Monitoring (24 hours)
|
|
366
|
+
|
|
367
|
+
**First 1 hour (Critical)**:
|
|
368
|
+
- [ ] Error rate: Maintain <0.1%
|
|
369
|
+
- [ ] Latency: Within 10% of baseline
|
|
370
|
+
- [ ] Throughput: Expected level
|
|
371
|
+
- [ ] Health checks: Continuous passing
|
|
372
|
+
- [ ] Consumer feedback: No critical issues
|
|
373
|
+
|
|
374
|
+
**24-hour window**:
|
|
375
|
+
- [ ] Error rate: Maintain <0.1%
|
|
376
|
+
- [ ] Latency: Stable, no spikes
|
|
377
|
+
- [ ] All business workflows: Operational
|
|
378
|
+
- [ ] Consumer migrations: On track (if breaking changes)
|
|
379
|
+
- [ ] On-call team: Standing down after 24 hours
|
|
380
|
+
|
|
381
|
+
**Checklist**:
|
|
382
|
+
- [ ] 1-hour post-deployment: All green
|
|
383
|
+
- [ ] 4-hour checkpoint: All green
|
|
384
|
+
- [ ] 24-hour checkpoint: Ready to remove rollback capability
|
|
385
|
+
|
|
386
|
+
---
|
|
387
|
+
|
|
388
|
+
## Incident Log
|
|
389
|
+
|
|
390
|
+
**If any issues occur, document**:
|
|
391
|
+
|
|
392
|
+
```
|
|
393
|
+
Time: [HH:MM UTC]
|
|
394
|
+
Severity: [LOW|MEDIUM|HIGH|CRITICAL]
|
|
395
|
+
Description: [What happened]
|
|
396
|
+
Impact: [How many users affected]
|
|
397
|
+
Root cause: [Why did it happen]
|
|
398
|
+
Action taken: [What was done]
|
|
399
|
+
Resolution: [How was it fixed]
|
|
400
|
+
Prevention: [How to prevent next time]
|
|
401
|
+
```
|
|
402
|
+
|
|
403
|
+
---
|
|
404
|
+
|
|
405
|
+
## Sign-Off & Completion
|
|
406
|
+
|
|
407
|
+
**Deployment Complete**:
|
|
408
|
+
- Version deployed: v[X.Y.Z]
|
|
409
|
+
- Environment: [Development|Staging|Production]
|
|
410
|
+
- Date/Time: [YYYY-MM-DD HH:MM UTC]
|
|
411
|
+
- Deployed by: [Name]
|
|
412
|
+
- Approvals: [List all approvers]
|
|
413
|
+
- All health checks: ✅ PASS
|
|
414
|
+
- Smoke tests: ✅ PASS
|
|
415
|
+
- Issues: [None / List any]
|
|
416
|
+
- Status: ✅ **READY FOR NEXT STAGE**
|
|
417
|
+
|
|
418
|
+
---
|
|
419
|
+
|
|
420
|
+
**RUNBOOK COMPLETE**
|