codex-genesis-harness 0.1.5 → 0.1.6

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (151) hide show
  1. package/.codebase/ARCHITECTURE_REVIEW_COMPLETE.md +216 -216
  2. package/.codebase/CURRENT_STATE.md +7 -2
  3. package/.codebase/FILE_NAMING_CLARIFICATION.md +161 -161
  4. package/.codebase/HARNESS_COMPLETENESS_AUDIT.md +613 -613
  5. package/.codebase/IMPLEMENTATION_COMPLETE.md +429 -429
  6. package/.codebase/IMPLEMENTATION_HANDOFF.md +351 -351
  7. package/.codebase/IMPROVEMENTS_SUMMARY.md +419 -419
  8. package/.codebase/PHASE3_SKILLS_NAMING_COMPLETE.md +292 -292
  9. package/.codebase/PHASE_DEPENDENCY_MAP.md +486 -486
  10. package/.codebase/QUICK_START_SPEC_IMPACT.md +456 -456
  11. package/.codebase/README.md +139 -139
  12. package/.codebase/RECOVERY_POINTS.md +438 -438
  13. package/.codex/skills/genesis-api-sync/SKILL.md +354 -354
  14. package/.codex/skills/genesis-api-sync/checklists/api-sync-checklist.md +101 -101
  15. package/.codex/skills/genesis-api-sync/templates/api-change-template.md +257 -257
  16. package/.codex/skills/genesis-debug-guide/SKILL.md +479 -479
  17. package/.codex/skills/genesis-debug-guide/checklists/flaky-test-investigation.md +339 -339
  18. package/.codex/skills/genesis-debug-guide/checklists/production-bug-debug.md +210 -210
  19. package/.codex/skills/genesis-debug-guide/checklists/test-failure-debug.md +158 -158
  20. package/.codex/skills/genesis-debug-guide/observability/debug-commands.md +365 -365
  21. package/.codex/skills/genesis-debug-guide/playbooks/unit-test-failures.md +289 -289
  22. package/.codex/skills/genesis-debug-guide/templates/debug-investigation-log.md +288 -288
  23. package/.codex/skills/genesis-docs-automation/SKILL.md +1003 -1003
  24. package/.codex/skills/genesis-docs-automation/checklists/docs-validation.md +359 -359
  25. package/.codex/skills/genesis-docs-automation/checklists/spec-alignment.md +312 -312
  26. package/.codex/skills/genesis-docs-automation/observability/docs-tracking.md +382 -382
  27. package/.codex/skills/genesis-docs-automation/playbooks/auto-update-flow.md +851 -851
  28. package/.codex/skills/genesis-docs-automation/playbooks/changelog-generation.md +491 -491
  29. package/.codex/skills/genesis-docs-automation/templates/changelog-entry-template.md +187 -187
  30. package/.codex/skills/genesis-docs-automation/templates/handoff-template.md +297 -297
  31. package/.codex/skills/genesis-harness/SKILL.md +1427 -1427
  32. package/.codex/skills/genesis-harness/agents/openai.yaml +7 -7
  33. package/.codex/skills/genesis-harness/checklists/bug-fix-qa.md +169 -169
  34. package/.codex/skills/genesis-harness/checklists/new-feature-qa.md +157 -157
  35. package/.codex/skills/genesis-harness/checklists/refactor-qa.md +216 -216
  36. package/.codex/skills/genesis-harness/checklists/requirements-validation.md +211 -211
  37. package/.codex/skills/genesis-harness/references/planning-schema.md +35 -35
  38. package/.codex/skills/genesis-harness/references/quality-rubric.md +21 -21
  39. package/.codex/skills/genesis-harness/references/research-rubric.md +41 -41
  40. package/.codex/skills/genesis-harness/references/workflows.md +33 -33
  41. package/.codex/skills/genesis-harness/resources/agents-template.md +27 -27
  42. package/.codex/skills/genesis-harness/resources/api-docs-template.md +32 -32
  43. package/.codex/skills/genesis-harness/resources/architecture-template.md +30 -30
  44. package/.codex/skills/genesis-harness/resources/audit-template.md +26 -26
  45. package/.codex/skills/genesis-harness/resources/bug-template.md +34 -34
  46. package/.codex/skills/genesis-harness/resources/change-impact-matrix-template.md +204 -204
  47. package/.codex/skills/genesis-harness/resources/check-template.md +21 -21
  48. package/.codex/skills/genesis-harness/resources/conventions-template.md +42 -42
  49. package/.codex/skills/genesis-harness/resources/decision-template.md +33 -33
  50. package/.codex/skills/genesis-harness/resources/design-template.md +26 -26
  51. package/.codex/skills/genesis-harness/resources/escalation-template.md +21 -21
  52. package/.codex/skills/genesis-harness/resources/feature-template.md +49 -49
  53. package/.codex/skills/genesis-harness/resources/foundation-phase-template.md +131 -131
  54. package/.codex/skills/genesis-harness/resources/integrations-template.md +32 -32
  55. package/.codex/skills/genesis-harness/resources/journeys-template.md +13 -13
  56. package/.codex/skills/genesis-harness/resources/lessons-learned-template.md +12 -12
  57. package/.codex/skills/genesis-harness/resources/observability-template.md +34 -34
  58. package/.codex/skills/genesis-harness/resources/phase-00-foundation-template.md +76 -76
  59. package/.codex/skills/genesis-harness/resources/phase-template.md +34 -34
  60. package/.codex/skills/genesis-harness/resources/pitfalls-template.md +22 -22
  61. package/.codex/skills/genesis-harness/resources/planning-tree-template.md +39 -39
  62. package/.codex/skills/genesis-harness/resources/post-implementation-guide.md +347 -347
  63. package/.codex/skills/genesis-harness/resources/project-template.md +38 -38
  64. package/.codex/skills/genesis-harness/resources/quality-score-template.md +11 -11
  65. package/.codex/skills/genesis-harness/resources/requirements-template.md +26 -26
  66. package/.codex/skills/genesis-harness/resources/research-template.md +26 -26
  67. package/.codex/skills/genesis-harness/resources/review-template.md +22 -22
  68. package/.codex/skills/genesis-harness/resources/spec-changelog-template.md +6 -6
  69. package/.codex/skills/genesis-harness/resources/stack-template.md +33 -33
  70. package/.codex/skills/genesis-harness/resources/verification-template.md +26 -26
  71. package/.codex/skills/genesis-harness/scripts/check-architecture-boundaries.sh +0 -0
  72. package/.codex/skills/genesis-harness/scripts/check-docs-sync.sh +0 -0
  73. package/.codex/skills/genesis-harness/scripts/check-no-debug-logs.sh +0 -0
  74. package/.codex/skills/genesis-harness/scripts/check-required-planning-files.sh +0 -0
  75. package/.codex/skills/genesis-harness/scripts/check-spec-changelog.sh +0 -0
  76. package/.codex/skills/genesis-harness/scripts/check-task-tracking.sh +0 -0
  77. package/.codex/skills/genesis-harness/scripts/compact-context.sh +0 -0
  78. package/.codex/skills/genesis-harness/scripts/create-adr.sh +0 -0
  79. package/.codex/skills/genesis-harness/scripts/create-bug.sh +0 -0
  80. package/.codex/skills/genesis-harness/scripts/create-feature.sh +0 -0
  81. package/.codex/skills/genesis-harness/scripts/detect-stack.sh +0 -0
  82. package/.codex/skills/genesis-harness/scripts/init-planning.sh +0 -0
  83. package/.codex/skills/genesis-harness/scripts/list-changed-files.sh +0 -0
  84. package/.codex/skills/genesis-harness/scripts/offload-log.sh +0 -0
  85. package/.codex/skills/genesis-harness/scripts/run-verification.sh +0 -0
  86. package/.codex/skills/genesis-harness/scripts/run-verify-loop.sh +0 -0
  87. package/.codex/skills/genesis-harness/scripts/update-state.sh +0 -0
  88. package/.codex/skills/genesis-mvp-planning/SKILL.md +114 -0
  89. package/.codex/skills/genesis-mvp-planning/agents/openai.yaml +6 -0
  90. package/.codex/skills/genesis-mvp-planning/checklists/mvp-readiness.md +18 -0
  91. package/.codex/skills/genesis-mvp-planning/examples/5-phase-roadmap-example.md +43 -0
  92. package/.codex/skills/genesis-mvp-planning/templates/phase-1-core.md +17 -0
  93. package/.codex/skills/genesis-mvp-planning/templates/phase-2-auth.md +17 -0
  94. package/.codex/skills/genesis-mvp-planning/templates/phase-3-features.md +17 -0
  95. package/.codex/skills/genesis-mvp-planning/templates/phase-4-integrations.md +17 -0
  96. package/.codex/skills/genesis-mvp-planning/templates/phase-5-readiness.md +17 -0
  97. package/.codex/skills/genesis-new-design/agents/openai.yaml +3 -3
  98. package/.codex/skills/genesis-observability-automation/checklists/.gitkeep +0 -0
  99. package/.codex/skills/genesis-observability-automation/observability/.gitkeep +0 -0
  100. package/.codex/skills/genesis-observability-automation/playbooks/.gitkeep +0 -0
  101. package/.codex/skills/genesis-observability-automation/templates/.gitkeep +0 -0
  102. package/.codex/skills/genesis-release-orchestration/SKILL.md +653 -653
  103. package/.codex/skills/genesis-release-orchestration/checklists/post-deployment-verification.md +274 -274
  104. package/.codex/skills/genesis-release-orchestration/checklists/pre-release-validation.md +220 -220
  105. package/.codex/skills/genesis-release-orchestration/observability/release-tracking.md +253 -253
  106. package/.codex/skills/genesis-release-orchestration/playbooks/canary-deployment-orchestration.md +472 -472
  107. package/.codex/skills/genesis-release-orchestration/playbooks/semantic-versioning-automation.md +494 -494
  108. package/.codex/skills/genesis-release-orchestration/templates/deployment-strategy-template.md +303 -303
  109. package/.codex/skills/genesis-release-orchestration/templates/release-runbook-template.md +420 -420
  110. package/.codex/skills/genesis-research-first/SKILL.md +237 -237
  111. package/.codex/skills/genesis-research-first/templates/.gitkeep +0 -0
  112. package/.codex/skills/genesis-spec-propagation/SKILL.md +534 -534
  113. package/.codex/skills/genesis-spec-propagation/checklists/phase-update-verification.md +384 -384
  114. package/.codex/skills/genesis-spec-propagation/checklists/spec-change-detection.md +257 -257
  115. package/.codex/skills/genesis-spec-propagation/observability/propagation-tracking.md +373 -373
  116. package/.codex/skills/genesis-spec-propagation/playbooks/breaking-change-propagation.md +692 -692
  117. package/.codex/skills/genesis-spec-propagation/playbooks/feature-change-propagation.md +434 -434
  118. package/.codex/skills/genesis-spec-propagation/templates/migration-guide-template.md +407 -407
  119. package/.codex/skills/genesis-upgrade-design/agents/openai.yaml +3 -3
  120. package/.codex/skills/spec-impact-engine/SKILL.md +504 -504
  121. package/.codex/skills/spec-impact-engine/detect-spec-changes.sh +0 -0
  122. package/.codex-plugin/plugin.json +19 -19
  123. package/CHANGELOG.md +42 -0
  124. package/LICENSE +22 -22
  125. package/README.EN.md +784 -730
  126. package/README.VI.md +776 -723
  127. package/README.md +102 -247
  128. package/VERSION +2 -2
  129. package/bin/genesis-harness.js +90 -87
  130. package/package.json +9 -3
  131. package/scripts/README.md +342 -342
  132. package/scripts/compact-context.sh +0 -0
  133. package/scripts/contract_integrity_gate.js +83 -0
  134. package/scripts/detect-changes.sh +0 -0
  135. package/scripts/healing_telemetry.js +118 -0
  136. package/scripts/install.sh +4 -1
  137. package/scripts/offload-log.sh +0 -0
  138. package/scripts/prompt_sentinel.js +84 -0
  139. package/scripts/run-evals.sh +1 -0
  140. package/scripts/run-verify-loop.sh +11 -0
  141. package/scripts/spec_visual_sync.js +157 -0
  142. package/scripts/test_generator.js +142 -0
  143. package/scripts/transition_state.sh +0 -0
  144. package/scripts/uninstall.sh +1 -0
  145. package/scripts/validation_gates.sh +40 -1
  146. package/scripts/verify.sh +5 -0
  147. package/tests/unit/contract_integrity_gate.test.js +74 -0
  148. package/tests/unit/healing_telemetry.test.js +58 -0
  149. package/tests/unit/prompt_sentinel.test.js +50 -0
  150. package/tests/unit/spec_visual_sync.test.js +77 -0
  151. package/tests/unit/test_generator.test.js +62 -0
@@ -1,420 +1,420 @@
1
- # Release Runbook Template
2
-
3
- **Environment**: [Development|Staging|Production]
4
- **Release Version**: v[X.Y.Z]
5
- **Release Date**: [YYYY-MM-DD HH:MM UTC]
6
- **Release Manager**: [Name]
7
- **Approval Chain**: [List approvers]
8
-
9
- ---
10
-
11
- ## Pre-Deployment Phase (30-60 minutes)
12
-
13
- ### 1. Pre-Deployment Verification (10 min)
14
-
15
- **Checklist**:
16
- - [ ] All tests passing (80%+ coverage)
17
- - [ ] Version correctly updated in VERSION file
18
- - [ ] CHANGELOG.md updated with release notes
19
- - [ ] Breaking changes documented (if applicable)
20
- - [ ] Migration guides created (if breaking changes)
21
- - [ ] Database migrations tested in staging
22
- - [ ] Configuration validated for this environment
23
- - [ ] Rollback plan tested and documented
24
- - [ ] Team on-call and ready
25
- - [ ] Status page updated (if applicable)
26
-
27
- **Command to verify**:
28
- ```bash
29
- # Check tests
30
- npm run test -- --coverage
31
-
32
- # Verify version
33
- cat VERSION # Should show v[X.Y.Z]
34
-
35
- # Verify changelog
36
- cat CHANGELOG.md | head -20 # Should have new version entry
37
-
38
- # Verify migrations (if applicable)
39
- ls -la db/migrations/ # Check latest migration script
40
- ```
41
-
42
- ### 2. Database Migrations (if applicable, 10-20 min)
43
-
44
- **For Development/Staging**:
45
- ```bash
46
- # Run migrations
47
- ./db/migrate-up.sh
48
-
49
- # Verify: Check table structure
50
- psql -d database_dev -c "\d users"
51
-
52
- # Verify: Check data integrity
53
- psql -d database_dev -c "SELECT COUNT(*) FROM users"
54
-
55
- # Save migration timestamp
56
- echo "Migration completed at $(date)" >> MIGRATION_LOG.md
57
- ```
58
-
59
- **For Production** (if breaking changes):
60
- - [ ] Backup database created: `db_backup_v[X.Y.Z]_$(date).sql`
61
- - [ ] Backup tested: Can restore from backup
62
- - [ ] Estimated migration time: [X] minutes
63
- - [ ] Data integrity verified post-migration
64
- - [ ] Rollback migration script tested
65
-
66
- ### 3. Configuration Validation (5-10 min)
67
-
68
- **Check environment-specific config**:
69
- ```bash
70
- # Verify environment variables loaded
71
- env | grep -E "DATABASE_URL|API_KEY|FEATURE_FLAG"
72
-
73
- # Verify secrets available
74
- grep -r "SECRET_KEY" config/ | grep -v "# PLACEHOLDER"
75
-
76
- # Verify no hardcoded values
77
- grep -r "hardcoded_value\|TODO_REPLACE\|CHANGE_ME" src/
78
-
79
- # Validate config syntax
80
- npm run config:validate
81
- ```
82
-
83
- **Config checklist**:
84
- - [ ] Database URL correct for environment
85
- - [ ] API keys/tokens available
86
- - [ ] Feature flags set correctly
87
- - [ ] Logging level appropriate (DEBUG in dev, INFO in prod)
88
- - [ ] TLS certificates valid
89
- - [ ] Domain names correct
90
-
91
- ### 4. Pre-Deployment Approval (5 min)
92
-
93
- **Get sign-off from**:
94
- - [ ] Release Manager: Confirmed ready
95
- - [ ] Tech Lead: All checks pass
96
- - [ ] Ops Lead: Infrastructure ready
97
- - [ ] Product (if breaking changes): Consumer communication sent
98
-
99
- **Approval timestamp**: [HH:MM UTC]
100
-
101
- ---
102
-
103
- ## Deployment Phase (20-60 minutes)
104
-
105
- ### For Development Environment
106
-
107
- ```bash
108
- # 1. Build Docker image
109
- docker build -t myapp:v[X.Y.Z] .
110
-
111
- # 2. Tag image
112
- docker tag myapp:v[X.Y.Z] myapp:latest
113
-
114
- # 3. Stop old container
115
- docker stop myapp-container || true
116
-
117
- # 4. Remove old container
118
- docker rm myapp-container || true
119
-
120
- # 5. Run new container
121
- docker run -d \
122
- --name myapp-container \
123
- --env-file .env.dev \
124
- -p 3000:3000 \
125
- myapp:v[X.Y.Z]
126
-
127
- # 6. Wait for startup (verify health endpoint)
128
- sleep 5
129
- curl -f http://localhost:3000/health || exit 1
130
-
131
- echo "✅ Development deployment complete: v[X.Y.Z]"
132
- ```
133
-
134
- ### For Staging Environment (via Kubernetes)
135
-
136
- ```bash
137
- # 1. Build and push image to registry
138
- docker build -t registry.example.com/myapp:v[X.Y.Z] .
139
- docker push registry.example.com/myapp:v[X.Y.Z]
140
-
141
- # 2. Update deployment
142
- kubectl set image deployment/myapp \
143
- myapp=registry.example.com/myapp:v[X.Y.Z] \
144
- -n staging
145
-
146
- # 3. Wait for rollout
147
- kubectl rollout status deployment/myapp -n staging --timeout=5m
148
-
149
- # 4. Verify pods running
150
- kubectl get pods -n staging -l app=myapp
151
-
152
- # 5. Port forward for testing
153
- kubectl port-forward svc/myapp 3000:3000 -n staging
154
-
155
- echo "✅ Staging deployment complete: v[X.Y.Z]"
156
- ```
157
-
158
- ### For Production Environment (Blue-Green or Canary)
159
-
160
- **Blue-Green**:
161
- ```bash
162
- # 1. Build and push to registry
163
- docker build -t registry.example.com/myapp:v[X.Y.Z] .
164
- docker push registry.example.com/myapp:v[X.Y.Z]
165
-
166
- # 2. Deploy to "green" environment (parallel to prod)
167
- kubectl apply -f deployment-green-v[X.Y.Z].yaml -n production
168
-
169
- # 3. Wait for health checks
170
- kubectl rollout status deployment/myapp-green -n production --timeout=5m
171
-
172
- # 4. Verify green environment healthy
173
- curl -f https://green.myapp.example.com/health || exit 1
174
-
175
- # 5. Switch traffic: blue (old) → green (new)
176
- # This is typically done via load balancer or DNS switch
177
- aws elbv2 modify-rule --rule-arn <arn> \
178
- --actions Type=forward,TargetGroups="[{TargetGroupArn=<green-tg>,Weight=100}]"
179
-
180
- # 6. Monitor: Keep blue running for instant rollback (1-2 hours)
181
- echo "✅ Production deployment complete: Traffic switched to green (v[X.Y.Z])"
182
- echo "⏱️ Blue environment available for 2 hours for instant rollback"
183
- ```
184
-
185
- **Canary** (for breaking changes):
186
- ```bash
187
- # 1-4: Same as blue-green deployment steps
188
-
189
- # 5. Route small % of traffic to new version (canary)
190
- aws elbv2 modify-rule --rule-arn <arn> \
191
- --actions Type=forward,TargetGroups="[{TargetGroupArn=<v3-tg>,Weight=1},{TargetGroupArn=<v2-tg>,Weight=99}]"
192
-
193
- # 6. Monitor Stage 1 (1 hour at 1% traffic)
194
- # ... (see canary-deployment-orchestration.md for full stages)
195
- ```
196
-
197
- ---
198
-
199
- ## Post-Deployment Verification (15-30 minutes)
200
-
201
- ### 1. Health Check Verification (5 min)
202
-
203
- ```bash
204
- # Check liveness probe
205
- curl -f http://[app-url]/health
206
- # Expected: 200 OK with version info
207
-
208
- # Check readiness probe
209
- curl -f http://[app-url]/ready
210
- # Expected: 200 OK with dependency status
211
-
212
- # Check metrics endpoint
213
- curl -f http://[app-url]/metrics | head -20
214
- # Expected: Prometheus metrics output
215
- ```
216
-
217
- **Verification checklist**:
218
- - [ ] Liveness endpoint: 200 OK
219
- - [ ] Readiness endpoint: 200 OK
220
- - [ ] Version matches v[X.Y.Z]
221
- - [ ] Database connected: Yes
222
- - [ ] Cache connected: Yes (if applicable)
223
- - [ ] External services: Available
224
-
225
- ### 2. Smoke Test Scenarios (5-10 min)
226
-
227
- **Critical Workflow #1: Authentication**
228
- ```bash
229
- # Test login
230
- curl -X POST http://[app-url]/api/login \
231
- -H "Content-Type: application/json" \
232
- -d '{"email":"test@example.com","password":"password"}'
233
-
234
- # Expected response:
235
- # { "token": "...", "user": { "id": 1, "email": "test@example.com" } }
236
- ```
237
-
238
- **Critical Workflow #2: Create & Read Data**
239
- ```bash
240
- # Create user
241
- curl -X POST http://[app-url]/api/users \
242
- -H "Authorization: Bearer [token]" \
243
- -H "Content-Type: application/json" \
244
- -d '{"name":"John","email":"john@example.com"}'
245
-
246
- # Read user (verify format is correct for this version)
247
- curl http://[app-url]/api/users/1 \
248
- -H "Authorization: Bearer [token]"
249
-
250
- # Expected for v3.0.0: { "data": { "id": 1, "name": "John" } }
251
- # NOT: { "user": { "id": 1, "name": "John" } }
252
- ```
253
-
254
- **Critical Workflow #3: Business Logic**
255
- ```bash
256
- # Test key business workflow (e.g., payment processing)
257
- curl -X POST http://[app-url]/api/payments \
258
- -H "Authorization: Bearer [token]" \
259
- -H "Content-Type: application/json" \
260
- -d '{"amount":99.99,"currency":"USD"}'
261
-
262
- # Expected: 200 OK with payment ID
263
- ```
264
-
265
- **Smoke Test Checklist**:
266
- - [ ] Login works
267
- - [ ] Create resource works
268
- - [ ] Read resource works (new format if applicable)
269
- - [ ] Update resource works
270
- - [ ] Delete resource works
271
- - [ ] Business critical endpoint works
272
- - [ ] Error handling works (test 404, 400, 500 scenarios)
273
-
274
- ### 3. Database Integrity Check (5 min)
275
-
276
- ```bash
277
- # Verify database accessible
278
- psql -d [database] -c "SELECT 1"
279
-
280
- # Check recent data
281
- psql -d [database] -c "SELECT COUNT(*) FROM users"
282
-
283
- # Check for any errors in migration
284
- psql -d [database] -c "SELECT * FROM schema_migrations ORDER BY version DESC LIMIT 5"
285
-
286
- # Verify no data corruption
287
- psql -d [database] -c "SELECT * FROM users LIMIT 1" | head -10
288
- ```
289
-
290
- **Database checklist**:
291
- - [ ] Database connection successful
292
- - [ ] Tables present
293
- - [ ] Data accessible
294
- - [ ] No schema errors
295
- - [ ] Data integrity checks pass
296
-
297
- ### 4. Performance Baseline (5 min)
298
-
299
- ```bash
300
- # Get baseline metrics
301
- curl http://[app-url]/metrics | grep http_request_duration_seconds | head -10
302
-
303
- # Expected: Requests processing in <200ms (P95)
304
- ```
305
-
306
- **Performance checklist**:
307
- - [ ] Response time: <200ms P95
308
- - [ ] Error rate: <0.1%
309
- - [ ] CPU usage: <70%
310
- - [ ] Memory usage: <80%
311
- - [ ] Cache hit rate: >80% (if applicable)
312
-
313
- ---
314
-
315
- ## Rollback Phase (If Needed - < 5 minutes)
316
-
317
- ### For Development/Staging
318
-
319
- ```bash
320
- # Get previous version
321
- PREVIOUS_VERSION=$(git tag | sort -V | tail -2 | head -1)
322
-
323
- # Stop current version
324
- docker stop myapp-container
325
-
326
- # Run previous version
327
- docker run -d \
328
- --name myapp-container \
329
- --env-file .env.dev \
330
- -p 3000:3000 \
331
- myapp:${PREVIOUS_VERSION}
332
-
333
- # Verify rollback successful
334
- sleep 5
335
- curl -f http://localhost:3000/health || exit 1
336
-
337
- echo "✅ Rollback to ${PREVIOUS_VERSION} complete"
338
- ```
339
-
340
- ### For Production
341
-
342
- ```bash
343
- # Revert to previous version immediately
344
- # If Blue-Green: Switch traffic back to blue (original)
345
- aws elbv2 modify-rule --rule-arn <arn> \
346
- --actions Type=forward,TargetGroups="[{TargetGroupArn=<blue-tg>,Weight=100}]"
347
-
348
- # If Canary: Immediately route 100% back to previous version
349
- aws elbv2 modify-rule --rule-arn <arn> \
350
- --actions Type=forward,TargetGroups="[{TargetGroupArn=<v2-tg>,Weight=100}]"
351
-
352
- # Verify rollback
353
- curl -f http://[prod-url]/health
354
-
355
- # Notify stakeholders
356
- # Send: Slack message, Status page update, Email to on-call
357
-
358
- echo "✅ Rollback to v[PREVIOUS] complete"
359
- echo "🔴 Incident: v[X.Y.Z] deployment ROLLED BACK"
360
- echo "📋 Investigation: See INCIDENT.md"
361
- ```
362
-
363
- ---
364
-
365
- ## Post-Deployment Monitoring (24 hours)
366
-
367
- **First 1 hour (Critical)**:
368
- - [ ] Error rate: Maintain <0.1%
369
- - [ ] Latency: Within 10% of baseline
370
- - [ ] Throughput: Expected level
371
- - [ ] Health checks: Continuous passing
372
- - [ ] Consumer feedback: No critical issues
373
-
374
- **24-hour window**:
375
- - [ ] Error rate: Maintain <0.1%
376
- - [ ] Latency: Stable, no spikes
377
- - [ ] All business workflows: Operational
378
- - [ ] Consumer migrations: On track (if breaking changes)
379
- - [ ] On-call team: Standing down after 24 hours
380
-
381
- **Checklist**:
382
- - [ ] 1-hour post-deployment: All green
383
- - [ ] 4-hour checkpoint: All green
384
- - [ ] 24-hour checkpoint: Ready to remove rollback capability
385
-
386
- ---
387
-
388
- ## Incident Log
389
-
390
- **If any issues occur, document**:
391
-
392
- ```
393
- Time: [HH:MM UTC]
394
- Severity: [LOW|MEDIUM|HIGH|CRITICAL]
395
- Description: [What happened]
396
- Impact: [How many users affected]
397
- Root cause: [Why did it happen]
398
- Action taken: [What was done]
399
- Resolution: [How was it fixed]
400
- Prevention: [How to prevent next time]
401
- ```
402
-
403
- ---
404
-
405
- ## Sign-Off & Completion
406
-
407
- **Deployment Complete**:
408
- - Version deployed: v[X.Y.Z]
409
- - Environment: [Development|Staging|Production]
410
- - Date/Time: [YYYY-MM-DD HH:MM UTC]
411
- - Deployed by: [Name]
412
- - Approvals: [List all approvers]
413
- - All health checks: ✅ PASS
414
- - Smoke tests: ✅ PASS
415
- - Issues: [None / List any]
416
- - Status: ✅ **READY FOR NEXT STAGE**
417
-
418
- ---
419
-
420
- **RUNBOOK COMPLETE**
1
+ # Release Runbook Template
2
+
3
+ **Environment**: [Development|Staging|Production]
4
+ **Release Version**: v[X.Y.Z]
5
+ **Release Date**: [YYYY-MM-DD HH:MM UTC]
6
+ **Release Manager**: [Name]
7
+ **Approval Chain**: [List approvers]
8
+
9
+ ---
10
+
11
+ ## Pre-Deployment Phase (30-60 minutes)
12
+
13
+ ### 1. Pre-Deployment Verification (10 min)
14
+
15
+ **Checklist**:
16
+ - [ ] All tests passing (80%+ coverage)
17
+ - [ ] Version correctly updated in VERSION file
18
+ - [ ] CHANGELOG.md updated with release notes
19
+ - [ ] Breaking changes documented (if applicable)
20
+ - [ ] Migration guides created (if breaking changes)
21
+ - [ ] Database migrations tested in staging
22
+ - [ ] Configuration validated for this environment
23
+ - [ ] Rollback plan tested and documented
24
+ - [ ] Team on-call and ready
25
+ - [ ] Status page updated (if applicable)
26
+
27
+ **Command to verify**:
28
+ ```bash
29
+ # Check tests
30
+ npm run test -- --coverage
31
+
32
+ # Verify version
33
+ cat VERSION # Should show v[X.Y.Z]
34
+
35
+ # Verify changelog
36
+ cat CHANGELOG.md | head -20 # Should have new version entry
37
+
38
+ # Verify migrations (if applicable)
39
+ ls -la db/migrations/ # Check latest migration script
40
+ ```
41
+
42
+ ### 2. Database Migrations (if applicable, 10-20 min)
43
+
44
+ **For Development/Staging**:
45
+ ```bash
46
+ # Run migrations
47
+ ./db/migrate-up.sh
48
+
49
+ # Verify: Check table structure
50
+ psql -d database_dev -c "\d users"
51
+
52
+ # Verify: Check data integrity
53
+ psql -d database_dev -c "SELECT COUNT(*) FROM users"
54
+
55
+ # Save migration timestamp
56
+ echo "Migration completed at $(date)" >> MIGRATION_LOG.md
57
+ ```
58
+
59
+ **For Production** (if breaking changes):
60
+ - [ ] Backup database created: `db_backup_v[X.Y.Z]_$(date).sql`
61
+ - [ ] Backup tested: Can restore from backup
62
+ - [ ] Estimated migration time: [X] minutes
63
+ - [ ] Data integrity verified post-migration
64
+ - [ ] Rollback migration script tested
65
+
66
+ ### 3. Configuration Validation (5-10 min)
67
+
68
+ **Check environment-specific config**:
69
+ ```bash
70
+ # Verify environment variables loaded
71
+ env | grep -E "DATABASE_URL|API_KEY|FEATURE_FLAG"
72
+
73
+ # Verify secrets available
74
+ grep -r "SECRET_KEY" config/ | grep -v "# PLACEHOLDER"
75
+
76
+ # Verify no hardcoded values
77
+ grep -r "hardcoded_value\|TODO_REPLACE\|CHANGE_ME" src/
78
+
79
+ # Validate config syntax
80
+ npm run config:validate
81
+ ```
82
+
83
+ **Config checklist**:
84
+ - [ ] Database URL correct for environment
85
+ - [ ] API keys/tokens available
86
+ - [ ] Feature flags set correctly
87
+ - [ ] Logging level appropriate (DEBUG in dev, INFO in prod)
88
+ - [ ] TLS certificates valid
89
+ - [ ] Domain names correct
90
+
91
+ ### 4. Pre-Deployment Approval (5 min)
92
+
93
+ **Get sign-off from**:
94
+ - [ ] Release Manager: Confirmed ready
95
+ - [ ] Tech Lead: All checks pass
96
+ - [ ] Ops Lead: Infrastructure ready
97
+ - [ ] Product (if breaking changes): Consumer communication sent
98
+
99
+ **Approval timestamp**: [HH:MM UTC]
100
+
101
+ ---
102
+
103
+ ## Deployment Phase (20-60 minutes)
104
+
105
+ ### For Development Environment
106
+
107
+ ```bash
108
+ # 1. Build Docker image
109
+ docker build -t myapp:v[X.Y.Z] .
110
+
111
+ # 2. Tag image
112
+ docker tag myapp:v[X.Y.Z] myapp:latest
113
+
114
+ # 3. Stop old container
115
+ docker stop myapp-container || true
116
+
117
+ # 4. Remove old container
118
+ docker rm myapp-container || true
119
+
120
+ # 5. Run new container
121
+ docker run -d \
122
+ --name myapp-container \
123
+ --env-file .env.dev \
124
+ -p 3000:3000 \
125
+ myapp:v[X.Y.Z]
126
+
127
+ # 6. Wait for startup (verify health endpoint)
128
+ sleep 5
129
+ curl -f http://localhost:3000/health || exit 1
130
+
131
+ echo "✅ Development deployment complete: v[X.Y.Z]"
132
+ ```
133
+
134
+ ### For Staging Environment (via Kubernetes)
135
+
136
+ ```bash
137
+ # 1. Build and push image to registry
138
+ docker build -t registry.example.com/myapp:v[X.Y.Z] .
139
+ docker push registry.example.com/myapp:v[X.Y.Z]
140
+
141
+ # 2. Update deployment
142
+ kubectl set image deployment/myapp \
143
+ myapp=registry.example.com/myapp:v[X.Y.Z] \
144
+ -n staging
145
+
146
+ # 3. Wait for rollout
147
+ kubectl rollout status deployment/myapp -n staging --timeout=5m
148
+
149
+ # 4. Verify pods running
150
+ kubectl get pods -n staging -l app=myapp
151
+
152
+ # 5. Port forward for testing
153
+ kubectl port-forward svc/myapp 3000:3000 -n staging
154
+
155
+ echo "✅ Staging deployment complete: v[X.Y.Z]"
156
+ ```
157
+
158
+ ### For Production Environment (Blue-Green or Canary)
159
+
160
+ **Blue-Green**:
161
+ ```bash
162
+ # 1. Build and push to registry
163
+ docker build -t registry.example.com/myapp:v[X.Y.Z] .
164
+ docker push registry.example.com/myapp:v[X.Y.Z]
165
+
166
+ # 2. Deploy to "green" environment (parallel to prod)
167
+ kubectl apply -f deployment-green-v[X.Y.Z].yaml -n production
168
+
169
+ # 3. Wait for health checks
170
+ kubectl rollout status deployment/myapp-green -n production --timeout=5m
171
+
172
+ # 4. Verify green environment healthy
173
+ curl -f https://green.myapp.example.com/health || exit 1
174
+
175
+ # 5. Switch traffic: blue (old) → green (new)
176
+ # This is typically done via load balancer or DNS switch
177
+ aws elbv2 modify-rule --rule-arn <arn> \
178
+ --actions Type=forward,TargetGroups="[{TargetGroupArn=<green-tg>,Weight=100}]"
179
+
180
+ # 6. Monitor: Keep blue running for instant rollback (1-2 hours)
181
+ echo "✅ Production deployment complete: Traffic switched to green (v[X.Y.Z])"
182
+ echo "⏱️ Blue environment available for 2 hours for instant rollback"
183
+ ```
184
+
185
+ **Canary** (for breaking changes):
186
+ ```bash
187
+ # 1-4: Same as blue-green deployment steps
188
+
189
+ # 5. Route small % of traffic to new version (canary)
190
+ aws elbv2 modify-rule --rule-arn <arn> \
191
+ --actions Type=forward,TargetGroups="[{TargetGroupArn=<v3-tg>,Weight=1},{TargetGroupArn=<v2-tg>,Weight=99}]"
192
+
193
+ # 6. Monitor Stage 1 (1 hour at 1% traffic)
194
+ # ... (see canary-deployment-orchestration.md for full stages)
195
+ ```
196
+
197
+ ---
198
+
199
+ ## Post-Deployment Verification (15-30 minutes)
200
+
201
+ ### 1. Health Check Verification (5 min)
202
+
203
+ ```bash
204
+ # Check liveness probe
205
+ curl -f http://[app-url]/health
206
+ # Expected: 200 OK with version info
207
+
208
+ # Check readiness probe
209
+ curl -f http://[app-url]/ready
210
+ # Expected: 200 OK with dependency status
211
+
212
+ # Check metrics endpoint
213
+ curl -f http://[app-url]/metrics | head -20
214
+ # Expected: Prometheus metrics output
215
+ ```
216
+
217
+ **Verification checklist**:
218
+ - [ ] Liveness endpoint: 200 OK
219
+ - [ ] Readiness endpoint: 200 OK
220
+ - [ ] Version matches v[X.Y.Z]
221
+ - [ ] Database connected: Yes
222
+ - [ ] Cache connected: Yes (if applicable)
223
+ - [ ] External services: Available
224
+
225
+ ### 2. Smoke Test Scenarios (5-10 min)
226
+
227
+ **Critical Workflow #1: Authentication**
228
+ ```bash
229
+ # Test login
230
+ curl -X POST http://[app-url]/api/login \
231
+ -H "Content-Type: application/json" \
232
+ -d '{"email":"test@example.com","password":"password"}'
233
+
234
+ # Expected response:
235
+ # { "token": "...", "user": { "id": 1, "email": "test@example.com" } }
236
+ ```
237
+
238
+ **Critical Workflow #2: Create & Read Data**
239
+ ```bash
240
+ # Create user
241
+ curl -X POST http://[app-url]/api/users \
242
+ -H "Authorization: Bearer [token]" \
243
+ -H "Content-Type: application/json" \
244
+ -d '{"name":"John","email":"john@example.com"}'
245
+
246
+ # Read user (verify format is correct for this version)
247
+ curl http://[app-url]/api/users/1 \
248
+ -H "Authorization: Bearer [token]"
249
+
250
+ # Expected for v3.0.0: { "data": { "id": 1, "name": "John" } }
251
+ # NOT: { "user": { "id": 1, "name": "John" } }
252
+ ```
253
+
254
+ **Critical Workflow #3: Business Logic**
255
+ ```bash
256
+ # Test key business workflow (e.g., payment processing)
257
+ curl -X POST http://[app-url]/api/payments \
258
+ -H "Authorization: Bearer [token]" \
259
+ -H "Content-Type: application/json" \
260
+ -d '{"amount":99.99,"currency":"USD"}'
261
+
262
+ # Expected: 200 OK with payment ID
263
+ ```
264
+
265
+ **Smoke Test Checklist**:
266
+ - [ ] Login works
267
+ - [ ] Create resource works
268
+ - [ ] Read resource works (new format if applicable)
269
+ - [ ] Update resource works
270
+ - [ ] Delete resource works
271
+ - [ ] Business critical endpoint works
272
+ - [ ] Error handling works (test 404, 400, 500 scenarios)
273
+
274
+ ### 3. Database Integrity Check (5 min)
275
+
276
+ ```bash
277
+ # Verify database accessible
278
+ psql -d [database] -c "SELECT 1"
279
+
280
+ # Check recent data
281
+ psql -d [database] -c "SELECT COUNT(*) FROM users"
282
+
283
+ # Check for any errors in migration
284
+ psql -d [database] -c "SELECT * FROM schema_migrations ORDER BY version DESC LIMIT 5"
285
+
286
+ # Verify no data corruption
287
+ psql -d [database] -c "SELECT * FROM users LIMIT 1" | head -10
288
+ ```
289
+
290
+ **Database checklist**:
291
+ - [ ] Database connection successful
292
+ - [ ] Tables present
293
+ - [ ] Data accessible
294
+ - [ ] No schema errors
295
+ - [ ] Data integrity checks pass
296
+
297
+ ### 4. Performance Baseline (5 min)
298
+
299
+ ```bash
300
+ # Get baseline metrics
301
+ curl http://[app-url]/metrics | grep http_request_duration_seconds | head -10
302
+
303
+ # Expected: Requests processing in <200ms (P95)
304
+ ```
305
+
306
+ **Performance checklist**:
307
+ - [ ] Response time: <200ms P95
308
+ - [ ] Error rate: <0.1%
309
+ - [ ] CPU usage: <70%
310
+ - [ ] Memory usage: <80%
311
+ - [ ] Cache hit rate: >80% (if applicable)
312
+
313
+ ---
314
+
315
+ ## Rollback Phase (If Needed - < 5 minutes)
316
+
317
+ ### For Development/Staging
318
+
319
+ ```bash
320
+ # Get previous version
321
+ PREVIOUS_VERSION=$(git tag | sort -V | tail -2 | head -1)
322
+
323
+ # Stop current version
324
+ docker stop myapp-container
325
+
326
+ # Run previous version
327
+ docker run -d \
328
+ --name myapp-container \
329
+ --env-file .env.dev \
330
+ -p 3000:3000 \
331
+ myapp:${PREVIOUS_VERSION}
332
+
333
+ # Verify rollback successful
334
+ sleep 5
335
+ curl -f http://localhost:3000/health || exit 1
336
+
337
+ echo "✅ Rollback to ${PREVIOUS_VERSION} complete"
338
+ ```
339
+
340
+ ### For Production
341
+
342
+ ```bash
343
+ # Revert to previous version immediately
344
+ # If Blue-Green: Switch traffic back to blue (original)
345
+ aws elbv2 modify-rule --rule-arn <arn> \
346
+ --actions Type=forward,TargetGroups="[{TargetGroupArn=<blue-tg>,Weight=100}]"
347
+
348
+ # If Canary: Immediately route 100% back to previous version
349
+ aws elbv2 modify-rule --rule-arn <arn> \
350
+ --actions Type=forward,TargetGroups="[{TargetGroupArn=<v2-tg>,Weight=100}]"
351
+
352
+ # Verify rollback
353
+ curl -f http://[prod-url]/health
354
+
355
+ # Notify stakeholders
356
+ # Send: Slack message, Status page update, Email to on-call
357
+
358
+ echo "✅ Rollback to v[PREVIOUS] complete"
359
+ echo "🔴 Incident: v[X.Y.Z] deployment ROLLED BACK"
360
+ echo "📋 Investigation: See INCIDENT.md"
361
+ ```
362
+
363
+ ---
364
+
365
+ ## Post-Deployment Monitoring (24 hours)
366
+
367
+ **First 1 hour (Critical)**:
368
+ - [ ] Error rate: Maintain <0.1%
369
+ - [ ] Latency: Within 10% of baseline
370
+ - [ ] Throughput: Expected level
371
+ - [ ] Health checks: Continuous passing
372
+ - [ ] Consumer feedback: No critical issues
373
+
374
+ **24-hour window**:
375
+ - [ ] Error rate: Maintain <0.1%
376
+ - [ ] Latency: Stable, no spikes
377
+ - [ ] All business workflows: Operational
378
+ - [ ] Consumer migrations: On track (if breaking changes)
379
+ - [ ] On-call team: Standing down after 24 hours
380
+
381
+ **Checklist**:
382
+ - [ ] 1-hour post-deployment: All green
383
+ - [ ] 4-hour checkpoint: All green
384
+ - [ ] 24-hour checkpoint: Ready to remove rollback capability
385
+
386
+ ---
387
+
388
+ ## Incident Log
389
+
390
+ **If any issues occur, document**:
391
+
392
+ ```
393
+ Time: [HH:MM UTC]
394
+ Severity: [LOW|MEDIUM|HIGH|CRITICAL]
395
+ Description: [What happened]
396
+ Impact: [How many users affected]
397
+ Root cause: [Why did it happen]
398
+ Action taken: [What was done]
399
+ Resolution: [How was it fixed]
400
+ Prevention: [How to prevent next time]
401
+ ```
402
+
403
+ ---
404
+
405
+ ## Sign-Off & Completion
406
+
407
+ **Deployment Complete**:
408
+ - Version deployed: v[X.Y.Z]
409
+ - Environment: [Development|Staging|Production]
410
+ - Date/Time: [YYYY-MM-DD HH:MM UTC]
411
+ - Deployed by: [Name]
412
+ - Approvals: [List all approvers]
413
+ - All health checks: ✅ PASS
414
+ - Smoke tests: ✅ PASS
415
+ - Issues: [None / List any]
416
+ - Status: ✅ **READY FOR NEXT STAGE**
417
+
418
+ ---
419
+
420
+ **RUNBOOK COMPLETE**