specsmd 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +300 -0
- package/bin/cli.js +21 -0
- package/flows/aidlc/README.md +372 -0
- package/flows/aidlc/agents/construction-agent.md +81 -0
- package/flows/aidlc/agents/inception-agent.md +95 -0
- package/flows/aidlc/agents/master-agent.md +61 -0
- package/flows/aidlc/agents/operations-agent.md +89 -0
- package/flows/aidlc/commands/construction-agent.md +63 -0
- package/flows/aidlc/commands/inception-agent.md +55 -0
- package/flows/aidlc/commands/master-agent.md +47 -0
- package/flows/aidlc/commands/operations-agent.md +77 -0
- package/flows/aidlc/context-config.yaml +41 -0
- package/flows/aidlc/memory-bank.yaml +104 -0
- package/flows/aidlc/quick-start.md +315 -0
- package/flows/aidlc/skills/construction/bolt-list.md +163 -0
- package/flows/aidlc/skills/construction/bolt-replan.md +343 -0
- package/flows/aidlc/skills/construction/bolt-start.md +289 -0
- package/flows/aidlc/skills/construction/bolt-status.md +185 -0
- package/flows/aidlc/skills/construction/navigator.md +196 -0
- package/flows/aidlc/skills/inception/bolt-plan.md +338 -0
- package/flows/aidlc/skills/inception/context.md +171 -0
- package/flows/aidlc/skills/inception/intent-create.md +211 -0
- package/flows/aidlc/skills/inception/intent-list.md +124 -0
- package/flows/aidlc/skills/inception/navigator.md +207 -0
- package/flows/aidlc/skills/inception/requirements.md +227 -0
- package/flows/aidlc/skills/inception/review.md +248 -0
- package/flows/aidlc/skills/inception/story-create.md +304 -0
- package/flows/aidlc/skills/inception/units.md +271 -0
- package/flows/aidlc/skills/master/analyze-context.md +132 -0
- package/flows/aidlc/skills/master/answer-question.md +141 -0
- package/flows/aidlc/skills/master/explain-flow.md +146 -0
- package/flows/aidlc/skills/master/project-init.md +281 -0
- package/flows/aidlc/skills/master/route-request.md +126 -0
- package/flows/aidlc/skills/operations/build.md +237 -0
- package/flows/aidlc/skills/operations/deploy.md +259 -0
- package/flows/aidlc/skills/operations/monitor.md +265 -0
- package/flows/aidlc/skills/operations/navigator.md +209 -0
- package/flows/aidlc/skills/operations/verify.md +224 -0
- package/flows/aidlc/templates/construction/bolt-template.md +193 -0
- package/flows/aidlc/templates/construction/bolt-types/bdd-construction-bolt.md +250 -0
- package/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt/adr-template.md +49 -0
- package/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-01-domain-model-template.md +55 -0
- package/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-02-technical-design-template.md +67 -0
- package/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt/ddd-03-test-report-template.md +62 -0
- package/flows/aidlc/templates/construction/bolt-types/ddd-construction-bolt.md +528 -0
- package/flows/aidlc/templates/construction/bolt-types/simple-construction-bolt.md +273 -0
- package/flows/aidlc/templates/construction/bolt-types/spike-bolt.md +240 -0
- package/flows/aidlc/templates/construction/bolt-types/tdd-construction-bolt.md +259 -0
- package/flows/aidlc/templates/construction/construction-log-template.md +129 -0
- package/flows/aidlc/templates/construction/standards/coding-standards.md +29 -0
- package/flows/aidlc/templates/construction/standards/system-architecture.md +22 -0
- package/flows/aidlc/templates/construction/standards/tech-stack.md +19 -0
- package/flows/aidlc/templates/inception/inception-log-template.md +134 -0
- package/flows/aidlc/templates/inception/project/README.md +55 -0
- package/flows/aidlc/templates/inception/requirements-template.md +144 -0
- package/flows/aidlc/templates/inception/stories-template.md +38 -0
- package/flows/aidlc/templates/inception/story-template.md +147 -0
- package/flows/aidlc/templates/inception/system-context-template.md +29 -0
- package/flows/aidlc/templates/inception/unit-brief-template.md +177 -0
- package/flows/aidlc/templates/inception/units-template.md +52 -0
- package/flows/aidlc/templates/standards/catalog.yaml +345 -0
- package/flows/aidlc/templates/standards/coding-standards.guide.md +553 -0
- package/flows/aidlc/templates/standards/data-stack.guide.md +162 -0
- package/flows/aidlc/templates/standards/tech-stack.guide.md +280 -0
- package/lib/InstallerFactory.js +36 -0
- package/lib/cli-utils.js +372 -0
- package/lib/constants.js +31 -0
- package/lib/installer.js +314 -0
- package/lib/installers/AntigravityInstaller.js +22 -0
- package/lib/installers/ClaudeInstaller.js +85 -0
- package/lib/installers/ClineInstaller.js +21 -0
- package/lib/installers/CodexInstaller.js +21 -0
- package/lib/installers/CopilotInstaller.js +113 -0
- package/lib/installers/CursorInstaller.js +63 -0
- package/lib/installers/GeminiInstaller.js +75 -0
- package/lib/installers/KiroInstaller.js +22 -0
- package/lib/installers/OpenCodeInstaller.js +22 -0
- package/lib/installers/RooInstaller.js +22 -0
- package/lib/installers/ToolInstaller.js +73 -0
- package/lib/installers/WindsurfInstaller.js +76 -0
- package/lib/markdown-validator.ts +175 -0
- package/lib/yaml-validator.ts +99 -0
- package/package.json +65 -0
|
@@ -0,0 +1,265 @@
|
|
|
1
|
+
# Skill: Setup Monitoring
|
|
2
|
+
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
## Progress Display
|
|
6
|
+
|
|
7
|
+
Show at start of this skill:
|
|
8
|
+
|
|
9
|
+
```text
|
|
10
|
+
### Operations Progress
|
|
11
|
+
- [x] Build approval
|
|
12
|
+
- [x] Staging deploy
|
|
13
|
+
- [x] Production deploy
|
|
14
|
+
- [ ] Monitoring setup ← current
|
|
15
|
+
```
|
|
16
|
+
|
|
17
|
+
---
|
|
18
|
+
|
|
19
|
+
## Checkpoints in This Skill
|
|
20
|
+
|
|
21
|
+
| Checkpoint | Purpose | Wait For |
|
|
22
|
+
|------------|---------|----------|
|
|
23
|
+
| Checkpoint 4 | Monitoring setup approval | User confirmation |
|
|
24
|
+
|
|
25
|
+
---
|
|
26
|
+
|
|
27
|
+
## Goal
|
|
28
|
+
|
|
29
|
+
Configure observability (metrics, logging, alerting) for the unit and document operational runbooks.
|
|
30
|
+
|
|
31
|
+
---
|
|
32
|
+
|
|
33
|
+
## Input
|
|
34
|
+
|
|
35
|
+
- **Required**: `--unit` - The unit to monitor
|
|
36
|
+
- **Required**: `.specsmd/aidlc/memory-bank.yaml` - artifact schema
|
|
37
|
+
- **Optional**: `--env` - Specific environment (default: all)
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## Process
|
|
42
|
+
|
|
43
|
+
### 1. Identify Key Metrics
|
|
44
|
+
|
|
45
|
+
Implement RED method (Rate, Errors, Duration):
|
|
46
|
+
|
|
47
|
+
- **Rate**: Requests/sec → Traffic patterns
|
|
48
|
+
- **Errors**: Error rate, error types → Health indicator
|
|
49
|
+
- **Duration**: Latency percentiles → Performance
|
|
50
|
+
- **Saturation**: CPU, Memory, Connections → Capacity
|
|
51
|
+
|
|
52
|
+
### 2. Define SLIs/SLOs
|
|
53
|
+
|
|
54
|
+
Service Level Indicators and Objectives:
|
|
55
|
+
|
|
56
|
+
- **Availability**: 99.9% (measured by uptime)
|
|
57
|
+
- **Latency**: p95 < 200ms (measured by response time)
|
|
58
|
+
- **Error Rate**: < 0.1% (measured by 5xx responses)
|
|
59
|
+
- **Throughput**: > 1000 req/s (measured by requests per second)
|
|
60
|
+
|
|
61
|
+
### 3. Configure Alerting
|
|
62
|
+
|
|
63
|
+
Set up alerts for SLO violations:
|
|
64
|
+
|
|
65
|
+
- **High Error Rate**: > 1% for 5min → Critical → Page on-call
|
|
66
|
+
- **High Latency**: p95 > 500ms for 10min → Warning → Investigate
|
|
67
|
+
- **Service Down**: Health check failing → Critical → Page on-call
|
|
68
|
+
- **Resource Exhaustion**: CPU/Memory > 90% → Warning → Scale up
|
|
69
|
+
|
|
70
|
+
### 4. Setup Logging
|
|
71
|
+
|
|
72
|
+
Configure structured logging:
|
|
73
|
+
|
|
74
|
+
```markdown
|
|
75
|
+
### Log Configuration
|
|
76
|
+
|
|
77
|
+
**Format**: JSON structured logs
|
|
78
|
+
**Fields**:
|
|
79
|
+
- timestamp
|
|
80
|
+
- level
|
|
81
|
+
- service
|
|
82
|
+
- trace_id
|
|
83
|
+
- message
|
|
84
|
+
- context
|
|
85
|
+
|
|
86
|
+
**Aggregation**: {log service}
|
|
87
|
+
**Retention**: {days}
|
|
88
|
+
```
|
|
89
|
+
|
|
90
|
+
### 5. Create Dashboards
|
|
91
|
+
|
|
92
|
+
Dashboard layout recommendation:
|
|
93
|
+
|
|
94
|
+
```markdown
|
|
95
|
+
### Dashboard Sections
|
|
96
|
+
|
|
97
|
+
1. **Overview Panel**
|
|
98
|
+
- Request rate
|
|
99
|
+
- Error rate
|
|
100
|
+
- p50/p95/p99 latency
|
|
101
|
+
- Active instances
|
|
102
|
+
|
|
103
|
+
2. **Errors Panel**
|
|
104
|
+
- Error breakdown by type
|
|
105
|
+
- Error rate trend
|
|
106
|
+
- Recent error logs
|
|
107
|
+
|
|
108
|
+
3. **Performance Panel**
|
|
109
|
+
- Latency distribution
|
|
110
|
+
- Throughput trend
|
|
111
|
+
- Slow endpoints
|
|
112
|
+
|
|
113
|
+
4. **Resources Panel**
|
|
114
|
+
- CPU usage
|
|
115
|
+
- Memory usage
|
|
116
|
+
- Connection pools
|
|
117
|
+
```
|
|
118
|
+
|
|
119
|
+
### 6. Document Runbooks
|
|
120
|
+
|
|
121
|
+
Create operational runbooks:
|
|
122
|
+
|
|
123
|
+
```markdown
|
|
124
|
+
### Runbook: High Error Rate
|
|
125
|
+
|
|
126
|
+
**Trigger**: Error rate > 1% for 5 minutes
|
|
127
|
+
|
|
128
|
+
**Steps**:
|
|
129
|
+
1. Check recent deployments
|
|
130
|
+
2. Review error logs for patterns
|
|
131
|
+
3. Check external dependencies
|
|
132
|
+
4. If deployment-related: rollback
|
|
133
|
+
5. If external: check status pages
|
|
134
|
+
|
|
135
|
+
**Escalation**: If unresolved in 15 min, escalate to {team}
|
|
136
|
+
```
|
|
137
|
+
|
|
138
|
+
### 7. Document Configuration
|
|
139
|
+
|
|
140
|
+
Create/update `deployment/monitoring.md`:
|
|
141
|
+
|
|
142
|
+
```markdown
|
|
143
|
+
---
|
|
144
|
+
unit: {unit-name}
|
|
145
|
+
configured: {timestamp}
|
|
146
|
+
---
|
|
147
|
+
|
|
148
|
+
## Monitoring Configuration: {unit-name}
|
|
149
|
+
|
|
150
|
+
### Dashboards
|
|
151
|
+
|
|
152
|
+
- **Overview**: {url} - General health
|
|
153
|
+
- **Errors**: {url} - Error analysis
|
|
154
|
+
- **Performance**: {url} - Latency tracking
|
|
155
|
+
|
|
156
|
+
### Alerts
|
|
157
|
+
|
|
158
|
+
- **High Error Rate**: > 1% → PagerDuty
|
|
159
|
+
- **High Latency**: p95 > 500ms → Slack
|
|
160
|
+
- **Service Down**: Health failing → PagerDuty
|
|
161
|
+
|
|
162
|
+
### SLOs
|
|
163
|
+
|
|
164
|
+
- **Availability**: 99.9% (30-day window)
|
|
165
|
+
- **Latency (p95)**: < 200ms (30-day window)
|
|
166
|
+
|
|
167
|
+
### Logs
|
|
168
|
+
|
|
169
|
+
- **Location**: {log aggregator URL}
|
|
170
|
+
- **Query**: `service="{unit}"`
|
|
171
|
+
- **Retention**: {days}
|
|
172
|
+
|
|
173
|
+
### Runbooks
|
|
174
|
+
|
|
175
|
+
- **High Error Rate**: `runbook/high-error-rate.md`
|
|
176
|
+
- **Performance Degradation**: `runbook/performance.md`
|
|
177
|
+
- **Service Recovery**: `runbook/recovery.md`
|
|
178
|
+
```
|
|
179
|
+
|
|
180
|
+
---
|
|
181
|
+
|
|
182
|
+
## Output
|
|
183
|
+
|
|
184
|
+
```markdown
|
|
185
|
+
## Monitoring Configured: {unit-name}
|
|
186
|
+
|
|
187
|
+
### Status: ✅ COMPLETE
|
|
188
|
+
|
|
189
|
+
### Observability Stack
|
|
190
|
+
|
|
191
|
+
- ✅ **Metrics**: Configured at {metrics-url}
|
|
192
|
+
- ✅ **Logging**: Configured at {logs-url}
|
|
193
|
+
- ✅ **Alerting**: Configured at {alerts-url}
|
|
194
|
+
- ✅ **Dashboards**: Created at {dashboard-url}
|
|
195
|
+
|
|
196
|
+
### Alert Channels
|
|
197
|
+
|
|
198
|
+
- **Critical**: PagerDuty
|
|
199
|
+
- **Warning**: Slack
|
|
200
|
+
- **Info**: Email
|
|
201
|
+
|
|
202
|
+
### SLOs Defined
|
|
203
|
+
|
|
204
|
+
- **Availability**: 99.9%
|
|
205
|
+
- **Latency (p95)**: < 200ms
|
|
206
|
+
- **Error Rate**: < 0.1%
|
|
207
|
+
|
|
208
|
+
### Documentation Created
|
|
209
|
+
- `{unit-path}/deployment/monitoring.md`
|
|
210
|
+
|
|
211
|
+
### Operations Complete
|
|
212
|
+
✅ Unit `{unit-name}` is now fully operational with monitoring.
|
|
213
|
+
|
|
214
|
+
### Actions
|
|
215
|
+
|
|
216
|
+
1 - **adjust**: Fine-tune alert thresholds
|
|
217
|
+
2 - **runbook**: Create additional runbooks
|
|
218
|
+
3 - **menu**: Return to operations menu
|
|
219
|
+
|
|
220
|
+
### Suggested Next Step
|
|
221
|
+
→ **menu** - Monitor for 24-48 hours, then return for adjustments
|
|
222
|
+
|
|
223
|
+
**Type a number or press Enter for suggested action.**
|
|
224
|
+
```
|
|
225
|
+
|
|
226
|
+
---
|
|
227
|
+
|
|
228
|
+
## Monitoring Setup Confirmation
|
|
229
|
+
|
|
230
|
+
**Checkpoint 4**: Ask user to confirm monitoring setup:
|
|
231
|
+
|
|
232
|
+
```text
|
|
233
|
+
Ready to configure monitoring?
|
|
234
|
+
|
|
235
|
+
This will set up:
|
|
236
|
+
1. Dashboards (Overview, Errors, Performance)
|
|
237
|
+
2. Alerts (Error rate, Latency, Health)
|
|
238
|
+
3. SLOs (Availability, Latency targets)
|
|
239
|
+
4. Runbooks (Incident response)
|
|
240
|
+
|
|
241
|
+
Proceed with monitoring setup?
|
|
242
|
+
1 - Yes, configure monitoring
|
|
243
|
+
2 - Skip (not recommended for production)
|
|
244
|
+
```
|
|
245
|
+
|
|
246
|
+
**Wait for user response.**
|
|
247
|
+
|
|
248
|
+
---
|
|
249
|
+
|
|
250
|
+
## Transition
|
|
251
|
+
|
|
252
|
+
After monitoring approved and completed:
|
|
253
|
+
|
|
254
|
+
- → **Operations Complete** - unit is fully deployed and monitored
|
|
255
|
+
|
|
256
|
+
---
|
|
257
|
+
|
|
258
|
+
## Test Contract
|
|
259
|
+
|
|
260
|
+
```yaml
|
|
261
|
+
input: Unit name, environment
|
|
262
|
+
output: Dashboards, alerts, SLOs, runbooks, monitoring.md
|
|
263
|
+
checkpoints: 1
|
|
264
|
+
- Checkpoint 4: Monitoring setup approved by user
|
|
265
|
+
```
|
|
@@ -0,0 +1,209 @@
|
|
|
1
|
+
# Skill: Operations Navigator
|
|
2
|
+
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
## Role
|
|
6
|
+
|
|
7
|
+
Entry point for Operations Agent. Routes to appropriate skill based on state.
|
|
8
|
+
|
|
9
|
+
**NO Checkpoint** - Navigator is a routing skill, not a decision point.
|
|
10
|
+
|
|
11
|
+
---
|
|
12
|
+
|
|
13
|
+
## Progress Display
|
|
14
|
+
|
|
15
|
+
Show workflow position with checkpoint markers:
|
|
16
|
+
|
|
17
|
+
```text
|
|
18
|
+
### Operations Workflow (4 Checkpoints)
|
|
19
|
+
|
|
20
|
+
[Prerequisites] Construction complete?
|
|
21
|
+
|
|
|
22
|
+
[Checkpoint 1] Build approval --> build skill
|
|
23
|
+
|
|
|
24
|
+
[Build + Deploy to Dev]
|
|
25
|
+
|
|
|
26
|
+
[Checkpoint 2] Staging deploy --> deploy skill
|
|
27
|
+
|
|
|
28
|
+
[Deploy to Staging + Verify]
|
|
29
|
+
|
|
|
30
|
+
[Checkpoint 3] Production deploy --> deploy skill
|
|
31
|
+
|
|
|
32
|
+
[Deploy to Production + Verify]
|
|
33
|
+
|
|
|
34
|
+
[Checkpoint 4] Monitoring setup --> monitor skill
|
|
35
|
+
|
|
|
36
|
+
[Operations Complete]
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
---
|
|
40
|
+
|
|
41
|
+
## Goal
|
|
42
|
+
|
|
43
|
+
Present the Operations Agent's skills and guide the user through the deployment and monitoring workflow.
|
|
44
|
+
|
|
45
|
+
---
|
|
46
|
+
|
|
47
|
+
## Input
|
|
48
|
+
|
|
49
|
+
- **Required**: `--unit` - The unit to operate on
|
|
50
|
+
- **Required**: `.specsmd/aidlc/memory-bank.yaml` - artifact schema
|
|
51
|
+
|
|
52
|
+
---
|
|
53
|
+
|
|
54
|
+
## Process
|
|
55
|
+
|
|
56
|
+
### 1. Determine Context
|
|
57
|
+
|
|
58
|
+
Load current operations state:
|
|
59
|
+
|
|
60
|
+
**Prerequisites** (must pass before proceeding):
|
|
61
|
+
|
|
62
|
+
- **Unit exists** - Path from `schema.units` (Required)
|
|
63
|
+
- **All bolts complete** - Path from `schema.bolts` (Required)
|
|
64
|
+
- **Tests passing** - Bolt completion status (Required)
|
|
65
|
+
|
|
66
|
+
**Deployment Status** (from `deployment/` artifacts):
|
|
67
|
+
|
|
68
|
+
- Build status and version
|
|
69
|
+
- Deployment history per environment
|
|
70
|
+
- Verification status
|
|
71
|
+
- Monitoring status
|
|
72
|
+
|
|
73
|
+
### 2. Present Menu
|
|
74
|
+
|
|
75
|
+
Build menu dynamically using the Output sections below based on current state.
|
|
76
|
+
|
|
77
|
+
### 3. Context-Aware Suggestions
|
|
78
|
+
|
|
79
|
+
Based on deployment state:
|
|
80
|
+
|
|
81
|
+
- **No build** → Build artifacts
|
|
82
|
+
- **Build done, not deployed** → Deploy to dev
|
|
83
|
+
- **Deployed to dev, not verified** → Verify dev
|
|
84
|
+
- **Dev verified, not in staging** → Deploy to staging
|
|
85
|
+
- **Staging verified, not in prod** → Deploy to production
|
|
86
|
+
- **Prod deployed, not verified** → Verify production
|
|
87
|
+
- **Prod verified, no monitoring** → Setup monitoring
|
|
88
|
+
- **All complete** → Operations complete
|
|
89
|
+
|
|
90
|
+
### 4. Handle Selection
|
|
91
|
+
|
|
92
|
+
When user selects an option:
|
|
93
|
+
|
|
94
|
+
1. Acknowledge the selection
|
|
95
|
+
2. Load the corresponding skill file
|
|
96
|
+
3. Execute with current context
|
|
97
|
+
|
|
98
|
+
---
|
|
99
|
+
|
|
100
|
+
## Output (Ready for Operations)
|
|
101
|
+
|
|
102
|
+
```markdown
|
|
103
|
+
## Operations Agent
|
|
104
|
+
|
|
105
|
+
### Unit: `{unit-name}`
|
|
106
|
+
**Construction Status**: ✅ Complete ({n} bolts)
|
|
107
|
+
**Stories Delivered**: {n}
|
|
108
|
+
|
|
109
|
+
### Deployment Status
|
|
110
|
+
|
|
111
|
+
- [ ] Development: Not deployed
|
|
112
|
+
- [ ] Staging: Not deployed
|
|
113
|
+
- [ ] Production: Not deployed
|
|
114
|
+
|
|
115
|
+
### Quick Actions
|
|
116
|
+
|
|
117
|
+
1 - **Build artifacts**: Create deployment package (`build --unit="{unit}"`)
|
|
118
|
+
2 - **View build history**: See previous builds (`build --unit="{unit}" --history`)
|
|
119
|
+
|
|
120
|
+
### Workflow
|
|
121
|
+
Build → Dev → Verify → Staging → Verify → Prod → Verify → Monitor
|
|
122
|
+
|
|
123
|
+
### Suggested Next Step
|
|
124
|
+
→ **Build deployment artifacts** to start the deployment pipeline
|
|
125
|
+
|
|
126
|
+
**Type a number to continue.**
|
|
127
|
+
```
|
|
128
|
+
|
|
129
|
+
---
|
|
130
|
+
|
|
131
|
+
## Output (Partially Deployed)
|
|
132
|
+
|
|
133
|
+
```markdown
|
|
134
|
+
## Operations Agent
|
|
135
|
+
|
|
136
|
+
### Unit: `{unit-name}`
|
|
137
|
+
**Latest Build**: `v{version}`
|
|
138
|
+
|
|
139
|
+
### Deployment Status
|
|
140
|
+
|
|
141
|
+
- ✅ Development: `v{version}` - Deployed, Verified
|
|
142
|
+
- ⏳ Staging: `v{version}` - Deployed, Pending verification ← current
|
|
143
|
+
- ⚠️ Production: `v{prev}` - Outdated, Verified
|
|
144
|
+
|
|
145
|
+
### Quick Actions
|
|
146
|
+
|
|
147
|
+
1 - **Verify staging**: Validate deployment (`verify --unit="{unit}" --env="staging"`)
|
|
148
|
+
2 - **Deploy to prod**: Promote to production (`deploy --unit="{unit}" --env="prod"`)
|
|
149
|
+
3 - **View history**: See deployment history (`history --unit="{unit}"`)
|
|
150
|
+
|
|
151
|
+
### Suggested Next Step
|
|
152
|
+
→ **Verify staging deployment** before promoting to production
|
|
153
|
+
|
|
154
|
+
**Type a number to continue.**
|
|
155
|
+
```
|
|
156
|
+
|
|
157
|
+
---
|
|
158
|
+
|
|
159
|
+
## Output (Fully Deployed)
|
|
160
|
+
|
|
161
|
+
```markdown
|
|
162
|
+
## Operations Agent
|
|
163
|
+
|
|
164
|
+
### Unit: `{unit-name}`
|
|
165
|
+
**Status**: ✅ FULLY OPERATIONAL
|
|
166
|
+
|
|
167
|
+
### All Environments
|
|
168
|
+
|
|
169
|
+
- ✅ Development: `v{version}` - Verified, Monitored
|
|
170
|
+
- ✅ Staging: `v{version}` - Verified, Monitored
|
|
171
|
+
- ✅ Production: `v{version}` - Verified, Monitored
|
|
172
|
+
|
|
173
|
+
### Resources
|
|
174
|
+
- Dashboard: {dashboard-url}
|
|
175
|
+
- Logs: {logs-url}
|
|
176
|
+
- Alerts: {alerts-url}
|
|
177
|
+
|
|
178
|
+
### Available Actions
|
|
179
|
+
|
|
180
|
+
1 - **View metrics**: Open monitoring dashboard
|
|
181
|
+
2 - **View logs**: Open log aggregator
|
|
182
|
+
3 - **Rollback**: Deploy previous version
|
|
183
|
+
4 - **Deploy new version**: When code changes
|
|
184
|
+
|
|
185
|
+
### Unit Complete
|
|
186
|
+
✅ Unit `{unit-name}` is fully deployed and monitored.
|
|
187
|
+
|
|
188
|
+
**Type a number or return to Master Agent.**
|
|
189
|
+
```
|
|
190
|
+
|
|
191
|
+
---
|
|
192
|
+
|
|
193
|
+
## Transition
|
|
194
|
+
|
|
195
|
+
After user selection:
|
|
196
|
+
|
|
197
|
+
- → Load selected skill
|
|
198
|
+
- → Skill contains the Checkpoint markers
|
|
199
|
+
- → Execute skill process
|
|
200
|
+
|
|
201
|
+
---
|
|
202
|
+
|
|
203
|
+
## Test Contract
|
|
204
|
+
|
|
205
|
+
```yaml
|
|
206
|
+
input: Unit state, deployment status
|
|
207
|
+
output: Menu with skill options, suggested next step
|
|
208
|
+
checkpoints: 0 (routing only)
|
|
209
|
+
```
|
|
@@ -0,0 +1,224 @@
|
|
|
1
|
+
# Skill: Verify Deployment
|
|
2
|
+
|
|
3
|
+
---
|
|
4
|
+
|
|
5
|
+
## Role
|
|
6
|
+
|
|
7
|
+
Post-deployment validation skill. Runs health checks and smoke tests.
|
|
8
|
+
|
|
9
|
+
**NO Checkpoint** - Verification is an automated check, not a decision point.
|
|
10
|
+
|
|
11
|
+
Verification happens after each deploy skill checkpoint. If verification fails, recommend rollback.
|
|
12
|
+
|
|
13
|
+
---
|
|
14
|
+
|
|
15
|
+
## Goal
|
|
16
|
+
|
|
17
|
+
Confirm that a deployment is healthy, functional, and meeting acceptance criteria before proceeding to next environment.
|
|
18
|
+
|
|
19
|
+
---
|
|
20
|
+
|
|
21
|
+
## Input
|
|
22
|
+
|
|
23
|
+
- **Required**: `--unit` - The unit to verify
|
|
24
|
+
- **Required**: `.specsmd/aidlc/memory-bank.yaml` - artifact schema
|
|
25
|
+
- **Optional**: `--env` - Target environment (default: last deployed)
|
|
26
|
+
- **Optional**: `--version` - Specific version (default: currently deployed)
|
|
27
|
+
|
|
28
|
+
---
|
|
29
|
+
|
|
30
|
+
## Process
|
|
31
|
+
|
|
32
|
+
### 1. Load Deployment Context
|
|
33
|
+
|
|
34
|
+
From `deployment/history.md`:
|
|
35
|
+
|
|
36
|
+
- Current deployed version
|
|
37
|
+
- Deployment timestamp
|
|
38
|
+
- Previous version (for comparison)
|
|
39
|
+
|
|
40
|
+
### 2. Execute Health Checks
|
|
41
|
+
|
|
42
|
+
- [ ] **Endpoint Health**: GET `/health` → 200 OK
|
|
43
|
+
- [ ] **Readiness**: GET `/ready` → 200 OK
|
|
44
|
+
- [ ] **Database**: Connection test → Connected
|
|
45
|
+
- [ ] **Cache**: Connection test → Connected
|
|
46
|
+
- [ ] **External APIs**: Connectivity check → Reachable
|
|
47
|
+
|
|
48
|
+
### 3. Run Smoke Tests
|
|
49
|
+
|
|
50
|
+
Execute critical path tests:
|
|
51
|
+
|
|
52
|
+
- [ ] **Authentication**: Login flow works (if applicable)
|
|
53
|
+
- [ ] **Core CRUD**: Basic operations work (required)
|
|
54
|
+
- [ ] **API Endpoints**: Key endpoints respond (required)
|
|
55
|
+
- [ ] **Error Handling**: Errors handled gracefully (required)
|
|
56
|
+
|
|
57
|
+
### 4. Check Metrics
|
|
58
|
+
|
|
59
|
+
Verify operational metrics are within bounds:
|
|
60
|
+
|
|
61
|
+
- [ ] **Response Time**: < SLA (p95 latency)
|
|
62
|
+
- [ ] **Error Rate**: < 1% (4xx/5xx responses)
|
|
63
|
+
- [ ] **CPU**: < 80% (resource utilization)
|
|
64
|
+
- [ ] **Memory**: < 80% (resource utilization)
|
|
65
|
+
|
|
66
|
+
### 5. Compare to Baseline
|
|
67
|
+
|
|
68
|
+
If previous version exists:
|
|
69
|
+
|
|
70
|
+
- Error rate comparison
|
|
71
|
+
- Latency comparison
|
|
72
|
+
- Resource usage comparison
|
|
73
|
+
|
|
74
|
+
### 6. Document Results
|
|
75
|
+
|
|
76
|
+
Create `deployment/verification-{version}.md`:
|
|
77
|
+
|
|
78
|
+
```markdown
|
|
79
|
+
---
|
|
80
|
+
version: {version}
|
|
81
|
+
environment: {env}
|
|
82
|
+
verified: {timestamp}
|
|
83
|
+
status: passed|failed
|
|
84
|
+
---
|
|
85
|
+
|
|
86
|
+
## Verification Report: {version}
|
|
87
|
+
|
|
88
|
+
### Health Checks
|
|
89
|
+
|
|
90
|
+
- ✅ **Endpoint**: 200 OK, 15ms
|
|
91
|
+
- ✅ **Database**: Connected
|
|
92
|
+
- ✅ **Cache**: Connected
|
|
93
|
+
|
|
94
|
+
### Smoke Tests
|
|
95
|
+
|
|
96
|
+
- ✅ **Login Flow**: 234ms
|
|
97
|
+
- ✅ **Create Item**: 156ms
|
|
98
|
+
- ✅ **API Health**: 12ms
|
|
99
|
+
|
|
100
|
+
### Metrics
|
|
101
|
+
|
|
102
|
+
- ✅ **p95 Latency**: 145ms (threshold: <200ms)
|
|
103
|
+
- ✅ **Error Rate**: 0.02% (threshold: <1%)
|
|
104
|
+
|
|
105
|
+
### Baseline Comparison
|
|
106
|
+
|
|
107
|
+
- **p95 Latency**: 142ms → 145ms (+2%)
|
|
108
|
+
- **Error Rate**: 0.01% → 0.02% (+0.01%)
|
|
109
|
+
|
|
110
|
+
### Conclusion
|
|
111
|
+
{passed|failed}: {summary}
|
|
112
|
+
```
|
|
113
|
+
|
|
114
|
+
---
|
|
115
|
+
|
|
116
|
+
## Output (Verification Passed)
|
|
117
|
+
|
|
118
|
+
```markdown
|
|
119
|
+
## Verification Passed: {unit-name}
|
|
120
|
+
|
|
121
|
+
### Status: ✅ VERIFIED
|
|
122
|
+
|
|
123
|
+
### Results Summary
|
|
124
|
+
|
|
125
|
+
- ✅ **Health Checks**: {n}/{n} passed
|
|
126
|
+
- ✅ **Smoke Tests**: {n}/{n} passed
|
|
127
|
+
- ✅ **Metric Checks**: {n}/{n} passed
|
|
128
|
+
|
|
129
|
+
### Key Metrics
|
|
130
|
+
|
|
131
|
+
- ✅ **Response Time (p95)**: {value}ms
|
|
132
|
+
- ✅ **Error Rate**: {value}%
|
|
133
|
+
- ✅ **Uptime**: 100%
|
|
134
|
+
|
|
135
|
+
### Environment Status
|
|
136
|
+
|
|
137
|
+
- ✅ **{env}**: `{version}` - Verified
|
|
138
|
+
|
|
139
|
+
### Documentation Created
|
|
140
|
+
- `{unit-path}/deployment/verification-{version}.md`
|
|
141
|
+
|
|
142
|
+
### Actions
|
|
143
|
+
|
|
144
|
+
1 - **monitor**: Setup monitoring for this unit
|
|
145
|
+
2 - **deploy**: Deploy to next environment
|
|
146
|
+
3 - **menu**: Return to operations menu
|
|
147
|
+
|
|
148
|
+
### Suggested Next Step
|
|
149
|
+
→ **monitor** - Setup monitoring for `{unit-name}`
|
|
150
|
+
|
|
151
|
+
**Type a number or press Enter for suggested action.**
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
---
|
|
155
|
+
|
|
156
|
+
## Output (Verification Failed)
|
|
157
|
+
|
|
158
|
+
```markdown
|
|
159
|
+
## Verification Failed: {unit-name}
|
|
160
|
+
|
|
161
|
+
### Status: ❌ FAILED
|
|
162
|
+
|
|
163
|
+
### Failed Checks
|
|
164
|
+
|
|
165
|
+
- ❌ **{check1}**: Expected {expected}, got {actual}
|
|
166
|
+
- ❌ **{check2}**: Expected {expected}, got {actual}
|
|
167
|
+
|
|
168
|
+
### Error Details
|
|
169
|
+
{error messages or logs}
|
|
170
|
+
|
|
171
|
+
### Impact Assessment
|
|
172
|
+
|
|
173
|
+
- **Severity**: {critical|high|medium|low}
|
|
174
|
+
- **Affected**: {what's broken}
|
|
175
|
+
|
|
176
|
+
### Recommended Action
|
|
177
|
+
⚠️ **ROLLBACK RECOMMENDED**
|
|
178
|
+
|
|
179
|
+
Previous stable version: `{prev-version}`
|
|
180
|
+
|
|
181
|
+
Rollback command:
|
|
182
|
+
deploy --unit="{unit}" --env="{env}" --version="{prev-version}"
|
|
183
|
+
|
|
184
|
+
### Actions
|
|
185
|
+
|
|
186
|
+
1 - **rollback**: Rollback to previous version
|
|
187
|
+
2 - **investigate**: Investigate root cause
|
|
188
|
+
3 - **menu**: Return to operations menu
|
|
189
|
+
|
|
190
|
+
### Suggested Next Step
|
|
191
|
+
→ **rollback** - Restore `{prev-version}` immediately
|
|
192
|
+
|
|
193
|
+
**Type a number or press Enter for suggested action.**
|
|
194
|
+
```
|
|
195
|
+
|
|
196
|
+
---
|
|
197
|
+
|
|
198
|
+
## Human Validation Point
|
|
199
|
+
|
|
200
|
+
On success:
|
|
201
|
+
> "Verification passed for `{unit}` v`{version}` in {env}. All {n} checks passed. Ready to proceed to {next-action}?"
|
|
202
|
+
|
|
203
|
+
On failure:
|
|
204
|
+
> "⚠️ Verification FAILED for `{unit}` v`{version}`. {n} checks failed. Recommend rollback to `{prev-version}`. Proceed with rollback?"
|
|
205
|
+
|
|
206
|
+
---
|
|
207
|
+
|
|
208
|
+
## Transition
|
|
209
|
+
|
|
210
|
+
After verification:
|
|
211
|
+
|
|
212
|
+
- → **Monitor** (`.specsmd/skills/operations/monitor.md`) - if verified and final environment
|
|
213
|
+
- → **Deploy** (`.specsmd/skills/operations/deploy.md`) - to next environment if verified
|
|
214
|
+
- → **Rollback** - if verification fails (deploy previous version)
|
|
215
|
+
|
|
216
|
+
---
|
|
217
|
+
|
|
218
|
+
## Test Contract
|
|
219
|
+
|
|
220
|
+
```yaml
|
|
221
|
+
input: Unit name, environment, version
|
|
222
|
+
output: Verification report with health checks, smoke tests, metrics
|
|
223
|
+
checkpoints: 0 (automated validation only)
|
|
224
|
+
```
|