@patricio0312rev/skillset 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/CHANGELOG.md +29 -0
- package/LICENSE +21 -0
- package/README.md +176 -0
- package/bin/cli.js +37 -0
- package/package.json +55 -0
- package/src/commands/init.js +301 -0
- package/src/index.js +168 -0
- package/src/lib/config.js +200 -0
- package/src/lib/generator.js +166 -0
- package/src/utils/display.js +95 -0
- package/src/utils/readme.js +196 -0
- package/src/utils/tool-specific.js +233 -0
- package/templates/ai-engineering/agent-orchestration-planner/ SKILL.md +266 -0
- package/templates/ai-engineering/cost-latency-optimizer/ SKILL.md +270 -0
- package/templates/ai-engineering/doc-to-vector-dataset-generator/ SKILL.md +239 -0
- package/templates/ai-engineering/evaluation-harness/ SKILL.md +219 -0
- package/templates/ai-engineering/guardrails-safety-filter-builder/ SKILL.md +226 -0
- package/templates/ai-engineering/llm-debugger/ SKILL.md +283 -0
- package/templates/ai-engineering/prompt-regression-tester/ SKILL.md +216 -0
- package/templates/ai-engineering/prompt-template-builder/ SKILL.md +393 -0
- package/templates/ai-engineering/rag-pipeline-builder/ SKILL.md +244 -0
- package/templates/ai-engineering/tool-function-schema-designer/ SKILL.md +219 -0
- package/templates/architecture/adr-writer/ SKILL.md +250 -0
- package/templates/architecture/api-versioning-deprecation-planner/ SKILL.md +331 -0
- package/templates/architecture/domain-model-boundaries-mapper/ SKILL.md +300 -0
- package/templates/architecture/migration-planner/ SKILL.md +376 -0
- package/templates/architecture/performance-budget-setter/ SKILL.md +318 -0
- package/templates/architecture/reliability-strategy-builder/ SKILL.md +286 -0
- package/templates/architecture/rfc-generator/ SKILL.md +362 -0
- package/templates/architecture/scalability-playbook/ SKILL.md +279 -0
- package/templates/architecture/system-design-generator/ SKILL.md +339 -0
- package/templates/architecture/tech-debt-prioritizer/ SKILL.md +329 -0
- package/templates/backend/api-contract-normalizer/ SKILL.md +487 -0
- package/templates/backend/api-endpoint-generator/ SKILL.md +415 -0
- package/templates/backend/auth-module-builder/ SKILL.md +99 -0
- package/templates/backend/background-jobs-designer/ SKILL.md +166 -0
- package/templates/backend/caching-strategist/ SKILL.md +190 -0
- package/templates/backend/error-handling-standardizer/ SKILL.md +174 -0
- package/templates/backend/rate-limiting-abuse-protection/ SKILL.md +147 -0
- package/templates/backend/rbac-permissions-builder/ SKILL.md +158 -0
- package/templates/backend/service-layer-extractor/ SKILL.md +269 -0
- package/templates/backend/webhook-receiver-hardener/ SKILL.md +211 -0
- package/templates/ci-cd/artifact-sbom-publisher/ SKILL.md +236 -0
- package/templates/ci-cd/caching-strategy-optimizer/ SKILL.md +195 -0
- package/templates/ci-cd/deployment-checklist-generator/ SKILL.md +381 -0
- package/templates/ci-cd/github-actions-pipeline-creator/ SKILL.md +348 -0
- package/templates/ci-cd/monorepo-ci-optimizer/ SKILL.md +298 -0
- package/templates/ci-cd/preview-environments-builder/ SKILL.md +187 -0
- package/templates/ci-cd/quality-gates-enforcer/ SKILL.md +342 -0
- package/templates/ci-cd/release-automation-builder/ SKILL.md +281 -0
- package/templates/ci-cd/rollback-workflow-builder/ SKILL.md +372 -0
- package/templates/ci-cd/secrets-env-manager/ SKILL.md +242 -0
- package/templates/db-management/backup-restore-runbook-generator/ SKILL.md +505 -0
- package/templates/db-management/data-integrity-auditor/ SKILL.md +505 -0
- package/templates/db-management/data-retention-archiving-planner/ SKILL.md +430 -0
- package/templates/db-management/data-seeding-fixtures-builder/ SKILL.md +375 -0
- package/templates/db-management/db-performance-watchlist/ SKILL.md +425 -0
- package/templates/db-management/etl-sync-job-builder/ SKILL.md +457 -0
- package/templates/db-management/multi-tenant-safety-checker/ SKILL.md +398 -0
- package/templates/db-management/prisma-migration-assistant/ SKILL.md +379 -0
- package/templates/db-management/schema-consistency-checker/ SKILL.md +440 -0
- package/templates/db-management/sql-query-optimizer/ SKILL.md +324 -0
- package/templates/foundation/changelog-writer/ SKILL.md +431 -0
- package/templates/foundation/code-formatter-installer/ SKILL.md +320 -0
- package/templates/foundation/codebase-summarizer/ SKILL.md +360 -0
- package/templates/foundation/dependency-doctor/ SKILL.md +163 -0
- package/templates/foundation/dev-environment-bootstrapper/ SKILL.md +259 -0
- package/templates/foundation/dev-onboarding-builder/ SKILL.md +556 -0
- package/templates/foundation/docs-starter-kit/ SKILL.md +574 -0
- package/templates/foundation/explaining-code/SKILL.md +13 -0
- package/templates/foundation/git-hygiene-enforcer/ SKILL.md +455 -0
- package/templates/foundation/project-scaffolder/ SKILL.md +65 -0
- package/templates/foundation/project-scaffolder/references/templates.md +126 -0
- package/templates/foundation/repo-structure-linter/ SKILL.md +0 -0
- package/templates/foundation/repo-structure-linter/references/conventions.md +98 -0
- package/templates/frontend/animation-micro-interaction-pack/ SKILL.md +41 -0
- package/templates/frontend/component-scaffold-generator/ SKILL.md +562 -0
- package/templates/frontend/design-to-component-translator/ SKILL.md +547 -0
- package/templates/frontend/form-wizard-builder/ SKILL.md +553 -0
- package/templates/frontend/frontend-refactor-planner/ SKILL.md +37 -0
- package/templates/frontend/i18n-frontend-implementer/ SKILL.md +44 -0
- package/templates/frontend/modal-drawer-system/ SKILL.md +377 -0
- package/templates/frontend/page-layout-builder/ SKILL.md +630 -0
- package/templates/frontend/state-ux-flow-builder/ SKILL.md +23 -0
- package/templates/frontend/table-builder/ SKILL.md +350 -0
- package/templates/performance/alerting-dashboard-builder/ SKILL.md +162 -0
- package/templates/performance/backend-latency-profiler-helper/ SKILL.md +108 -0
- package/templates/performance/caching-cdn-strategy-planner/ SKILL.md +150 -0
- package/templates/performance/capacity-planning-helper/ SKILL.md +242 -0
- package/templates/performance/core-web-vitals-tuner/ SKILL.md +126 -0
- package/templates/performance/incident-runbook-generator/ SKILL.md +162 -0
- package/templates/performance/load-test-scenario-builder/ SKILL.md +256 -0
- package/templates/performance/observability-setup/ SKILL.md +232 -0
- package/templates/performance/postmortem-writer/ SKILL.md +203 -0
- package/templates/performance/structured-logging-standardizer/ SKILL.md +122 -0
- package/templates/security/auth-security-reviewer/ SKILL.md +428 -0
- package/templates/security/dependency-vulnerability-triage/ SKILL.md +495 -0
- package/templates/security/input-validation-sanitization-auditor/ SKILL.md +76 -0
- package/templates/security/pii-redaction-logging-policy-builder/ SKILL.md +65 -0
- package/templates/security/rbac-policy-tester/ SKILL.md +80 -0
- package/templates/security/secrets-scanner/ SKILL.md +462 -0
- package/templates/security/secure-headers-csp-builder/ SKILL.md +404 -0
- package/templates/security/security-incident-playbook-generator/ SKILL.md +76 -0
- package/templates/security/security-pr-checklist-skill/ SKILL.md +62 -0
- package/templates/security/threat-model-generator/ SKILL.md +394 -0
- package/templates/testing/contract-testing-builder/ SKILL.md +492 -0
- package/templates/testing/coverage-strategist/ SKILL.md +436 -0
- package/templates/testing/e2e-test-builder/ SKILL.md +382 -0
- package/templates/testing/flaky-test-detective/ SKILL.md +416 -0
- package/templates/testing/integration-test-builder/ SKILL.md +525 -0
- package/templates/testing/mocking-assistant/ SKILL.md +383 -0
- package/templates/testing/snapshot-test-refactorer/ SKILL.md +375 -0
- package/templates/testing/test-data-factory-builder/ SKILL.md +449 -0
- package/templates/testing/test-reporting-triage-skill/ SKILL.md +469 -0
- package/templates/testing/unit-test-generator/ SKILL.md +548 -0
|
@@ -0,0 +1,279 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: scalability-playbook
|
|
3
|
+
description: Identifies performance bottlenecks and provides ordered scaling strategies with triggers, phased plans, and cost implications. Use for "scalability planning", "performance bottlenecks", "capacity planning", or "growth strategy".
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Scalability Playbook
|
|
7
|
+
|
|
8
|
+
Systematic approach to identifying and resolving scalability bottlenecks.
|
|
9
|
+
|
|
10
|
+
## Bottleneck Analysis
|
|
11
|
+
|
|
12
|
+
### Current System Profile
|
|
13
|
+
|
|
14
|
+
```
|
|
15
|
+
Traffic: 1,000 req/min
|
|
16
|
+
Users: 10,000 active
|
|
17
|
+
Data: 100GB database
|
|
18
|
+
Response time: p95 = 500ms
|
|
19
|
+
```
|
|
20
|
+
|
|
21
|
+
### Identified Bottlenecks
|
|
22
|
+
|
|
23
|
+
#### 1. Database Queries
|
|
24
|
+
|
|
25
|
+
**Symptom:** Slow page loads (2-3s)
|
|
26
|
+
**Measurement:** Query time p95 = 800ms
|
|
27
|
+
**Impact:** HIGH - affects all reads
|
|
28
|
+
**Trigger:** When p95 >500ms
|
|
29
|
+
|
|
30
|
+
#### 2. Single Server
|
|
31
|
+
|
|
32
|
+
**Symptom:** High CPU (>80%)
|
|
33
|
+
**Measurement:** Load average >4
|
|
34
|
+
**Impact:** MEDIUM - intermittent slowdowns
|
|
35
|
+
**Trigger:** When CPU >70%
|
|
36
|
+
|
|
37
|
+
#### 3. No Caching
|
|
38
|
+
|
|
39
|
+
**Symptom:** Repeated DB queries
|
|
40
|
+
**Measurement:** Cache hit rate = 0%
|
|
41
|
+
**Impact:** MEDIUM - unnecessary load
|
|
42
|
+
**Trigger:** When query volume >10k/min
|
|
43
|
+
|
|
44
|
+
## Scaling Strategies (Ordered)
|
|
45
|
+
|
|
46
|
+
### Level 1: Quick Wins (Days)
|
|
47
|
+
|
|
48
|
+
#### 1.1 Add Database Indexes
|
|
49
|
+
|
|
50
|
+
**Problem:** Slow queries
|
|
51
|
+
**Solution:**
|
|
52
|
+
|
|
53
|
+
```sql
|
|
54
|
+
CREATE INDEX idx_users_email ON users(email);
|
|
55
|
+
CREATE INDEX idx_orders_user_created ON orders(user_id, created_at);
|
|
56
|
+
```
|
|
57
|
+
|
|
58
|
+
**Expected Impact:** 80% faster queries
|
|
59
|
+
**Cost:** $0
|
|
60
|
+
**Effort:** 1 day
|
|
61
|
+
|
|
62
|
+
#### 1.2 Enable Query Caching
|
|
63
|
+
|
|
64
|
+
**Problem:** Repeated queries
|
|
65
|
+
**Solution:** Redis cache layer
|
|
66
|
+
|
|
67
|
+
```typescript
|
|
68
|
+
const cached = await redis.get(`user:${userId}`);
|
|
69
|
+
if (cached) return JSON.parse(cached);
|
|
70
|
+
|
|
71
|
+
const user = await db.users.findById(userId);
|
|
72
|
+
await redis.setex(`user:${userId}`, 3600, JSON.stringify(user));
|
|
73
|
+
```
|
|
74
|
+
|
|
75
|
+
**Expected Impact:** 60% reduction in DB load
|
|
76
|
+
**Cost:** $50/month
|
|
77
|
+
**Effort:** 2 days
|
|
78
|
+
|
|
79
|
+
### Level 2: Horizontal Scaling (Weeks)
|
|
80
|
+
|
|
81
|
+
#### 2.1 Add Read Replicas
|
|
82
|
+
|
|
83
|
+
**Problem:** Read-heavy workload
|
|
84
|
+
**Solution:** Route reads to replicas
|
|
85
|
+
|
|
86
|
+
```
|
|
87
|
+
Write Load: Primary DB
|
|
88
|
+
Read Load: 3x Read Replicas
|
|
89
|
+
```
|
|
90
|
+
|
|
91
|
+
**Expected Impact:** 3x read capacity
|
|
92
|
+
**Cost:** $300/month
|
|
93
|
+
**Effort:** 1 week
|
|
94
|
+
|
|
95
|
+
#### 2.2 Load Balancer + Multiple Servers
|
|
96
|
+
|
|
97
|
+
**Problem:** Single point of failure
|
|
98
|
+
**Solution:**
|
|
99
|
+
|
|
100
|
+
```
|
|
101
|
+
ALB
|
|
102
|
+
├── Server 1
|
|
103
|
+
├── Server 2
|
|
104
|
+
└── Server 3
|
|
105
|
+
```
|
|
106
|
+
|
|
107
|
+
**Expected Impact:** 3x throughput
|
|
108
|
+
**Cost:** $400/month
|
|
109
|
+
**Effort:** 1 week
|
|
110
|
+
|
|
111
|
+
### Level 3: Architecture Changes (Months)
|
|
112
|
+
|
|
113
|
+
#### 3.1 CDN for Static Assets
|
|
114
|
+
|
|
115
|
+
**Problem:** Slow asset delivery
|
|
116
|
+
**Solution:** CloudFront CDN
|
|
117
|
+
**Expected Impact:** 90% faster asset loads
|
|
118
|
+
**Cost:** $100/month
|
|
119
|
+
**Effort:** 1 week
|
|
120
|
+
|
|
121
|
+
#### 3.2 Async Processing
|
|
122
|
+
|
|
123
|
+
**Problem:** Slow sync operations
|
|
124
|
+
**Solution:** Background job queues
|
|
125
|
+
|
|
126
|
+
```typescript
|
|
127
|
+
// Before: Sync
|
|
128
|
+
await sendEmail(user);
|
|
129
|
+
await processPayment(order);
|
|
130
|
+
await updateAnalytics(event);
|
|
131
|
+
return response; // Waits 5+ seconds
|
|
132
|
+
|
|
133
|
+
// After: Async
|
|
134
|
+
await queue.add("send-email", { userId });
|
|
135
|
+
await queue.add("process-payment", { orderId });
|
|
136
|
+
await queue.add("update-analytics", { event });
|
|
137
|
+
return response; // Returns immediately
|
|
138
|
+
```
|
|
139
|
+
|
|
140
|
+
**Expected Impact:** 80% faster responses
|
|
141
|
+
**Cost:** $50/month (SQS)
|
|
142
|
+
**Effort:** 2 weeks
|
|
143
|
+
|
|
144
|
+
### Level 4: Data Layer Optimization (Months)
|
|
145
|
+
|
|
146
|
+
#### 4.1 Database Sharding
|
|
147
|
+
|
|
148
|
+
**Problem:** Single DB too large
|
|
149
|
+
**Solution:** Shard by user_id
|
|
150
|
+
|
|
151
|
+
```
|
|
152
|
+
Shard 1: user_id 0-24999
|
|
153
|
+
Shard 2: user_id 25000-49999
|
|
154
|
+
Shard 3: user_id 50000-74999
|
|
155
|
+
Shard 4: user_id 75000-99999
|
|
156
|
+
```
|
|
157
|
+
|
|
158
|
+
**Expected Impact:** 4x capacity
|
|
159
|
+
**Cost:** $1,200/month
|
|
160
|
+
**Effort:** 2 months
|
|
161
|
+
|
|
162
|
+
#### 4.2 Event-Driven Architecture
|
|
163
|
+
|
|
164
|
+
**Problem:** Tight coupling, cascading failures
|
|
165
|
+
**Solution:** Message broker (Kafka)
|
|
166
|
+
|
|
167
|
+
```
|
|
168
|
+
Service A → Kafka → Service B
|
|
169
|
+
↘ ↗ Service C
|
|
170
|
+
```
|
|
171
|
+
|
|
172
|
+
**Expected Impact:** Better isolation, resilience
|
|
173
|
+
**Cost:** $500/month
|
|
174
|
+
**Effort:** 3 months
|
|
175
|
+
|
|
176
|
+
## Scaling Triggers
|
|
177
|
+
|
|
178
|
+
```markdown
|
|
179
|
+
| Metric | Current | Warning | Critical | Action |
|
|
180
|
+
| ---------------- | ------- | ------- | -------- | ----------------------- |
|
|
181
|
+
| CPU | 40% | 70% | 85% | Add servers |
|
|
182
|
+
| Memory | 50% | 75% | 90% | Upgrade instances |
|
|
183
|
+
| DB Connections | 20 | 40 | 50 | Add read replicas |
|
|
184
|
+
| Query Time (p95) | 200ms | 500ms | 1000ms | Add indexes |
|
|
185
|
+
| Queue Depth | 100 | 1000 | 5000 | Add workers |
|
|
186
|
+
| Error Rate | 0.1% | 1% | 5% | Investigate immediately |
|
|
187
|
+
```
|
|
188
|
+
|
|
189
|
+
## Phased Scaling Plan
|
|
190
|
+
|
|
191
|
+
### Phase 1: Current → 10x (0-3 months)
|
|
192
|
+
|
|
193
|
+
**Target:** 10,000 req/min, 100K users
|
|
194
|
+
|
|
195
|
+
**Actions:**
|
|
196
|
+
|
|
197
|
+
1. Add database indexes (Week 1)
|
|
198
|
+
2. Implement Redis caching (Week 2)
|
|
199
|
+
3. Add 3x read replicas (Week 4)
|
|
200
|
+
4. Horizontal scale app servers (Week 6)
|
|
201
|
+
5. CDN for static assets (Week 8)
|
|
202
|
+
|
|
203
|
+
**Cost:** $500 → $1,000/month
|
|
204
|
+
|
|
205
|
+
### Phase 2: 10x → 100x (3-12 months)
|
|
206
|
+
|
|
207
|
+
**Target:** 100,000 req/min, 1M users
|
|
208
|
+
|
|
209
|
+
**Actions:**
|
|
210
|
+
|
|
211
|
+
1. Database sharding (Month 4-6)
|
|
212
|
+
2. Multi-region deployment (Month 6-8)
|
|
213
|
+
3. Microservices extraction (Month 8-12)
|
|
214
|
+
4. Event-driven architecture (Month 10-12)
|
|
215
|
+
|
|
216
|
+
**Cost:** $1,000 → $10,000/month
|
|
217
|
+
|
|
218
|
+
### Phase 3: 100x → 1000x (12-24 months)
|
|
219
|
+
|
|
220
|
+
**Target:** 1M req/min, 10M users
|
|
221
|
+
|
|
222
|
+
**Actions:**
|
|
223
|
+
|
|
224
|
+
1. Global CDN (Month 13)
|
|
225
|
+
2. Advanced caching (L1/L2) (Month 14-15)
|
|
226
|
+
3. Custom DB solutions (Month 16-18)
|
|
227
|
+
4. Edge computing (Month 18-20)
|
|
228
|
+
|
|
229
|
+
**Cost:** $10,000 → $100,000/month
|
|
230
|
+
|
|
231
|
+
## Load Testing Plan
|
|
232
|
+
|
|
233
|
+
```bash
|
|
234
|
+
# Current baseline
|
|
235
|
+
hey -n 10000 -c 100 https://api.example.com/users
|
|
236
|
+
|
|
237
|
+
# Target 10x
|
|
238
|
+
hey -n 100000 -c 1000 https://api.example.com/users
|
|
239
|
+
|
|
240
|
+
# Measure:
|
|
241
|
+
# - Requests/sec
|
|
242
|
+
# - p50, p95, p99 latency
|
|
243
|
+
# - Error rate
|
|
244
|
+
# - Resource utilization
|
|
245
|
+
```
|
|
246
|
+
|
|
247
|
+
## Cost-Benefit Analysis
|
|
248
|
+
|
|
249
|
+
```markdown
|
|
250
|
+
| Strategy | Cost/Month | Expected Impact | ROI | Priority |
|
|
251
|
+
| ------------- | ---------- | ------------------ | --- | -------- |
|
|
252
|
+
| DB Indexes | $0 | 80% faster queries | ∞ | HIGH |
|
|
253
|
+
| Redis Cache | $50 | 60% less DB load | 12x | HIGH |
|
|
254
|
+
| Read Replicas | $300 | 3x capacity | 10x | MEDIUM |
|
|
255
|
+
| Load Balancer | $400 | 3x throughput | 7x | MEDIUM |
|
|
256
|
+
| DB Sharding | $1,200 | 4x capacity | 3x | LOW |
|
|
257
|
+
```
|
|
258
|
+
|
|
259
|
+
## Best Practices
|
|
260
|
+
|
|
261
|
+
1. **Measure first**: Don't optimize blindly
|
|
262
|
+
2. **Low-hanging fruit**: Start with easy wins
|
|
263
|
+
3. **Load test**: Validate before production
|
|
264
|
+
4. **Monitor continuously**: Set up alerts
|
|
265
|
+
5. **Plan ahead**: Scale before hitting limits
|
|
266
|
+
6. **Cost-conscious**: ROI-driven decisions
|
|
267
|
+
7. **Incremental**: Small, safe changes
|
|
268
|
+
|
|
269
|
+
## Output Checklist
|
|
270
|
+
|
|
271
|
+
- [ ] Current system profile
|
|
272
|
+
- [ ] Bottlenecks identified and measured
|
|
273
|
+
- [ ] Scaling strategies ordered by effort
|
|
274
|
+
- [ ] Triggers defined for each action
|
|
275
|
+
- [ ] Phased plan (1x → 10x → 100x)
|
|
276
|
+
- [ ] Cost estimates per phase
|
|
277
|
+
- [ ] Load testing plan
|
|
278
|
+
- [ ] Monitoring dashboard
|
|
279
|
+
- [ ] Rollback procedures
|
|
@@ -0,0 +1,339 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: system-design-generator
|
|
3
|
+
description: Produces comprehensive system architecture plans for features and products including component breakdown, data flow diagrams, system boundaries, API contracts, and scaling considerations. Use for "system design", "architecture planning", "feature design", or "technical specs".
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# System Design Generator
|
|
7
|
+
|
|
8
|
+
Create comprehensive system architecture plans from requirements.
|
|
9
|
+
|
|
10
|
+
## System Design Document Template
|
|
11
|
+
|
|
12
|
+
```markdown
|
|
13
|
+
# System Design: [Feature/Product Name]
|
|
14
|
+
|
|
15
|
+
## Overview
|
|
16
|
+
|
|
17
|
+
Brief description of what we're building and why.
|
|
18
|
+
|
|
19
|
+
## Requirements
|
|
20
|
+
|
|
21
|
+
### Functional
|
|
22
|
+
|
|
23
|
+
- User can upload videos (max 1GB)
|
|
24
|
+
- System processes video within 5 minutes
|
|
25
|
+
- User receives notification when complete
|
|
26
|
+
|
|
27
|
+
### Non-Functional
|
|
28
|
+
|
|
29
|
+
- Handle 1000 uploads/day
|
|
30
|
+
- 99.9% uptime
|
|
31
|
+
- Process videos in <5 minutes (p95)
|
|
32
|
+
- Cost: <$0.50 per video
|
|
33
|
+
|
|
34
|
+
## High-Level Architecture
|
|
35
|
+
```
|
|
36
|
+
|
|
37
|
+
┌─────────┐ ┌──────────┐ ┌─────────────┐
|
|
38
|
+
│ Client │─────▶│ API │─────▶│ Upload │
|
|
39
|
+
│ │ │ Gateway │ │ Service │
|
|
40
|
+
└─────────┘ └──────────┘ └─────────────┘
|
|
41
|
+
│
|
|
42
|
+
▼
|
|
43
|
+
┌─────────────┐
|
|
44
|
+
│ Storage │
|
|
45
|
+
│ (S3) │
|
|
46
|
+
└─────────────┘
|
|
47
|
+
│
|
|
48
|
+
▼
|
|
49
|
+
┌─────────────┐
|
|
50
|
+
│ Processing │◀─┐
|
|
51
|
+
│ Queue │ │
|
|
52
|
+
└─────────────┘ │
|
|
53
|
+
│ │
|
|
54
|
+
▼ │
|
|
55
|
+
┌─────────────┐ │
|
|
56
|
+
│ Processor │─┘
|
|
57
|
+
│ Workers │
|
|
58
|
+
└─────────────┘
|
|
59
|
+
│
|
|
60
|
+
▼
|
|
61
|
+
┌─────────────┐
|
|
62
|
+
│Notification │
|
|
63
|
+
│ Service │
|
|
64
|
+
└─────────────┘
|
|
65
|
+
|
|
66
|
+
```
|
|
67
|
+
|
|
68
|
+
## Components
|
|
69
|
+
|
|
70
|
+
### 1. API Gateway
|
|
71
|
+
**Responsibilities:**
|
|
72
|
+
- Authentication
|
|
73
|
+
- Rate limiting
|
|
74
|
+
- Request routing
|
|
75
|
+
|
|
76
|
+
**Technology:** Kong/AWS API Gateway
|
|
77
|
+
**Scaling:** Auto-scale based on requests/sec
|
|
78
|
+
|
|
79
|
+
### 2. Upload Service
|
|
80
|
+
**Responsibilities:**
|
|
81
|
+
- Generate pre-signed S3 URLs
|
|
82
|
+
- Validate file metadata
|
|
83
|
+
- Enqueue processing jobs
|
|
84
|
+
|
|
85
|
+
**API:**
|
|
86
|
+
```
|
|
87
|
+
|
|
88
|
+
POST /uploads
|
|
89
|
+
Request: { filename, size, content_type }
|
|
90
|
+
Response: { upload_url, upload_id }
|
|
91
|
+
|
|
92
|
+
```
|
|
93
|
+
|
|
94
|
+
**Technology:** Node.js + Express
|
|
95
|
+
**Scaling:** Horizontal (stateless)
|
|
96
|
+
|
|
97
|
+
### 3. Storage (S3)
|
|
98
|
+
**Responsibilities:**
|
|
99
|
+
- Store raw videos
|
|
100
|
+
- Store processed outputs
|
|
101
|
+
- Serve content via CDN
|
|
102
|
+
|
|
103
|
+
**Structure:**
|
|
104
|
+
```
|
|
105
|
+
|
|
106
|
+
/uploads/{user_id}/{upload_id}/original.mp4
|
|
107
|
+
/processed/{user_id}/{upload_id}/output.mp4
|
|
108
|
+
|
|
109
|
+
````
|
|
110
|
+
|
|
111
|
+
### 4. Processing Queue
|
|
112
|
+
**Responsibilities:**
|
|
113
|
+
- Buffer processing jobs
|
|
114
|
+
- Ensure at-least-once delivery
|
|
115
|
+
- DLQ for failed jobs
|
|
116
|
+
|
|
117
|
+
**Technology:** AWS SQS
|
|
118
|
+
**Configuration:**
|
|
119
|
+
- Visibility timeout: 15 minutes
|
|
120
|
+
- DLQ after 3 retries
|
|
121
|
+
|
|
122
|
+
### 5. Processor Workers
|
|
123
|
+
**Responsibilities:**
|
|
124
|
+
- Transcode videos
|
|
125
|
+
- Generate thumbnails
|
|
126
|
+
- Update database
|
|
127
|
+
|
|
128
|
+
**Technology:** Python + FFmpeg
|
|
129
|
+
**Scaling:** Auto-scale on queue depth
|
|
130
|
+
|
|
131
|
+
## Data Flow
|
|
132
|
+
|
|
133
|
+
### Upload Flow
|
|
134
|
+
1. Client requests upload URL from Upload Service
|
|
135
|
+
2. Upload Service generates pre-signed S3 URL
|
|
136
|
+
3. Client uploads directly to S3
|
|
137
|
+
4. Client notifies Upload Service of completion
|
|
138
|
+
5. Upload Service enqueues processing job
|
|
139
|
+
6. Returns upload_id to client
|
|
140
|
+
|
|
141
|
+
### Processing Flow
|
|
142
|
+
1. Worker polls queue for jobs
|
|
143
|
+
2. Downloads video from S3
|
|
144
|
+
3. Processes video (transcode, thumbnail)
|
|
145
|
+
4. Uploads results to S3
|
|
146
|
+
5. Updates database status
|
|
147
|
+
6. Sends notification
|
|
148
|
+
7. Deletes message from queue
|
|
149
|
+
|
|
150
|
+
## Data Model
|
|
151
|
+
|
|
152
|
+
```typescript
|
|
153
|
+
interface Upload {
|
|
154
|
+
id: string;
|
|
155
|
+
user_id: string;
|
|
156
|
+
filename: string;
|
|
157
|
+
size: number;
|
|
158
|
+
status: 'pending' | 'processing' | 'complete' | 'failed';
|
|
159
|
+
original_url: string;
|
|
160
|
+
processed_url?: string;
|
|
161
|
+
created_at: Date;
|
|
162
|
+
processed_at?: Date;
|
|
163
|
+
}
|
|
164
|
+
|
|
165
|
+
interface ProcessingJob {
|
|
166
|
+
upload_id: string;
|
|
167
|
+
attempts: number;
|
|
168
|
+
error?: string;
|
|
169
|
+
}
|
|
170
|
+
````
|
|
171
|
+
|
|
172
|
+
## API Contract
|
|
173
|
+
|
|
174
|
+
### Upload Endpoints
|
|
175
|
+
|
|
176
|
+
```
|
|
177
|
+
POST /uploads - Request upload URL
|
|
178
|
+
GET /uploads/:id - Get upload status
|
|
179
|
+
DELETE /uploads/:id - Cancel upload
|
|
180
|
+
GET /uploads - List user uploads
|
|
181
|
+
```
|
|
182
|
+
|
|
183
|
+
### Webhooks
|
|
184
|
+
|
|
185
|
+
```
|
|
186
|
+
POST {webhook_url}
|
|
187
|
+
{
|
|
188
|
+
"event": "upload.completed",
|
|
189
|
+
"upload_id": "...",
|
|
190
|
+
"status": "complete",
|
|
191
|
+
"processed_url": "..."
|
|
192
|
+
}
|
|
193
|
+
```
|
|
194
|
+
|
|
195
|
+
## Scaling Considerations
|
|
196
|
+
|
|
197
|
+
### Current Capacity
|
|
198
|
+
|
|
199
|
+
- 1000 uploads/day = ~1 per minute
|
|
200
|
+
- Single worker can process 1 video every 5 minutes
|
|
201
|
+
- Need 5 workers for current load
|
|
202
|
+
|
|
203
|
+
### 10x Scale (10,000/day)
|
|
204
|
+
|
|
205
|
+
- ~10 uploads per minute
|
|
206
|
+
- Need 50 workers
|
|
207
|
+
- Use spot instances for cost savings
|
|
208
|
+
- Add Redis cache for status checks
|
|
209
|
+
|
|
210
|
+
### 100x Scale (100,000/day)
|
|
211
|
+
|
|
212
|
+
- ~100 uploads per minute
|
|
213
|
+
- Partition by region
|
|
214
|
+
- Use Kafka instead of SQS
|
|
215
|
+
- Database sharding by user_id
|
|
216
|
+
|
|
217
|
+
## Failure Modes
|
|
218
|
+
|
|
219
|
+
### S3 Unavailable
|
|
220
|
+
|
|
221
|
+
- Impact: Uploads fail
|
|
222
|
+
- Mitigation: Multi-region S3 replication
|
|
223
|
+
|
|
224
|
+
### Queue Backed Up
|
|
225
|
+
|
|
226
|
+
- Impact: Processing delays
|
|
227
|
+
- Mitigation: Auto-scale workers faster
|
|
228
|
+
|
|
229
|
+
### Worker Crash During Processing
|
|
230
|
+
|
|
231
|
+
- Impact: Job retried
|
|
232
|
+
- Mitigation: Idempotent processing
|
|
233
|
+
|
|
234
|
+
## Cost Estimate
|
|
235
|
+
|
|
236
|
+
**Monthly (1000 uploads/day):**
|
|
237
|
+
|
|
238
|
+
- S3 Storage: $50
|
|
239
|
+
- S3 Transfer: $100
|
|
240
|
+
- SQS: $10
|
|
241
|
+
- Workers (EC2): $300
|
|
242
|
+
- Database: $100
|
|
243
|
+
**Total: ~$560/month**
|
|
244
|
+
|
|
245
|
+
## Security
|
|
246
|
+
|
|
247
|
+
- Pre-signed URLs expire in 1 hour
|
|
248
|
+
- Videos in private S3 buckets
|
|
249
|
+
- CloudFront signed URLs for delivery
|
|
250
|
+
- Rate limiting per user
|
|
251
|
+
|
|
252
|
+
## Monitoring
|
|
253
|
+
|
|
254
|
+
**Metrics:**
|
|
255
|
+
|
|
256
|
+
- Upload success rate
|
|
257
|
+
- Processing time (p50, p95, p99)
|
|
258
|
+
- Queue depth
|
|
259
|
+
- Worker CPU/memory
|
|
260
|
+
- Error rate by type
|
|
261
|
+
|
|
262
|
+
**Alerts:**
|
|
263
|
+
|
|
264
|
+
- Queue depth >1000
|
|
265
|
+
- Processing time p95 >10 minutes
|
|
266
|
+
- Error rate >5%
|
|
267
|
+
|
|
268
|
+
## Open Questions
|
|
269
|
+
|
|
270
|
+
- [ ] Video retention policy? (30 days? 1 year?)
|
|
271
|
+
- [ ] Maximum video duration? (affects processing time)
|
|
272
|
+
- [ ] Regional data residency requirements?
|
|
273
|
+
|
|
274
|
+
````
|
|
275
|
+
|
|
276
|
+
## Component Template
|
|
277
|
+
|
|
278
|
+
```markdown
|
|
279
|
+
### Component Name
|
|
280
|
+
|
|
281
|
+
**Responsibilities:**
|
|
282
|
+
- Primary responsibility
|
|
283
|
+
- Secondary responsibility
|
|
284
|
+
|
|
285
|
+
**Technology Stack:**
|
|
286
|
+
- Language: [Python/Node/Go]
|
|
287
|
+
- Framework: [Express/FastAPI/Gin]
|
|
288
|
+
- Database: [PostgreSQL/MongoDB]
|
|
289
|
+
|
|
290
|
+
**API/Interface:**
|
|
291
|
+
```typescript
|
|
292
|
+
interface ComponentAPI {
|
|
293
|
+
method(params): ReturnType;
|
|
294
|
+
}
|
|
295
|
+
````
|
|
296
|
+
|
|
297
|
+
**Scaling Strategy:**
|
|
298
|
+
|
|
299
|
+
- Horizontal: Stateless, load balanced
|
|
300
|
+
- Vertical: Cache layer, connection pooling
|
|
301
|
+
|
|
302
|
+
**Dependencies:**
|
|
303
|
+
|
|
304
|
+
- Service A (for X)
|
|
305
|
+
- Database B (for persistence)
|
|
306
|
+
|
|
307
|
+
**Failure Handling:**
|
|
308
|
+
|
|
309
|
+
- Retry with exponential backoff
|
|
310
|
+
- Circuit breaker for downstream services
|
|
311
|
+
- Fallback to cached data
|
|
312
|
+
|
|
313
|
+
```
|
|
314
|
+
|
|
315
|
+
## Best Practices
|
|
316
|
+
|
|
317
|
+
1. **Start with requirements**: Functional + non-functional
|
|
318
|
+
2. **Draw diagrams first**: Visual clarity
|
|
319
|
+
3. **Define boundaries**: What's in scope vs out
|
|
320
|
+
4. **Document tradeoffs**: Every choice has costs
|
|
321
|
+
5. **Plan for failure**: What breaks and how to handle
|
|
322
|
+
6. **Consider scale**: Current, 10x, 100x
|
|
323
|
+
7. **Estimate costs**: Build vs buy decisions
|
|
324
|
+
8. **Leave open questions**: Don't pretend to know everything
|
|
325
|
+
|
|
326
|
+
## Output Checklist
|
|
327
|
+
|
|
328
|
+
- [ ] Requirements documented (functional + non-functional)
|
|
329
|
+
- [ ] High-level architecture diagram
|
|
330
|
+
- [ ] Component breakdown (3-7 components)
|
|
331
|
+
- [ ] Data flow documented
|
|
332
|
+
- [ ] Data model defined
|
|
333
|
+
- [ ] API contracts specified
|
|
334
|
+
- [ ] Scaling considerations (1x, 10x, 100x)
|
|
335
|
+
- [ ] Failure modes identified
|
|
336
|
+
- [ ] Cost estimate provided
|
|
337
|
+
- [ ] Security considerations
|
|
338
|
+
- [ ] Monitoring plan
|
|
339
|
+
```
|