opencode-skills-collection 3.0.45 → 3.0.47

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/bundled-skills/.antigravity-install-manifest.json +10 -1
  2. package/bundled-skills/2slides-ppt-generator/SKILL.md +1 -1
  3. package/bundled-skills/2slides-ppt-generator/scripts/create_pdf_slides.py +2 -1
  4. package/bundled-skills/2slides-ppt-generator/scripts/generate_narration.py +2 -1
  5. package/bundled-skills/2slides-ppt-generator/scripts/generate_slides.py +13 -7
  6. package/bundled-skills/android-dev/references/hybrid.md +7 -4
  7. package/bundled-skills/android-dev/references/react-native.md +5 -2
  8. package/bundled-skills/atlas-contract/SKILL.md +4 -4
  9. package/bundled-skills/atlas-ledger/SKILL.md +10 -7
  10. package/bundled-skills/bun-development/SKILL.md +1 -1
  11. package/bundled-skills/cloud-penetration-testing/SKILL.md +1 -1
  12. package/bundled-skills/codebase-to-wordpress-converter/SKILL.md +1 -0
  13. package/bundled-skills/docs/integrations/jetski-cortex.md +3 -3
  14. package/bundled-skills/docs/integrations/jetski-gemini-loader/README.md +1 -1
  15. package/bundled-skills/docs/maintainers/repo-growth-seo.md +3 -3
  16. package/bundled-skills/docs/maintainers/skills-update-guide.md +1 -1
  17. package/bundled-skills/docs/users/bundles.md +1 -1
  18. package/bundled-skills/docs/users/claude-code-skills.md +1 -1
  19. package/bundled-skills/docs/users/gemini-cli-skills.md +1 -1
  20. package/bundled-skills/docs/users/getting-started.md +1 -1
  21. package/bundled-skills/docs/users/kiro-integration.md +1 -1
  22. package/bundled-skills/docs/users/usage.md +4 -4
  23. package/bundled-skills/docs/users/visual-guide.md +4 -4
  24. package/bundled-skills/dos-verify-done-claims/SKILL.md +173 -0
  25. package/bundled-skills/ecl-harness-engineer/LICENSE +21 -0
  26. package/bundled-skills/ecl-harness-engineer/SKILL.md +714 -0
  27. package/bundled-skills/ecl-harness-engineer/agents/analyzer.md +119 -0
  28. package/bundled-skills/ecl-harness-engineer/agents/auditor.md +212 -0
  29. package/bundled-skills/ecl-harness-engineer/agents/creator-config.md +343 -0
  30. package/bundled-skills/ecl-harness-engineer/agents/creator-docs.md +201 -0
  31. package/bundled-skills/ecl-harness-engineer/agents/creator-linters.md +123 -0
  32. package/bundled-skills/ecl-harness-engineer/references/adapters/adapter-schema.md +204 -0
  33. package/bundled-skills/ecl-harness-engineer/references/adapters/generic.md +156 -0
  34. package/bundled-skills/ecl-harness-engineer/references/adapters/go.md +212 -0
  35. package/bundled-skills/ecl-harness-engineer/references/adapters/java.md +205 -0
  36. package/bundled-skills/ecl-harness-engineer/references/adapters/python.md +225 -0
  37. package/bundled-skills/ecl-harness-engineer/references/adapters/rust.md +220 -0
  38. package/bundled-skills/ecl-harness-engineer/references/adapters/typescript.md +245 -0
  39. package/bundled-skills/ecl-harness-engineer/references/architecture-diagrams.md +420 -0
  40. package/bundled-skills/ecl-harness-engineer/references/audit-templates.md +649 -0
  41. package/bundled-skills/ecl-harness-engineer/references/capability-registry.md +485 -0
  42. package/bundled-skills/ecl-harness-engineer/references/darwin-eval-prompts.md +373 -0
  43. package/bundled-skills/ecl-harness-engineer/references/documentation-templates.md +741 -0
  44. package/bundled-skills/ecl-harness-engineer/references/durability-patterns.md +423 -0
  45. package/bundled-skills/ecl-harness-engineer/references/ecl-harness.md +1431 -0
  46. package/bundled-skills/ecl-harness-engineer/references/environment-config-guide.md +534 -0
  47. package/bundled-skills/ecl-harness-engineer/references/environment-detection-guide.md +751 -0
  48. package/bundled-skills/ecl-harness-engineer/references/eval-templates.md +377 -0
  49. package/bundled-skills/ecl-harness-engineer/references/gc-templates.md +798 -0
  50. package/bundled-skills/ecl-harness-engineer/references/greenfield-templates.md +1385 -0
  51. package/bundled-skills/ecl-harness-engineer/references/linter-templates.md +448 -0
  52. package/bundled-skills/ecl-harness-engineer/references/observability-templates.md +315 -0
  53. package/bundled-skills/environment-setup-guide/SKILL.md +2 -2
  54. package/bundled-skills/evolution/SKILL.md +1 -1
  55. package/bundled-skills/gitops-workflow/SKILL.md +1 -1
  56. package/bundled-skills/linkerd-patterns/SKILL.md +1 -1
  57. package/bundled-skills/loki-mode/examples/todo-app-generated/frontend/package-lock.json +504 -1317
  58. package/bundled-skills/loki-mode/examples/todo-app-generated/frontend/package.json +2 -2
  59. package/bundled-skills/lovable-cleanup/SKILL.md +416 -0
  60. package/bundled-skills/monopoly/SKILL.md +397 -0
  61. package/bundled-skills/monopoly/patterns/SKILL.md +331 -0
  62. package/bundled-skills/monopoly/scale-benchmarks/SKILL.md +174 -0
  63. package/bundled-skills/monopoly/security-checklist/SKILL.md +69 -0
  64. package/bundled-skills/monopoly/tech-matrix/SKILL.md +268 -0
  65. package/bundled-skills/pagespeed-enhancer/SKILL.md +579 -0
  66. package/bundled-skills/polis-protocol/SKILL.md +6 -3
  67. package/bundled-skills/unship/SKILL.md +11 -5
  68. package/bundled-skills/uv-package-manager/resources/implementation-playbook.md +1 -1
  69. package/bundled-skills/varlock/SKILL.md +2 -2
  70. package/package.json +1 -1
  71. package/skills_index.json +204 -4
@@ -0,0 +1,331 @@
1
+ ---
2
+ name: patterns
3
+ description: Reference document for monopoly patterns.
4
+ risk: safe
5
+ reports-to: monopoly
6
+ ---
7
+
8
+ # MONOPOLY — Design Patterns Deep Dive
9
+
10
+ ## Table of Contents
11
+ 1. CQRS
12
+ 2. Event Sourcing
13
+ 3. Saga Pattern
14
+ 4. Circuit Breaker
15
+ 5. Bulkhead
16
+ 6. Strangler Fig
17
+ 7. Sidecar / Service Mesh
18
+ 8. Outbox Pattern
19
+ 9. Consistent Hashing
20
+ 10. Backpressure
21
+ 11. Leader Election
22
+ 12. Two-Phase Commit
23
+
24
+ ---
25
+
26
+ ## 1. CQRS (Command Query Responsibility Segregation)
27
+
28
+ **What it is:** Separate the read model (Query) from the write model (Command) into distinct services, databases, or code paths.
29
+
30
+ **When to use:**
31
+ - Read load is 10×+ write load (most web apps)
32
+ - Read queries are complex aggregations over write data
33
+ - Need to optimize read and write paths independently
34
+ - Domain model is complex (DDD contexts)
35
+
36
+ **Implementation:**
37
+ ```
38
+ Write Path: Client → Command API → Write DB (normalized, PostgreSQL)
39
+ Read Path: Client → Query API → Read DB (denormalized, Redis / Elasticsearch)
40
+ Sync: Write DB → CDC (Debezium) → Message Queue → Read DB updater
41
+ ```
42
+
43
+ **Trade-offs:**
44
+ - ✅ Independent scaling of read and write
45
+ - ✅ Optimized schemas for each operation type
46
+ - ❌ Eventual consistency between write and read models
47
+ - ❌ Increased complexity; two models to maintain
48
+
49
+ **Real-world users:** Amazon (order service), LinkedIn (feed)
50
+
51
+ ---
52
+
53
+ ## 2. Event Sourcing
54
+
55
+ **What it is:** Store state as a sequence of immutable events rather than current state. Rebuild current state by replaying events.
56
+
57
+ **When to use:**
58
+ - Full audit trail is a regulatory requirement (fintech, healthcare)
59
+ - Need to replay history for debugging or analytics
60
+ - Complex domain with many state transitions
61
+ - Need to derive multiple read projections from same data
62
+
63
+ **Implementation:**
64
+ ```
65
+ Event Store: append-only log (Kafka, EventStoreDB)
66
+ Snapshots: periodic snapshots to speed up state rebuild
67
+ Projections: consumers build read models from events
68
+ ```
69
+
70
+ **Trade-offs:**
71
+ - ✅ Complete audit history; perfect for compliance
72
+ - ✅ Replay and time-travel debugging
73
+ - ❌ Querying current state requires projection maintenance
74
+ - ❌ Event schema evolution is hard
75
+ - ❌ High storage overhead over time
76
+
77
+ ---
78
+
79
+ ## 3. Saga Pattern
80
+
81
+ **What it is:** Manage distributed transactions across microservices via a sequence of local transactions, each publishing an event. If a step fails, compensating transactions undo previous steps.
82
+
83
+ **Two variants:**
84
+ - **Choreography:** Services react to events autonomously (decentralized)
85
+ - **Orchestration:** A central Saga Orchestrator coordinates steps (centralized)
86
+
87
+ **When to use:**
88
+ - Multi-service workflows where ACID across services is impossible
89
+ - Long-running business transactions (order → payment → inventory → shipping)
90
+ - Need rollback across service boundaries
91
+
92
+ **Choreography Example:**
93
+ ```
94
+ OrderService creates order →
95
+ [event: OrderCreated] →
96
+ PaymentService charges card →
97
+ [event: PaymentProcessed] →
98
+ InventoryService reserves stock →
99
+ [event: StockReserved] →
100
+ ShippingService books courier
101
+ ```
102
+
103
+ **Compensating Transactions (on failure):**
104
+ ```
105
+ ShippingService fails →
106
+ [event: ShippingFailed] →
107
+ InventoryService releases stock →
108
+ PaymentService refunds card →
109
+ OrderService marks order failed
110
+ ```
111
+
112
+ **Trade-offs:**
113
+ - ✅ No distributed locking; high availability
114
+ - ✅ Scales well across services
115
+ - ❌ Hard to debug; distributed trace required
116
+ - ❌ Compensating transactions are complex to implement correctly
117
+
118
+ ---
119
+
120
+ ## 4. Circuit Breaker
121
+
122
+ **What it is:** A proxy that monitors calls to a service. If failure rate exceeds threshold, the circuit "opens" and calls fail fast instead of waiting for timeout.
123
+
124
+ **States:**
125
+ ```
126
+ CLOSED → calls pass through; monitor failure rate
127
+ OPEN → calls fail immediately; no calls to downstream
128
+ HALF-OPEN → let a probe call through; if success, close; if fail, stay open
129
+ ```
130
+
131
+ **When to use:**
132
+ - Calling any external service (payment gateway, SMS, email)
133
+ - Microservices calling each other
134
+ - Preventing timeout cascade when downstream is slow
135
+
136
+ **Implementation tools:** Hystrix (deprecated), Resilience4j, Polly (.NET), Envoy proxy
137
+
138
+ **Thresholds (starting point):**
139
+ - Open after 50% failure rate over 10 requests
140
+ - Stay open for 30 seconds
141
+ - Half-open: allow 1 probe request
142
+
143
+ **Trade-offs:**
144
+ - ✅ Prevents cascade failures
145
+ - ✅ Gives downstream time to recover
146
+ - ❌ Adds latency overhead for monitoring
147
+ - ❌ Requires fallback behavior when circuit is open
148
+
149
+ ---
150
+
151
+ ## 5. Bulkhead
152
+
153
+ **What it is:** Isolate components so a failure in one doesn't consume resources of others. Named after the watertight compartments in ship hulls.
154
+
155
+ **Types:**
156
+ - **Thread Pool Bulkhead:** Separate thread pools per service call
157
+ - **Semaphore Bulkhead:** Limit concurrent calls per service
158
+ - **Process Bulkhead:** Separate processes/containers per service type
159
+
160
+ **When to use:**
161
+ - Multiple tenants sharing infrastructure (SaaS)
162
+ - One slow service consuming all connection pool slots
163
+ - Protecting critical services from being starved by non-critical ones
164
+
165
+ **Example:**
166
+ ```
167
+ Without bulkhead:
168
+ [Recommendation Service hangs] → fills shared thread pool → [Payment Service starves]
169
+
170
+ With bulkhead:
171
+ [Recommendation Service hangs] → fills its own thread pool (10 threads) → [Payment Service unaffected, has its own 50 threads]
172
+ ```
173
+
174
+ ---
175
+
176
+ ## 6. Strangler Fig Pattern
177
+
178
+ **What it is:** Incrementally replace a legacy monolith by routing new functionality to new microservices, while keeping the monolith alive for unchanged features.
179
+
180
+ **Migration steps:**
181
+ ```
182
+ Phase 1: Deploy proxy in front of monolith (no user impact)
183
+ Phase 2: Route one feature to new microservice
184
+ Phase 3: Verify; deprecate that feature in monolith
185
+ Phase 4: Repeat for each feature
186
+ Phase 5: Monolith is empty; decommission
187
+ ```
188
+
189
+ **When to use:**
190
+ - Migrating legacy monolith to microservices
191
+ - Can't do a big-bang rewrite (too risky)
192
+ - Need to ship new features during migration
193
+
194
+ **Trade-offs:**
195
+ - ✅ Zero downtime migration
196
+ - ✅ Incremental risk
197
+ - ❌ Dual maintenance burden during migration (monolith + new services)
198
+ - ❌ Proxy adds latency; must be managed carefully
199
+
200
+ ---
201
+
202
+ ## 7. Outbox Pattern
203
+
204
+ **What it is:** Solve the dual-write problem (write to DB AND publish to queue atomically) by writing the event to an "outbox" table in the same DB transaction, then having a separate process relay it to the queue.
205
+
206
+ **Problem it solves:**
207
+ ```
208
+ ❌ WRONG (dual-write race):
209
+ BEGIN;
210
+ UPDATE orders SET status='paid';
211
+ COMMIT;
212
+ // Crash here → event never published, DB and queue are inconsistent
213
+ publish(PaymentProcessed);
214
+ ```
215
+
216
+ ```
217
+ ✅ CORRECT (outbox):
218
+ BEGIN;
219
+ UPDATE orders SET status='paid';
220
+ INSERT INTO outbox (event_type, payload) VALUES ('PaymentProcessed', {...});
221
+ COMMIT;
222
+ // Relay process reads outbox and publishes to Kafka
223
+ // At-least-once delivery guaranteed; make consumers idempotent
224
+ ```
225
+
226
+ **Relay options:** Debezium (CDC), polling relay, transaction log tailing
227
+
228
+ ---
229
+
230
+ ## 8. Consistent Hashing
231
+
232
+ **What it is:** A hashing scheme where adding or removing nodes requires only K/N keys to be remapped (K = keys, N = nodes), instead of remapping all keys.
233
+
234
+ **When to use:**
235
+ - Distributing cache keys across Redis cluster nodes
236
+ - Routing requests to servers in a distributed system
237
+ - Partitioning data across database nodes
238
+
239
+ **Virtual nodes:** Assign multiple positions per physical node on the hash ring to ensure even distribution even with few nodes.
240
+
241
+ ---
242
+
243
+ ## 9. Backpressure
244
+
245
+ **What it is:** A mechanism for consumers to signal producers to slow down when they can't keep up, preventing memory exhaustion and cascade failures.
246
+
247
+ **Strategies:**
248
+ - **Drop:** Discard overflow messages (acceptable for metrics, logs)
249
+ - **Buffer:** Queue up to a limit, then block or drop
250
+ - **Block:** Producer waits until consumer catches up (simplest, may cause timeout)
251
+ - **Rate Limit:** Throttle producers at ingestion point
252
+
253
+ **When to use:**
254
+ - Message queue consumers are slower than producers
255
+ - Real-time data pipeline ingestion spikes
256
+ - API rate limiting for upstream clients
257
+
258
+ ---
259
+
260
+ ## 10. Leader Election
261
+
262
+ **What it is:** In a distributed system, elect a single node to perform a privileged task (e.g., writing to DB, sending scheduled jobs, coordinating work).
263
+
264
+ **Algorithms:**
265
+ - **Raft:** Used by etcd, CockroachDB, Consul. Practical and well-understood.
266
+ - **ZooKeeper (ZAB):** Used by Kafka, HBase. Mature but operationally heavy.
267
+ - **Bully Algorithm:** Simple; highest ID wins. Not fault-tolerant.
268
+
269
+ **When to use:**
270
+ - Scheduled jobs that should only run once (cron replacement)
271
+ - Primary/replica database failover coordination
272
+ - Distributed lock management
273
+
274
+ **Tools:** etcd, ZooKeeper, Consul, Redis (Redlock — use with caution)
275
+
276
+ ---
277
+
278
+ ## 11. Two-Phase Commit (2PC)
279
+
280
+ **What it is:** A distributed algorithm that ensures all participants in a transaction either all commit or all abort.
281
+
282
+ **Phases:**
283
+ ```
284
+ Phase 1 (Prepare): Coordinator asks all participants "can you commit?"
285
+ All say YES → proceed to Phase 2
286
+ Any says NO → abort
287
+
288
+ Phase 2 (Commit): Coordinator tells all participants to commit
289
+ ```
290
+
291
+ **When to use (sparingly):**
292
+ - Strong consistency is an absolute requirement across services
293
+ - Data loss is catastrophic (financial settlements)
294
+
295
+ **Why to avoid:**
296
+ - Coordinator is a SPOF
297
+ - Blocks on participant failure
298
+ - Very low throughput under contention
299
+ - Prefer Saga Pattern in most microservice architectures
300
+
301
+ ---
302
+
303
+ ## 12. Read-Through / Write-Through / Write-Behind Cache
304
+
305
+ **Read-Through:**
306
+ ```
307
+ Client → Cache (miss) → Cache fetches from DB → Returns to client
308
+ ```
309
+ Cache is always populated on miss. Simple for clients. Risk: cold start.
310
+
311
+ **Write-Through:**
312
+ ```
313
+ Client → Cache → Cache writes to DB synchronously → Confirms
314
+ ```
315
+ Strong consistency. Higher write latency. Good for read-heavy with consistency need.
316
+
317
+ **Write-Behind (Write-Back):**
318
+ ```
319
+ Client → Cache → Confirms immediately → Async flush to DB
320
+ ```
321
+ Very low write latency. Risk of data loss if cache fails before flush. Good for high-throughput counters, analytics.
322
+
323
+ **Cache-Aside (Lazy Loading):**
324
+ ```
325
+ Client → Cache (miss) → Client fetches from DB → Client writes to Cache
326
+ ```
327
+ Most common. Application owns cache logic. Risk: thundering herd on cold start.
328
+
329
+
330
+ ## Limitations
331
+ - This is a reference document and may not cover all edge cases. Always verify architectures before production.
@@ -0,0 +1,174 @@
1
+ ---
2
+ name: scale-benchmarks
3
+ description: Reference document for monopoly scale-benchmarks.
4
+ risk: safe
5
+ reports-to: monopoly
6
+ ---
7
+
8
+ # MONOPOLY — Scale Benchmarks & Estimation Formulas
9
+
10
+ ## Quick Estimation Formulas
11
+
12
+ ### User → RPS Conversion
13
+ ```
14
+ Requests per second (avg) = DAU × avg_requests_per_user_per_day / 86400
15
+ Requests per second (peak) = avg_RPS × peak_multiplier
16
+
17
+ Peak multipliers by app type:
18
+ Social media: 5–10×
19
+ E-commerce: 3–5× (higher during sales)
20
+ News / media: 10–20× (breaking news spike)
21
+ B2B SaaS: 2–3× (business hours spike)
22
+ Gaming: 5–15× (event-driven)
23
+ ```
24
+
25
+ ### Storage Estimation
26
+ ```
27
+ Storage per day = requests_per_day × avg_payload_size
28
+ Storage per year = storage_per_day × 365
29
+ With replication = storage_per_year × replication_factor (3× typical)
30
+ With CDN/cache = reduce by cache_hit_ratio (80% hit = 20% origin load)
31
+
32
+ Common payload sizes:
33
+ Tweet / short text: 500B
34
+ Social post with text: 2KB
35
+ Profile data: 5KB
36
+ Image (compressed): 200KB–2MB
37
+ Video (per minute): 50MB (720p), 150MB (1080p)
38
+ API JSON response: 1–20KB
39
+ ```
40
+
41
+ ### Bandwidth Estimation
42
+ ```
43
+ Inbound bandwidth = avg_request_size × RPS
44
+ Outbound bandwidth = avg_response_size × RPS
45
+
46
+ Convert: 1 Gbps = 125 MB/s
47
+ 10 Gbps = 1.25 GB/s
48
+ ```
49
+
50
+ ---
51
+
52
+ ## Known Scale Limits of Common Technologies
53
+
54
+ ### Databases
55
+
56
+ | Technology | Single Node Writes | Reads (with replicas) | Recommended Shard/Cluster Trigger |
57
+ |------------|-------------------|----------------------|----------------------------------|
58
+ | PostgreSQL | ~5K–20K writes/s | ~50K–200K reads/s | >5TB data or >20K writes/s |
59
+ | MySQL | ~10K–25K writes/s | ~60K–250K reads/s | >5TB or >25K writes/s |
60
+ | MongoDB | ~20K–50K writes/s | ~50K–100K reads/s | >100GB or >50K writes/s |
61
+ | Cassandra | ~200K–1M writes/s | ~200K–500K reads/s | Almost never needs explicit sharding |
62
+ | DynamoDB | Unlimited (managed) | Unlimited (managed) | Use provisioned capacity mode |
63
+ | Redis | ~500K–1M ops/s | Same | >50GB data or cluster needed |
64
+ | Elasticsearch | ~10K–50K docs/s | ~1K–10K queries/s | >100M documents per index |
65
+
66
+ ### Queues / Streams
67
+
68
+ | Technology | Max Throughput | Max Consumers | Retention |
69
+ |------------|----------------|---------------|-----------|
70
+ | Kafka | 1M+ msgs/s per cluster | Unlimited consumer groups | Configurable (days–forever) |
71
+ | RabbitMQ | ~50K–100K msgs/s | Limited by connections | Until consumed |
72
+ | SQS Standard | Unlimited (AWS-managed) | Unlimited | 14 days |
73
+ | SQS FIFO | 3K msgs/s per queue | Per group | 14 days |
74
+ | Redis Pub/Sub | ~1M msgs/s | Limited by subscribers | None (fire-and-forget) |
75
+
76
+ ### Caching
77
+
78
+ | Technology | Max Memory (single) | Max Throughput | Latency |
79
+ |------------|--------------------|--------------|----|
80
+ | Redis | ~1TB RAM | ~1M ops/s | <1ms |
81
+ | Memcached | ~64GB RAM | ~1M ops/s | <1ms |
82
+ | In-process (Caffeine/Guava) | JVM heap | Unlimited (local) | <0.1ms |
83
+
84
+ ---
85
+
86
+ ## Capacity Planning by User Scale
87
+
88
+ ### 1K DAU
89
+ ```
90
+ Avg RPS: ~1–5 RPS
91
+ Peak RPS: ~10–50 RPS
92
+ DB size/year: ~10–50GB
93
+ Infra needed: Single server, managed DB (RDS t3.medium), basic CDN
94
+ Monthly cost: $50–200
95
+ ```
96
+
97
+ ### 10K DAU
98
+ ```
99
+ Avg RPS: ~10–50 RPS
100
+ Peak RPS: ~100–500 RPS
101
+ DB size/year: ~100–500GB
102
+ Infra needed: 2–4 app servers, RDS r5.large, Redis t3.medium, CDN
103
+ Monthly cost: $300–800
104
+ ```
105
+
106
+ ### 100K DAU
107
+ ```
108
+ Avg RPS: ~100–500 RPS
109
+ Peak RPS: ~1K–5K RPS
110
+ DB size/year: ~1–5TB
111
+ Infra needed: ASG (5–10 app servers), RDS r5.xlarge + 2 replicas, Redis cluster, CDN, ALB
112
+ Monthly cost: $2K–8K
113
+ ```
114
+
115
+ ### 1M DAU
116
+ ```
117
+ Avg RPS: ~1K–5K RPS
118
+ Peak RPS: ~10K–50K RPS
119
+ DB size/year: ~10–50TB
120
+ Infra needed: ASG (20–50 servers), DB sharding or Aurora, Redis cluster, Kafka, CDN, WAF
121
+ Monthly cost: $20K–80K
122
+ ```
123
+
124
+ ### 10M DAU
125
+ ```
126
+ Avg RPS: ~10K–50K RPS
127
+ Peak RPS: ~100K–500K RPS
128
+ DB size/year: ~100–500TB
129
+ Infra needed: Multi-region, microservices, distributed DB (Cassandra/CockroachDB), full CDN, dedicated SRE
130
+ Monthly cost: $200K–2M+
131
+ ```
132
+
133
+ ---
134
+
135
+ ## Common SLO Targets
136
+
137
+ | Tier | Availability | Monthly Downtime Allowed |
138
+ |------|-------------|--------------------------|
139
+ | 99% | Basic | 7.2 hours/month |
140
+ | 99.9% (three nines) | Standard production | 43.8 minutes/month |
141
+ | 99.95% | Important services | 21.9 minutes/month |
142
+ | 99.99% (four nines) | Critical services | 4.38 minutes/month |
143
+ | 99.999% (five nines) | Telecom / payments | 26 seconds/month |
144
+
145
+ **Achieving four nines requires:** Multi-AZ deployment, automated failover, zero-downtime deploys, chaos engineering, 24/7 on-call.
146
+
147
+ ---
148
+
149
+ ## Latency Budget Guidelines
150
+
151
+ ```
152
+ User perceived latency targets:
153
+ < 100ms → Feels instant
154
+ 100–300ms → Acceptable for most interactions
155
+ 300ms–1s → Noticeable; optimize if possible
156
+ > 1s → Frustrating; unacceptable for critical paths
157
+
158
+ Network latency by distance (approximate):
159
+ Same datacenter: 0.5ms
160
+ Same region (AZ): 1–2ms
161
+ Cross-region US: 30–60ms
162
+ US to Europe: 80–120ms
163
+ US to Asia: 150–250ms
164
+
165
+ Database query targets:
166
+ Simple key-value: < 1ms (cache)
167
+ Simple DB query: < 5ms
168
+ Complex query: < 50ms
169
+ Reporting query: < 500ms (async if > 1s)
170
+ ```
171
+
172
+
173
+ ## Limitations
174
+ - This is a reference document and may not cover all edge cases. Always verify architectures before production.
@@ -0,0 +1,69 @@
1
+ ---
2
+ name: security-checklist
3
+ description: Reference document for monopoly security-checklist.
4
+ risk: safe
5
+ reports-to: monopoly
6
+ ---
7
+
8
+ # MONOPOLY — Security Hardening Checklist
9
+
10
+ ## Network Security
11
+ - [ ] All services inside private VPC; only LB/API GW exposed publicly
12
+ - [ ] Security groups follow least-privilege (deny all, allow specific ports/CIDRs)
13
+ - [ ] NACLs as secondary defense layer
14
+ - [ ] WAF enabled with OWASP top 10 ruleset
15
+ - [ ] DDoS protection (Cloudflare / AWS Shield Standard minimum)
16
+ - [ ] VPN or Private Link for inter-service communication in multi-region
17
+
18
+ ## Authentication & Authorization
19
+ - [ ] JWT tokens with short expiry (15 min access, 7 day refresh)
20
+ - [ ] OAuth 2.0 / OIDC for third-party auth
21
+ - [ ] MFA enforced for admin accounts
22
+ - [ ] RBAC or ABAC for authorization
23
+ - [ ] No secrets in JWT payload (use opaque references)
24
+ - [ ] Token revocation strategy (Redis blocklist or short TTL)
25
+
26
+ ## API Security
27
+ - [ ] Rate limiting at API gateway (per user, per IP, per endpoint)
28
+ - [ ] Input validation and sanitization on all endpoints
29
+ - [ ] SQL injection prevention (parameterized queries, ORM)
30
+ - [ ] XSS prevention (output encoding, CSP headers)
31
+ - [ ] CSRF protection (SameSite cookies, CSRF tokens)
32
+ - [ ] CORS policy locked down (not wildcard `*`)
33
+ - [ ] HTTP security headers (HSTS, X-Frame-Options, X-Content-Type-Options)
34
+
35
+ ## Data Security
36
+ - [ ] Encryption in transit (TLS 1.2+ everywhere, TLS 1.3 preferred)
37
+ - [ ] Encryption at rest (AES-256 for DBs, S3 SSE)
38
+ - [ ] PII data identified, minimized, and encrypted at field level where needed
39
+ - [ ] Database backups encrypted
40
+ - [ ] No sensitive data in logs (PII, passwords, tokens, card numbers)
41
+
42
+ ## Secrets Management
43
+ - [ ] No secrets in code or environment variables in plain text
44
+ - [ ] Secrets manager in use (HashiCorp Vault, AWS Secrets Manager, GCP Secret Manager)
45
+ - [ ] Secrets rotation automated
46
+ - [ ] IAM roles for service-to-service auth (not static credentials)
47
+
48
+ ## Supply Chain & Dependencies
49
+ - [ ] Dependency scanning (Snyk, Dependabot, npm audit)
50
+ - [ ] Container image scanning (Trivy, ECR scanning)
51
+ - [ ] Pin dependency versions in production
52
+ - [ ] SBOM (Software Bill of Materials) generated for compliance
53
+
54
+ ## Incident Response
55
+ - [ ] Audit logs for all admin actions and data access
56
+ - [ ] Alerting on anomalous access patterns
57
+ - [ ] Incident response runbook documented
58
+ - [ ] Data breach notification process defined (GDPR 72-hour rule)
59
+ - [ ] Regular penetration testing scheduled
60
+
61
+ ## Compliance (as applicable)
62
+ - [ ] GDPR: data residency, right to deletion, consent tracking
63
+ - [ ] PCI-DSS: if handling card data — never store raw PANs
64
+ - [ ] HIPAA: if health data — encryption, audit logs, BAA with vendors
65
+ - [ ] SOC 2 Type II: access control, availability, confidentiality evidence
66
+
67
+
68
+ ## Limitations
69
+ - This is a reference document and may not cover all edge cases. Always verify architectures before production.