engsys 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +202 -0
- package/core/agents/aaron.md +152 -0
- package/core/agents/bert.md +115 -0
- package/core/agents/isabelle.md +136 -0
- package/core/agents/jody.md +150 -0
- package/core/agents/leith.md +111 -0
- package/core/agents/marcelo.md +282 -0
- package/core/agents/melvin.md +101 -0
- package/core/agents/nyx.md +152 -0
- package/core/agents/otto.md +168 -0
- package/core/agents/patricia.md +283 -0
- package/core/commands/design-audit-local.md +155 -0
- package/core/commands/design-audit.md +235 -0
- package/core/commands/design-critique.md +96 -0
- package/core/commands/file-issue.md +22 -0
- package/core/commands/generate-project.md +45 -0
- package/core/commands/implement-issue.md +37 -0
- package/core/commands/implement-project.md +40 -0
- package/core/commands/naturalize.md +61 -0
- package/core/commands/pre-push.md +29 -0
- package/core/commands/prep-review-collect.md +130 -0
- package/core/commands/prep-review-finalize.md +121 -0
- package/core/commands/prep-review-publish.md +113 -0
- package/core/commands/prep-review.md +65 -0
- package/core/commands/project-closeout.md +25 -0
- package/core/skills/agentic-eval/SKILL.md +195 -0
- package/core/skills/chrome-devtools/SKILL.md +97 -0
- package/core/skills/code-review/SKILL.md +26 -0
- package/core/skills/gh-cli/SKILL.md +2202 -0
- package/core/skills/git-commit/SKILL.md +124 -0
- package/core/skills/git-workflow-agents/SKILL.md +462 -0
- package/core/skills/git-workflow-agents/reference.md +220 -0
- package/core/skills/github-actions/SKILL.md +190 -0
- package/core/skills/github-issues/SKILL.md +154 -0
- package/core/skills/llm-structured-outputs/SKILL.md +323 -0
- package/core/skills/llm-structured-outputs/references/provider-details.md +392 -0
- package/core/skills/pre-push/SKILL.md +115 -0
- package/core/skills/refactor/SKILL.md +645 -0
- package/core/skills/web-design-reviewer/SKILL.md +371 -0
- package/core/skills/webapp-testing/SKILL.md +127 -0
- package/core/skills/webapp-testing/test-helper.js +56 -0
- package/core/templates/CLAUDE.md.tmpl +98 -0
- package/core/templates/adr-template.md +67 -0
- package/core/templates/gh-issue-templates/bug.md +39 -0
- package/core/templates/gh-issue-templates/content.md +42 -0
- package/core/templates/gh-issue-templates/enhancement.md +36 -0
- package/core/templates/gh-issue-templates/feature.md +39 -0
- package/core/templates/gh-issue-templates/infrastructure.md +41 -0
- package/core/templates/post-edit-reminders.sh.tmpl +19 -0
- package/core/templates/settings.json.tmpl +90 -0
- package/core/templates/settings.local.json.tmpl +3 -0
- package/core/workflows/agent-implementation-workflow.md +346 -0
- package/core/workflows/generate-project.md +258 -0
- package/core/workflows/implement-project-workflow.md +190 -0
- package/core/workflows/issue-tracking.md +89 -0
- package/core/workflows/project-closeout-ceremony.md +77 -0
- package/core/workflows/review-workflow.md +266 -0
- package/engsys.config.example.yaml +46 -0
- package/install +202 -0
- package/lessons-library/README.md +80 -0
- package/lessons-library/async-callbacks-verify-liveness.md +15 -0
- package/lessons-library/change-isnt-done-until-every-surface-updated.md +15 -0
- package/lessons-library/claim-then-act-for-irreversible-ops.md +16 -0
- package/lessons-library/co-commit-entangled-work.md +15 -0
- package/lessons-library/dependabot-triage-playbook.md +17 -0
- package/lessons-library/deploy-by-digest-and-verify-the-running-revision.md +15 -0
- package/lessons-library/enforce-your-guarantee-at-your-boundary.md +16 -0
- package/lessons-library/gate-changes-on-measurement-not-vibes.md +15 -0
- package/lessons-library/iac-first-no-console-changes.md +15 -0
- package/lessons-library/independent-objective-review-gate.md +15 -0
- package/lessons-library/keep-an-immutable-source-of-truth.md +15 -0
- package/lessons-library/long-agent-runs-checkpoint-not-poll.md +15 -0
- package/lessons-library/model-identity-with-stable-ids-and-provenance.md +15 -0
- package/lessons-library/operator-choices-are-first-class.md +15 -0
- package/lessons-library/prefer-tool-enforced-structured-output.md +15 -0
- package/lessons-library/prove-causation-before-acting.md +15 -0
- package/lessons-library/re-read-state-before-acting.md +14 -0
- package/lessons-library/read-layer-tolerates-unbackfilled-rows.md +15 -0
- package/lessons-library/shell-safety-pipefail-and-validate-before-teardown.md +14 -0
- package/lessons-library/shift-correctness-left-and-distrust-false-greens.md +15 -0
- package/lessons-library/stray-control-bytes-hide-changes.md +14 -0
- package/lessons-library/tests-can-assert-the-bug.md +15 -0
- package/lessons-library/verify-ground-truth-not-reports.md +15 -0
- package/lessons-library/worktrees-need-bootstrap-from-origin-main.md +15 -0
- package/lib/commands.js +356 -0
- package/lib/generate-team-avatars.mjs +251 -0
- package/lib/manifest.js +155 -0
- package/lib/render.js +135 -0
- package/lib/selftest.js +90 -0
- package/lib/util.js +89 -0
- package/lib/yaml.js +156 -0
- package/optional-agents/gary.md +86 -0
- package/optional-agents/jos.md +136 -0
- package/optional-agents/sandy.md +101 -0
- package/optional-agents/steve.md +161 -0
- package/package.json +43 -0
- package/stacks/cloud/aws/claude.fragment.md +17 -0
- package/stacks/cloud/aws/settings.fragment.json +39 -0
- package/stacks/cloud/aws/skills/aws-deployment-preflight/SKILL.md +165 -0
- package/stacks/cloud/aws/skills/cloud-architecture-aws/SKILL.md +265 -0
- package/stacks/cloud/azure/claude.fragment.md +17 -0
- package/stacks/cloud/azure/settings.fragment.json +45 -0
- package/stacks/cloud/azure/skills/azure-deployment-preflight/SKILL.md +175 -0
- package/stacks/cloud/azure/skills/cloud-architecture-azure/SKILL.md +211 -0
- package/stacks/cloud/cloudflare/claude.fragment.md +21 -0
- package/stacks/cloud/cloudflare/settings.fragment.json +31 -0
- package/stacks/cloud/cloudflare/skills/cloud-architecture-cloudflare/SKILL.md +294 -0
- package/stacks/cloud/cloudflare/skills/cloudflare-deployment-preflight/SKILL.md +175 -0
- package/stacks/cloud/gcp/claude.fragment.md +17 -0
- package/stacks/cloud/gcp/settings.fragment.json +40 -0
- package/stacks/cloud/gcp/skills/cloud-architecture-gcp/SKILL.md +208 -0
- package/stacks/cloud/gcp/skills/gcp-deployment-preflight/SKILL.md +137 -0
- package/stacks/db/mongo/skills/mongo-conventions/SKILL.md +96 -0
- package/stacks/db/prisma/claude.fragment.md +49 -0
- package/stacks/db/prisma/skills/docker-database-package-copy/SKILL.md +44 -0
- package/stacks/db/prisma/skills/prisma-conventions/SKILL.md +37 -0
- package/stacks/domain/mobile-growth/skills/apple-ads/SKILL.md +184 -0
- package/stacks/domain/mobile-growth/skills/apple-ads/references/benchmark-notes.md +47 -0
- package/stacks/domain/mobile-growth/skills/apple-ads/references/official-links.md +53 -0
- package/stacks/domain/mobile-growth/skills/google-play-growth/SKILL.md +197 -0
- package/stacks/domain/mobile-growth/skills/google-play-growth/references/benchmark-notes.md +47 -0
- package/stacks/domain/mobile-growth/skills/google-play-growth/references/official-links.md +45 -0
- package/stacks/iac/bicep/claude.fragment.md +14 -0
- package/stacks/iac/bicep/settings.fragment.json +20 -0
- package/stacks/iac/bicep/skills/iac-bicep/SKILL.md +113 -0
- package/stacks/iac/cdk/claude.fragment.md +14 -0
- package/stacks/iac/cdk/settings.fragment.json +23 -0
- package/stacks/iac/cdk/skills/iac-cdk/SKILL.md +104 -0
- package/stacks/iac/terraform/claude.fragment.md +13 -0
- package/stacks/iac/terraform/settings.fragment.json +25 -0
- package/stacks/iac/terraform/skills/iac-terraform/SKILL.md +93 -0
- package/stacks/iac/terraform/skills/terraform-conventions/SKILL.md +87 -0
- package/stacks/lang/kotlin/skills/android-testing/SKILL.md +263 -0
- package/stacks/lang/kotlin/skills/jetpack-compose/SKILL.md +264 -0
- package/stacks/lang/kotlin/skills/kotlin-coroutines/SKILL.md +329 -0
- package/stacks/lang/python/skills/python-conventions/SKILL.md +61 -0
- package/stacks/lang/shell/skills/shell-scripting/SKILL.md +110 -0
- package/stacks/lang/swift/skills/swift-concurrency/SKILL.md +423 -0
- package/stacks/lang/swift/skills/swift-concurrency/references/approachable-concurrency.md +80 -0
- package/stacks/lang/swift/skills/swift-concurrency/references/concurrency-patterns.md +233 -0
- package/stacks/lang/swift/skills/swift-concurrency/references/swiftui-concurrency.md +187 -0
- package/stacks/lang/swift/skills/swift-concurrency/references/synchronization-primitives.md +341 -0
- package/stacks/lang/swift/skills/swift-testing/SKILL.md +497 -0
- package/stacks/lang/swift/skills/swift-testing/references/testing-advanced.md +106 -0
- package/stacks/lang/swift/skills/swift-testing/references/testing-patterns.md +504 -0
- package/stacks/lang/swift/skills/swiftdata/SKILL.md +334 -0
- package/stacks/lang/swift/skills/swiftdata/references/core-data-coexistence.md +504 -0
- package/stacks/lang/swift/skills/swiftdata/references/swiftdata-advanced.md +975 -0
- package/stacks/lang/swift/skills/swiftdata/references/swiftdata-queries.md +675 -0
- package/stacks/lang/swift/skills/swiftui-patterns/SKILL.md +371 -0
- package/stacks/lang/swift/skills/swiftui-patterns/references/architecture-patterns.md +486 -0
- package/stacks/lang/swift/skills/swiftui-patterns/references/deprecated-migration.md +1097 -0
- package/stacks/lang/swift/skills/swiftui-patterns/references/design-polish.md +780 -0
- package/stacks/lang/swift/skills/swiftui-patterns/references/platform-and-sharing.md +696 -0
- package/stacks/lang/typescript/skills/typescript-conventions/SKILL.md +91 -0
- package/stacks/platform/android/claude.fragment.md +40 -0
- package/stacks/platform/android/hooks/pre-push-gradle.sh +70 -0
- package/stacks/platform/android/settings.fragment.json +13 -0
- package/stacks/platform/android/skills/android-build-conventions/SKILL.md +247 -0
- package/stacks/platform/ios/claude.fragment.md +24 -0
- package/stacks/platform/ios/hooks/pre-push-xcodebuild.sh +82 -0
- package/stacks/platform/ios/settings.fragment.json +21 -0
- package/stacks/platform/ios/skills/xcodebuildmcp-simulator-logs/SKILL.md +76 -0
- package/stacks/platform/web/skills/frontend-testing/SKILL.md +246 -0
- package/stacks/platform/web/skills/react-conventions/SKILL.md +261 -0
- package/stacks/platform/web/skills/web-platform-conventions/SKILL.md +55 -0
- package/stacks/tooling/issue-tracker-github/claude.fragment.md +10 -0
- package/stacks/tooling/issue-tracker-github/settings.fragment.json +24 -0
- package/stacks/tooling/issue-tracker-github/skills/issue-tracker-github/SKILL.md +278 -0
- package/stacks/tooling/issue-tracker-linear/claude.fragment.md +17 -0
- package/stacks/tooling/issue-tracker-linear/settings.fragment.json +9 -0
- package/stacks/tooling/issue-tracker-linear/skills/issue-tracker-linear/SKILL.md +183 -0
|
@@ -0,0 +1,211 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: cloud-architecture-azure
|
|
3
|
+
description: Azure service-level architecture knowledge — compute (Container Apps/AKS/Functions), data (PostgreSQL Flexible Server/Cosmos DB), messaging (Service Bus/Event Grid/Event Hubs), edge (Front Door/API Management), identity (Entra), storage + secrets (Blob/Key Vault/ACR), and Azure OpenAI. Cost models, service limits, failure modes, and cold-start gotchas. Activate when the active cloud is Azure and the work involves designing, scaling, costing, or diagnosing Azure architecture (Container Apps cold starts, PostgreSQL connection limits, Service Bus quotas, Front Door, NAT egress).
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Azure Architecture Knowledge
|
|
7
|
+
|
|
8
|
+
Service-level detail for an Azure-backed project. Pairs with Melvin's cloud-agnostic
|
|
9
|
+
diagnostic checklist (traffic pattern, state location, SLAs, blast radius, cost
|
|
10
|
+
explosion, coordination, limits, observability) — this pack supplies the Azure-specific
|
|
11
|
+
answers. For concrete topology, cost tiers, and stack context, read the architecture
|
|
12
|
+
docs named in `CLAUDE.md`. For deep service docs use the `microsoft-docs` skill
|
|
13
|
+
(Microsoft Learn MCP).
|
|
14
|
+
|
|
15
|
+
## Compute
|
|
16
|
+
|
|
17
|
+
### Azure Container Apps (ACA)
|
|
18
|
+
|
|
19
|
+
- Serverless containers on managed Kubernetes (KEDA-based autoscaling). **Scale to
|
|
20
|
+
zero** is the cost win — and the latency trap: a scaled-to-zero app pays a **cold
|
|
21
|
+
start** (image pull + container start, seconds) on the next request. For latency-
|
|
22
|
+
sensitive services set **`minReplicas >= 1`** (a warm replica) — the ACA analogue of
|
|
23
|
+
provisioned concurrency.
|
|
24
|
+
- **Revisions:** each config/image change creates a revision; traffic-split between
|
|
25
|
+
them for blue/green and canary. The phantom "revision nobody recognizes" is usually a
|
|
26
|
+
stale active revision still taking traffic — check the revision list and traffic split.
|
|
27
|
+
- **Scaling:** KEDA scale rules on HTTP concurrency, CPU/memory, or queue length
|
|
28
|
+
(Service Bus, etc.). Per-app and per-environment replica ceilings apply — know them
|
|
29
|
+
before you bet a spike on autoscale.
|
|
30
|
+
- **Good for:** microservices, HTTP APIs, queue/event workers. Prefer it over AKS unless
|
|
31
|
+
you genuinely need Kubernetes primitives.
|
|
32
|
+
|
|
33
|
+
### AKS
|
|
34
|
+
|
|
35
|
+
- Full Kubernetes — reach for it only when you need operators, complex scheduling,
|
|
36
|
+
service mesh, or a multi-tenant platform. Otherwise it's operational overhead you'll
|
|
37
|
+
regret; Container Apps or Functions usually suffice.
|
|
38
|
+
|
|
39
|
+
### Azure Functions
|
|
40
|
+
|
|
41
|
+
- Event-driven serverless. **Consumption** plan scales to zero (cold starts) and bills
|
|
42
|
+
per execution+GB-s; **Premium** keeps pre-warmed instances (no cold start, VNet
|
|
43
|
+
integration); **Dedicated/App Service** for predictable steady load. Durable Functions
|
|
44
|
+
for orchestration/fan-out-fan-in workflows.
|
|
45
|
+
|
|
46
|
+
## Data
|
|
47
|
+
|
|
48
|
+
### PostgreSQL Flexible Server
|
|
49
|
+
|
|
50
|
+
- **Connection limits** scale with tier/size and are the classic bottleneck —
|
|
51
|
+
serverless/many-replica apps exhaust them. Use **PgBouncer** (built-in pooler) — but
|
|
52
|
+
it is **not available on the Burstable tier**; only GeneralPurpose+ supports it (see
|
|
53
|
+
the IaC lessons). On Burstable, pool in-app or upsize.
|
|
54
|
+
- **Tiers:** Burstable (`B`-series — dev/cheap, throttled baseline CPU, no PgBouncer) →
|
|
55
|
+
GeneralPurpose (`D`-series) → MemoryOptimized (`E`-series). HA = zone-redundant standby
|
|
56
|
+
(failover drops connections — apps must reconnect/retry). Read replicas scale reads.
|
|
57
|
+
- **The metric is `active_connections`**, not `connection_percent` (that's Azure SQL) —
|
|
58
|
+
matters for alerts (see IaC lessons).
|
|
59
|
+
- IOPS and storage scale together up to a point; provisioned IOPS available on higher
|
|
60
|
+
tiers. Watch IOPS on write-heavy workloads.
|
|
61
|
+
|
|
62
|
+
### Cosmos DB
|
|
63
|
+
|
|
64
|
+
- Globally-distributed multi-model. **Partition key design is everything** — a hot
|
|
65
|
+
partition throttles (429) regardless of total RU/s; each physical partition caps
|
|
66
|
+
~10,000 RU/s. Pick a high-cardinality, evenly-accessed key.
|
|
67
|
+
- **Throughput:** provisioned RU/s (manual or autoscale) vs **serverless** (pay-per-
|
|
68
|
+
request, good for spiky/dev). Five tunable consistency levels (Strong → Eventual) —
|
|
69
|
+
stronger costs more RU and latency; Session is the usual default.
|
|
70
|
+
- Cost is RU-driven: large items, cross-partition queries, and indexing-everything
|
|
71
|
+
inflate it. Tune the indexing policy.
|
|
72
|
+
|
|
73
|
+
## Messaging & Orchestration
|
|
74
|
+
|
|
75
|
+
### Service Bus
|
|
76
|
+
|
|
77
|
+
- Enterprise broker: **queues** (point-to-point) and **topics/subscriptions** (pub/sub
|
|
78
|
+
with SQL/correlation filters). **Sessions** for ordered/grouped processing, **DLQ**
|
|
79
|
+
built-in (configure max delivery count), scheduled + deferred messages, duplicate
|
|
80
|
+
detection.
|
|
81
|
+
- **Tiers:** Basic (queues only) / Standard (topics, pay-per-op) / **Premium**
|
|
82
|
+
(dedicated capacity, predictable latency, VNet, larger messages, required for serious
|
|
83
|
+
throughput). Standard has per-namespace throughput limits — Premium for high volume.
|
|
84
|
+
- Lock duration must exceed processing time or you get redelivery — idempotent consumers
|
|
85
|
+
required.
|
|
86
|
+
|
|
87
|
+
### Event Grid / Event Hubs
|
|
88
|
+
|
|
89
|
+
- **Event Grid:** lightweight reactive pub/sub for discrete events (resource changes,
|
|
90
|
+
custom topics) with filtering — fan-out, low latency, cheap.
|
|
91
|
+
- **Event Hubs:** high-throughput streaming/ingestion (Kafka-compatible), partitioned,
|
|
92
|
+
consumer groups — telemetry, log/event firehose. Use Event Hubs for *streams*, Event
|
|
93
|
+
Grid for *discrete events*, Service Bus for *transactional work queues*.
|
|
94
|
+
|
|
95
|
+
### Durable Functions / Logic Apps
|
|
96
|
+
|
|
97
|
+
- Durable Functions for code-first orchestration; Logic Apps for low-code connector-
|
|
98
|
+
driven integration workflows.
|
|
99
|
+
|
|
100
|
+
## Edge & Networking
|
|
101
|
+
|
|
102
|
+
### Front Door
|
|
103
|
+
|
|
104
|
+
- Global L7 load balancer + CDN + WAF at the edge. Anycast routing, TLS termination,
|
|
105
|
+
caching, path/host routing to origins. Use it for global entry, edge caching, and WAF;
|
|
106
|
+
cache key + rules drive hit ratio.
|
|
107
|
+
|
|
108
|
+
### API Management (APIM)
|
|
109
|
+
|
|
110
|
+
- Full API gateway: policies (rate limiting, transformation, auth), product/subscription
|
|
111
|
+
keys, developer portal. Heavier (and pricier — Consumption tier for serverless-style
|
|
112
|
+
billing, Developer/Standard/Premium otherwise) than Front Door routing; use when you
|
|
113
|
+
need real API-management features.
|
|
114
|
+
|
|
115
|
+
### VNet / Networking — cost landmines
|
|
116
|
+
|
|
117
|
+
- **NAT Gateway** + outbound data processing, and **cross-zone / cross-region data
|
|
118
|
+
transfer** are the egress budget-eaters (same shape as every cloud). Use **Private
|
|
119
|
+
Endpoints / Private Link** to keep traffic off the public path and **Service
|
|
120
|
+
Endpoints** where applicable. Default to private; public only where required.
|
|
121
|
+
|
|
122
|
+
## Identity
|
|
123
|
+
|
|
124
|
+
### Microsoft Entra
|
|
125
|
+
|
|
126
|
+
- **Entra ID** (workforce) and **Entra External ID** (customers/CIAM — the successor to
|
|
127
|
+
Azure AD B2C) for app auth: OIDC/OAuth2, social + federated IdPs, MFA, conditional
|
|
128
|
+
access. **Managed identities** (system- or user-assigned) are the right way for
|
|
129
|
+
Azure resources to authenticate to each other — no secrets in code. Prefer managed
|
|
130
|
+
identity + Key Vault references over connection strings everywhere.
|
|
131
|
+
|
|
132
|
+
## Storage, Secrets & Registry
|
|
133
|
+
|
|
134
|
+
### Blob Storage
|
|
135
|
+
|
|
136
|
+
- Tiers: Hot → Cool → Cold → Archive (retrieval latency/cost). Lifecycle policies to
|
|
137
|
+
age data down. Encryption at rest by default. Cost = storage + transactions + egress.
|
|
138
|
+
|
|
139
|
+
### Key Vault
|
|
140
|
+
|
|
141
|
+
- Secrets, keys, certs. Reference from Container Apps / App Service as secret refs and
|
|
142
|
+
from Bicep via `getSecret()` — keep secrets out of templates and env files. **Name is
|
|
143
|
+
globally unique** and **soft-deleted on delete** (purge protection can block name
|
|
144
|
+
reuse) — see IaC lessons. RBAC or access-policy authorization model; managed identity
|
|
145
|
+
for access.
|
|
146
|
+
|
|
147
|
+
### Azure Container Registry (ACR)
|
|
148
|
+
|
|
149
|
+
- Private image registry. **SKU restrictions bite** (see IaC lessons): Basic may be
|
|
150
|
+
unavailable on some subscriptions; `retentionPolicy` requires **Premium** (remove it
|
|
151
|
+
from Standard dev/staging). Use managed identity / ACR tokens for pulls.
|
|
152
|
+
|
|
153
|
+
## Azure OpenAI / AI
|
|
154
|
+
|
|
155
|
+
- Managed OpenAI + Azure-hosted models via deployments. **Capacity is in TPM (tokens-
|
|
156
|
+
per-minute) quota per deployment per region** — the throughput ceiling; a naive high-
|
|
157
|
+
volume pipeline hits 429s, so request quota early and add backoff. **Provisioned
|
|
158
|
+
Throughput Units (PTUs)** reserve guaranteed capacity/latency (committed spend) vs
|
|
159
|
+
pay-as-you-go. Model availability varies by region. Pair with **Azure AI Search** for
|
|
160
|
+
managed RAG. Cost is token-driven — right-size the model per task.
|
|
161
|
+
|
|
162
|
+
## Cost realism (where Azure bills explode)
|
|
163
|
+
|
|
164
|
+
1. **NAT Gateway / outbound data processing** — per-GB egress. Use Private Endpoints.
|
|
165
|
+
2. **Cross-zone / cross-region data transfer** — per-GB.
|
|
166
|
+
3. **Container Apps / Functions over-provisioning** — vCPU-s on idle min-replicas.
|
|
167
|
+
4. **PostgreSQL IOPS + over-sized tier**; Cosmos RU/s over-provisioning.
|
|
168
|
+
5. **Service Bus Premium** dedicated capacity; APIM tier.
|
|
169
|
+
6. **Azure OpenAI tokens / PTUs**.
|
|
170
|
+
7. **Front Door + Blob egress**.
|
|
171
|
+
|
|
172
|
+
Levers: Reservations / Savings Plans (steady baseline), Spot VMs (fault-tolerant batch),
|
|
173
|
+
right-sizing from Azure Monitor, Private Endpoints, Blob lifecycle tiering, Cost
|
|
174
|
+
Management budgets + alerts.
|
|
175
|
+
|
|
176
|
+
## Service limits (check before betting on them)
|
|
177
|
+
|
|
178
|
+
Container Apps replicas per app/environment, Functions scale limits, Service Bus per-
|
|
179
|
+
namespace throughput (Standard) + entity counts, PostgreSQL `max_connections` per tier,
|
|
180
|
+
Cosmos RU/s per partition, Front Door routes, APIM throughput per tier, Azure OpenAI
|
|
181
|
+
TPM/PTU per region, Key Vault transactions/sec, subscription-level vCPU quotas per
|
|
182
|
+
region/family. Raise soft quotas via support with lead time.
|
|
183
|
+
|
|
184
|
+
## Observability
|
|
185
|
+
|
|
186
|
+
Azure Monitor (metrics + alerts), **Log Analytics workspace** (the store the
|
|
187
|
+
`AppRequests`/`ContainerAppConsoleLogs` tables live in — KQL queries scope to the
|
|
188
|
+
*workspace*, not App Insights), Application Insights (APM/distributed tracing). Alarm on
|
|
189
|
+
Container Apps replica count + restarts, Service Bus DLQ depth + active message count,
|
|
190
|
+
PostgreSQL `active_connections` + CPU, Cosmos 429 rate + RU consumption, Front Door
|
|
191
|
+
backend health.
|
|
192
|
+
|
|
193
|
+
## Hard-won lessons
|
|
194
|
+
|
|
195
|
+
### Container Apps VNet integration cannot be retrofitted
|
|
196
|
+
**Symptom:** You need Private Endpoints (e.g. a private DB path) but the Container
|
|
197
|
+
Apps environment was created without VNet integration; there's no in-place toggle.
|
|
198
|
+
**Cause:** A VNet-integrated ACA environment routes egress through the VNet — that's
|
|
199
|
+
a creation-time property of the *environment*, not a setting you can flip later.
|
|
200
|
+
**Fix:** Create the environment **VNet-integrated from the start** whenever Private
|
|
201
|
+
Endpoints/Link are on the roadmap (do it while the resource group is still empty).
|
|
202
|
+
Retrofitting means tearing down and recreating the environment.
|
|
203
|
+
|
|
204
|
+
### Deploy by immutable digest + unique revision; verify the active revision
|
|
205
|
+
**Symptom:** A "green" deploy reports success but the running app is still the old
|
|
206
|
+
build — the new code never actually took traffic.
|
|
207
|
+
**Cause:** A mutable image tag (`:latest`) can resolve to a stale layer, or a config
|
|
208
|
+
change that produces no new revision leaves the prior revision active.
|
|
209
|
+
**Fix:** Deploy by **immutable image digest** with a **unique revision suffix**, then
|
|
210
|
+
**verify the active revision** after deploy (revision list + traffic split). Treat
|
|
211
|
+
the post-deploy revision check as part of the deploy, not an afterthought.
|
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
## Cloud stack
|
|
2
|
+
|
|
3
|
+
- **Active cloud: Cloudflare.** Architecture and deploys target Cloudflare's edge
|
|
4
|
+
(Workers/Pages + R2/D1/KV/Durable Objects/Queues); agents load the
|
|
5
|
+
`cloud-architecture-cloudflare` and `cloudflare-deployment-preflight` skill packs.
|
|
6
|
+
- **Tool preference order** (when investigating or validating cloud state):
|
|
7
|
+
1. **Wrangler CLI, read-only** — `wrangler whoami`, `wrangler deploy --dry-run`,
|
|
8
|
+
`wrangler d1 list` / `wrangler d1 info`, `wrangler kv namespace list`,
|
|
9
|
+
`wrangler r2 bucket list`, `wrangler queues list`, `wrangler versions list`,
|
|
10
|
+
`wrangler secret list`, `wrangler tail` and similar inspection commands. Never
|
|
11
|
+
mutate state to answer a question.
|
|
12
|
+
2. **Docs source** — official Cloudflare documentation (developers.cloudflare.com) for
|
|
13
|
+
service limits, plan tiers, pricing, and binding/API behavior. Limits are plan-tier
|
|
14
|
+
dependent and change — verify against docs rather than from memory.
|
|
15
|
+
- Mutating actions (`wrangler deploy`, `secret put`, `d1 execute`, `d1 migrations apply`,
|
|
16
|
+
`r2 object delete`, `delete`) go through the `cloudflare-deployment-preflight` gate and
|
|
17
|
+
a staged `versions upload` + gradual rollout, never an ad-hoc bare deploy.
|
|
18
|
+
|
|
19
|
+
<!-- naturalize: confirm the Cloudflare account, the plan tier (Free/Paid/Enterprise —
|
|
20
|
+
limits depend on it), the state primitives in use (KV/D1/DO/R2), and the path to the
|
|
21
|
+
architecture/cost docs Melvin and Aaron should read for concrete topology. -->
|
|
@@ -0,0 +1,31 @@
|
|
|
1
|
+
{
|
|
2
|
+
"permissions": {
|
|
3
|
+
"allow": [
|
|
4
|
+
"Bash(wrangler whoami:*)",
|
|
5
|
+
"Bash(wrangler deploy --dry-run:*)",
|
|
6
|
+
"Bash(wrangler pages functions build:*)",
|
|
7
|
+
"Bash(wrangler d1 list:*)",
|
|
8
|
+
"Bash(wrangler d1 info:*)",
|
|
9
|
+
"Bash(wrangler d1 migrations list:*)",
|
|
10
|
+
"Bash(wrangler kv:key list:*)",
|
|
11
|
+
"Bash(wrangler kv:namespace list:*)",
|
|
12
|
+
"Bash(wrangler kv namespace list:*)",
|
|
13
|
+
"Bash(wrangler r2 bucket list:*)",
|
|
14
|
+
"Bash(wrangler queues list:*)",
|
|
15
|
+
"Bash(wrangler versions list:*)",
|
|
16
|
+
"Bash(wrangler versions view:*)",
|
|
17
|
+
"Bash(wrangler secret list:*)",
|
|
18
|
+
"Bash(wrangler tail:*)"
|
|
19
|
+
],
|
|
20
|
+
"deny": [
|
|
21
|
+
"Bash(wrangler delete:*)",
|
|
22
|
+
"Bash(wrangler d1 execute:*)",
|
|
23
|
+
"Bash(wrangler d1 migrations apply:*)",
|
|
24
|
+
"Bash(wrangler secret put:*)",
|
|
25
|
+
"Bash(wrangler secret delete:*)",
|
|
26
|
+
"Bash(wrangler r2 object delete:*)",
|
|
27
|
+
"Bash(wrangler kv:key delete:*)"
|
|
28
|
+
]
|
|
29
|
+
},
|
|
30
|
+
"mcpServers": {}
|
|
31
|
+
}
|
|
@@ -0,0 +1,294 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: cloud-architecture-cloudflare
|
|
3
|
+
description: Cloudflare edge/serverless architecture knowledge — compute (Workers isolates/Containers, smart placement), Pages, storage (R2/D1/KV/Durable Objects/Vectorize), messaging (Queues), AI (Workers AI/AI Gateway), and origin pooling (Hyperdrive/Cache API). CPU-time and subrequest limits, consistency tradeoffs, failure modes, and request+CPU pricing gotchas (no R2 egress). Activate when the active cloud is Cloudflare and the work involves designing, scaling, costing, or diagnosing Cloudflare architecture (Workers CPU limits, D1 write limits, KV eventual consistency, Durable Object single-threading, R2 egress savings, subrequest caps).
|
|
4
|
+
---
|
|
5
|
+
|
|
6
|
+
# Cloudflare Architecture Knowledge
|
|
7
|
+
|
|
8
|
+
Service-level detail for a Cloudflare-backed project. Pairs with Melvin's cloud-agnostic
|
|
9
|
+
diagnostic checklist (traffic pattern, state location, SLAs, blast radius, cost
|
|
10
|
+
explosion, coordination, limits, observability) — this pack supplies the Cloudflare-specific
|
|
11
|
+
answers for each. Cloudflare's model is fundamentally different from AWS/Azure: there are
|
|
12
|
+
no regions you provision into and no VMs — code runs in **V8 isolates** at the edge POP
|
|
13
|
+
nearest the user, and state lives in purpose-built edge primitives. For concrete project
|
|
14
|
+
topology, cost tiers, and stack context, read the architecture docs named in `CLAUDE.md`.
|
|
15
|
+
|
|
16
|
+
## Compute
|
|
17
|
+
|
|
18
|
+
### Workers (isolates)
|
|
19
|
+
|
|
20
|
+
- **No cold starts.** Workers run as **V8 isolates**, not containers/VMs — a new isolate
|
|
21
|
+
spins up in <5ms (often "zero" perceived) because there's no OS/runtime boot. This is
|
|
22
|
+
the headline architectural difference from Lambda/Cloud Functions/Container Apps: the
|
|
23
|
+
"scale to zero costs you a cold start" tradeoff that dominates AWS/Azure design simply
|
|
24
|
+
doesn't exist here. Design for it — short-lived, stateless request handlers are ideal.
|
|
25
|
+
- **CPU-time limit, not wall-clock.** The cap is **CPU time**, default **30s** on paid,
|
|
26
|
+
configurable up to **5 minutes** (`limits.cpu_ms` in `wrangler.toml`, max 300000).
|
|
27
|
+
Free plan is **10ms** CPU/request. Crucially, **time spent awaiting I/O (fetch, KV, D1)
|
|
28
|
+
does not count** against CPU time — a Worker can wait minutes on a slow subrequest and
|
|
29
|
+
burn near-zero CPU. The killer is *compute* (crypto, JSON parsing huge payloads, image
|
|
30
|
+
work, regex backtracking), not waiting. The historical "50ms CPU" number was the old
|
|
31
|
+
default; the platform now allows far more, but **a CPU-bound hot path is still where you
|
|
32
|
+
get `Exceeded CPU` (Error 1102) errors** — profile and offload heavy compute.
|
|
33
|
+
- **Subrequest limit** is the other hard ceiling: **50 subrequests/request on Free, 1000
|
|
34
|
+
on paid** (`fetch` + binding calls to KV/D1/R2/service bindings each count). A Worker
|
|
35
|
+
that fans out per-item to an API or DB will hit this — batch, cache, or move the fan-out
|
|
36
|
+
into a Durable Object / Queue consumer. Simultaneous open connections cap at ~6.
|
|
37
|
+
- **Memory** is **128MB per isolate** — hard limit, `Exceeded Memory` (1102) on overflow.
|
|
38
|
+
No tuning knob like Lambda's memory→CPU slider. Stream large bodies; never buffer a big
|
|
39
|
+
R2 object fully into memory.
|
|
40
|
+
- **Bundle size:** 3MB (Free) / 10MB compressed (paid) per Worker script. Heavy npm deps
|
|
41
|
+
(especially Node built-ins needing `nodejs_compat`) bloat this — tree-shake.
|
|
42
|
+
- **Good for:** edge APIs, auth/routing/transform middleware, request shaping, glue.
|
|
43
|
+
**Bad for:** CPU-heavy batch, anything needing >128MB RAM, long-running stateful compute
|
|
44
|
+
(use Containers or push to a real backend via Hyperdrive).
|
|
45
|
+
|
|
46
|
+
### Workers Containers
|
|
47
|
+
|
|
48
|
+
- For workloads that don't fit the isolate model (full Linux, large memory, arbitrary
|
|
49
|
+
runtimes, heavy CPU, existing Docker images), **Workers Containers** run actual
|
|
50
|
+
containers, *orchestrated by a Durable Object* and programmatically started/stopped from
|
|
51
|
+
a Worker. Unlike isolates, containers **do have cold starts** (image pull + boot) and
|
|
52
|
+
bill for the time they're running — the AWS Fargate tradeoff reappears here. Use them as
|
|
53
|
+
the escape hatch, not the default; keep the hot path on isolates.
|
|
54
|
+
|
|
55
|
+
### Smart Placement
|
|
56
|
+
|
|
57
|
+
- By default a Worker runs at the POP nearest the *user*. If the Worker makes several
|
|
58
|
+
round trips to a **centralized origin** (a DB in one region, a slow upstream API), edge
|
|
59
|
+
placement means each subrequest pays the full user→origin latency. **Smart Placement**
|
|
60
|
+
(`placement = { mode = "smart" }`) lets Cloudflare instead run the Worker near the
|
|
61
|
+
*origin*, collapsing N origin round-trips into one user round-trip. Win when the Worker
|
|
62
|
+
is back-end-chatty; no benefit (or slightly worse) for a Worker that mostly serves edge
|
|
63
|
+
data (KV/cache/static). Pair origin-DB Workers with Smart Placement + Hyperdrive.
|
|
64
|
+
|
|
65
|
+
### Pages
|
|
66
|
+
|
|
67
|
+
- Git-integrated hosting for static sites + SPA/SSR frameworks, with **Pages Functions**
|
|
68
|
+
(Workers under the hood, file-based routing in `functions/`). Automatic preview
|
|
69
|
+
deployments per branch/PR. Increasingly converging with Workers (Workers now also serves
|
|
70
|
+
static assets via the `assets` binding) — for new projects prefer **Workers + static
|
|
71
|
+
assets** unless you specifically want Pages' Git/CI ergonomics. Same isolate runtime,
|
|
72
|
+
same limits apply to the Functions.
|
|
73
|
+
|
|
74
|
+
## Storage & Data
|
|
75
|
+
|
|
76
|
+
The hardest part of Cloudflare design is **picking the right state primitive** — each has
|
|
77
|
+
a sharp consistency/latency/cost profile and they are *not* interchangeable.
|
|
78
|
+
|
|
79
|
+
### R2 (object storage)
|
|
80
|
+
|
|
81
|
+
- S3-compatible object storage with the headline feature: **zero egress fees.** You pay
|
|
82
|
+
storage + per-operation (Class A writes/lists ~$4.50/M, Class B reads ~$0.36/M) but
|
|
83
|
+
**never** for data transferred out. This inverts the AWS cost calculus — for
|
|
84
|
+
egress-heavy workloads (media, model weights, large downloads, multi-cloud data sharing)
|
|
85
|
+
R2 can be dramatically cheaper than S3. The cost trap moves to **operation counts**: a
|
|
86
|
+
workload doing millions of tiny PUTs/LISTs pays on Class A ops even though bytes are
|
|
87
|
+
cheap.
|
|
88
|
+
- Strongly read-after-write consistent for new objects. Supports multipart upload,
|
|
89
|
+
presigned URLs, lifecycle rules, event notifications (→ Queues), and bucket-level
|
|
90
|
+
jurisdiction (EU). Access from a Worker via an R2 binding (no egress, no auth round
|
|
91
|
+
trip) or via the S3 API. **Stream** objects through the Worker — don't buffer (128MB
|
|
92
|
+
isolate cap).
|
|
93
|
+
- **Failure mode:** R2 ops still count as subrequests from a Worker and are subject to the
|
|
94
|
+
per-request subrequest cap.
|
|
95
|
+
|
|
96
|
+
### D1 (SQLite at the edge)
|
|
97
|
+
|
|
98
|
+
- Managed **SQLite**, exposed to Workers via a binding. Real SQL, transactions, and now
|
|
99
|
+
**read replicas** (Sessions API routes reads to a nearby replica and guarantees
|
|
100
|
+
read-your-writes for that session). The primary is single-region; replicas are eventually
|
|
101
|
+
consistent — writes always go to the primary, so **write latency from a far POP includes
|
|
102
|
+
the round trip to the primary region.**
|
|
103
|
+
- **Limits that bite:** **10GB max per database** (it's SQLite, not a sharded cluster —
|
|
104
|
+
this is a hard product ceiling, plan sharding/partitioning early), max **~50 databases**
|
|
105
|
+
worth of bindings per Worker, **100MB max** per query result / row size limits,
|
|
106
|
+
parameter limits per statement. Billing is by **rows read + rows written** (not queries)
|
|
107
|
+
— an unindexed query that scans the table bills every row scanned. **Index aggressively**;
|
|
108
|
+
watch `rows_read` in the query metadata.
|
|
109
|
+
- **Good for:** per-tenant/per-app relational data, config, low-to-moderate write volume.
|
|
110
|
+
**Bad for:** a single large multi-tenant OLTP database (10GB ceiling, single-writer),
|
|
111
|
+
high-write-fan-in. For those, use Hyperdrive → a real Postgres, or many D1s sharded by
|
|
112
|
+
tenant.
|
|
113
|
+
|
|
114
|
+
### KV (key-value)
|
|
115
|
+
|
|
116
|
+
- Global, **eventually consistent** key-value store optimized for **high-read, low-write,
|
|
117
|
+
read-from-everywhere** (config, feature flags, routing tables, cached tokens, session
|
|
118
|
+
lookups). Reads are fast at the edge (cached at the POP after first read); writes
|
|
119
|
+
propagate globally with **eventual consistency — up to ~60s** to be visible everywhere.
|
|
120
|
+
- **Do not use KV where you need read-after-write or coordination.** Last-write-wins, no
|
|
121
|
+
transactions, no conditional writes across the global view. A counter or a "did I already
|
|
122
|
+
process this" flag in KV will lose updates and read stale — that's a Durable Object job.
|
|
123
|
+
- **Limits:** value max 25MB, key max 512 bytes, and a soft guidance of **~1 write/sec per
|
|
124
|
+
key** (writes to the *same* key are rate-limited and the slow-propagation makes
|
|
125
|
+
write-heavy patterns wrong). Billing per read/write/delete/list op + storage.
|
|
126
|
+
|
|
127
|
+
### Durable Objects (single-threaded coordination)
|
|
128
|
+
|
|
129
|
+
- The coordination primitive — a **single-threaded, globally-unique, addressable** object
|
|
130
|
+
instance (one per ID, with **transactional, strongly-consistent storage** colocated with
|
|
131
|
+
the compute). Because exactly one instance handles all requests for a given ID,
|
|
132
|
+
**serially**, it's the right tool for anything KV/D1 can't do safely: counters,
|
|
133
|
+
rate limiters, locks, leader election, real-time **WebSocket** rooms/hubs, collaborative
|
|
134
|
+
state, per-entity state machines.
|
|
135
|
+
- **Single-threaded = the throughput ceiling.** All requests to one DO ID queue and run one
|
|
136
|
+
at a time. A "hot object" (one room/tenant taking all the traffic) becomes a bottleneck
|
|
137
|
+
no horizontal scaling fixes — **shard the keyspace** so load spreads across many DO IDs.
|
|
138
|
+
This is the DO analogue of DynamoDB's hot partition.
|
|
139
|
+
- **WebSockets:** DOs are *the* way to do stateful WebSockets at the edge. Use the
|
|
140
|
+
**Hibernation API** — a DO with idle WebSockets can be evicted from memory (you stop
|
|
141
|
+
paying for active duration) while keeping connections open, rehydrating on the next
|
|
142
|
+
message. Without hibernation, thousands of idle long-lived sockets pin the DO in memory
|
|
143
|
+
and bill continuously.
|
|
144
|
+
- **Alarms:** a DO can schedule itself to wake later (`storage.setAlarm`) — the primitive
|
|
145
|
+
for per-object timers, retries, scheduled flushes, debouncing, and reliable background
|
|
146
|
+
work without a cron. Survives eviction.
|
|
147
|
+
- **Placement:** a DO lives in one location (near first access, or pinned by jurisdiction).
|
|
148
|
+
Cross-region access to a DO pays that latency — colocate the DO with its primary traffic.
|
|
149
|
+
- **Cost:** billed on **active duration (GB-s) + requests**; SQLite-backed DOs add
|
|
150
|
+
rows-read/written billing. Hibernation is the key lever to avoid paying for idle.
|
|
151
|
+
|
|
152
|
+
### Vectorize
|
|
153
|
+
|
|
154
|
+
- Managed **vector database** for embeddings — semantic search and RAG over Workers AI /
|
|
155
|
+
external embeddings. Create indexes with a fixed dimension + distance metric (cosine/
|
|
156
|
+
euclidean/dot). Supports metadata filtering and namespaces. **Limits** on vectors per
|
|
157
|
+
index, dimensions, and metadata size — check before betting a large corpus on it; for
|
|
158
|
+
very large/complex vector workloads a dedicated vector DB via Hyperdrive may fit better.
|
|
159
|
+
Pairs naturally with Workers AI (embed) + an LLM (generate) for an all-edge RAG stack.
|
|
160
|
+
|
|
161
|
+
### Cache API
|
|
162
|
+
|
|
163
|
+
- Programmatic access to Cloudflare's **CDN cache** from inside a Worker
|
|
164
|
+
(`caches.default.match/put`). Distinct from KV: it's **per-POP** (not global — a cache
|
|
165
|
+
put in one POP isn't visible in another), tied to the CDN, and ideal for caching
|
|
166
|
+
subrequest/compute results at the edge with normal HTTP cache semantics. Use it to
|
|
167
|
+
collapse repeated origin/compute work per-POP and stay under the subrequest cap. Respect
|
|
168
|
+
`Cache-Control`; a bad cache key (unique query param) tanks hit ratio just like CloudFront.
|
|
169
|
+
|
|
170
|
+
## Messaging
|
|
171
|
+
|
|
172
|
+
### Queues
|
|
173
|
+
|
|
174
|
+
- Managed **message queue** with Worker producers and consumers, **at-least-once** delivery
|
|
175
|
+
(idempotent consumers mandatory), **batching** (consumer gets a batch — tune
|
|
176
|
+
`max_batch_size` / `max_batch_timeout`), **retries** with configurable `max_retries`, and
|
|
177
|
+
a **dead-letter queue** for poison messages (always configure one, exactly as with SQS).
|
|
178
|
+
- The right tool to **decouple** work from the request path and to **get under the
|
|
179
|
+
subrequest limit**: instead of fanning out 500 API calls inside one Worker, enqueue 500
|
|
180
|
+
messages and let the consumer process them in batches across many invocations. Also
|
|
181
|
+
smooths spikes and isolates a slow downstream from user latency.
|
|
182
|
+
- Throughput/message-size limits apply (message ≤128KB, throughput quotas per queue) —
|
|
183
|
+
check before betting a very high-volume pipeline on it.
|
|
184
|
+
|
|
185
|
+
## AI
|
|
186
|
+
|
|
187
|
+
### Workers AI
|
|
188
|
+
|
|
189
|
+
- Run inference (LLMs, embeddings, image, speech, classification) on Cloudflare's **GPU
|
|
190
|
+
edge network** via a binding — no infra, no GPU to provision. Billed on **Neurons** (a
|
|
191
|
+
normalized compute unit) with a daily free allocation. **Quotas/rate limits per model**
|
|
192
|
+
will throttle a naive high-volume pipeline — back off and batch. Model availability and
|
|
193
|
+
context limits vary by model; pick the smallest model that does the job (don't run a
|
|
194
|
+
large LLM for a classification). Pairs with Vectorize for edge RAG.
|
|
195
|
+
|
|
196
|
+
### AI Gateway
|
|
197
|
+
|
|
198
|
+
- A **proxy/control-plane in front of *any* model provider** (Workers AI, OpenAI,
|
|
199
|
+
Anthropic, etc.) that adds **caching** (dedup identical prompts → big cost saver),
|
|
200
|
+
**rate limiting**, **retries/fallbacks** across providers, request logging, and analytics
|
|
201
|
+
— without changing your application code beyond the endpoint. Use it as the single
|
|
202
|
+
chokepoint for all LLM traffic to control cost, observe spend, and add resilience. The
|
|
203
|
+
caching layer alone often pays for itself on repetitive prompts.
|
|
204
|
+
|
|
205
|
+
## Origin connectivity
|
|
206
|
+
|
|
207
|
+
### Hyperdrive
|
|
208
|
+
|
|
209
|
+
- **Connection pooling + query caching in front of an external origin database**
|
|
210
|
+
(Postgres/MySQL — e.g. RDS, Cloud SQL, Neon, Supabase). Solves the exact problem that
|
|
211
|
+
bites serverless + Postgres everywhere: each Worker invocation would otherwise open a new
|
|
212
|
+
DB connection and exhaust `max_connections` (the Lambda+RDS mismatch). Hyperdrive
|
|
213
|
+
**pools** connections at the edge so thousands of Workers share a small pool, and
|
|
214
|
+
**caches** read queries to cut round trips. It also keeps a warm connection to the
|
|
215
|
+
origin, hiding connection-setup latency.
|
|
216
|
+
- Use it whenever Workers talk to a traditional regional SQL database. Pair with **Smart
|
|
217
|
+
Placement** so the Worker runs near the origin. This is the Cloudflare answer to RDS
|
|
218
|
+
Proxy / PgBouncer — without it, a busy Worker fleet will knock over the origin DB on
|
|
219
|
+
connection count alone.
|
|
220
|
+
|
|
221
|
+
## Consistency model cheat-sheet (pick the right primitive)
|
|
222
|
+
|
|
223
|
+
| Need | Use | Consistency |
|
|
224
|
+
| --- | --- | --- |
|
|
225
|
+
| High-read config / flags / global lookups | **KV** | Eventual (~60s) |
|
|
226
|
+
| Relational data, real SQL, transactions | **D1** | Strong on primary; replicas eventual |
|
|
227
|
+
| Coordination / counters / locks / WebSockets / per-entity state | **Durable Objects** | Strong, serialized, single-writer |
|
|
228
|
+
| Large blobs / media / egress-heavy | **R2** | Read-after-write (new objects) |
|
|
229
|
+
| Per-POP HTTP/compute caching | **Cache API** | Per-POP, TTL-based |
|
|
230
|
+
| Pool/cache to an external SQL DB | **Hyperdrive** | Inherits origin |
|
|
231
|
+
| Vector/embedding search | **Vectorize** | Index-level |
|
|
232
|
+
|
|
233
|
+
The classic mistake: reaching for KV because it's simple, then needing read-after-write or
|
|
234
|
+
a counter — that's always a Durable Object. And reaching for D1 for a single big
|
|
235
|
+
multi-tenant DB — that's the 10GB ceiling and single-writer, so shard D1 or use Hyperdrive.
|
|
236
|
+
|
|
237
|
+
## Failure modes (what breaks and how it shows up)
|
|
238
|
+
|
|
239
|
+
- **`Exceeded CPU` / `Exceeded Memory` (Error 1102)** — CPU-bound or >128MB hot path.
|
|
240
|
+
Profile, offload heavy compute, stream large bodies, raise `limits.cpu_ms` if it's
|
|
241
|
+
legitimately compute-heavy and on paid.
|
|
242
|
+
- **Subrequest limit exceeded** — per-item fan-out inside one Worker. Batch, cache (Cache
|
|
243
|
+
API), or move fan-out to Queues / a Durable Object.
|
|
244
|
+
- **KV stale reads / lost writes** — using KV where read-after-write or coordination was
|
|
245
|
+
needed. Move to a Durable Object.
|
|
246
|
+
- **D1 `rows_read` blowup / slow queries** — unindexed scans (bills every row) or hitting
|
|
247
|
+
the 10GB / single-writer ceiling. Index, shard, or move to Hyperdrive+Postgres.
|
|
248
|
+
- **Durable Object hot-object bottleneck** — all traffic to one DO ID serializes. Shard
|
|
249
|
+
the keyspace across many IDs.
|
|
250
|
+
- **DB connections exhausted** — Workers opening direct connections to a regional Postgres.
|
|
251
|
+
Put **Hyperdrive** in front.
|
|
252
|
+
- **Idle WebSocket memory/billing** — DOs pinned by idle sockets. Use the Hibernation API.
|
|
253
|
+
|
|
254
|
+
## Cost realism (where Cloudflare bills behave differently)
|
|
255
|
+
|
|
256
|
+
1. **No R2 egress** — the headline saving; egress-heavy workloads are far cheaper than S3.
|
|
257
|
+
The cost moves to **Class A/B operation counts** — millions of tiny ops add up.
|
|
258
|
+
2. **Workers = requests + CPU time**, not GB-seconds of wall-clock. Idle-waiting on I/O is
|
|
259
|
+
nearly free; the bill is driven by request count and *compute*. A handler that does
|
|
260
|
+
little CPU and waits on fetches is cheap even if slow.
|
|
261
|
+
3. **D1 = rows read + rows written**, not queries — unindexed scans are the silent
|
|
262
|
+
multiplier. Index and watch `rows_read`.
|
|
263
|
+
4. **Durable Objects = active duration (GB-s) + requests** — idle DOs pinned in memory
|
|
264
|
+
(esp. WebSockets without hibernation) bill continuously. Hibernate.
|
|
265
|
+
5. **KV = per-op + storage** — read-heavy is its sweet spot; write-heavy is both wrong and
|
|
266
|
+
costly.
|
|
267
|
+
6. **Workers AI = Neurons per inference** — model size and volume drive it; AI Gateway
|
|
268
|
+
caching cuts repetitive spend.
|
|
269
|
+
7. **Queues = per-operation** on push/pull/retry — fine, but DLQ loops on poison messages
|
|
270
|
+
waste ops.
|
|
271
|
+
|
|
272
|
+
Levers: AI Gateway caching, Cache API + good cache keys, batching to stay under subrequest
|
|
273
|
+
limits, indexing D1, DO hibernation, sharding hot DOs, and Smart Placement to cut origin
|
|
274
|
+
round trips.
|
|
275
|
+
|
|
276
|
+
## Limits to check before betting on them (request increases early)
|
|
277
|
+
|
|
278
|
+
Workers CPU-time (30s default / 300s max paid, 10ms Free), **subrequests 50 Free / 1000
|
|
279
|
+
paid**, 128MB isolate memory, bundle size (3MB/10MB), **D1 10GB/database** + rows-billing,
|
|
280
|
+
KV value 25MB / ~1 write/s per key / ~60s propagation, R2 operation classes, Durable Object
|
|
281
|
+
single-thread throughput, Queues message ≤128KB + throughput quotas, Workers AI per-model
|
|
282
|
+
rate limits, Vectorize index dimension/count limits. Many are plan-tier (Free vs Paid vs
|
|
283
|
+
Enterprise) — verify the current numbers against Cloudflare docs (they change), and confirm
|
|
284
|
+
the plan tier before designing around a limit.
|
|
285
|
+
|
|
286
|
+
## Observability
|
|
287
|
+
|
|
288
|
+
`wrangler tail` for live request logs, **Workers Logs / Logpush** for persisted logs to a
|
|
289
|
+
destination, **Analytics Engine** for high-cardinality custom metrics written from a
|
|
290
|
+
Worker, **Workers Trace Events / Tail Workers** for structured per-invocation traces, and
|
|
291
|
+
the dashboard's per-binding analytics (D1 query metrics, KV/R2 op counts, DO duration,
|
|
292
|
+
Queue depth, AI Gateway logs). Alarm on the things that predict pain: Worker error rate +
|
|
293
|
+
CPU-limit (1102) exceptions, subrequest-limit errors, D1 `rows_read` trends, DO duration +
|
|
294
|
+
queue-up, Queue backlog/DLQ depth, and Workers AI rate-limit (429) responses.
|