@zigrivers/scaffold 3.6.0 → 3.8.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +127 -12
- package/content/knowledge/backend/backend-api-design.md +103 -0
- package/content/knowledge/backend/backend-architecture.md +100 -0
- package/content/knowledge/backend/backend-async-patterns.md +101 -0
- package/content/knowledge/backend/backend-auth-patterns.md +100 -0
- package/content/knowledge/backend/backend-conventions.md +105 -0
- package/content/knowledge/backend/backend-data-modeling.md +102 -0
- package/content/knowledge/backend/backend-deployment.md +100 -0
- package/content/knowledge/backend/backend-dev-environment.md +102 -0
- package/content/knowledge/backend/backend-observability.md +102 -0
- package/content/knowledge/backend/backend-project-structure.md +100 -0
- package/content/knowledge/backend/backend-requirements.md +103 -0
- package/content/knowledge/backend/backend-security.md +104 -0
- package/content/knowledge/backend/backend-testing.md +101 -0
- package/content/knowledge/backend/backend-worker-patterns.md +100 -0
- package/content/knowledge/cli/cli-architecture.md +101 -0
- package/content/knowledge/cli/cli-conventions.md +117 -0
- package/content/knowledge/cli/cli-dev-environment.md +121 -0
- package/content/knowledge/cli/cli-distribution-patterns.md +106 -0
- package/content/knowledge/cli/cli-interactivity-patterns.md +116 -0
- package/content/knowledge/cli/cli-output-patterns.md +107 -0
- package/content/knowledge/cli/cli-project-structure.md +124 -0
- package/content/knowledge/cli/cli-requirements.md +101 -0
- package/content/knowledge/cli/cli-shell-integration.md +130 -0
- package/content/knowledge/cli/cli-testing.md +134 -0
- package/content/knowledge/library/library-api-design.md +306 -0
- package/content/knowledge/library/library-architecture.md +247 -0
- package/content/knowledge/library/library-bundling.md +244 -0
- package/content/knowledge/library/library-conventions.md +229 -0
- package/content/knowledge/library/library-dev-environment.md +220 -0
- package/content/knowledge/library/library-documentation.md +300 -0
- package/content/knowledge/library/library-project-structure.md +237 -0
- package/content/knowledge/library/library-requirements.md +173 -0
- package/content/knowledge/library/library-security.md +257 -0
- package/content/knowledge/library/library-testing.md +319 -0
- package/content/knowledge/library/library-type-definitions.md +284 -0
- package/content/knowledge/library/library-versioning.md +300 -0
- package/content/knowledge/mobile-app/mobile-app-architecture.md +283 -0
- package/content/knowledge/mobile-app/mobile-app-conventions.md +180 -0
- package/content/knowledge/mobile-app/mobile-app-deployment.md +298 -0
- package/content/knowledge/mobile-app/mobile-app-dev-environment.md +257 -0
- package/content/knowledge/mobile-app/mobile-app-distribution.md +264 -0
- package/content/knowledge/mobile-app/mobile-app-observability.md +317 -0
- package/content/knowledge/mobile-app/mobile-app-offline-patterns.md +311 -0
- package/content/knowledge/mobile-app/mobile-app-project-structure.md +245 -0
- package/content/knowledge/mobile-app/mobile-app-push-notifications.md +321 -0
- package/content/knowledge/mobile-app/mobile-app-requirements.md +147 -0
- package/content/knowledge/mobile-app/mobile-app-security.md +338 -0
- package/content/knowledge/mobile-app/mobile-app-testing.md +400 -0
- package/content/knowledge/web-app/web-app-api-patterns.md +224 -0
- package/content/knowledge/web-app/web-app-architecture.md +116 -0
- package/content/knowledge/web-app/web-app-auth-patterns.md +256 -0
- package/content/knowledge/web-app/web-app-conventions.md +121 -0
- package/content/knowledge/web-app/web-app-data-patterns.md +218 -0
- package/content/knowledge/web-app/web-app-deployment-workflow.md +143 -0
- package/content/knowledge/web-app/web-app-deployment.md +134 -0
- package/content/knowledge/web-app/web-app-design-system.md +158 -0
- package/content/knowledge/web-app/web-app-dev-environment.md +173 -0
- package/content/knowledge/web-app/web-app-observability.md +221 -0
- package/content/knowledge/web-app/web-app-project-structure.md +160 -0
- package/content/knowledge/web-app/web-app-rendering-strategies.md +133 -0
- package/content/knowledge/web-app/web-app-requirements.md +112 -0
- package/content/knowledge/web-app/web-app-security.md +193 -0
- package/content/knowledge/web-app/web-app-session-patterns.md +214 -0
- package/content/knowledge/web-app/web-app-testing.md +249 -0
- package/content/knowledge/web-app/web-app-ux-patterns.md +162 -0
- package/content/methodology/backend-overlay.yml +73 -0
- package/content/methodology/cli-overlay.yml +69 -0
- package/content/methodology/library-overlay.yml +67 -0
- package/content/methodology/mobile-app-overlay.yml +71 -0
- package/content/methodology/web-app-overlay.yml +79 -0
- package/dist/cli/commands/init.d.ts +21 -0
- package/dist/cli/commands/init.d.ts.map +1 -1
- package/dist/cli/commands/init.js +261 -13
- package/dist/cli/commands/init.js.map +1 -1
- package/dist/cli/commands/init.test.js +206 -0
- package/dist/cli/commands/init.test.js.map +1 -1
- package/dist/config/schema.d.ts +1392 -64
- package/dist/config/schema.d.ts.map +1 -1
- package/dist/config/schema.js +82 -5
- package/dist/config/schema.js.map +1 -1
- package/dist/config/schema.test.js +302 -1
- package/dist/config/schema.test.js.map +1 -1
- package/dist/core/assembly/overlay-loader.d.ts.map +1 -1
- package/dist/core/assembly/overlay-loader.js +2 -1
- package/dist/core/assembly/overlay-loader.js.map +1 -1
- package/dist/core/assembly/overlay-loader.test.js +56 -0
- package/dist/core/assembly/overlay-loader.test.js.map +1 -1
- package/dist/e2e/game-pipeline.test.js +1 -0
- package/dist/e2e/game-pipeline.test.js.map +1 -1
- package/dist/e2e/project-type-overlays.test.d.ts +16 -0
- package/dist/e2e/project-type-overlays.test.d.ts.map +1 -0
- package/dist/e2e/project-type-overlays.test.js +834 -0
- package/dist/e2e/project-type-overlays.test.js.map +1 -0
- package/dist/types/config.d.ts +19 -2
- package/dist/types/config.d.ts.map +1 -1
- package/dist/types/index.d.ts +0 -1
- package/dist/types/index.d.ts.map +1 -1
- package/dist/types/index.js +0 -1
- package/dist/types/index.js.map +1 -1
- package/dist/wizard/questions.d.ts +27 -1
- package/dist/wizard/questions.d.ts.map +1 -1
- package/dist/wizard/questions.js +142 -3
- package/dist/wizard/questions.js.map +1 -1
- package/dist/wizard/questions.test.js +206 -8
- package/dist/wizard/questions.test.js.map +1 -1
- package/dist/wizard/wizard.d.ts +21 -0
- package/dist/wizard/wizard.d.ts.map +1 -1
- package/dist/wizard/wizard.js +27 -1
- package/dist/wizard/wizard.js.map +1 -1
- package/package.json +1 -1
- package/dist/types/wizard.d.ts +0 -14
- package/dist/types/wizard.d.ts.map +0 -1
- package/dist/types/wizard.js +0 -2
- package/dist/types/wizard.js.map +0 -1
package/README.md
CHANGED
|
@@ -29,7 +29,7 @@ Either way, Scaffold constructs the prompt and the target AI tool does the work.
|
|
|
29
29
|
|
|
30
30
|
**Assembly engine** — At execution time, Scaffold builds a 7-section prompt from: system metadata, the meta-prompt, knowledge base entries, project context (artifacts from prior steps), methodology settings, layered instructions, and depth-specific execution guidance.
|
|
31
31
|
|
|
32
|
-
**Knowledge base** —
|
|
32
|
+
**Knowledge base** — 158 domain expertise entries in `content/knowledge/` organized in thirteen categories (core, product, review, validation, finalization, execution, tools, game, web-app, backend, cli, library, mobile-app) covering testing strategy, domain modeling, API design, security best practices, eval craft, TDD execution, task claiming, worktree management, release management, rendering strategies, data stores, CLI patterns, game engines, library bundling, mobile deployment, and more. These get injected into prompts based on each step's `knowledge-base` frontmatter field. Knowledge files with a `## Deep Guidance` section are optimized for CLI assembly — only the deep guidance content is loaded, avoiding redundancy with the prompt text. Teams can add project-local overrides in `.scaffold/knowledge/` that layer on top of the global entries.
|
|
33
33
|
|
|
34
34
|
**Methodology presets** — Three built-in presets control which steps run and how deep the analysis goes:
|
|
35
35
|
- **deep** (depth 5) — all steps enabled, exhaustive analysis
|
|
@@ -371,6 +371,52 @@ Every `scaffold init` wizard question can be answered via CLI flags, making scaf
|
|
|
371
371
|
| `--project-type` | string | web-app, mobile-app, backend, cli, library, game |
|
|
372
372
|
| `--auto` | boolean | Non-interactive mode (uses Zod defaults for unset flags) |
|
|
373
373
|
|
|
374
|
+
#### Web-App Config Flags (require `--project-type web-app` or auto-set it)
|
|
375
|
+
|
|
376
|
+
| Flag | Type | Values |
|
|
377
|
+
|------|------|--------|
|
|
378
|
+
| `--web-rendering` | string | spa, ssr, ssg, hybrid |
|
|
379
|
+
| `--web-deploy-target` | string | static, serverless, container, edge, long-running |
|
|
380
|
+
| `--web-realtime` | string | none, websocket, sse |
|
|
381
|
+
| `--web-auth-flow` | string | none, session, oauth, passkey |
|
|
382
|
+
|
|
383
|
+
#### Backend Config Flags (require `--project-type backend` or auto-set it)
|
|
384
|
+
|
|
385
|
+
| Flag | Type | Values |
|
|
386
|
+
|------|------|--------|
|
|
387
|
+
| `--backend-api-style` | string | rest, graphql, grpc, trpc, none |
|
|
388
|
+
| `--backend-data-store` | comma-sep | relational, document, key-value |
|
|
389
|
+
| `--backend-auth` | string | none, jwt, session, oauth, apikey |
|
|
390
|
+
| `--backend-messaging` | string | none, queue, event-driven |
|
|
391
|
+
| `--backend-deploy-target` | string | serverless, container, long-running |
|
|
392
|
+
|
|
393
|
+
#### CLI Config Flags (require `--project-type cli` or auto-set it)
|
|
394
|
+
|
|
395
|
+
| Flag | Type | Values |
|
|
396
|
+
|------|------|--------|
|
|
397
|
+
| `--cli-interactivity` | string | args-only, interactive, hybrid |
|
|
398
|
+
| `--cli-distribution` | comma-sep | package-manager, system-package-manager, standalone-binary, container |
|
|
399
|
+
| `--cli-structured-output` | boolean | `--cli-structured-output` / `--no-cli-structured-output` |
|
|
400
|
+
|
|
401
|
+
#### Library Config Flags (require `--project-type library` or auto-set it)
|
|
402
|
+
|
|
403
|
+
| Flag | Type | Values |
|
|
404
|
+
|------|------|--------|
|
|
405
|
+
| `--lib-visibility` | string | public, internal |
|
|
406
|
+
| `--lib-runtime-target` | string | node, browser, isomorphic, edge |
|
|
407
|
+
| `--lib-bundle-format` | string | esm, cjs, dual, unbundled |
|
|
408
|
+
| `--lib-type-definitions` | boolean | `--lib-type-definitions` / `--no-lib-type-definitions` |
|
|
409
|
+
| `--lib-doc-level` | string | none, readme, api-docs, full-site |
|
|
410
|
+
|
|
411
|
+
#### Mobile-App Config Flags (require `--project-type mobile-app` or auto-set it)
|
|
412
|
+
|
|
413
|
+
| Flag | Type | Values |
|
|
414
|
+
|------|------|--------|
|
|
415
|
+
| `--mobile-platform` | string | ios, android, cross-platform |
|
|
416
|
+
| `--mobile-distribution` | string | public, private, mixed |
|
|
417
|
+
| `--mobile-offline` | string | none, cache, offline-first |
|
|
418
|
+
| `--mobile-push-notifications` | boolean | `--mobile-push-notifications` / `--no-mobile-push-notifications` |
|
|
419
|
+
|
|
374
420
|
#### Game Config Flags (require `--project-type game` or auto-set it)
|
|
375
421
|
|
|
376
422
|
| Flag | Type | Values |
|
|
@@ -387,16 +433,66 @@ Every `scaffold init` wizard question can be answered via CLI flags, making scaf
|
|
|
387
433
|
| `--modding` | boolean | `--modding` / `--no-modding` |
|
|
388
434
|
| `--persistence` | string | none, settings-only, profile, progression, cloud |
|
|
389
435
|
|
|
436
|
+
> **Flag aliases**: Game flags have `--game-*` aliases for consistency with other project types (e.g., `--game-engine` is equivalent to `--engine`). Bare flags like `--engine` still work.
|
|
437
|
+
|
|
390
438
|
#### How Flags Interact
|
|
391
439
|
|
|
392
440
|
- **Flag > auto > interactive**: Flags always take highest precedence. `--auto --engine unreal` uses defaults for everything except engine.
|
|
393
441
|
- **Partial flags + interactive**: Provide some flags and the wizard asks only the remaining questions. `scaffold init --project-type game --engine unreal` prompts interactively for multiplayer, platforms, etc.
|
|
394
|
-
- **
|
|
395
|
-
- **
|
|
442
|
+
- **Type-specific flags auto-set project type**: `--engine unity` automatically sets `--project-type game`, `--web-rendering ssr` sets `--project-type web-app`, `--backend-api-style rest` sets `--project-type backend`, `--cli-interactivity hybrid` sets `--project-type cli`, `--lib-visibility public` sets `--project-type library`, `--mobile-platform ios` sets `--project-type mobile-app`. Error if conflicting type.
|
|
443
|
+
- **Cannot mix flag families**: `--web-rendering ssr --backend-api-style rest` is an error. Each flag family is exclusive.
|
|
444
|
+
- **Validation**: `--depth` requires `--methodology custom`. `--online-services` requires `--multiplayer online` or `hybrid`. SSR/hybrid rendering is incompatible with static deploy target. Session auth requires server state (not static).
|
|
396
445
|
|
|
397
446
|
#### CI Examples
|
|
398
447
|
|
|
399
448
|
```bash
|
|
449
|
+
# Web-app project (SSR with serverless deploy)
|
|
450
|
+
scaffold init --auto --methodology deep --project-type web-app \
|
|
451
|
+
--web-rendering ssr --web-deploy-target serverless
|
|
452
|
+
|
|
453
|
+
# Web-app with real-time features and OAuth
|
|
454
|
+
scaffold init --auto --methodology deep --project-type web-app \
|
|
455
|
+
--web-rendering ssr --web-deploy-target container \
|
|
456
|
+
--web-realtime websocket --web-auth-flow oauth
|
|
457
|
+
|
|
458
|
+
# Backend project (GraphQL with relational + key-value stores)
|
|
459
|
+
scaffold init --auto --methodology deep --project-type backend \
|
|
460
|
+
--backend-api-style graphql --backend-data-store relational,key-value
|
|
461
|
+
|
|
462
|
+
# Backend with event-driven messaging and JWT auth
|
|
463
|
+
scaffold init --auto --methodology deep --project-type backend \
|
|
464
|
+
--backend-api-style rest --backend-data-store relational \
|
|
465
|
+
--backend-auth jwt --backend-messaging event-driven \
|
|
466
|
+
--backend-deploy-target container
|
|
467
|
+
|
|
468
|
+
# CLI project (interactive with multiple distribution channels)
|
|
469
|
+
scaffold init --auto --methodology mvp --project-type cli \
|
|
470
|
+
--cli-interactivity hybrid --cli-distribution package-manager,standalone-binary
|
|
471
|
+
|
|
472
|
+
# CLI with structured JSON output
|
|
473
|
+
scaffold init --auto --methodology deep --project-type cli \
|
|
474
|
+
--cli-interactivity args-only --cli-distribution package-manager \
|
|
475
|
+
--cli-structured-output
|
|
476
|
+
|
|
477
|
+
# Public library with full API docs and ESM bundle
|
|
478
|
+
scaffold init --auto --methodology deep --project-type library \
|
|
479
|
+
--lib-visibility public --lib-runtime-target isomorphic \
|
|
480
|
+
--lib-bundle-format esm --lib-doc-level api-docs
|
|
481
|
+
|
|
482
|
+
# Internal library (Node-only, no docs)
|
|
483
|
+
scaffold init --auto --methodology mvp --project-type library \
|
|
484
|
+
--lib-visibility internal --lib-runtime-target node \
|
|
485
|
+
--lib-bundle-format cjs --lib-doc-level none
|
|
486
|
+
|
|
487
|
+
# Cross-platform mobile app with offline support
|
|
488
|
+
scaffold init --auto --methodology deep --project-type mobile-app \
|
|
489
|
+
--mobile-platform cross-platform --mobile-offline offline-first \
|
|
490
|
+
--mobile-push-notifications
|
|
491
|
+
|
|
492
|
+
# iOS app with private distribution
|
|
493
|
+
scaffold init --auto --methodology mvp --project-type mobile-app \
|
|
494
|
+
--mobile-platform ios --mobile-distribution private
|
|
495
|
+
|
|
400
496
|
# Multiplayer mobile game with Unity
|
|
401
497
|
scaffold init --project-type game --methodology deep --auto \
|
|
402
498
|
--engine unity --multiplayer online --target-platforms ios,android \
|
|
@@ -405,10 +501,6 @@ scaffold init --project-type game --methodology deep --auto \
|
|
|
405
501
|
# Simple puzzle game
|
|
406
502
|
scaffold init --project-type game --auto --engine godot
|
|
407
503
|
|
|
408
|
-
# Web app with multiple AI adapters
|
|
409
|
-
scaffold init --project-type web-app --methodology mvp --auto \
|
|
410
|
-
--adapters claude-code,gemini --traits web,mobile
|
|
411
|
-
|
|
412
504
|
# Custom methodology at depth 3
|
|
413
505
|
scaffold init --methodology custom --depth 3 --auto
|
|
414
506
|
|
|
@@ -421,6 +513,25 @@ scaffold init --project-type game --methodology deep --auto \
|
|
|
421
513
|
--content-structure open-world
|
|
422
514
|
```
|
|
423
515
|
|
|
516
|
+
### Project-Type Overlays
|
|
517
|
+
|
|
518
|
+
Scaffold supports **project-type overlays** — domain-specific knowledge and pipeline customizations that activate based on your project type. When you set a project type during `scaffold init`, the corresponding overlay layers on top of your chosen methodology (mvp, deep, or custom):
|
|
519
|
+
|
|
520
|
+
- **Injects domain knowledge** into existing pipeline steps (e.g., SSR caching strategies into `tech-stack`, API pagination patterns into `coding-standards`)
|
|
521
|
+
|
|
522
|
+
The game overlay additionally adjusts step enablement, remaps artifact references, and adds dependency overrides (because game development has fundamentally different artifacts). The web-app, backend, CLI, library, and mobile-app overlays are **knowledge-only** — they inject domain expertise into existing steps without changing which steps run or how they depend on each other.
|
|
523
|
+
|
|
524
|
+
Overlays are composable with methodology presets. An MVP web-app gets fewer steps at lower depth; a deep backend project gets exhaustive analysis of every architectural decision.
|
|
525
|
+
|
|
526
|
+
| Project Type | Overlay | Knowledge Entries | Config Options |
|
|
527
|
+
|-------------|---------|-------------------|----------------|
|
|
528
|
+
| `web-app` | `web-app-overlay.yml` | 17 entries (rendering, state management, auth, SSR, deploy targets, real-time, PWA, testing) | Rendering strategy, deploy target, real-time, auth flow |
|
|
529
|
+
| `backend` | `backend-overlay.yml` | 14 entries (API design, data stores, auth, messaging, observability, deploy, caching, rate limiting) | API style, data store(s), auth, messaging, deploy target |
|
|
530
|
+
| `cli` | `cli-overlay.yml` | 10 entries (argument parsing, config management, output formatting, distribution, testing, error handling) | Interactivity model, distribution channels, structured output |
|
|
531
|
+
| `library` | `library-overlay.yml` | 12 entries (API design, bundling, type definitions, versioning, documentation, testing, security) | Visibility, runtime target, bundle format, type definitions, documentation level |
|
|
532
|
+
| `mobile-app` | `mobile-app-overlay.yml` | 12 entries (architecture, offline patterns, push notifications, deployment, distribution, testing, security) | Platform, distribution model, offline support, push notifications |
|
|
533
|
+
| `game` | `game-overlay.yml` | 24 entries (engines, networking, audio, VR/AR, economy, save systems, certification) | Engine, multiplayer, platforms, economy, narrative, and 6 more |
|
|
534
|
+
|
|
424
535
|
### Game Development
|
|
425
536
|
|
|
426
537
|
Scaffold fully supports game development projects. When you select `game` as your project type, a **project-type overlay** activates 24 game-specific pipeline steps and injects game domain expertise into existing steps — all while keeping the standard pipeline workflow (status, next, rework, multi-model review) fully functional.
|
|
@@ -1113,15 +1224,19 @@ scaffold dashboard
|
|
|
1113
1224
|
|
|
1114
1225
|
## Knowledge System
|
|
1115
1226
|
|
|
1116
|
-
Scaffold ships with
|
|
1227
|
+
Scaffold ships with 134 domain expertise entries organized in eleven categories:
|
|
1117
1228
|
|
|
1118
|
-
- **core/** (
|
|
1119
|
-
- **product/** (
|
|
1120
|
-
- **review/** (
|
|
1229
|
+
- **core/** (26 entries) — eval craft, testing strategy, domain modeling, API design, database design, system architecture, ADR craft, security best practices, operations, task decomposition, user stories, UX specification, design system tokens, user story innovation, AI memory management, coding conventions, tech stack selection, project structure patterns, task tracking, CLAUDE.md patterns, multi-model review dispatch, review step template, dev environment, git workflow patterns, automated review tooling, vision craft
|
|
1230
|
+
- **product/** (5 entries) — PRD craft, PRD innovation, gap analysis, vision craft, vision innovation
|
|
1231
|
+
- **review/** (20 entries) — review methodology (shared), plus domain-specific review passes for PRD, user stories, domain modeling, ADRs, architecture, API design, database design, UX specification, testing, security, operations, implementation tasks, game design, game economy, game UI, netcode, and more
|
|
1121
1232
|
- **validation/** (7 entries) — critical path analysis, cross-phase consistency, scope management, traceability, implementability, decision completeness, dependency validation
|
|
1122
1233
|
- **finalization/** (3 entries) — implementation playbook, developer onboarding, apply-fixes-and-freeze
|
|
1123
1234
|
- **execution/** (4 entries) — TDD execution loop, task claiming strategy, worktree management, enhancement workflow
|
|
1124
|
-
- **tools/** (
|
|
1235
|
+
- **tools/** (4 entries) — release management, version strategy, session analysis, and more
|
|
1236
|
+
- **game/** (24 entries) — game engines, networking/netcode, audio middleware, save systems, input patterns, VR/AR, localization, modding/UGC, live operations, platform certification, economy design, AI/behavior, level design, performance, accessibility
|
|
1237
|
+
- **web-app/** (17 entries) — rendering strategies (SSR/SSG/SPA), state management, authentication, deploy targets, real-time patterns, PWA, performance, security, testing, session patterns, UX patterns, caching, API integration, accessibility
|
|
1238
|
+
- **backend/** (14 entries) — API design patterns, data store selection, authentication mechanisms, messaging/event systems, observability, deploy strategies, caching, rate limiting, error handling, database migrations, testing, security
|
|
1239
|
+
- **cli/** (10 entries) — argument parsing, config management, output formatting, distribution channels, testing patterns, error handling, plugin architecture, shell integration, structured output, interactive prompts
|
|
1125
1240
|
|
|
1126
1241
|
Each pipeline step declares which knowledge entries it needs in its frontmatter. The assembly engine injects them automatically. Knowledge files with a `## Deep Guidance` section are optimized for the CLI — only the deep guidance content is loaded into the assembled prompt, skipping the summary to avoid redundancy with the prompt text.
|
|
1127
1242
|
|
|
@@ -0,0 +1,103 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: backend-api-design
|
|
3
|
+
description: REST maturity levels, GraphQL schema-first design, gRPC protobuf conventions, tRPC router patterns, API versioning strategies, pagination, and filtering
|
|
4
|
+
topics: [backend, api-design, rest, graphql, grpc, trpc, versioning, pagination]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
API design is a long-lived contract. Every structural decision — URL shape, error format, pagination scheme, versioning strategy — is expensive to change after consumers depend on it. Design APIs from the consumer's perspective first. The best API is one where new developers can predict the shape of an endpoint they have never seen before, because every other endpoint follows the same patterns.
|
|
8
|
+
|
|
9
|
+
## Summary
|
|
10
|
+
|
|
11
|
+
REST, GraphQL, gRPC, and tRPC each solve different problems. REST (Level 2 Richardson Maturity) is the default for most public APIs — use HTTP verbs correctly, plural nouns for collections, and nested paths for containment. GraphQL suits complex, multi-entity queries with schema-first design. gRPC is the standard for high-performance internal service-to-service calls. tRPC provides end-to-end type safety for TypeScript monorepos.
|
|
12
|
+
|
|
13
|
+
API versioning, pagination, and filtering are long-lived contracts. URL path versioning is preferred for public APIs. Cursor-based pagination is the correct choice for any list that may grow large. All filter parameters must be validated against a schema before reaching the data layer.
|
|
14
|
+
|
|
15
|
+
## Deep Guidance
|
|
16
|
+
|
|
17
|
+
### REST Maturity Levels (Richardson Model)
|
|
18
|
+
|
|
19
|
+
The Richardson Maturity Model is a pragmatic rubric, not a goal to maximize:
|
|
20
|
+
|
|
21
|
+
- **Level 0**: Single endpoint, all operations via POST. Avoid — not REST.
|
|
22
|
+
- **Level 1**: Separate resources at different URLs. Basic REST. Sufficient for many internal APIs.
|
|
23
|
+
- **Level 2**: HTTP verbs used correctly (GET reads, POST creates, PUT/PATCH updates, DELETE removes). Standard REST. The target for most APIs.
|
|
24
|
+
- **Level 3 (HATEOAS)**: Responses include hypermedia links describing available actions. Rarely justified in practice — adds response payload complexity and client coupling to URL structure. Only implement if clients genuinely traverse links without prior URL knowledge.
|
|
25
|
+
|
|
26
|
+
Target Level 2 for REST APIs. Do not chase Level 3 as an ideological goal.
|
|
27
|
+
|
|
28
|
+
**REST URL conventions**: Use plural nouns for collections (`/orders`), singular for specific resources (`/orders/{id}`), nested paths for containment (`/orders/{id}/items`), and action-oriented sub-resources for operations (`/orders/{id}/cancel`). Avoid verbs in URL paths.
|
|
29
|
+
|
|
30
|
+
### GraphQL Schema-First Design
|
|
31
|
+
|
|
32
|
+
Write the schema before writing resolvers. The schema is the API contract:
|
|
33
|
+
|
|
34
|
+
- **Schema-first**: Define types, queries, mutations, and subscriptions in the SDL (Schema Definition Language) before any implementation. Generate types and resolver stubs from the schema.
|
|
35
|
+
- **Single responsibility**: Each resolver does one thing. Business logic belongs in a service layer called by the resolver, not inside the resolver function.
|
|
36
|
+
- **N+1 problem**: Every GraphQL API must address the N+1 query problem. Use DataLoader to batch and deduplicate database calls per request. An un-addressed N+1 in production is a performance crisis at scale.
|
|
37
|
+
- **Pagination**: Use cursor-based pagination (Relay Connection spec) for any collection that may exceed 100 items. Offset pagination degrades at scale and produces incorrect results under concurrent inserts/deletes.
|
|
38
|
+
- **Schema directives**: Use directives for cross-cutting concerns — `@auth`, `@deprecated`, `@rateLimit` — rather than duplicating logic in each resolver.
|
|
39
|
+
|
|
40
|
+
### gRPC and Protobuf Conventions
|
|
41
|
+
|
|
42
|
+
gRPC is the standard for high-performance internal service-to-service communication:
|
|
43
|
+
|
|
44
|
+
- **Protobuf schema-first**: Define `.proto` files before any implementation. Store them in a shared repository or a dedicated `proto/` directory. Generate client and server stubs from the proto at build time.
|
|
45
|
+
- **Naming**: Service names in `PascalCase`, RPC methods in `PascalCase`, message fields in `snake_case`. Follow the official Google API style guide for proto naming.
|
|
46
|
+
- **Field numbering**: Never reuse a field number once it has been in production, even after the field is removed. Removing a field requires marking it `reserved`. Changing a field type is a breaking change.
|
|
47
|
+
- **Streaming**: Use server-streaming for large result sets, bidirectional streaming for real-time updates, and unary calls for everything else. Don't over-use streaming — it complicates error handling and testing.
|
|
48
|
+
|
|
49
|
+
### tRPC Router Patterns
|
|
50
|
+
|
|
51
|
+
tRPC provides end-to-end type safety for TypeScript monorepos without a schema definition step:
|
|
52
|
+
|
|
53
|
+
- **Procedure organization**: Group procedures into routers by domain (`appRouter.orders.create`, `appRouter.users.findById`). Each domain router lives in its own file.
|
|
54
|
+
- **Input validation**: Every procedure must validate its input with a Zod schema. This is both a type contract and a runtime guard.
|
|
55
|
+
- **Context**: Pass request-scoped data (authenticated user, database connection, logger) via the context object — never as procedure parameters.
|
|
56
|
+
- **Middleware**: Apply authentication, rate limiting, and logging via tRPC middleware (`.use()`), not inside individual procedures.
|
|
57
|
+
|
|
58
|
+
### API Versioning Strategies
|
|
59
|
+
|
|
60
|
+
- **URL path versioning** (`/api/v1/`, `/api/v2/`): Most discoverable, easy to proxy and document. Preferred for public APIs. The version lives in the path, not just in headers.
|
|
61
|
+
- **Header versioning** (`Accept: application/vnd.myapi.v2+json`): Cleaner URLs, harder to test in a browser. Preferred when URL cleanliness is a hard requirement.
|
|
62
|
+
- **Query parameter** (`?v=2`): Easy to add but pollutes request URLs and caching keys.
|
|
63
|
+
- **Sunset headers**: For deprecated versions, return `Sunset: Sat, 1 Jan 2025 00:00:00 GMT` and `Deprecation: true` headers on every response. Clients can detect imminent removal programmatically.
|
|
64
|
+
|
|
65
|
+
### Pagination
|
|
66
|
+
|
|
67
|
+
- **Cursor-based**: Use for any list that may grow large. Return a `cursor` (opaque string encoding the last item's sort key) in the response. The client passes `?after=<cursor>` for the next page. Stable under concurrent writes. Efficient at any offset depth.
|
|
68
|
+
- **Offset-based** (`?page=3&limit=25`): Simple to implement. Acceptable for small, stable datasets. Degrades at large offsets (database must skip rows) and produces duplicates/gaps under concurrent mutations.
|
|
69
|
+
- **Standard response envelope**: Every paginated list response should include `data: []`, `nextCursor` (or `nextPage`), and `total` (when computationally cheap).
|
|
70
|
+
|
|
71
|
+
### Filtering and Sorting
|
|
72
|
+
|
|
73
|
+
- Expose filtering via query parameters: `?status=active&createdAfter=2024-01-01`.
|
|
74
|
+
- Expose sorting via `?sort=createdAt:desc` or `?sortBy=createdAt&order=desc`.
|
|
75
|
+
- Validate all filter parameters against a schema before passing to the data layer. Never interpolate raw query parameters into SQL — use parameterized queries unconditionally.
|
|
76
|
+
- Limit the surface area of filterable/sortable fields to those backed by indexes. Undocumented table scans at scale are a reliability incident.
|
|
77
|
+
|
|
78
|
+
### Error Response Standards
|
|
79
|
+
|
|
80
|
+
Standardize error responses across all API styles:
|
|
81
|
+
|
|
82
|
+
```json
|
|
83
|
+
{
|
|
84
|
+
"error": {
|
|
85
|
+
"code": "VALIDATION_ERROR",
|
|
86
|
+
"message": "Invalid request parameters",
|
|
87
|
+
"details": [
|
|
88
|
+
{ "field": "email", "issue": "must be a valid email address" }
|
|
89
|
+
],
|
|
90
|
+
"requestId": "req_abc123"
|
|
91
|
+
}
|
|
92
|
+
}
|
|
93
|
+
```
|
|
94
|
+
|
|
95
|
+
Include a machine-readable `code` field that consumers can switch on without string-matching human-readable messages. Include the `requestId` in every error response to enable correlation with server-side logs.
|
|
96
|
+
|
|
97
|
+
### API Design Review Checklist
|
|
98
|
+
|
|
99
|
+
Before shipping any new endpoint: Does the URL follow the naming convention? Are all error responses structured with an error code? Is the success response envelope consistent with existing endpoints? Is pagination implemented for list endpoints? Are inputs validated with a schema? Is the endpoint documented in OpenAPI / GraphQL schema / proto? Are breaking changes versioned?
|
|
100
|
+
|
|
101
|
+
### Rate Limiting Headers
|
|
102
|
+
|
|
103
|
+
Every API should communicate rate limit state to callers via response headers. Include `X-RateLimit-Limit` (maximum requests per window), `X-RateLimit-Remaining` (requests left in current window), and `X-RateLimit-Reset` (UTC epoch seconds when the window resets). When the limit is exceeded, return `429 Too Many Requests` with a `Retry-After` header specifying seconds until the caller may retry. This enables well-behaved clients to self-throttle without guessing.
|
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: backend-architecture
|
|
3
|
+
description: Monolith vs microservices decision framework, layered architecture patterns, CQRS, event sourcing, hexagonal architecture, and service mesh considerations
|
|
4
|
+
topics: [backend, architecture, microservices, monolith, cqrs, event-sourcing, hexagonal, clean-architecture]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Backend architecture is the set of structural decisions that determine how the system scales, how teams work independently, and how expensive future changes will be. The single most common backend architecture mistake is choosing microservices before the problem demands them. Start with the simplest architecture that solves the current problem, and evolve to complexity only when specific pain points — not hypothetical future ones — force the change.
|
|
8
|
+
|
|
9
|
+
## Summary
|
|
10
|
+
|
|
11
|
+
The single most common backend architecture mistake is choosing microservices before the problem demands them. Start with a monolith when the team is under 10-15 engineers or domain boundaries are unclear. Choose microservices only when parts of the system have genuinely different scaling requirements or teams need independent release cadences. The "modular monolith" is the underrated middle path.
|
|
12
|
+
|
|
13
|
+
The foundational organizational pattern is layered architecture: controllers own HTTP, services own business logic, repositories own data access. Advanced patterns like CQRS, event sourcing, and hexagonal architecture solve specific problems and should not be adopted speculatively.
|
|
14
|
+
|
|
15
|
+
## Deep Guidance
|
|
16
|
+
|
|
17
|
+
### Monolith vs Microservices Decision Framework
|
|
18
|
+
|
|
19
|
+
The choice is not philosophical — it is economic. Evaluate on these axes:
|
|
20
|
+
|
|
21
|
+
**Choose a monolith when:**
|
|
22
|
+
- Team size is under 10–15 engineers; coordination overhead of microservices exceeds the benefit
|
|
23
|
+
- The domain boundaries are not yet clear; premature decomposition creates the wrong service boundaries that are expensive to undo
|
|
24
|
+
- Operational maturity is low; microservices require observability, deployment pipelines, and network failure handling that monoliths do not
|
|
25
|
+
- The project is new; a well-structured monolith can be extracted into services later, at a fraction of the cost of rebuilding distributed services into a monolith
|
|
26
|
+
|
|
27
|
+
**Choose microservices when:**
|
|
28
|
+
- Different parts of the system have genuinely different scaling requirements (image processing vs user authentication vs real-time chat)
|
|
29
|
+
- Teams have hard ownership boundaries with independent release cadences; shared deployment is blocking velocity
|
|
30
|
+
- Specific services need different technology choices (ML inference in Python, payment processing in Go, main app in Node.js)
|
|
31
|
+
- The monolith's deployment coupling is causing real incidents — a change to a low-risk module blocking a high-stakes deployment
|
|
32
|
+
|
|
33
|
+
The "modular monolith" is the underrated middle path: a single deployment with strong internal module boundaries, co-located services with strict import rules, and a clear extraction path to microservices when the time comes.
|
|
34
|
+
|
|
35
|
+
### Layered Architecture (Controller → Service → Repository)
|
|
36
|
+
|
|
37
|
+
The foundational pattern for backend organization separates concerns across three layers:
|
|
38
|
+
|
|
39
|
+
- **Controller / Handler layer**: Owns HTTP. Parses requests, validates inputs, calls services, formats responses. Zero business logic.
|
|
40
|
+
- **Service layer**: Owns business logic. Orchestrates domain operations. Depends on repository interfaces. Returns domain objects or throws domain errors. No knowledge of HTTP, SQL, or external API SDKs.
|
|
41
|
+
- **Repository layer**: Owns data access. Translates between domain objects and persistence format. Exposes a domain-language interface: `findById`, `save`, `delete`.
|
|
42
|
+
|
|
43
|
+
This separation enables unit testing services with mocked repositories — the most valuable test in the suite.
|
|
44
|
+
|
|
45
|
+
### CQRS (Command Query Responsibility Segregation)
|
|
46
|
+
|
|
47
|
+
CQRS separates read and write models. Commands mutate state; queries return state. This is not a default pattern — apply it when reads and writes have meaningfully different performance, consistency, or complexity requirements:
|
|
48
|
+
|
|
49
|
+
- **Separate models**: The write model enforces invariants and emits events. The read model is denormalized for query performance — often a materialized view or a read-optimized document.
|
|
50
|
+
- **When to use**: High-read / low-write ratios where the read query complexity is a bottleneck, systems where read and write consistency requirements differ, event-driven systems where the write side already produces events.
|
|
51
|
+
- **When NOT to use**: Simple CRUD with no complex business rules; small teams where the dual-model overhead is not justified.
|
|
52
|
+
|
|
53
|
+
### Event Sourcing
|
|
54
|
+
|
|
55
|
+
Event sourcing stores the full history of state changes as an immutable event log, deriving current state by replaying events. This is a specialized pattern for specific use cases:
|
|
56
|
+
|
|
57
|
+
- **When justified**: Audit trails are a hard requirement (financial transactions, medical records), temporal queries ("what was the account balance at 3pm Tuesday?"), complex event-driven workflows where the event history has business value.
|
|
58
|
+
- **Cost**: Event replay infrastructure, eventual consistency complexity, developer mental overhead. Do not adopt speculatively.
|
|
59
|
+
|
|
60
|
+
### Hexagonal / Clean Architecture
|
|
61
|
+
|
|
62
|
+
Hexagonal architecture (Ports and Adapters) places the domain model at the center, surrounded by ports (interfaces) that define how the domain interacts with the outside world, and adapters that implement those interfaces for specific technologies:
|
|
63
|
+
|
|
64
|
+
- **Core domain**: Pure business logic with no framework or infrastructure dependencies.
|
|
65
|
+
- **Ports**: Interfaces the core defines. `UserRepository` is a port; `PostgresUserRepository` is an adapter.
|
|
66
|
+
- **Adapters**: HTTP controllers, database repositories, message queue consumers, external API clients.
|
|
67
|
+
|
|
68
|
+
The benefit: the domain is testable in isolation from all infrastructure. The cost: more interfaces and indirection. Apply when the domain is complex and long-lived; overkill for simple CRUD services.
|
|
69
|
+
|
|
70
|
+
### Service Mesh Considerations
|
|
71
|
+
|
|
72
|
+
Service meshes (Istio, Linkerd, Consul Connect) add a sidecar proxy to each service pod, providing mTLS, traffic management, circuit breaking, and observability without application code changes. Consider only when:
|
|
73
|
+
|
|
74
|
+
- You have 10+ microservices and the operational complexity of per-service networking configuration is unsustainable
|
|
75
|
+
- Zero-trust networking is a compliance requirement
|
|
76
|
+
- You need traffic splitting for canary deployments at the infrastructure level
|
|
77
|
+
|
|
78
|
+
A service mesh adds significant operational overhead. Validate the need against simpler alternatives (application-level circuit breakers with Resilience4j / Polly / opossum, API gateway traffic management) before committing.
|
|
79
|
+
|
|
80
|
+
### Modular Monolith Implementation
|
|
81
|
+
|
|
82
|
+
A modular monolith combines the deployment simplicity of a monolith with the code isolation of microservices:
|
|
83
|
+
|
|
84
|
+
- **Module boundaries**: Each module has its own directory with a public API (exported functions/types) and private internals. Modules communicate through well-defined interfaces, not direct imports of internal implementation.
|
|
85
|
+
- **Enforce boundaries**: Use ESLint import rules, TypeScript project references, or build-time checks to prevent cross-module imports that bypass the public API. The rule: Module A may import from Module B's public API but never from its internal implementation files.
|
|
86
|
+
- **Database isolation**: Each module owns its tables and never queries another module's tables directly. Cross-module data access goes through the module's service interface. This discipline makes future extraction to microservices possible without rewriting the data layer.
|
|
87
|
+
- **Shared kernel**: A small shared layer provides cross-cutting concerns (auth, logging, error types) used by all modules. Keep the shared kernel minimal — every shared type is a coupling point.
|
|
88
|
+
|
|
89
|
+
The modular monolith is the underrated default architecture for teams of 5–15 engineers. It avoids the operational overhead of microservices while providing the code isolation that prevents monolith rot.
|
|
90
|
+
|
|
91
|
+
### API Gateway Pattern
|
|
92
|
+
|
|
93
|
+
An API gateway sits in front of backend services and provides a unified entry point:
|
|
94
|
+
|
|
95
|
+
- **Routing**: Route requests to the appropriate service based on URL path or header
|
|
96
|
+
- **Cross-cutting concerns**: Authentication, rate limiting, request logging, and CORS handled once at the gateway rather than duplicated in each service
|
|
97
|
+
- **Response aggregation**: Combine responses from multiple services into a single response for the client (BFF-like behavior)
|
|
98
|
+
- **Tools**: Kong, AWS API Gateway, Traefik, Envoy, or a custom Express/Fastify proxy
|
|
99
|
+
|
|
100
|
+
Use an API gateway when you have 3+ services that share auth and rate limiting requirements. Do not use one for a single service — the added hop is pure overhead.
|
|
@@ -0,0 +1,101 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: backend-async-patterns
|
|
3
|
+
description: Message queue patterns, event-driven architecture, saga patterns, retry strategies, and idempotency keys
|
|
4
|
+
topics: [backend, async, message-queues, event-driven, saga, retry, idempotency, cqrs]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Asynchronous patterns decouple services in time and space, enabling systems to absorb load spikes, survive partial failures, and scale independently — but they introduce delivery guarantees and consistency tradeoffs that must be designed for explicitly from the start.
|
|
8
|
+
|
|
9
|
+
## Summary
|
|
10
|
+
|
|
11
|
+
Asynchronous patterns come in two primary shapes: pub/sub for fan-out scenarios and work queues for competing consumers. Dead letter queues capture messages that exhaust retry attempts. Always configure a DLQ — without one, unprocessable messages either loop forever or are silently lost.
|
|
12
|
+
|
|
13
|
+
For distributed transactions, the saga pattern coordinates multi-step operations across services using either choreography (event-driven) or orchestration (central coordinator). All retry strategies should use exponential backoff with jitter to prevent retry storms, and circuit breakers prevent cascading failures when downstream services are degraded.
|
|
14
|
+
|
|
15
|
+
## Deep Guidance
|
|
16
|
+
|
|
17
|
+
### Message Queue Patterns
|
|
18
|
+
|
|
19
|
+
**Pub/sub (publish-subscribe):** Publishers emit events to a topic without knowing who consumes them. Multiple subscribers independently receive each message. Use for fan-out scenarios — an order placed event triggers inventory update, email notification, and analytics simultaneously.
|
|
20
|
+
|
|
21
|
+
**Work queues (competing consumers):** Messages are distributed across multiple workers; each message is processed by exactly one worker. Use for task distribution — image processing, email sending, PDF generation. Enables horizontal scaling: add workers to increase throughput.
|
|
22
|
+
|
|
23
|
+
**Dead letter queues (DLQ):** Messages that fail after the maximum retry attempts are moved to a DLQ instead of being dropped. Monitor DLQ depth as a health signal. Inspect failed messages, fix the root cause, then replay from the DLQ. Always configure a DLQ — without one, unprocessable messages either loop forever or are silently lost.
|
|
24
|
+
|
|
25
|
+
### Event-Driven Architecture
|
|
26
|
+
|
|
27
|
+
**Event sourcing:** Store the sequence of state-changing events as the system of record, not the current state. The current state is a projection derived by replaying events. Benefits: complete audit history, ability to reconstruct past state, event replay for debugging. Cost: more complex reads (projections), eventual consistency, snapshot management for performance.
|
|
28
|
+
|
|
29
|
+
**CQRS (Command Query Responsibility Segregation):** Separate the write model (commands that change state) from the read model (queries optimized for display). Commands go through validation and business logic; read models are denormalized for query performance. Event sourcing and CQRS are often paired but are independent patterns — CQRS is valuable without event sourcing when read and write workloads have very different shapes.
|
|
30
|
+
|
|
31
|
+
**Event schema evolution:** Version event schemas from the start. Consumers must handle older event versions gracefully. Use a schema registry (Confluent Schema Registry, AWS Glue Schema Registry) to enforce compatibility. Prefer additive changes (new optional fields) over breaking changes.
|
|
32
|
+
|
|
33
|
+
### Saga Pattern for Distributed Transactions
|
|
34
|
+
|
|
35
|
+
Sagas coordinate multi-step transactions across services without two-phase commit. Each step has a compensating transaction that undoes its effect.
|
|
36
|
+
|
|
37
|
+
**Choreography:** Each service emits events and other services react. No central coordinator. Simpler but harder to trace and reason about for long workflows.
|
|
38
|
+
|
|
39
|
+
**Orchestration:** A saga orchestrator service drives the workflow, calling each participant and issuing compensating calls on failure. Easier to trace and monitor; the orchestrator is a single point of failure.
|
|
40
|
+
|
|
41
|
+
Use sagas when: a business operation spans multiple services or databases, and full ACID transactions are not available. Always design compensating transactions before implementing the forward path.
|
|
42
|
+
|
|
43
|
+
### Retry Strategies
|
|
44
|
+
|
|
45
|
+
**Exponential backoff:** After a failure, wait before retrying. Double the wait on each subsequent attempt: 1s, 2s, 4s, 8s, 16s. Add jitter (randomness of ±25%) to prevent retry storms — without jitter, all callers retry simultaneously and overwhelm the recovering service.
|
|
46
|
+
|
|
47
|
+
**Maximum attempts:** Cap total retries (typically 3–5). After the maximum, either raise the error to the caller, move the message to a DLQ, or trigger an alert.
|
|
48
|
+
|
|
49
|
+
**Circuit breaker:** Track the failure rate of calls to a dependency. When failures exceed a threshold (e.g., 50% of calls in the last 10 seconds), open the circuit and immediately return an error without attempting the call. After a cooldown period, allow a single probe request — if it succeeds, close the circuit; if it fails, stay open. Circuit breakers prevent cascading failures when a downstream service is slow or down.
|
|
50
|
+
|
|
51
|
+
### Idempotency Keys
|
|
52
|
+
|
|
53
|
+
Make all mutating operations idempotent so they can be safely retried. The client generates a unique idempotency key (UUID) and sends it with the request. The server records the key and the response. On a duplicate request with the same key, return the stored response without re-executing the operation.
|
|
54
|
+
|
|
55
|
+
**Storage:** Store idempotency keys in Redis or the database with the operation result. Set TTL based on reasonable retry windows (24 hours for payments, 1 hour for most operations).
|
|
56
|
+
|
|
57
|
+
**Scope:** Idempotency keys must be scoped to a user or API key — global keys are a DoS vector. Return `409 Conflict` if the same key is used with different request parameters (key collision detection).
|
|
58
|
+
|
|
59
|
+
Design database operations to be naturally idempotent where possible: `INSERT ... ON CONFLICT DO NOTHING`, `UPSERT`, or check-then-insert in a transaction.
|
|
60
|
+
|
|
61
|
+
### Message Ordering Guarantees
|
|
62
|
+
|
|
63
|
+
Different messaging systems provide different ordering guarantees:
|
|
64
|
+
|
|
65
|
+
- **FIFO (First In, First Out)**: SQS FIFO queues, Kafka partitions. Messages are delivered in the order they were sent. Useful for operations where order matters (sequential state transitions, financial transactions).
|
|
66
|
+
- **Best-effort ordering**: Standard SQS, most pub/sub systems. Messages may arrive out of order. Design consumers to handle reordering — use sequence numbers or timestamps to detect and resolve ordering conflicts.
|
|
67
|
+
- **Partition-level ordering**: Kafka guarantees order within a partition. Use a partition key (user ID, order ID) to ensure all related messages go to the same partition and are processed in order.
|
|
68
|
+
|
|
69
|
+
When order matters, choose a system that guarantees it at the partition level rather than trying to enforce ordering at the application level. Application-level ordering (hold messages in a buffer, sort, then process) is fragile and adds latency.
|
|
70
|
+
|
|
71
|
+
### Backpressure Strategies
|
|
72
|
+
|
|
73
|
+
When a producer emits messages faster than consumers can process them, the system needs a backpressure strategy:
|
|
74
|
+
|
|
75
|
+
- **Queue depth limits**: Set a maximum queue depth. When the queue is full, the producer receives an error and must retry or drop the message. This prevents unbounded memory growth.
|
|
76
|
+
- **Rate limiting at the producer**: Throttle the producer based on consumer throughput. The producer monitors queue depth and reduces its emit rate when the queue approaches capacity.
|
|
77
|
+
- **Consumer scaling**: Automatically scale consumer instances based on queue depth. When depth exceeds a threshold, add workers. When it drops, scale down. Cloud-native options: AWS Lambda with SQS triggers (auto-scales), Kubernetes KEDA (queue-based autoscaler).
|
|
78
|
+
- **Shedding**: When overwhelmed, intentionally drop low-priority messages rather than processing everything slowly. Useful for telemetry or analytics events where some data loss is acceptable.
|
|
79
|
+
|
|
80
|
+
Monitor queue depth and consumer lag as primary health indicators. A growing queue depth means consumers are falling behind — this is a capacity planning signal, not a bug to ignore.
|
|
81
|
+
|
|
82
|
+
### Transactional Outbox Pattern
|
|
83
|
+
|
|
84
|
+
The transactional outbox pattern ensures that database writes and message publishing are atomic — either both happen or neither does:
|
|
85
|
+
|
|
86
|
+
1. Write the business data and the outbox event to the database in the same transaction
|
|
87
|
+
2. A separate process (poller or CDC) reads the outbox table and publishes events to the message queue
|
|
88
|
+
3. Mark outbox entries as published after successful delivery
|
|
89
|
+
|
|
90
|
+
This eliminates the dual-write problem where the database commit succeeds but the message publish fails (or vice versa). Use database CDC (Change Data Capture) with Debezium for production-grade implementations.
|
|
91
|
+
|
|
92
|
+
### Choosing a Message Broker
|
|
93
|
+
|
|
94
|
+
Select the broker based on throughput, ordering, and operational requirements:
|
|
95
|
+
|
|
96
|
+
- **Redis (via BullMQ, Redis Streams)**: Good for moderate throughput, simple to operate, already present in most stacks. Limited durability if Redis is not configured with AOF persistence. Best for job queues and task distribution.
|
|
97
|
+
- **RabbitMQ**: Full-featured broker with routing, exchanges, and consumer acknowledgment. Excellent for complex routing topologies. More operational overhead than Redis.
|
|
98
|
+
- **Kafka**: Designed for high-throughput event streaming with durable, ordered, replayable logs. Best for event-driven architectures where consumers need to replay history. Highest operational overhead.
|
|
99
|
+
- **SQS / Cloud Pub/Sub**: Managed services with zero operational overhead. Limited features compared to self-hosted brokers but eliminate infrastructure management entirely. Default choice for cloud-native services unless a specific feature gap forces self-hosting.
|
|
100
|
+
|
|
101
|
+
Start with the simplest broker that meets the requirements. Migrating from Redis to Kafka later is straightforward if the consumer interface is abstracted behind a repository pattern.
|
|
@@ -0,0 +1,100 @@
|
|
|
1
|
+
---
|
|
2
|
+
name: backend-auth-patterns
|
|
3
|
+
description: JWT lifecycle, OAuth2 authorization code flow, API key management, and service-to-service authentication
|
|
4
|
+
topics: [backend, auth, jwt, oauth2, api-keys, mtls, security]
|
|
5
|
+
---
|
|
6
|
+
|
|
7
|
+
Authentication and authorization are the first line of defense for any backend service — mistakes here compromise the entire system, making it essential to use proven patterns like JWTs with rotation, OAuth2 with PKCE, and workload identity from the start.
|
|
8
|
+
|
|
9
|
+
## Summary
|
|
10
|
+
|
|
11
|
+
Authentication and authorization patterns for backend services center on three areas: JWTs with short expiry and refresh-token rotation for user sessions, OAuth2 authorization code flow with PKCE for third-party integrations, and scoped API keys with hashed storage for programmatic access. Service-to-service authentication uses mTLS, HMAC request signing, or cloud workload identity.
|
|
12
|
+
|
|
13
|
+
Every auth mechanism requires explicit key rotation, revocation procedures, and scope enforcement. Never put sensitive data in JWT payloads and never store API keys unhashed.
|
|
14
|
+
|
|
15
|
+
## Deep Guidance
|
|
16
|
+
|
|
17
|
+
### JWT Lifecycle
|
|
18
|
+
|
|
19
|
+
Issue JWTs with short expiry (15–60 minutes) signed with RS256 or ES256. Include only the minimum claims needed: `sub`, `iat`, `exp`, `iss`, `aud`, and application-specific role or scope claims. Never put sensitive data in the payload — it is base64-encoded, not encrypted.
|
|
20
|
+
|
|
21
|
+
**Refresh flow:** Issue a long-lived refresh token (7–30 days) stored in an HttpOnly, Secure, SameSite=Strict cookie. On access-token expiry the client posts to `/auth/refresh`; the server validates the refresh token, issues a new access token, and optionally rotates the refresh token (sliding expiry). Implement refresh-token rotation to detect token theft: if an already-used refresh token is presented, revoke the entire family and force re-login.
|
|
22
|
+
|
|
23
|
+
**Revocation:** JWTs are stateless by design — revocation requires a blocklist. Store revoked token JTIs in Redis with TTL matching the token's remaining lifetime. Check the blocklist on every request only for high-security operations; skip the check for low-risk reads when performance matters and accept the short window of continued validity.
|
|
24
|
+
|
|
25
|
+
**Key rotation:** Support multiple active signing keys identified by `kid` header. Add the new key to the JWKS endpoint, start signing with it, keep the old key for validation until all tokens signed with it have expired, then remove it.
|
|
26
|
+
|
|
27
|
+
### OAuth2 Authorization Code Flow
|
|
28
|
+
|
|
29
|
+
Use the authorization code flow (with PKCE) for any user-facing OAuth2 integration — never implicit or client credentials on the frontend.
|
|
30
|
+
|
|
31
|
+
1. Generate a cryptographically random `state` parameter and PKCE `code_verifier`/`code_challenge`. Store both in the session.
|
|
32
|
+
2. Redirect the user to the provider's authorization endpoint with `response_type=code`, `client_id`, `redirect_uri`, `scope`, `state`, and `code_challenge`.
|
|
33
|
+
3. On callback, verify `state` matches the session value (CSRF protection), then exchange `code` + `code_verifier` for tokens via a server-side POST.
|
|
34
|
+
4. Store the provider's access token server-side (never expose it to the browser). Use it to fetch the user's profile and map to a local user record.
|
|
35
|
+
5. Issue your own session or JWT — do not use the provider's token as your application's auth token.
|
|
36
|
+
|
|
37
|
+
### API Key Management
|
|
38
|
+
|
|
39
|
+
**Generation:** Use cryptographically random 32-byte values encoded as hex or base58. Prefix keys with a service identifier (`sk_live_`, `pk_test_`) for easy identification in logs and leaked-credential scanners.
|
|
40
|
+
|
|
41
|
+
**Storage:** Hash the key (SHA-256) before storing in the database, just like a password. Only show the full key once at creation time. Store metadata alongside the hash: name, scopes, last-used timestamp, expiry, owner.
|
|
42
|
+
|
|
43
|
+
**Scoping:** Define fine-grained scopes (`orders:read`, `webhooks:write`) and require callers to request minimum necessary scopes. Validate scope on each request against the endpoint's required permissions.
|
|
44
|
+
|
|
45
|
+
**Rotation:** Provide a rotation endpoint that issues a new key and returns both old and new simultaneously. Set a grace period (e.g., 24 hours) during which both keys are valid, then revoke the old key. Send email/webhook notifications before expiry.
|
|
46
|
+
|
|
47
|
+
### Service-to-Service Authentication
|
|
48
|
+
|
|
49
|
+
**mTLS:** Both client and server present TLS certificates. The server verifies the client certificate against a trusted CA. Use a private CA (cert-manager on Kubernetes, AWS Private CA) to issue short-lived service certificates. Rotate certificates automatically before expiry. mTLS is the strongest service-to-service option and integrates with service meshes (Istio, Linkerd).
|
|
50
|
+
|
|
51
|
+
**Shared secrets / HMAC request signing:** For simpler setups, sign requests with an HMAC-SHA256 of the canonical request (method + path + timestamp + body hash) using a shared secret. Include the signature and timestamp in a request header. The receiving service verifies the signature and rejects requests with timestamps older than 5 minutes (replay protection). Rotate shared secrets via a dual-key window.
|
|
52
|
+
|
|
53
|
+
**Workload identity:** On cloud platforms, prefer workload identity (AWS IAM roles for service accounts, GCP Workload Identity Federation) over static shared secrets. Credentials are issued dynamically by the platform and rotate automatically.
|
|
54
|
+
|
|
55
|
+
### Session Management Patterns
|
|
56
|
+
|
|
57
|
+
For server-rendered applications or APIs that need session state:
|
|
58
|
+
|
|
59
|
+
- **Server-side sessions**: Store session data in Redis or a database. The client holds only an opaque session ID in an HttpOnly, Secure, SameSite=Strict cookie. Server looks up the session on each request. Benefits: easy revocation, no client-side state management. Cost: every request hits the session store.
|
|
60
|
+
- **Stateless JWT sessions**: The JWT itself is the session. Benefits: no session store, horizontal scaling without shared state. Cost: revocation requires a blocklist, token size grows with claims, tokens cannot be invalidated before expiry without extra infrastructure.
|
|
61
|
+
- **Hybrid approach**: Use JWTs for short-lived access (15 minutes) and server-side sessions for refresh tokens. This provides the performance benefits of stateless access tokens with the revocability of server-side sessions.
|
|
62
|
+
|
|
63
|
+
Choose server-side sessions for applications with strict revocation requirements (banking, healthcare). Choose stateless JWTs for high-throughput APIs where the operational overhead of a session store is not justified.
|
|
64
|
+
|
|
65
|
+
### Permission Models
|
|
66
|
+
|
|
67
|
+
Authorization beyond authentication requires an explicit permission model:
|
|
68
|
+
|
|
69
|
+
- **RBAC (Role-Based Access Control)**: Users are assigned roles (admin, editor, viewer). Roles map to permissions. Simple to implement, hard to evolve when permission granularity requirements grow. Suitable for most applications with fewer than 20 distinct permission levels.
|
|
70
|
+
- **ABAC (Attribute-Based Access Control)**: Permissions evaluated based on user attributes, resource attributes, and environmental context. More flexible than RBAC but more complex to implement and audit. Use when the same user needs different permissions on different resources based on attributes (department, project membership, data sensitivity).
|
|
71
|
+
- **ReBAC (Relationship-Based Access Control)**: Permissions derived from the relationship between users and resources (ownership, team membership, sharing). Google Zanzibar model. Implemented by OpenFGA, SpiceDB, Ory Keto. Use for applications with complex sharing and collaboration features (Google Docs-style permissions).
|
|
72
|
+
|
|
73
|
+
Regardless of model, enforce authorization at the service layer — not at the controller level. A controller that checks permissions directly is duplicating security logic that should be centralized.
|
|
74
|
+
|
|
75
|
+
### Token Storage Best Practices
|
|
76
|
+
|
|
77
|
+
Where tokens are stored determines the attack surface:
|
|
78
|
+
|
|
79
|
+
- **HttpOnly cookies**: Protected from XSS (JavaScript cannot read them). Vulnerable to CSRF without SameSite and CSRF tokens. The recommended storage for refresh tokens.
|
|
80
|
+
- **Authorization header (Bearer token)**: Tokens stored in memory. Lost on page refresh. Not vulnerable to CSRF. Suitable for SPAs where the token is fetched from a server-side session on load.
|
|
81
|
+
- **localStorage**: Persistent across sessions but fully accessible to any JavaScript on the page. Never store refresh tokens or long-lived credentials in localStorage — a single XSS vulnerability compromises them.
|
|
82
|
+
|
|
83
|
+
### CSRF Protection
|
|
84
|
+
|
|
85
|
+
Cross-Site Request Forgery (CSRF) attacks trick authenticated browsers into making unintended requests. Protection strategies:
|
|
86
|
+
|
|
87
|
+
- **SameSite cookies**: Set `SameSite=Strict` or `SameSite=Lax` on session cookies. `Strict` prevents the cookie from being sent on any cross-site request, including navigation. `Lax` allows the cookie on top-level GET navigations (safe for most cases).
|
|
88
|
+
- **Double-submit cookie**: Generate a random CSRF token, set it as a cookie, and require the client to include it in a custom header (`X-CSRF-Token`). The server verifies the header matches the cookie. Attackers cannot read cross-origin cookies to include the header.
|
|
89
|
+
- **Origin header validation**: Check the `Origin` or `Referer` header on state-changing requests. Reject requests from unknown origins. This is a defense-in-depth measure, not a standalone protection.
|
|
90
|
+
- **Token-per-request**: For highest security, generate a unique CSRF token per form render and validate it on submission. More complex but eliminates token reuse attacks.
|
|
91
|
+
|
|
92
|
+
When using JWT bearer tokens in Authorization headers (not cookies), CSRF protection is not needed — the token is not sent automatically by the browser.
|
|
93
|
+
|
|
94
|
+
### Multi-Factor Authentication Integration
|
|
95
|
+
|
|
96
|
+
For applications requiring MFA:
|
|
97
|
+
|
|
98
|
+
- **TOTP (Time-based One-Time Password)**: Standard algorithm (RFC 6238) compatible with Google Authenticator, Authy, 1Password. Generate a secret key, encode as a QR code URI, and verify the 6-digit code during login. Store the secret key encrypted in the database.
|
|
99
|
+
- **WebAuthn / Passkeys**: Hardware key or biometric authentication via the Web Authentication API. Strongest option — phishing-resistant. Support as the primary MFA method for security-sensitive applications.
|
|
100
|
+
- **Recovery codes**: Generate 8-10 single-use recovery codes during MFA enrollment. Hash them like passwords. Display only once at enrollment. These are the escape hatch when the user loses their MFA device.
|