aigent-team 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (71) hide show
  1. package/LICENSE +21 -0
  2. package/README.md +253 -0
  3. package/dist/chunk-N3RYHWTR.js +267 -0
  4. package/dist/cli.js +576 -0
  5. package/dist/index.d.ts +234 -0
  6. package/dist/index.js +27 -0
  7. package/package.json +67 -0
  8. package/templates/shared/git-workflow.md +44 -0
  9. package/templates/shared/project-conventions.md +48 -0
  10. package/templates/teams/ba/agent.yaml +25 -0
  11. package/templates/teams/ba/references/acceptance-criteria.md +87 -0
  12. package/templates/teams/ba/references/api-contract-design.md +110 -0
  13. package/templates/teams/ba/references/requirements-analysis.md +83 -0
  14. package/templates/teams/ba/references/user-story-mapping.md +73 -0
  15. package/templates/teams/ba/skill.md +85 -0
  16. package/templates/teams/be/agent.yaml +34 -0
  17. package/templates/teams/be/conventions.md +102 -0
  18. package/templates/teams/be/references/api-design.md +91 -0
  19. package/templates/teams/be/references/async-processing.md +86 -0
  20. package/templates/teams/be/references/auth-security.md +58 -0
  21. package/templates/teams/be/references/caching.md +79 -0
  22. package/templates/teams/be/references/database.md +65 -0
  23. package/templates/teams/be/references/error-handling.md +106 -0
  24. package/templates/teams/be/references/observability.md +83 -0
  25. package/templates/teams/be/references/review-checklist.md +50 -0
  26. package/templates/teams/be/references/testing.md +100 -0
  27. package/templates/teams/be/review-checklist.md +54 -0
  28. package/templates/teams/be/skill.md +71 -0
  29. package/templates/teams/devops/agent.yaml +35 -0
  30. package/templates/teams/devops/conventions.md +133 -0
  31. package/templates/teams/devops/references/ci-cd.md +218 -0
  32. package/templates/teams/devops/references/cost-optimization.md +218 -0
  33. package/templates/teams/devops/references/disaster-recovery.md +199 -0
  34. package/templates/teams/devops/references/docker.md +237 -0
  35. package/templates/teams/devops/references/infrastructure-as-code.md +238 -0
  36. package/templates/teams/devops/references/kubernetes.md +397 -0
  37. package/templates/teams/devops/references/monitoring.md +224 -0
  38. package/templates/teams/devops/references/review-checklist.md +149 -0
  39. package/templates/teams/devops/references/security.md +225 -0
  40. package/templates/teams/devops/review-checklist.md +72 -0
  41. package/templates/teams/devops/skill.md +131 -0
  42. package/templates/teams/fe/agent.yaml +28 -0
  43. package/templates/teams/fe/conventions.md +80 -0
  44. package/templates/teams/fe/references/accessibility.md +92 -0
  45. package/templates/teams/fe/references/component-architecture.md +87 -0
  46. package/templates/teams/fe/references/css-styling.md +89 -0
  47. package/templates/teams/fe/references/forms.md +73 -0
  48. package/templates/teams/fe/references/performance.md +104 -0
  49. package/templates/teams/fe/references/review-checklist.md +51 -0
  50. package/templates/teams/fe/references/security.md +90 -0
  51. package/templates/teams/fe/references/state-management.md +117 -0
  52. package/templates/teams/fe/references/testing.md +112 -0
  53. package/templates/teams/fe/review-checklist.md +53 -0
  54. package/templates/teams/fe/skill.md +68 -0
  55. package/templates/teams/lead/agent.yaml +18 -0
  56. package/templates/teams/lead/references/cross-team-coordination.md +68 -0
  57. package/templates/teams/lead/references/quality-gates.md +64 -0
  58. package/templates/teams/lead/references/task-decomposition.md +69 -0
  59. package/templates/teams/lead/skill.md +83 -0
  60. package/templates/teams/qa/agent.yaml +32 -0
  61. package/templates/teams/qa/conventions.md +130 -0
  62. package/templates/teams/qa/references/ci-integration.md +337 -0
  63. package/templates/teams/qa/references/e2e-testing.md +292 -0
  64. package/templates/teams/qa/references/mocking.md +249 -0
  65. package/templates/teams/qa/references/performance-testing.md +288 -0
  66. package/templates/teams/qa/references/review-checklist.md +143 -0
  67. package/templates/teams/qa/references/security-testing.md +271 -0
  68. package/templates/teams/qa/references/test-data.md +275 -0
  69. package/templates/teams/qa/references/test-strategy.md +192 -0
  70. package/templates/teams/qa/review-checklist.md +53 -0
  71. package/templates/teams/qa/skill.md +131 -0
@@ -0,0 +1,83 @@
1
+ # Requirements Analysis
2
+
3
+ ## Analysis Framework
4
+
5
+ ### Step 1: Understand the "Why"
6
+ - What business problem does this solve?
7
+ - Who are the users? What are their goals?
8
+ - What happens if we don't build this?
9
+ - How does this fit into the larger product roadmap?
10
+
11
+ ### Step 2: Define the Scope
12
+ - **In scope**: Explicit list of what this feature includes
13
+ - **Out of scope**: Explicit list of what it does NOT include (prevents scope creep)
14
+ - **Assumptions**: What we're assuming to be true (validate with stakeholders)
15
+ - **Dependencies**: What must exist before this can be built
16
+
17
+ ### Step 3: Identify Actors and Actions
18
+ Map every user role to their interactions:
19
+
20
+ | Actor | Action | Expected outcome |
21
+ |-------|--------|-----------------|
22
+ | Anonymous user | Views product page | Sees price, description, reviews, "Add to cart" |
23
+ | Logged-in user | Adds item to cart | Item appears in cart, quantity updated |
24
+ | Admin | Views all orders | Paginated list with filters and search |
25
+
26
+ ### Step 4: Edge Cases Inventory
27
+
28
+ For every feature, systematically check:
29
+ - **Empty state**: What does the user see when there's no data?
30
+ - **Error state**: What happens when the operation fails? (Network error, validation, server error)
31
+ - **Boundary values**: Max length inputs, zero quantity, negative numbers, very long text
32
+ - **Concurrent access**: Two users editing the same record simultaneously
33
+ - **Permission boundaries**: What happens when an unauthorized user tries to access this?
34
+ - **Data integrity**: What if referenced data is deleted? (User deletes account while order is pending)
35
+ - **Performance**: What if there are 10,000 items? Will the UI still work?
36
+ - **Internationalization**: Special characters, RTL languages, different date formats
37
+ - **Accessibility**: Can this be used with keyboard only? Screen reader?
38
+
39
+ ### Step 5: Priority Classification
40
+
41
+ | Priority | Definition | Example |
42
+ |----------|-----------|---------|
43
+ | P0 (Must have) | Feature doesn't work without this | Login form, payment processing |
44
+ | P1 (Should have) | Important but has workaround | Search filters, bulk actions |
45
+ | P2 (Nice to have) | Improves experience but not critical | Animations, shortcuts |
46
+ | P3 (Future) | Deferred to later release | Advanced analytics, AI features |
47
+
48
+ ## Requirements Document Template
49
+
50
+ ```markdown
51
+ # Feature: [Name]
52
+
53
+ ## Overview
54
+ [1-2 sentences: what and why]
55
+
56
+ ## Actors
57
+ - [Role 1]: [what they do]
58
+ - [Role 2]: [what they do]
59
+
60
+ ## User Stories
61
+ 1. As a [role], I want to [action], so that [benefit]
62
+ - AC: Given... When... Then...
63
+ - AC: Given... When... Then...
64
+
65
+ ## Data Model Changes
66
+ - New table/field: [description]
67
+ - Modified: [what changes]
68
+
69
+ ## API Changes
70
+ - [New/modified endpoints with schemas]
71
+
72
+ ## UI Changes
73
+ - [Mockups or descriptions of new screens/components]
74
+
75
+ ## Edge Cases
76
+ - [List of identified edge cases with expected behavior]
77
+
78
+ ## Open Questions
79
+ - [Things that need stakeholder input]
80
+
81
+ ## Out of Scope
82
+ - [Explicitly excluded items]
83
+ ```
@@ -0,0 +1,73 @@
1
+ # User Story Mapping
2
+
3
+ ## Story Map Structure
4
+
5
+ ```
6
+ User Activities (high-level goals)
7
+ ├── Activity 1: Browse Products
8
+ │ ├── Task 1.1: View product list
9
+ │ ├── Task 1.2: Search products
10
+ │ ├── Task 1.3: Filter by category
11
+ │ └── Task 1.4: View product detail
12
+ ├── Activity 2: Purchase
13
+ │ ├── Task 2.1: Add to cart
14
+ │ ├── Task 2.2: Review cart
15
+ │ ├── Task 2.3: Enter shipping
16
+ │ ├── Task 2.4: Enter payment
17
+ │ └── Task 2.5: Confirm order
18
+ └── Activity 3: Manage Account
19
+ ├── Task 3.1: Register
20
+ ├── Task 3.2: Login
21
+ ├── Task 3.3: View orders
22
+ └── Task 3.4: Update profile
23
+ ```
24
+
25
+ ## MVP Slicing
26
+
27
+ Draw a horizontal line across the story map. Everything above = MVP. Everything below = later.
28
+
29
+ **MVP rule**: What is the MINIMUM set of tasks that delivers value to the user?
30
+
31
+ ```
32
+ MVP Line
33
+ ─────────────────────────────────
34
+ Above: View list, View detail, Add to cart, Basic checkout, Register, Login
35
+ Below: Search, Filters, Wishlist, Reviews, Order history, Profile editing
36
+ ```
37
+
38
+ ## Prioritization: RICE Framework
39
+
40
+ | Factor | How to score |
41
+ |--------|-------------|
42
+ | **R**each | How many users will this affect? (per quarter) |
43
+ | **I**mpact | How much will this move the metric? (3=massive, 2=high, 1=medium, 0.5=low, 0.25=minimal) |
44
+ | **C**onfidence | How sure are we about the estimates? (100%, 80%, 50%) |
45
+ | **E**ffort | How many person-weeks? |
46
+
47
+ **RICE Score** = (Reach × Impact × Confidence) / Effort
48
+
49
+ Higher score = higher priority.
50
+
51
+ ## Story Splitting Techniques
52
+
53
+ When a story is too large (> 5 story points):
54
+
55
+ 1. **By workflow step**: Split "Checkout" into "Enter shipping" + "Enter payment" + "Confirm order"
56
+ 2. **By data variation**: Split "Support all payment methods" into "Credit card" + "PayPal" + "Apple Pay"
57
+ 3. **By operation**: Split "Manage users" into "Create" + "Read" + "Update" + "Delete"
58
+ 4. **By user role**: Split "Dashboard" into "Admin dashboard" + "User dashboard"
59
+ 5. **By happy/sad path**: Split "Login" into "Successful login" + "Failed login + error handling"
60
+ 6. **By platform**: Split "Mobile support" into "Responsive design" + "Native features"
61
+
62
+ ## Definition of Ready
63
+
64
+ A story is ready for development when:
65
+ - [ ] User story written (As a... I want... So that...)
66
+ - [ ] Acceptance criteria defined (Given/When/Then)
67
+ - [ ] Edge cases identified and documented
68
+ - [ ] API contract proposed (if FE+BE involved)
69
+ - [ ] UI mockups or wireframes available (if FE involved)
70
+ - [ ] Dependencies identified and resolved
71
+ - [ ] Priority assigned (P0-P3)
72
+ - [ ] Size estimated by the team
73
+ - [ ] Open questions answered by stakeholders
@@ -0,0 +1,85 @@
1
+ # BA Agent (Business Analyst)
2
+
3
+ You are a senior business analyst who translates business requirements into precise, testable technical specifications. Your output directly drives what FE, BE, and QA agents build and test.
4
+
5
+ ## Core Principles
6
+
7
+ 1. **Clarity over completeness**: A clear spec for 80% of cases is more valuable than a vague spec for 100%. Mark unknowns explicitly as "TBD — needs stakeholder input".
8
+ 2. **Testable acceptance criteria**: Every criterion must be verifiable by QA. "User-friendly" is not testable. "Form validates email format on blur and shows inline error" is.
9
+ 3. **Edge cases are requirements**: The happy path is obvious. Your value is identifying: what happens when the user does X wrong? What if the data doesn't exist? What if two users do this simultaneously?
10
+ 4. **API contracts are agreements**: When you define an API contract, FE builds against it and BE implements it. Changing the contract after both start = expensive rework.
11
+ 5. **Diagrams > paragraphs**: A data flow diagram communicates system interactions faster than 3 pages of text.
12
+
13
+ ## Output Formats
14
+
15
+ ### User Stories
16
+ ```
17
+ As a [role],
18
+ I want to [action],
19
+ So that [benefit].
20
+ ```
21
+
22
+ ### Acceptance Criteria (Given/When/Then)
23
+ ```
24
+ GIVEN a logged-in user with "admin" role
25
+ WHEN they navigate to /admin/users
26
+ THEN they see a paginated list of all users with name, email, role, and last login
27
+ AND they can search by name or email
28
+ AND they can filter by role
29
+ AND they can sort by any column
30
+ ```
31
+
32
+ ### API Contract Proposal
33
+ ```yaml
34
+ POST /api/orders
35
+ request:
36
+ body:
37
+ productId: string (required)
38
+ quantity: integer (required, min: 1, max: 100)
39
+ couponCode: string (optional, max: 20 chars)
40
+ response:
41
+ 201: { data: { id, productId, quantity, total, discount, status, createdAt } }
42
+ 400: { error: { code: "VALIDATION_ERROR", details: [...] } }
43
+ 404: { error: { code: "PRODUCT_NOT_FOUND" } }
44
+ 409: { error: { code: "INSUFFICIENT_STOCK" } }
45
+ ```
46
+
47
+ ### Data Flow Diagram (Mermaid)
48
+ ```mermaid
49
+ sequenceDiagram
50
+ User->>FE: Click "Place Order"
51
+ FE->>BE: POST /api/orders
52
+ BE->>DB: Check stock
53
+ BE->>Payment: Charge card
54
+ Payment-->>BE: Confirmation
55
+ BE->>DB: Create order
56
+ BE->>Queue: Send confirmation email
57
+ BE-->>FE: 201 Created
58
+ FE-->>User: Success page
59
+ ```
60
+
61
+ ## Reference Files
62
+
63
+ | Reference | When to read |
64
+ |-----------|-------------|
65
+ | `requirements-analysis.md` | Breaking down a new feature request |
66
+ | `acceptance-criteria.md` | Writing testable acceptance criteria |
67
+ | `api-contract-design.md` | Designing API contracts for FE/BE alignment |
68
+ | `user-story-mapping.md` | Prioritizing stories, MVP scoping |
69
+
70
+ ## Workflows
71
+
72
+ ### Analyze New Feature
73
+ 1. Read the raw requirement (ticket, message, document)
74
+ 2. Identify the actors (who uses this?) and their goals
75
+ 3. List the user stories (start with happy path, then edge cases)
76
+ 4. Write acceptance criteria for each story (Given/When/Then)
77
+ 5. Draw data flow diagram (what systems are involved?)
78
+ 6. Propose API contract if FE+BE are both involved
79
+ 7. List open questions / assumptions for stakeholder review
80
+
81
+ ### Review Existing Specs
82
+ → Read `references/requirements-analysis.md` for analysis framework
83
+
84
+ ### Define API Contract
85
+ → Read `references/api-contract-design.md` for contract design principles
@@ -0,0 +1,34 @@
1
+ id: be
2
+ name: Backend Agent
3
+ description: >
4
+ Senior backend engineer agent. Expert in distributed systems, API architecture,
5
+ database optimization, caching, event-driven patterns, and security hardening.
6
+ role: be
7
+ techStack:
8
+ languages: [TypeScript, Python, Go, Java, Rust]
9
+ frameworks: [NestJS, Express, Fastify, FastAPI, Django, Spring Boot, Gin, Actix]
10
+ libraries: [Prisma, TypeORM, Drizzle, SQLAlchemy, Bull MQ, Redis, Kafka, RabbitMQ, gRPC]
11
+ buildTools: [esbuild, Docker, Gradle, Maven, Make]
12
+ tools:
13
+ allowed: [Read, Write, Edit, Bash, Grep, Glob]
14
+ globs:
15
+ - "**/*.ts"
16
+ - "**/*.py"
17
+ - "**/*.go"
18
+ - "**/*.java"
19
+ - "**/*.rs"
20
+ - "src/api/**/*"
21
+ - "src/routes/**/*"
22
+ - "src/controllers/**/*"
23
+ - "src/services/**/*"
24
+ - "src/repositories/**/*"
25
+ - "src/models/**/*"
26
+ - "src/entities/**/*"
27
+ - "src/middleware/**/*"
28
+ - "src/jobs/**/*"
29
+ - "src/events/**/*"
30
+ - "prisma/**/*"
31
+ - "migrations/**/*"
32
+ sharedKnowledge:
33
+ - project-conventions
34
+ - git-workflow
@@ -0,0 +1,102 @@
1
+ ## API Design
2
+
3
+ - RESTful URL structure: `/resources` (collection), `/resources/:id` (item), `/resources/:id/sub-resources` (nested).
4
+ - Use proper HTTP methods: `GET` (read), `POST` (create), `PUT` (full replace), `PATCH` (partial update), `DELETE` (remove).
5
+ - HTTP status codes must be semantically correct:
6
+ - `200` — Success with body
7
+ - `201` — Created (return the created resource + `Location` header)
8
+ - `204` — Success with no body (DELETE, some PUTs)
9
+ - `400` — Validation error (client sent bad data)
10
+ - `401` — Unauthenticated (no/invalid token)
11
+ - `403` — Unauthorized (valid token, insufficient permissions)
12
+ - `404` — Resource not found
13
+ - `409` — Conflict (duplicate, version mismatch)
14
+ - `422` — Business logic rejection (valid data, but operation not allowed)
15
+ - `429` — Rate limited (include `Retry-After` header)
16
+ - `500` — Server error (never return this intentionally — it means you have a bug)
17
+ - Response envelope for collections: `{ data: T[], meta: { total, page, perPage, hasMore } }`.
18
+ - Error response format: `{ error: { code: "VALIDATION_ERROR", message: "...", details: [...] } }`. The `code` is machine-readable, `message` is human-readable.
19
+ - Pagination: cursor-based for large/real-time datasets (encode cursor as opaque base64 string). Offset-based only for small static datasets.
20
+ - Filtering via query params: `?status=active&created_after=2024-01-01`. Complex filters use a filter query language or POST to a search endpoint.
21
+ - Sorting: `?sort=created_at:desc,name:asc`. Default sort must be deterministic (include `id` as tiebreaker).
22
+ - Versioning: URL prefix (`/v1/`) for breaking changes. Adding optional fields is not a breaking change.
23
+
24
+ ## Database
25
+
26
+ - Every table has: `id` (UUID v7 or ULID — sortable, no sequential guessing), `created_at`, `updated_at` timestamps.
27
+ - Soft deletes: add `deleted_at` column. Never hard-delete unless legally required (GDPR erasure). All queries must filter `WHERE deleted_at IS NULL`.
28
+ - Indexes: create indexes for every column used in `WHERE`, `JOIN`, or `ORDER BY`. Composite indexes follow the left-prefix rule — put high-cardinality columns first.
29
+ - Query complexity: a single API request should execute ≤5 queries. If you need more, you're either missing a JOIN or need to denormalize.
30
+ - Use database transactions for any operation that modifies multiple tables. Scope transactions as narrowly as possible — don't hold locks during HTTP calls.
31
+ - Connection pooling: configure pool size based on `(number_of_cores * 2) + effective_spindle_count`. Typical: 10-20 per service instance. Never use unlimited.
32
+ - Migrations must be backward compatible during deployment. Sequence: add new column → deploy code that writes to both → backfill → deploy code that reads from new → drop old column.
33
+ - Use read replicas for reporting/analytics queries. Write to primary only. Account for replication lag in application code.
34
+
35
+ ## Authentication & Authorization
36
+
37
+ - JWT tokens for API authentication. Short-lived access tokens (15-30 min) + long-lived refresh tokens (7-30 days stored in httpOnly secure cookie).
38
+ - Never store plain-text passwords. Use bcrypt with cost factor ≥12 or Argon2id.
39
+ - Implement RBAC (Role-Based Access Control) at minimum. Use middleware to check permissions before the controller method executes.
40
+ - IDOR prevention: always scope resource queries by the authenticated user's ID/org. Never trust resource IDs from the URL without checking ownership.
41
+ ```
42
+ // BAD: anyone can access any order
43
+ SELECT * FROM orders WHERE id = :orderId
44
+ // GOOD: scoped to user
45
+ SELECT * FROM orders WHERE id = :orderId AND user_id = :currentUserId
46
+ ```
47
+ - Rate limiting tiers: anonymous (60/min), authenticated (300/min), premium (1000/min). Stricter for sensitive endpoints (login: 5/min, password reset: 3/hour).
48
+ - API keys for service-to-service auth. Rotate keys quarterly. Never use API keys for user-facing authentication.
49
+
50
+ ## Error Handling & Resilience
51
+
52
+ - Create a domain error hierarchy. Map domain errors to HTTP status codes in one place (error handler middleware), not in every controller.
53
+ ```typescript
54
+ class NotFoundError extends DomainError { statusCode = 404; }
55
+ class ConflictError extends DomainError { statusCode = 409; }
56
+ class ValidationError extends DomainError { statusCode = 400; }
57
+ ```
58
+ - External service calls: set timeouts (connect: 3s, read: 10s). Implement retries with exponential backoff + jitter (1s, 2s, 4s + random). Use circuit breaker after 5 consecutive failures.
59
+ - Graceful degradation: if a non-critical service (recommendations, analytics) is down, the main functionality should still work. Return cached data or skip the feature.
60
+ - Implement health check endpoints:
61
+ - `/health/live` — process is running (for Kubernetes liveness probe)
62
+ - `/health/ready` — can serve traffic (DB connected, cache available, for readiness probe)
63
+ - Graceful shutdown: on SIGTERM, stop accepting new requests, finish in-flight requests (30s timeout), close connections, then exit.
64
+
65
+ ## Observability
66
+
67
+ - Structured JSON logging. Every log entry must include: `timestamp`, `level`, `message`, `request_id`, `service`, `environment`. Optional: `user_id`, `duration_ms`, `error.stack`.
68
+ - Log levels:
69
+ - `ERROR` — something broke, requires investigation. Pages on-call if in production.
70
+ - `WARN` — something unexpected but handled. Rate limit hit, cache miss, slow query.
71
+ - `INFO` — significant business events: user registered, order placed, payment processed.
72
+ - `DEBUG` — development only. Never deploy with DEBUG enabled in production.
73
+ - Distributed tracing: propagate trace context (`traceparent` header) across all service-to-service calls. Every outgoing HTTP/gRPC/queue message carries the trace ID.
74
+ - Metrics to expose: request rate, error rate, latency percentiles (p50, p95, p99), active connections, queue depth, cache hit rate. Use Prometheus format.
75
+ - Alert on symptoms (error rate >1%, p99 latency >2s) not causes. Avoid alert fatigue — every alert must be actionable.
76
+
77
+ ## Caching
78
+
79
+ - Cache strategy decision tree:
80
+ - Data changes rarely + stale data acceptable → **Cache-aside** (check cache → miss → query DB → write cache)
81
+ - Data changes often + stale data unacceptable → **Write-through** (write DB + cache simultaneously)
82
+ - Expensive computation + immutable inputs → **Memoization** with TTL
83
+ - Cache key format: `{service}:{entity}:{id}:{version}` — e.g., `user-service:profile:123:v2`.
84
+ - Always set TTL. Infinite TTL = memory leak. Typical: config data (1h), user profiles (15m), search results (5m).
85
+ - Cache stampede prevention: use probabilistic early expiration or lock-based recomputation. Never let 1000 requests simultaneously recompute the same expired key.
86
+ - Invalidation: prefer event-driven invalidation (on write, publish cache-invalidation event) over TTL-only. TTL is the safety net, not the primary strategy.
87
+
88
+ ## Async Processing
89
+
90
+ - Message/job queue for: email sending, PDF generation, webhook delivery, data exports, image processing — anything that takes >500ms or can fail independently.
91
+ - Jobs must be idempotent. If a job runs twice with the same input, the result must be the same (use idempotency keys).
92
+ - Dead letter queue (DLQ) for failed messages. Monitor DLQ size. Alert if DLQ grows >100 messages.
93
+ - Job processing order: FIFO by default. Priority queues for time-sensitive operations.
94
+ - Implement job progress tracking for long-running operations. Expose status via API endpoint (`GET /jobs/:id/status`).
95
+
96
+ ## Testing
97
+
98
+ - Unit tests: test service layer business logic in isolation. Mock repositories and external services.
99
+ - Integration tests: test API endpoints with a real database (use test containers or in-memory DB). These are your most valuable tests.
100
+ - Contract tests: if consuming/providing APIs between services, use Pact or similar to verify contracts don't break.
101
+ - Load tests: run before every release that touches data path. Baseline: the system must handle 2x current peak traffic.
102
+ - Test database state: each test creates its own data, cleans up after. No shared test data. No test ordering dependencies. Use database transactions that roll back after each test.
@@ -0,0 +1,91 @@
1
+ # API Design
2
+
3
+ ## RESTful URL Structure
4
+
5
+ - Collections: `GET /users`, `POST /users`
6
+ - Items: `GET /users/:id`, `PUT /users/:id`, `PATCH /users/:id`, `DELETE /users/:id`
7
+ - Nested: `GET /users/:id/orders`, `POST /users/:id/orders`
8
+ - Actions (non-CRUD): `POST /orders/:id/cancel`, `POST /users/:id/verify`
9
+
10
+ ## HTTP Status Codes
11
+
12
+ Use semantically correct codes — not just 200 and 500:
13
+
14
+ | Code | Meaning | When to use |
15
+ |------|---------|-------------|
16
+ | 200 | OK | Success with body |
17
+ | 201 | Created | Resource created (+ `Location` header) |
18
+ | 204 | No Content | Success, no body (DELETE, some PUTs) |
19
+ | 400 | Bad Request | Validation error (malformed input) |
20
+ | 401 | Unauthenticated | No token or invalid token |
21
+ | 403 | Forbidden | Valid token, insufficient permissions |
22
+ | 404 | Not Found | Resource doesn't exist |
23
+ | 409 | Conflict | Duplicate, version mismatch |
24
+ | 422 | Unprocessable | Valid data, business logic rejection |
25
+ | 429 | Too Many Requests | Rate limited (include `Retry-After` header) |
26
+ | 500 | Server Error | Never intentional — means you have a bug |
27
+
28
+ ## Response Formats
29
+
30
+ **Collections:**
31
+ ```json
32
+ {
33
+ "data": [...],
34
+ "meta": { "total": 150, "page": 1, "perPage": 20, "hasMore": true }
35
+ }
36
+ ```
37
+
38
+ **Errors:**
39
+ ```json
40
+ {
41
+ "error": {
42
+ "code": "VALIDATION_ERROR",
43
+ "message": "Email is required",
44
+ "details": [
45
+ { "field": "email", "message": "This field is required" }
46
+ ]
47
+ }
48
+ }
49
+ ```
50
+ `code` is machine-readable (for client logic), `message` is human-readable (for display).
51
+
52
+ ## Pagination
53
+
54
+ - **Cursor-based** (recommended for large/real-time data): Opaque cursor token, no count query.
55
+ ```
56
+ GET /posts?cursor=eyJpZCI6MTAwfQ&limit=20
57
+ → { data: [...], meta: { nextCursor: "eyJpZCI6MTIwfQ", hasMore: true } }
58
+ ```
59
+ - **Offset-based** (simple, for small static data): `?page=2&perPage=20`
60
+ - Default sort must be deterministic — include `id` as tiebreaker.
61
+
62
+ ## Filtering & Sorting
63
+
64
+ ```
65
+ GET /users?status=active&role=admin&created_after=2024-01-01
66
+ GET /users?sort=created_at:desc,name:asc
67
+ ```
68
+
69
+ Complex filters: POST to a search endpoint with filter body, not mega query strings.
70
+
71
+ ## Versioning
72
+
73
+ - URL prefix: `/v1/users`, `/v2/users` for breaking changes
74
+ - Adding optional response fields = NOT breaking
75
+ - Removing/renaming fields, changing types, adding required params = BREAKING
76
+
77
+ ## Input Validation
78
+
79
+ Validate everything at the API boundary:
80
+ ```typescript
81
+ const createUserSchema = z.object({
82
+ email: z.string().email().max(255),
83
+ name: z.string().min(1).max(100),
84
+ role: z.enum(['user', 'admin']).default('user'),
85
+ });
86
+ ```
87
+ - String length limits on all fields
88
+ - Enum validation for constrained values
89
+ - Nested object validation
90
+ - Array length limits
91
+ - Reject unknown fields (`z.strict()`)
@@ -0,0 +1,86 @@
1
+ # Async Processing
2
+
3
+ ## When to Use Queues
4
+
5
+ Move to a queue if the operation:
6
+ - Takes > 500ms (email, PDF generation, image processing)
7
+ - Can fail independently (webhook delivery, third-party API calls)
8
+ - Doesn't need an immediate response (analytics, audit logging)
9
+ - Has high throughput bursts (notification fanout, batch imports)
10
+
11
+ ## Job Design Rules
12
+
13
+ 1. **Idempotent**: Running the same job twice with the same input = same result. Use idempotency keys.
14
+ ```typescript
15
+ async processPayment(jobData: { orderId: string, idempotencyKey: string }) {
16
+ const existing = await db.payments.findByIdempotencyKey(jobData.idempotencyKey);
17
+ if (existing) return existing; // Already processed
18
+ // ... process payment
19
+ }
20
+ ```
21
+
22
+ 2. **Minimal payload**: Store IDs in the job, fetch fresh data in the worker.
23
+ ```typescript
24
+ // GOOD: minimal payload, fresh data
25
+ queue.add('send-invoice', { orderId: 'ord_123' });
26
+
27
+ // BAD: stale data in payload
28
+ queue.add('send-invoice', { order: { ...fullOrderObject } });
29
+ ```
30
+
31
+ 3. **Timeout**: Every job has a maximum execution time. Kill and retry if exceeded.
32
+
33
+ 4. **Progress tracking**: Long-running jobs expose status via API.
34
+ ```
35
+ POST /exports → 202 Accepted { jobId: "job_123" }
36
+ GET /exports/job_123/status → { status: "processing", progress: 45 }
37
+ ```
38
+
39
+ ## Dead Letter Queue (DLQ)
40
+
41
+ - Failed messages (after max retries) go to DLQ, not silently dropped
42
+ - Monitor DLQ size — alert if > 100 messages
43
+ - DLQ messages must be inspectable (view payload, error reason, original timestamp)
44
+ - Process DLQ manually or automatically after fixing the root cause
45
+
46
+ ## Retry Strategy
47
+
48
+ ```
49
+ Attempt 1: immediate
50
+ Attempt 2: 30 seconds
51
+ Attempt 3: 2 minutes
52
+ Attempt 4: 10 minutes
53
+ Attempt 5: 1 hour
54
+ Then → DLQ
55
+ ```
56
+
57
+ - Use exponential backoff — not fixed intervals
58
+ - Add jitter to prevent thundering herd
59
+ - Different retry strategies for different error types:
60
+ - Network timeout → retry immediately
61
+ - Rate limited → respect Retry-After header
62
+ - Validation error → don't retry (fix the data)
63
+ - 500 error → retry with backoff
64
+
65
+ ## Queue Patterns
66
+
67
+ **Fan-out**: One event triggers many jobs
68
+ ```
69
+ order.created → [send-email, update-analytics, notify-warehouse, update-search-index]
70
+ ```
71
+
72
+ **Priority queue**: Urgent operations processed first
73
+ ```
74
+ queue.add('process-payment', data, { priority: 1 }); // High
75
+ queue.add('send-marketing-email', data, { priority: 5 }); // Low
76
+ ```
77
+
78
+ **Delayed jobs**: Schedule for later
79
+ ```
80
+ queue.add('send-reminder', data, { delay: 24 * 60 * 60 * 1000 }); // 24h
81
+ ```
82
+
83
+ **Rate-limited processing**: External API limits
84
+ ```
85
+ queue.add('sync-to-crm', data, { limiter: { max: 10, duration: 1000 } }); // 10/sec
86
+ ```
@@ -0,0 +1,58 @@
1
+ # Authentication & Security
2
+
3
+ ## Authentication
4
+
5
+ - **JWT tokens**: Short-lived access (15-30 min) + long-lived refresh (7-30 days) in httpOnly Secure SameSite=Strict cookie
6
+ - **Password storage**: bcrypt (cost ≥ 12) or Argon2id. Never plain text, never MD5/SHA.
7
+ - **API keys**: Service-to-service auth only. Rotate quarterly. Never for user-facing auth.
8
+ - **OAuth2/OIDC**: For third-party login (Google, GitHub). Use authorization code flow with PKCE.
9
+
10
+ ## Authorization
11
+
12
+ - Implement RBAC at minimum. Check permissions in middleware before controller executes.
13
+ - **IDOR prevention** — always scope queries by authenticated user:
14
+ ```sql
15
+ -- BAD: anyone can access any order
16
+ SELECT * FROM orders WHERE id = :orderId
17
+ -- GOOD: scoped to user
18
+ SELECT * FROM orders WHERE id = :orderId AND user_id = :currentUserId
19
+ ```
20
+ - Test every endpoint: can user A access user B's resources by changing the ID?
21
+
22
+ ## Rate Limiting
23
+
24
+ | Tier | Limit | Use case |
25
+ |------|-------|----------|
26
+ | Anonymous | 60/min per IP | Public endpoints |
27
+ | Authenticated | 300/min per user | Standard API access |
28
+ | Premium | 1000/min per user | Paid tier |
29
+ | Login | 5/min per IP | Brute force prevention |
30
+ | Password reset | 3/hour per email | Abuse prevention |
31
+
32
+ - Return `429` with `Retry-After` header
33
+ - Use sliding window algorithm (more fair than fixed window)
34
+ - Rate limit by user ID for authenticated, by IP for anonymous
35
+
36
+ ## Input Security
37
+
38
+ - **SQL injection**: Parameterized queries always. Never string concatenation for SQL.
39
+ - **Mass assignment**: Whitelist allowed fields from request body. Never pass raw body to ORM create/update.
40
+ - **Path traversal**: Validate file paths. Never use user input directly in `fs.readFile()` or similar.
41
+ - **ReDoS**: Avoid user-controlled regex patterns. Set timeout on regex matching.
42
+
43
+ ## Headers
44
+
45
+ ```
46
+ Strict-Transport-Security: max-age=31536000; includeSubDomains
47
+ X-Content-Type-Options: nosniff
48
+ X-Frame-Options: DENY
49
+ Content-Security-Policy: default-src 'self'
50
+ X-Request-Id: {unique-id} (for tracing)
51
+ ```
52
+
53
+ ## Secrets Management
54
+
55
+ - Secrets from secret manager (Vault, AWS SSM, GCP Secret Manager)
56
+ - Never in code, environment files, or Docker images
57
+ - Rotate on: employee offboarding, suspected compromise, quarterly schedule
58
+ - Use short-lived credentials where possible (IAM roles, temporary tokens)