aigent-team 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +253 -0
- package/dist/chunk-N3RYHWTR.js +267 -0
- package/dist/cli.js +576 -0
- package/dist/index.d.ts +234 -0
- package/dist/index.js +27 -0
- package/package.json +67 -0
- package/templates/shared/git-workflow.md +44 -0
- package/templates/shared/project-conventions.md +48 -0
- package/templates/teams/ba/agent.yaml +25 -0
- package/templates/teams/ba/references/acceptance-criteria.md +87 -0
- package/templates/teams/ba/references/api-contract-design.md +110 -0
- package/templates/teams/ba/references/requirements-analysis.md +83 -0
- package/templates/teams/ba/references/user-story-mapping.md +73 -0
- package/templates/teams/ba/skill.md +85 -0
- package/templates/teams/be/agent.yaml +34 -0
- package/templates/teams/be/conventions.md +102 -0
- package/templates/teams/be/references/api-design.md +91 -0
- package/templates/teams/be/references/async-processing.md +86 -0
- package/templates/teams/be/references/auth-security.md +58 -0
- package/templates/teams/be/references/caching.md +79 -0
- package/templates/teams/be/references/database.md +65 -0
- package/templates/teams/be/references/error-handling.md +106 -0
- package/templates/teams/be/references/observability.md +83 -0
- package/templates/teams/be/references/review-checklist.md +50 -0
- package/templates/teams/be/references/testing.md +100 -0
- package/templates/teams/be/review-checklist.md +54 -0
- package/templates/teams/be/skill.md +71 -0
- package/templates/teams/devops/agent.yaml +35 -0
- package/templates/teams/devops/conventions.md +133 -0
- package/templates/teams/devops/references/ci-cd.md +218 -0
- package/templates/teams/devops/references/cost-optimization.md +218 -0
- package/templates/teams/devops/references/disaster-recovery.md +199 -0
- package/templates/teams/devops/references/docker.md +237 -0
- package/templates/teams/devops/references/infrastructure-as-code.md +238 -0
- package/templates/teams/devops/references/kubernetes.md +397 -0
- package/templates/teams/devops/references/monitoring.md +224 -0
- package/templates/teams/devops/references/review-checklist.md +149 -0
- package/templates/teams/devops/references/security.md +225 -0
- package/templates/teams/devops/review-checklist.md +72 -0
- package/templates/teams/devops/skill.md +131 -0
- package/templates/teams/fe/agent.yaml +28 -0
- package/templates/teams/fe/conventions.md +80 -0
- package/templates/teams/fe/references/accessibility.md +92 -0
- package/templates/teams/fe/references/component-architecture.md +87 -0
- package/templates/teams/fe/references/css-styling.md +89 -0
- package/templates/teams/fe/references/forms.md +73 -0
- package/templates/teams/fe/references/performance.md +104 -0
- package/templates/teams/fe/references/review-checklist.md +51 -0
- package/templates/teams/fe/references/security.md +90 -0
- package/templates/teams/fe/references/state-management.md +117 -0
- package/templates/teams/fe/references/testing.md +112 -0
- package/templates/teams/fe/review-checklist.md +53 -0
- package/templates/teams/fe/skill.md +68 -0
- package/templates/teams/lead/agent.yaml +18 -0
- package/templates/teams/lead/references/cross-team-coordination.md +68 -0
- package/templates/teams/lead/references/quality-gates.md +64 -0
- package/templates/teams/lead/references/task-decomposition.md +69 -0
- package/templates/teams/lead/skill.md +83 -0
- package/templates/teams/qa/agent.yaml +32 -0
- package/templates/teams/qa/conventions.md +130 -0
- package/templates/teams/qa/references/ci-integration.md +337 -0
- package/templates/teams/qa/references/e2e-testing.md +292 -0
- package/templates/teams/qa/references/mocking.md +249 -0
- package/templates/teams/qa/references/performance-testing.md +288 -0
- package/templates/teams/qa/references/review-checklist.md +143 -0
- package/templates/teams/qa/references/security-testing.md +271 -0
- package/templates/teams/qa/references/test-data.md +275 -0
- package/templates/teams/qa/references/test-strategy.md +192 -0
- package/templates/teams/qa/review-checklist.md +53 -0
- package/templates/teams/qa/skill.md +131 -0
|
@@ -0,0 +1,83 @@
|
|
|
1
|
+
# Requirements Analysis
|
|
2
|
+
|
|
3
|
+
## Analysis Framework
|
|
4
|
+
|
|
5
|
+
### Step 1: Understand the "Why"
|
|
6
|
+
- What business problem does this solve?
|
|
7
|
+
- Who are the users? What are their goals?
|
|
8
|
+
- What happens if we don't build this?
|
|
9
|
+
- How does this fit into the larger product roadmap?
|
|
10
|
+
|
|
11
|
+
### Step 2: Define the Scope
|
|
12
|
+
- **In scope**: Explicit list of what this feature includes
|
|
13
|
+
- **Out of scope**: Explicit list of what it does NOT include (prevents scope creep)
|
|
14
|
+
- **Assumptions**: What we're assuming to be true (validate with stakeholders)
|
|
15
|
+
- **Dependencies**: What must exist before this can be built
|
|
16
|
+
|
|
17
|
+
### Step 3: Identify Actors and Actions
|
|
18
|
+
Map every user role to their interactions:
|
|
19
|
+
|
|
20
|
+
| Actor | Action | Expected outcome |
|
|
21
|
+
|-------|--------|-----------------|
|
|
22
|
+
| Anonymous user | Views product page | Sees price, description, reviews, "Add to cart" |
|
|
23
|
+
| Logged-in user | Adds item to cart | Item appears in cart, quantity updated |
|
|
24
|
+
| Admin | Views all orders | Paginated list with filters and search |
|
|
25
|
+
|
|
26
|
+
### Step 4: Edge Cases Inventory
|
|
27
|
+
|
|
28
|
+
For every feature, systematically check:
|
|
29
|
+
- **Empty state**: What does the user see when there's no data?
|
|
30
|
+
- **Error state**: What happens when the operation fails? (Network error, validation, server error)
|
|
31
|
+
- **Boundary values**: Max length inputs, zero quantity, negative numbers, very long text
|
|
32
|
+
- **Concurrent access**: Two users editing the same record simultaneously
|
|
33
|
+
- **Permission boundaries**: What happens when an unauthorized user tries to access this?
|
|
34
|
+
- **Data integrity**: What if referenced data is deleted? (User deletes account while order is pending)
|
|
35
|
+
- **Performance**: What if there are 10,000 items? Will the UI still work?
|
|
36
|
+
- **Internationalization**: Special characters, RTL languages, different date formats
|
|
37
|
+
- **Accessibility**: Can this be used with keyboard only? Screen reader?
|
|
38
|
+
|
|
39
|
+
### Step 5: Priority Classification
|
|
40
|
+
|
|
41
|
+
| Priority | Definition | Example |
|
|
42
|
+
|----------|-----------|---------|
|
|
43
|
+
| P0 (Must have) | Feature doesn't work without this | Login form, payment processing |
|
|
44
|
+
| P1 (Should have) | Important but has workaround | Search filters, bulk actions |
|
|
45
|
+
| P2 (Nice to have) | Improves experience but not critical | Animations, shortcuts |
|
|
46
|
+
| P3 (Future) | Deferred to later release | Advanced analytics, AI features |
|
|
47
|
+
|
|
48
|
+
## Requirements Document Template
|
|
49
|
+
|
|
50
|
+
```markdown
|
|
51
|
+
# Feature: [Name]
|
|
52
|
+
|
|
53
|
+
## Overview
|
|
54
|
+
[1-2 sentences: what and why]
|
|
55
|
+
|
|
56
|
+
## Actors
|
|
57
|
+
- [Role 1]: [what they do]
|
|
58
|
+
- [Role 2]: [what they do]
|
|
59
|
+
|
|
60
|
+
## User Stories
|
|
61
|
+
1. As a [role], I want to [action], so that [benefit]
|
|
62
|
+
- AC: Given... When... Then...
|
|
63
|
+
- AC: Given... When... Then...
|
|
64
|
+
|
|
65
|
+
## Data Model Changes
|
|
66
|
+
- New table/field: [description]
|
|
67
|
+
- Modified: [what changes]
|
|
68
|
+
|
|
69
|
+
## API Changes
|
|
70
|
+
- [New/modified endpoints with schemas]
|
|
71
|
+
|
|
72
|
+
## UI Changes
|
|
73
|
+
- [Mockups or descriptions of new screens/components]
|
|
74
|
+
|
|
75
|
+
## Edge Cases
|
|
76
|
+
- [List of identified edge cases with expected behavior]
|
|
77
|
+
|
|
78
|
+
## Open Questions
|
|
79
|
+
- [Things that need stakeholder input]
|
|
80
|
+
|
|
81
|
+
## Out of Scope
|
|
82
|
+
- [Explicitly excluded items]
|
|
83
|
+
```
|
|
@@ -0,0 +1,73 @@
|
|
|
1
|
+
# User Story Mapping
|
|
2
|
+
|
|
3
|
+
## Story Map Structure
|
|
4
|
+
|
|
5
|
+
```
|
|
6
|
+
User Activities (high-level goals)
|
|
7
|
+
├── Activity 1: Browse Products
|
|
8
|
+
│ ├── Task 1.1: View product list
|
|
9
|
+
│ ├── Task 1.2: Search products
|
|
10
|
+
│ ├── Task 1.3: Filter by category
|
|
11
|
+
│ └── Task 1.4: View product detail
|
|
12
|
+
├── Activity 2: Purchase
|
|
13
|
+
│ ├── Task 2.1: Add to cart
|
|
14
|
+
│ ├── Task 2.2: Review cart
|
|
15
|
+
│ ├── Task 2.3: Enter shipping
|
|
16
|
+
│ ├── Task 2.4: Enter payment
|
|
17
|
+
│ └── Task 2.5: Confirm order
|
|
18
|
+
└── Activity 3: Manage Account
|
|
19
|
+
├── Task 3.1: Register
|
|
20
|
+
├── Task 3.2: Login
|
|
21
|
+
├── Task 3.3: View orders
|
|
22
|
+
└── Task 3.4: Update profile
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
## MVP Slicing
|
|
26
|
+
|
|
27
|
+
Draw a horizontal line across the story map. Everything above = MVP. Everything below = later.
|
|
28
|
+
|
|
29
|
+
**MVP rule**: What is the MINIMUM set of tasks that delivers value to the user?
|
|
30
|
+
|
|
31
|
+
```
|
|
32
|
+
MVP Line
|
|
33
|
+
─────────────────────────────────
|
|
34
|
+
Above: View list, View detail, Add to cart, Basic checkout, Register, Login
|
|
35
|
+
Below: Search, Filters, Wishlist, Reviews, Order history, Profile editing
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
## Prioritization: RICE Framework
|
|
39
|
+
|
|
40
|
+
| Factor | How to score |
|
|
41
|
+
|--------|-------------|
|
|
42
|
+
| **R**each | How many users will this affect? (per quarter) |
|
|
43
|
+
| **I**mpact | How much will this move the metric? (3=massive, 2=high, 1=medium, 0.5=low, 0.25=minimal) |
|
|
44
|
+
| **C**onfidence | How sure are we about the estimates? (100%, 80%, 50%) |
|
|
45
|
+
| **E**ffort | How many person-weeks? |
|
|
46
|
+
|
|
47
|
+
**RICE Score** = (Reach × Impact × Confidence) / Effort
|
|
48
|
+
|
|
49
|
+
Higher score = higher priority.
|
|
50
|
+
|
|
51
|
+
## Story Splitting Techniques
|
|
52
|
+
|
|
53
|
+
When a story is too large (> 5 story points):
|
|
54
|
+
|
|
55
|
+
1. **By workflow step**: Split "Checkout" into "Enter shipping" + "Enter payment" + "Confirm order"
|
|
56
|
+
2. **By data variation**: Split "Support all payment methods" into "Credit card" + "PayPal" + "Apple Pay"
|
|
57
|
+
3. **By operation**: Split "Manage users" into "Create" + "Read" + "Update" + "Delete"
|
|
58
|
+
4. **By user role**: Split "Dashboard" into "Admin dashboard" + "User dashboard"
|
|
59
|
+
5. **By happy/sad path**: Split "Login" into "Successful login" + "Failed login + error handling"
|
|
60
|
+
6. **By platform**: Split "Mobile support" into "Responsive design" + "Native features"
|
|
61
|
+
|
|
62
|
+
## Definition of Ready
|
|
63
|
+
|
|
64
|
+
A story is ready for development when:
|
|
65
|
+
- [ ] User story written (As a... I want... So that...)
|
|
66
|
+
- [ ] Acceptance criteria defined (Given/When/Then)
|
|
67
|
+
- [ ] Edge cases identified and documented
|
|
68
|
+
- [ ] API contract proposed (if FE+BE involved)
|
|
69
|
+
- [ ] UI mockups or wireframes available (if FE involved)
|
|
70
|
+
- [ ] Dependencies identified and resolved
|
|
71
|
+
- [ ] Priority assigned (P0-P3)
|
|
72
|
+
- [ ] Size estimated by the team
|
|
73
|
+
- [ ] Open questions answered by stakeholders
|
|
@@ -0,0 +1,85 @@
|
|
|
1
|
+
# BA Agent (Business Analyst)
|
|
2
|
+
|
|
3
|
+
You are a senior business analyst who translates business requirements into precise, testable technical specifications. Your output directly drives what FE, BE, and QA agents build and test.
|
|
4
|
+
|
|
5
|
+
## Core Principles
|
|
6
|
+
|
|
7
|
+
1. **Clarity over completeness**: A clear spec for 80% of cases is more valuable than a vague spec for 100%. Mark unknowns explicitly as "TBD — needs stakeholder input".
|
|
8
|
+
2. **Testable acceptance criteria**: Every criterion must be verifiable by QA. "User-friendly" is not testable. "Form validates email format on blur and shows inline error" is.
|
|
9
|
+
3. **Edge cases are requirements**: The happy path is obvious. Your value is identifying: what happens when the user does X wrong? What if the data doesn't exist? What if two users do this simultaneously?
|
|
10
|
+
4. **API contracts are agreements**: When you define an API contract, FE builds against it and BE implements it. Changing the contract after both start = expensive rework.
|
|
11
|
+
5. **Diagrams > paragraphs**: A data flow diagram communicates system interactions faster than 3 pages of text.
|
|
12
|
+
|
|
13
|
+
## Output Formats
|
|
14
|
+
|
|
15
|
+
### User Stories
|
|
16
|
+
```
|
|
17
|
+
As a [role],
|
|
18
|
+
I want to [action],
|
|
19
|
+
So that [benefit].
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
### Acceptance Criteria (Given/When/Then)
|
|
23
|
+
```
|
|
24
|
+
GIVEN a logged-in user with "admin" role
|
|
25
|
+
WHEN they navigate to /admin/users
|
|
26
|
+
THEN they see a paginated list of all users with name, email, role, and last login
|
|
27
|
+
AND they can search by name or email
|
|
28
|
+
AND they can filter by role
|
|
29
|
+
AND they can sort by any column
|
|
30
|
+
```
|
|
31
|
+
|
|
32
|
+
### API Contract Proposal
|
|
33
|
+
```yaml
|
|
34
|
+
POST /api/orders
|
|
35
|
+
request:
|
|
36
|
+
body:
|
|
37
|
+
productId: string (required)
|
|
38
|
+
quantity: integer (required, min: 1, max: 100)
|
|
39
|
+
couponCode: string (optional, max: 20 chars)
|
|
40
|
+
response:
|
|
41
|
+
201: { data: { id, productId, quantity, total, discount, status, createdAt } }
|
|
42
|
+
400: { error: { code: "VALIDATION_ERROR", details: [...] } }
|
|
43
|
+
404: { error: { code: "PRODUCT_NOT_FOUND" } }
|
|
44
|
+
409: { error: { code: "INSUFFICIENT_STOCK" } }
|
|
45
|
+
```
|
|
46
|
+
|
|
47
|
+
### Data Flow Diagram (Mermaid)
|
|
48
|
+
```mermaid
|
|
49
|
+
sequenceDiagram
|
|
50
|
+
User->>FE: Click "Place Order"
|
|
51
|
+
FE->>BE: POST /api/orders
|
|
52
|
+
BE->>DB: Check stock
|
|
53
|
+
BE->>Payment: Charge card
|
|
54
|
+
Payment-->>BE: Confirmation
|
|
55
|
+
BE->>DB: Create order
|
|
56
|
+
BE->>Queue: Send confirmation email
|
|
57
|
+
BE-->>FE: 201 Created
|
|
58
|
+
FE-->>User: Success page
|
|
59
|
+
```
|
|
60
|
+
|
|
61
|
+
## Reference Files
|
|
62
|
+
|
|
63
|
+
| Reference | When to read |
|
|
64
|
+
|-----------|-------------|
|
|
65
|
+
| `requirements-analysis.md` | Breaking down a new feature request |
|
|
66
|
+
| `acceptance-criteria.md` | Writing testable acceptance criteria |
|
|
67
|
+
| `api-contract-design.md` | Designing API contracts for FE/BE alignment |
|
|
68
|
+
| `user-story-mapping.md` | Prioritizing stories, MVP scoping |
|
|
69
|
+
|
|
70
|
+
## Workflows
|
|
71
|
+
|
|
72
|
+
### Analyze New Feature
|
|
73
|
+
1. Read the raw requirement (ticket, message, document)
|
|
74
|
+
2. Identify the actors (who uses this?) and their goals
|
|
75
|
+
3. List the user stories (start with happy path, then edge cases)
|
|
76
|
+
4. Write acceptance criteria for each story (Given/When/Then)
|
|
77
|
+
5. Draw data flow diagram (what systems are involved?)
|
|
78
|
+
6. Propose API contract if FE+BE are both involved
|
|
79
|
+
7. List open questions / assumptions for stakeholder review
|
|
80
|
+
|
|
81
|
+
### Review Existing Specs
|
|
82
|
+
→ Read `references/requirements-analysis.md` for analysis framework
|
|
83
|
+
|
|
84
|
+
### Define API Contract
|
|
85
|
+
→ Read `references/api-contract-design.md` for contract design principles
|
|
@@ -0,0 +1,34 @@
|
|
|
1
|
+
id: be
|
|
2
|
+
name: Backend Agent
|
|
3
|
+
description: >
|
|
4
|
+
Senior backend engineer agent. Expert in distributed systems, API architecture,
|
|
5
|
+
database optimization, caching, event-driven patterns, and security hardening.
|
|
6
|
+
role: be
|
|
7
|
+
techStack:
|
|
8
|
+
languages: [TypeScript, Python, Go, Java, Rust]
|
|
9
|
+
frameworks: [NestJS, Express, Fastify, FastAPI, Django, Spring Boot, Gin, Actix]
|
|
10
|
+
libraries: [Prisma, TypeORM, Drizzle, SQLAlchemy, Bull MQ, Redis, Kafka, RabbitMQ, gRPC]
|
|
11
|
+
buildTools: [esbuild, Docker, Gradle, Maven, Make]
|
|
12
|
+
tools:
|
|
13
|
+
allowed: [Read, Write, Edit, Bash, Grep, Glob]
|
|
14
|
+
globs:
|
|
15
|
+
- "**/*.ts"
|
|
16
|
+
- "**/*.py"
|
|
17
|
+
- "**/*.go"
|
|
18
|
+
- "**/*.java"
|
|
19
|
+
- "**/*.rs"
|
|
20
|
+
- "src/api/**/*"
|
|
21
|
+
- "src/routes/**/*"
|
|
22
|
+
- "src/controllers/**/*"
|
|
23
|
+
- "src/services/**/*"
|
|
24
|
+
- "src/repositories/**/*"
|
|
25
|
+
- "src/models/**/*"
|
|
26
|
+
- "src/entities/**/*"
|
|
27
|
+
- "src/middleware/**/*"
|
|
28
|
+
- "src/jobs/**/*"
|
|
29
|
+
- "src/events/**/*"
|
|
30
|
+
- "prisma/**/*"
|
|
31
|
+
- "migrations/**/*"
|
|
32
|
+
sharedKnowledge:
|
|
33
|
+
- project-conventions
|
|
34
|
+
- git-workflow
|
|
@@ -0,0 +1,102 @@
|
|
|
1
|
+
## API Design
|
|
2
|
+
|
|
3
|
+
- RESTful URL structure: `/resources` (collection), `/resources/:id` (item), `/resources/:id/sub-resources` (nested).
|
|
4
|
+
- Use proper HTTP methods: `GET` (read), `POST` (create), `PUT` (full replace), `PATCH` (partial update), `DELETE` (remove).
|
|
5
|
+
- HTTP status codes must be semantically correct:
|
|
6
|
+
- `200` — Success with body
|
|
7
|
+
- `201` — Created (return the created resource + `Location` header)
|
|
8
|
+
- `204` — Success with no body (DELETE, some PUTs)
|
|
9
|
+
- `400` — Validation error (client sent bad data)
|
|
10
|
+
- `401` — Unauthenticated (no/invalid token)
|
|
11
|
+
- `403` — Unauthorized (valid token, insufficient permissions)
|
|
12
|
+
- `404` — Resource not found
|
|
13
|
+
- `409` — Conflict (duplicate, version mismatch)
|
|
14
|
+
- `422` — Business logic rejection (valid data, but operation not allowed)
|
|
15
|
+
- `429` — Rate limited (include `Retry-After` header)
|
|
16
|
+
- `500` — Server error (never return this intentionally — it means you have a bug)
|
|
17
|
+
- Response envelope for collections: `{ data: T[], meta: { total, page, perPage, hasMore } }`.
|
|
18
|
+
- Error response format: `{ error: { code: "VALIDATION_ERROR", message: "...", details: [...] } }`. The `code` is machine-readable, `message` is human-readable.
|
|
19
|
+
- Pagination: cursor-based for large/real-time datasets (encode cursor as opaque base64 string). Offset-based only for small static datasets.
|
|
20
|
+
- Filtering via query params: `?status=active&created_after=2024-01-01`. Complex filters use a filter query language or POST to a search endpoint.
|
|
21
|
+
- Sorting: `?sort=created_at:desc,name:asc`. Default sort must be deterministic (include `id` as tiebreaker).
|
|
22
|
+
- Versioning: URL prefix (`/v1/`) for breaking changes. Adding optional fields is not a breaking change.
|
|
23
|
+
|
|
24
|
+
## Database
|
|
25
|
+
|
|
26
|
+
- Every table has: `id` (UUID v7 or ULID — sortable, no sequential guessing), `created_at`, `updated_at` timestamps.
|
|
27
|
+
- Soft deletes: add `deleted_at` column. Never hard-delete unless legally required (GDPR erasure). All queries must filter `WHERE deleted_at IS NULL`.
|
|
28
|
+
- Indexes: create indexes for every column used in `WHERE`, `JOIN`, or `ORDER BY`. Composite indexes follow the left-prefix rule — put high-cardinality columns first.
|
|
29
|
+
- Query complexity: a single API request should execute ≤5 queries. If you need more, you're either missing a JOIN or need to denormalize.
|
|
30
|
+
- Use database transactions for any operation that modifies multiple tables. Scope transactions as narrowly as possible — don't hold locks during HTTP calls.
|
|
31
|
+
- Connection pooling: configure pool size based on `(number_of_cores * 2) + effective_spindle_count`. Typical: 10-20 per service instance. Never use unlimited.
|
|
32
|
+
- Migrations must be backward compatible during deployment. Sequence: add new column → deploy code that writes to both → backfill → deploy code that reads from new → drop old column.
|
|
33
|
+
- Use read replicas for reporting/analytics queries. Write to primary only. Account for replication lag in application code.
|
|
34
|
+
|
|
35
|
+
## Authentication & Authorization
|
|
36
|
+
|
|
37
|
+
- JWT tokens for API authentication. Short-lived access tokens (15-30 min) + long-lived refresh tokens (7-30 days stored in httpOnly secure cookie).
|
|
38
|
+
- Never store plain-text passwords. Use bcrypt with cost factor ≥12 or Argon2id.
|
|
39
|
+
- Implement RBAC (Role-Based Access Control) at minimum. Use middleware to check permissions before the controller method executes.
|
|
40
|
+
- IDOR prevention: always scope resource queries by the authenticated user's ID/org. Never trust resource IDs from the URL without checking ownership.
|
|
41
|
+
```
|
|
42
|
+
// BAD: anyone can access any order
|
|
43
|
+
SELECT * FROM orders WHERE id = :orderId
|
|
44
|
+
// GOOD: scoped to user
|
|
45
|
+
SELECT * FROM orders WHERE id = :orderId AND user_id = :currentUserId
|
|
46
|
+
```
|
|
47
|
+
- Rate limiting tiers: anonymous (60/min), authenticated (300/min), premium (1000/min). Stricter for sensitive endpoints (login: 5/min, password reset: 3/hour).
|
|
48
|
+
- API keys for service-to-service auth. Rotate keys quarterly. Never use API keys for user-facing authentication.
|
|
49
|
+
|
|
50
|
+
## Error Handling & Resilience
|
|
51
|
+
|
|
52
|
+
- Create a domain error hierarchy. Map domain errors to HTTP status codes in one place (error handler middleware), not in every controller.
|
|
53
|
+
```typescript
|
|
54
|
+
class NotFoundError extends DomainError { statusCode = 404; }
|
|
55
|
+
class ConflictError extends DomainError { statusCode = 409; }
|
|
56
|
+
class ValidationError extends DomainError { statusCode = 400; }
|
|
57
|
+
```
|
|
58
|
+
- External service calls: set timeouts (connect: 3s, read: 10s). Implement retries with exponential backoff + jitter (1s, 2s, 4s + random). Use circuit breaker after 5 consecutive failures.
|
|
59
|
+
- Graceful degradation: if a non-critical service (recommendations, analytics) is down, the main functionality should still work. Return cached data or skip the feature.
|
|
60
|
+
- Implement health check endpoints:
|
|
61
|
+
- `/health/live` — process is running (for Kubernetes liveness probe)
|
|
62
|
+
- `/health/ready` — can serve traffic (DB connected, cache available, for readiness probe)
|
|
63
|
+
- Graceful shutdown: on SIGTERM, stop accepting new requests, finish in-flight requests (30s timeout), close connections, then exit.
|
|
64
|
+
|
|
65
|
+
## Observability
|
|
66
|
+
|
|
67
|
+
- Structured JSON logging. Every log entry must include: `timestamp`, `level`, `message`, `request_id`, `service`, `environment`. Optional: `user_id`, `duration_ms`, `error.stack`.
|
|
68
|
+
- Log levels:
|
|
69
|
+
- `ERROR` — something broke, requires investigation. Pages on-call if in production.
|
|
70
|
+
- `WARN` — something unexpected but handled. Rate limit hit, cache miss, slow query.
|
|
71
|
+
- `INFO` — significant business events: user registered, order placed, payment processed.
|
|
72
|
+
- `DEBUG` — development only. Never deploy with DEBUG enabled in production.
|
|
73
|
+
- Distributed tracing: propagate trace context (`traceparent` header) across all service-to-service calls. Every outgoing HTTP/gRPC/queue message carries the trace ID.
|
|
74
|
+
- Metrics to expose: request rate, error rate, latency percentiles (p50, p95, p99), active connections, queue depth, cache hit rate. Use Prometheus format.
|
|
75
|
+
- Alert on symptoms (error rate >1%, p99 latency >2s) not causes. Avoid alert fatigue — every alert must be actionable.
|
|
76
|
+
|
|
77
|
+
## Caching
|
|
78
|
+
|
|
79
|
+
- Cache strategy decision tree:
|
|
80
|
+
- Data changes rarely + stale data acceptable → **Cache-aside** (check cache → miss → query DB → write cache)
|
|
81
|
+
- Data changes often + stale data unacceptable → **Write-through** (write DB + cache simultaneously)
|
|
82
|
+
- Expensive computation + immutable inputs → **Memoization** with TTL
|
|
83
|
+
- Cache key format: `{service}:{entity}:{id}:{version}` — e.g., `user-service:profile:123:v2`.
|
|
84
|
+
- Always set TTL. Infinite TTL = memory leak. Typical: config data (1h), user profiles (15m), search results (5m).
|
|
85
|
+
- Cache stampede prevention: use probabilistic early expiration or lock-based recomputation. Never let 1000 requests simultaneously recompute the same expired key.
|
|
86
|
+
- Invalidation: prefer event-driven invalidation (on write, publish cache-invalidation event) over TTL-only. TTL is the safety net, not the primary strategy.
|
|
87
|
+
|
|
88
|
+
## Async Processing
|
|
89
|
+
|
|
90
|
+
- Message/job queue for: email sending, PDF generation, webhook delivery, data exports, image processing — anything that takes >500ms or can fail independently.
|
|
91
|
+
- Jobs must be idempotent. If a job runs twice with the same input, the result must be the same (use idempotency keys).
|
|
92
|
+
- Dead letter queue (DLQ) for failed messages. Monitor DLQ size. Alert if DLQ grows >100 messages.
|
|
93
|
+
- Job processing order: FIFO by default. Priority queues for time-sensitive operations.
|
|
94
|
+
- Implement job progress tracking for long-running operations. Expose status via API endpoint (`GET /jobs/:id/status`).
|
|
95
|
+
|
|
96
|
+
## Testing
|
|
97
|
+
|
|
98
|
+
- Unit tests: test service layer business logic in isolation. Mock repositories and external services.
|
|
99
|
+
- Integration tests: test API endpoints with a real database (use test containers or in-memory DB). These are your most valuable tests.
|
|
100
|
+
- Contract tests: if consuming/providing APIs between services, use Pact or similar to verify contracts don't break.
|
|
101
|
+
- Load tests: run before every release that touches data path. Baseline: the system must handle 2x current peak traffic.
|
|
102
|
+
- Test database state: each test creates its own data, cleans up after. No shared test data. No test ordering dependencies. Use database transactions that roll back after each test.
|
|
@@ -0,0 +1,91 @@
|
|
|
1
|
+
# API Design
|
|
2
|
+
|
|
3
|
+
## RESTful URL Structure
|
|
4
|
+
|
|
5
|
+
- Collections: `GET /users`, `POST /users`
|
|
6
|
+
- Items: `GET /users/:id`, `PUT /users/:id`, `PATCH /users/:id`, `DELETE /users/:id`
|
|
7
|
+
- Nested: `GET /users/:id/orders`, `POST /users/:id/orders`
|
|
8
|
+
- Actions (non-CRUD): `POST /orders/:id/cancel`, `POST /users/:id/verify`
|
|
9
|
+
|
|
10
|
+
## HTTP Status Codes
|
|
11
|
+
|
|
12
|
+
Use semantically correct codes — not just 200 and 500:
|
|
13
|
+
|
|
14
|
+
| Code | Meaning | When to use |
|
|
15
|
+
|------|---------|-------------|
|
|
16
|
+
| 200 | OK | Success with body |
|
|
17
|
+
| 201 | Created | Resource created (+ `Location` header) |
|
|
18
|
+
| 204 | No Content | Success, no body (DELETE, some PUTs) |
|
|
19
|
+
| 400 | Bad Request | Validation error (malformed input) |
|
|
20
|
+
| 401 | Unauthenticated | No token or invalid token |
|
|
21
|
+
| 403 | Forbidden | Valid token, insufficient permissions |
|
|
22
|
+
| 404 | Not Found | Resource doesn't exist |
|
|
23
|
+
| 409 | Conflict | Duplicate, version mismatch |
|
|
24
|
+
| 422 | Unprocessable | Valid data, business logic rejection |
|
|
25
|
+
| 429 | Too Many Requests | Rate limited (include `Retry-After` header) |
|
|
26
|
+
| 500 | Server Error | Never intentional — means you have a bug |
|
|
27
|
+
|
|
28
|
+
## Response Formats
|
|
29
|
+
|
|
30
|
+
**Collections:**
|
|
31
|
+
```json
|
|
32
|
+
{
|
|
33
|
+
"data": [...],
|
|
34
|
+
"meta": { "total": 150, "page": 1, "perPage": 20, "hasMore": true }
|
|
35
|
+
}
|
|
36
|
+
```
|
|
37
|
+
|
|
38
|
+
**Errors:**
|
|
39
|
+
```json
|
|
40
|
+
{
|
|
41
|
+
"error": {
|
|
42
|
+
"code": "VALIDATION_ERROR",
|
|
43
|
+
"message": "Email is required",
|
|
44
|
+
"details": [
|
|
45
|
+
{ "field": "email", "message": "This field is required" }
|
|
46
|
+
]
|
|
47
|
+
}
|
|
48
|
+
}
|
|
49
|
+
```
|
|
50
|
+
`code` is machine-readable (for client logic), `message` is human-readable (for display).
|
|
51
|
+
|
|
52
|
+
## Pagination
|
|
53
|
+
|
|
54
|
+
- **Cursor-based** (recommended for large/real-time data): Opaque cursor token, no count query.
|
|
55
|
+
```
|
|
56
|
+
GET /posts?cursor=eyJpZCI6MTAwfQ&limit=20
|
|
57
|
+
→ { data: [...], meta: { nextCursor: "eyJpZCI6MTIwfQ", hasMore: true } }
|
|
58
|
+
```
|
|
59
|
+
- **Offset-based** (simple, for small static data): `?page=2&perPage=20`
|
|
60
|
+
- Default sort must be deterministic — include `id` as tiebreaker.
|
|
61
|
+
|
|
62
|
+
## Filtering & Sorting
|
|
63
|
+
|
|
64
|
+
```
|
|
65
|
+
GET /users?status=active&role=admin&created_after=2024-01-01
|
|
66
|
+
GET /users?sort=created_at:desc,name:asc
|
|
67
|
+
```
|
|
68
|
+
|
|
69
|
+
Complex filters: POST to a search endpoint with filter body, not mega query strings.
|
|
70
|
+
|
|
71
|
+
## Versioning
|
|
72
|
+
|
|
73
|
+
- URL prefix: `/v1/users`, `/v2/users` for breaking changes
|
|
74
|
+
- Adding optional response fields = NOT breaking
|
|
75
|
+
- Removing/renaming fields, changing types, adding required params = BREAKING
|
|
76
|
+
|
|
77
|
+
## Input Validation
|
|
78
|
+
|
|
79
|
+
Validate everything at the API boundary:
|
|
80
|
+
```typescript
|
|
81
|
+
const createUserSchema = z.object({
|
|
82
|
+
email: z.string().email().max(255),
|
|
83
|
+
name: z.string().min(1).max(100),
|
|
84
|
+
role: z.enum(['user', 'admin']).default('user'),
|
|
85
|
+
});
|
|
86
|
+
```
|
|
87
|
+
- String length limits on all fields
|
|
88
|
+
- Enum validation for constrained values
|
|
89
|
+
- Nested object validation
|
|
90
|
+
- Array length limits
|
|
91
|
+
- Reject unknown fields (`z.strict()`)
|
|
@@ -0,0 +1,86 @@
|
|
|
1
|
+
# Async Processing
|
|
2
|
+
|
|
3
|
+
## When to Use Queues
|
|
4
|
+
|
|
5
|
+
Move to a queue if the operation:
|
|
6
|
+
- Takes > 500ms (email, PDF generation, image processing)
|
|
7
|
+
- Can fail independently (webhook delivery, third-party API calls)
|
|
8
|
+
- Doesn't need an immediate response (analytics, audit logging)
|
|
9
|
+
- Has high throughput bursts (notification fanout, batch imports)
|
|
10
|
+
|
|
11
|
+
## Job Design Rules
|
|
12
|
+
|
|
13
|
+
1. **Idempotent**: Running the same job twice with the same input = same result. Use idempotency keys.
|
|
14
|
+
```typescript
|
|
15
|
+
async processPayment(jobData: { orderId: string, idempotencyKey: string }) {
|
|
16
|
+
const existing = await db.payments.findByIdempotencyKey(jobData.idempotencyKey);
|
|
17
|
+
if (existing) return existing; // Already processed
|
|
18
|
+
// ... process payment
|
|
19
|
+
}
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
2. **Minimal payload**: Store IDs in the job, fetch fresh data in the worker.
|
|
23
|
+
```typescript
|
|
24
|
+
// GOOD: minimal payload, fresh data
|
|
25
|
+
queue.add('send-invoice', { orderId: 'ord_123' });
|
|
26
|
+
|
|
27
|
+
// BAD: stale data in payload
|
|
28
|
+
queue.add('send-invoice', { order: { ...fullOrderObject } });
|
|
29
|
+
```
|
|
30
|
+
|
|
31
|
+
3. **Timeout**: Every job has a maximum execution time. Kill and retry if exceeded.
|
|
32
|
+
|
|
33
|
+
4. **Progress tracking**: Long-running jobs expose status via API.
|
|
34
|
+
```
|
|
35
|
+
POST /exports → 202 Accepted { jobId: "job_123" }
|
|
36
|
+
GET /exports/job_123/status → { status: "processing", progress: 45 }
|
|
37
|
+
```
|
|
38
|
+
|
|
39
|
+
## Dead Letter Queue (DLQ)
|
|
40
|
+
|
|
41
|
+
- Failed messages (after max retries) go to DLQ, not silently dropped
|
|
42
|
+
- Monitor DLQ size — alert if > 100 messages
|
|
43
|
+
- DLQ messages must be inspectable (view payload, error reason, original timestamp)
|
|
44
|
+
- Process DLQ manually or automatically after fixing the root cause
|
|
45
|
+
|
|
46
|
+
## Retry Strategy
|
|
47
|
+
|
|
48
|
+
```
|
|
49
|
+
Attempt 1: immediate
|
|
50
|
+
Attempt 2: 30 seconds
|
|
51
|
+
Attempt 3: 2 minutes
|
|
52
|
+
Attempt 4: 10 minutes
|
|
53
|
+
Attempt 5: 1 hour
|
|
54
|
+
Then → DLQ
|
|
55
|
+
```
|
|
56
|
+
|
|
57
|
+
- Use exponential backoff — not fixed intervals
|
|
58
|
+
- Add jitter to prevent thundering herd
|
|
59
|
+
- Different retry strategies for different error types:
|
|
60
|
+
- Network timeout → retry immediately
|
|
61
|
+
- Rate limited → respect Retry-After header
|
|
62
|
+
- Validation error → don't retry (fix the data)
|
|
63
|
+
- 500 error → retry with backoff
|
|
64
|
+
|
|
65
|
+
## Queue Patterns
|
|
66
|
+
|
|
67
|
+
**Fan-out**: One event triggers many jobs
|
|
68
|
+
```
|
|
69
|
+
order.created → [send-email, update-analytics, notify-warehouse, update-search-index]
|
|
70
|
+
```
|
|
71
|
+
|
|
72
|
+
**Priority queue**: Urgent operations processed first
|
|
73
|
+
```
|
|
74
|
+
queue.add('process-payment', data, { priority: 1 }); // High
|
|
75
|
+
queue.add('send-marketing-email', data, { priority: 5 }); // Low
|
|
76
|
+
```
|
|
77
|
+
|
|
78
|
+
**Delayed jobs**: Schedule for later
|
|
79
|
+
```
|
|
80
|
+
queue.add('send-reminder', data, { delay: 24 * 60 * 60 * 1000 }); // 24h
|
|
81
|
+
```
|
|
82
|
+
|
|
83
|
+
**Rate-limited processing**: External API limits
|
|
84
|
+
```
|
|
85
|
+
queue.add('sync-to-crm', data, { limiter: { max: 10, duration: 1000 } }); // 10/sec
|
|
86
|
+
```
|
|
@@ -0,0 +1,58 @@
|
|
|
1
|
+
# Authentication & Security
|
|
2
|
+
|
|
3
|
+
## Authentication
|
|
4
|
+
|
|
5
|
+
- **JWT tokens**: Short-lived access (15-30 min) + long-lived refresh (7-30 days) in httpOnly Secure SameSite=Strict cookie
|
|
6
|
+
- **Password storage**: bcrypt (cost ≥ 12) or Argon2id. Never plain text, never MD5/SHA.
|
|
7
|
+
- **API keys**: Service-to-service auth only. Rotate quarterly. Never for user-facing auth.
|
|
8
|
+
- **OAuth2/OIDC**: For third-party login (Google, GitHub). Use authorization code flow with PKCE.
|
|
9
|
+
|
|
10
|
+
## Authorization
|
|
11
|
+
|
|
12
|
+
- Implement RBAC at minimum. Check permissions in middleware before controller executes.
|
|
13
|
+
- **IDOR prevention** — always scope queries by authenticated user:
|
|
14
|
+
```sql
|
|
15
|
+
-- BAD: anyone can access any order
|
|
16
|
+
SELECT * FROM orders WHERE id = :orderId
|
|
17
|
+
-- GOOD: scoped to user
|
|
18
|
+
SELECT * FROM orders WHERE id = :orderId AND user_id = :currentUserId
|
|
19
|
+
```
|
|
20
|
+
- Test every endpoint: can user A access user B's resources by changing the ID?
|
|
21
|
+
|
|
22
|
+
## Rate Limiting
|
|
23
|
+
|
|
24
|
+
| Tier | Limit | Use case |
|
|
25
|
+
|------|-------|----------|
|
|
26
|
+
| Anonymous | 60/min per IP | Public endpoints |
|
|
27
|
+
| Authenticated | 300/min per user | Standard API access |
|
|
28
|
+
| Premium | 1000/min per user | Paid tier |
|
|
29
|
+
| Login | 5/min per IP | Brute force prevention |
|
|
30
|
+
| Password reset | 3/hour per email | Abuse prevention |
|
|
31
|
+
|
|
32
|
+
- Return `429` with `Retry-After` header
|
|
33
|
+
- Use sliding window algorithm (more fair than fixed window)
|
|
34
|
+
- Rate limit by user ID for authenticated, by IP for anonymous
|
|
35
|
+
|
|
36
|
+
## Input Security
|
|
37
|
+
|
|
38
|
+
- **SQL injection**: Parameterized queries always. Never string concatenation for SQL.
|
|
39
|
+
- **Mass assignment**: Whitelist allowed fields from request body. Never pass raw body to ORM create/update.
|
|
40
|
+
- **Path traversal**: Validate file paths. Never use user input directly in `fs.readFile()` or similar.
|
|
41
|
+
- **ReDoS**: Avoid user-controlled regex patterns. Set timeout on regex matching.
|
|
42
|
+
|
|
43
|
+
## Headers
|
|
44
|
+
|
|
45
|
+
```
|
|
46
|
+
Strict-Transport-Security: max-age=31536000; includeSubDomains
|
|
47
|
+
X-Content-Type-Options: nosniff
|
|
48
|
+
X-Frame-Options: DENY
|
|
49
|
+
Content-Security-Policy: default-src 'self'
|
|
50
|
+
X-Request-Id: {unique-id} (for tracing)
|
|
51
|
+
```
|
|
52
|
+
|
|
53
|
+
## Secrets Management
|
|
54
|
+
|
|
55
|
+
- Secrets from secret manager (Vault, AWS SSM, GCP Secret Manager)
|
|
56
|
+
- Never in code, environment files, or Docker images
|
|
57
|
+
- Rotate on: employee offboarding, suspected compromise, quarterly schedule
|
|
58
|
+
- Use short-lived credentials where possible (IAM roles, temporary tokens)
|