@kodrunhq/opencode-autopilot 1.12.2 → 1.14.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/assets/commands/oc-brainstorm.md +1 -0
- package/assets/commands/oc-new-agent.md +1 -0
- package/assets/commands/oc-new-command.md +1 -0
- package/assets/commands/oc-new-skill.md +1 -0
- package/assets/commands/oc-quick.md +1 -0
- package/assets/commands/oc-refactor.md +26 -0
- package/assets/commands/oc-review-agents.md +1 -0
- package/assets/commands/oc-review-pr.md +1 -0
- package/assets/commands/oc-security-audit.md +20 -0
- package/assets/commands/oc-stocktake.md +1 -0
- package/assets/commands/oc-tdd.md +1 -0
- package/assets/commands/oc-update-docs.md +1 -0
- package/assets/commands/oc-write-plan.md +1 -0
- package/assets/skills/api-design/SKILL.md +391 -0
- package/assets/skills/brainstorming/SKILL.md +1 -0
- package/assets/skills/code-review/SKILL.md +1 -0
- package/assets/skills/coding-standards/SKILL.md +1 -0
- package/assets/skills/csharp-patterns/SKILL.md +1 -0
- package/assets/skills/database-patterns/SKILL.md +270 -0
- package/assets/skills/docker-deployment/SKILL.md +326 -0
- package/assets/skills/e2e-testing/SKILL.md +1 -0
- package/assets/skills/frontend-design/SKILL.md +1 -0
- package/assets/skills/git-worktrees/SKILL.md +1 -0
- package/assets/skills/go-patterns/SKILL.md +1 -0
- package/assets/skills/java-patterns/SKILL.md +1 -0
- package/assets/skills/plan-executing/SKILL.md +1 -0
- package/assets/skills/plan-writing/SKILL.md +1 -0
- package/assets/skills/python-patterns/SKILL.md +1 -0
- package/assets/skills/rust-patterns/SKILL.md +1 -0
- package/assets/skills/security-patterns/SKILL.md +312 -0
- package/assets/skills/strategic-compaction/SKILL.md +1 -0
- package/assets/skills/systematic-debugging/SKILL.md +1 -0
- package/assets/skills/tdd-workflow/SKILL.md +1 -0
- package/assets/skills/typescript-patterns/SKILL.md +1 -0
- package/assets/skills/verification/SKILL.md +1 -0
- package/package.json +1 -1
- package/src/agents/db-specialist.ts +295 -0
- package/src/agents/devops.ts +352 -0
- package/src/agents/frontend-engineer.ts +541 -0
- package/src/agents/index.ts +12 -0
- package/src/agents/security-auditor.ts +348 -0
- package/src/hooks/anti-slop.ts +40 -1
- package/src/hooks/slop-patterns.ts +24 -4
- package/src/installer.ts +29 -2
- package/src/memory/capture.ts +9 -4
- package/src/memory/decay.ts +11 -0
- package/src/memory/retrieval.ts +31 -2
- package/src/orchestrator/artifacts.ts +7 -2
- package/src/orchestrator/confidence.ts +3 -2
- package/src/orchestrator/handlers/architect.ts +11 -8
- package/src/orchestrator/handlers/build.ts +12 -10
- package/src/orchestrator/handlers/challenge.ts +9 -3
- package/src/orchestrator/handlers/plan.ts +5 -4
- package/src/orchestrator/handlers/recon.ts +9 -4
- package/src/orchestrator/handlers/retrospective.ts +3 -1
- package/src/orchestrator/handlers/ship.ts +8 -7
- package/src/orchestrator/handlers/types.ts +1 -0
- package/src/orchestrator/lesson-memory.ts +2 -1
- package/src/orchestrator/orchestration-logger.ts +40 -0
- package/src/orchestrator/phase.ts +14 -0
- package/src/orchestrator/schemas.ts +1 -0
- package/src/orchestrator/skill-injection.ts +11 -6
- package/src/orchestrator/state.ts +2 -1
- package/src/review/selection.ts +4 -32
- package/src/skills/adaptive-injector.ts +96 -5
- package/src/skills/loader.ts +4 -1
- package/src/tools/orchestrate.ts +141 -18
- package/src/tools/review.ts +2 -1
|
@@ -0,0 +1,312 @@
|
|
|
1
|
+
---
|
|
2
|
+
# opencode-autopilot
|
|
3
|
+
name: security-patterns
|
|
4
|
+
description: OWASP Top 10 security patterns, authentication, authorization, input validation, secret management, and secure coding practices
|
|
5
|
+
stacks: []
|
|
6
|
+
requires: []
|
|
7
|
+
---
|
|
8
|
+
|
|
9
|
+
# Security Patterns
|
|
10
|
+
|
|
11
|
+
Actionable security patterns for building, reviewing, and hardening applications. Covers the OWASP Top 10, authentication, authorization, input validation, secret management, secure headers, dependency security, cryptography basics, API security, and logging. Apply these when writing new code, reviewing pull requests, or auditing existing systems.
|
|
12
|
+
|
|
13
|
+
## 1. Injection Prevention (OWASP A03)
|
|
14
|
+
|
|
15
|
+
**DO:** Use parameterized queries and prepared statements for all database interactions. Never concatenate user input into queries.
|
|
16
|
+
|
|
17
|
+
```sql
|
|
18
|
+
-- DO: Parameterized query
|
|
19
|
+
SELECT * FROM users WHERE email = ? AND status = ?
|
|
20
|
+
|
|
21
|
+
-- DON'T: String concatenation
|
|
22
|
+
SELECT * FROM users WHERE email = '" + userInput + "' AND status = 'active'
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
- Use ORM query builders with bound parameters
|
|
26
|
+
- Apply the same principle to LDAP, OS commands, and XML parsers
|
|
27
|
+
- Use allowlists for dynamic column/table names (never interpolate directly)
|
|
28
|
+
|
|
29
|
+
**DON'T:**
|
|
30
|
+
|
|
31
|
+
- Build SQL strings with template literals or concatenation
|
|
32
|
+
- Trust "sanitized" input as a substitute for parameterization
|
|
33
|
+
- Use dynamic code evaluation with user-controlled input
|
|
34
|
+
- Pass user input directly to shell commands -- use argument arrays instead:
|
|
35
|
+
```
|
|
36
|
+
// DO: Argument array (no shell interpretation)
|
|
37
|
+
spawn("convert", [inputFile, "-resize", "200x200", outputFile])
|
|
38
|
+
|
|
39
|
+
// DON'T: Shell string (command injection risk)
|
|
40
|
+
runShellCommand("convert " + inputFile + " -resize 200x200 " + outputFile)
|
|
41
|
+
```
|
|
42
|
+
|
|
43
|
+
## 2. Authentication Patterns
|
|
44
|
+
|
|
45
|
+
**DO:** Use proven authentication libraries and standards. Never roll your own crypto or session management.
|
|
46
|
+
|
|
47
|
+
- **JWT best practices:**
|
|
48
|
+
- Use short-lived access tokens (5-15 minutes) with refresh token rotation
|
|
49
|
+
- Validate `iss`, `aud`, `exp`, and `nbf` claims on every request
|
|
50
|
+
- Use asymmetric signing (RS256/ES256) for distributed systems; symmetric (HS256) only for single-service
|
|
51
|
+
- Store refresh tokens server-side (database or Redis) with revocation support
|
|
52
|
+
- Never store JWTs in `localStorage` -- use `httpOnly` cookies
|
|
53
|
+
|
|
54
|
+
- **Session management:**
|
|
55
|
+
- Regenerate session ID after login (prevent session fixation)
|
|
56
|
+
- Set absolute session timeout (e.g., 8 hours) and idle timeout (e.g., 30 minutes)
|
|
57
|
+
- Invalidate sessions on password change and logout
|
|
58
|
+
- Store sessions server-side; the cookie holds only the session ID
|
|
59
|
+
|
|
60
|
+
- **Password handling:**
|
|
61
|
+
- Hash with bcrypt (cost factor 12+), scrypt, or Argon2id -- never MD5 or SHA-256 alone
|
|
62
|
+
- Enforce minimum length (12+ characters), no maximum length under 128
|
|
63
|
+
- Check against breached password databases (Have I Been Pwned API)
|
|
64
|
+
- Use constant-time comparison for password verification
|
|
65
|
+
|
|
66
|
+
**DON'T:**
|
|
67
|
+
|
|
68
|
+
- Store passwords in plaintext or with reversible encryption
|
|
69
|
+
- Implement custom JWT libraries -- use well-maintained ones (jose, jsonwebtoken)
|
|
70
|
+
- Send tokens in URL query parameters (logged in server logs, browser history, referrer headers)
|
|
71
|
+
- Use predictable session IDs or sequential tokens
|
|
72
|
+
|
|
73
|
+
## 3. Authorization (OWASP A01)
|
|
74
|
+
|
|
75
|
+
**DO:** Enforce authorization on every request, server-side. Never rely on client-side checks alone.
|
|
76
|
+
|
|
77
|
+
- **RBAC (Role-Based Access Control):**
|
|
78
|
+
```
|
|
79
|
+
// Middleware checks role before handler runs
|
|
80
|
+
authorize(["admin", "manager"])
|
|
81
|
+
function deleteUser(userId) { ... }
|
|
82
|
+
```
|
|
83
|
+
|
|
84
|
+
- **ABAC (Attribute-Based Access Control):**
|
|
85
|
+
```
|
|
86
|
+
// Policy: user can edit only their own posts, admins can edit any
|
|
87
|
+
function canEditPost(user, post) {
|
|
88
|
+
return user.role === "admin" || post.authorId === user.id
|
|
89
|
+
}
|
|
90
|
+
```
|
|
91
|
+
|
|
92
|
+
- Check ownership on every resource access (IDOR prevention):
|
|
93
|
+
```
|
|
94
|
+
// DO: Verify ownership
|
|
95
|
+
post = await getPost(postId)
|
|
96
|
+
if (post.authorId !== currentUser.id && !currentUser.isAdmin) {
|
|
97
|
+
throw new ForbiddenError()
|
|
98
|
+
}
|
|
99
|
+
|
|
100
|
+
// DON'T: Trust that the user only accesses their own resources
|
|
101
|
+
post = await getPost(postId) // No ownership check
|
|
102
|
+
```
|
|
103
|
+
|
|
104
|
+
- Apply the principle of least privilege -- default deny, explicitly grant
|
|
105
|
+
- Log all authorization failures for monitoring
|
|
106
|
+
|
|
107
|
+
**DON'T:**
|
|
108
|
+
|
|
109
|
+
- Hide UI elements as a security measure (security by obscurity)
|
|
110
|
+
- Use sequential/guessable IDs for sensitive resources -- use UUIDs
|
|
111
|
+
- Check permissions only at the UI layer
|
|
112
|
+
- Grant broad roles when narrow permissions suffice
|
|
113
|
+
|
|
114
|
+
## 4. Cross-Site Scripting Prevention (OWASP A07)
|
|
115
|
+
|
|
116
|
+
**DO:** Escape all output by default. Use context-aware encoding.
|
|
117
|
+
|
|
118
|
+
- Use framework auto-escaping (React JSX, Vue templates, Angular binding)
|
|
119
|
+
- Sanitize HTML when rich text is required (use libraries like DOMPurify or sanitize-html)
|
|
120
|
+
- Use `textContent` instead of `innerHTML` for dynamic text
|
|
121
|
+
- Apply Content Security Policy headers (see Section 7)
|
|
122
|
+
|
|
123
|
+
**DON'T:**
|
|
124
|
+
|
|
125
|
+
- Use raw HTML injection props (React, Vue) with user-supplied content
|
|
126
|
+
- Insert user data into script tags, event handlers, or `href="javascript:..."`
|
|
127
|
+
- Trust server-side sanitization alone -- defense in depth means escaping at every layer
|
|
128
|
+
- Disable framework auto-escaping without explicit justification
|
|
129
|
+
|
|
130
|
+
## 5. Cross-Site Request Forgery Prevention (OWASP A01)
|
|
131
|
+
|
|
132
|
+
**DO:** Protect state-changing operations with anti-CSRF tokens.
|
|
133
|
+
|
|
134
|
+
- Use the synchronizer token pattern (server-generated, per-session or per-request)
|
|
135
|
+
- For SPAs: use the double-submit cookie pattern or custom request headers
|
|
136
|
+
- Set `SameSite=Lax` or `SameSite=Strict` on session cookies
|
|
137
|
+
- Verify `Origin` and `Referer` headers as an additional layer
|
|
138
|
+
|
|
139
|
+
**DON'T:**
|
|
140
|
+
|
|
141
|
+
- Rely solely on `SameSite` cookies (older browsers may not support it)
|
|
142
|
+
- Use GET requests for state-changing operations
|
|
143
|
+
- Accept CSRF tokens in query parameters (leaks via referrer)
|
|
144
|
+
|
|
145
|
+
## 6. Server-Side Request Forgery Prevention (OWASP A10)
|
|
146
|
+
|
|
147
|
+
**DO:** Validate and restrict all server-initiated outbound requests.
|
|
148
|
+
|
|
149
|
+
- Maintain an allowlist of permitted hostnames or URL patterns
|
|
150
|
+
- Block requests to private/internal IP ranges (10.x, 172.16-31.x, 192.168.x, 127.x, ::1)
|
|
151
|
+
- Use a dedicated HTTP client with timeout, redirect limits, and DNS rebinding protection
|
|
152
|
+
- Resolve DNS and validate the IP before connecting (prevent DNS rebinding)
|
|
153
|
+
|
|
154
|
+
**DON'T:**
|
|
155
|
+
|
|
156
|
+
- Allow user-controlled URLs to reach internal services
|
|
157
|
+
- Follow redirects blindly from user-provided URLs
|
|
158
|
+
- Trust URL parsing alone -- resolve and check the actual IP address
|
|
159
|
+
|
|
160
|
+
## 7. Secure Headers
|
|
161
|
+
|
|
162
|
+
**DO:** Set security headers on all HTTP responses.
|
|
163
|
+
|
|
164
|
+
```
|
|
165
|
+
Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self' 'unsafe-inline'; img-src 'self' data: https:; frame-ancestors 'none'
|
|
166
|
+
Strict-Transport-Security: max-age=63072000; includeSubDomains; preload
|
|
167
|
+
X-Content-Type-Options: nosniff
|
|
168
|
+
X-Frame-Options: DENY
|
|
169
|
+
Referrer-Policy: strict-origin-when-cross-origin
|
|
170
|
+
Permissions-Policy: camera=(), microphone=(), geolocation=()
|
|
171
|
+
```
|
|
172
|
+
|
|
173
|
+
- Start with a strict CSP and loosen only as needed
|
|
174
|
+
- Use `nonce` or `hash` for inline scripts instead of `'unsafe-inline'`
|
|
175
|
+
- Enable HSTS preloading for production domains
|
|
176
|
+
- Set `X-Frame-Options: DENY` unless embedding is required
|
|
177
|
+
|
|
178
|
+
**DON'T:**
|
|
179
|
+
|
|
180
|
+
- Use `'unsafe-eval'` in CSP (enables XSS via code evaluation)
|
|
181
|
+
- Skip HSTS on HTTPS-only sites
|
|
182
|
+
- Set permissive CORS (`Access-Control-Allow-Origin: *`) on authenticated endpoints
|
|
183
|
+
|
|
184
|
+
## 8. Input Validation and Sanitization
|
|
185
|
+
|
|
186
|
+
**DO:** Validate all input at system boundaries. Reject invalid input before processing.
|
|
187
|
+
|
|
188
|
+
- Use schema validation (Zod, Joi, JSON Schema) for structured input
|
|
189
|
+
- Validate type, length, range, and format
|
|
190
|
+
- Use allowlists over blocklists for security-sensitive fields
|
|
191
|
+
- Sanitize for the output context (HTML-encode for HTML, parameterize for SQL)
|
|
192
|
+
- Validate file uploads: check MIME type, file extension, file size, and magic bytes
|
|
193
|
+
|
|
194
|
+
**DON'T:**
|
|
195
|
+
|
|
196
|
+
- Trust `Content-Type` headers alone for file type validation
|
|
197
|
+
- Use regex-only validation for complex formats (emails, URLs) -- use dedicated parsers
|
|
198
|
+
- Validate on the client only -- always re-validate server-side
|
|
199
|
+
- Accept unbounded input (always set maximum lengths)
|
|
200
|
+
|
|
201
|
+
## 9. Secret Management
|
|
202
|
+
|
|
203
|
+
**DO:** Keep secrets out of source code and version control.
|
|
204
|
+
|
|
205
|
+
- Use environment variables for deployment-specific secrets
|
|
206
|
+
- Use a secrets manager (Vault, AWS Secrets Manager, GCP Secret Manager) for production
|
|
207
|
+
- Rotate secrets on a schedule and immediately after suspected exposure
|
|
208
|
+
- Use separate secrets per environment (dev, staging, production)
|
|
209
|
+
- Validate that required secrets are present at startup -- fail fast if missing
|
|
210
|
+
|
|
211
|
+
**DON'T:**
|
|
212
|
+
|
|
213
|
+
- Commit secrets to Git (even in "private" repos)
|
|
214
|
+
- Log secrets in application logs or error messages
|
|
215
|
+
- Store secrets in `.env` files in production (use the platform's secret injection)
|
|
216
|
+
- Share secrets via chat, email, or documentation -- use a secrets manager
|
|
217
|
+
- Hardcode API keys, database passwords, or tokens in source files
|
|
218
|
+
|
|
219
|
+
```
|
|
220
|
+
// DO: Read from environment
|
|
221
|
+
apiKey = environment.get("API_KEY")
|
|
222
|
+
if not apiKey: raise ConfigurationError("API_KEY is required")
|
|
223
|
+
|
|
224
|
+
// DON'T: Hardcoded
|
|
225
|
+
apiKey = "sk-1234567890abcdef"
|
|
226
|
+
```
|
|
227
|
+
|
|
228
|
+
## 10. Dependency Security
|
|
229
|
+
|
|
230
|
+
**DO:** Treat dependencies as an attack surface. Audit regularly and keep them updated.
|
|
231
|
+
|
|
232
|
+
- Run your language's dependency audit tool on every CI build (`npm audit`, `pip audit`, `cargo audit`, etc.)
|
|
233
|
+
- Use lockfiles and commit them to version control
|
|
234
|
+
- Pin major versions; allow patch updates with automated PR tools (Dependabot, Renovate)
|
|
235
|
+
- Review new dependencies before adding: check maintenance status, download count, and known vulnerabilities
|
|
236
|
+
- Use Software Composition Analysis (SCA) tools in CI
|
|
237
|
+
|
|
238
|
+
**DON'T:**
|
|
239
|
+
|
|
240
|
+
- Ignore audit warnings -- triage and fix or document accepted risk
|
|
241
|
+
- Use `*` or `latest` as version specifiers
|
|
242
|
+
- Add dependencies without evaluating their transitive dependency tree
|
|
243
|
+
- Skip lockfile commits (reproducible builds require locked versions)
|
|
244
|
+
|
|
245
|
+
## 11. Cryptography Basics
|
|
246
|
+
|
|
247
|
+
**DO:** Use standard algorithms and libraries. Never implement your own cryptographic primitives.
|
|
248
|
+
|
|
249
|
+
- **Hashing:** SHA-256 or SHA-3 for data integrity; bcrypt/scrypt/Argon2id for passwords
|
|
250
|
+
- **Encryption:** AES-256-GCM for symmetric; RSA-OAEP or X25519 for asymmetric
|
|
251
|
+
- **Signing:** HMAC-SHA256 for message authentication; Ed25519 or ECDSA for digital signatures
|
|
252
|
+
- Use cryptographically secure random number generators (`crypto.randomUUID()`, `crypto.getRandomValues()`)
|
|
253
|
+
- Store encryption keys separate from encrypted data
|
|
254
|
+
|
|
255
|
+
**DON'T:**
|
|
256
|
+
|
|
257
|
+
- Use MD5 or SHA-1 for anything security-sensitive (broken collision resistance)
|
|
258
|
+
- Use ECB mode for block ciphers (patterns leak through)
|
|
259
|
+
- Reuse initialization vectors (IVs) or nonces
|
|
260
|
+
- Store encryption keys alongside the encrypted data
|
|
261
|
+
- Roll your own encryption scheme
|
|
262
|
+
|
|
263
|
+
## 12. API Security
|
|
264
|
+
|
|
265
|
+
**DO:** Protect APIs at multiple layers.
|
|
266
|
+
|
|
267
|
+
- Implement rate limiting per IP and per authenticated user:
|
|
268
|
+
```
|
|
269
|
+
X-RateLimit-Limit: 100
|
|
270
|
+
X-RateLimit-Remaining: 42
|
|
271
|
+
X-RateLimit-Reset: 1672531200
|
|
272
|
+
```
|
|
273
|
+
- Use API keys for identification, OAuth2/JWT for authentication
|
|
274
|
+
- Configure CORS to allow only specific origins on authenticated endpoints
|
|
275
|
+
- Validate request body size limits (prevent payload-based DoS)
|
|
276
|
+
- Use TLS 1.2+ for all API traffic -- no exceptions
|
|
277
|
+
|
|
278
|
+
**DON'T:**
|
|
279
|
+
|
|
280
|
+
- Expose internal error details in API responses (stack traces, SQL errors)
|
|
281
|
+
- Allow unlimited request sizes or query complexity (GraphQL depth/cost limiting)
|
|
282
|
+
- Use API keys as the sole authentication mechanism for sensitive operations
|
|
283
|
+
- Disable TLS certificate validation in production clients
|
|
284
|
+
|
|
285
|
+
## 13. Logging and Monitoring
|
|
286
|
+
|
|
287
|
+
**DO:** Log security-relevant events for detection and forensics.
|
|
288
|
+
|
|
289
|
+
- Log: authentication attempts (success and failure), authorization failures, input validation failures, privilege escalation, configuration changes
|
|
290
|
+
- Include: timestamp, user ID, action, resource, IP address, result (success/failure)
|
|
291
|
+
- Use structured logging (JSON) for machine-parseable audit trails
|
|
292
|
+
- Set up alerts for: brute force patterns, unusual access times, privilege escalation, mass data access
|
|
293
|
+
|
|
294
|
+
**DON'T:**
|
|
295
|
+
|
|
296
|
+
- Log passwords, tokens, session IDs, credit card numbers, or PII
|
|
297
|
+
- Log at a level that makes it easy to reconstruct sensitive user data
|
|
298
|
+
- Store logs on the same system they are monitoring (compromised system = compromised logs)
|
|
299
|
+
- Ignore log volume -- implement log rotation and retention policies
|
|
300
|
+
|
|
301
|
+
```
|
|
302
|
+
// DO: Structured security log (PII redacted)
|
|
303
|
+
logger.warn("auth.failed", {
|
|
304
|
+
userId: attempt.userId,
|
|
305
|
+
ip: request.ip,
|
|
306
|
+
reason: "invalid_password",
|
|
307
|
+
attemptCount: 3,
|
|
308
|
+
})
|
|
309
|
+
|
|
310
|
+
// DON'T: Leak credentials
|
|
311
|
+
logger.warn("Login failed for user@example.com with password P@ssw0rd!")
|
|
312
|
+
```
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@kodrunhq/opencode-autopilot",
|
|
3
|
-
"version": "1.
|
|
3
|
+
"version": "1.14.0",
|
|
4
4
|
"description": "Curated agents, skills, and commands for the OpenCode AI coding CLI — autonomous orchestrator, multi-agent code review, model fallback, and in-session asset creation tools.",
|
|
5
5
|
"main": "src/index.ts",
|
|
6
6
|
"keywords": [
|
|
@@ -0,0 +1,295 @@
|
|
|
1
|
+
import type { AgentConfig } from "@opencode-ai/sdk";
|
|
2
|
+
|
|
3
|
+
export const dbSpecialistAgent: Readonly<AgentConfig> = Object.freeze({
|
|
4
|
+
description:
|
|
5
|
+
"Database specialist for query optimization, schema design, migrations, and data modeling",
|
|
6
|
+
mode: "subagent",
|
|
7
|
+
prompt: `You are a database specialist. You design schemas, optimize queries, plan migrations, and solve data modeling challenges across SQL and NoSQL databases.
|
|
8
|
+
|
|
9
|
+
## How You Work
|
|
10
|
+
|
|
11
|
+
1. **Understand the data requirement** -- Read the task description, identify entities, relationships, access patterns, and performance constraints.
|
|
12
|
+
2. **Analyze the existing schema** -- Review current tables, indexes, and queries to understand the baseline.
|
|
13
|
+
3. **Design or optimize** -- Apply normalization, indexing, and query optimization patterns to solve the problem.
|
|
14
|
+
4. **Write migrations** -- Create versioned, incremental migration scripts with rollback plans.
|
|
15
|
+
5. **Validate** -- Use EXPLAIN ANALYZE to verify query plans, run migrations against test data, and confirm correctness.
|
|
16
|
+
|
|
17
|
+
<skill name="database-patterns">
|
|
18
|
+
# Database Patterns
|
|
19
|
+
|
|
20
|
+
Practical patterns for database design, query optimization, and operational management. Covers schema design, indexing, query optimization, migrations, connection pooling, transactions, data modeling, and backup/recovery. Apply these when designing schemas, reviewing queries, planning migrations, or troubleshooting performance.
|
|
21
|
+
|
|
22
|
+
## 1. Schema Design Principles
|
|
23
|
+
|
|
24
|
+
**DO:** Design schemas that balance normalization with practical query performance.
|
|
25
|
+
|
|
26
|
+
- Normalize to 3NF by default -- eliminate data duplication and update anomalies
|
|
27
|
+
- Denormalize deliberately when read performance justifies it (document the trade-off)
|
|
28
|
+
- Use consistent naming conventions:
|
|
29
|
+
\`\`\`
|
|
30
|
+
-- Tables: plural snake_case
|
|
31
|
+
users, order_items, payment_methods
|
|
32
|
+
|
|
33
|
+
-- Columns: singular snake_case
|
|
34
|
+
user_id, created_at, is_active, total_amount
|
|
35
|
+
|
|
36
|
+
-- Foreign keys: referenced_table_singular_id
|
|
37
|
+
user_id, order_id, category_id
|
|
38
|
+
|
|
39
|
+
-- Indexes: idx_table_column(s)
|
|
40
|
+
idx_users_email, idx_orders_user_id_created_at
|
|
41
|
+
\`\`\`
|
|
42
|
+
- Include standard metadata columns: \`id\`, \`created_at\`, \`updated_at\`
|
|
43
|
+
- Use UUIDs for public-facing identifiers; auto-increment for internal references
|
|
44
|
+
- Add \`NOT NULL\` constraints by default -- make nullable columns the exception with justification
|
|
45
|
+
|
|
46
|
+
**DON'T:**
|
|
47
|
+
|
|
48
|
+
- Over-normalize to 5NF+ (diminishing returns, excessive joins)
|
|
49
|
+
- Use reserved words as column names (\`order\`, \`user\`, \`group\`, \`select\`)
|
|
50
|
+
- Create tables without primary keys
|
|
51
|
+
- Use floating-point types for monetary values (use \`DECIMAL\`/\`NUMERIC\`)
|
|
52
|
+
- Store comma-separated values in a single column (normalize into a junction table)
|
|
53
|
+
|
|
54
|
+
\`\`\`
|
|
55
|
+
-- DO: Junction table for many-to-many
|
|
56
|
+
CREATE TABLE user_roles (
|
|
57
|
+
user_id INTEGER NOT NULL REFERENCES users(id),
|
|
58
|
+
role_id INTEGER NOT NULL REFERENCES roles(id),
|
|
59
|
+
PRIMARY KEY (user_id, role_id)
|
|
60
|
+
);
|
|
61
|
+
|
|
62
|
+
-- DON'T: CSV in a column
|
|
63
|
+
ALTER TABLE users ADD COLUMN roles TEXT; -- "admin,editor,viewer"
|
|
64
|
+
\`\`\`
|
|
65
|
+
|
|
66
|
+
## 2. Indexing Strategy
|
|
67
|
+
|
|
68
|
+
**DO:** Create indexes based on actual query patterns, not guesswork.
|
|
69
|
+
|
|
70
|
+
- **B-tree indexes** (default): equality and range queries (\`WHERE\`, \`ORDER BY\`, \`JOIN\`)
|
|
71
|
+
- **Hash indexes**: exact equality lookups only (faster than B-tree for \`=\`, no range support)
|
|
72
|
+
- **Composite indexes**: order columns by selectivity (most selective first) and match query patterns
|
|
73
|
+
\`\`\`sql
|
|
74
|
+
-- Query: WHERE status = 'active' AND created_at > '2024-01-01' ORDER BY created_at
|
|
75
|
+
-- Index: (status, created_at) -- status for equality, created_at for range + sort
|
|
76
|
+
CREATE INDEX idx_orders_status_created ON orders(status, created_at);
|
|
77
|
+
\`\`\`
|
|
78
|
+
- **Covering indexes**: include all columns needed by the query to avoid table lookups
|
|
79
|
+
\`\`\`sql
|
|
80
|
+
-- Query only needs id, email, name -- index covers it entirely
|
|
81
|
+
CREATE INDEX idx_users_email_covering ON users(email) INCLUDE (id, name);
|
|
82
|
+
\`\`\`
|
|
83
|
+
- Use \`EXPLAIN ANALYZE\` to verify index usage before and after adding indexes
|
|
84
|
+
- Index foreign key columns (required for efficient joins and cascade deletes)
|
|
85
|
+
|
|
86
|
+
**DON'T:**
|
|
87
|
+
|
|
88
|
+
- Index every column (indexes slow writes and consume storage)
|
|
89
|
+
- Create indexes on low-cardinality columns alone (\`is_active\` with 2 values -- combine with other columns)
|
|
90
|
+
- Ignore index maintenance (rebuild/reindex fragmented indexes periodically)
|
|
91
|
+
- Forget that composite index column order matters: \`(a, b)\` serves \`WHERE a = ?\` but NOT \`WHERE b = ?\`
|
|
92
|
+
- Add indexes without checking if an existing index already covers the query
|
|
93
|
+
|
|
94
|
+
## 3. Query Optimization
|
|
95
|
+
|
|
96
|
+
**DO:** Write efficient queries and use EXPLAIN to verify execution plans.
|
|
97
|
+
|
|
98
|
+
- **Read EXPLAIN output** to understand:
|
|
99
|
+
- Scan type: Sequential Scan (bad for large tables) vs Index Scan (good)
|
|
100
|
+
- Join strategy: Nested Loop (small datasets), Hash Join (equality), Merge Join (sorted)
|
|
101
|
+
- Estimated vs actual row counts (large discrepancies indicate stale statistics)
|
|
102
|
+
|
|
103
|
+
- **Avoid N+1 queries** -- the most common ORM performance problem:
|
|
104
|
+
\`\`\`
|
|
105
|
+
// DON'T: N+1 -- 1 query for users + N queries for orders
|
|
106
|
+
users = await User.findAll()
|
|
107
|
+
for (user of users) {
|
|
108
|
+
user.orders = await Order.findAll({ where: { userId: user.id } })
|
|
109
|
+
}
|
|
110
|
+
|
|
111
|
+
// DO: Eager loading -- 1 or 2 queries total
|
|
112
|
+
users = await User.findAll({ include: [Order] })
|
|
113
|
+
|
|
114
|
+
// DO: Batch loading
|
|
115
|
+
users = await User.findAll()
|
|
116
|
+
userIds = users.map(u => u.id)
|
|
117
|
+
orders = await Order.findAll({ where: { userId: { in: userIds } } })
|
|
118
|
+
\`\`\`
|
|
119
|
+
|
|
120
|
+
- Use \`LIMIT\` on all queries that don't need full result sets
|
|
121
|
+
- Use \`EXISTS\` instead of \`COUNT(*)\` when checking for existence
|
|
122
|
+
- Avoid \`SELECT *\` -- request only needed columns
|
|
123
|
+
- Use batch operations for bulk inserts/updates (not individual queries in a loop)
|
|
124
|
+
|
|
125
|
+
**DON'T:**
|
|
126
|
+
|
|
127
|
+
- Use \`OFFSET\` for deep pagination (use cursor-based pagination instead)
|
|
128
|
+
- Wrap queries in unnecessary subqueries
|
|
129
|
+
- Use functions on indexed columns in WHERE clauses (\`WHERE LOWER(email) = ?\` -- use a functional index or store normalized)
|
|
130
|
+
- Ignore slow query logs -- review them regularly
|
|
131
|
+
|
|
132
|
+
## 4. Migration Strategies
|
|
133
|
+
|
|
134
|
+
**DO:** Use versioned, incremental migrations with rollback plans.
|
|
135
|
+
|
|
136
|
+
- Number migrations sequentially or by timestamp: \`001_create_users.sql\`, \`20240115_add_email_index.sql\`
|
|
137
|
+
- Each migration must be idempotent or have a corresponding down migration
|
|
138
|
+
- Test migrations against a copy of production data volume (not just empty schemas)
|
|
139
|
+
- Plan zero-downtime migrations for production:
|
|
140
|
+
\`\`\`
|
|
141
|
+
-- Step 1: Add nullable column (no downtime)
|
|
142
|
+
ALTER TABLE users ADD COLUMN phone TEXT;
|
|
143
|
+
|
|
144
|
+
-- Step 2: Backfill data (background job)
|
|
145
|
+
UPDATE users SET phone = 'unknown' WHERE phone IS NULL;
|
|
146
|
+
|
|
147
|
+
-- Step 3: Add NOT NULL constraint (after backfill completes)
|
|
148
|
+
ALTER TABLE users ALTER COLUMN phone SET NOT NULL;
|
|
149
|
+
\`\`\`
|
|
150
|
+
|
|
151
|
+
- For column renames or type changes, use the expand-contract pattern:
|
|
152
|
+
1. Add new column
|
|
153
|
+
2. Deploy code that writes to both old and new columns
|
|
154
|
+
3. Backfill new column from old
|
|
155
|
+
4. Deploy code that reads from new column
|
|
156
|
+
5. Drop old column
|
|
157
|
+
|
|
158
|
+
**DON'T:**
|
|
159
|
+
|
|
160
|
+
- Run destructive migrations without a rollback plan (\`DROP TABLE\`, \`DROP COLUMN\`)
|
|
161
|
+
- Apply schema changes and code changes in the same deployment (deploy schema first)
|
|
162
|
+
- Lock large tables with \`ALTER TABLE\` during peak traffic
|
|
163
|
+
- Skip testing migrations on production-like data (row counts, constraints, FK relationships)
|
|
164
|
+
- Use ORM auto-migration in production (unpredictable, no rollback)
|
|
165
|
+
|
|
166
|
+
## 5. Connection Pooling
|
|
167
|
+
|
|
168
|
+
**DO:** Use connection pools to manage database connections efficiently.
|
|
169
|
+
|
|
170
|
+
- Size the pool based on: \`pool_size = (core_count * 2) + disk_spindles\` (start with this, tune under load)
|
|
171
|
+
- Set connection timeouts (acquisition: 5s, idle: 60s, max lifetime: 30min)
|
|
172
|
+
- Use a connection pool per service instance, not a global shared pool
|
|
173
|
+
- Monitor pool metrics: active connections, waiting requests, timeout rate
|
|
174
|
+
- Close connections gracefully on application shutdown
|
|
175
|
+
|
|
176
|
+
**DON'T:**
|
|
177
|
+
|
|
178
|
+
- Create a new connection per query (connection setup is expensive: TCP + TLS + auth)
|
|
179
|
+
- Set pool size equal to \`max_connections\` on the database (leave room for admin, monitoring, other services)
|
|
180
|
+
- Use unbounded pools (set a maximum to prevent connection exhaustion)
|
|
181
|
+
- Ignore idle connection cleanup (stale connections waste database resources)
|
|
182
|
+
- Share connection pools across unrelated services
|
|
183
|
+
|
|
184
|
+
## 6. Transactions and Locking
|
|
185
|
+
|
|
186
|
+
**DO:** Use transactions to maintain data integrity for multi-step operations.
|
|
187
|
+
|
|
188
|
+
- Choose the appropriate isolation level:
|
|
189
|
+
|
|
190
|
+
| Level | Dirty Read | Non-repeatable Read | Phantom Read | Use Case |
|
|
191
|
+
|-------|-----------|-------------------|-------------|---------|
|
|
192
|
+
| READ COMMITTED | No | Yes | Yes | Default for most apps |
|
|
193
|
+
| REPEATABLE READ | No | No | Yes | Financial reports |
|
|
194
|
+
| SERIALIZABLE | No | No | No | Critical financial ops |
|
|
195
|
+
|
|
196
|
+
- Use optimistic locking for low-contention updates:
|
|
197
|
+
\`\`\`sql
|
|
198
|
+
-- Add version column
|
|
199
|
+
UPDATE orders SET status = 'shipped', version = version + 1
|
|
200
|
+
WHERE id = 123 AND version = 5;
|
|
201
|
+
-- If 0 rows affected: someone else updated first (retry or error)
|
|
202
|
+
\`\`\`
|
|
203
|
+
|
|
204
|
+
- Use pessimistic locking (\`SELECT ... FOR UPDATE\`) only when contention is high and retries are expensive
|
|
205
|
+
- Keep transactions as short as possible -- do computation outside the transaction
|
|
206
|
+
- Always handle deadlocks: retry with exponential backoff (2-3 attempts max)
|
|
207
|
+
|
|
208
|
+
**DON'T:**
|
|
209
|
+
|
|
210
|
+
- Hold transactions open during user input or external API calls
|
|
211
|
+
- Use \`SERIALIZABLE\` as the default isolation level (performance impact)
|
|
212
|
+
- Nest transactions without understanding savepoint semantics
|
|
213
|
+
- Ignore deadlock errors -- they are expected in concurrent systems; handle them
|
|
214
|
+
- Lock entire tables when row-level locks suffice
|
|
215
|
+
|
|
216
|
+
## 7. Data Modeling Patterns
|
|
217
|
+
|
|
218
|
+
**DO:** Use established patterns for common data modeling challenges.
|
|
219
|
+
|
|
220
|
+
- **Soft deletes:** Add \`deleted_at TIMESTAMP NULL\` instead of removing rows. Filter with \`WHERE deleted_at IS NULL\` by default. Use a partial index for performance:
|
|
221
|
+
\`\`\`sql
|
|
222
|
+
CREATE INDEX idx_users_active ON users(email) WHERE deleted_at IS NULL;
|
|
223
|
+
\`\`\`
|
|
224
|
+
|
|
225
|
+
- **Temporal data (audit history):** Use a separate history table or event sourcing:
|
|
226
|
+
\`\`\`sql
|
|
227
|
+
CREATE TABLE order_events (
|
|
228
|
+
id SERIAL PRIMARY KEY,
|
|
229
|
+
order_id INTEGER NOT NULL REFERENCES orders(id),
|
|
230
|
+
event_type TEXT NOT NULL, -- 'created', 'updated', 'shipped'
|
|
231
|
+
payload JSONB NOT NULL,
|
|
232
|
+
created_at TIMESTAMP NOT NULL DEFAULT NOW()
|
|
233
|
+
);
|
|
234
|
+
\`\`\`
|
|
235
|
+
|
|
236
|
+
- **Polymorphic associations:** Use a discriminator column or separate tables per type:
|
|
237
|
+
\`\`\`sql
|
|
238
|
+
-- DO: Separate tables (type-safe, indexable)
|
|
239
|
+
CREATE TABLE comment_on_post (comment_id INT, post_id INT);
|
|
240
|
+
CREATE TABLE comment_on_photo (comment_id INT, photo_id INT);
|
|
241
|
+
|
|
242
|
+
-- Acceptable: Discriminator column (simpler, less type-safe)
|
|
243
|
+
CREATE TABLE comments (
|
|
244
|
+
id SERIAL, body TEXT,
|
|
245
|
+
commentable_type TEXT NOT NULL, -- 'post', 'photo'
|
|
246
|
+
commentable_id INTEGER NOT NULL
|
|
247
|
+
);
|
|
248
|
+
\`\`\`
|
|
249
|
+
|
|
250
|
+
- **JSON columns:** Use for semi-structured data that doesn't need relational queries. Index with GIN for JSONB queries:
|
|
251
|
+
\`\`\`sql
|
|
252
|
+
ALTER TABLE products ADD COLUMN metadata JSONB;
|
|
253
|
+
CREATE INDEX idx_products_metadata ON products USING GIN(metadata);
|
|
254
|
+
\`\`\`
|
|
255
|
+
|
|
256
|
+
**DON'T:**
|
|
257
|
+
|
|
258
|
+
- Use soft deletes without filtering them by default (data leaks)
|
|
259
|
+
- Store structured, queryable data in JSON columns (normalize instead)
|
|
260
|
+
- Create polymorphic foreign keys without application-level integrity checks
|
|
261
|
+
- Use EAV (Entity-Attribute-Value) pattern when a proper schema is feasible
|
|
262
|
+
|
|
263
|
+
## 8. Backup and Recovery
|
|
264
|
+
|
|
265
|
+
**DO:** Plan for data loss scenarios before they happen.
|
|
266
|
+
|
|
267
|
+
- Implement automated daily backups with point-in-time recovery (PITR) capability
|
|
268
|
+
- Store backups in a different region/account than the database
|
|
269
|
+
- Test backup restoration quarterly -- an untested backup is not a backup
|
|
270
|
+
- Document the recovery procedure: who, what, where, how long (RTO/RPO targets)
|
|
271
|
+
- Use logical backups (pg_dump, mysqldump) for portability and physical backups (WAL archiving, snapshots) for speed
|
|
272
|
+
|
|
273
|
+
**DON'T:**
|
|
274
|
+
|
|
275
|
+
- Keep backups on the same server/disk as the database
|
|
276
|
+
- Skip backup verification (restore to a test environment periodically)
|
|
277
|
+
- Rely solely on database replication as a backup strategy (replication propagates corruption)
|
|
278
|
+
- Store backup credentials in the same place as database credentials
|
|
279
|
+
- Assume cloud provider handles backup without verifying configuration and retention policy
|
|
280
|
+
</skill>
|
|
281
|
+
|
|
282
|
+
## Rules
|
|
283
|
+
|
|
284
|
+
- ALWAYS use parameterized queries -- never concatenate user input into SQL.
|
|
285
|
+
- ALWAYS include rollback plans with migration scripts.
|
|
286
|
+
- ALWAYS verify query plans with EXPLAIN ANALYZE before recommending index changes.
|
|
287
|
+
- DO use bash to run migrations, queries, and database tools.
|
|
288
|
+
- DO NOT access the web.
|
|
289
|
+
- DO NOT make application-layer architecture decisions -- focus on the data layer only.`,
|
|
290
|
+
permission: {
|
|
291
|
+
edit: "allow",
|
|
292
|
+
bash: "allow",
|
|
293
|
+
webfetch: "deny",
|
|
294
|
+
} as const,
|
|
295
|
+
});
|