@pageai/ralph-loop 1.8.0 → 1.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (46) hide show
  1. package/.agent/PROMPT.md +3 -2
  2. package/.agents/skills/mysql/SKILL.md +81 -0
  3. package/.agents/skills/mysql/references/character-sets.md +66 -0
  4. package/.agents/skills/mysql/references/composite-indexes.md +59 -0
  5. package/.agents/skills/mysql/references/connection-management.md +70 -0
  6. package/.agents/skills/mysql/references/covering-indexes.md +47 -0
  7. package/.agents/skills/mysql/references/data-types.md +69 -0
  8. package/.agents/skills/mysql/references/deadlocks.md +72 -0
  9. package/.agents/skills/mysql/references/explain-analysis.md +66 -0
  10. package/.agents/skills/mysql/references/fulltext-indexes.md +28 -0
  11. package/.agents/skills/mysql/references/index-maintenance.md +110 -0
  12. package/.agents/skills/mysql/references/isolation-levels.md +49 -0
  13. package/.agents/skills/mysql/references/json-column-patterns.md +77 -0
  14. package/.agents/skills/mysql/references/n-plus-one.md +77 -0
  15. package/.agents/skills/mysql/references/online-ddl.md +53 -0
  16. package/.agents/skills/mysql/references/partitioning.md +92 -0
  17. package/.agents/skills/mysql/references/primary-keys.md +70 -0
  18. package/.agents/skills/mysql/references/query-optimization-pitfalls.md +117 -0
  19. package/.agents/skills/mysql/references/replication-lag.md +46 -0
  20. package/.agents/skills/mysql/references/row-locking-gotchas.md +63 -0
  21. package/.agents/skills/postgres/SKILL.md +46 -0
  22. package/.agents/skills/postgres/references/backup-recovery.md +41 -0
  23. package/.agents/skills/postgres/references/index-optimization.md +69 -0
  24. package/.agents/skills/postgres/references/indexing.md +61 -0
  25. package/.agents/skills/postgres/references/memory-management-ops.md +39 -0
  26. package/.agents/skills/postgres/references/monitoring.md +59 -0
  27. package/.agents/skills/postgres/references/mvcc-transactions.md +38 -0
  28. package/.agents/skills/postgres/references/mvcc-vacuum.md +41 -0
  29. package/.agents/skills/postgres/references/optimization-checklist.md +19 -0
  30. package/.agents/skills/postgres/references/partitioning.md +79 -0
  31. package/.agents/skills/postgres/references/process-architecture.md +46 -0
  32. package/.agents/skills/postgres/references/ps-cli-api-insights.md +53 -0
  33. package/.agents/skills/postgres/references/ps-cli-commands.md +72 -0
  34. package/.agents/skills/postgres/references/ps-connection-pooling.md +72 -0
  35. package/.agents/skills/postgres/references/ps-connections.md +37 -0
  36. package/.agents/skills/postgres/references/ps-extensions.md +27 -0
  37. package/.agents/skills/postgres/references/ps-insights.md +62 -0
  38. package/.agents/skills/postgres/references/query-patterns.md +80 -0
  39. package/.agents/skills/postgres/references/replication.md +49 -0
  40. package/.agents/skills/postgres/references/schema-design.md +66 -0
  41. package/.agents/skills/postgres/references/storage-layout.md +41 -0
  42. package/.agents/skills/postgres/references/wal-operations.md +42 -0
  43. package/README.md +2 -2
  44. package/bin/cli.js +3 -1
  45. package/bin/lib/shadcn.js +1 -1
  46. package/package.json +1 -1
package/.agent/PROMPT.md CHANGED
@@ -34,10 +34,11 @@ Tasks are listed in @.agent/tasks.json
34
34
 
35
35
  ## Rules
36
36
 
37
- - **IMPORTANT**: only work on one task at a time and **exit closing all background processes**. **DO NOT** start another task.
37
+ - **CRITICAL**: Only work on **ONE task per invocation**. After committing the task, output `<promise>TASK-{ID}:DONE</promise>` and **STOP immediately**. Do NOT read the next task. Do NOT continue working. Your response **must END** after the promise tag. Any output after it is a violation.
38
+ - Kill all background processes (dev server, etc.) before outputting the promise tag.
38
39
  - No git init/remote changes. **No git push**.
39
40
  - Check the last 5 tasks in `.agent/logs/LOG.md` for past work
40
- - **CRITICAL**: When ALL tasks pass → output `<promise>COMPLETE</promise>` and **nothing else**.
41
+ - **CRITICAL**: When **ALL** tasks pass → output `<promise>COMPLETE</promise>` and **nothing else**.
41
42
 
42
43
  ## Help Tags
43
44
 
@@ -0,0 +1,81 @@
1
+ ---
2
+ name: mysql
3
+ description: Plan and review MySQL/InnoDB schema, indexing, query tuning, transactions, and operations. Use when creating or modifying MySQL tables, indexes, or queries; diagnosing slow/locking behavior; planning migrations; or troubleshooting replication and connection issues. Load when using a MySQL database.
4
+ ---
5
+
6
+ # MySQL
7
+
8
+ Use this skill to make safe, measurable MySQL/InnoDB changes.
9
+
10
+ ## Workflow
11
+ 1. Define workload and constraints (read/write mix, latency target, data volume, MySQL version, hosting platform).
12
+ 2. Read only the relevant reference files linked in each section below.
13
+ 3. Propose the smallest change that can solve the problem, including trade-offs.
14
+ 4. Validate with evidence (`EXPLAIN`, `EXPLAIN ANALYZE`, lock/connection metrics, and production-safe rollout steps).
15
+ 5. For production changes, include rollback and post-deploy verification.
16
+
17
+ ## Schema Design
18
+ - Prefer narrow, monotonic PKs (`BIGINT UNSIGNED AUTO_INCREMENT`) for write-heavy OLTP tables.
19
+ - Avoid random UUID values as clustered PKs; if external IDs are required, keep UUID in a secondary unique column.
20
+ - Always `utf8mb4` / `utf8mb4_0900_ai_ci`. Prefer `NOT NULL`, `DATETIME` over `TIMESTAMP`.
21
+ - Lookup tables over `ENUM`. Normalize to 3NF; denormalize only for measured hot paths.
22
+
23
+ References:
24
+ - [primary-keys](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/primary-keys.md)
25
+ - [data-types](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/data-types.md)
26
+ - [character-sets](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/character-sets.md)
27
+ - [json-column-patterns](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/json-column-patterns.md)
28
+
29
+ ## Indexing
30
+ - Composite order: equality first, then range/sort (leftmost prefix rule).
31
+ - Range predicates stop index usage for subsequent columns.
32
+ - Secondary indexes include PK implicitly. Prefix indexes for long strings.
33
+ - Audit via `performance_schema` — drop indexes with `count_read = 0`.
34
+
35
+ References:
36
+ - [composite-indexes](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/composite-indexes.md)
37
+ - [covering-indexes](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/covering-indexes.md)
38
+ - [fulltext-indexes](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/fulltext-indexes.md)
39
+ - [index-maintenance](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/index-maintenance.md)
40
+
41
+ ## Partitioning
42
+ - Partition time-series (>50M rows) or large tables (>100M rows). Plan early — retrofit = full rebuild.
43
+ - Include partition column in every unique/PK. Always add a `MAXVALUE` catch-all.
44
+
45
+ References:
46
+ - [partitioning](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/partitioning.md)
47
+
48
+ ## Query Optimization
49
+ - Check `EXPLAIN` — red flags: `type: ALL`, `Using filesort`, `Using temporary`.
50
+ - Cursor pagination, not `OFFSET`. Avoid functions on indexed columns in `WHERE`.
51
+ - Batch inserts (500–5000 rows). `UNION ALL` over `UNION` when dedup unnecessary.
52
+
53
+ References:
54
+ - [explain-analysis](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/explain-analysis.md)
55
+ - [query-optimization-pitfalls](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/query-optimization-pitfalls.md)
56
+ - [n-plus-one](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/n-plus-one.md)
57
+
58
+ ## Transactions & Locking
59
+ - Default: `REPEATABLE READ` (gap locks). Use `READ COMMITTED` for high contention.
60
+ - Consistent row access order prevents deadlocks. Retry error 1213 with backoff.
61
+ - Do I/O outside transactions. Use `SELECT ... FOR UPDATE` sparingly.
62
+
63
+ References:
64
+ - [isolation-levels](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/isolation-levels.md)
65
+ - [deadlocks](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/deadlocks.md)
66
+ - [row-locking-gotchas](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/row-locking-gotchas.md)
67
+
68
+ ## Operations
69
+ - Use online DDL (`ALGORITHM=INPLACE`) when possible; test on replicas first.
70
+ - Tune connection pooling — avoid `max_connections` exhaustion under load.
71
+ - Monitor replication lag; avoid stale reads from replicas during writes.
72
+
73
+ References:
74
+ - [online-ddl](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/online-ddl.md)
75
+ - [connection-management](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/connection-management.md)
76
+ - [replication-lag](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/mysql/references/replication-lag.md)
77
+
78
+ ## Guardrails
79
+ - Prefer measured evidence over blanket rules of thumb.
80
+ - Note MySQL-version-specific behavior when giving advice.
81
+ - Ask for explicit human approval before destructive data operations (drops/deletes/truncates).
@@ -0,0 +1,66 @@
1
+ ---
2
+ title: Character Sets and Collations
3
+ description: Charset config guide
4
+ tags: mysql, character-sets, utf8mb4, collation, encoding
5
+ ---
6
+
7
+ # Character Sets and Collations
8
+
9
+ ## Always Use utf8mb4
10
+ MySQL's `utf8` = `utf8mb3` (3-byte only, no emoji/many CJK). Always `utf8mb4`.
11
+
12
+ ```sql
13
+ CREATE DATABASE myapp DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
14
+ ```
15
+
16
+ ## Collation Quick Reference
17
+ | Collation | Behavior | Use for |
18
+ |---|---|---|
19
+ | `utf8mb4_0900_ai_ci` | Case-insensitive, accent-insensitive | Default |
20
+ | `utf8mb4_0900_as_cs` | Case/accent sensitive | Exact matching |
21
+ | `utf8mb4_bin` | Byte-by-byte comparison | Tokens, hashes |
22
+
23
+ `_0900_` = Unicode 9.0 (preferred over older `_unicode_` variants).
24
+
25
+ ## Collation Behavior
26
+
27
+ Collations affect string comparisons, sorting (`ORDER BY`), and pattern matching (`LIKE`):
28
+
29
+ - **Case-insensitive (`_ci`)**: `'A' = 'a'` evaluates to true, `LIKE 'a%'` matches 'Apple'
30
+ - **Case-sensitive (`_cs`)**: `'A' = 'a'` evaluates to false, `LIKE 'a%'` matches only lowercase
31
+ - **Accent-insensitive (`_ai`)**: `'e' = 'é'` evaluates to true
32
+ - **Accent-sensitive (`_as`)**: `'e' = 'é'` evaluates to false
33
+ - **Binary (`_bin`)**: strict byte-by-byte comparison (most restrictive)
34
+
35
+ You can override collation per query:
36
+
37
+ ```sql
38
+ SELECT * FROM users
39
+ WHERE name COLLATE utf8mb4_0900_as_cs = 'José';
40
+ ```
41
+
42
+ ## Migrating from utf8/utf8mb3
43
+
44
+ ```sql
45
+ -- Find columns still using utf8
46
+ SELECT table_name, column_name FROM information_schema.columns
47
+ WHERE table_schema = 'mydb' AND character_set_name = 'utf8';
48
+ -- Convert
49
+ ALTER TABLE users CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci;
50
+ ```
51
+
52
+ **Warning**: index key length limits depend on InnoDB row format:
53
+ - DYNAMIC/COMPRESSED: 3072 bytes max (≈768 chars with utf8mb4)
54
+ - REDUNDANT/COMPACT: 767 bytes max (≈191 chars with utf8mb4)
55
+
56
+ `VARCHAR(255)` with utf8mb4 = up to 1020 bytes (4×255). That's safe for DYNAMIC/COMPRESSED but exceeds REDUNDANT/COMPACT limits.
57
+
58
+ ## Connection
59
+ Ensure client uses `utf8mb4`: `SET NAMES utf8mb4;` (most modern drivers default to this).
60
+
61
+ `SET NAMES utf8mb4` sets three session variables:
62
+ - `character_set_client` (encoding for statements sent to server)
63
+ - `character_set_connection` (encoding for statement processing)
64
+ - `character_set_results` (encoding for results sent to client)
65
+
66
+ It also sets `collation_connection` to the default collation for utf8mb4.
@@ -0,0 +1,59 @@
1
+ ---
2
+ title: Composite Index Design
3
+ description: Multi-column indexes
4
+ tags: mysql, indexes, composite, query-optimization, leftmost-prefix
5
+ ---
6
+
7
+ # Composite Indexes
8
+
9
+ ## Leftmost Prefix Rule
10
+ Index `(a, b, c)` is usable for:
11
+ - `WHERE a` (uses column `a`)
12
+ - `WHERE a AND b` (uses columns `a`, `b`)
13
+ - `WHERE a AND b AND c` (uses all columns)
14
+ - `WHERE a AND c` (uses only column `a`; `c` can't filter without `b`)
15
+
16
+ NOT usable for `WHERE b` alone or `WHERE b AND c` (the search must start from the leftmost column).
17
+
18
+ ## Column Order: Equality First, Then Range/Sort
19
+
20
+ ```sql
21
+ -- Query: WHERE tenant_id = ? AND status = ? AND created_at > ?
22
+ CREATE INDEX idx_orders_tenant_status_created ON orders (tenant_id, status, created_at);
23
+ ```
24
+
25
+ **Critical**: Range predicates (`>`, `<`, `BETWEEN`, `LIKE 'prefix%'`, and sometimes large `IN (...)`) stop index usage for filtering subsequent columns. However, columns after a range predicate can still be useful for:
26
+ - Covering index reads (avoid table lookups)
27
+ - `ORDER BY`/`GROUP BY` in some cases, when the ordering/grouping matches the usable index prefix
28
+
29
+ ## Sort Order Must Match Index
30
+
31
+ ```sql
32
+ -- Index: (status, created_at)
33
+ ORDER BY status ASC, created_at ASC -- ✓ matches (optimal)
34
+ ORDER BY status DESC, created_at DESC -- ✓ full reverse OK (reverse scan)
35
+ ORDER BY status ASC, created_at DESC -- ⚠️ mixed directions (may use filesort)
36
+
37
+ -- MySQL 8.0+: descending index components
38
+ CREATE INDEX idx_orders_status_created ON orders (status ASC, created_at DESC);
39
+ ```
40
+
41
+ ## Composite vs Multiple Single-Column Indexes
42
+ MySQL can merge single-column indexes (`index_merge` union/intersection) but a composite index is typically faster. Index merge is useful when queries filter on different column combinations that don't share a common prefix, but it adds overhead and may not scale well under load.
43
+
44
+ ## Selectivity Considerations
45
+ Within equality columns, place higher-cardinality (more selective) columns first when possible. However, query patterns and frequency usually matter more than pure selectivity.
46
+
47
+ ## GROUP BY and Composite Indexes
48
+ `GROUP BY` can benefit from composite indexes when the GROUP BY columns match the index prefix. MySQL may use the index to avoid sorting.
49
+
50
+ ## Design for Multiple Queries
51
+
52
+ ```sql
53
+ -- One index covers: WHERE user_id=?, WHERE user_id=? AND status=?,
54
+ -- and WHERE user_id=? AND status=? ORDER BY created_at DESC
55
+ CREATE INDEX idx_orders_user_status_created ON orders (user_id, status, created_at DESC);
56
+ ```
57
+
58
+ ## InnoDB Secondary Index Behavior
59
+ InnoDB secondary indexes implicitly store the primary key value with each index entry. This means a secondary index can sometimes "cover" primary key lookups without adding the PK columns explicitly.
@@ -0,0 +1,70 @@
1
+ ---
2
+ title: Connection Pooling and Limits
3
+ description: Connection management best practices
4
+ tags: mysql, connections, pooling, max-connections, performance
5
+ ---
6
+
7
+ # Connection Management
8
+
9
+ Every MySQL connection costs memory (~1–10 MB depending on buffers). Unbounded connections cause OOM or `Too many connections` errors.
10
+
11
+ ## Sizing `max_connections`
12
+ Default is 151. Don't blindly raise it — more connections = more memory + more contention.
13
+
14
+ ```sql
15
+ SHOW VARIABLES LIKE 'max_connections'; -- current limit
16
+ SHOW STATUS LIKE 'Max_used_connections'; -- high-water mark
17
+ SHOW STATUS LIKE 'Threads_connected'; -- current count
18
+ ```
19
+
20
+ ## Pool Sizing Formula
21
+ A good starting point for OLTP: **pool size = (CPU cores * N)** where N is typically 2-10. This is a baseline — tune based on:
22
+ - Query characteristics (I/O-bound queries may benefit from more connections)
23
+ - Actual connection usage patterns (monitor `Threads_connected` vs `Max_used_connections`)
24
+ - Application concurrency requirements
25
+
26
+ More connections beyond CPU-bound optimal add context-switch overhead without improving throughput.
27
+
28
+ ## Timeout Tuning
29
+
30
+ ### Idle Connection Timeouts
31
+ ```sql
32
+ -- Kill idle connections after 5 minutes (default is 28800 seconds / 8 hours — way too long)
33
+ SET GLOBAL wait_timeout = 300; -- Non-interactive connections (apps)
34
+ SET GLOBAL interactive_timeout = 300; -- Interactive connections (CLI)
35
+ ```
36
+
37
+ **Note**: These are server-side timeouts. The server closes idle connections after this period. Client-side connection timeouts (e.g., `connectTimeout` in JDBC) are separate and control connection establishment.
38
+
39
+ ### Active Query Timeouts
40
+ ```sql
41
+ -- Increase for bulk operations or large result sets (default: 30 seconds)
42
+ SET GLOBAL net_read_timeout = 60; -- Time server waits for data from client
43
+ SET GLOBAL net_write_timeout = 60; -- Time server waits to send data to client
44
+ ```
45
+
46
+ These apply to active data transmission, not idle connections. Increase if you see errors like `Lost connection to MySQL server during query` during bulk inserts or large SELECTs.
47
+
48
+ ## Thread Handling
49
+ MySQL uses a **one-thread-per-connection** model by default: each connection gets its own OS thread. This means `max_connections` directly impacts thread count and memory usage.
50
+
51
+ MySQL also caches threads for reuse. If connections fluctuate frequently, increase `thread_cache_size` to reduce thread creation overhead.
52
+
53
+ ## Common Pitfalls
54
+ - **ORM default pools too large**: Rails default is 5 per process — 20 Puma workers = 100 connections from one app server. Multiply by app server count.
55
+ - **No pool at all**: PHP/CGI models open a new connection per request. Use persistent connections or ProxySQL.
56
+ - **Connection storms on deploy**: All app servers reconnect simultaneously when restarted, potentially exhausting `max_connections`. Mitigations: stagger deployments, use connection pool warm-up (gradually open connections), or use a proxy layer.
57
+ - **Idle transactions**: Connections with open transactions (`BEGIN` without `COMMIT`/`ROLLBACK`) are **not** closed by `wait_timeout` and hold locks. This causes deadlocks and connection leaks. Always commit or rollback promptly, and use application-level transaction timeouts.
58
+
59
+ ## Prepared Statements
60
+ Use prepared statements with connection pooling for performance and safety:
61
+ - **Performance**: reduces repeated parsing for parameterized queries
62
+ - **Security**: helps prevent SQL injection
63
+
64
+ Note: prepared statements are typically connection-scoped; some pools/drivers provide statement caching.
65
+
66
+ ## When to Use a Proxy
67
+ Use **ProxySQL** or **PlanetScale connection pooling** when: multiple app services share a DB, you need query routing (read/write split), or total connection demand exceeds safe `max_connections`.
68
+
69
+ ## Vitess / PlanetScale Note
70
+ If running on **PlanetScale** (or Vitess), connection pooling is handled at the Vitess `vtgate` layer. This means your app can open many connections to vtgate without each one mapping 1:1 to a MySQL backend connection. Backend connection issues are minimized under this architecture.
@@ -0,0 +1,47 @@
1
+ ---
2
+ title: Covering Indexes
3
+ description: Index-only scans
4
+ tags: mysql, indexes, covering-index, query-optimization, explain
5
+ ---
6
+
7
+ # Covering Indexes
8
+
9
+ A covering index contains all columns a query needs — InnoDB satisfies it from the index alone (`Using index` in EXPLAIN Extra).
10
+
11
+ ```sql
12
+ -- Query: SELECT user_id, status, total FROM orders WHERE user_id = 42
13
+ -- Covering index (filter columns first, then included columns):
14
+ CREATE INDEX idx_orders_cover ON orders (user_id, status, total);
15
+ ```
16
+
17
+ ## InnoDB Implicit Covering
18
+ Because InnoDB secondary indexes store the primary key value with each index entry, `INDEX(status)` already covers `SELECT id FROM t WHERE status = ?` (where `id` is the PK).
19
+
20
+ ## ICP vs Covering Index
21
+ - **ICP (`Using index condition`)**: engine filters at the index level before accessing table rows, but still requires table lookups.
22
+ - **Covering index (`Using index`)**: query is satisfied entirely from the index, with no table lookups.
23
+
24
+ ## EXPLAIN Signals
25
+ Look for `Using index` in the `Extra` column:
26
+
27
+ ```sql
28
+ EXPLAIN SELECT user_id, status, total FROM orders WHERE user_id = 42;
29
+ -- Extra: Using index ✓
30
+ ```
31
+
32
+ If you see `Using index condition` instead, the index is helping but not covering — you may need to add selected columns to the index.
33
+
34
+ ## When to Use
35
+ - High-frequency reads selecting few columns from wide tables.
36
+ - Not worth it for: wide result sets (TEXT/BLOB), write-heavy tables, low-frequency queries.
37
+
38
+ ## Tradeoffs
39
+ - **Write amplification**: every INSERT/UPDATE/DELETE must update all relevant indexes.
40
+ - **Index size**: wide indexes consume more disk and buffer pool memory.
41
+ - **Maintenance**: larger indexes take longer to rebuild during `ALTER TABLE`.
42
+
43
+ ## Guidelines
44
+ - Add columns to existing indexes rather than creating new ones.
45
+ - Order: filter columns first, then additional covered columns.
46
+ - Verify `Using index` appears in EXPLAIN after adding the index.
47
+ - **Pitfall**: `SELECT *` defeats covering indexes — select only the columns you need.
@@ -0,0 +1,69 @@
1
+ ---
2
+ title: MySQL Data Type Selection
3
+ description: Data type reference
4
+ tags: mysql, data-types, numeric, varchar, datetime, json
5
+ ---
6
+
7
+ # Data Types
8
+
9
+ Choose the smallest correct type — more rows per page, better cache, faster queries.
10
+
11
+ ## Numeric Sizes
12
+ | Type | Bytes | Unsigned Max |
13
+ |---|---|---|
14
+ | `TINYINT` | 1 | 255 |
15
+ | `SMALLINT` | 2 | 65,535 |
16
+ | `MEDIUMINT` | 3 | 16.7M |
17
+ | `INT` | 4 | 4.3B |
18
+ | `BIGINT` | 8 | 18.4 quintillion |
19
+
20
+ Use `BIGINT UNSIGNED` for PKs — `INT` exhausts at ~4.3B rows. Use `DECIMAL(19,4)` for money, never `FLOAT`.
21
+
22
+ ## Strings
23
+ - `VARCHAR(N)` over `TEXT` when bounded — can be indexed directly.
24
+ - **`N` matters**: `VARCHAR(255)` vs `VARCHAR(50)` affects memory allocation for temp tables and sorts.
25
+
26
+ ## TEXT/BLOB Indexing
27
+ - You generally can't index `TEXT`/`BLOB` fully; use prefix indexes: `INDEX(text_col(255))`.
28
+ - Prefix length limits depend on InnoDB row format:
29
+ - DYNAMIC/COMPRESSED: 3072 bytes max (≈768 chars with utf8mb4)
30
+ - REDUNDANT/COMPACT: 767 bytes max (≈191 chars with utf8mb4)
31
+ - For keyword search, consider `FULLTEXT` indexes instead of large prefix indexes.
32
+
33
+ ## Date/Time
34
+ - `TIMESTAMP`: 4 bytes, auto-converts timezone, but **2038 limit**. Use `DATETIME` for dates beyond 2038.
35
+
36
+ ```sql
37
+ created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
38
+ updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
39
+ ```
40
+
41
+ ## JSON
42
+ Use for truly dynamic data only. Index JSON values via generated columns:
43
+
44
+ ```sql
45
+ ALTER TABLE products
46
+ ADD COLUMN color VARCHAR(50) GENERATED ALWAYS AS (attributes->>'$.color') STORED,
47
+ ADD INDEX idx_color (color);
48
+ ```
49
+
50
+ Prefer simpler types like integers and strings over JSON.
51
+
52
+ ## Generated Columns
53
+ Use generated columns for computed values, JSON extraction, or functional indexing:
54
+
55
+ ```sql
56
+ -- VIRTUAL (default): computed on read, no storage
57
+ ALTER TABLE orders
58
+ ADD COLUMN total_cents INT GENERATED ALWAYS AS (price_cents * quantity) VIRTUAL;
59
+
60
+ -- STORED: computed on write, can be indexed
61
+ ALTER TABLE products
62
+ ADD COLUMN name_lower VARCHAR(255) GENERATED ALWAYS AS (LOWER(name)) STORED,
63
+ ADD INDEX idx_name_lower (name_lower);
64
+ ```
65
+
66
+ Choose **VIRTUAL** for simple expressions when space matters. Choose **STORED** when indexing is required or the expression is expensive.
67
+
68
+ ## ENUM/SET
69
+ Prefer lookup tables — `ENUM`/`SET` changes require `ALTER TABLE`, which can be slow on large tables.
@@ -0,0 +1,72 @@
1
+ ---
2
+ title: InnoDB Deadlock Resolution
3
+ description: Deadlock diagnosis
4
+ tags: mysql, deadlocks, innodb, transactions, locking, concurrency
5
+ ---
6
+
7
+ # Deadlocks
8
+
9
+ InnoDB auto-detects deadlocks and rolls back one transaction (the "victim").
10
+
11
+ ## Common Causes
12
+ 1. **Opposite row ordering** — Transactions accessing the same rows in different order can deadlock. Fix: always access rows in a consistent order (typically by primary key or a common index) so locks are acquired in the same sequence.
13
+ 2. **Next-key lock conflicts** (REPEATABLE READ) — InnoDB uses next-key locks (row + gap) to prevent phantoms. Fix: use READ COMMITTED (reduces gap locking) or narrow lock scope.
14
+ 3. **Missing index on WHERE column** — UPDATE/DELETE without an index may require a full table scan, locking many rows unnecessarily and increasing deadlock risk.
15
+ 4. **AUTO_INCREMENT lock contention** — Concurrent INSERT patterns can deadlock while contending on the auto-inc lock. Fix: use `innodb_autoinc_lock_mode=2` (interleaved) for better concurrency when safe for your workload, or batch inserts.
16
+
17
+ Note: SERIALIZABLE also uses gap/next-key locks. READ COMMITTED reduces some gap-lock deadlocks but doesn't eliminate deadlocks from opposite ordering or missing indexes.
18
+
19
+ ## Diagnosing
20
+
21
+ ```sql
22
+ -- Last deadlock details
23
+ SHOW ENGINE INNODB STATUS\G
24
+ -- Look for "LATEST DETECTED DEADLOCK" section
25
+
26
+ -- Current lock waits (MySQL 8.0+)
27
+ SELECT object_name, lock_type, lock_mode, lock_status, lock_data
28
+ FROM performance_schema.data_locks WHERE lock_status = 'WAITING';
29
+
30
+ -- Lock wait relationships (MySQL 8.0+)
31
+ SELECT
32
+ w.requesting_thread_id,
33
+ w.requested_lock_id,
34
+ w.blocking_thread_id,
35
+ w.blocking_lock_id,
36
+ l.lock_type,
37
+ l.lock_mode,
38
+ l.lock_data
39
+ FROM performance_schema.data_lock_waits w
40
+ JOIN performance_schema.data_locks l ON w.requested_lock_id = l.lock_id;
41
+ ```
42
+
43
+ ## Prevention
44
+ - Keep transactions short. Do I/O outside transactions.
45
+ - Ensure WHERE columns in UPDATE/DELETE are indexed.
46
+ - Use `SELECT ... FOR UPDATE` sparingly. Batch large updates with `LIMIT`.
47
+ - Access rows in a consistent order (by PK or index) across all transactions.
48
+
49
+ ## Retry Pattern (Error 1213)
50
+
51
+ In applications, retries are a common workaround for occasional deadlocks.
52
+
53
+ **Important**: ensure the operation is idempotent (or can be safely retried) before adding automatic retries, especially if there are side effects outside the database.
54
+
55
+ ```pseudocode
56
+ def execute_with_retry(db, fn, max_retries=3):
57
+ for attempt in range(max_retries):
58
+ try:
59
+ with db.begin():
60
+ return fn()
61
+ except OperationalError as e:
62
+ if e.args[0] == 1213 and attempt < max_retries - 1:
63
+ time.sleep(0.05 * (2 ** attempt))
64
+ continue
65
+ raise
66
+ ```
67
+
68
+ ## Common Misconceptions
69
+ - **"Deadlocks are bugs"** — deadlocks are a normal part of concurrent systems. The goal is to minimize frequency, not eliminate them entirely.
70
+ - **"READ COMMITTED eliminates deadlocks"** — it reduces gap/next-key lock deadlocks, but deadlocks still happen from opposite ordering, missing indexes, and lock contention.
71
+ - **"All deadlocks are from gap locks"** — many are caused by opposite row ordering even without gap locks.
72
+ - **"Victim selection is random"** — InnoDB generally chooses the transaction with lower rollback cost (fewer rows changed).
@@ -0,0 +1,66 @@
1
+ ---
2
+ title: EXPLAIN Plan Analysis
3
+ description: EXPLAIN output guide
4
+ tags: mysql, explain, query-plan, performance, indexes
5
+ ---
6
+
7
+ # EXPLAIN Analysis
8
+
9
+ ```sql
10
+ EXPLAIN SELECT ...; -- estimated plan
11
+ EXPLAIN FORMAT=JSON SELECT ...; -- detailed with cost estimates
12
+ EXPLAIN FORMAT=TREE SELECT ...; -- tree format (8.0+)
13
+ EXPLAIN ANALYZE SELECT ...; -- actual execution (8.0.18+, runs the query, uses TREE format)
14
+ ```
15
+
16
+ ## Access Types (Best → Worst)
17
+ `system` → `const` → `eq_ref` → `ref` → `range` → `index` (full index scan) → `ALL` (full table scan)
18
+
19
+ Target `ref` or better. `ALL` on >1000 rows almost always needs an index.
20
+
21
+ ## Key Extra Flags
22
+ | Flag | Meaning | Action |
23
+ |---|---|---|
24
+ | `Using index` | Covering index (optimal) | None |
25
+ | `Using filesort` | Sort not via index | Index the ORDER BY columns |
26
+ | `Using temporary` | Temp table for GROUP BY | Index the grouped columns |
27
+ | `Using join buffer` | No index on join column | Add index on join column |
28
+ | `Using index condition` | ICP — engine filters at index level | Generally good |
29
+
30
+ ## key_len — How Much of Composite Index Is Used
31
+ Byte sizes: `TINYINT`=1, `INT`=4, `BIGINT`=8, `DATE`=3, `DATETIME`=5, `VARCHAR(N)` utf8mb4: N×4+1 (or +2 when N×4>255). Add 1 byte per nullable column.
32
+
33
+ ```sql
34
+ -- Index: (status TINYINT, created_at DATETIME)
35
+ -- key_len=2 → only status (1+1 null). key_len=8 → both columns used.
36
+ ```
37
+
38
+ ## rows vs filtered
39
+ - `rows`: estimated rows examined after index access (before additional WHERE filtering)
40
+ - `filtered`: percent of examined rows expected to pass the full WHERE conditions
41
+ - Rough estimate of rows that satisfy the query: `rows × filtered / 100`
42
+ - Low `filtered` often means additional (non-indexed) predicates are filtering out lots of rows
43
+
44
+ ## Join Order
45
+ Row order in EXPLAIN output reflects execution order: the first row is typically the first table read, and subsequent rows are joined in order. Use this to spot suboptimal join ordering (e.g., starting with a large table when a selective table could drive the join).
46
+
47
+ ## EXPLAIN ANALYZE
48
+ **Availability:** MySQL 8.0.18+
49
+
50
+ **Important:** `EXPLAIN ANALYZE` actually executes the query (it does not return the result rows). It uses `FORMAT=TREE` automatically.
51
+
52
+ **Metrics (TREE output):**
53
+ - `actual time`: milliseconds (startup → end)
54
+ - `rows`: actual rows produced by that iterator
55
+ - `loops`: number of times the iterator ran
56
+
57
+ Compare estimated vs actual to find optimizer misestimates. Large discrepancies often improve after refreshing statistics:
58
+
59
+ ```sql
60
+ ANALYZE TABLE your_table;
61
+ ```
62
+
63
+ **Limitations / pitfalls:**
64
+ - Adds instrumentation overhead (measurements are not perfectly "free")
65
+ - Cost units (arbitrary) and time (ms) are different; don't compare them directly
66
+ - Results reflect real execution, including buffer pool/cache effects (warm cache can hide I/O problems)
@@ -0,0 +1,28 @@
1
+ ---
2
+ title: Fulltext Search Indexes
3
+ description: Fulltext index guide
4
+ tags: mysql, fulltext, search, indexes, boolean-mode
5
+ ---
6
+
7
+ # Fulltext Indexes
8
+
9
+ Fulltext indexes are useful for keyword text search in MySQL. For advanced ranking, fuzzy matching, or complex document search, prefer a dedicated search engine.
10
+
11
+ ```sql
12
+ ALTER TABLE articles ADD FULLTEXT INDEX ft_title_body (title, body);
13
+
14
+ -- Natural language (default, sorted by relevance)
15
+ SELECT *, MATCH(title, body) AGAINST('database performance') AS score
16
+ FROM articles WHERE MATCH(title, body) AGAINST('database performance');
17
+
18
+ -- Boolean mode: + required, - excluded, * suffix wildcard, "exact phrase"
19
+ WHERE MATCH(title, body) AGAINST('+mysql -postgres +optim*' IN BOOLEAN MODE);
20
+ ```
21
+
22
+ ## Key Gotchas
23
+ - **Min word length**: default 3 chars (`innodb_ft_min_token_size`). Shorter words are ignored. Changing this requires rebuilding the FULLTEXT index (drop/recreate) to take effect.
24
+ - **Stopwords**: common words excluded. Control stopwords with `innodb_ft_enable_stopword` and customize via `innodb_ft_user_stopword_table` / `innodb_ft_server_stopword_table` (set before creating the index, then rebuild to apply changes).
25
+ - **No partial matching**: unlike `LIKE '%term%'`, requires whole tokens (except `*` in boolean mode).
26
+ - **MATCH() columns must correspond to an index definition**: `MATCH(title, body)` needs a FULLTEXT index that covers the same column set (e.g. `(title, body)`).
27
+ - Boolean mode without required terms (no leading `+`) can match a very large portion of the index and be slow.
28
+ - Fulltext adds write overhead — consider Elasticsearch/Meilisearch for complex search needs.