@pageai/ralph-loop 1.9.0 → 1.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/.agents/skills/mysql/SKILL.md +81 -0
  2. package/.agents/skills/mysql/references/character-sets.md +66 -0
  3. package/.agents/skills/mysql/references/composite-indexes.md +59 -0
  4. package/.agents/skills/mysql/references/connection-management.md +70 -0
  5. package/.agents/skills/mysql/references/covering-indexes.md +47 -0
  6. package/.agents/skills/mysql/references/data-types.md +69 -0
  7. package/.agents/skills/mysql/references/deadlocks.md +72 -0
  8. package/.agents/skills/mysql/references/explain-analysis.md +66 -0
  9. package/.agents/skills/mysql/references/fulltext-indexes.md +28 -0
  10. package/.agents/skills/mysql/references/index-maintenance.md +110 -0
  11. package/.agents/skills/mysql/references/isolation-levels.md +49 -0
  12. package/.agents/skills/mysql/references/json-column-patterns.md +77 -0
  13. package/.agents/skills/mysql/references/n-plus-one.md +77 -0
  14. package/.agents/skills/mysql/references/online-ddl.md +53 -0
  15. package/.agents/skills/mysql/references/partitioning.md +92 -0
  16. package/.agents/skills/mysql/references/primary-keys.md +70 -0
  17. package/.agents/skills/mysql/references/query-optimization-pitfalls.md +117 -0
  18. package/.agents/skills/mysql/references/replication-lag.md +46 -0
  19. package/.agents/skills/mysql/references/row-locking-gotchas.md +63 -0
  20. package/.agents/skills/postgres/SKILL.md +46 -0
  21. package/.agents/skills/postgres/references/backup-recovery.md +41 -0
  22. package/.agents/skills/postgres/references/index-optimization.md +69 -0
  23. package/.agents/skills/postgres/references/indexing.md +61 -0
  24. package/.agents/skills/postgres/references/memory-management-ops.md +39 -0
  25. package/.agents/skills/postgres/references/monitoring.md +59 -0
  26. package/.agents/skills/postgres/references/mvcc-transactions.md +38 -0
  27. package/.agents/skills/postgres/references/mvcc-vacuum.md +41 -0
  28. package/.agents/skills/postgres/references/optimization-checklist.md +19 -0
  29. package/.agents/skills/postgres/references/partitioning.md +79 -0
  30. package/.agents/skills/postgres/references/process-architecture.md +46 -0
  31. package/.agents/skills/postgres/references/ps-cli-api-insights.md +53 -0
  32. package/.agents/skills/postgres/references/ps-cli-commands.md +72 -0
  33. package/.agents/skills/postgres/references/ps-connection-pooling.md +72 -0
  34. package/.agents/skills/postgres/references/ps-connections.md +37 -0
  35. package/.agents/skills/postgres/references/ps-extensions.md +27 -0
  36. package/.agents/skills/postgres/references/ps-insights.md +62 -0
  37. package/.agents/skills/postgres/references/query-patterns.md +80 -0
  38. package/.agents/skills/postgres/references/replication.md +49 -0
  39. package/.agents/skills/postgres/references/schema-design.md +66 -0
  40. package/.agents/skills/postgres/references/storage-layout.md +41 -0
  41. package/.agents/skills/postgres/references/wal-operations.md +42 -0
  42. package/README.md +1 -1
  43. package/bin/cli.js +2 -0
  44. package/bin/lib/shadcn.js +1 -1
  45. package/package.json +1 -1
@@ -0,0 +1,110 @@
1
+ ---
2
+ title: Index Maintenance and Cleanup
3
+ description: Index maintenance
4
+ tags: mysql, indexes, maintenance, unused-indexes, performance
5
+ ---
6
+
7
+ # Index Maintenance
8
+
9
+ ## Find Unused Indexes
10
+
11
+ ```sql
12
+ -- Requires performance_schema enabled (default in MySQL 5.7+)
13
+ -- "Unused" here means no reads/writes since last restart.
14
+ SELECT object_schema, object_name, index_name, COUNT_READ, COUNT_WRITE
15
+ FROM performance_schema.table_io_waits_summary_by_index_usage
16
+ WHERE object_schema = 'mydb'
17
+ AND index_name IS NOT NULL AND index_name != 'PRIMARY'
18
+ AND COUNT_READ = 0 AND COUNT_WRITE = 0
19
+ ORDER BY COUNT_WRITE DESC;
20
+ ```
21
+
22
+ Sometimes you'll also see indexes with **writes but no reads** (overhead without query benefit). Review these carefully: some are required for constraints (UNIQUE/PK) even if not used in query plans.
23
+
24
+ ```sql
25
+ SELECT object_schema, object_name, index_name, COUNT_READ, COUNT_WRITE
26
+ FROM performance_schema.table_io_waits_summary_by_index_usage
27
+ WHERE object_schema = 'mydb'
28
+ AND index_name IS NOT NULL AND index_name != 'PRIMARY'
29
+ AND COUNT_READ = 0 AND COUNT_WRITE > 0
30
+ ORDER BY COUNT_WRITE DESC;
31
+ ```
32
+
33
+ Counters reset on restart — ensure 1+ full business cycle of uptime before dropping.
34
+
35
+ ## Find Redundant Indexes
36
+
37
+ Index on `(a)` is redundant if `(a, b)` exists (leftmost prefix covers it). Pairs sharing only the first column (e.g. `(a,b)` vs `(a,c)`) need manual review — neither is redundant.
38
+
39
+ ```sql
40
+ -- Prefer sys schema view (MySQL 5.7.7+)
41
+ SELECT table_schema, table_name,
42
+ redundant_index_name, redundant_index_columns,
43
+ dominant_index_name, dominant_index_columns
44
+ FROM sys.schema_redundant_indexes
45
+ WHERE table_schema = 'mydb';
46
+ ```
47
+
48
+ ## Check Index Sizes
49
+
50
+ ```sql
51
+ SELECT database_name, table_name, index_name,
52
+ ROUND(stat_value * @@innodb_page_size / 1024 / 1024, 2) AS size_mb
53
+ FROM mysql.innodb_index_stats
54
+ WHERE stat_name = 'size' AND database_name = 'mydb'
55
+ ORDER BY stat_value DESC;
56
+ -- stat_value is in pages; multiply by innodb_page_size for bytes
57
+ ```
58
+
59
+ ## Index Write Overhead
60
+ Each index must be updated on INSERT, UPDATE, and DELETE operations. More indexes = slower writes.
61
+
62
+ - **INSERT**: each secondary index adds a write
63
+ - **UPDATE**: changing indexed columns updates all affected indexes
64
+ - **DELETE**: removes entries from all indexes
65
+
66
+ InnoDB can defer some secondary index updates via the change buffer, but excessive indexing still reduces write throughput.
67
+
68
+ ## Update Statistics (ANALYZE TABLE)
69
+ The optimizer relies on index cardinality and distribution statistics. After large data changes, refresh statistics:
70
+
71
+ ```sql
72
+ ANALYZE TABLE orders;
73
+ ```
74
+
75
+ This updates statistics (does not rebuild the table).
76
+
77
+ ## Rebuild / Reclaim Space (OPTIMIZE TABLE)
78
+ `OPTIMIZE TABLE` can reclaim space and rebuild indexes:
79
+
80
+ ```sql
81
+ OPTIMIZE TABLE orders;
82
+ ```
83
+
84
+ For InnoDB this effectively rebuilds the table and indexes and can be slow on large tables.
85
+
86
+ ## Invisible Indexes (MySQL 8.0+)
87
+ Test removing an index without dropping it:
88
+
89
+ ```sql
90
+ ALTER TABLE orders ALTER INDEX idx_status INVISIBLE;
91
+ ALTER TABLE orders ALTER INDEX idx_status VISIBLE;
92
+ ```
93
+
94
+ Invisible indexes are still maintained on writes (overhead remains), but the optimizer won't consider them.
95
+
96
+ ## Index Maintenance Tools
97
+
98
+ ### Online DDL (Built-in)
99
+ Most add/drop index operations are online-ish but still take brief metadata locks:
100
+
101
+ ```sql
102
+ ALTER TABLE orders ADD INDEX idx_status (status), ALGORITHM=INPLACE, LOCK=NONE;
103
+ ```
104
+
105
+ ### pt-online-schema-change / gh-ost
106
+ For very large tables or high-write workloads, online schema change tools can reduce blocking by using a shadow table and a controlled cutover (tradeoffs: operational complexity, privileges, triggers/binlog requirements).
107
+
108
+ ## Guidelines
109
+ - 1–5 indexes per table is normal. 6+: audit for redundancy.
110
+ - Combine `performance_schema` data with `EXPLAIN` of frequent queries monthly.
@@ -0,0 +1,49 @@
1
+ ---
2
+ title: InnoDB Transaction Isolation Levels
3
+ description: Best practices for choosing and using isolation levels
4
+ tags: mysql, transactions, isolation, innodb, locking, concurrency
5
+ ---
6
+
7
+ # Isolation Levels (InnoDB Best Practices)
8
+
9
+ **Default to REPEATABLE READ.** It is the InnoDB default, most tested, and prevents phantom reads. Only change per-session with a measured reason.
10
+
11
+ ```sql
12
+ SELECT @@transaction_isolation;
13
+ SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED; -- per-session only
14
+ ```
15
+
16
+ ## Autocommit Interaction
17
+ - Default: `autocommit=1` (each statement is its own transaction).
18
+ - With `autocommit=0`, transactions span multiple statements until `COMMIT`/`ROLLBACK`.
19
+ - Isolation level applies per transaction. SERIALIZABLE behavior differs based on autocommit setting (see SERIALIZABLE section).
20
+
21
+ ## Locking vs Non-Locking Reads
22
+ - **Non-locking reads**: plain `SELECT` statements use consistent reads (MVCC snapshots). They don't acquire locks and don't block writers.
23
+ - **Locking reads**: `SELECT ... FOR UPDATE` (exclusive) or `SELECT ... FOR SHARE` (shared) acquire locks and can block concurrent modifications.
24
+ - `UPDATE` and `DELETE` statements are implicitly locking reads.
25
+
26
+ ## REPEATABLE READ (Default — Prefer This)
27
+ - Consistent reads: snapshot established at first read; all plain SELECTs within the transaction read from that same snapshot (MVCC). Plain SELECTs are non-locking and don't block writers.
28
+ - Locking reads/writes use **next-key locks** (row + gap) — prevents phantoms. Exception: a unique index with a unique search condition locks only the index record, not the gap.
29
+ - **Use for**: OLTP, check-then-insert, financial logic, reports needing consistent snapshots.
30
+ - **Avoid mixing** locking statements (`SELECT ... FOR UPDATE`, `UPDATE`, `DELETE`) with non-locking `SELECT` statements in the same transaction — they can observe different states (current vs snapshot) and lead to surprises.
31
+
32
+ ## READ COMMITTED (Per-Session Only, When Needed)
33
+ - Fresh snapshot per SELECT; **record locks only** (gap locks disabled for searches/index scans, but still used for foreign-key and duplicate-key checks) — more concurrency, but phantoms possible.
34
+ - **Switch only when**: gap-lock deadlocks confirmed via `SHOW ENGINE INNODB STATUS`, bulk imports with contention, or high-write concurrency on overlapping ranges.
35
+ - **Never switch globally.** Check-then-insert patterns break — use `INSERT ... ON DUPLICATE KEY` or `FOR UPDATE` instead.
36
+
37
+ ## SERIALIZABLE — Avoid
38
+ Converts all plain SELECTs to `SELECT ... FOR SHARE` **if autocommit is disabled**. If autocommit is enabled, SELECTs are consistent (non-locking) reads. SERIALIZABLE can cause massive contention when autocommit is disabled. Prefer explicit `SELECT ... FOR UPDATE` at REPEATABLE READ instead — same safety, far less lock scope.
39
+
40
+ ## READ UNCOMMITTED — Never Use
41
+ Dirty reads with no valid production use case.
42
+
43
+ ## Decision Guide
44
+ | Scenario | Recommendation |
45
+ |---|---|
46
+ | General OLTP / check-then-insert / reports | **REPEATABLE READ** (default) |
47
+ | Bulk import or gap-lock deadlocks | **READ COMMITTED** (per-session), benchmark first |
48
+ | Need serializability | Explicit `FOR UPDATE` at REPEATABLE READ; SERIALIZABLE only as last resort |
49
+
@@ -0,0 +1,77 @@
1
+ ---
2
+ title: JSON Column Best Practices
3
+ description: When and how to use JSON columns safely
4
+ tags: mysql, json, generated-columns, indexes, data-modeling
5
+ ---
6
+
7
+ # JSON Column Patterns
8
+
9
+ MySQL 5.7+ supports native JSON columns. Useful, but with important caveats.
10
+
11
+ ## When JSON Is Appropriate
12
+ - Truly schema-less data (user preferences, metadata bags, webhook payloads).
13
+ - Rarely filtered/joined — if you query a JSON path frequently, extract it to a real column.
14
+
15
+ ## Indexing JSON: Use Generated Columns
16
+ You **cannot** index a JSON column directly. Create a virtual generated column and index that:
17
+ ```sql
18
+ ALTER TABLE events
19
+ ADD COLUMN event_type VARCHAR(50) GENERATED ALWAYS AS (data->>'$.type') VIRTUAL,
20
+ ADD INDEX idx_event_type (event_type);
21
+ ```
22
+
23
+ ## Extraction Operators
24
+ | Syntax | Returns | Use for |
25
+ |---|---|---|
26
+ | `JSON_EXTRACT(col, '$.key')` | JSON type value (e.g., `"foo"` for strings) | When you need JSON type semantics |
27
+ | `col->'$.key'` | Same as `JSON_EXTRACT(col, '$.key')` | Shorthand |
28
+ | `col->>'$.key'` | Unquoted scalar (equivalent to `JSON_UNQUOTE(JSON_EXTRACT(col, '$.key'))`) | WHERE comparisons, display |
29
+
30
+ Always use `->>` (unquote) in WHERE clauses, otherwise you compare against `"foo"` (with quotes).
31
+
32
+ Tip: the generated column example above can be written more concisely as:
33
+
34
+ ```sql
35
+ ALTER TABLE events
36
+ ADD COLUMN event_type VARCHAR(50) GENERATED ALWAYS AS (data->>'$.type') VIRTUAL,
37
+ ADD INDEX idx_event_type (event_type);
38
+ ```
39
+
40
+ ## Multi-Valued Indexes (MySQL 8.0.17+)
41
+ If you store arrays in JSON (e.g., `tags: ["electronics","sale"]`), MySQL 8.0.17+ supports multi-valued indexes to index array elements:
42
+
43
+ ```sql
44
+ ALTER TABLE products
45
+ ADD INDEX idx_tags ((CAST(tags AS CHAR(50) ARRAY)));
46
+ ```
47
+
48
+ This can accelerate membership queries such as:
49
+
50
+ ```sql
51
+ SELECT * FROM products WHERE 'electronics' MEMBER OF (tags);
52
+ ```
53
+
54
+ ## Collation and Type Casting Pitfalls
55
+ - **JSON type comparisons**: `JSON_EXTRACT` returns JSON type. Comparing directly to strings can be wrong for numbers/dates.
56
+
57
+ ```sql
58
+ -- WRONG: lexicographic string comparison
59
+ WHERE data->>'$.price' <= '1200'
60
+
61
+ -- CORRECT: cast to numeric
62
+ WHERE CAST(data->>'$.price' AS UNSIGNED) <= 1200
63
+ ```
64
+
65
+ - **Collation**: values extracted with `->>` behave like strings and use a collation. Use `COLLATE` when you need a specific comparison behavior.
66
+
67
+ ```sql
68
+ WHERE data->>'$.status' COLLATE utf8mb4_0900_as_cs = 'Active'
69
+ ```
70
+
71
+ ## Common Pitfalls
72
+ - **Heavy update cost**: `JSON_SET`/`JSON_REPLACE` can touch large portions of a JSON document and generate significant redo/undo work on large blobs.
73
+ - **No partial indexes**: You can only index extracted scalar paths via generated columns.
74
+ - **Large documents hurt**: JSON stored inline in the row. Documents >8 KB spill to overflow pages, hurting read performance.
75
+ - **Type mismatches**: `JSON_EXTRACT` returns a JSON type. Comparing with `= 'foo'` may not match — use `->>` or `JSON_UNQUOTE`.
76
+ - **VIRTUAL vs STORED generated columns**: VIRTUAL columns compute on read (less storage, more CPU). STORED columns materialize on write (more storage, faster reads if selected often). Both can be indexed; for indexed paths, the index stores the computed value either way.
77
+
@@ -0,0 +1,77 @@
1
+ ---
2
+ title: N+1 Query Detection and Fixes
3
+ description: N+1 query solutions
4
+ tags: mysql, n-plus-one, orm, query-optimization, performance
5
+ ---
6
+
7
+ # N+1 Query Detection
8
+
9
+ ## What Is N+1?
10
+ The N+1 pattern occurs when you fetch N parent records, then execute N additional queries (one per parent) to fetch related data.
11
+
12
+ Example: 1 query for users + N queries for posts.
13
+
14
+ ## ORM Fixes (Quick Reference)
15
+
16
+ - **SQLAlchemy 1.x**: `session.query(User).options(joinedload(User.posts))`
17
+ - **SQLAlchemy 2.0**: `select(User).options(joinedload(User.posts))`
18
+ - **Django**: `select_related('fk_field')` for FK/O2O, `prefetch_related('m2m_field')` for M2M/reverse FK
19
+ - **ActiveRecord**: `User.includes(:orders)`
20
+ - **Prisma**: `findMany({ include: { orders: true } })`
21
+ - **Drizzle**: use `.leftJoin()` instead of loop queries
22
+
23
+ ```typescript
24
+ // Drizzle example: avoid N+1 with a join
25
+ const rows = await db
26
+ .select()
27
+ .from(users)
28
+ .leftJoin(posts, eq(users.id, posts.userId));
29
+ ```
30
+
31
+ ## Detecting in MySQL Production
32
+
33
+ ```sql
34
+ -- High-frequency simple queries often indicate N+1
35
+ -- Requires performance_schema enabled (default in MySQL 5.7+)
36
+ SELECT digest_text, count_star, avg_timer_wait
37
+ FROM performance_schema.events_statements_summary_by_digest
38
+ ORDER BY count_star DESC LIMIT 20;
39
+ ```
40
+
41
+ Also check the slow query log sorted by `count` for frequently repeated simple SELECTs.
42
+
43
+ ## Batch Consolidation
44
+ Replace sequential queries with `WHERE id IN (...)`.
45
+
46
+ Practical limits:
47
+ - Total statement size is capped by `max_allowed_packet` (often 4MB by default).
48
+ - Very large IN lists increase parsing/planning overhead and can hurt performance.
49
+
50
+ Strategies:
51
+ - Up to ~1000–5000 ids: `IN (...)` is usually fine.
52
+ - Larger: chunk the list (e.g. batches of 500–1000) or use a temporary table and join.
53
+
54
+ ```sql
55
+ -- Temporary table approach for large batches
56
+ CREATE TEMPORARY TABLE temp_user_ids (id BIGINT PRIMARY KEY);
57
+ INSERT INTO temp_user_ids VALUES (1), (2), (3);
58
+
59
+ SELECT p.*
60
+ FROM posts p
61
+ JOIN temp_user_ids t ON p.user_id = t.id;
62
+ ```
63
+
64
+ ## Joins vs Separate Queries
65
+ - Prefer **JOINs** when you need related data for most/all parent rows and the result set stays reasonable.
66
+ - Prefer **separate queries** (batched) when JOINs would explode rows (one-to-many) or over-fetch too much data.
67
+
68
+ ## Eager Loading Caveats
69
+ - **Over-fetching**: eager loading pulls *all* related rows unless you filter it.
70
+ - **Memory**: loading large collections can blow up memory.
71
+ - **Row multiplication**: JOIN-based eager loading can create huge result sets; in some ORMs, a "select-in" strategy is safer.
72
+
73
+ ## Prepared Statements
74
+ Prepared statements reduce repeated parse/optimize overhead for repeated parameterized queries, but they do **not** eliminate N+1: you still execute N queries. Use batching/eager loading to reduce query count.
75
+
76
+ ## Pagination Pitfalls
77
+ N+1 often reappears per page. Ensure eager loading or batching is applied to the paginated query, not inside the per-row loop.
@@ -0,0 +1,53 @@
1
+ ---
2
+ title: Online DDL and Schema Migrations
3
+ description: Lock-safe ALTER TABLE guidance
4
+ tags: mysql, ddl, schema-migration, alter-table, innodb
5
+ ---
6
+
7
+ # Online DDL
8
+
9
+ Not all `ALTER TABLE` is equal — some block writes for the entire duration.
10
+
11
+ ## Algorithm Spectrum
12
+
13
+ | Algorithm | What Happens | DML During? |
14
+ |---|---|---|
15
+ | `INSTANT` | Metadata-only change | Yes |
16
+ | `INPLACE` | Rebuilds in background | Usually yes |
17
+ | `COPY` | Full table copy to tmp table | **Blocked** |
18
+
19
+ MySQL picks the fastest available. Specify explicitly to fail-safe:
20
+ ```sql
21
+ ALTER TABLE orders ADD COLUMN note VARCHAR(255) DEFAULT NULL, ALGORITHM=INSTANT;
22
+ -- Fails loudly if INSTANT isn't possible, rather than silently falling back to COPY.
23
+ ```
24
+
25
+ ## What Supports INSTANT (MySQL 8.0+)
26
+ - Adding a column (at any position as of 8.0.29; only at end before 8.0.29)
27
+ - Dropping a column (8.0.29+)
28
+ - Renaming a column (8.0.28+)
29
+
30
+ **Not INSTANT**: adding indexes (uses INPLACE), dropping indexes (uses INPLACE; typically metadata-only), changing column type, extending VARCHAR (uses INPLACE), adding columns when INSTANT isn't supported for the table/operation.
31
+
32
+ ## Lock Levels
33
+ `LOCK=NONE` (concurrent DML), `LOCK=SHARED` (reads only), `LOCK=EXCLUSIVE` (full block), `LOCK=DEFAULT` (server chooses maximum concurrency; default).
34
+
35
+ Always request `LOCK=NONE` (and an explicit `ALGORITHM`) to surface conflicts early instead of silently falling back to a more blocking method.
36
+
37
+ ## Large Tables (millions+ rows)
38
+ Even `INPLACE` operations typically hold brief metadata locks at start/end. The commit phase requires an exclusive metadata lock and will wait for concurrent transactions to finish; long-running transactions can block DDL from completing.
39
+
40
+ On huge tables, consider external tools:
41
+ - **pt-online-schema-change**: creates shadow table, syncs via triggers.
42
+ - **gh-ost**: triggerless, uses binlog stream. Preferred for high-write tables.
43
+
44
+ ## Replication Considerations
45
+ - DDL replicates to replicas and executes there, potentially causing lag (especially COPY-like rebuilds).
46
+ - INSTANT operations minimize replication impact because they complete quickly.
47
+ - INPLACE operations can still cause lag and metadata lock waits on replicas during apply.
48
+
49
+ ## PlanetScale Users
50
+ On PlanetScale, use **deploy requests** instead of manual DDL tools. Vitess handles non-blocking migrations automatically. Use this whenever possible because it offers much safer schema migrations.
51
+
52
+ ## Key Rule
53
+ Never run `ALTER TABLE` on production without checking the algorithm. A surprise `COPY` on a 100M-row table can lock writes for hours.
@@ -0,0 +1,92 @@
1
+ ---
2
+ title: MySQL Partitioning
3
+ description: Partition types and management operations
4
+ tags: mysql, partitioning, range, list, hash, maintenance, data-retention
5
+ ---
6
+
7
+ # Partitioning
8
+
9
+ All columns used in the partitioning expression must be part of every UNIQUE/PRIMARY KEY.
10
+
11
+ ## Partition Pruning
12
+ The optimizer can eliminate partitions that cannot contain matching rows based on the WHERE clause ("partition pruning"). Partitioning helps most when queries frequently filter by the partition key/expression:
13
+ - Equality: `WHERE partition_key = ?` (HASH/KEY)
14
+ - Ranges: `WHERE partition_key BETWEEN ? AND ?` (RANGE)
15
+ - IN lists: `WHERE partition_key IN (...)` (LIST)
16
+
17
+ ## Types
18
+
19
+ | Need | Type |
20
+ |---|---|
21
+ | Time-ordered / data retention | RANGE |
22
+ | Discrete categories | LIST |
23
+ | Even distribution | HASH / KEY |
24
+ | Two access patterns | RANGE + HASH sub |
25
+
26
+ ```sql
27
+ -- RANGE COLUMNS (direct date comparisons; avoids function wrapper)
28
+ PARTITION BY RANGE COLUMNS (created_at) (
29
+ PARTITION p2025_q1 VALUES LESS THAN ('2025-04-01'),
30
+ PARTITION p_future VALUES LESS THAN (MAXVALUE)
31
+ );
32
+
33
+ -- RANGE with function (use when you must partition by an expression)
34
+ PARTITION BY RANGE (TO_DAYS(created_at)) (
35
+ PARTITION p2025_q1 VALUES LESS THAN (TO_DAYS('2025-04-01')),
36
+ PARTITION p_future VALUES LESS THAN MAXVALUE
37
+ );
38
+ -- LIST (discrete categories — unlisted values cause errors, ensure full coverage)
39
+ PARTITION BY LIST COLUMNS (region) (
40
+ PARTITION p_americas VALUES IN ('us', 'ca', 'br'),
41
+ PARTITION p_europe VALUES IN ('uk', 'de', 'fr')
42
+ );
43
+ -- HASH/KEY (even distribution, equality pruning only)
44
+ PARTITION BY HASH (user_id) PARTITIONS 8;
45
+ ```
46
+
47
+ ## Foreign Key Restrictions (InnoDB)
48
+ Partitioned InnoDB tables do not support foreign keys:
49
+ - A partitioned table cannot define foreign key constraints to other tables.
50
+ - Other tables cannot reference a partitioned table with a foreign key.
51
+
52
+ If you need foreign keys, partitioning may not be an option.
53
+
54
+ ## When Partitioning Helps vs Hurts
55
+ **Helps:**
56
+ - Very large tables (millions+ rows) with time-ordered access patterns
57
+ - Data retention workflows (drop old partitions vs DELETE)
58
+ - Queries that filter by the partition key/expression (enables pruning)
59
+ - Maintenance on subsets of data (operate on partitions vs whole table)
60
+
61
+ **Hurts:**
62
+ - Small tables (overhead without benefit)
63
+ - Queries that don't filter by the partition key (no pruning)
64
+ - Workloads that require foreign keys
65
+ - Complex UNIQUE key requirements (partition key columns must be included everywhere)
66
+
67
+ ## Management Operations
68
+
69
+ ```sql
70
+ -- Add: split catch-all MAXVALUE partition
71
+ ALTER TABLE events REORGANIZE PARTITION p_future INTO (
72
+ PARTITION p2026_01 VALUES LESS THAN (TO_DAYS('2026-02-01')),
73
+ PARTITION p_future VALUES LESS THAN MAXVALUE
74
+ );
75
+ -- Drop aged-out data (orders of magnitude faster than DELETE)
76
+ ALTER TABLE events DROP PARTITION p2025_q1;
77
+ -- Merge partitions
78
+ ALTER TABLE events REORGANIZE PARTITION p2025_01, p2025_02, p2025_03 INTO (
79
+ PARTITION p2025_q1 VALUES LESS THAN (TO_DAYS('2025-04-01'))
80
+ );
81
+ -- Archive via exchange (LIKE creates non-partitioned copy; both must match structure)
82
+ CREATE TABLE events_archive LIKE events;
83
+ ALTER TABLE events_archive REMOVE PARTITIONING;
84
+ ALTER TABLE events EXCHANGE PARTITION p2025_q1 WITH TABLE events_archive;
85
+ ```
86
+
87
+ Notes:
88
+ - `REORGANIZE PARTITION` rebuilds the affected partition(s).
89
+ - `EXCHANGE PARTITION` requires an exact structure match (including indexes) and the target table must not be partitioned.
90
+ - `DROP PARTITION` is DDL (fast) vs `DELETE` (DML; slow on large datasets).
91
+
92
+ Always ask for human approval before dropping, deleting, or archiving data.
@@ -0,0 +1,70 @@
1
+ ---
2
+ title: Primary Key Design
3
+ description: Primary key patterns
4
+ tags: mysql, primary-keys, auto-increment, uuid, innodb
5
+ ---
6
+
7
+ # Primary Keys
8
+
9
+ InnoDB stores rows in primary key order (clustered index). This means:
10
+ - **Sequential keys = optimal inserts**: new rows append, minimizing page splits and fragmentation.
11
+ - **Random keys = fragmentation**: random inserts cause page splits to maintain PK order, wasting space and slowing inserts.
12
+ - **Secondary index lookups**: secondary indexes store the PK value and use it to fetch the full row from the clustered index.
13
+
14
+ ## INT vs BIGINT for Primary Keys
15
+ - **INT UNSIGNED**: 4 bytes, max ~4.3B rows.
16
+ - **BIGINT UNSIGNED**: 8 bytes, max ~18.4 quintillion rows.
17
+
18
+ Guideline: default to **BIGINT UNSIGNED** unless you're certain the table will never approach the INT limit. The extra 4 bytes is usually cheaper than the risk of exhausting INT.
19
+
20
+ ## Avoid Random UUID as Clustered PK
21
+ - UUID PK stored as `BINARY(16)`: 16 bytes (vs 8 for BIGINT). Random inserts cause page splits, and every secondary index entry carries the PK.
22
+ - UUID stored as `CHAR(36)`/`VARCHAR(36)`: 36 bytes (+ overhead) and is generally worse for storage and index size.
23
+ - If external identifiers are required, store UUID as `BINARY(16)` in a secondary unique column:
24
+
25
+ ```sql
26
+ CREATE TABLE users (
27
+ id BIGINT UNSIGNED NOT NULL AUTO_INCREMENT PRIMARY KEY,
28
+ public_id BINARY(16) NOT NULL,
29
+ UNIQUE KEY idx_public_id (public_id)
30
+ );
31
+ -- UUID_TO_BIN(uuid, 1) reorders UUIDv1 bytes to be roughly time-sorted (reduces fragmentation)
32
+ -- MySQL's UUID() returns UUIDv4 (random). For time-ordered IDs, use app-generated UUIDv7/ULID/Snowflake.
33
+ INSERT INTO users (public_id) VALUES (UUID_TO_BIN(?, 1)); -- app provides UUID string
34
+ ```
35
+
36
+ If UUIDs are required, prefer time-ordered variants such as UUIDv7 (app-generated) to reduce index fragmentation.
37
+
38
+ ## Secondary Indexes Include the Primary Key
39
+ InnoDB secondary indexes store the primary key value with each index entry. Implications:
40
+ - **Larger secondary indexes**: a secondary index entry includes (indexed columns + PK bytes).
41
+ - **Covering reads**: `SELECT id FROM users WHERE email = ?` can often be satisfied from `INDEX(email)` because `id` (PK) is already present in the index entry.
42
+ - **UUID penalty**: a `BINARY(16)` PK makes every secondary index entry 8 bytes larger than a BIGINT PK.
43
+
44
+ ## Auto-Increment Considerations
45
+ - **Hot spot**: inserts target the end of the clustered index (usually fine; can bottleneck at extreme insert rates).
46
+ - **Gaps are normal**: rollbacks or failed inserts can leave gaps.
47
+ - **Locking**: auto-increment allocation can introduce contention under very high concurrency.
48
+
49
+ ## Alternative Ordered IDs (Snowflake / ULID / UUIDv7)
50
+ If you need globally unique IDs generated outside the database:
51
+ - **Snowflake-style**: 64-bit integers (fits in BIGINT), time-ordered, compact.
52
+ - **ULID / UUIDv7**: 128-bit (store as `BINARY(16)`), time-ordered, better insert locality than random UUIDv4.
53
+
54
+ Recommendation: prefer `BIGINT AUTO_INCREMENT` unless you need distributed ID generation or externally meaningful identifiers.
55
+
56
+ ## Replication Considerations
57
+ - Random-key insert patterns (UUIDv4) can amplify page splits and I/O on replicas too, increasing lag.
58
+ - Time-ordered IDs reduce fragmentation and tend to replicate more smoothly under heavy insert workloads.
59
+
60
+ ## Composite Primary Keys
61
+
62
+ Use for join/many-to-many tables. Most-queried column first:
63
+
64
+ ```sql
65
+ CREATE TABLE user_roles (
66
+ user_id BIGINT UNSIGNED NOT NULL,
67
+ role_id BIGINT UNSIGNED NOT NULL,
68
+ PRIMARY KEY (user_id, role_id)
69
+ );
70
+ ```
@@ -0,0 +1,117 @@
1
+ ---
2
+ title: Query Optimization Pitfalls
3
+ description: Common anti-patterns that silently kill performance
4
+ tags: mysql, query-optimization, anti-patterns, performance, indexes
5
+ ---
6
+
7
+ # Query Optimization Pitfalls
8
+
9
+ These patterns look correct but bypass indexes or cause full scans.
10
+
11
+ ## Non-Sargable Predicates
12
+ A **sargable** predicate can use an index. Common non-sargable patterns:
13
+ - functions/arithmetic on indexed columns
14
+ - implicit type conversions
15
+ - leading wildcards (`LIKE '%x'`)
16
+ - some negations (`!=`, `NOT IN`, `NOT LIKE`) depending on shape/data
17
+
18
+ ## Functions on Indexed Columns
19
+ ```sql
20
+ -- BAD: function prevents index use on created_at
21
+ WHERE YEAR(created_at) = 2024
22
+
23
+ -- GOOD: sargable range
24
+ WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01'
25
+ ```
26
+
27
+ MySQL 8.0+ can use expression (functional) indexes for some cases:
28
+
29
+ ```sql
30
+ CREATE INDEX idx_users_upper_name ON users ((UPPER(name)));
31
+ -- Now this can use idx_users_upper_name:
32
+ WHERE UPPER(name) = 'SMITH'
33
+ ```
34
+
35
+ ## Implicit Type Conversions
36
+ Implicit casts can make indexes unusable:
37
+
38
+ ```sql
39
+ -- If phone is VARCHAR, this may force CAST(phone AS UNSIGNED) and scan
40
+ WHERE phone = 1234567890
41
+
42
+ -- Better: match the column type
43
+ WHERE phone = '1234567890'
44
+ ```
45
+
46
+ ## LIKE Patterns
47
+ ```sql
48
+ -- BAD: leading wildcard cannot use a B-Tree index
49
+ WHERE name LIKE '%smith'
50
+ WHERE name LIKE '%smith%'
51
+
52
+ -- GOOD: prefix match can use an index
53
+ WHERE name LIKE 'smith%'
54
+ ```
55
+
56
+ For suffix search, consider storing a reversed generated column + prefix search:
57
+
58
+ ```sql
59
+ ALTER TABLE users
60
+ ADD COLUMN name_reversed VARCHAR(255) AS (REVERSE(name)) STORED,
61
+ ADD INDEX idx_users_name_reversed (name_reversed);
62
+
63
+ WHERE name_reversed LIKE CONCAT(REVERSE('smith'), '%');
64
+ ```
65
+
66
+ For infix search at scale, use `FULLTEXT` (when appropriate) or a dedicated search engine.
67
+
68
+ ## `OR` Across Different Columns
69
+ `OR` across different columns often prevents efficient index use.
70
+
71
+ ```sql
72
+ -- Often suboptimal
73
+ WHERE status = 'active' OR region = 'us-east'
74
+
75
+ -- Often better: two indexed queries
76
+ SELECT * FROM orders WHERE status = 'active'
77
+ UNION ALL
78
+ SELECT * FROM orders WHERE region = 'us-east';
79
+ ```
80
+
81
+ MySQL can sometimes use `index_merge`, but it's frequently slower than a purpose-built composite index or a UNION rewrite.
82
+
83
+ ## ORDER BY + LIMIT Without an Index
84
+ `LIMIT` does not automatically make sorting cheap. If no index supports the order, MySQL may sort many rows (`Using filesort`) and then apply LIMIT.
85
+
86
+ ```sql
87
+ -- Needs an index on created_at (or it will filesort)
88
+ SELECT * FROM orders ORDER BY created_at DESC LIMIT 10;
89
+
90
+ -- For WHERE + ORDER BY, you usually need a composite index:
91
+ -- (status, created_at DESC)
92
+ SELECT * FROM orders
93
+ WHERE status = 'pending'
94
+ ORDER BY created_at DESC
95
+ LIMIT 10;
96
+ ```
97
+
98
+ ## DISTINCT / GROUP BY
99
+ `DISTINCT` and `GROUP BY` can trigger temp tables and sorts (`Using temporary`, `Using filesort`) when indexes don't match.
100
+
101
+ ```sql
102
+ -- Often improved by an index on (status)
103
+ SELECT DISTINCT status FROM orders;
104
+
105
+ -- Often improved by an index on (status)
106
+ SELECT status, COUNT(*) FROM orders GROUP BY status;
107
+ ```
108
+
109
+ ## Derived Tables / CTE Materialization
110
+ Derived tables and CTEs may be materialized into temporary tables, which can be slower than a flattened query. If performance is surprising, check `EXPLAIN` and consider rewriting the query or adding supporting indexes.
111
+
112
+ ## Other Quick Rules
113
+ - **`OFFSET` pagination**: `OFFSET N` scans and discards N rows. Use cursor-based pagination.
114
+ - **`SELECT *`** defeats covering indexes. Select only needed columns.
115
+ - **`NOT IN` with NULLs**: `NOT IN (subquery)` returns no rows if subquery contains any NULL. Use `NOT EXISTS`.
116
+ - **`COUNT(*)` vs `COUNT(col)`**: `COUNT(*)` counts all rows; `COUNT(col)` skips NULLs.
117
+ - **Arithmetic on indexed columns**: `WHERE price * 1.1 > 100` prevents index use. Rewrite to keep the column bare: `WHERE price > 100 / 1.1`.