@pageai/ralph-loop 1.9.0 → 1.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (45) hide show
  1. package/.agents/skills/mysql/SKILL.md +81 -0
  2. package/.agents/skills/mysql/references/character-sets.md +66 -0
  3. package/.agents/skills/mysql/references/composite-indexes.md +59 -0
  4. package/.agents/skills/mysql/references/connection-management.md +70 -0
  5. package/.agents/skills/mysql/references/covering-indexes.md +47 -0
  6. package/.agents/skills/mysql/references/data-types.md +69 -0
  7. package/.agents/skills/mysql/references/deadlocks.md +72 -0
  8. package/.agents/skills/mysql/references/explain-analysis.md +66 -0
  9. package/.agents/skills/mysql/references/fulltext-indexes.md +28 -0
  10. package/.agents/skills/mysql/references/index-maintenance.md +110 -0
  11. package/.agents/skills/mysql/references/isolation-levels.md +49 -0
  12. package/.agents/skills/mysql/references/json-column-patterns.md +77 -0
  13. package/.agents/skills/mysql/references/n-plus-one.md +77 -0
  14. package/.agents/skills/mysql/references/online-ddl.md +53 -0
  15. package/.agents/skills/mysql/references/partitioning.md +92 -0
  16. package/.agents/skills/mysql/references/primary-keys.md +70 -0
  17. package/.agents/skills/mysql/references/query-optimization-pitfalls.md +117 -0
  18. package/.agents/skills/mysql/references/replication-lag.md +46 -0
  19. package/.agents/skills/mysql/references/row-locking-gotchas.md +63 -0
  20. package/.agents/skills/postgres/SKILL.md +46 -0
  21. package/.agents/skills/postgres/references/backup-recovery.md +41 -0
  22. package/.agents/skills/postgres/references/index-optimization.md +69 -0
  23. package/.agents/skills/postgres/references/indexing.md +61 -0
  24. package/.agents/skills/postgres/references/memory-management-ops.md +39 -0
  25. package/.agents/skills/postgres/references/monitoring.md +59 -0
  26. package/.agents/skills/postgres/references/mvcc-transactions.md +38 -0
  27. package/.agents/skills/postgres/references/mvcc-vacuum.md +41 -0
  28. package/.agents/skills/postgres/references/optimization-checklist.md +19 -0
  29. package/.agents/skills/postgres/references/partitioning.md +79 -0
  30. package/.agents/skills/postgres/references/process-architecture.md +46 -0
  31. package/.agents/skills/postgres/references/ps-cli-api-insights.md +53 -0
  32. package/.agents/skills/postgres/references/ps-cli-commands.md +72 -0
  33. package/.agents/skills/postgres/references/ps-connection-pooling.md +72 -0
  34. package/.agents/skills/postgres/references/ps-connections.md +37 -0
  35. package/.agents/skills/postgres/references/ps-extensions.md +27 -0
  36. package/.agents/skills/postgres/references/ps-insights.md +62 -0
  37. package/.agents/skills/postgres/references/query-patterns.md +80 -0
  38. package/.agents/skills/postgres/references/replication.md +49 -0
  39. package/.agents/skills/postgres/references/schema-design.md +66 -0
  40. package/.agents/skills/postgres/references/storage-layout.md +41 -0
  41. package/.agents/skills/postgres/references/wal-operations.md +42 -0
  42. package/README.md +1 -1
  43. package/bin/cli.js +2 -0
  44. package/bin/lib/shadcn.js +1 -1
  45. package/package.json +1 -1
@@ -0,0 +1,46 @@
1
+ ---
2
+ title: Process Architecture
3
+ description: PostgreSQL multi-process model, connection management, and auxiliary processes
4
+ tags: postgres, processes, connections, pooling, memory, operations
5
+ ---
6
+
7
+ # Process Architecture
8
+
9
+ PostgreSQL uses a **multi-process** model, not multi-threaded: one OS process per client connection. The postmaster is the parent; it spawns backend processes per connection. Each backend has some private memory (`work_mem`, temp buffers). 1000 connections = 1000 processes (~5–10MB base + query memory each). There is also a large buffer shared amongst all.
10
+
11
+ ## Auxiliary Processes
12
+
13
+ WAL Writer, Background Writer, Checkpointer, Autovacuum Launcher/Workers, Archiver, WAL Summarizer (PG 17+). These run alongside backends and are not spawned per connection.
14
+
15
+ ## Memory Risk
16
+
17
+ `work_mem` is per-operation, not per-query. Estimate: `work_mem × operations_per_query × parallel_workers × connections` can grow very large at high concurrency. Scale connections and parallelism before raising `work_mem`.
18
+
19
+ ## Connection Pooling (Critical)
20
+
21
+ Each connection = OS process (fork overhead, context switching, memory). PgBouncer can multiplex many app connections to fewer DB connections. Typical: 1000 app connections → pooler → 20–50 backends. Implement pooling before raising `max_connections`; `max_connections` requires a full restart to change (default 100). Note: `superuser_reserved_connections` (default 3) reserves slots for emergency superuser access, so non-superusers are rejected before `max_connections` is fully reached.
22
+
23
+ ## Monitoring
24
+
25
+ ```sql
26
+ SELECT state, count(*) FROM pg_stat_activity WHERE backend_type = 'client backend' GROUP BY state;
27
+ ```
28
+
29
+ ```sql
30
+ -- Show used and free connection slots
31
+ SELECT count(*) AS used, max(max_conn) - count(*) AS free
32
+ FROM pg_stat_activity, (SELECT setting::int AS max_conn FROM pg_settings WHERE name = 'max_connections') s
33
+ WHERE backend_type = 'client backend';
34
+ ```
35
+
36
+ Use `pg_activity` for interactive top-like monitoring. Alert at 80% connection usage, critical at 95%. Count by state to find idle-in-transaction leaks — these hold locks and **block VACUUM** from reclaiming dead tuples.
37
+
38
+ ## Common Problems
39
+
40
+ | Problem | Fix |
41
+ | ------- | --- |
42
+ | `too many clients already` | Implement pooling; find idle connections; check for connection leaks |
43
+ | High memory / OOM | Reduce `work_mem`; add pooling; set `statement_timeout` |
44
+ | Stuck process | `SELECT pg_cancel_backend(pid);` then `SELECT pg_terminate_backend(pid);` — **always confirm with a human before terminating backends**, as this may abort in-flight transactions and cause data issues for the application |
45
+
46
+ Prefer pooling + conservative `max_connections` over raising limits reactively.
@@ -0,0 +1,53 @@
1
+ ---
2
+ title: CLI Query Insights API
3
+ description: CLI insights usage
4
+ tags: postgres, planetscale, cli, insights, query-patterns, api
5
+ ---
6
+
7
+ # Query Insights via pscale CLI
8
+
9
+ Analyze slow queries and missing indexes using `pscale api`. Endpoints may change—see https://planetscale.com/docs/api/reference/getting-started-with-planetscale-api for current API docs.
10
+
11
+ ## Using pscale api
12
+
13
+ The `pscale api` command makes authenticated API calls using your current login or service token (see [ps-cli-commands.md](ps-cli-commands.md#service-token-cicd) for auth setup). No need to manage auth headers manually.
14
+
15
+ ```bash
16
+ pscale api "<endpoint>" [--method POST] [--field key=value] [--org <org>]
17
+ ```
18
+
19
+ ## Query Patterns Reports
20
+
21
+ ```bash
22
+ # Create a new report
23
+ pscale api "organizations/{org}/databases/{db}/branches/{branch}/query-patterns-reports" \
24
+ --method POST --org my-org
25
+
26
+ # Check status (poll until state=complete)
27
+ pscale api "organizations/{org}/databases/{db}/branches/{branch}/query-patterns-reports/{id}/status"
28
+
29
+ # Download completed report
30
+ pscale api "organizations/{org}/databases/{db}/branches/{branch}/query-patterns-reports/{id}"
31
+
32
+ # List all reports
33
+ pscale api "organizations/{org}/databases/{db}/branches/{branch}/query-patterns-reports"
34
+ ```
35
+
36
+ ## Schema Analysis
37
+
38
+ ```bash
39
+ # Get branch schema
40
+ pscale api "organizations/{org}/databases/{db}/branches/{branch}/schema"
41
+
42
+ # Lint schema for issues
43
+ pscale api "organizations/{org}/databases/{db}/branches/{branch}/schema/lint"
44
+ ```
45
+
46
+ ## What to Look For
47
+
48
+ | Metric | Indicates | Action |
49
+ | -------------------------------- | --------------------- | ------------------------------- |
50
+ | High `rows_read / rows_returned` | Missing or poor index | Add index on WHERE/JOIN columns |
51
+ | High `total_time_s` | Heavy query | Optimize or cache |
52
+ | High `count` with same pattern | N+1 queries | Batch or eager-load |
53
+ | `indexed: false` | Full table scan | Add index |
@@ -0,0 +1,72 @@
1
+ ---
2
+ title: PlanetScale CLI Reference
3
+ description: CLI command guide
4
+ tags: planetscale, cli, branches, deploy-requests, authentication
5
+ ---
6
+
7
+ # pscale CLI Commands
8
+
9
+ Full CLI reference: https://planetscale.com/docs/cli. Use `pscale <command> --help` for subcommands and flags.
10
+
11
+ ## Authentication
12
+
13
+ ```bash
14
+ pscale auth login # Opens browser
15
+ pscale auth logout
16
+ pscale org list
17
+ pscale org switch <name>
18
+ ```
19
+
20
+ ### Service Token (CI/CD)
21
+
22
+ ```bash
23
+ # Create and configure
24
+ pscale service-token create
25
+ pscale service-token add-access <id> read_branch --database <db>
26
+ # Use in CI/CD
27
+ export PLANETSCALE_SERVICE_TOKEN_ID="<id>"
28
+ export PLANETSCALE_SERVICE_TOKEN="<token>"
29
+ ```
30
+
31
+ ## Core Commands
32
+
33
+ ```bash
34
+ # Databases
35
+ pscale database list
36
+ pscale database create <name>
37
+
38
+ # Branches
39
+ pscale branch list <db>
40
+ pscale branch create <db> <branch> [--from <parent>]
41
+ pscale branch delete <db> <branch> # DESTRUCTIVE — always confirm with a human first
42
+ pscale branch schema <db> <branch>
43
+
44
+ # Deploy requests (schema changes) — Vitess only
45
+ pscale deploy-request create <db> <branch>
46
+ pscale deploy-request list <db>
47
+ pscale deploy-request deploy <db> <number>
48
+
49
+ # Connect
50
+ pscale shell <db> <branch> # Opens psql (Postgres) or mysql (Vitess)
51
+ pscale connect <db> <branch> # Proxy for GUI tools (secure tunnel) — Vitess only
52
+
53
+ # Credentials
54
+ pscale role create <db> <branch> <name> # Postgres
55
+ pscale password create <db> <branch> <name> # Vitess
56
+
57
+ # Other
58
+ pscale ping # Check latency to regions
59
+ pscale region list # Available regions
60
+ pscale backup list <db> <branch>
61
+ pscale backup create <db> <branch>
62
+ ```
63
+
64
+ ## Useful Flags
65
+
66
+ ```bash
67
+ --format json # Output as JSON (also: csv, human)
68
+ --org <name> # Specify organization
69
+ --debug # Debug output
70
+ ```
71
+
72
+ For API calls via CLI, see [ps-cli-api-insights.md](ps-cli-api-insights.md).
@@ -0,0 +1,72 @@
1
+ ---
2
+ title: PgBouncer Connection Pooling
3
+ description: Pooling setup guide
4
+ tags: postgres, pgbouncer, connection-pooling, performance, transactions
5
+ ---
6
+
7
+ # Connection Pooling with PgBouncer
8
+
9
+ PlanetScale provides PgBouncer for connection pooling. Connect on port `6432` instead of `5432`.
10
+
11
+ ## When to Use PgBouncer (Port 6432)
12
+
13
+ All OLTP application workloads: web apps, APIs, high-concurrency read/write operations.
14
+
15
+ ## When to Use Direct Connections (Port 5432)
16
+
17
+ - Schema changes (DDL)
18
+ - Analytics, reporting, batch processing
19
+ - Session-specific features (temp tables, session variables)
20
+ - ETL, data streaming, `pg_dump`
21
+ - Long-running admin transactions
22
+
23
+ ## PgBouncer Types
24
+
25
+ PlanetScale offers three PgBouncer options. All use port `6432`.
26
+
27
+ | Type | Runs On | Routes To | Key Trait |
28
+ | ---- | ------- | --------- | --------- |
29
+ | **Local** | Same node as primary | Primary only | Included with every database; no replica routing |
30
+ | **Dedicated Primary** | Separate node | Primary | Connections persist through resizes, upgrades, and most failovers |
31
+ | **Dedicated Replica** | Separate node | Replicas | Read-only traffic; supports AZ affinity for lower latency |
32
+
33
+ - **Local PgBouncer** — use same credentials as direct, just change port to `6432`. Always routes to primary regardless of username.
34
+ - **Dedicated Primary** — runs off-server for improved HA. Use for production OLTP write traffic.
35
+ - **Dedicated Replica** — runs off-server for read-heavy workloads. Supports AZ affinity to prefer same-zone replicas. Multiple can be created for capacity or per-app isolation.
36
+
37
+ To connect to a dedicated PgBouncer, append `|pgbouncer-name` to the username (e.g., `postgres.xxx|write-pool` or `postgres.xxx|read-bouncer`).
38
+
39
+ ## Transaction Pooling Limitations
40
+
41
+ PlanetScale PgBouncer uses **transaction pooling mode**. These features are unavailable:
42
+
43
+ - Prepared statements that persist across transactions
44
+ - Temporary tables
45
+ - `LISTEN`/`NOTIFY`
46
+ - Session-level advisory locks
47
+ - `SET` commands persisting beyond a transaction
48
+
49
+ ## Recommended Patterns
50
+
51
+ - Size pools from observed concurrency, query memory behavior, and connection limits.
52
+ - Keep pooled app traffic on `6432` and reserve direct connections for DDL/admin/long-running jobs.
53
+
54
+ ## Avoid Patterns
55
+
56
+ - Avoid setting pool size with only `CPU_cores * N` while ignoring query-memory amplification.
57
+ - Avoid running session-dependent workflows through transaction pooling.
58
+
59
+ ## Connecting
60
+
61
+ ```bash
62
+ # Local PgBouncer (same credentials, port 6432)
63
+ psql 'host=xxx.horizon.psdb.cloud port=6432 user=postgres.xxx password=pscale_pw_xxx dbname=mydb sslnegotiation=direct sslmode=verify-full sslrootcert=system'
64
+
65
+ # Dedicated primary PgBouncer (append |pgbouncer-name to user)
66
+ psql 'host=xxx.horizon.psdb.cloud port=6432 user=postgres.xxx|write-pool password=pscale_pw_xxx dbname=mydb sslnegotiation=direct sslmode=verify-full sslrootcert=system'
67
+
68
+ # Dedicated replica PgBouncer (append |pgbouncer-name to user)
69
+ psql 'host=xxx.horizon.psdb.cloud port=6432 user=postgres.xxx|read-bouncer password=pscale_pw_xxx dbname=mydb sslnegotiation=direct sslmode=verify-full sslrootcert=system'
70
+ ```
71
+
72
+ Docs: https://planetscale.com/docs/postgres/connecting/pgbouncer
@@ -0,0 +1,37 @@
1
+ ---
2
+ title: PlanetScale Postgres Connections
3
+ description: Connection guide for PlanetScale Postgres
4
+ tags: planetscale, postgres, connections, ssl, troubleshooting
5
+ ---
6
+
7
+ # PlanetScale Postgres Connections
8
+
9
+ Postgres docs: https://planetscale.com/docs/postgres/connecting
10
+
11
+ | Protocol | Standard Port | Pooled Port | SSL |
12
+ | -------- | ------------- | ----------------------- | -------- |
13
+ | Postgres | 5432 | 6432 (PgBouncer) | Required |
14
+
15
+ Credentials (roles) are branch-specific and cannot be recovered after creation.
16
+
17
+ ## Connection String
18
+
19
+ ```
20
+ postgresql://<user>:<password>@<host>.horizon.psdb.cloud:5432/<database>?sslmode=verify-full&sslrootcert=system&sslnegotiation=direct
21
+ ```
22
+
23
+ Use port **6432** for PgBouncer (applications/OLTP).
24
+ Use port **5432** for DDL, admin tasks, and migrations.
25
+
26
+ ## Troubleshooting
27
+
28
+ | Error | Fix |
29
+ | -------------------------------- | --------------------------------------- |
30
+ | `password authentication failed` | Check role format: `<role>.<branch_id>` |
31
+ | `too many clients already` | Use PgBouncer (port 6432) |
32
+ | `SSL connection is required` | Add `sslmode=verify-full&sslrootcert=system` |
33
+
34
+ **Best practices:**
35
+ - Use the PlanetScale Postgres metrics page to monitor direct and PgBouncer connections
36
+ - Route OLTP traffic to port 6432 and reserve 5432 for admin/migrations.
37
+ - Avoid raising `max_connections` reactively instead of pooling.
@@ -0,0 +1,27 @@
1
+ ---
2
+ title: PlanetScale PostgreSQL Extensions
3
+ description: Extension reference
4
+ tags: postgres, extensions
5
+ ---
6
+
7
+ # PostgreSQL Extensions on PlanetScale
8
+
9
+ Only use PlanetScale-supported extensions. For the complete and up-to-date list of available extensions, see: https://planetscale.com/docs/postgres/extensions
10
+
11
+ Do not rely on hard-coded extension lists — always check the documentation above for current availability.
12
+
13
+ ## Enabling Extensions
14
+
15
+ Some extensions must first be **enabled in the PlanetScale Dashboard** (Clusters > Extensions) before they can be created in SQL. This often requires a database restart.
16
+
17
+ Once enabled in the dashboard, create the extension in SQL:
18
+
19
+ ```sql
20
+ CREATE EXTENSION IF NOT EXISTS <extension_name>;
21
+ ```
22
+
23
+ ## Recommended Patterns
24
+
25
+ - Always check the [PlanetScale extensions docs](https://planetscale.com/docs/postgres/extensions) before assuming an extension is available.
26
+ - Verify extension availability in PlanetScale configuration and docs before schema design depends on it.
27
+ - Enable `pg_stat_statements` early for baseline query telemetry.
@@ -0,0 +1,62 @@
1
+ ---
2
+ title: PlanetScale Query Insights
3
+ description: Query insights guide
4
+ tags: postgres, planetscale, insights, monitoring, optimization
5
+ ---
6
+
7
+ # PlanetScale Insights
8
+
9
+ ## Fetch current documentation first
10
+
11
+ Prefer retrieval over pre-training knowledge. Docs: https://planetscale.com/docs
12
+
13
+ ## MCP Server (Preferred)
14
+
15
+ When the PlanetScale MCP server is configured in your environment, prefer it over CLI. Key tools:
16
+
17
+ - `planetscale_get_branch_schema` — Get schema for a branch
18
+ - `planetscale_execute_read_query` — Run SELECT, SHOW, DESCRIBE, EXPLAIN
19
+ - `planetscale_get_insights` — Query performance insights
20
+ - `planetscale_list_schema_recommendations` — Index and schema suggestions
21
+ - `planetscale_search_documentation` — Search PlanetScale docs
22
+
23
+ MCP setup: https://planetscale.com/docs/connect/mcp
24
+
25
+ The MCP server is the ideal way to interact with insights from an AI agent.
26
+ If not installed, prompt the user to install it to make the agent more effective.
27
+
28
+ ## Query Insights (CLI)
29
+
30
+ Generating reports via CLI is a multi-step process (create → wait → download).
31
+
32
+ See [ps-cli-api-insights.md](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/postgres/references/ps-cli-api-insights.md) for how to use.
33
+
34
+ What to look for:
35
+
36
+ - High `rows_read / rows_returned` ratio → missing index
37
+ - High `total_time_s` → optimization target
38
+
39
+ ## Insights UI (Dashboard)
40
+
41
+ In the [PlanetScale dashboard](https://app.planetscale.com/), select your database and click **Insights**.
42
+
43
+ - **Filtering** — Pick a branch, choose primary or replica, and scroll through the last 7 days. Click-and-drag on graphs to zoom into a time window.
44
+ - **Graphs** — Four tabs: Query latency (p50/p95/p99/p99.9), Queries per second, Rows read/s, and Rows written/s.
45
+ - **Queries table** — All queries in the selected timeframe, normalized into patterns. Sortable and filterable by SQL, schema, table, latency, index usage, and more. Customizable columns (count, total time, latency percentiles, rows read/returned/affected, CPU/IO time, cache hit ratio, etc.). Enable sparklines for inline trend graphs. Orange icons flag full table scans.
46
+ - **Query deep dive** — Click any query to see per-pattern graphs, summary stats, index usage breakdown, and a table of notable executions (>1 s, >10k rows read, or errors). Use "Summarize query" for an LLM-generated plain-English description.
47
+ - **Anomalies tab** — Flags periods with elevated slow-running queries and surfaces the responsible patterns.
48
+ - **Errors tab** — Surfaces queries that produced errors.
49
+ - **pginsights settings** — `pginsights.raw_queries` enables full query text collection for notable queries; `pginsights.normalize_schema_names` groups identical patterns across schemas (useful for schema-per-tenant designs). Both configurable in the Extensions tab on the Clusters page.
50
+
51
+ More: [PlanetScale Insights docs](https://planetscale.com/docs/postgres/monitoring/query-insights)
52
+
53
+ ## Optimization Checklist
54
+
55
+ - Remove unused indexes (0 scans)
56
+ - Remove duplicate indexes
57
+ - Archive audit/log tables >10 GB
58
+ - Review tables >100 GB for partitioning
59
+
60
+ **Always confirm with a human before removing indexes, dropping tables/partitions, or archiving data.** These are destructive actions that cannot be easily undone.
61
+
62
+ More: [optimization-checklist.md](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/postgres/references/optimization-checklist.md)
@@ -0,0 +1,80 @@
1
+ ---
2
+ title: SQL Query Patterns
3
+ description: Common SQL anti-patterns and optimized alternatives
4
+ tags: postgres, sql, query-optimization, n-plus-one, pagination
5
+ ---
6
+
7
+ # SQL Query Patterns
8
+
9
+ ## Query Structure
10
+
11
+ **SELECT specific columns** — avoids fetching unnecessary data and enables covering indexes:
12
+ ```sql
13
+ -- Bad:
14
+ SELECT * FROM user WHERE status = 'active';
15
+ -- Good:
16
+ SELECT id, name, email FROM user WHERE status = 'active';
17
+ ```
18
+
19
+ **Subqueries → JOINs** — correlated subqueries re-execute per row:
20
+ ```sql
21
+ -- Bad
22
+ SELECT id, (SELECT COUNT(*) FROM order WHERE order.user_id = user.id) FROM user;
23
+ -- Good
24
+ SELECT u.id, COUNT(o.id) FROM user u LEFT JOIN order o ON o.user_id = u.id GROUP BY u.id;
25
+ ```
26
+
27
+ **Always LIMIT unbounded queries** — prevent runaway result sets:
28
+ ```sql
29
+ SELECT id, message FROM log WHERE level = 'error' ORDER BY created_at DESC LIMIT 100;
30
+ ```
31
+
32
+ **Avoid functions on indexed columns (SARGable)** — functions prevent index usage unless a functional index exists:
33
+ ```sql
34
+ -- Bad: Full table scan
35
+ SELECT * FROM user WHERE date_trunc('day', created_at) = '2023-01-01';
36
+ -- Good: Index scan
37
+ SELECT * FROM user WHERE created_at >= '2023-01-01' AND created_at < '2023-01-02';
38
+ ```
39
+
40
+ ## N+1 Detection
41
+
42
+ **Queries inside loops → batch with ANY/IN:**
43
+ ```python
44
+ # Bad
45
+ for uid in user_ids:
46
+ cursor.execute("SELECT name FROM user WHERE id = %s", (uid,))
47
+ # Good (Postgres specific)
48
+ cursor.execute("SELECT id, name FROM user WHERE id = ANY(%s)", (list(user_ids),))
49
+ # Good (Standard SQL)
50
+ # cursor.execute("SELECT id, name FROM user WHERE id IN %s", (tuple(user_ids),))
51
+ ```
52
+
53
+ **ORM lazy loading → eager loading:**
54
+ ```python
55
+ # Bad: N+1 — each iteration fires a query
56
+ for user in User.query.all():
57
+ print(user.posts)
58
+ # Good
59
+ users = User.query.options(joinedload(User.posts)).all()
60
+ ```
61
+
62
+ ## Query Rewrites
63
+
64
+ **UNION → UNION ALL** — skip deduplication when duplicates are impossible or acceptable.
65
+
66
+ **IN subquery → EXISTS** — EXISTS short-circuits on first match:
67
+ ```sql
68
+ SELECT id, name FROM user u
69
+ WHERE EXISTS (SELECT 1 FROM order o WHERE o.user_id = u.id AND o.total > 100);
70
+ ```
71
+
72
+ **OFFSET → cursor pagination** — OFFSET scans and discards rows, degrading at depth:
73
+ ```sql
74
+ -- Bad: OFFSET 10000 scans 10020 rows
75
+ SELECT id, title FROM article ORDER BY created_at DESC LIMIT 20 OFFSET 10000;
76
+ -- Good: cursor-based (requires index on (created_at DESC, id DESC))
77
+ SELECT id, title FROM article
78
+ WHERE (created_at, id) < ('2025-06-15T12:00:00Z', 987654)
79
+ ORDER BY created_at DESC, id DESC LIMIT 20;
80
+ ```
@@ -0,0 +1,49 @@
1
+ ---
2
+ title: Replication
3
+ description: Streaming replication, replication slots, synchronous commit levels, failover, and standby management
4
+ tags: postgres, replication, streaming, slots, synchronous, failover, standby, operations
5
+ ---
6
+
7
+ # Replication
8
+
9
+ ## Streaming Replication for followers
10
+
11
+ Use physical (byte-for-byte) replication via WAL stream from primary to standbys. Standbys are read-only (hot standby); same major PG version and architecture required (same minor recommended). Without replication slots, the primary may recycle WAL before the standby receives it → standby needs full resync via `pg_basebackup`. Use replication slots to guarantee WAL retention for specific standbys.
12
+
13
+ ## Replication Slots
14
+
15
+ Postgres supports Physical slots (streaming) and logical slots (logical replication). Slots prevent WAL deletion even if standby is offline — can exhaust `pg_wal/` disk. Use `max_slot_wal_keep_size` to cap retained WAL per slot. Use `idle_replication_slot_timeout` (PG 17+) to auto-invalidate idle slots. `wal_keep_size` is a simpler alternative to slots for WAL retention. Drop inactive slots immediately to prevent disk exhaustion.
16
+
17
+ Slot lag (MB behind): `SELECT slot_name, pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)/1024/1024 AS mb_behind FROM pg_replication_slots;`
18
+
19
+ Drop inactive slot: `SELECT pg_drop_replication_slot('slot_name');`
20
+
21
+ **Always confirm with a human before dropping replication slots.** Dropping an active or needed slot can cause downstream issues.
22
+
23
+ ## Synchronous Commit Levels
24
+
25
+ | Level | Behavior | Use Case |
26
+ |-------|----------|----------|
27
+ | `off` | Returns immediately, no wait | Non-critical writes; risks losing ~600ms of commits on crash (no inconsistency) |
28
+ | `local` | Waits for local WAL fsync only | Local durability only; no standby wait |
29
+ | `remote_write` | Waits for standby OS buffer | Data loss on standby OS crash |
30
+ | `on` | Waits for standby WAL to disk when `synchronous_standby_names` is set; otherwise same as `local` | **Default. This level or higher recommended for HA** |
31
+ | `remote_apply` | Waits for standby to apply WAL | Strongest; read-your-writes |
32
+
33
+ Configure with `synchronous_standby_names`. Use `ANY N` for quorum or `FIRST N` for priority-based sync.
34
+
35
+ ## Quorum and Failure
36
+
37
+ `FIRST 2 (s1, s2, s3)` is priority-based: waits for the 2 highest-priority connected standbys (s1+s2; s3 takes over only if one disconnects). `ANY 2 (s1, s2, s3)` is quorum-based: waits for any 2. With either, if only 1 is healthy, commits hang. Provision at least N+1 standbys: need 2 confirmations → provision 3. PostgreSQL never commits unless required standbys confirm — no inconsistency, but clients may timeout.
38
+
39
+ ## Failover
40
+
41
+ `pg_ctl promote` or `SELECT pg_promote()` (SQL function, PG 12+) converts standby to primary. One-way: promoted standby cannot rejoin as standby without rebuild. `pg_rewind` can resync old primary to new primary (requires `wal_log_hints=on` or data checksums) — faster than full rebuild. After promotion: update connection strings, rebuild old primary as standby, reconfigure other standbys.
42
+
43
+ ## Monitoring
44
+
45
+ On the primary, query `pg_stat_replication` for each connected standby's `state` (`streaming` = healthy, `catchup` = behind), `sync_state` (`sync`/`async`), and LSN positions (`sent_lsn`, `write_lsn`, `flush_lsn`, `replay_lsn`) to compute lag. On standbys, `pg_stat_wal_receiver` shows the receiver process status and `flushed_lsn`; compare `pg_last_wal_receive_lsn()` vs `pg_last_wal_replay_lsn()` for local replay lag.
46
+
47
+ Replication lag (MB): `SELECT application_name, pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)/1024/1024 AS lag_mb FROM pg_stat_replication;`
48
+
49
+ Enable `wal_compression` (`pglz`, `lz4`, or `zstd`) to compress full page images in WAL (not all WAL data) — reduces WAL size for bandwidth-limited replication.
@@ -0,0 +1,66 @@
1
+ ---
2
+ title: PostgreSQL Schema Design
3
+ description: Schema design guide
4
+ tags: postgres, schema, primary-keys, data-types, foreign-keys, naming
5
+ ---
6
+
7
+ # Schema Design
8
+
9
+ ## Primary Keys
10
+
11
+ Prefer `BIGINT GENERATED ALWAYS AS IDENTITY`. Avoid random UUIDs (UUIDv4) as primary keys; use `uuidv7()` when you need UUIDs.
12
+
13
+ ```sql
14
+ CREATE TABLE user (
15
+ id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
16
+ email TEXT NOT NULL UNIQUE
17
+ );
18
+ ```
19
+
20
+ Random UUID PKs (v4) can cause index fragmentation; UUIDs are also larger (16 vs 8 bytes for BIGINT) and can slow joins.
21
+
22
+ ## Data Types
23
+
24
+ | Use | Avoid |
25
+ | --- | --- |
26
+ | `TEXT`, `VARCHAR` | Extension-specific types |
27
+ | `JSONB` | Custom ENUMs (use CHECK instead) |
28
+ | `TIMESTAMPTZ` | `TIMESTAMP` without time zone |
29
+ | `BIGINT`, `INTEGER` | Platform-specific types |
30
+
31
+ Prefer CHECK constraints over ENUM types — they're easier to modify:
32
+
33
+ ```sql
34
+ CREATE TABLE order (
35
+ id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
36
+ status TEXT NOT NULL CHECK (status IN ('pending', 'shipped', 'delivered'))
37
+ );
38
+ ```
39
+
40
+ ## Foreign Keys
41
+
42
+ - Always index FK columns (PostgreSQL does not auto-create these)
43
+ - Avoid circular FK dependencies
44
+ - Suggestion: use `ON DELETE CASCADE` or `ON DELETE SET NULL` explicitly
45
+
46
+ ```sql
47
+ CREATE TABLE order (
48
+ id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
49
+ customer_id BIGINT NOT NULL REFERENCES customer(id) ON DELETE CASCADE
50
+ );
51
+ CREATE INDEX order_customer_id_idx ON order (customer_id);
52
+ ```
53
+
54
+ ## Naming Conventions
55
+
56
+ - Tables: singular snake_case (`user_account`, `order_item`)
57
+ - Columns: singular snake_case (`created_at`, `user_id`)
58
+ - Indexes: `{table}_{column}_idx`
59
+ - Constraints: `{table}_{column}_{type}` (e.g., `order_status_check`)
60
+
61
+ ## General Guidelines
62
+
63
+ - Add `NOT NULL` to as many columns as possible
64
+ - Add `created_at TIMESTAMPTZ DEFAULT NOW()` to all tables
65
+ - Use `BIGINT` for all IDs and foreign keys, even on small tables
66
+ - Keep tables normalized; denormalize only for proven hot read paths
@@ -0,0 +1,41 @@
1
+ ---
2
+ title: Storage Layout and Tablespaces
3
+ description: PGDATA directory structure, TOAST, fillfactor, tablespaces, and disk management
4
+ tags: postgres, storage, pgdata, toast, fillfactor, tablespaces, disk, operations
5
+ ---
6
+
7
+ # Storage Layout and Tablespaces
8
+
9
+ ## PGDATA Structure
10
+
11
+ - **base/** — database files (one subdirectory per database, named by OID)
12
+ - **global/** — cluster-wide shared catalogs (pg_database, pg_authid, pg_tablespace)
13
+ - **pg_wal/** — WAL files
14
+ - **pg_xact/** — transaction commit status
15
+
16
+ "Cluster" in PostgreSQL = single instance with one PGDATA, not an HA cluster. Each table/index = one or more files, split into 1GB segments. Tables have companion **_fsm** (free space map) and **_vm** (visibility map); indexes have **_fsm** only (no _vm), except hash indexes.
17
+
18
+ ## Visibility Map and Free Space Map
19
+
20
+ - **_vm** tracks all-visible pages — VACUUM skips these
21
+ - **_fsm** tracks free space per page — INSERT uses this to find pages with room
22
+ - Both are small files but critical for performance
23
+
24
+ ## TOAST
25
+
26
+ TOAST triggers when a **row** exceeds ~2KB. Large values are compressed and/or moved out-of-line to `pg_toast.pg_toast_<oid>` tables. **Strategies:** PLAIN (no TOAST), EXTENDED (compress+out-of-line, default for text/bytea), EXTERNAL (out-of-line, no compression — use for pre-compressed data), MAIN (compress, avoid out-of-line). TOAST tables bloat like regular tables — they need VACUUM. `SELECT *` fetches all TOAST columns; always SELECT only needed columns. Move large rarely-accessed columns to separate tables.
27
+
28
+ ## Fillfactor
29
+
30
+ Controls how full pages are packed (default 100%). Lower fillfactor (70–80%) leaves room for HOT (Heap-Only Tuple) updates, which avoid index entries and reduce bloat on UPDATE-heavy tables. Keep 100% for insert-only or read-mostly tables. `ALTER TABLE t SET (fillfactor = 70);`
31
+
32
+ ## Tablespaces
33
+
34
+ `pg_default` (base/), `pg_global` (global/) are built-in. Custom tablespaces: symbolic links in **pg_tblspc/** to other filesystem locations. Use for separating hot data (SSD) from archives (HDD). Moving tablespaces requires exclusive lock on affected tables.
35
+
36
+ ## Disk Monitoring
37
+
38
+ - `pg_database_size('dbname')`, `pg_total_relation_size('tablename')`, `pg_relation_size('tablename')`
39
+ - Monitor disk usage: >80% = at risk; >90% = critical (VACUUM may fail if disk capacity is insufficient)
40
+ - Check inode usage (`df -i`) — can run out even with free space
41
+ - `pg_wal/` suddenly large = check replication slots and archiving
@@ -0,0 +1,42 @@
1
+ ---
2
+ title: WAL and Checkpoint Operations
3
+ description: Write-ahead log internals, checkpoint tuning, durability guarantees, and WAL disk management
4
+ tags: postgres, wal, checkpoints, durability, crash-recovery, fsync, operations
5
+ ---
6
+
7
+ # WAL and Checkpoint Operations
8
+
9
+ ## WAL Fundamentals
10
+
11
+ Write-Ahead Logging: logs changes to `pg_wal/` **before** modifying data files. WAL segments are 16MB (fixed at initdb). On COMMIT, PostgreSQL fsyncs WAL to disk and returns SUCCESS — data files are updated lazily. WAL records are written for all changes (including uncommitted transactions and rollbacks). **Never disable `fsync` in production** — power loss without fsync risks unrecoverable data loss.
12
+
13
+ `wal_level`: `minimal` (crash recovery only), `replica` (default; replication + archiving), `logical` (logical replication).
14
+
15
+ ## Dirty Pages and Checkpoints
16
+
17
+ A dirty page is modified in shared_buffers but not yet written to data files. A checkpoint flushes all dirty pages to disk and writes a checkpoint record to WAL; recovery only replays WAL since the last checkpoint.
18
+
19
+ - `checkpoint_timeout` (default 5 min) and `max_wal_size` (default 1GB) — checkpoint on whichever triggers first.
20
+ - `checkpoint_completion_target=0.9` spreads I/O over 90% of the interval; avoid spikes.
21
+ - "Checkpoints are occurring too frequently" in logs → increase `max_wal_size`.
22
+ - **Target: >90% of checkpoints should be time-based** (`num_timed` in `pg_stat_checkpointer`), not size-based (`num_requested`). If num_requested/(num_timed+num_requested) > 10%, tune `max_wal_size` up.
23
+
24
+ ## WAL Disk Management
25
+
26
+ Replication slots prevent WAL deletion even when standbys are offline — they can fill disk. WAL archiving failures also block recycling. `max_wal_size` is a *soft* limit; WAL can grow beyond it under heavy load.
27
+
28
+ WAL size: `SELECT count(*) AS files, pg_size_pretty(sum(size)) AS total FROM pg_ls_waldir();`
29
+
30
+ Slot lag: `SELECT slot_name, pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS lag_bytes FROM pg_replication_slots;`
31
+
32
+ ## Checkpoint Monitoring
33
+
34
+ PG17+ moved checkpoint stats from `pg_stat_bgwriter` to `pg_stat_checkpointer` and renamed columns.
35
+
36
+ `SELECT num_timed, num_requested, write_time, sync_time, buffers_written FROM pg_stat_checkpointer;`
37
+
38
+ Backend-direct writes (formerly `buffers_backend` in `pg_stat_bgwriter`) are now tracked in `pg_stat_io`: `SELECT writes FROM pg_stat_io WHERE backend_type = 'client backend' AND object = 'relation';`
39
+
40
+ ## Crash Recovery
41
+
42
+ On crash, PostgreSQL replays WAL from the last checkpoint. Longer checkpoint intervals → more WAL to replay → longer recovery. Trade-off: frequent checkpoints (faster recovery, more I/O) vs infrequent (less I/O, slower recovery). For most workloads, `checkpoint_timeout=5min` and `max_wal_size` tuned to keep checkpoints time-based is the right balance.
package/README.md CHANGED
@@ -41,7 +41,7 @@ This is an implementation that actually works, containing a hackable script so y
41
41
  I recommend using a CLI to bootstrap your project with the necessary tools and dependencies, e.g.:
42
42
 
43
43
  ```bash
44
- npx @tanstack/cli create lib --add-ons shadcn,eslint,form,tanstack-query --no-git
44
+ npx @tanstack/cli create lib --add-ons shadcn,eslint,form,tanstack-query,nitro --no-git
45
45
  ```
46
46
 
47
47
  > If you must start from a blank slate, which is not recommended, see [Starting from scratch](#starting-from-scratch). You can also go for a more barebone start by running `npx create-vite@latest src --template react-ts`