npm - @pageai/ralph-loop - Versions diffs - 1.8.0 → 1.10.0 - Mend

@pageai/ralph-loop 1.8.0 → 1.10.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (46) hide show

package/.agents/skills/postgres/references/process-architecture.md ADDED Viewed

@@ -0,0 +1,46 @@
+---
+title: Process Architecture
+description: PostgreSQL multi-process model, connection management, and auxiliary processes
+tags: postgres, processes, connections, pooling, memory, operations
+---
+# Process Architecture
+PostgreSQL uses a **multi-process** model, not multi-threaded: one OS process per client connection. The postmaster is the parent; it spawns backend processes per connection. Each backend has some private memory (`work_mem`, temp buffers). 1000 connections = 1000 processes (~5–10MB base + query memory each). There is also a large buffer shared amongst all.
+## Auxiliary Processes
+WAL Writer, Background Writer, Checkpointer, Autovacuum Launcher/Workers, Archiver, WAL Summarizer (PG 17+). These run alongside backends and are not spawned per connection.
+## Memory Risk
+`work_mem` is per-operation, not per-query. Estimate: `work_mem × operations_per_query × parallel_workers × connections` can grow very large at high concurrency. Scale connections and parallelism before raising `work_mem`.
+## Connection Pooling (Critical)
+Each connection = OS process (fork overhead, context switching, memory). PgBouncer can multiplex many app connections to fewer DB connections. Typical: 1000 app connections → pooler → 20–50 backends. Implement pooling before raising `max_connections`; `max_connections` requires a full restart to change (default 100). Note: `superuser_reserved_connections` (default 3) reserves slots for emergency superuser access, so non-superusers are rejected before `max_connections` is fully reached.
+## Monitoring
+```sql
+SELECT state, count(*) FROM pg_stat_activity WHERE backend_type = 'client backend' GROUP BY state;
+```
+```sql
+-- Show used and free connection slots
+SELECT count(*) AS used, max(max_conn) - count(*) AS free
+FROM pg_stat_activity, (SELECT setting::int AS max_conn FROM pg_settings WHERE name = 'max_connections') s
+WHERE backend_type = 'client backend';
+```
+Use `pg_activity` for interactive top-like monitoring. Alert at 80% connection usage, critical at 95%. Count by state to find idle-in-transaction leaks — these hold locks and **block VACUUM** from reclaiming dead tuples.
+## Common Problems
+| Problem | Fix |
+| ------- | --- |
+| `too many clients already` | Implement pooling; find idle connections; check for connection leaks |
+| High memory / OOM | Reduce `work_mem`; add pooling; set `statement_timeout` |
+| Stuck process | `SELECT pg_cancel_backend(pid);` then `SELECT pg_terminate_backend(pid);` — **always confirm with a human before terminating backends**, as this may abort in-flight transactions and cause data issues for the application |
+Prefer pooling + conservative `max_connections` over raising limits reactively.

package/.agents/skills/postgres/references/ps-cli-api-insights.md ADDED Viewed

@@ -0,0 +1,53 @@
+---
+title: CLI Query Insights API
+description: CLI insights usage
+tags: postgres, planetscale, cli, insights, query-patterns, api
+---
+# Query Insights via pscale CLI
+Analyze slow queries and missing indexes using `pscale api`. Endpoints may change—see https://planetscale.com/docs/api/reference/getting-started-with-planetscale-api for current API docs.
+## Using pscale api
+The `pscale api` command makes authenticated API calls using your current login or service token (see [ps-cli-commands.md](ps-cli-commands.md#service-token-cicd) for auth setup). No need to manage auth headers manually.
+```bash
+pscale api "<endpoint>" [--method POST] [--field key=value] [--org <org>]
+```
+## Query Patterns Reports
+```bash
+# Create a new report
+pscale api "organizations/{org}/databases/{db}/branches/{branch}/query-patterns-reports" \
+  --method POST --org my-org
+# Check status (poll until state=complete)
+pscale api "organizations/{org}/databases/{db}/branches/{branch}/query-patterns-reports/{id}/status"
+# Download completed report
+pscale api "organizations/{org}/databases/{db}/branches/{branch}/query-patterns-reports/{id}"
+# List all reports
+pscale api "organizations/{org}/databases/{db}/branches/{branch}/query-patterns-reports"
+```
+## Schema Analysis
+```bash
+# Get branch schema
+pscale api "organizations/{org}/databases/{db}/branches/{branch}/schema"
+# Lint schema for issues
+pscale api "organizations/{org}/databases/{db}/branches/{branch}/schema/lint"
+```
+## What to Look For
+| Metric                           | Indicates             | Action                          |
+| -------------------------------- | --------------------- | ------------------------------- |
+| High `rows_read / rows_returned` | Missing or poor index | Add index on WHERE/JOIN columns |
+| High `total_time_s`              | Heavy query           | Optimize or cache               |
+| High `count` with same pattern   | N+1 queries           | Batch or eager-load             |
+| `indexed: false`                 | Full table scan       | Add index                       |

package/.agents/skills/postgres/references/ps-cli-commands.md ADDED Viewed

@@ -0,0 +1,72 @@
+---
+title: PlanetScale CLI Reference
+description: CLI command guide
+tags: planetscale, cli, branches, deploy-requests, authentication
+---
+# pscale CLI Commands
+Full CLI reference: https://planetscale.com/docs/cli. Use `pscale <command> --help` for subcommands and flags.
+## Authentication
+```bash
+pscale auth login                    # Opens browser
+pscale auth logout
+pscale org list
+pscale org switch <name>
+```
+### Service Token (CI/CD)
+```bash
+# Create and configure
+pscale service-token create
+pscale service-token add-access <id> read_branch --database <db>
+# Use in CI/CD
+export PLANETSCALE_SERVICE_TOKEN_ID="<id>"
+export PLANETSCALE_SERVICE_TOKEN="<token>"
+```
+## Core Commands
+```bash
+# Databases
+pscale database list
+pscale database create <name>
+# Branches
+pscale branch list <db>
+pscale branch create <db> <branch> [--from <parent>]
+pscale branch delete <db> <branch>    # DESTRUCTIVE — always confirm with a human first
+pscale branch schema <db> <branch>
+# Deploy requests (schema changes) — Vitess only
+pscale deploy-request create <db> <branch>
+pscale deploy-request list <db>
+pscale deploy-request deploy <db> <number>
+# Connect
+pscale shell <db> <branch>           # Opens psql (Postgres) or mysql (Vitess)
+pscale connect <db> <branch>         # Proxy for GUI tools (secure tunnel) — Vitess only
+# Credentials
+pscale role create <db> <branch> <name>      # Postgres
+pscale password create <db> <branch> <name>  # Vitess
+# Other
+pscale ping              # Check latency to regions
+pscale region list       # Available regions
+pscale backup list <db> <branch>
+pscale backup create <db> <branch>
+```
+## Useful Flags
+```bash
+--format json    # Output as JSON (also: csv, human)
+--org <name>     # Specify organization
+--debug          # Debug output
+```
+For API calls via CLI, see [ps-cli-api-insights.md](ps-cli-api-insights.md).

package/.agents/skills/postgres/references/ps-connection-pooling.md ADDED Viewed

@@ -0,0 +1,72 @@
+---
+title: PgBouncer Connection Pooling
+description: Pooling setup guide
+tags: postgres, pgbouncer, connection-pooling, performance, transactions
+---
+# Connection Pooling with PgBouncer
+PlanetScale provides PgBouncer for connection pooling. Connect on port `6432` instead of `5432`.
+## When to Use PgBouncer (Port 6432)
+All OLTP application workloads: web apps, APIs, high-concurrency read/write operations.
+## When to Use Direct Connections (Port 5432)
+- Schema changes (DDL)
+- Analytics, reporting, batch processing
+- Session-specific features (temp tables, session variables)
+- ETL, data streaming, `pg_dump`
+- Long-running admin transactions
+## PgBouncer Types
+PlanetScale offers three PgBouncer options. All use port `6432`.
+| Type | Runs On | Routes To | Key Trait |
+| ---- | ------- | --------- | --------- |
+| **Local** | Same node as primary | Primary only | Included with every database; no replica routing |
+| **Dedicated Primary** | Separate node | Primary | Connections persist through resizes, upgrades, and most failovers |
+| **Dedicated Replica** | Separate node | Replicas | Read-only traffic; supports AZ affinity for lower latency |
+- **Local PgBouncer** — use same credentials as direct, just change port to `6432`. Always routes to primary regardless of username.
+- **Dedicated Primary** — runs off-server for improved HA. Use for production OLTP write traffic.
+- **Dedicated Replica** — runs off-server for read-heavy workloads. Supports AZ affinity to prefer same-zone replicas. Multiple can be created for capacity or per-app isolation.
+To connect to a dedicated PgBouncer, append `|pgbouncer-name` to the username (e.g., `postgres.xxx|write-pool` or `postgres.xxx|read-bouncer`).
+## Transaction Pooling Limitations
+PlanetScale PgBouncer uses **transaction pooling mode**. These features are unavailable:
+- Prepared statements that persist across transactions
+- Temporary tables
+- `LISTEN`/`NOTIFY`
+- Session-level advisory locks
+- `SET` commands persisting beyond a transaction
+## Recommended Patterns
+- Size pools from observed concurrency, query memory behavior, and connection limits.
+- Keep pooled app traffic on `6432` and reserve direct connections for DDL/admin/long-running jobs.
+## Avoid Patterns
+- Avoid setting pool size with only `CPU_cores * N` while ignoring query-memory amplification.
+- Avoid running session-dependent workflows through transaction pooling.
+## Connecting
+```bash
+# Local PgBouncer (same credentials, port 6432)
+psql 'host=xxx.horizon.psdb.cloud port=6432 user=postgres.xxx password=pscale_pw_xxx dbname=mydb sslnegotiation=direct sslmode=verify-full sslrootcert=system'
+# Dedicated primary PgBouncer (append |pgbouncer-name to user)
+psql 'host=xxx.horizon.psdb.cloud port=6432 user=postgres.xxx|write-pool password=pscale_pw_xxx dbname=mydb sslnegotiation=direct sslmode=verify-full sslrootcert=system'
+# Dedicated replica PgBouncer (append |pgbouncer-name to user)
+psql 'host=xxx.horizon.psdb.cloud port=6432 user=postgres.xxx|read-bouncer password=pscale_pw_xxx dbname=mydb sslnegotiation=direct sslmode=verify-full sslrootcert=system'
+```
+Docs: https://planetscale.com/docs/postgres/connecting/pgbouncer

package/.agents/skills/postgres/references/ps-connections.md ADDED Viewed

@@ -0,0 +1,37 @@
+---
+title: PlanetScale Postgres Connections
+description: Connection guide for PlanetScale Postgres
+tags: planetscale, postgres, connections, ssl, troubleshooting
+---
+# PlanetScale Postgres Connections
+Postgres docs: https://planetscale.com/docs/postgres/connecting
+| Protocol | Standard Port | Pooled Port | SSL      |
+| -------- | ------------- | ----------------------- | -------- |
+| Postgres | 5432          | 6432 (PgBouncer)        | Required |
+Credentials (roles) are branch-specific and cannot be recovered after creation.
+## Connection String
+```
+postgresql://<user>:<password>@<host>.horizon.psdb.cloud:5432/<database>?sslmode=verify-full&sslrootcert=system&sslnegotiation=direct
+```
+Use port **6432** for PgBouncer (applications/OLTP).
+Use port **5432** for DDL, admin tasks, and migrations.
+## Troubleshooting
+| Error | Fix |
+| -------------------------------- | --------------------------------------- |
+| `password authentication failed` | Check role format: `<role>.<branch_id>` |
+| `too many clients already`       | Use PgBouncer (port 6432)               |
+| `SSL connection is required`     | Add `sslmode=verify-full&sslrootcert=system` |
+**Best practices:**
+- Use the PlanetScale Postgres metrics page to monitor direct and PgBouncer connections
+- Route OLTP traffic to port 6432 and reserve 5432 for admin/migrations.
+- Avoid raising `max_connections` reactively instead of pooling.

package/.agents/skills/postgres/references/ps-extensions.md ADDED Viewed

@@ -0,0 +1,27 @@
+---
+title: PlanetScale PostgreSQL Extensions
+description: Extension reference
+tags: postgres, extensions
+---
+# PostgreSQL Extensions on PlanetScale
+Only use PlanetScale-supported extensions. For the complete and up-to-date list of available extensions, see: https://planetscale.com/docs/postgres/extensions
+Do not rely on hard-coded extension lists — always check the documentation above for current availability.
+## Enabling Extensions
+Some extensions must first be **enabled in the PlanetScale Dashboard** (Clusters > Extensions) before they can be created in SQL. This often requires a database restart.
+Once enabled in the dashboard, create the extension in SQL:
+```sql
+CREATE EXTENSION IF NOT EXISTS <extension_name>;
+```
+## Recommended Patterns
+- Always check the [PlanetScale extensions docs](https://planetscale.com/docs/postgres/extensions) before assuming an extension is available.
+- Verify extension availability in PlanetScale configuration and docs before schema design depends on it.
+- Enable `pg_stat_statements` early for baseline query telemetry.

package/.agents/skills/postgres/references/ps-insights.md ADDED Viewed

@@ -0,0 +1,62 @@
+---
+title: PlanetScale Query Insights
+description: Query insights guide
+tags: postgres, planetscale, insights, monitoring, optimization
+---
+# PlanetScale Insights
+## Fetch current documentation first
+Prefer retrieval over pre-training knowledge. Docs: https://planetscale.com/docs
+## MCP Server (Preferred)
+When the PlanetScale MCP server is configured in your environment, prefer it over CLI. Key tools:
+- `planetscale_get_branch_schema` — Get schema for a branch
+- `planetscale_execute_read_query` — Run SELECT, SHOW, DESCRIBE, EXPLAIN
+- `planetscale_get_insights` — Query performance insights
+- `planetscale_list_schema_recommendations` — Index and schema suggestions
+- `planetscale_search_documentation` — Search PlanetScale docs
+MCP setup: https://planetscale.com/docs/connect/mcp
+The MCP server is the ideal way to interact with insights from an AI agent.
+If not installed, prompt the user to install it to make the agent more effective.
+## Query Insights (CLI)
+Generating reports via CLI is a multi-step process (create → wait → download).
+See [ps-cli-api-insights.md](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/postgres/references/ps-cli-api-insights.md) for how to use.
+What to look for:
+- High `rows_read / rows_returned` ratio → missing index
+- High `total_time_s` → optimization target
+## Insights UI (Dashboard)
+In the [PlanetScale dashboard](https://app.planetscale.com/), select your database and click **Insights**.
+- **Filtering** — Pick a branch, choose primary or replica, and scroll through the last 7 days. Click-and-drag on graphs to zoom into a time window.
+- **Graphs** — Four tabs: Query latency (p50/p95/p99/p99.9), Queries per second, Rows read/s, and Rows written/s.
+- **Queries table** — All queries in the selected timeframe, normalized into patterns. Sortable and filterable by SQL, schema, table, latency, index usage, and more. Customizable columns (count, total time, latency percentiles, rows read/returned/affected, CPU/IO time, cache hit ratio, etc.). Enable sparklines for inline trend graphs. Orange icons flag full table scans.
+- **Query deep dive** — Click any query to see per-pattern graphs, summary stats, index usage breakdown, and a table of notable executions (>1 s, >10k rows read, or errors). Use "Summarize query" for an LLM-generated plain-English description.
+- **Anomalies tab** — Flags periods with elevated slow-running queries and surfaces the responsible patterns.
+- **Errors tab** — Surfaces queries that produced errors.
+- **pginsights settings** — `pginsights.raw_queries` enables full query text collection for notable queries; `pginsights.normalize_schema_names` groups identical patterns across schemas (useful for schema-per-tenant designs). Both configurable in the Extensions tab on the Clusters page.
+More: [PlanetScale Insights docs](https://planetscale.com/docs/postgres/monitoring/query-insights)
+## Optimization Checklist
+- Remove unused indexes (0 scans)
+- Remove duplicate indexes
+- Archive audit/log tables >10 GB
+- Review tables >100 GB for partitioning
+**Always confirm with a human before removing indexes, dropping tables/partitions, or archiving data.** These are destructive actions that cannot be easily undone.
+More: [optimization-checklist.md](https://raw.githubusercontent.com/planetscale/database-skills/main/skills/postgres/references/optimization-checklist.md)

package/.agents/skills/postgres/references/query-patterns.md ADDED Viewed

@@ -0,0 +1,80 @@
+---
+title: SQL Query Patterns
+description: Common SQL anti-patterns and optimized alternatives
+tags: postgres, sql, query-optimization, n-plus-one, pagination
+---
+# SQL Query Patterns
+## Query Structure
+**SELECT specific columns** — avoids fetching unnecessary data and enables covering indexes:
+```sql
+-- Bad:
+SELECT * FROM user WHERE status = 'active';
+-- Good:
+SELECT id, name, email FROM user WHERE status = 'active';
+```
+**Subqueries → JOINs** — correlated subqueries re-execute per row:
+```sql
+-- Bad
+SELECT id, (SELECT COUNT(*) FROM order WHERE order.user_id = user.id) FROM user;
+-- Good
+SELECT u.id, COUNT(o.id) FROM user u LEFT JOIN order o ON o.user_id = u.id GROUP BY u.id;
+```
+**Always LIMIT unbounded queries** — prevent runaway result sets:
+```sql
+SELECT id, message FROM log WHERE level = 'error' ORDER BY created_at DESC LIMIT 100;
+```
+**Avoid functions on indexed columns (SARGable)** — functions prevent index usage unless a functional index exists:
+```sql
+-- Bad: Full table scan
+SELECT * FROM user WHERE date_trunc('day', created_at) = '2023-01-01';
+-- Good: Index scan
+SELECT * FROM user WHERE created_at >= '2023-01-01' AND created_at < '2023-01-02';
+```
+## N+1 Detection
+**Queries inside loops → batch with ANY/IN:**
+```python
+# Bad
+for uid in user_ids:
+    cursor.execute("SELECT name FROM user WHERE id = %s", (uid,))
+# Good (Postgres specific)
+cursor.execute("SELECT id, name FROM user WHERE id = ANY(%s)", (list(user_ids),))
+# Good (Standard SQL)
+# cursor.execute("SELECT id, name FROM user WHERE id IN %s", (tuple(user_ids),))
+```
+**ORM lazy loading → eager loading:**
+```python
+# Bad: N+1 — each iteration fires a query
+for user in User.query.all():
+    print(user.posts)
+# Good
+users = User.query.options(joinedload(User.posts)).all()
+```
+## Query Rewrites
+**UNION → UNION ALL** — skip deduplication when duplicates are impossible or acceptable.
+**IN subquery → EXISTS** — EXISTS short-circuits on first match:
+```sql
+SELECT id, name FROM user u
+WHERE EXISTS (SELECT 1 FROM order o WHERE o.user_id = u.id AND o.total > 100);
+```
+**OFFSET → cursor pagination** — OFFSET scans and discards rows, degrading at depth:
+```sql
+-- Bad: OFFSET 10000 scans 10020 rows
+SELECT id, title FROM article ORDER BY created_at DESC LIMIT 20 OFFSET 10000;
+-- Good: cursor-based (requires index on (created_at DESC, id DESC))
+SELECT id, title FROM article
+WHERE (created_at, id) < ('2025-06-15T12:00:00Z', 987654)
+ORDER BY created_at DESC, id DESC LIMIT 20;
+```

package/.agents/skills/postgres/references/replication.md ADDED Viewed

@@ -0,0 +1,49 @@
+---
+title: Replication
+description: Streaming replication, replication slots, synchronous commit levels, failover, and standby management
+tags: postgres, replication, streaming, slots, synchronous, failover, standby, operations
+---
+# Replication
+## Streaming Replication for followers
+Use physical (byte-for-byte) replication via WAL stream from primary to standbys. Standbys are read-only (hot standby); same major PG version and architecture required (same minor recommended). Without replication slots, the primary may recycle WAL before the standby receives it → standby needs full resync via `pg_basebackup`. Use replication slots to guarantee WAL retention for specific standbys.
+## Replication Slots
+Postgres supports Physical slots (streaming) and logical slots (logical replication). Slots prevent WAL deletion even if standby is offline — can exhaust `pg_wal/` disk. Use `max_slot_wal_keep_size` to cap retained WAL per slot. Use `idle_replication_slot_timeout` (PG 17+) to auto-invalidate idle slots. `wal_keep_size` is a simpler alternative to slots for WAL retention. Drop inactive slots immediately to prevent disk exhaustion.
+Slot lag (MB behind): `SELECT slot_name, pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)/1024/1024 AS mb_behind FROM pg_replication_slots;`
+Drop inactive slot: `SELECT pg_drop_replication_slot('slot_name');`
+**Always confirm with a human before dropping replication slots.** Dropping an active or needed slot can cause downstream issues.
+## Synchronous Commit Levels
+| Level | Behavior | Use Case |
+|-------|----------|----------|
+| `off` | Returns immediately, no wait | Non-critical writes; risks losing ~600ms of commits on crash (no inconsistency) |
+| `local` | Waits for local WAL fsync only | Local durability only; no standby wait |
+| `remote_write` | Waits for standby OS buffer | Data loss on standby OS crash |
+| `on` | Waits for standby WAL to disk when `synchronous_standby_names` is set; otherwise same as `local` | **Default. This level or higher recommended for HA** |
+| `remote_apply` | Waits for standby to apply WAL | Strongest; read-your-writes |
+Configure with `synchronous_standby_names`. Use `ANY N` for quorum or `FIRST N` for priority-based sync.
+## Quorum and Failure
+`FIRST 2 (s1, s2, s3)` is priority-based: waits for the 2 highest-priority connected standbys (s1+s2; s3 takes over only if one disconnects). `ANY 2 (s1, s2, s3)` is quorum-based: waits for any 2. With either, if only 1 is healthy, commits hang. Provision at least N+1 standbys: need 2 confirmations → provision 3. PostgreSQL never commits unless required standbys confirm — no inconsistency, but clients may timeout.
+## Failover
+`pg_ctl promote` or `SELECT pg_promote()` (SQL function, PG 12+) converts standby to primary. One-way: promoted standby cannot rejoin as standby without rebuild. `pg_rewind` can resync old primary to new primary (requires `wal_log_hints=on` or data checksums) — faster than full rebuild. After promotion: update connection strings, rebuild old primary as standby, reconfigure other standbys.
+## Monitoring
+On the primary, query `pg_stat_replication` for each connected standby's `state` (`streaming` = healthy, `catchup` = behind), `sync_state` (`sync`/`async`), and LSN positions (`sent_lsn`, `write_lsn`, `flush_lsn`, `replay_lsn`) to compute lag. On standbys, `pg_stat_wal_receiver` shows the receiver process status and `flushed_lsn`; compare `pg_last_wal_receive_lsn()` vs `pg_last_wal_replay_lsn()` for local replay lag.
+Replication lag (MB): `SELECT application_name, pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)/1024/1024 AS lag_mb FROM pg_stat_replication;`
+Enable `wal_compression` (`pglz`, `lz4`, or `zstd`) to compress full page images in WAL (not all WAL data) — reduces WAL size for bandwidth-limited replication.

package/.agents/skills/postgres/references/schema-design.md ADDED Viewed

@@ -0,0 +1,66 @@
+---
+title: PostgreSQL Schema Design
+description: Schema design guide
+tags: postgres, schema, primary-keys, data-types, foreign-keys, naming
+---
+# Schema Design
+## Primary Keys
+Prefer `BIGINT GENERATED ALWAYS AS IDENTITY`. Avoid random UUIDs (UUIDv4) as primary keys; use `uuidv7()` when you need UUIDs.
+```sql
+CREATE TABLE user (
+  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
+  email TEXT NOT NULL UNIQUE
+);
+```
+Random UUID PKs (v4) can cause index fragmentation; UUIDs are also larger (16 vs 8 bytes for BIGINT) and can slow joins.
+## Data Types
+| Use | Avoid |
+| --- | --- |
+| `TEXT`, `VARCHAR` | Extension-specific types |
+| `JSONB` | Custom ENUMs (use CHECK instead) |
+| `TIMESTAMPTZ` | `TIMESTAMP` without time zone |
+| `BIGINT`, `INTEGER` | Platform-specific types |
+Prefer CHECK constraints over ENUM types — they're easier to modify:
+```sql
+CREATE TABLE order (
+  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
+  status TEXT NOT NULL CHECK (status IN ('pending', 'shipped', 'delivered'))
+);
+```
+## Foreign Keys
+- Always index FK columns (PostgreSQL does not auto-create these)
+- Avoid circular FK dependencies
+- Suggestion: use `ON DELETE CASCADE` or `ON DELETE SET NULL` explicitly
+```sql
+CREATE TABLE order (
+  id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
+  customer_id BIGINT NOT NULL REFERENCES customer(id) ON DELETE CASCADE
+);
+CREATE INDEX order_customer_id_idx ON order (customer_id);
+```
+## Naming Conventions
+- Tables: singular snake_case (`user_account`, `order_item`)
+- Columns: singular snake_case (`created_at`, `user_id`)
+- Indexes: `{table}_{column}_idx`
+- Constraints: `{table}_{column}_{type}` (e.g., `order_status_check`)
+## General Guidelines
+- Add `NOT NULL` to as many columns as possible
+- Add `created_at TIMESTAMPTZ DEFAULT NOW()` to all tables
+- Use `BIGINT` for all IDs and foreign keys, even on small tables
+- Keep tables normalized; denormalize only for proven hot read paths

package/.agents/skills/postgres/references/storage-layout.md ADDED Viewed

@@ -0,0 +1,41 @@
+---
+title: Storage Layout and Tablespaces
+description: PGDATA directory structure, TOAST, fillfactor, tablespaces, and disk management
+tags: postgres, storage, pgdata, toast, fillfactor, tablespaces, disk, operations
+---
+# Storage Layout and Tablespaces
+## PGDATA Structure
+- **base/** — database files (one subdirectory per database, named by OID)
+- **global/** — cluster-wide shared catalogs (pg_database, pg_authid, pg_tablespace)
+- **pg_wal/** — WAL files
+- **pg_xact/** — transaction commit status
+"Cluster" in PostgreSQL = single instance with one PGDATA, not an HA cluster. Each table/index = one or more files, split into 1GB segments. Tables have companion **_fsm** (free space map) and **_vm** (visibility map); indexes have **_fsm** only (no _vm), except hash indexes.
+## Visibility Map and Free Space Map
+- **_vm** tracks all-visible pages — VACUUM skips these
+- **_fsm** tracks free space per page — INSERT uses this to find pages with room
+- Both are small files but critical for performance
+## TOAST
+TOAST triggers when a **row** exceeds ~2KB. Large values are compressed and/or moved out-of-line to `pg_toast.pg_toast_<oid>` tables. **Strategies:** PLAIN (no TOAST), EXTENDED (compress+out-of-line, default for text/bytea), EXTERNAL (out-of-line, no compression — use for pre-compressed data), MAIN (compress, avoid out-of-line). TOAST tables bloat like regular tables — they need VACUUM. `SELECT *` fetches all TOAST columns; always SELECT only needed columns. Move large rarely-accessed columns to separate tables.
+## Fillfactor
+Controls how full pages are packed (default 100%). Lower fillfactor (70–80%) leaves room for HOT (Heap-Only Tuple) updates, which avoid index entries and reduce bloat on UPDATE-heavy tables. Keep 100% for insert-only or read-mostly tables. `ALTER TABLE t SET (fillfactor = 70);`
+## Tablespaces
+`pg_default` (base/), `pg_global` (global/) are built-in. Custom tablespaces: symbolic links in **pg_tblspc/** to other filesystem locations. Use for separating hot data (SSD) from archives (HDD). Moving tablespaces requires exclusive lock on affected tables.
+## Disk Monitoring
+- `pg_database_size('dbname')`, `pg_total_relation_size('tablename')`, `pg_relation_size('tablename')`
+- Monitor disk usage: >80% = at risk; >90% = critical (VACUUM may fail if disk capacity is insufficient)
+- Check inode usage (`df -i`) — can run out even with free space
+- `pg_wal/` suddenly large = check replication slots and archiving

package/.agents/skills/postgres/references/wal-operations.md ADDED Viewed

@@ -0,0 +1,42 @@
+---
+title: WAL and Checkpoint Operations
+description: Write-ahead log internals, checkpoint tuning, durability guarantees, and WAL disk management
+tags: postgres, wal, checkpoints, durability, crash-recovery, fsync, operations
+---
+# WAL and Checkpoint Operations
+## WAL Fundamentals
+Write-Ahead Logging: logs changes to `pg_wal/` **before** modifying data files. WAL segments are 16MB (fixed at initdb). On COMMIT, PostgreSQL fsyncs WAL to disk and returns SUCCESS — data files are updated lazily. WAL records are written for all changes (including uncommitted transactions and rollbacks). **Never disable `fsync` in production** — power loss without fsync risks unrecoverable data loss.
+`wal_level`: `minimal` (crash recovery only), `replica` (default; replication + archiving), `logical` (logical replication).
+## Dirty Pages and Checkpoints
+A dirty page is modified in shared_buffers but not yet written to data files. A checkpoint flushes all dirty pages to disk and writes a checkpoint record to WAL; recovery only replays WAL since the last checkpoint.
+- `checkpoint_timeout` (default 5 min) and `max_wal_size` (default 1GB) — checkpoint on whichever triggers first.
+- `checkpoint_completion_target=0.9` spreads I/O over 90% of the interval; avoid spikes.
+- "Checkpoints are occurring too frequently" in logs → increase `max_wal_size`.
+- **Target: >90% of checkpoints should be time-based** (`num_timed` in `pg_stat_checkpointer`), not size-based (`num_requested`). If num_requested/(num_timed+num_requested) > 10%, tune `max_wal_size` up.
+## WAL Disk Management
+Replication slots prevent WAL deletion even when standbys are offline — they can fill disk. WAL archiving failures also block recycling. `max_wal_size` is a *soft* limit; WAL can grow beyond it under heavy load.
+WAL size: `SELECT count(*) AS files, pg_size_pretty(sum(size)) AS total FROM pg_ls_waldir();`
+Slot lag: `SELECT slot_name, pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS lag_bytes FROM pg_replication_slots;`
+## Checkpoint Monitoring
+PG17+ moved checkpoint stats from `pg_stat_bgwriter` to `pg_stat_checkpointer` and renamed columns.
+`SELECT num_timed, num_requested, write_time, sync_time, buffers_written FROM pg_stat_checkpointer;`
+Backend-direct writes (formerly `buffers_backend` in `pg_stat_bgwriter`) are now tracked in `pg_stat_io`: `SELECT writes FROM pg_stat_io WHERE backend_type = 'client backend' AND object = 'relation';`
+## Crash Recovery
+On crash, PostgreSQL replays WAL from the last checkpoint. Longer checkpoint intervals → more WAL to replay → longer recovery. Trade-off: frequent checkpoints (faster recovery, more I/O) vs infrequent (less I/O, slower recovery). For most workloads, `checkpoint_timeout=5min` and `max_wal_size` tuned to keep checkpoints time-based is the right balance.