@claude-code-mastery/starter-kit 1.2.0 → 1.3.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,69 @@
1
+ ---
2
+ name: mongodb-backups
3
+ description: Production MongoDB backup and restore practices that the documentation gets wrong. Use when writing a mongodump/mongorestore pipeline, a backup cron job, an S3 backup, or a selective restore, or when planning recovery. Covers streaming to object storage with no temp file, saving a collection inventory, the --nsInclude trap on gzipped archives, collection tiering for fast restores, and matching write concern to data criticality. Defers replica-set topology and tuning to mongodb-replica-sets, and query patterns to mongodb-rules.
4
+ when_to_use: |
5
+ - Writing or reviewing a mongodump / mongorestore pipeline or backup cron job
6
+ - Streaming a backup to S3 or other object storage
7
+ - Doing a selective restore (only some collections) from a gzipped archive
8
+ - Planning recovery, retention, or what to restore first during an incident
9
+ - Do NOT use for replica-set setup or tuning (mongodb-replica-sets) or query shape (mongodb-rules)
10
+ ---
11
+
12
+ # MongoDB Backups and Restore
13
+
14
+ From running thousands of production backups. The defaults and the docs leave out the parts that bite during an actual restore.
15
+
16
+ ## Dump from a secondary, stream straight to object storage
17
+
18
+ Point mongodump at a secondary so the backup doesn't add load to the primary, and pipe the archive directly to S3 with no intermediate file on disk. On a replica set, `--oplog` captures a consistent point-in-time snapshot.
19
+
20
+ ```bash
21
+ mongodump --host mongodb-secondary.internal:27017 \
22
+ --username "$U" --password "$P" --authenticationDatabase admin \
23
+ --db "$DB" --oplog --gzip --archive \
24
+ | aws s3 cp - "s3://bucket/$DB/$(date +%Y%m%d_%H%M%S).dump.gz"
25
+ ```
26
+
27
+ Keep a `latest.dump.gz` alias next to the timestamped file so restore scripts always know where to look.
28
+
29
+ ## Save a collection inventory with every backup
30
+
31
+ You cannot inspect a `--gzip --archive` after the fact. There is no list, peek, or inspect flag, it's an opaque binary blob, and `--dryRun` finishes before the archive is demuxed so it tells you nothing. So write the collection list at backup time, right beside the dump:
32
+
33
+ ```bash
34
+ mongosh --quiet --host "$HOST" -u "$U" -p "$P" --authenticationDatabase admin \
35
+ --eval "db.getSiblingDB('$DB').getCollectionNames().forEach(c => print(c))" \
36
+ | aws s3 cp - "s3://bucket/$DB/$(date +%Y%m%d_%H%M%S).collections.txt"
37
+ ```
38
+
39
+ Six months later when you need a selective restore, you read the file instead of trying to remember what was in the archive.
40
+
41
+ ## The `--nsInclude` trap on gzipped archives
42
+
43
+ The docs say `--nsInclude` filters a restore to specific collections. It does, from a directory dump (one `.bson` per collection). But from a `--gzip --archive`, which is what almost every production pipeline uses, `--nsInclude` (and `--nsFrom`/`--nsTo`) silently restore everything anyway and throw duplicate-key errors on the collections you never asked for. The archive is a single multiplexed stream that mongorestore can't seek, so the namespace filter doesn't hold. This is real and long-standing (JIRA TOOLS-2023, open over six years).
44
+
45
+ The reliable approach is the inverse: `--nsExclude` every collection you don't want, generated from the inventory file. Don't hand-build 100+ exclude flags at 2 AM, script it to read the inventory and emit the restore command.
46
+
47
+ ## Tier collections so restore is a command, not improvisation
48
+
49
+ Decide ahead of time what gets restored, and keep the lists in the restore script:
50
+
51
+ - **Tier 1, critical business data** (orders, customers, products, inventory): always restore.
52
+ - **Tier 2, regenerable** (sessions, caches, tokens, search indexes): never restore. Restoring stale sessions is worse than having none, you log people back into dead state.
53
+ - **Tier 3, historical/analytical** (audit logs, history, analytics rollups): restore only on demand. This is the bulk of the exclude list.
54
+
55
+ When the incident hits you want to run a command, not write one.
56
+
57
+ ## A replica set is only a backup at `w:"majority"`
58
+
59
+ Write concern quietly decides whether replication is durability or just a live mirror. `w:1` acknowledges on the primary alone, so a write lost before it replicates never existed anywhere else. `w:"majority"` means the data is on a majority of members before the app gets the OK. Match it to data criticality rather than setting one value globally: `w:"majority"` for data you can't lose, `w:1` for the regenerable and the disposable. See mongodb-replica-sets for the full read/write semantics.
60
+
61
+ ## Test the restore before you need it
62
+
63
+ A backup you've never restored is a hypothesis. Practice the selective restore in staging, and time how long a replica-set member rebuilds from zero (delete a secondary's data dir, restart, and watch `db.adminCommand({ replSetGetStatus: 1, initialSync: 1 }).initialSyncStatus`). That number is what tells you, at 2 AM, whether to wait for self-healing or restore from backup. Point-in-time recovery needs `--oplog` backups plus `mongorestore --oplogReplay --oplogLimit "<ts>:<inc>"`.
64
+
65
+ One adjacent footgun that kills backups: unrotated MongoDB diagnostic logs fill the disk and the primary goes read-only. Set `systemLog.logRotate: rename` with rotation and alert on disk at 80%.
66
+
67
+ ---
68
+
69
+ This skill is built to grow. Add a rule when a real restore surprise has a stable, defensible fix.
@@ -0,0 +1,73 @@
1
+ ---
2
+ name: mongodb-replica-sets
3
+ description: Production MongoDB replica-set operation: topology, durability, host tuning, and the container-specific gotchas Claude gets wrong. Use when setting up or configuring a replica set, writing its connection string, choosing read preference or write concern, running MongoDB in Docker or Swarm, or planning failover and upgrades. Covers odd-member quorum, connecting via the set rather than one node, WiredTiger cache sizing, the OS tuning MongoDB requires, why ingress-mode ports break a replica set, and oplog/failover discipline. Defers query and index shape to mongodb-rules and backup pipelines to mongodb-backups.
4
+ when_to_use: |
5
+ - Setting up, configuring, or initializing a replica set, or writing its connection string
6
+ - Choosing read preference or write concern, or reasoning about staleness and durability
7
+ - Running MongoDB in Docker or Docker Swarm
8
+ - Sizing the WiredTiger cache, tuning the host, sizing the oplog, or planning failover/upgrades
9
+ - Do NOT use for query/aggregation/index shape (mongodb-rules) or backup pipelines (mongodb-backups)
10
+ ---
11
+
12
+ # MongoDB Replica Sets in Production
13
+
14
+ How to run a replica set, not how to query it (that's mongodb-rules). These are the operational decisions Claude tends to get wrong.
15
+
16
+ ## Topology: odd voting members, real hostnames, never localhost
17
+
18
+ Use an odd number of voting members, three, so a majority can still elect a primary when one is lost. A two-member set has no majority if either dies, it goes read-only. Avoid arbiters unless you truly must: an arbiter holds no data, so a three-node primary-secondary-arbiter set that loses its one data secondary can no longer satisfy `w:"majority"`. Address every member by a hostname that all other members and all clients can resolve and reach; `localhost` in the config is a classic break, the other members can't reach it. Initialize once, from a single member, after all members are up, and wait for the election before creating users.
19
+
20
+ ```javascript
21
+ rs.initiate({ _id: "rs0", members: [
22
+ { _id: 0, host: "mongo1.internal:27017", priority: 2 },
23
+ { _id: 1, host: "mongo2.internal:27017", priority: 1 },
24
+ { _id: 2, host: "mongo3.internal:27017", priority: 1 }
25
+ ]})
26
+ ```
27
+
28
+ ## Connect to the set, not to one node
29
+
30
+ The connection string must list the seed members and name the set, so the driver can find the current primary and follow failover. Pointing the app at a single host throws away the high availability the replica set exists to provide.
31
+
32
+ ```
33
+ mongodb://mongo1,mongo2,mongo3/db?replicaSet=rs0
34
+ ```
35
+
36
+ ## Durability and reads
37
+
38
+ Writes always go to the primary. `w:"majority"` means a majority of members acknowledged before the app gets the OK, that is durability, use it for data you can't lose; `w:1` (primary only) is fine for the regenerable. Add `wtimeoutMS` so a degraded set doesn't block writes forever. Reads default to the primary and are strongly consistent. Secondaries are eventually consistent, they lag, so only route reads to them (`secondaryPreferred`, `secondary`, or `nearest`) when stale data is acceptable, analytics and reporting, not a read-after-write the user just made.
39
+
40
+ ## Authentication between members
41
+
42
+ A replica set needs internal auth or any host can join. Generate a keyfile (`openssl rand -base64 756`), share it across members, and enable `security.authorization: enabled` with the keyfile. Never expose 27017 to the internet, bind to the private network and firewall it. Use TLS for client and inter-node traffic when the network isn't fully trusted.
43
+
44
+ ## OS tuning MongoDB actually requires (self-hosted)
45
+
46
+ These aren't optional polish, MongoDB warns about them and they cause real instability if skipped:
47
+
48
+ - **Disable Transparent Huge Pages (THP).** THP hurts database memory access patterns badly; set `enabled` and `defrag` to `never` at boot.
49
+ - **`vm.swappiness=1`.** Keep the working set in RAM instead of swapping out the cache.
50
+ - **Raise ulimits.** `nofile` and `nproc` to 64000/32000, the defaults are too low for a busy mongod's connections and threads.
51
+ - **XFS for the data volume.** MongoDB recommends XFS over ext4 for WiredTiger. Never put data on NFS or network storage, the latency wrecks it.
52
+
53
+ ## WiredTiger cache, especially in a container
54
+
55
+ Set `--wiredTigerCacheSizeGB` explicitly from the container's memory limit, roughly half of it minus 1GB, and cap the container's memory in the deploy block. Don't rely on the default: it targets about half of system RAM, and the cache is only part of mongod's footprint (connections, aggregation, and sort buffers live outside it). An unset cache sized against the host plus those buffers will exceed a container limit and get the container OOM-killed. Leave headroom on purpose.
56
+
57
+ ## Running it in Docker or Swarm
58
+
59
+ The replica-set-specific traps on top of the general docker and docker-swarm skills:
60
+
61
+ - **Publish the port in `mode: host`, never ingress.** The routing mesh load-balances `27017` across all three members, which breaks the replica set, clients and members must reach a specific member. Use `ports: [{ target: 27017, published: 27017, mode: host }]`.
62
+ - **Pin each member to its own node.** Without placement constraints all three can land on one host and you have zero fault tolerance. Label nodes and constrain each service (`node.labels.mongo.replica == 1`), one member per host.
63
+ - **Bind-mount the data directory to a host path.** Anonymous or named-without-bind volumes risk data loss on recreation and are hard to back up. Pre-create the dir and `chown 999:999` (the image runs as UID 999). XFS host filesystem.
64
+ - **Real healthcheck, with a start period.** `test: ["CMD","mongosh","--eval","db.adminCommand('ping')"]` and `start_period: 60s`, so a slow first start isn't read as unhealthy.
65
+ - **Keyfile and root password via Docker secrets**, not plaintext env (which shows in `docker inspect` and image history).
66
+
67
+ ## Oplog and failover discipline
68
+
69
+ The oplog is your recovery window: a secondary that falls further behind than the oplog covers needs a full resync, and point-in-time recovery can only reach back as far as the oplog. Size it for your write volume (24h minimum, 48 to 72h is a safer target; `db.getReplicationInfo()` shows the current window). Test failover on a schedule with `rs.stepDown()`, don't discover at an outage that the app doesn't reconnect. Roll upgrades secondaries-first, one at a time, step the primary down last, and let each node resync before moving on.
70
+
71
+ ---
72
+
73
+ This skill is built to grow. Add a rule when a real replica-set operation has a stable, defensible fix.
@@ -59,4 +59,7 @@ Non-negotiable for this codebase. From production, not preference.
59
59
 
60
60
  ## Schema
61
61
 
62
- - **Collections enforce no structure by default.** Add a `$jsonSchema` validator as a collection stabilizes and more code depends on its shape. Roll out safely: `validationAction: "warn"` to observe first, then `error`; `validationLevel: "moderate"` to spare existing nonconforming documents.
62
+ Collections enforce no structure by default, so validate in two layers.
63
+
64
+ - **Parse before every write.** Validate each document against its Zod schema right before it hits the database, so a malformed shape never lands. The schema itself isn't Mongo-specific, it's the same contract the API and frontend use, see the `schema-source-of-truth` skill for defining it once and deriving every layer from one base.
65
+ - **Keep a `$jsonSchema` collection validator as the floor.** Zod only guards writes that go through your app, so the DB validator is the last line that also catches mongosh, scripts, and other services. Add it as a collection stabilizes and more code depends on its shape. Roll out safely: `validationAction: "warn"` to observe first, then `error`; `validationLevel: "moderate"` to spare existing nonconforming documents.
@@ -0,0 +1,148 @@
1
+ ---
2
+ name: nginx
3
+ description: Production NGINX configuration best practices, especially as a reverse proxy in front of containerized backends. Use when writing or editing nginx.conf, server blocks, upstreams, SSL, proxy caching, security headers, structured logging, or stream (TCP/UDP) proxying. Covers the Docker-DNS resolver that keeps upstreams from going stale, separate access-controlled health ports, the stream-block placement gotcha, and headers that must be sent on errors too. Kept separate from the docker and docker-swarm skills.
4
+ when_to_use: |
5
+ - Writing or editing nginx.conf, a server block, an upstream, or an SSL block
6
+ - Putting NGINX in front of containerized services as a reverse proxy
7
+ - Upstreams that resolve once at startup and then break when a container is replaced
8
+ - Structured logging, proxy caching, security headers, or TCP/UDP (stream) proxying
9
+ - Do NOT use for Dockerfile or Swarm deploy concerns, those are the docker and docker-swarm skills
10
+ ---
11
+
12
+ # NGINX: Production Reverse-Proxy Config
13
+
14
+ Aimed at NGINX in front of containerized backends. The defaults are fine for a static site, these are the things that bite in production.
15
+
16
+ ## Resolve upstreams through Docker DNS with a short TTL
17
+
18
+ By default NGINX resolves an upstream host once, at startup, and caches the IP forever. In Docker that IP belongs to a container that will be replaced, so the proxy keeps sending traffic to a dead address. Point NGINX at Docker's internal DNS and force re-resolution:
19
+
20
+ ```nginx
21
+ http {
22
+ resolver 127.0.0.11 ipv6=off valid=10s; # Docker DNS, re-resolve every 10s
23
+ }
24
+ ```
25
+
26
+ `valid=10s` is what makes NGINX pick up the new container after a restart or scale. This is the NGINX side of "never hardcode IPs", see the docker-swarm skill for the principle.
27
+
28
+ ## Upstreams by service name, with keepalive
29
+
30
+ Reference backends by service name, never IP. Reuse connections with `keepalive`, which needs HTTP/1.1 and a cleared Connection header:
31
+
32
+ ```nginx
33
+ upstream backend {
34
+ server backend-service:8080;
35
+ keepalive 32;
36
+ }
37
+ server {
38
+ location / {
39
+ proxy_pass http://backend;
40
+ proxy_http_version 1.1;
41
+ proxy_set_header Connection "";
42
+ }
43
+ }
44
+ ```
45
+
46
+ ## Structured JSON logs to stdout/stderr
47
+
48
+ Log JSON so an aggregator can parse it, and write to stdout/stderr so Docker's logging driver captures it. Never log to a file inside the container.
49
+
50
+ ```nginx
51
+ log_format json_log escape=json '{'
52
+ '"time":$msec,"method":"$request_method","status":$status,'
53
+ '"uri":"$request_uri","rt":$request_time,'
54
+ '"upstream":"$upstream_addr","cache":"$upstream_cache_status",'
55
+ '"client":"$remote_addr","xff":"$http_x_forwarded_for"'
56
+ '}';
57
+ access_log /dev/stdout json_log;
58
+ error_log /dev/stderr warn;
59
+ ```
60
+
61
+ ## Health and status on separate, access-restricted ports
62
+
63
+ Keep health checks and metrics off the production port: different access control, no log noise, no interference with real traffic. Restrict to internal networks and turn off access logging.
64
+
65
+ ```nginx
66
+ server { # load balancer health check
67
+ listen 82;
68
+ allow 10.0.0.0/8; allow 172.16.0.0/12; allow 127.0.0.1; deny all;
69
+ location /health { access_log off; return 200 "OK"; }
70
+ }
71
+ server { # stub_status for Prometheus/Datadog
72
+ listen 81;
73
+ allow 10.0.0.0/8; allow 127.0.0.1; deny all;
74
+ location /nginx_status { stub_status on; }
75
+ }
76
+ ```
77
+
78
+ A deep health check that proxies an upstream's own `/health` is worth a third port when a service's liveness depends on its backend being reachable.
79
+
80
+ ## Stream (TCP/UDP) blocks go OUTSIDE the http block
81
+
82
+ Proxying a non-HTTP protocol like MongoDB or a database uses the `stream` module, which is a top-level block, not inside `http`. Putting it inside `http` is a silent misconfiguration. HTTP services (an Elasticsearch REST proxy, say) stay inside `http`.
83
+
84
+ ```nginx
85
+ load_module modules/ngx_stream_module.so;
86
+ include /etc/nginx/mongo.conf; # stream { ... } OUTSIDE http
87
+ http {
88
+ include /etc/nginx/elasticsearch.conf; # HTTP proxy, INSIDE http
89
+ }
90
+ ```
91
+
92
+ ## SSL and security headers
93
+
94
+ Modern protocols and ciphers, session cache, OCSP stapling. When certs come from Docker secrets they are mounted at `/run/secrets/<name>`:
95
+
96
+ ```nginx
97
+ ssl_certificate /run/secrets/server_pem;
98
+ ssl_certificate_key /run/secrets/server_key;
99
+ ssl_protocols TLSv1.2 TLSv1.3;
100
+ ssl_session_cache shared:SSL:60m;
101
+ ssl_stapling on; ssl_stapling_verify on;
102
+ ```
103
+
104
+ Send security headers with `always` so they are present on error responses too, not just 2xx/3xx:
105
+
106
+ ```nginx
107
+ add_header X-Frame-Options "SAMEORIGIN" always;
108
+ add_header X-Content-Type-Options "nosniff" always;
109
+ add_header Strict-Transport-Security "max-age=31536000; includeSubDomains" always;
110
+ ```
111
+
112
+ ## Always add a Content-Security-Policy, this is the one that gets skipped
113
+
114
+ CSP is the highest-value security header and the one almost always left out. The others harden edges; CSP is the primary defense against XSS and content injection, it tells the browser which sources are allowed to load scripts, styles, images, and frames, so an injected `<script>` from an attacker simply doesn't execute. A site without a CSP has no second line of defense once markup injection gets through. Add it by default, do not wait to be asked.
115
+
116
+ `default-src 'self'` alone is technically a CSP but it breaks most real apps (CDNs, inline styles, analytics) and lulls you into thinking you're covered, so set the directives explicitly:
117
+
118
+ ```nginx
119
+ add_header Content-Security-Policy "default-src 'self'; script-src 'self'; style-src 'self'; img-src 'self' data:; font-src 'self'; connect-src 'self'; object-src 'none'; base-uri 'self'; frame-ancestors 'self'" always;
120
+ ```
121
+
122
+ Rules that matter:
123
+ - **Avoid `'unsafe-inline'` and `'unsafe-eval'` in `script-src`.** They re-open the XSS hole CSP exists to close. If you have inline scripts, use a per-request nonce or a hash, not a blanket unsafe allow.
124
+ - **`object-src 'none'` and `base-uri 'self'`** are free wins that block plugin and base-tag injection. Set them every time.
125
+ - **`frame-ancestors`** controls who can iframe you and supersedes `X-Frame-Options`, so put your clickjacking policy here.
126
+ - **Roll out with report-only first.** A too-strict CSP breaks the page silently. Ship `Content-Security-Policy-Report-Only` to collect violations without enforcing, watch what trips, tighten, then promote to the enforcing header. The real policy is app-specific and is built by tuning, not guessed in one line.
127
+
128
+ ## Proxy caching for read-heavy upstreams
129
+
130
+ Cache GET/HEAD, and serve stale on upstream error or timeout so a backend hiccup doesn't reach users:
131
+
132
+ ```nginx
133
+ proxy_cache es_cache;
134
+ proxy_cache_methods GET HEAD;
135
+ proxy_cache_valid 200 1m;
136
+ proxy_cache_key $host$uri$args;
137
+ proxy_cache_use_stale updating error timeout http_500 http_502 http_503 http_504;
138
+ proxy_hide_header X-Powered-By;
139
+ add_header X-Proxy-Cache $upstream_cache_status;
140
+ ```
141
+
142
+ ## Watch line endings in config files
143
+
144
+ NGINX config copied in with Windows CRLF line endings can fail to parse or behave oddly in a Linux container, which is why production NGINX images often run `dos2unix` on the configs at build time. If `nginx -t` reports something that makes no sense, check the line endings first, see the dev-pitfalls skill.
145
+
146
+ ---
147
+
148
+ This skill is built to grow. Add a directive when a real production NGINX problem has a stable, defensible fix. ModSecurity/WAF setup (build as a dynamic module, `load_module`) is deep enough to deserve its own section when needed.
@@ -0,0 +1,4 @@
1
+ [ZoneTransfer]
2
+ ZoneId=3
3
+ ReferrerUrl=https://claude.ai/chat/597ba4c7-56c2-4a17-8c34-c62cc3cd01b9
4
+ HostUrl=https://claude.ai/api/organizations/72c98b96-eeec-4437-8319-588863e85078/conversations/597ba4c7-56c2-4a17-8c34-c62cc3cd01b9/wiggle/download-file?path=%2Fmnt%2Fuser-data%2Foutputs%2Fskills%2Fnginx%2FSKILL.md
@@ -0,0 +1,121 @@
1
+ ---
2
+ name: nodejs
3
+ description: Node.js backend runtime and process-lifecycle rules that Claude reliably gets wrong. Use when writing a Node server or long-running script, an Express app, worker threads, or process signal handling, and when choosing packages. Covers correct graceful shutdown, crash-on-fault instead of swallowing errors, not blocking the event loop, loading instrumentation first, securing the session cookie, and replacing deprecated packages with built-ins. Defers MongoDB to mongodb-rules, schema/validation to schema-source-of-truth, and image/deploy to docker and docker-swarm.
4
+ when_to_use: |
5
+ - Writing a Node server (`server.js`, Express) or a long-running script or worker
6
+ - Adding or reviewing process signal handling, graceful shutdown, or error handling
7
+ - CPU-heavy work that might block the event loop
8
+ - Choosing an HTTP client, UUID lib, date lib, or any dependency with a modern built-in
9
+ - Do NOT use for Mongo query shape (mongodb-rules), validation schemas (schema-source-of-truth), or Dockerfiles (docker)
10
+ ---
11
+
12
+ # Node.js: Process Lifecycle and Runtime Rules
13
+
14
+ Claude writes Node services that work in a demo and fall over in production, almost always around the process lifecycle. These are the parts to get right.
15
+
16
+ ## Graceful shutdown, done correctly
17
+
18
+ A server must shut down cleanly on `SIGTERM` (what `docker stop`, Swarm, and Kubernetes send) and `SIGINT` (Ctrl-C). Claude usually omits this entirely, so the orchestrator waits the grace period and then SIGKILLs, dropping in-flight requests. The correct sequence is: stop accepting new connections, drain in-flight ones, close dependencies, exit 0, with a hard timeout so a stuck connection can't block shutdown forever.
19
+
20
+ ```javascript
21
+ const server = app.listen(port);
22
+ let shuttingDown = false;
23
+
24
+ async function shutdown(signal) {
25
+ if (shuttingDown) return; // ignore repeat signals
26
+ shuttingDown = true;
27
+ logger.info("shutting down", { signal });
28
+
29
+ const force = setTimeout(() => { // drain hung? force it
30
+ logger.error("shutdown timed out, forcing exit");
31
+ process.exit(1);
32
+ }, 10_000);
33
+ force.unref();
34
+
35
+ try {
36
+ await new Promise((r) => server.close(r)); // stop new conns, let in-flight finish
37
+ await db.close(); // then close DB, redis, change streams
38
+ clearTimeout(force);
39
+ process.exit(0);
40
+ } catch (err) {
41
+ logger.error("error during shutdown", { err });
42
+ process.exit(1);
43
+ }
44
+ }
45
+
46
+ process.on("SIGTERM", () => shutdown("SIGTERM"));
47
+ process.on("SIGINT", () => shutdown("SIGINT"));
48
+ ```
49
+
50
+ Two things that look fine but are bugs: do not put async cleanup in a `process.on("exit", ...)` handler, the event loop is already stopped so nothing async runs, `exit` is for synchronous work only. And do not trap `SIGUSR1`, Node uses it for the debugger. This only works if the process actually receives the signal, which means an exec-form `ENTRYPOINT` so Node is PID 1 (see the docker skill) and `init: true` so signals are forwarded (see docker-swarm).
51
+
52
+ ## Let it crash, never swallow a fault
53
+
54
+ On `uncaughtException` or `unhandledRejection` the process is in an unknown, possibly corrupt state. Log it and exit non-zero, let the orchestrator restart a clean process. Do not catch-and-continue. Modern Node already terminates on an unhandled rejection by default, so code that relies on swallowing one is both wrong and fragile.
55
+
56
+ ```javascript
57
+ process.on("uncaughtException", (err) => {
58
+ logger.error("uncaught exception", { err });
59
+ process.exit(1); // exit non-zero so restart_policy: on-failure restarts
60
+ });
61
+ process.on("unhandledRejection", (reason) => {
62
+ logger.error("unhandled rejection", { reason });
63
+ process.exit(1);
64
+ });
65
+ ```
66
+
67
+ The non-zero exit is what makes Swarm/K8s self-healing fire, an `exit(0)` on a crash reads as success and the dead service is never restarted (see docker-swarm). You can route these through `shutdown()` to drain first, but never let the process keep serving after one.
68
+
69
+ ## Don't block the event loop
70
+
71
+ Node runs your JavaScript on a single thread. A CPU-bound stretch, parsing a huge payload, hashing, image work, a tight loop over a large array, freezes every concurrent request until it finishes. I/O is already async and is not the problem. For real CPU work, offload to `worker_threads`, not `child_process` (for in-process JS) and not "just make it async" (await doesn't yield during a synchronous loop).
72
+
73
+ ```javascript
74
+ const { Worker } = require("node:worker_threads");
75
+ new Worker("./workers/process.js", {
76
+ workerData,
77
+ resourceLimits: { maxOldGenerationSizeMb: 512 }, // cap so one worker can't OOM the host
78
+ });
79
+ ```
80
+
81
+ ## Load instrumentation before anything else
82
+
83
+ APM and tracing libraries (`dd-trace`, the OpenTelemetry SDK) work by monkey-patching `http`, `express`, and your DB driver. They can only patch modules loaded after them, so the init call must be the very first thing in the entry file, before any `require("express")`. Required late, it silently instruments nothing.
84
+
85
+ ```javascript
86
+ // server.js, line 1
87
+ require("dd-trace").init({ /* ... */ });
88
+ const express = require("express"); // now traced
89
+ ```
90
+
91
+ ## Lock down the session cookie
92
+
93
+ When Claude sets up sessions it usually sets `httpOnly` and stops. Set all three: `httpOnly` (no JS access), `secure` in production (HTTPS only), and `sameSite` (CSRF defense), which is the one that gets missed.
94
+
95
+ ```javascript
96
+ cookie: { httpOnly: true, secure: isProd, sameSite: "lax", maxAge: 86_400_000 }
97
+ ```
98
+
99
+ Related CORS gotcha: `credentials: true` cannot be combined with `origin: "*"`, the browser rejects it. Echo a specific allowed origin instead.
100
+
101
+ ## Reach for built-ins, replace deprecated packages
102
+
103
+ Claude's training pulls in libraries that are now deprecated or unnecessary. Prefer the platform:
104
+
105
+ - `crypto.randomUUID()` over the `uuid` package for a v4 id, and `uuid` over `node-uuid`
106
+ - native `fetch` (Node 18+) or `axios` over `request` (unmaintained since 2020)
107
+ - `node:test` + `node:assert` for simple suites, `structuredClone()` over a deep-clone dep
108
+ - the Intl APIs or `date-fns`/Temporal over `moment` (in maintenance mode)
109
+ - `@aws-sdk/client-*` v3 (modular) over the monolithic `aws-sdk` v2
110
+ - `sass` (dart-sass) over the deprecated `node-sass`
111
+
112
+ Use the `node:` prefix on built-in imports (`require("node:fs")`) so there's no ambiguity with a same-named package.
113
+
114
+ ## Two smaller ones
115
+
116
+ - **Logging:** a structured logger (pino or winston) emitting JSON to stdout in production, never `console.log` in a hot path. stdout because the container's logging driver collects it (see docker).
117
+ - **PM2:** cluster mode is for using all cores on a VM or bare-metal host. Inside a Swarm or K8s container, run one Node process and scale with replicas plus `init: true`, don't stack two process managers that both try to own restarts.
118
+
119
+ ---
120
+
121
+ This skill is built to grow. Add a rule when a real Node production failure has a stable, defensible fix.
@@ -0,0 +1,4 @@
1
+ [ZoneTransfer]
2
+ ZoneId=3
3
+ ReferrerUrl=https://claude.ai/chat/597ba4c7-56c2-4a17-8c34-c62cc3cd01b9
4
+ HostUrl=https://claude.ai/api/organizations/72c98b96-eeec-4437-8319-588863e85078/conversations/597ba4c7-56c2-4a17-8c34-c62cc3cd01b9/wiggle/download-file?path=%2Fmnt%2Fuser-data%2Foutputs%2Fskills%2Fnodejs%2FSKILL.md
@@ -0,0 +1,128 @@
1
+ ---
2
+ name: responsive-css
3
+ description: Writing CSS and markup that works on phone and desktop at the same time, the responsive failures Claude repeats. Use when building or editing any web page or component, or when something overflows horizontally, a code block blows out the page, text is huge on mobile, or content won't scroll on touch. Covers the viewport meta tag, the flex/grid min-width:0 rule that fixes most overflow, code blocks that scroll instead of overflowing, fluid type with clamp(), and mobile-first breakpoints. This is authoring guidance, design-review evaluates the result.
4
+ when_to_use: |
5
+ - Writing or editing CSS or HTML for a page or component that renders in a browser
6
+ - Anything that has to look right on both a phone and a desktop
7
+ - Fixing horizontal page overflow, a code block or table that overflows, text too large on mobile, or a region that won't scroll on touch
8
+ - The whole page suddenly scrolls, jumps, or rubber-bands on iPhone/Safari, or background scrolls under an open modal
9
+ - Building docs/landing pages with code blocks, which overflow on mobile constantly
10
+ - Do NOT use for native mobile (React Native, Flutter) or non-visual code
11
+ ---
12
+
13
+ # Responsive CSS: Mobile and Desktop at Once
14
+
15
+ Claude writes CSS for the desktop it's picturing and never checks the narrow viewport, so the same few things break every time: content overflows sideways, code blocks blow out the page, and text that's right on desktop is huge on a phone. Design for the small screen first and these mostly disappear.
16
+
17
+ ## Set the viewport, stop iOS inflating text
18
+
19
+ Without the viewport meta tag, mobile browsers render at a ~980px layout width and scale the result down, which is why everything looks oversized and mis-laid-out on a phone. This one line is non-negotiable on every page:
20
+
21
+ ```html
22
+ <meta name="viewport" content="width=device-width, initial-scale=1" />
23
+ ```
24
+
25
+ And stop iOS from auto-enlarging text:
26
+
27
+ ```css
28
+ html { -webkit-text-size-adjust: 100%; text-size-adjust: 100%; }
29
+ ```
30
+
31
+ ## The flex/grid `min-width: 0` rule, this fixes most overflow
32
+
33
+ This is the single most common cause of mysterious horizontal scroll. Flex and grid children default to `min-width: auto`, which means they refuse to shrink below their content's intrinsic width. So one long line, a URL, or a `<pre>` inside a flex/grid item pushes the whole layout wider than the screen. Set `min-width: 0` on the child (or `overflow: hidden`) and it shrinks correctly.
34
+
35
+ ```css
36
+ .flex-child, .grid-child { min-width: 0; } /* lets long content shrink instead of overflowing */
37
+ ```
38
+
39
+ If you fix nothing else, fix this. It's behind the code-block overflow in most "works on desktop, scrolls sideways on mobile" pages.
40
+
41
+ ## Code blocks scroll, they don't overflow the page
42
+
43
+ A `<pre>` doesn't wrap and has no scroll affordance by default, so long lines expand the page. Make the block scroll inside itself, and never line-wrap code (wrapping changes what the code means). The parent needs `min-width: 0` per the rule above, or this still overflows.
44
+
45
+ ```css
46
+ pre {
47
+ overflow-x: auto; /* scroll inside the block */
48
+ max-width: 100%;
49
+ }
50
+ pre code { white-space: pre; } /* keep code on its own lines, scroll horizontally */
51
+ ```
52
+
53
+ For prose, the opposite: let long words and URLs break instead of overflowing.
54
+
55
+ ```css
56
+ p, li, h1, h2, h3 { overflow-wrap: break-word; }
57
+ ```
58
+
59
+ Wrap wide tables the same way you handle code, in a scroll container: `<div style="overflow-x:auto">…table…</div>`.
60
+
61
+ ## Fluid type with `clamp()`, not fixed desktop sizes
62
+
63
+ A heading sized for desktop is enormous on a phone and forces wrapping and overflow. `clamp()` scales the size smoothly between a mobile floor and a desktop ceiling with no breakpoints to juggle:
64
+
65
+ ```css
66
+ h1 { font-size: clamp(1.75rem, 4vw + 1rem, 3rem); }
67
+ body { font-size: clamp(1rem, 0.5vw + 0.9rem, 1.125rem); line-height: 1.5; }
68
+ ```
69
+
70
+ The middle value does the scaling; the floor and ceiling keep it readable at both ends.
71
+
72
+ ## Mobile-first: base styles small, enhance up
73
+
74
+ Write the base styles for the narrow screen, then add `@media (min-width: ...)` to enhance for larger ones. Desktop-first with `max-width` patches is exactly how you get a layout that works on the desktop and falls apart on mobile, because the mobile case is an afterthought bolted on.
75
+
76
+ ```css
77
+ .layout { display: grid; grid-template-columns: 1fr; gap: 1rem; } /* phone */
78
+ @media (min-width: 48rem) {
79
+ .layout { grid-template-columns: 240px 1fr; } /* tablet up */
80
+ }
81
+ ```
82
+
83
+ Check the phone width first, not last.
84
+
85
+ ## Sideways scroll: the rest of the causes, and the touch-scroll bug
86
+
87
+ If the page scrolls horizontally on mobile, the `min-width: 0` rule above is the first thing to check. The other usual suspects:
88
+
89
+ - **Set `box-sizing: border-box` globally.** Without it, `width: 100%` plus any `padding` or `border` adds up to wider than the parent and overflows. This is a top cause of sideways scroll and Claude often omits it from scratch CSS:
90
+ ```css
91
+ *, *::before, *::after { box-sizing: border-box; }
92
+ ```
93
+ - **No fixed pixel widths wider than the phone.** `width: 800px` or `min-width: 600px` on a 375px screen forces horizontal scroll. Use `max-width`, percentages, or `min(800px, 100%)` so the element caps at the viewport.
94
+ - **Watch positioned and decorative elements.** Absolutely positioned, transformed, or negative-margin elements (offset images, background blobs, things nudged with `right:` or `translateX`) stick out past the right edge and widen the scrollable area even when the layout looks fine. Constrain them, or clip on a wrapper with `overflow-x: clip` (preferred over `hidden`, it doesn't create a scroll container and so won't break `position: sticky`).
95
+ - **Use `width: 100%`, not `100vw`.** `100vw` includes the scrollbar width, so a full-width element ends up wider than the content area and scrolls the page sideways.
96
+ - **Cap media, and give images dimensions:** `img, video, svg, canvas { max-width: 100%; height: auto; }`, and set the intrinsic `width`/`height` attributes on every `<img>` so the browser reserves the box and the page doesn't shift or repaint when the image loads (see web-performance).
97
+ - **Don't paper over it with `body { overflow-x: hidden }`.** That hides the symptom and breaks `position: sticky`. Find the offender: temporarily add `* { outline: 1px solid red; }` (outline, not border, so it doesn't change layout) and look for the element wider than the viewport, then fix that element.
98
+ - **Can't scroll a code block or editor on touch?** It's almost always an ancestor with `overflow: hidden` clipping it, or the region has a fixed height with no `overflow: auto`. Give the scroll region `overflow: auto` and a sane `max-height`, and make sure no ancestor sets `overflow: hidden` or `touch-action: none` over it.
99
+
100
+ ## iOS Safari: when the whole page suddenly scrolls or jumps
101
+
102
+ A classic, and it's almost always one of four iOS-specific behaviors, each with a different fix:
103
+
104
+ - **Scroll chaining.** Drag inside a scrollable region (a modal, drawer, code block, chat list), hit its top or bottom, keep dragging, and the scroll "leaks" to the page so the whole thing moves and rubber-bands underneath. Stop it by containing the scroll on that region:
105
+ ```css
106
+ .modal-body, .drawer, .scroll-region { overscroll-behavior: contain; }
107
+ ```
108
+ - **Background scrolls under an open modal.** `body { overflow: hidden }` does not reliably hold on iOS Safari, the page still scrolls behind the overlay. The robust lock is to fix the body and restore the scroll position on close:
109
+ ```js
110
+ // open: remember position, freeze the body in place
111
+ const y = window.scrollY;
112
+ document.body.style.cssText = `position:fixed; top:${-y}px; left:0; right:0;`;
113
+ // close: release and jump back exactly where they were
114
+ document.body.style.cssText = "";
115
+ window.scrollTo(0, y);
116
+ ```
117
+ - **The `100vh` toolbar jump.** On iOS, `100vh` counts the area behind Safari's address bar, so a `height:100vh` section is taller than the visible viewport and the page jumps as the toolbar shows and hides. Use the dynamic viewport unit instead, with a legacy fallback:
118
+ ```css
119
+ .full-height { height: 100vh; height: 100dvh; } /* dvh tracks the real visible height */
120
+ ```
121
+ - **Tap an input and the page zooms and scrolls.** iOS Safari auto-zooms when you focus an input whose `font-size` is under 16px, which scrolls and rescales the whole page. Give form controls at least 16px. Do not "fix" this with `maximum-scale=1` or `user-scalable=no`, that disables pinch-zoom and hurts accessibility.
122
+ ```css
123
+ input, select, textarea { font-size: 16px; }
124
+ ```
125
+
126
+ ---
127
+
128
+ This skill is built to grow. Add a rule when a real responsive failure has a stable, defensible fix.
@@ -0,0 +1,50 @@
1
+ ---
2
+ name: schema-source-of-truth
3
+ description: One canonical Zod schema per entity, reused across the stack instead of redeclared at each layer. Use whenever defining or changing a data shape, a TypeScript type or interface for an entity, an API request/response validator, an Express body/query/params check, a frontend form validator, or a DB document shape. Catches the same-entity-defined-four-times drift. TypeScript-first, derive types and per-layer variants from one base instead of hand-writing parallel copies.
4
+ when_to_use: |
5
+ - Defining or editing an entity's shape (a User, Order, etc.) in types, an API, a form, or the DB
6
+ - Writing API request/response validation or Express middleware that checks req.body / query / params
7
+ - Writing a frontend form validator, or a TypeScript interface/type for data that also lives on the backend
8
+ - Any moment you're about to write a second definition of a shape that already exists somewhere
9
+ - Do NOT use for one-off internal types with no cross-layer counterpart
10
+ ---
11
+
12
+ # Schema as a Single Source of Truth
13
+
14
+ A data entity should be defined once, as a Zod schema, and everything else derived from it. The failure pattern to kill: declaring the same entity separately at each layer, a frontend `interface User`, a backend `interface User`, a hand-written API validator, and a manual Mongo `$jsonSchema`. Four shapes for one entity, kept in sync by hand, guaranteed to drift the first time a field is added or renamed.
15
+
16
+ ## One schema per entity, derive the rest
17
+
18
+ Define the canonical schema once and generate every other representation from it:
19
+
20
+ - **TypeScript type:** `type User = z.infer<typeof UserSchema>`. Never hand-write an `interface` that parallels a schema, infer it so it can't fall out of sync.
21
+ - **API validation:** parse `req.body` / `req.query` / `req.params` through the schema in middleware. `safeParse` and return 400 on failure, no field-by-field `if` checks.
22
+ - **Frontend forms:** the same schema drives form validation (`zodResolver` with react-hook-form), so client and server reject the same inputs by the same rules.
23
+ - **Pre-write guard:** parse before writing to the DB. See the `mongodb-rules` skill for the Mongo-specific parse-before-write and `$jsonSchema` floor.
24
+ - **OpenAPI and `$jsonSchema`:** generate them from the schema (`zod-to-openapi`, `zod-to-json-schema`) rather than maintaining them by hand.
25
+
26
+ Zod is TypeScript-first with zero runtime dependencies, so this costs one small library and removes every duplicated definition.
27
+
28
+ ## One base, many variants (they are not identical)
29
+
30
+ "Same schema everywhere" is the goal, but a create payload is not the stored document and a response is not the request, so don't pretend they're one object. Model it honestly: one base schema, with per-layer variants derived from it, sharing a single origin while differing where they genuinely must.
31
+
32
+ ```typescript
33
+ const UserSchema = z.object({
34
+ _id: z.string(),
35
+ email: z.string().email(),
36
+ name: z.string().min(1),
37
+ createdAt: z.date(),
38
+ });
39
+
40
+ const CreateUser = UserSchema.omit({ _id: true, createdAt: true }); // POST body
41
+ const UpdateUser = CreateUser.partial(); // PATCH body
42
+ const UserResponse = UserSchema.extend({ displayName: z.string() }); // adds computed field
43
+ type User = z.infer<typeof UserSchema>;
44
+ ```
45
+
46
+ Derive variants with `.omit()`, `.partial()`, `.pick()`, `.extend()`. When the base gains a field, every variant inherits it automatically, which is the entire point. A hand-copied variant is just the drift problem at smaller scale.
47
+
48
+ ## Make it physically shared
49
+
50
+ Single source of truth only holds if there is literally one file. Put entity schemas in a shared module both sides import, a `packages/schemas` workspace, or a shared `src/schemas/` reachable by frontend and backend. If each side keeps its own copy, they diverge no matter how disciplined the intent. The shared import is the enforcement, not the convention.
@@ -52,7 +52,7 @@ await page.goto('/dashboard'); // no assertion at all
52
52
  The data layer has rules that test data must respect, or the test passes while masking the exact bug that bites in production.
53
53
 
54
54
  - **Seed real `ObjectId` values, not string ids.** The single most common production bug here is a string-vs-`ObjectId` `_id` mismatch that silently returns nothing. A test seeded with string ids passes and hides it. Use actual `ObjectId` types in fixtures.
55
- - **Exercise the StrictDB adapter, not a hand-rolled driver mock.** Tests go through the same `adapters/` boundary the handlers use. Mock at the network or data boundary, not by reimplementing the driver.
55
+ - **Exercise the data adapter (StrictDB or native), not a hand-rolled driver mock.** Tests go through the same `adapters/` boundary the handlers use. Mock at the network or data boundary, not by reimplementing the driver.
56
56
  - **Test the round trip.** Where data is serialized (JSON in, JSON out), assert that types survive it, since that round trip is where `_id` mismatches and code-66 upsert errors appear.
57
57
 
58
58
  ## Unit tests (Vitest)