homegames-common 1.5.2 → 1.5.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/docker-helper.js CHANGED
@@ -101,7 +101,7 @@ const runGameContainer = async ({
101
101
  saveDataPath,
102
102
  assetCachePath = null,
103
103
  imageName = 'homegames-runner',
104
- memoryLimit = '128m',
104
+ memoryLimit = '196m',
105
105
  cpuLimit = '1',
106
106
  gameEntryRelative = null,
107
107
  noFrame = false,
@@ -395,4 +395,5 @@ module.exports = {
395
395
  stopContainer,
396
396
  isContainerRunning,
397
397
  streamContainerLogs,
398
+ parseMemoryString,
398
399
  };
@@ -0,0 +1,173 @@
1
+ # Homegames — Architecture
2
+
3
+ Component-by-component. For where these physically run, see INFRA.md; for traces
4
+ through them, see FLOWS.md.
5
+
6
+ ---
7
+
8
+ ## squish (npm: `squishjs`)
9
+
10
+ The shared contract between server and client — a compact binary serialization
11
+ of a game's scene graph, plus the node types and base classes games are built
12
+ from.
13
+
14
+ - **Node types** (`GameNode`): `Shape` (polygons/lines), `Text`, `Asset` (image/audio).
15
+ Everything is positioned in a virtual **0–100 coordinate plane**.
16
+ - **`squish(node)` / `unsquish(bytes)`** (`src/squish.js`): TLV-style encoder.
17
+ Each node frame starts with magic byte `3`, a class code, then per-property
18
+ sub-frames (color, coordinates, text, asset, playerIds, etc.), each with its
19
+ own type tag + length. Numbers are stored as integer + 2-decimal-fraction byte
20
+ pairs (~0.01 resolution).
21
+ - **`Game` / `ViewableGame`** base classes: games extend these. `ViewableGame`
22
+ adds a large world "plane" + `ViewUtils.getView()` for per-player cameras into
23
+ a world bigger than one screen.
24
+ - **`Squisher`** (`src/Squisher.js`): walks a game's layer tree, squishes every
25
+ node, builds **per-player frames** (via `playerIds` filtering — this is how
26
+ hidden info / private UI works and why it's bandwidth-efficient and
27
+ cheat-resistant), bundles binary assets, and coalesces per-tick mutations into
28
+ one broadcast.
29
+ - **Versioning:** published as pinned versions; the canonical map is
30
+ `homegames-common/game-loader.js → squishMap`. A game's `squishVersion` selects
31
+ which package both server and client use. **Image cropping / spritesheets
32
+ require `squish-140`+.**
33
+
34
+ Authoring contract: **squishjs-game-authoring.md** (this folder).
35
+
36
+ ---
37
+
38
+ ## homegames-common
39
+
40
+ The shared backend library. Notable modules:
41
+
42
+ - **`game-loader.js`** — `squishMap` (version → npm alias, **single source of
43
+ truth**), `parseSquishVersion` (AST-reads a game's `squishVersion`),
44
+ `loadGameClass*`, and `fetchGameFromForgejo` (download a game's repo archive,
45
+ find `index.js`).
46
+ - **`docker-helper.js`** — the isolation layer. `runGameContainer` (live session:
47
+ read-only code mount, mem/CPU/PID limits, `CapDrop: ALL`, tmpfs, auto-remove)
48
+ and `validateGame` (publish gate: `--network=none`, read-only rootfs, noexec
49
+ tmpfs, timeout). Uses `dockerode`.
50
+ - **`game-session-manager.js`** — starts a session by `versionId` (fetch from
51
+ Forgejo) or path; **uses Docker when available, falls back to `fork()` when
52
+ not** (see security_notes.md — fail closed in production).
53
+ - **`index.js`** — config, logging, `getAppDataPath`, and the **authoring-doc
54
+ accessor** (`authoringDocPath`, `getAuthoringDoc()`) so every consumer reads
55
+ one doc.
56
+ - **`docs/`** — this documentation.
57
+
58
+ ---
59
+
60
+ ## homegames-core (game-session server)
61
+
62
+ Runs a single published game and streams it to players.
63
+
64
+ - Loads the game class for the requested `squishVersion`, instantiates it,
65
+ constructs a `Squisher`, and on each tick/state-change squishes the scene and
66
+ **broadcasts frames over WebSocket**. Receives input messages (click/key/etc.)
67
+ and routes them to the game instance's handlers.
68
+ - **Per-session isolation:** each live game runs in a `homegames-runner` Docker
69
+ container (built from `homegames-core/docker/`: `Dockerfile`,
70
+ `container-entry.js`, and `validate.js` used by the publish gate).
71
+ - Session orchestration / port assignment / handing players to the right session
72
+ is done with the **Homenames** registry and the socket layer (`src/util/socket.js`),
73
+ which also speaks the binary client protocol (init / asset bundle / state /
74
+ port-redirect / aspect-ratio messages).
75
+ - Built-in games live in `src/games/` (e.g. `image-test`, `singularity`,
76
+ `enhanced-view-test`). Published user games are fetched from Forgejo at run time.
77
+
78
+ ---
79
+
80
+ ## homegames-client (browser renderer)
81
+
82
+ The engine that turns squished frames into pixels.
83
+
84
+ - `src/index.js` — `HomegamesClient`: WebSocket lifecycle (inline or via a Web
85
+ Worker, `socket-worker.js`), handles the binary message types, picks the right
86
+ `unsquish` by version (`squish-map.js`), runs a rAF render loop that only
87
+ repaints when state actually changed.
88
+ - `src/renderer.js` — draws polygons/text/images/audio + effects; records
89
+ hit-test data; applies image **crop** (9-arg `drawImage`).
90
+ - `src/input.js` — mouse/touch/keyboard/gamepad → game protocol messages;
91
+ point-in-polygon hit-testing; held-key repeat throttled to ~30/s; clears stuck
92
+ mouse state on blur / off-window release.
93
+ - `src/assets.js` — decodes the binary asset bundle (image/audio/font) and caches.
94
+ - Built to `dist/homegames-client.js` (webpack) and served to players.
95
+
96
+ ---
97
+
98
+ ## homegamesio (the website)
99
+
100
+ Static site on S3+CloudFront; `app.js` is the Node origin that maps URL paths to
101
+ HTML files. Surfaces:
102
+
103
+ - **Landing / catalog / play / view** pages.
104
+ - **Studio** (`studio.html` + `studio.js`) — the in-browser game editor: code
105
+ editor + file tree, live **Preview**, **Versions** history, **AI Edit**, a
106
+ consolidated **Settings** panel (description / thumbnail / clone), one-click
107
+ **Publish**, and a full **Assets** workspace (Upload / Draw / Record / Keyboard).
108
+ Top-level UI is two modes: **GAMES** and **ASSETS**. (Redesigned to an
109
+ "analog dashboard" — big labeled buttons, progressive disclosure.)
110
+ - **Admin** (`admin.html` + `admin.js`) — moderation console at `/admin`,
111
+ admin-only: inspect/search/delete users, games, assets; publish-request review;
112
+ support messages; stats. (One-click "delete user + everything they made"
113
+ cascades Mongo + Forgejo repos + the Forgejo account.)
114
+ - **Reset-password / verify** pages.
115
+ - Talks to `api.homegames.io` for everything dynamic; serves the **authoring
116
+ guide** at `/authoring-guide.md` (read from `homegames-common`).
117
+
118
+ ---
119
+
120
+ ## api (backend REST + publish worker)
121
+
122
+ Node HTTP API. Routing in `router.js` (regex → handler, with `requiresAuth` /
123
+ `requiresVerified` gates). Key areas:
124
+
125
+ - **auth.js** — signup (internal `userId` ≠ immutable `displayName`), login (by
126
+ display name or email), **email verification by 6-digit code**, **password
127
+ reset by code** (anti-enumeration). JWT in `crypto.js`.
128
+ - **studio-handlers.js** — the Studio backend: provisions a per-user Forgejo
129
+ account (password derived via HMAC from a server secret), creates/edits game
130
+ repos, lists versions, restores, sets thumbnail, submits **publish requests**
131
+ (rate-limited), receives Forgejo **push webhooks** (HMAC-verified), and queues
132
+ build/LLM jobs.
133
+ - **handlers.js** — catalog, asset upload (**calls `nsfw.js` `classifyImage`
134
+ in-process at upload time** — TensorFlow `nsfwjs`), admin endpoints, delete
135
+ (game/asset/developer with cascade), email-verify/reset handlers.
136
+ - **worker.js** — the **publish-validation worker** (consumes the
137
+ `publish_requests` RabbitMQ queue): checks `index.js` + GPLv3 LICENSE, size
138
+ limits, runs **`ast-scanner.js`** over every JS file, then **`validateGame`**
139
+ (Docker sandbox: loads/instantiates/runs the game ~5s, no network), reads NSFW
140
+ flags, and on success writes a `gameVersions` record with `published: true`.
141
+ - **forgejo.js** — admin-token client for the Forgejo server (create user/repo,
142
+ files, webhooks, archives, delete user/repo).
143
+ - **nsfw.js / detect.js / ast-scanner.js / crypto.js / db.js / queue.js / email.js**
144
+ — moderation model, mime detection, the static-analysis gate, JWT+hashing,
145
+ Mongo access, RabbitMQ publishing, SES email.
146
+
147
+ Note the API is a **monolith process** that also embeds the NSFW model — NSFW
148
+ moderation is *not* a separate worker.
149
+
150
+ ---
151
+
152
+ ## worker (LLM "AI edit")
153
+
154
+ Runs on the Mac Studio (MLX needs Apple Silicon).
155
+
156
+ - `index.js` (Node) — pulls LLM jobs from the EC2's RabbitMQ, manages a
157
+ long-lived Python child, posts results back to the API (authenticated with
158
+ `LLM_WORKER_SECRET`).
159
+ - `llm/model_server.py` (MLX) — holds a warm code model (Qwen2.5-Coder), prefills
160
+ the **authoring guide** as the system prompt (resolved from `homegames-common`
161
+ via `AUTHORING_DOC_PATH`), rewrites a game's `index.js` from a natural-language
162
+ request, with validation-error retry.
163
+ - The result is dropped into the Studio editor as an unsaved change for the user
164
+ to review and save.
165
+
166
+ ---
167
+
168
+ ## Supporting services (on the EC2 host)
169
+
170
+ - **MongoDB** — all application data (users, games, versions, assets, etc.). See DATA-MODEL.md.
171
+ - **RabbitMQ** — job queues: `publish_requests` (publish validation) and the
172
+ unified jobs queue (`homegames-jobs`, incl. `LLM_REQUEST`, `BUILD_GAME`).
173
+ - **Forgejo** — git server; one repo per game; the canonical store of game source.
@@ -0,0 +1,115 @@
1
+ # Homegames — Data Model
2
+
3
+ Where state lives: **MongoDB** (app data), **Forgejo** (game source), the **asset
4
+ store** (binaries in Mongo), and the **squish** wire format (transient, on the
5
+ wire). `> TODO: confirm` marks fields inferred from code that should be
6
+ spot-checked against a live DB.
7
+
8
+ ---
9
+
10
+ ## Identity model (important)
11
+
12
+ - **`userId`** — the canonical internal id (generated, `md5(uuid)`). Never shown.
13
+ It's what the JWT carries, what `developerId` references, and the **Forgejo
14
+ account/owner name**.
15
+ - **`displayName`** — chosen at signup, **immutable**, unique (case-insensitive).
16
+ The only user-facing name.
17
+ - **`email`** — unique (case-insensitive), used for verification + reset.
18
+
19
+ This split was deliberate so display names can stay stable/independent of identity
20
+ and future-flexible. Anything that shows `developerId` as an author is showing the
21
+ opaque id — map to `displayName` for display. (`> TODO`: a sweep of catalog/profile
22
+ pages to show displayName everywhere.)
23
+
24
+ ---
25
+
26
+ ## MongoDB collections
27
+
28
+ Connection + helpers: `api/db.js`. DB name default `homegames`.
29
+
30
+ ### `users`
31
+ `userId`, `displayName`, `displayNameLower`, `email`, `emailLower`, `verified`,
32
+ `verificationCodeHash`, `verificationCodeExpires`, `resetCodeHash`,
33
+ `resetCodeExpires`, `passwordHash`, `passwordSalt`, `created`, `isAdmin`,
34
+ `forgejoAccountCreated`, `forgejoPasswordSynced`, plus profile fields
35
+ (`image`, `description`, `btcAddress`). Password = pbkdf2(`crypto.js`). Codes are
36
+ stored **hashed** (`hashValue`/sha256).
37
+
38
+ ### `games`
39
+ `gameId`, `name`, `description`, `developerId` (= `userId`), `created`,
40
+ `forgejoRepo` (`"<userId>/<repoName>"`), `featured`, `thumbnail` (an assetId),
41
+ `nsfw`.
42
+
43
+ ### `gameVersions`
44
+ `versionId`, `gameId`, `commitSha`, `publishedAt`, `publishedBy`, `published`,
45
+ `approved`, `nsfw`. **This is what makes a game launchable** — a session runs the
46
+ `commitSha` of a published version.
47
+
48
+ ### `publishRequests`
49
+ `requestId`, `gameId`, `commitSha`, `userId`, `status`
50
+ (`PENDING` → `PROCESSING` → `PUBLISHED` | `FAILED`; an older manual-approval path
51
+ uses `PENDING_PUBLISH_APPROVAL` → `REJECTED`), `created`, `completedAt`,
52
+ `versionId`, `error`, `adminMessage`. Rate-limited to 1 per 10 min per user.
53
+
54
+ ### `builds`
55
+ `buildId`, `gameId`, `commitSha`, `commitMessage`, `triggeredBy`, `status`,
56
+ `error`, `created`, `completed`. Created on Forgejo push webhooks.
57
+
58
+ ### `assets`
59
+ `assetId`, `developerId`, `name`, `type` (`image`/`audio`/`font`), `fileType`,
60
+ `nsfw`, `created`, `tags`, `description`, public/approval flags. `> TODO: confirm`
61
+ exact public/approval field names. Max 100 assets/user (`MAX_ASSETS_PER_USER`).
62
+
63
+ ### `documents`
64
+ The **binary asset store**: `developerId`, `assetId`, `data` (BSON `Binary`),
65
+ `fileSize`, `fileType`. Served at `GET /assets/:id`.
66
+
67
+ ### `supportMessages`
68
+ `id`, `created`, `ipHash`, `status` (`PENDING` → ack'd), `message`, `email`.
69
+
70
+ ### `blog`
71
+ `id`, `publishedBy`, `created`, `title`, `content`.
72
+
73
+ `> TODO: confirm` any other collections in the live DB (sessions/homenames state, certs, etc.).
74
+
75
+ ---
76
+
77
+ ## Forgejo (game source of truth)
78
+
79
+ - A self-hosted Gitea fork. **One repo per game**, at `"<userId>/<repoName>"`,
80
+ owned by the user's internal id. Created/managed by the API via the **admin
81
+ token** (`api/forgejo.js`).
82
+ - Each user has a Forgejo account (username = `userId`); its password is **derived
83
+ on demand** via `HMAC(FORGEJO_USER_SECRET, userId)` — nothing stored. CLI clone
84
+ URLs embed those derived credentials.
85
+ - A game's `index.js` (and other files) are committed here; the published
86
+ `commitSha` in `gameVersions` pins exactly what runs.
87
+ - Push webhooks (HMAC `FORGEJO_WEBHOOK_SECRET`) drive the build/publish pipeline.
88
+
89
+ ---
90
+
91
+ ## Asset store / serving
92
+
93
+ - Binaries live in Mongo `documents`; metadata in `assets`. Served by the API at
94
+ `/assets/:id`.
95
+ - The squish `Asset` class (client + game code) references assets by **id**, and
96
+ historically downloads from an asset endpoint (`api.homegames.io/assets`,
97
+ older `assets.homegames.io`). `> TODO: confirm` whether any assets are also in
98
+ S3 vs all in Mongo now.
99
+
100
+ ---
101
+
102
+ ## squish wire format (on the wire, not stored)
103
+
104
+ Transient binary frames between homegames-core and the browser. Full spec +
105
+ authoring rules: **squishjs-game-authoring.md**. In brief:
106
+
107
+ - A game's scene = a tree of `Shape`/`Text`/`Asset` nodes in 0–100 space.
108
+ - `squish()` encodes each node as `[3, len…, classCode, …sub-frames]`; each
109
+ property (color, coordinates2d, text, asset, playerIds, …) is a typed,
110
+ length-prefixed sub-frame. Numbers are integer + 2-decimal-fraction byte pairs.
111
+ - The server sends: **init** (player id, aspect ratio, squishVersion) →
112
+ **asset bundle** → **state frames**, plus port-redirect / aspect-ratio messages.
113
+ - **Versioned**: `squishVersion` in a game's `metadata()` selects the pinned
114
+ squish package used to encode (server) and decode (client). Never mutate a
115
+ published version's format; ship a new version (see `game-loader.js → squishMap`).
package/docs/FLOWS.md ADDED
@@ -0,0 +1,125 @@
1
+ # Homegames — End-to-End Flows
2
+
3
+ Traces through the system for the things that matter. See ARCHITECTURE.md for the
4
+ components named here.
5
+
6
+ ---
7
+
8
+ ## 1. Playing a game (hosted)
9
+
10
+ 1. Player opens `homegames.io` (S3/CloudFront) and picks a game from the catalog,
11
+ or hits a play link with a `gameId`/`versionId`.
12
+ 2. The page asks the **API** to create/find a **session** for that published
13
+ version. The API (via **Homenames** + homegames-core) ensures a
14
+ **homegames-core session is running in a Docker container** for that game and
15
+ returns a WebSocket endpoint/port.
16
+ 3. The **homegames-client** (bundled in the page) opens the WebSocket. The server
17
+ sends an **init** message (player id, aspect ratio, bezel, and the game's
18
+ **squishVersion**), then the **asset bundle** (type 1), then **state frames**
19
+ (type 3).
20
+ 4. The client picks the matching `unsquish` for that squishVersion, decodes each
21
+ frame, and **renders** polygons/text/images to the canvas (0–100 space → pixels).
22
+ 5. Player input (click/key/touch) is normalized to 0–100 and sent back over the
23
+ socket; homegames-core routes it to the game instance's handlers
24
+ (`onClick`, `handleKeyDown`, …), which mutate state; the `Squisher` coalesces
25
+ and broadcasts the next frame.
26
+ 6. Per-player visibility (`playerIds`) means each player can receive a different
27
+ frame — the server only squishes what that player should see.
28
+
29
+ Multiple players share **one** game instance (it's authoritative); there is no
30
+ per-player game process.
31
+
32
+ ---
33
+
34
+ ## 2. Authoring + publishing a game
35
+
36
+ **Author (in the Studio):**
37
+ 1. User signs in to the Studio (`homegames.io/studio`). On first studio action the
38
+ API lazily provisions them a **Forgejo account** (username = their internal
39
+ `userId`; password derived by HMAC from `FORGEJO_USER_SECRET`).
40
+ 2. "New Game" → API creates a **Forgejo repo** (`<userId>/<repo>`), commits a
41
+ GPLv3 `LICENSE` + a chosen starter template, and a `games` record in Mongo.
42
+ 3. Editing in the Studio writes files via the API → Forgejo commits (the **Save
43
+ Version** path). They can Preview (a real session of the working tree), use
44
+ **AI Edit**, set Description/Thumbnail in **Settings**, etc.
45
+ 4. They can also `git clone` the repo (Settings → Clone) and push from the CLI.
46
+
47
+ **Publish pipeline:**
48
+ 5. **Publish** (or a Forgejo push webhook) → the API enqueues a job on RabbitMQ.
49
+ Webhooks are HMAC-verified (`FORGEJO_WEBHOOK_SECRET`).
50
+ 6. The **API worker** (`api/worker.js`) consumes `publish_requests` and validates
51
+ the target commit:
52
+ - `index.js` exists; a **GPLv3 LICENSE** matches; size limits OK.
53
+ - **AST scan** (`ast-scanner.js`) every `.js`: no banned `require`s (fs, net,
54
+ child_process, …), no `eval`/`Function`, no dynamic `require`/`import`, etc.
55
+ - **Docker validation** (`validateGame`): load the class, check `metadata()`
56
+ + `squishVersion`, instantiate, run ~5s in a **no-network, read-only**
57
+ container; collect asset ids.
58
+ - **NSFW**: the game's assets already carry an `nsfw` flag (set at upload —
59
+ see flow 6); if any are flagged, the version is marked nsfw.
60
+ 7. On success it writes a `gameVersions` record with `published: true`
61
+ (+ `commitSha`). The game is now in the catalog and launchable; sessions run
62
+ that exact pinned commit.
63
+
64
+ > Trust boundary: publishing **is** running attacker-controlled code. The AST scan
65
+ > is a filter; the Docker container is the real boundary. See `security_notes.md`.
66
+
67
+ ---
68
+
69
+ ## 3. Developer signup + email verification
70
+
71
+ 1. Studio signup form posts `{ displayName, email, password }` to `/auth/signup`.
72
+ 2. API creates a user: generated internal **`userId`** (canonical identity),
73
+ separate **immutable `displayName`**, `verified: false`, a hashed **6-digit
74
+ code** (24h expiry). A JWT is returned (they're logged in but unverified).
75
+ 3. SES emails the **code** (not a link — avoids email-client prefetch consuming
76
+ it).
77
+ 4. In the Studio, the unverified banner takes the code → `POST /auth/verify`
78
+ (authenticated, scoped to the user) → `verified: true`.
79
+ 5. **Gating:** unverified users can browse/edit but the API blocks the
80
+ abuse-surface routes (`requiresVerified`): create game, save, publish, AI edit,
81
+ asset upload, thumbnail. Playing/browsing is fully anonymous and ungated.
82
+
83
+ Forgot password (`/auth/forgot` → `/auth/reset`) mirrors this: anti-enumeration
84
+ (always 200), 8-char emailed code, 1h expiry, rate-limited; standalone page at
85
+ `/reset-password`.
86
+
87
+ ---
88
+
89
+ ## 4. Asset upload + moderation
90
+
91
+ 1. Studio Assets workspace → Upload/Draw/Record/Keyboard → `POST /asset`.
92
+ 2. The API stores asset metadata in `assets` and binary data in `documents`
93
+ (MongoDB Binary), and **synchronously classifies the image** in-process
94
+ (`nsfw.js`, TensorFlow `nsfwjs`); the `nsfw` flag is saved on the asset.
95
+ 3. Assets are served back at `/assets/:id` and referenced from games by id.
96
+ 4. Admins moderate via `/admin` (delete asset, flag/unflag NSFW). Publishing a
97
+ game inherits its assets' NSFW status.
98
+
99
+ ---
100
+
101
+ ## 5. Moderation / admin
102
+
103
+ 1. Admin (a user with `isAdmin: true` in Mongo — set manually to bootstrap) opens
104
+ `/admin`.
105
+ 2. The console lists/searches **users / games / assets** (paginated), shows
106
+ **stats**, **publish requests** (approve/reject), and **support messages**
107
+ (acknowledge) — all gated server-side on `isAdmin`.
108
+ 3. **Delete user** (`DELETE /admin/developers/:id`) cascades: their Forgejo
109
+ repos, the Forgejo account, search index entries, and all Mongo data (games,
110
+ versions, publishRequests, builds, assets, documents, user). One click removes
111
+ everything they made.
112
+
113
+ ---
114
+
115
+ ## 6. AI "edit my game" (LLM)
116
+
117
+ 1. Studio → AI Edit → `POST /studio/games/:id/llm-modify` with a prompt; the API
118
+ enqueues an `LLM_REQUEST` on RabbitMQ.
119
+ 2. The **Mac Studio worker** pulls the job, the MLX model server rewrites
120
+ `index.js` grounded by the **authoring guide** (system prompt), with
121
+ validation-error retries.
122
+ 3. The worker posts the result back to the API (auth'd by `LLM_WORKER_SECRET`);
123
+ the Studio polls status and drops the rewrite into the editor as an **unsaved
124
+ change** for the user to review + save.
125
+ 4. If the Mac is offline, jobs simply wait — nothing else is affected.
package/docs/INFRA.md ADDED
@@ -0,0 +1,111 @@
1
+ # Homegames — Infrastructure
2
+
3
+ What physically runs where, the domains, ports, and secrets. Design principle:
4
+ **keep it runnable on a single box** so others can self-host the whole thing
5
+ without orchestration. `> TODO: confirm` marks things to verify against the live
6
+ environment.
7
+
8
+ ---
9
+
10
+ ## Hosts
11
+
12
+ ### 1. The single EC2 instance (the backend)
13
+ Everything below runs on **one EC2 host**, on purpose (simplest possible to run):
14
+
15
+ - **API** (`api.homegames.io`) — the Node monolith (`api/`). Also embeds the
16
+ **NSFW model** in-process (TensorFlow `nsfwjs`).
17
+ - **API worker** — `api/worker.js`, the publish-validation consumer.
18
+ - **homegames-core** + **Docker** — game sessions, each in a `homegames-runner`
19
+ container. **Planned split:** move homegames-core (and its Docker host duty)
20
+ onto its **own instance** for scalability, keeping it separate from the API.
21
+ This is the one intended departure from single-host.
22
+ - **MongoDB** — all app data.
23
+ - **RabbitMQ** — job queues (port 5672).
24
+ - **Forgejo** — git server (port 3000). `api/config.js` currently hardcodes
25
+ `FORGEJO_URL = 'http://52.32.110.71:3000'` → that IP is the EC2 host.
26
+ `> GOTCHA`: this is a hardcoded IP, not config/DNS — update it if the instance
27
+ IP changes, and ideally move it to an env var.
28
+
29
+ ### 2. Mac Studio (the LLM worker) — Joseph's personal machine, at home
30
+ - Runs `worker/` (Node `index.js` + Python MLX `llm/model_server.py`). MLX
31
+ requires Apple Silicon, which is why this is not on EC2.
32
+ - **Pulls LLM jobs from the EC2's RabbitMQ** (`amqp://api.homegames.io:5672`) and
33
+ posts results back to the API. AWS/networking was configured so the Mac can
34
+ reach the queue. `> TODO: confirm` exactly how (security-group allowance for the
35
+ Mac's IP to 5672? VPN?).
36
+ - If this machine is off, "AI edit" requests just queue and wait — nothing else
37
+ is affected.
38
+
39
+ ### 3. homegames.io (the website) — AWS S3 + CloudFront
40
+ - Static hosting for the site, Studio, Admin, client bundle, and assets like the
41
+ authoring guide. `> TODO: confirm` the exact deploy story (see OPERATIONS.md —
42
+ `deploy.sh` only pushes `bundle.js`; need to confirm how the HTML pages,
43
+ `studio.js`, `admin.*`, `reset-password.*`, and `/authoring-guide.md` reach
44
+ prod, i.e. whether CloudFront's origin is S3 or the EC2 `app.js`).
45
+
46
+ ---
47
+
48
+ ## Domains / DNS
49
+
50
+ - **`homegames.io`** — the website (S3/CloudFront).
51
+ - **`api.homegames.io`** — the API; also the RabbitMQ host (`:5672`), Homenames,
52
+ and asset serving (`/assets/...`). `> TODO: confirm` it points at the EC2.
53
+ - **`homegames.link` / `public.homegames.link`** — the **relay** for the old
54
+ self-hosted model (exposing a home instance to the public internet without
55
+ port-forwarding). It worked but is **not maintained** now that the hosted
56
+ service exists. Code references remain (`LINK_PROXY_URL`, port 81/82) but treat
57
+ it as legacy unless revived.
58
+ - `> TODO: confirm` Route53 zones and the asset/cert domains (`homegames.link`
59
+ cert domain appears in config).
60
+
61
+ ---
62
+
63
+ ## Ports (on the EC2 host)
64
+
65
+ | Port | Service | Notes |
66
+ |------|---------|-------|
67
+ | 80 / 443 | API (and/or web) | `> TODO: confirm` TLS termination (in-process? nginx? CloudFront?) |
68
+ | 3000 | Forgejo | hardcoded in `api/config.js` |
69
+ | 5672 | RabbitMQ | the Mac worker connects here |
70
+ | 27017 | MongoDB | `> TODO: confirm` |
71
+ | 7400 | Homenames | session/naming registry (`HOMENAMES_PORT`) |
72
+ | 8300–8400 | game sessions | per-session container ports (`GAME_SERVER_PORT_RANGE`) |
73
+ | 7001 / 9801 | self-host home port | legacy/self-host (`HOME_PORT`) |
74
+
75
+ ---
76
+
77
+ ## Secrets & credentials inventory
78
+
79
+ All currently provided via environment to the relevant service. **Bus-factor: know
80
+ where these are set** (`> TODO: confirm` — systemd unit `Environment=`/EnvironmentFile,
81
+ a `.env`, AWS SSM, etc.).
82
+
83
+ - **`JWT_SECRET`** (api) — signs user JWTs. If lost/rotated, everyone is logged out.
84
+ - **`FORGEJO_USER_SECRET`** (api) — HMAC key from which **every user's Forgejo
85
+ password is derived**. If lost, you can't reproduce users' git credentials;
86
+ rotating it requires re-syncing all users' Forgejo passwords
87
+ (`rotate-forgejo-secret.js`).
88
+ - **`FORGEJO_WEBHOOK_SECRET`** (api + Forgejo) — verifies push webhooks.
89
+ - **Forgejo admin token** — the API does all Forgejo ops with an admin token.
90
+ `> TODO: confirm` where it's stored.
91
+ - **`LLM_WORKER_SECRET`** (api + Mac worker) — authenticates the Mac worker
92
+ posting results back to the API.
93
+ - **AWS credentials / IAM** — EC2 instance role (Route53, SES, possibly S3); the
94
+ Mac's AWS config for queue access. `> TODO: confirm` the IAM role's exact
95
+ permissions (least privilege — see security_notes.md re: IMDS).
96
+ - **SES** — `SES_FROM_ADDRESS` (verified identity), `SES_REGION`; account must be
97
+ out of the SES sandbox for public email.
98
+ - **Mongo** — `DB_USERNAME` / `DB_PASSWORD` (if auth enabled).
99
+ - **TLS certs** — `> TODO: confirm` (Let's Encrypt? CloudFront-managed? the
100
+ `acme-client` dep in the worker suggests ACME somewhere).
101
+
102
+ ---
103
+
104
+ ## squish version aliasing (deploy-relevant)
105
+
106
+ `squishjs` is published under many npm aliases (`squish-135`, `squish-138`,
107
+ `squish-140`, …) so every game keeps running on the exact version it declared.
108
+ The map is `homegames-common/game-loader.js → squishMap`. Adding a squish feature
109
+ = publish a new version + add the alias in **every consumer** (`api`,
110
+ `homegames-core`, `homegames-client`, `homegames-web`, `homegamesio`) and the map.
111
+ The image-crop / `getView` work lives in **`squish-140`**.
@@ -0,0 +1,121 @@
1
+ # Homegames — Operations / Runbook
2
+
3
+ How to deploy, restart, observe, and recover. `> TODO: confirm` marks gaps to fill
4
+ in with current practice.
5
+
6
+ ---
7
+
8
+ ## The host
9
+
10
+ The EC2 backend runs its services under **systemd**. Manage with `systemctl`,
11
+ read logs with `journalctl`.
12
+
13
+ ```bash
14
+ # status / restart / logs (service names: > TODO confirm exact unit names)
15
+ systemctl status homegames-core api homegames-worker
16
+ systemctl restart api
17
+ journalctl -u api -f # follow API logs
18
+ journalctl -u homegames-core -f
19
+ journalctl -u homegames-worker -f
20
+ ```
21
+
22
+ Services (one host today): **api**, **homegames-core**, **the API worker**, plus
23
+ **mongod**, **rabbitmq-server**, **forgejo**, and **docker**.
24
+ `> TODO: confirm` the exact systemd unit names and whether Mongo/Rabbit/Forgejo are
25
+ distro packages or also custom units.
26
+
27
+ ---
28
+
29
+ ## Deploying / updating
30
+
31
+ ### API, homegames-core, API worker (on the EC2 host)
32
+ Current practice (`> TODO: confirm` precise steps):
33
+ ```bash
34
+ cd <repo> && git pull
35
+ npm install # if deps changed
36
+ systemctl restart <service>
37
+ journalctl -u <service> -n 100 --no-pager # verify it came up
38
+ ```
39
+ - After changing **homegames-common**, dependents that pin it via `file:` link
40
+ pick it up on `npm install`; restart them.
41
+ - After changing the **Docker runner** (`homegames-core/docker/`), rebuild the
42
+ `homegames-runner` image (the session manager can build it on boot if
43
+ `dockerImageDir` is set; otherwise rebuild manually).
44
+
45
+ ### Website (homegames.io — S3/CloudFront)
46
+ - `homegamesio/deploy.sh` currently does:
47
+ `aws s3 cp bundle.js s3://homegames.io/bundle.js` + a CloudFront invalidation.
48
+ - `> TODO: CONFIRM (important)`: that script only pushes `bundle.js`, but the site
49
+ also serves `index.html`, `studio.html`/`studio.js`, `admin.html`/`admin.js`,
50
+ `reset-password.*`, `catalog.html`, etc., and `/authoring-guide.md`. **How do
51
+ those reach production?** Either CloudFront's origin is the EC2 `app.js`
52
+ (so HTML routes are dynamic), or all files must be `aws s3 sync`'d and the
53
+ routing is S3/CloudFront behaviors. **This must be nailed down** — recent
54
+ studio/admin/reset/authoring-guide changes only go live once this path is
55
+ correct. (If S3-static, `deploy.sh` needs to sync those files too; if the EC2
56
+ is the origin, `app.js` routes are what matter and `deploy.sh` is incomplete.)
57
+
58
+ ### LLM worker (Mac Studio)
59
+ - Pull `worker/`, `npm install`, ensure the Python venv (`worker/llm/env`) +
60
+ model are present, run `worker/index.js` (it spawns the MLX model server).
61
+ `> TODO: confirm` how it's kept running (launchd? a `run.sh` in tmux? manual?)
62
+ and how it's updated.
63
+ - It needs `homegames-common` installed (`npm install`) so it can resolve the
64
+ authoring-guide path it feeds the model.
65
+
66
+ ### Publishing a new squish version
67
+ 1. Bump + `npm publish` `squishjs`. 2. Add the alias (`squish-<v>`) to **every**
68
+ consumer's `package.json` and to `homegames-common/game-loader.js → squishMap`.
69
+ 3. `npm install` + restart consumers; rebuild the client bundle + the runner image.
70
+
71
+ ---
72
+
73
+ ## Bootstrapping / common tasks
74
+
75
+ - **Make someone an admin:** set `isAdmin: true` on their `users` doc directly in
76
+ Mongo (there is intentionally no self-serve admin grant):
77
+ `db.users.updateOne({ displayName: "X" }, { $set: { isAdmin: true } })`.
78
+ - **Wipe/reset:** the user model changed (internal id + email); a fresh start
79
+ means clearing `users` (and dependent collections).
80
+ - **SES out of sandbox:** required before verification/reset emails can go to
81
+ arbitrary addresses; set `SES_FROM_ADDRESS`/`SES_REGION`.
82
+
83
+ ---
84
+
85
+ ## Backups & disaster recovery
86
+
87
+ `> TODO: CONFIRM` — define and verify these, they're the bus-factor core:
88
+ - **MongoDB** — is there a scheduled `mongodump` / snapshot? Where to?
89
+ - **Forgejo** — game source lives here; is the Forgejo data dir / repos backed up
90
+ (or the EBS volume snapshotted)?
91
+ - **Assets** — binaries are in Mongo `documents`, so covered by the Mongo backup
92
+ (confirm).
93
+ - **Secrets** — where is the canonical copy of `JWT_SECRET`,
94
+ `FORGEJO_USER_SECRET`, etc. so the host can be rebuilt? (Losing
95
+ `FORGEJO_USER_SECRET` orphans every user's git credentials.)
96
+ - **EBS snapshots** of the single host would capture Mongo + Forgejo + configs in
97
+ one shot — `> TODO` confirm a snapshot schedule exists.
98
+
99
+ ---
100
+
101
+ ## Failure playbook
102
+
103
+ | Symptom | Likely cause / check |
104
+ |---------|----------------------|
105
+ | Games won't start / blank | homegames-core or Docker down (`systemctl status docker homegames-core`). **Note:** the session manager falls back to in-process `fork()` if Docker is unavailable — for the public host this is a security risk; prefer fail-closed (see security_notes.md). |
106
+ | Publishes stuck in PENDING | API worker down, or RabbitMQ down (`systemctl status homegames-worker rabbitmq-server`); check `journalctl -u homegames-worker`. |
107
+ | "AI edit" never completes | Mac Studio worker offline or can't reach RabbitMQ. Safe to ignore short-term. |
108
+ | Verification / reset emails not arriving | SES in sandbox, `SES_FROM_ADDRESS` unset, or DKIM/SPF missing. With SES unset, the API logs the code instead of sending. |
109
+ | Site changes not live | the homegames.io deploy path (see TODO above) — invalidate CloudFront / sync S3. |
110
+ | Forgejo errors on publish/clone | Forgejo down or the hardcoded `FORGEJO_URL` IP changed (`api/config.js`). |
111
+ | Login broken for everyone | `JWT_SECRET` changed/lost. |
112
+
113
+ ---
114
+
115
+ ## Security posture (pointer)
116
+
117
+ The publish pipeline runs untrusted code; containment is the real boundary. Before
118
+ scaling or hardening, read **`homegames-core/../security_notes.md`** (Docker
119
+ fail-closed, IMDS/credential exposure from session containers, network egress,
120
+ least-privilege IAM, validation-vs-runtime parity). The single-host design means a
121
+ container escape reaches everything — weigh that as usage grows.