romdevtools 0.22.1 → 0.24.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. package/AGENTS.md +169 -494
  2. package/CHANGELOG.md +103 -0
  3. package/examples/genesis/templates/platformer.c +5 -1
  4. package/examples/genesis/templates/two_plane_parallax.c +166 -0
  5. package/package.json +2 -2
  6. package/src/cores/wasm/vice_x64_libretro.js +1 -1
  7. package/src/cores/wasm/vice_x64_libretro.wasm +0 -0
  8. package/src/host/LibretroHost.js +225 -2
  9. package/src/host/framebuffer.js +37 -0
  10. package/src/http/skill-doc.js +1 -1
  11. package/src/mcp/tools/audio.js +2 -2
  12. package/src/mcp/tools/frame.js +13 -34
  13. package/src/mcp/tools/index.js +2 -2
  14. package/src/mcp/tools/input-layout.js +10 -0
  15. package/src/mcp/tools/input.js +26 -2
  16. package/src/mcp/tools/metasprite-tools.js +1 -1
  17. package/src/mcp/tools/platform-tools.js +18 -11
  18. package/src/mcp/tools/playtest.js +17 -2
  19. package/src/mcp/tools/project.js +9 -1
  20. package/src/mcp/tools/rendering-context.js +1 -1
  21. package/src/mcp/tools/symbols.js +130 -39
  22. package/src/mcp/tools/tile-inspect.js +1 -1
  23. package/src/mcp/tools/toolchain.js +3 -2
  24. package/src/mcp/tools/watch-memory.js +58 -6
  25. package/src/platforms/_guides/ROMHACKING_PLAYBOOK.md +155 -6
  26. package/src/platforms/atari2600/MENTAL_MODEL.md +37 -0
  27. package/src/platforms/atari7800/MENTAL_MODEL.md +36 -0
  28. package/src/platforms/c64/MENTAL_MODEL.md +83 -6
  29. package/src/platforms/gb/MENTAL_MODEL.md +74 -0
  30. package/src/platforms/gb/lib/c/SDCC_GOTCHAS.md +91 -0
  31. package/src/platforms/gba/MENTAL_MODEL.md +57 -3
  32. package/src/platforms/gba/lib/arm-archives/libc.a +0 -0
  33. package/src/platforms/gba/lib/arm-archives/libgcc.a +0 -0
  34. package/src/platforms/gba/lib/arm-archives/libnosys.a +0 -0
  35. package/src/platforms/gbc/MENTAL_MODEL.md +34 -0
  36. package/src/platforms/gbc/lib/c/SDCC_GOTCHAS.md +91 -0
  37. package/src/platforms/genesis/MENTAL_MODEL.md +180 -0
  38. package/src/platforms/genesis/TROUBLESHOOTING.md +32 -0
  39. package/src/platforms/genesis/lib/c/libc.a +0 -0
  40. package/src/platforms/genesis/lib/c/libgcc.a +0 -0
  41. package/src/platforms/genesis/lib/c/libm.a +0 -0
  42. package/src/platforms/gg/MENTAL_MODEL.md +24 -0
  43. package/src/platforms/gg/lib/c/gg_crt0.s +30 -0
  44. package/src/platforms/lynx/MENTAL_MODEL.md +33 -7
  45. package/src/platforms/msx/MENTAL_MODEL.md +27 -0
  46. package/src/platforms/nes/MENTAL_MODEL.md +35 -0
  47. package/src/platforms/sms/MENTAL_MODEL.md +51 -0
  48. package/src/platforms/sms/lib/c/sms_crt0.s +40 -0
  49. package/src/platforms/snes/MENTAL_MODEL.md +21 -0
  50. package/src/platforms/snes/TROUBLESHOOTING.md +43 -0
  51. package/src/playtest/playtest.js +48 -0
  52. package/src/toolchains/sdcc/preflight-lint.js +164 -8
  53. package/examples/msx/catch_game/_verify.mjs +0 -93
  54. package/examples/pce/catch_game/_verify.mjs +0 -75
@@ -47,6 +47,19 @@ the same wall.
47
47
  `s__DATA` for `l__DATA` bytes; bring-your-own crt0 should do the
48
48
  same.
49
49
 
50
+ 6. **Don't poke a hardcoded `$C0xx` WRAM pointer for game state — it
51
+ overlaps your statics.** SDCC links the C runtime's data + BSS (every
52
+ `static` global: your PRNG seed, your grids, your scores) at the BOTTOM
53
+ of WRAM starting `$C000`. A `volatile uint8_t *board = (uint8_t*)0xC000;`
54
+ then scribbles right over `static uint32_t rng = ...;` et al. Symptom
55
+ looks exactly like an SDCC *codegen* bug — e.g. a 32-bit xorshift PRNG
56
+ that "degenerates" so every roll is identical (its seed is being
57
+ clobbered, not miscompiled). **Use a `static` array and let the linker
58
+ place it** (`static uint8_t board[78]; board[i]=p;`), or hardcode at
59
+ `$C200`+ and confirm with the linker map (`build({includeSymbols:true})`
60
+ → check `s__DATA`/`s__BSS`). Full write-up + repro in
61
+ `lib/c/SDCC_GOTCHAS.md` § "sm83 codegen traps in plain game logic".
62
+
50
63
  - **Two VRAM banks** (switched via VBK at $FF4F) — bank 0 holds tile
51
64
  pattern data, bank 1 holds per-tile BG attributes (palette index,
52
65
  H/V flip, priority, tile bank).
@@ -61,6 +74,27 @@ the same wall.
61
74
  - **HDMA** ($FF51-$FF55) for fast block transfers during HBlank —
62
75
  used for live tile streaming.
63
76
 
77
+ ## MCP debug & inspection tooling
78
+
79
+ GBC shares the patched gambatte core with DMG, so **all the live inspectors
80
+ and `gb_*` memory regions documented in the GB MENTAL_MODEL apply unchanged
81
+ here** — `sprites({op:'inspect'})`, `tiles({op:'png'})`, `cpu({op:'read'})`,
82
+ `audioDebug({op:'inspect', chip:'gb'})`, and the `gb_vram` / `gb_oam` / `gb_io`
83
+ / `gb_hram` / `gb_cpu_regs` regions (same gotcha: it's `gb_vram`, NOT the
84
+ generic `video_ram`). Disassembly routes through the same `-m gbz80` objdump.
85
+ See the GB MENTAL_MODEL for the shared gambatte debug tooling.
86
+
87
+ CGB-only deltas on top of that shared set:
88
+
89
+ - **`palette({source:'live'})`** on a CGB ROM decodes the **64-byte BCPS/OCPS
90
+ palette RAM** into **8 palettes × 4 colors in BGR555** (the DMG path that
91
+ decodes BGP/OBP0/OBP1 bytes is what runs on a `gb` build instead). The raw
92
+ CGB palette RAM is also readable directly via the **`gb_bgpdata`** (BG, 64
93
+ bytes) and **`gb_objpdata`** (OBJ, 64 bytes) memory regions.
94
+ - **`background({view:'renderState'})`** reports the CGB extras the DMG path
95
+ doesn't have: the current **VRAM bank** (VBK), **KEY1** (double-speed state),
96
+ and the live **BCPS/OCPS palette index**.
97
+
64
98
  ## CGB vs DMG mode
65
99
 
66
100
  The CGB boot ROM checks header byte **`$0143`**:
@@ -188,3 +188,94 @@ build({
188
188
  },
189
189
  })
190
190
  ```
191
+
192
+ ## sm83 codegen traps in plain game logic (WRAM integer/array code)
193
+
194
+ Every footgun above is about VRAM / OAM-DMA / the cart header — the stuff
195
+ that makes sprites vanish. This section is the opposite: **plain WRAM game
196
+ logic** — PRNGs, collision grids, score math. Two such "miscompiles" were
197
+ reported from a real GBC Columns build session and chased to ground here.
198
+ **Verdict: neither was an sm83 codegen bug.** They are documented so you
199
+ don't burn hours blaming the compiler for what is actually a memory-layout
200
+ or static-init trap.
201
+
202
+ ### NOT a bug: 32-bit math / `uint32_t` shifts ≥ 16
203
+
204
+ Reported: *"`static uint32_t rng=0x1357; rng ^= rng<<13; rng ^= rng>>17;
205
+ rng ^= rng<<5;` degenerates — every `1+xorshift()%6` roll comes out the
206
+ same (near-monochrome)."*
207
+
208
+ **Reproduced on sm83: it does NOT degenerate.** A ROM that seeds the PRNG,
209
+ calls `xorshift()` 20×, and writes `1 + (result % 6)` to WRAM reads back a
210
+ fully-varied `5,5,5,1,5,5,4,1,3,2,1,...` — the exact sequence a reference
211
+ implementation produces. Full 32-bit fidelity was confirmed byte-for-byte
212
+ across several seeds (`0xDEADBEEF`, `0x00000001`, …). The `<<13` / `>>17` /
213
+ `<<5` shifts (including the ≥16-bit right shift) and `% 6` are all correct.
214
+ **Do not rewrite a working 32-bit xorshift into 16-bit to "dodge" this.**
215
+ 32-bit ops are bigger/slower than 16-bit on an 8-bit CPU, so prefer 16-bit
216
+ PRNGs for *speed* — but not for correctness; both are correct.
217
+
218
+ ### The REAL trap behind "monochrome RNG": writing game state to a fixed
219
+ `0xC0xx` WRAM address that overlaps your statics
220
+
221
+ This is what actually produces the reported symptom. SDCC links the C
222
+ runtime's `_DATA` / `_INITIALIZED` segment (every value-initialised
223
+ `static`, e.g. `static uint32_t rng = 0x1357;`) **at the very bottom of
224
+ WRAM, starting `$C000`**, with `_BSS` (zero-init statics like
225
+ `static uint8_t grid[78];`) right after it. If your code also pokes a
226
+ **hardcoded** `$C000`-area pointer for game state —
227
+
228
+ ```c
229
+ volatile uint8_t *board = (volatile uint8_t *)0xC000; /* DON'T */
230
+ board[i] = piece; /* clobbers `rng` and friends! */
231
+ ```
232
+
233
+ — you are scribbling directly over your own statics. Then `xorshift()`
234
+ reads a trashed `rng`, the PRNG collapses, and every roll looks the same.
235
+ It presents *exactly* like a compiler bug; it is not.
236
+
237
+ **Fixes (any one):**
238
+ - **Best — let the linker place it.** Use a `static` array and take its
239
+ address; never hardcode a WRAM pointer:
240
+ `static uint8_t board[6*13]; ... board[i] = piece;`
241
+ - If you *must* use a fixed address, put it well clear of the runtime data:
242
+ `$C200`+ is safe for small projects (statics here end far below `$C100`;
243
+ `shadow_oam` is pinned at `$C100`). Confirm with the linker map — build
244
+ with `includeSymbols:true` and look at `s__DATA` / `s__BSS` (e.g.
245
+ `s__DATA = $C000`, `s__BSS = $C006`): your scratch RAM must start ABOVE
246
+ the end of `_BSS`.
247
+ - **Diagnose it in seconds:** read `system_ram` offset 0 right after boot
248
+ and compare against your initialised statics' expected bytes. If a
249
+ `static uint32_t x = 0x1357;` doesn't read back `57 13 00 00` at its map
250
+ address, something is overwriting it.
251
+
252
+ ### NOT a bug: short `for` loop with an indexed `static` array read
253
+
254
+ Reported: *"`for(i=0;i<3;i++){ if(grid[r*6+col]) return 1; }` reads the
255
+ wrong cells (pieces lock mid-air / floating gaps); unrolling the 3
256
+ iterations fixed it."*
257
+
258
+ **Reproduced on sm83: the looped form reads the CORRECT cells.** A ROM that
259
+ seeds `grid[]` with a sparse occupied/empty pattern and runs `collides()`
260
+ both looped and hand-unrolled, for 8 straddling `(col,topy)` inputs, gets
261
+ **identical, correct** results from both forms (`1,0,1,0,1,1,1,1`). The
262
+ `grid[r*6+col]` index math and the 3-iteration loop are fine. If your real
263
+ collision check "floats," look first at the WRAM-collision trap above (a
264
+ clobbered `grid[]`), at off-by-one row/col limits, or at signed/unsigned
265
+ mix-ups — not at loop codegen. **Don't pre-emptively unroll loops as a
266
+ compiler workaround; with the stack-overflow fix in place, sm83 loops with
267
+ indexed array reads are reliable.**
268
+
269
+ ### z80 (SMS/GG) ONLY — fixed: value-initialised statics booted as 0
270
+
271
+ Investigating the above on the **z80** port (SMS/GG share the SDCC family)
272
+ surfaced a real bug — but a **crt0** bug, not codegen. The bundled
273
+ `sms_crt0.s` / `gg_crt0.s` placed `_INITIALIZER` (the ROM image of
274
+ value-initialised statics) *after* the `_DATA` RAM block in the area list,
275
+ so sdld put it in RAM; the gsinit `ldir` then copied uninitialised RAM onto
276
+ itself and **every `static uint8_t x = 5;` booted as 0** (and BSS wasn't
277
+ zeroed either). On z80 *this* is what made the xorshift PRNG monochrome
278
+ (seed `rng` booted 0 → stayed 0). Fixed 2026-06-08 by ROM-placing
279
+ `_INITIALIZER` + adding a `_DATA` zero loop, mirroring this sm83 crt0 (which
280
+ was already correct — hence sm83 was never affected). If you bring your own
281
+ z80 crt0, model gsinit on `gb_crt0.s`.
@@ -161,6 +161,167 @@ while (1) {
161
161
  your sprite updates never appear on screen. It's the single most
162
162
  important call in any SGDK game loop.
163
163
 
164
+ ## Scrolling, parallax & the feel trap ⭐
165
+
166
+ This is the section to read before you build a side-scroller. The #1
167
+ "my horizontal movement feels choppy/juddery" bug on Genesis is a
168
+ software mistake, not a hardware limit:
169
+
170
+ > ### ⚠️ DO NOT rewrite full tilemaps in the frame loop.
171
+ > The Genesis scrolls in HARDWARE. Moving the world is **two register
172
+ > writes** (`VDP_setHorizontalScroll`), which are free. If instead you
173
+ > redraw the plane each frame (a big `VDP_setTileMapXY`/`VDP_loadTileMap`
174
+ > burst or a per-frame DMA), you overrun vblank, drop frames, and the
175
+ > scroll judders. **Paint the planes ONCE at setup; the loop only nudges
176
+ > scroll registers and re-stages sprites.** Use the
177
+ > `template:"two_plane_parallax"` scaffold as the known-good shape.
178
+
179
+ ### Hardware scroll, the whole loop
180
+
181
+ A two-plane parallax scroller's *entire* per-frame render cost is:
182
+
183
+ ```c
184
+ VDP_setHorizontalScroll(BG_A, -camX); // foreground: 1:1 with world
185
+ VDP_setHorizontalScroll(BG_B, -(camX >> 4)); // background: 1/16 speed = far depth
186
+ /* ...stage sprites in SCREEN space... */
187
+ VDP_setSprite(0, playerScreenX, playerY, SPRITE_SIZE(2,2), attr);
188
+ VDP_updateSprites(1, DMA); // flush the SAT
189
+ SYS_doVBlankProcess(); // flush DMA queue, sync vblank
190
+ ```
191
+
192
+ No `VDP_setTileMapXY` / `VDP_fillTileMapRect` / `VDP_loadTileMap` in the
193
+ loop. Those are SETUP calls (and tiny one-off updates — a coin that
194
+ vanishes, a door that opens). They are NOT for whole-plane runtime
195
+ redraws. Positive `camX` scrolls the plane LEFT, so you write the
196
+ NEGATIVE camera offset. `VDP_setVerticalScroll` is the vertical twin
197
+ (it writes VSRAM — see `genesis_vsram`).
198
+
199
+ ### Logical plane size vs HARDWARE plane size
200
+
201
+ A common confusion: **the Genesis has ONE shared plane-size setting for
202
+ BOTH planes A and B** (VDP regs 16). You pick 32×32 / 64×32 / 32×64 /
203
+ 64×64 *cells* once; you do NOT get an independent size per plane. So a
204
+ "32-cell-wide level" still lives inside a 64-cell **physical** plane if
205
+ that's the hardware size you set — the extra cells are just offscreen
206
+ buffer. The scroll value wraps within the physical plane
207
+ (64 cells = 512 px), which is exactly what makes a fully-painted plane
208
+ tile forever with no redraw. Don't fight this: pick a hardware plane
209
+ size and treat your logical world coords separately.
210
+
211
+ | Plane size (cells) | Pixels | Use |
212
+ |--------------------|----------|--------------------------------------|
213
+ | 32×32 | 256×256 | single-screen / small wrap |
214
+ | **64×32** (default)| 512×256 | horizontal scroller (one plane wide) |
215
+ | 32×64 | 256×512 | vertical scroller |
216
+ | 64×64 | 512×512 | uses the most VRAM for name tables |
217
+
218
+ ### How Sonic-style large maps REALLY work (wider than one plane)
219
+
220
+ You do NOT make the plane "as wide as the level," and you do NOT redraw
221
+ the plane. The 64-cell hardware plane is a **circular buffer**: as the
222
+ camera advances, the column scrolling OFF the left re-appears on the
223
+ right (the scroll wraps mod 512 px). You keep the visible window full by
224
+ updating exactly **ONE offscreen column** each time the camera crosses
225
+ an 8-px tile boundary:
226
+
227
+ ```c
228
+ // camX in pixels; world is an array wider than 512 px.
229
+ s16 newTileCol = camX >> 3;
230
+ if (newTileCol != lastTileCol) {
231
+ // the column about to enter view on the right edge:
232
+ s16 worldCol = (camX + SCREEN_W) >> 3;
233
+ s16 planeCol = worldCol & 63; // wrap into the 64-cell plane
234
+ drawWorldColumn(planeCol, worldCol); // ONE column, ~28 cells — tiny
235
+ lastTileCol = newTileCol;
236
+ }
237
+ ```
238
+
239
+ That's ~28 tile writes per 8 px of travel, not a 1792-cell plane redraw.
240
+ The `template:"platformer"` scaffold scrolls within one plane (no
241
+ streaming); add the column-stream above to go wider. (Real Sonic also
242
+ splits the screen with H-blank raster effects for independent strips —
243
+ that's an IRQ/raster topic, see the `asm` template.)
244
+
245
+ ## Why does horizontal movement feel choppy? — motion-trace it headlessly ⭐
246
+
247
+ When movement feels off, don't trial-and-error with screenshots. Sample
248
+ the player's world-X, the camera scroll, and the actual VDP scroll
249
+ values over ~180 frames while holding a direction, and read the curve.
250
+ Two signatures to look for:
251
+
252
+ 1. **Camera scroll changes while the sprite's screen-X barely moves**
253
+ (or vice-versa) → your camera-follow math is off; the world slides
254
+ under a frozen-looking player, or the player slides on a frozen world.
255
+ 2. **Scroll JUMPS** (non-monotone, big steps) → you're scrolling by a
256
+ non-constant amount per frame (variable-rate camera, or you only
257
+ update scroll on a tile boundary instead of every frame).
258
+
259
+ The exact call — hold RIGHT, sample player-X + both planes' HSCROLL +
260
+ VSRAM over 180 frames. Expose the player/camera vars as `volatile`
261
+ globals so they resolve (see "Reading your C globals headlessly"); the
262
+ HSCROLL table lives in VRAM (`video_ram`), default base **$F000**
263
+ (`frame({op:'verify'})`'s render summary prints "H-scroll table: $Fxxx"):
264
+
265
+ ```js
266
+ b = build({output:'romWithDebug', platform:'genesis', source, inline:true,
267
+ resolveSymbols:['g_player_x','g_cam_x']})
268
+ // → resolvedSymbols.g_player_x.ramOffset (system_ram offset)
269
+ recordSession({
270
+ frames:180, sampleEvery:10, includeScreenshots:false,
271
+ holdInputs:[{right:true}],
272
+ memorySamples:[
273
+ {label:'player_x', region:'system_ram', offset: PLAYER_X_OFF, length:2},
274
+ {label:'cam_x', region:'system_ram', offset: CAM_X_OFF, length:2},
275
+ {label:'hscrollA', region:'video_ram', offset:0xF000, length:2},
276
+ {label:'hscrollB', region:'video_ram', offset:0xF002, length:2},
277
+ {label:'vsram', region:'genesis_vsram', offset:0, length:4},
278
+ ],
279
+ })
280
+ ```
281
+
282
+ Read the columns: `player_x` should ramp smoothly; `hscrollA` should
283
+ move 1:1 with the camera and `hscrollB` at the parallax ratio; both
284
+ should be **monotone** (no jumps) while RIGHT is held. ⚠ Genesis WRAM +
285
+ VRAM read **word-byte-swapped** in gpgx (a 16-bit `0x00F0` reads as
286
+ bytes `F0 00`) — account for the swap, or read single bytes. For a
287
+ compact value-vs-frame curve of just the HSCROLL table use
288
+ `watch({on:'mem', region:'video_ram', offset:0xF000, length:4,
289
+ format:'series', pressDuring:[{frame:0, button:'right', holdFrames:180}]})`.
290
+
291
+ ## Is the loop doing too much VDP work? — per-frame DMA budget ⭐
292
+
293
+ The render-side cause of choppy scroll is **too many VDP/DMA bytes per
294
+ frame** (a tilemap rewrite, an asset re-upload). Measure it directly,
295
+ no core rebuild:
296
+
297
+ ```js
298
+ watch({on:'dma', perFrame:true, frames:120,
299
+ pressDuring:[{frame:0, button:'right', holdFrames:120}]})
300
+ ```
301
+
302
+ returns a per-frame timeline `[{frame, dmas, bytes, romBytes, ramBytes}]`
303
+ plus `avgBytesPerFrame`, `peakFrame`/`peakBytes`, and `spikes`.
304
+
305
+ - A **smooth hardware-scroll loop** shows a LOW, FLAT curve — after boot
306
+ it's mostly the steady SAT/scroll refresh (`ramBytes`, single/low
307
+ double digits per frame).
308
+ - A **`spikes` entry** (bytes ≫ average, especially `romBytes` — an
309
+ asset upload FROM cart ROM) is the "I rewrote a tilemap / re-uploaded
310
+ tiles in the frame loop" smell. Move that work to setup, or stream
311
+ ONE column per 8-px scroll step (above).
312
+
313
+ **CEILING / what this does NOT catch:** this counts mem→VDP **DMA**
314
+ bytes (the dominant cost). Plain CPU writes to the VDP data port —
315
+ `VDP_setTileMapXY` without DMA, single-cell pokes — are not DMA and are
316
+ NOT counted; catching *those* would need a core-side VDP-data-port write
317
+ hook (a gpgx patch, not shipped). In practice the expensive per-frame
318
+ mistakes (whole-plane fills, `VDP_loadTileMap`, big `DMA_*` transfers)
319
+ ALL go through DMA and DO show up here, so the budget is a reliable
320
+ choppiness diagnostic today. There is no exposed per-frame
321
+ "vblank-cycles-used / overrun" counter either — infer overrun from the
322
+ byte budget (DMA bandwidth in vblank is finite: ~7.6 KB to VRAM in PAL
323
+ vblank, less in NTSC; a frame moving multiple KB to VRAM is at risk).
324
+
164
325
  ## Input
165
326
 
166
327
  `u16 pad = JOY_readJoypad(JOY_1)` returns a packed bitmask. The
@@ -252,6 +413,25 @@ headless per-PCM-channel "is it playing" readout for Genesis yet (it would need
252
413
  core patch to expose the XGM2 Z80 driver state), so audio verification here is
253
414
  record-and-listen, not assert.
254
415
 
416
+ ## MCP debug & inspection tooling
417
+
418
+ The shipped genesis_plus_gx (gpgx) core is patched for live introspection.
419
+ Video is deeply readable; the FM audio chip is only partially exposed:
420
+
421
+ - **Sprites:** `sprites({op:'inspect'})` decodes the live SAT.
422
+ - **Palette:** `palette({source:'live'})` reads live CRAM.
423
+ - **CPU:** `cpu({op:'read', cpu:'main'})` reads the 68000.
424
+ - **Audio (limited):** `getYm2612State` returns the YM2612's internal
425
+ struct as a raw blob — gpgx doesn't expose it in a safely per-channel
426
+ decodable form (good for frame-to-frame diffing, see "Debugging sound").
427
+ `getPsgState` decodes the SN76489 (3 tone + 1 noise channels).
428
+ - **Memory regions:** `memory({op:'read'})` exposes CRAM, VSRAM, VDP_REGS,
429
+ Z80_RAM (the sound CPU's RAM), M68K work RAM, YM2612, PSG, and VRAM.
430
+ Remember the gpgx byte-swap quirk: VRAM and WRAM read host-LE
431
+ word-byte-swapped (a 16-bit value's two bytes are swapped at the offset)
432
+ — account for it or read single bytes (see "Reading your C globals
433
+ headlessly").
434
+
255
435
  ## ROM layout
256
436
 
257
437
  ```
@@ -230,3 +230,35 @@ pixels per byte). The shipped `hello_sprite`, `tile_engine`, `shmup`,
230
230
  `platformer`, `puzzle` templates use this approach. **But never
231
231
  hand-encode a full-screen image this way** — that's the red/choppy
232
232
  failure above.
233
+
234
+ ## "Horizontal movement / scrolling feels choppy or judders"
235
+
236
+ Almost always: **you're rewriting the tilemap in the frame loop.** The
237
+ Genesis scrolls in HARDWARE — moving the world is two register writes
238
+ (`VDP_setHorizontalScroll`), which cost nothing. If you instead redraw a
239
+ plane every frame (a `VDP_fillTileMapRect` / `VDP_loadTileMap` / big
240
+ `DMA_*` each frame), you overrun the vblank DMA budget and drop frames →
241
+ judder. Fix: paint the planes ONCE at setup; the loop only nudges scroll
242
+ registers + re-stages sprites. The `template:"two_plane_parallax"`
243
+ scaffold is the known-good shape.
244
+
245
+ Diagnose it without guessing (no core rebuild):
246
+
247
+ - **Per-frame VDP work:** `watch({on:'dma', perFrame:true, frames:120,
248
+ pressDuring:[{frame:0, button:'right', holdFrames:120}]})` → a per-frame
249
+ `[{frame, bytes, romBytes, ramBytes}]` timeline + `spikes`. A smooth
250
+ loop is a LOW, FLAT curve (just the SAT/scroll refresh). A `spikes`
251
+ entry (bytes ≫ avg, esp. `romBytes`) IS the per-frame asset-upload /
252
+ tilemap-rewrite mistake. (Counts DMA bytes only — non-DMA single-cell
253
+ `VDP_setTileMapXY` pokes aren't counted; the expensive whole-plane work
254
+ always uses DMA and does show.)
255
+ - **Motion curve:** sample the player-X + both planes' HSCROLL ($F000 in
256
+ `video_ram`) over 180 frames while holding a direction — see
257
+ MENTAL_MODEL.md "Why does horizontal movement feel choppy?". Look for
258
+ scroll that jumps (non-constant per-frame delta) or a camera that moves
259
+ while the sprite's screen-X is frozen.
260
+
261
+ For a world WIDER than one 512-px plane, don't make the plane bigger and
262
+ don't redraw it — stream ONE offscreen column per 8-px camera step
263
+ (circular-buffer the 64-cell plane). See MENTAL_MODEL.md "How Sonic-style
264
+ large maps REALLY work".
Binary file
Binary file
@@ -160,6 +160,30 @@ void main(void) {
160
160
  }
161
161
  ```
162
162
 
163
+ ## MCP debug & inspection tooling
164
+
165
+ Game Gear runs on the same genesis_plus_gx (gpgx, patched) core as SMS, so the
166
+ inspectors are identical. **The canonical reference lives in the SMS
167
+ MENTAL_MODEL** (`src/platforms/sms/MENTAL_MODEL.md`, "MCP debug & inspection
168
+ tooling" section): `sprites({op:'inspect'})`, `tiles({op:'png'})`,
169
+ `cpu({op:'read'})` (Z80), `background({view:'renderState'})`,
170
+ `audioDebug({op:'inspect', chip:'psg'})` (SN76489, the shared
171
+ SMS/GG/Genesis region), and the z80 `objdump` disasm pipeline all apply
172
+ unchanged.
173
+
174
+ Game-Gear-only deltas:
175
+
176
+ - **`palette({source:'live'})`** decodes **12-bit BGR (4-4-4)**, twice the
177
+ depth of SMS's 6-bit. CRAM is **64 bytes** (2 little-endian bytes per
178
+ entry) instead of 32.
179
+ - The Game-Gear memory regions are **`gg_vram`** and **`gg_cram`** (the
180
+ 64-byte palette); use these instead of `sms_vram` / `sms_cram`. The
181
+ `sms_vdp_regs` / `sms_z80_regs` register regions are shared (same VDP and
182
+ Z80).
183
+ - `sprites({op:'inspect'})` X/Y fields are reported in **256×192 hardware
184
+ coordinates**, not the 160×144 visible window — match them with
185
+ hardware-coord arithmetic (see "Sprite coords are hardware-space" above).
186
+
163
187
  ## Differences from SMS — quick reference
164
188
 
165
189
  - Visible 160×144 vs 256×192 — center content
@@ -31,6 +31,8 @@
31
31
  .globl l__INITIALIZER
32
32
  .globl s__INITIALIZER
33
33
  .globl s__INITIALIZED
34
+ .globl s__DATA
35
+ .globl l__DATA
34
36
 
35
37
  ;; ─── Reset vector at $0000 ────────────────────────────────────────
36
38
  .area _HEADER (ABS)
@@ -83,7 +85,15 @@
83
85
  ;; call main. The initializer area is filled by sdcc when it sees
84
86
  ;; global initializations.
85
87
 
88
+ ;; AREA ORDERING IS LOad-BEARING. `_INITIALIZER` (the ROM image of
89
+ ;; every value-initialised `static` global) MUST be declared in the
90
+ ;; ROM group here — BEFORE the `_DATA` RAM block. If it isn't, sdld
91
+ ;; places `_INITIALIZER` in RAM right after `_INITIALIZED`, so the
92
+ ;; gsinit copy below copies uninitialised RAM onto itself and every
93
+ ;; `static uint8_t x = 5;` boots as 0. (Bug found 2026-06-08; see the
94
+ ;; matching note in sms_crt0.s — both z80 crt0s were missing this.)
86
95
  .area _HOME
96
+ .area _INITIALIZER
87
97
  .area _CODE
88
98
  .area _GSINIT
89
99
  .area _GSFINAL
@@ -97,6 +107,26 @@
97
107
  .area _CODE
98
108
 
99
109
  gsinit:
110
+ ;; ── Zero the BSS segment (`_DATA`). ──────────────────────────
111
+ ;; Every uninitialised `static` global lands in `_DATA` and MUST
112
+ ;; read back 0 at boot. Mirrors the sm83 GB crt0's gsinit_data loop.
113
+ ld bc, #l__DATA
114
+ ld a, b
115
+ or a, c
116
+ jr Z, gsinit_bss_done
117
+ ld hl, #s__DATA
118
+ ld (hl), #0x00
119
+ ld d, h
120
+ ld e, l
121
+ inc de
122
+ dec bc
123
+ ld a, b
124
+ or a, c
125
+ jr Z, gsinit_bss_done
126
+ ldir ; propagate the 0 across _DATA
127
+ gsinit_bss_done:
128
+
129
+ ;; ── Copy `_INITIALIZER` (ROM) → `_INITIALIZED` (RAM). ────────
100
130
  ld bc, #l__INITIALIZER
101
131
  ld a, b
102
132
  or a, c
@@ -35,13 +35,39 @@ Mikey handles:
35
35
  - **Joystick** — read via SWITCHES register at `$FCB0`. cc65 provides
36
36
  `joy_read(JOY_1)` + `JOY_LEFT/RIGHT/UP/DOWN/BTN_1/BTN_2` macros.
37
37
 
38
- **Live debug:** `audioDebug({op:'inspect', chip:"mikey"})` decodes the 4 audio voices
39
- (volume, period→freq→note, LFSR state); `palette({source:'live'})` reads the 16-entry
40
- palette; `background({view:'renderState'})` shows DISPCTL + the display base address;
41
- `cpu({op:'read'})` reads the 65C02; `breakpoint({on:'write'})` is the write watchpoint. **Sprites
42
- are the exception** the Lynx has no fixed OAM (sprites are SCB linked lists
43
- walked by Suzy), so `sprites({op:'inspect'})` returns the SCB list head ($FC10/$FC11) and
44
- you walk the chain over `system_ram`, rather than reading a sprite table.
38
+ **Live debug:** the MCP inspectors (`palette` / `cpu` / `audioDebug` /
39
+ `background` / `breakpoint`, and the SCB-list-head `sprites` special case) are
40
+ documented in "MCP debug & inspection tooling" below.
41
+
42
+ ## MCP debug & inspection tooling
43
+
44
+ The Lynx runs on handy (patched). The inspectors read the *live* core state —
45
+ reach for them when a sprite or palette renders wrong and the source alone
46
+ doesn't explain it. Details and the per-tool facts:
47
+
48
+ - **`palette({source:'live'})`** — the **16-entry, 12-bit RGB** Mikey palette
49
+ (`$FDA0-$FDBF`) converted to RGB.
50
+ - **`cpu({op:'read'})`** — 65C02 dump: A / X / Y / P / SP / PC plus the
51
+ decoded flag bits.
52
+ - **`audioDebug({op:'inspect', chip:'mikey'})`** — the 4 Mikey voices: volume,
53
+ the timer→period→frequency→note chain, and the **12-bit LFSR** state.
54
+ - **`background({view:'renderState'})`** — decodes DISPCTL: the DMA-enable,
55
+ flip, and color-mode bits plus the display base address.
56
+ - **`sprites({op:'inspect'})` is the special case.** The Lynx has **no fixed
57
+ OAM** — sprites are **SCB (Sprite Control Block) linked lists in RAM** that
58
+ Suzy walks at blit time. So this tool can't return a sprite table; instead
59
+ it returns the **SCB list head (SCBNEXT, `$FC10`/`$FC11`)** plus
60
+ instructions to walk the chain yourself over `system_ram`.
61
+
62
+ ### Memory regions (`memory({op:'read', region:…})`)
63
+
64
+ | Region | Address / size | Contents |
65
+ |-----------------|-----------------------|-------------------------------------------------------|
66
+ | `lynx_cpu_regs` | — | 65C02 register snapshot |
67
+ | `lynx_hw_regs` | $FC00-$FDFF window | the **Suzy + Mikey** register window — sprite-engine regs, LCD control, audio, palette |
68
+ | `system_ram` | 64 KB | full address space (also where the SCB chain lives) |
69
+
70
+ Pair these with `breakpoint({on:'write'})` for the full live-debug loop.
45
71
 
46
72
  ## Frame heartbeat (cc65 + tgi)
47
73
 
@@ -117,3 +117,30 @@ exactly this.
117
117
  generator + the envelope (period + shape bits).
118
118
  - `memory({op:'read'})` regions: `msx_vram`, `msx_vdp_regs`, `msx_vdp_status`,
119
119
  `msx_palette`, `msx_cpu_regs`, `msx_psg_regs`, plus `system_ram` (work RAM).
120
+
121
+ ## MCP debug & inspection tooling
122
+
123
+ MSX is a **Tier-1** platform with deep introspection — the full set of
124
+ inspectors and memory regions is listed under **"Debugging tools"** above
125
+ (`cpu` / `background` / `palette` / `sprites` / `symbols` / `audioDebug` for
126
+ the AY-3-8910 PSG, and the `msx_vram` / `msx_vdp_regs` / `msx_vdp_status` /
127
+ `msx_palette` / `msx_cpu_regs` / `msx_psg_regs` / `system_ram` regions). The
128
+ PSG-channel decode means `audioDebug({op:'inspect', chip:'ay8910'})` gives
129
+ you the 3 square-wave channels plus the shared noise generator and envelope
130
+ without poking at `msx_psg_regs` by hand.
131
+
132
+ ### ColecoVision shares this core family — but is bring-up only
133
+
134
+ ColecoVision runs the same toolchain family and exposes only the **standard**
135
+ introspection: `system_ram` + `save_ram` + `video_ram`. It has **no deep
136
+ inspectors** (no `palette` / `sprites` / `background` / `audioDebug` decode)
137
+ and **no MENTAL_MODEL of its own** — treat it as a bring-up target, not a
138
+ finished Tier-1 platform.
139
+
140
+ ### Extending introspection (for whoever adds a platform)
141
+
142
+ Deeper, decoded inspectors are not free — each is implemented by **patching
143
+ the emulator core** to expose the extra register/VRAM regions, then wiring a
144
+ decoder. To add deep introspection to ColecoVision (or any thin platform),
145
+ follow the existing core-patch pattern used for snes9x / gpgx / fceumm / vice
146
+ under **`scripts/patches/`**.
@@ -25,6 +25,13 @@ The cc65 runtime claims:
25
25
  - ZP $1C+ available to your game (with our chr-ram crt0)
26
26
  - `$0500-$07FF` (3 pages): cc65 C parameter stack
27
27
 
28
+ > **cc65 zero-page starts at $02, not $00 (applies to every cc65 platform —
29
+ > NES, C64, Atari, Lynx, …).** cc65 reserves `$00-$01` for its runtime, so your
30
+ > first `.res 1` in the `ZEROPAGE` segment lands at **$02**, not $00. If you
31
+ > hand-write asm that assumes a zero-page var is at $00 you'll clobber the
32
+ > runtime. Confirm actual addresses with `symbols({op:'map'})` after
33
+ > `build({output:'romWithDebug'})`.
34
+
28
35
  ## PPU memory map (separate from CPU bus!)
29
36
 
30
37
  ```
@@ -94,6 +101,15 @@ The `nes_runtime` helper `tile_set_palette(nt, x, y, palette)` does
94
101
  the read-modify-write dance and the bit-twiddling — use it instead
95
102
  of writing attributes by hand.
96
103
 
104
+ > **256-tile cap per pattern table (the busy-image trap).** The nametable's
105
+ > tile index is 8-bit, so a single pattern table holds at most **256 unique
106
+ > tiles** — and a per-frame BG can therefore use at most 256 distinct tiles.
107
+ > Auto-converting a busy full-screen illustration almost always needs more than
108
+ > 256 unique 8×8 tiles and **overflows**; `encodeArt({stage:'tilemap'})` warns
109
+ > when it does. The only real workaround is mid-frame CHR bank switching
110
+ > (an MMC3-class mapper) — the bundled NROM presets can't do it, so design BG
111
+ > art to reuse tiles (≤256 unique per table).
112
+
97
113
  ## Palettes
98
114
 
99
115
  32 bytes at $3F00-$3F1F:
@@ -329,6 +345,25 @@ incorrectly aligned."
329
345
  scrolling worlds you need to manage the nametable buffer + bank
330
346
  switching yourself.
331
347
 
348
+ ## MCP debug & inspection tooling
349
+
350
+ The shipped fceumm core is patched for live introspection — read state
351
+ instead of guessing:
352
+
353
+ - **Sprites:** `sprites({op:'inspect'})` decodes live OAM.
354
+ - **Palette:** `palette({source:'live'})` reads the live 32-byte palette RAM.
355
+ - **CPU:** `cpu({op:'read'})` reads the 6502.
356
+ - **Background render state:** `background({view:'renderState'})` decodes
357
+ PPUCTRL/PPUMASK and resolves the active CHR bank (plus its file offset) —
358
+ this is what tells you which pattern table BG vs sprites are fetching from
359
+ (the bit-4 footgun above).
360
+ - **Memory regions:** `memory({op:'read'})` exposes OAM, Palette,
361
+ Nametables (CIRAM — including the 2-bit-per-16x16 attribute data that
362
+ selects each tile group's sub-palette, decoded by `inspectBackgroundMap`),
363
+ CHR (live MMC1-banked CHR — don't parse the iNES file), CPU_REGS,
364
+ PPU_REGS, and APU_REGS (the synthesized $4000-$4017 snapshot consumed by
365
+ `audioDebug`).
366
+
332
367
  ## Rebuilding a CHR-ROM NROM image (reverse-engineering)
333
368
 
334
369
  The homebrew presets above are CHR-**RAM** (the CPU uploads tiles at runtime).