romdevtools 0.22.1 → 0.24.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/AGENTS.md +169 -494
- package/CHANGELOG.md +103 -0
- package/examples/genesis/templates/platformer.c +5 -1
- package/examples/genesis/templates/two_plane_parallax.c +166 -0
- package/package.json +2 -2
- package/src/cores/wasm/vice_x64_libretro.js +1 -1
- package/src/cores/wasm/vice_x64_libretro.wasm +0 -0
- package/src/host/LibretroHost.js +225 -2
- package/src/host/framebuffer.js +37 -0
- package/src/http/skill-doc.js +1 -1
- package/src/mcp/tools/audio.js +2 -2
- package/src/mcp/tools/frame.js +13 -34
- package/src/mcp/tools/index.js +2 -2
- package/src/mcp/tools/input-layout.js +10 -0
- package/src/mcp/tools/input.js +26 -2
- package/src/mcp/tools/metasprite-tools.js +1 -1
- package/src/mcp/tools/platform-tools.js +18 -11
- package/src/mcp/tools/playtest.js +17 -2
- package/src/mcp/tools/project.js +9 -1
- package/src/mcp/tools/rendering-context.js +1 -1
- package/src/mcp/tools/symbols.js +130 -39
- package/src/mcp/tools/tile-inspect.js +1 -1
- package/src/mcp/tools/toolchain.js +3 -2
- package/src/mcp/tools/watch-memory.js +58 -6
- package/src/platforms/_guides/ROMHACKING_PLAYBOOK.md +155 -6
- package/src/platforms/atari2600/MENTAL_MODEL.md +37 -0
- package/src/platforms/atari7800/MENTAL_MODEL.md +36 -0
- package/src/platforms/c64/MENTAL_MODEL.md +83 -6
- package/src/platforms/gb/MENTAL_MODEL.md +74 -0
- package/src/platforms/gb/lib/c/SDCC_GOTCHAS.md +91 -0
- package/src/platforms/gba/MENTAL_MODEL.md +57 -3
- package/src/platforms/gba/lib/arm-archives/libc.a +0 -0
- package/src/platforms/gba/lib/arm-archives/libgcc.a +0 -0
- package/src/platforms/gba/lib/arm-archives/libnosys.a +0 -0
- package/src/platforms/gbc/MENTAL_MODEL.md +34 -0
- package/src/platforms/gbc/lib/c/SDCC_GOTCHAS.md +91 -0
- package/src/platforms/genesis/MENTAL_MODEL.md +180 -0
- package/src/platforms/genesis/TROUBLESHOOTING.md +32 -0
- package/src/platforms/genesis/lib/c/libc.a +0 -0
- package/src/platforms/genesis/lib/c/libgcc.a +0 -0
- package/src/platforms/genesis/lib/c/libm.a +0 -0
- package/src/platforms/gg/MENTAL_MODEL.md +24 -0
- package/src/platforms/gg/lib/c/gg_crt0.s +30 -0
- package/src/platforms/lynx/MENTAL_MODEL.md +33 -7
- package/src/platforms/msx/MENTAL_MODEL.md +27 -0
- package/src/platforms/nes/MENTAL_MODEL.md +35 -0
- package/src/platforms/sms/MENTAL_MODEL.md +51 -0
- package/src/platforms/sms/lib/c/sms_crt0.s +40 -0
- package/src/platforms/snes/MENTAL_MODEL.md +21 -0
- package/src/platforms/snes/TROUBLESHOOTING.md +43 -0
- package/src/playtest/playtest.js +48 -0
- package/src/toolchains/sdcc/preflight-lint.js +164 -8
- package/examples/msx/catch_game/_verify.mjs +0 -93
- package/examples/pce/catch_game/_verify.mjs +0 -75
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
@@ -47,6 +47,19 @@ the same wall.
|
|
|
47
47
|
`s__DATA` for `l__DATA` bytes; bring-your-own crt0 should do the
|
|
48
48
|
same.
|
|
49
49
|
|
|
50
|
+
6. **Don't poke a hardcoded `$C0xx` WRAM pointer for game state — it
|
|
51
|
+
overlaps your statics.** SDCC links the C runtime's data + BSS (every
|
|
52
|
+
`static` global: your PRNG seed, your grids, your scores) at the BOTTOM
|
|
53
|
+
of WRAM starting `$C000`. A `volatile uint8_t *board = (uint8_t*)0xC000;`
|
|
54
|
+
then scribbles right over `static uint32_t rng = ...;` et al. Symptom
|
|
55
|
+
looks exactly like an SDCC *codegen* bug — e.g. a 32-bit xorshift PRNG
|
|
56
|
+
that "degenerates" so every roll is identical (its seed is being
|
|
57
|
+
clobbered, not miscompiled). **Use a `static` array and let the linker
|
|
58
|
+
place it** (`static uint8_t board[78]; board[i]=p;`), or hardcode at
|
|
59
|
+
`$C200`+ and confirm with the linker map (`build({includeSymbols:true})`
|
|
60
|
+
→ check `s__DATA`/`s__BSS`). Full write-up + repro in
|
|
61
|
+
`lib/c/SDCC_GOTCHAS.md` § "sm83 codegen traps in plain game logic".
|
|
62
|
+
|
|
50
63
|
- **Two VRAM banks** (switched via VBK at $FF4F) — bank 0 holds tile
|
|
51
64
|
pattern data, bank 1 holds per-tile BG attributes (palette index,
|
|
52
65
|
H/V flip, priority, tile bank).
|
|
@@ -61,6 +74,27 @@ the same wall.
|
|
|
61
74
|
- **HDMA** ($FF51-$FF55) for fast block transfers during HBlank —
|
|
62
75
|
used for live tile streaming.
|
|
63
76
|
|
|
77
|
+
## MCP debug & inspection tooling
|
|
78
|
+
|
|
79
|
+
GBC shares the patched gambatte core with DMG, so **all the live inspectors
|
|
80
|
+
and `gb_*` memory regions documented in the GB MENTAL_MODEL apply unchanged
|
|
81
|
+
here** — `sprites({op:'inspect'})`, `tiles({op:'png'})`, `cpu({op:'read'})`,
|
|
82
|
+
`audioDebug({op:'inspect', chip:'gb'})`, and the `gb_vram` / `gb_oam` / `gb_io`
|
|
83
|
+
/ `gb_hram` / `gb_cpu_regs` regions (same gotcha: it's `gb_vram`, NOT the
|
|
84
|
+
generic `video_ram`). Disassembly routes through the same `-m gbz80` objdump.
|
|
85
|
+
See the GB MENTAL_MODEL for the shared gambatte debug tooling.
|
|
86
|
+
|
|
87
|
+
CGB-only deltas on top of that shared set:
|
|
88
|
+
|
|
89
|
+
- **`palette({source:'live'})`** on a CGB ROM decodes the **64-byte BCPS/OCPS
|
|
90
|
+
palette RAM** into **8 palettes × 4 colors in BGR555** (the DMG path that
|
|
91
|
+
decodes BGP/OBP0/OBP1 bytes is what runs on a `gb` build instead). The raw
|
|
92
|
+
CGB palette RAM is also readable directly via the **`gb_bgpdata`** (BG, 64
|
|
93
|
+
bytes) and **`gb_objpdata`** (OBJ, 64 bytes) memory regions.
|
|
94
|
+
- **`background({view:'renderState'})`** reports the CGB extras the DMG path
|
|
95
|
+
doesn't have: the current **VRAM bank** (VBK), **KEY1** (double-speed state),
|
|
96
|
+
and the live **BCPS/OCPS palette index**.
|
|
97
|
+
|
|
64
98
|
## CGB vs DMG mode
|
|
65
99
|
|
|
66
100
|
The CGB boot ROM checks header byte **`$0143`**:
|
|
@@ -188,3 +188,94 @@ build({
|
|
|
188
188
|
},
|
|
189
189
|
})
|
|
190
190
|
```
|
|
191
|
+
|
|
192
|
+
## sm83 codegen traps in plain game logic (WRAM integer/array code)
|
|
193
|
+
|
|
194
|
+
Every footgun above is about VRAM / OAM-DMA / the cart header — the stuff
|
|
195
|
+
that makes sprites vanish. This section is the opposite: **plain WRAM game
|
|
196
|
+
logic** — PRNGs, collision grids, score math. Two such "miscompiles" were
|
|
197
|
+
reported from a real GBC Columns build session and chased to ground here.
|
|
198
|
+
**Verdict: neither was an sm83 codegen bug.** They are documented so you
|
|
199
|
+
don't burn hours blaming the compiler for what is actually a memory-layout
|
|
200
|
+
or static-init trap.
|
|
201
|
+
|
|
202
|
+
### NOT a bug: 32-bit math / `uint32_t` shifts ≥ 16
|
|
203
|
+
|
|
204
|
+
Reported: *"`static uint32_t rng=0x1357; rng ^= rng<<13; rng ^= rng>>17;
|
|
205
|
+
rng ^= rng<<5;` degenerates — every `1+xorshift()%6` roll comes out the
|
|
206
|
+
same (near-monochrome)."*
|
|
207
|
+
|
|
208
|
+
**Reproduced on sm83: it does NOT degenerate.** A ROM that seeds the PRNG,
|
|
209
|
+
calls `xorshift()` 20×, and writes `1 + (result % 6)` to WRAM reads back a
|
|
210
|
+
fully-varied `5,5,5,1,5,5,4,1,3,2,1,...` — the exact sequence a reference
|
|
211
|
+
implementation produces. Full 32-bit fidelity was confirmed byte-for-byte
|
|
212
|
+
across several seeds (`0xDEADBEEF`, `0x00000001`, …). The `<<13` / `>>17` /
|
|
213
|
+
`<<5` shifts (including the ≥16-bit right shift) and `% 6` are all correct.
|
|
214
|
+
**Do not rewrite a working 32-bit xorshift into 16-bit to "dodge" this.**
|
|
215
|
+
32-bit ops are bigger/slower than 16-bit on an 8-bit CPU, so prefer 16-bit
|
|
216
|
+
PRNGs for *speed* — but not for correctness; both are correct.
|
|
217
|
+
|
|
218
|
+
### The REAL trap behind "monochrome RNG": writing game state to a fixed
|
|
219
|
+
`0xC0xx` WRAM address that overlaps your statics
|
|
220
|
+
|
|
221
|
+
This is what actually produces the reported symptom. SDCC links the C
|
|
222
|
+
runtime's `_DATA` / `_INITIALIZED` segment (every value-initialised
|
|
223
|
+
`static`, e.g. `static uint32_t rng = 0x1357;`) **at the very bottom of
|
|
224
|
+
WRAM, starting `$C000`**, with `_BSS` (zero-init statics like
|
|
225
|
+
`static uint8_t grid[78];`) right after it. If your code also pokes a
|
|
226
|
+
**hardcoded** `$C000`-area pointer for game state —
|
|
227
|
+
|
|
228
|
+
```c
|
|
229
|
+
volatile uint8_t *board = (volatile uint8_t *)0xC000; /* DON'T */
|
|
230
|
+
board[i] = piece; /* clobbers `rng` and friends! */
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
— you are scribbling directly over your own statics. Then `xorshift()`
|
|
234
|
+
reads a trashed `rng`, the PRNG collapses, and every roll looks the same.
|
|
235
|
+
It presents *exactly* like a compiler bug; it is not.
|
|
236
|
+
|
|
237
|
+
**Fixes (any one):**
|
|
238
|
+
- **Best — let the linker place it.** Use a `static` array and take its
|
|
239
|
+
address; never hardcode a WRAM pointer:
|
|
240
|
+
`static uint8_t board[6*13]; ... board[i] = piece;`
|
|
241
|
+
- If you *must* use a fixed address, put it well clear of the runtime data:
|
|
242
|
+
`$C200`+ is safe for small projects (statics here end far below `$C100`;
|
|
243
|
+
`shadow_oam` is pinned at `$C100`). Confirm with the linker map — build
|
|
244
|
+
with `includeSymbols:true` and look at `s__DATA` / `s__BSS` (e.g.
|
|
245
|
+
`s__DATA = $C000`, `s__BSS = $C006`): your scratch RAM must start ABOVE
|
|
246
|
+
the end of `_BSS`.
|
|
247
|
+
- **Diagnose it in seconds:** read `system_ram` offset 0 right after boot
|
|
248
|
+
and compare against your initialised statics' expected bytes. If a
|
|
249
|
+
`static uint32_t x = 0x1357;` doesn't read back `57 13 00 00` at its map
|
|
250
|
+
address, something is overwriting it.
|
|
251
|
+
|
|
252
|
+
### NOT a bug: short `for` loop with an indexed `static` array read
|
|
253
|
+
|
|
254
|
+
Reported: *"`for(i=0;i<3;i++){ if(grid[r*6+col]) return 1; }` reads the
|
|
255
|
+
wrong cells (pieces lock mid-air / floating gaps); unrolling the 3
|
|
256
|
+
iterations fixed it."*
|
|
257
|
+
|
|
258
|
+
**Reproduced on sm83: the looped form reads the CORRECT cells.** A ROM that
|
|
259
|
+
seeds `grid[]` with a sparse occupied/empty pattern and runs `collides()`
|
|
260
|
+
both looped and hand-unrolled, for 8 straddling `(col,topy)` inputs, gets
|
|
261
|
+
**identical, correct** results from both forms (`1,0,1,0,1,1,1,1`). The
|
|
262
|
+
`grid[r*6+col]` index math and the 3-iteration loop are fine. If your real
|
|
263
|
+
collision check "floats," look first at the WRAM-collision trap above (a
|
|
264
|
+
clobbered `grid[]`), at off-by-one row/col limits, or at signed/unsigned
|
|
265
|
+
mix-ups — not at loop codegen. **Don't pre-emptively unroll loops as a
|
|
266
|
+
compiler workaround; with the stack-overflow fix in place, sm83 loops with
|
|
267
|
+
indexed array reads are reliable.**
|
|
268
|
+
|
|
269
|
+
### z80 (SMS/GG) ONLY — fixed: value-initialised statics booted as 0
|
|
270
|
+
|
|
271
|
+
Investigating the above on the **z80** port (SMS/GG share the SDCC family)
|
|
272
|
+
surfaced a real bug — but a **crt0** bug, not codegen. The bundled
|
|
273
|
+
`sms_crt0.s` / `gg_crt0.s` placed `_INITIALIZER` (the ROM image of
|
|
274
|
+
value-initialised statics) *after* the `_DATA` RAM block in the area list,
|
|
275
|
+
so sdld put it in RAM; the gsinit `ldir` then copied uninitialised RAM onto
|
|
276
|
+
itself and **every `static uint8_t x = 5;` booted as 0** (and BSS wasn't
|
|
277
|
+
zeroed either). On z80 *this* is what made the xorshift PRNG monochrome
|
|
278
|
+
(seed `rng` booted 0 → stayed 0). Fixed 2026-06-08 by ROM-placing
|
|
279
|
+
`_INITIALIZER` + adding a `_DATA` zero loop, mirroring this sm83 crt0 (which
|
|
280
|
+
was already correct — hence sm83 was never affected). If you bring your own
|
|
281
|
+
z80 crt0, model gsinit on `gb_crt0.s`.
|
|
@@ -161,6 +161,167 @@ while (1) {
|
|
|
161
161
|
your sprite updates never appear on screen. It's the single most
|
|
162
162
|
important call in any SGDK game loop.
|
|
163
163
|
|
|
164
|
+
## Scrolling, parallax & the feel trap ⭐
|
|
165
|
+
|
|
166
|
+
This is the section to read before you build a side-scroller. The #1
|
|
167
|
+
"my horizontal movement feels choppy/juddery" bug on Genesis is a
|
|
168
|
+
software mistake, not a hardware limit:
|
|
169
|
+
|
|
170
|
+
> ### ⚠️ DO NOT rewrite full tilemaps in the frame loop.
|
|
171
|
+
> The Genesis scrolls in HARDWARE. Moving the world is **two register
|
|
172
|
+
> writes** (`VDP_setHorizontalScroll`), which are free. If instead you
|
|
173
|
+
> redraw the plane each frame (a big `VDP_setTileMapXY`/`VDP_loadTileMap`
|
|
174
|
+
> burst or a per-frame DMA), you overrun vblank, drop frames, and the
|
|
175
|
+
> scroll judders. **Paint the planes ONCE at setup; the loop only nudges
|
|
176
|
+
> scroll registers and re-stages sprites.** Use the
|
|
177
|
+
> `template:"two_plane_parallax"` scaffold as the known-good shape.
|
|
178
|
+
|
|
179
|
+
### Hardware scroll, the whole loop
|
|
180
|
+
|
|
181
|
+
A two-plane parallax scroller's *entire* per-frame render cost is:
|
|
182
|
+
|
|
183
|
+
```c
|
|
184
|
+
VDP_setHorizontalScroll(BG_A, -camX); // foreground: 1:1 with world
|
|
185
|
+
VDP_setHorizontalScroll(BG_B, -(camX >> 4)); // background: 1/16 speed = far depth
|
|
186
|
+
/* ...stage sprites in SCREEN space... */
|
|
187
|
+
VDP_setSprite(0, playerScreenX, playerY, SPRITE_SIZE(2,2), attr);
|
|
188
|
+
VDP_updateSprites(1, DMA); // flush the SAT
|
|
189
|
+
SYS_doVBlankProcess(); // flush DMA queue, sync vblank
|
|
190
|
+
```
|
|
191
|
+
|
|
192
|
+
No `VDP_setTileMapXY` / `VDP_fillTileMapRect` / `VDP_loadTileMap` in the
|
|
193
|
+
loop. Those are SETUP calls (and tiny one-off updates — a coin that
|
|
194
|
+
vanishes, a door that opens). They are NOT for whole-plane runtime
|
|
195
|
+
redraws. Positive `camX` scrolls the plane LEFT, so you write the
|
|
196
|
+
NEGATIVE camera offset. `VDP_setVerticalScroll` is the vertical twin
|
|
197
|
+
(it writes VSRAM — see `genesis_vsram`).
|
|
198
|
+
|
|
199
|
+
### Logical plane size vs HARDWARE plane size
|
|
200
|
+
|
|
201
|
+
A common confusion: **the Genesis has ONE shared plane-size setting for
|
|
202
|
+
BOTH planes A and B** (VDP regs 16). You pick 32×32 / 64×32 / 32×64 /
|
|
203
|
+
64×64 *cells* once; you do NOT get an independent size per plane. So a
|
|
204
|
+
"32-cell-wide level" still lives inside a 64-cell **physical** plane if
|
|
205
|
+
that's the hardware size you set — the extra cells are just offscreen
|
|
206
|
+
buffer. The scroll value wraps within the physical plane
|
|
207
|
+
(64 cells = 512 px), which is exactly what makes a fully-painted plane
|
|
208
|
+
tile forever with no redraw. Don't fight this: pick a hardware plane
|
|
209
|
+
size and treat your logical world coords separately.
|
|
210
|
+
|
|
211
|
+
| Plane size (cells) | Pixels | Use |
|
|
212
|
+
|--------------------|----------|--------------------------------------|
|
|
213
|
+
| 32×32 | 256×256 | single-screen / small wrap |
|
|
214
|
+
| **64×32** (default)| 512×256 | horizontal scroller (one plane wide) |
|
|
215
|
+
| 32×64 | 256×512 | vertical scroller |
|
|
216
|
+
| 64×64 | 512×512 | uses the most VRAM for name tables |
|
|
217
|
+
|
|
218
|
+
### How Sonic-style large maps REALLY work (wider than one plane)
|
|
219
|
+
|
|
220
|
+
You do NOT make the plane "as wide as the level," and you do NOT redraw
|
|
221
|
+
the plane. The 64-cell hardware plane is a **circular buffer**: as the
|
|
222
|
+
camera advances, the column scrolling OFF the left re-appears on the
|
|
223
|
+
right (the scroll wraps mod 512 px). You keep the visible window full by
|
|
224
|
+
updating exactly **ONE offscreen column** each time the camera crosses
|
|
225
|
+
an 8-px tile boundary:
|
|
226
|
+
|
|
227
|
+
```c
|
|
228
|
+
// camX in pixels; world is an array wider than 512 px.
|
|
229
|
+
s16 newTileCol = camX >> 3;
|
|
230
|
+
if (newTileCol != lastTileCol) {
|
|
231
|
+
// the column about to enter view on the right edge:
|
|
232
|
+
s16 worldCol = (camX + SCREEN_W) >> 3;
|
|
233
|
+
s16 planeCol = worldCol & 63; // wrap into the 64-cell plane
|
|
234
|
+
drawWorldColumn(planeCol, worldCol); // ONE column, ~28 cells — tiny
|
|
235
|
+
lastTileCol = newTileCol;
|
|
236
|
+
}
|
|
237
|
+
```
|
|
238
|
+
|
|
239
|
+
That's ~28 tile writes per 8 px of travel, not a 1792-cell plane redraw.
|
|
240
|
+
The `template:"platformer"` scaffold scrolls within one plane (no
|
|
241
|
+
streaming); add the column-stream above to go wider. (Real Sonic also
|
|
242
|
+
splits the screen with H-blank raster effects for independent strips —
|
|
243
|
+
that's an IRQ/raster topic, see the `asm` template.)
|
|
244
|
+
|
|
245
|
+
## Why does horizontal movement feel choppy? — motion-trace it headlessly ⭐
|
|
246
|
+
|
|
247
|
+
When movement feels off, don't trial-and-error with screenshots. Sample
|
|
248
|
+
the player's world-X, the camera scroll, and the actual VDP scroll
|
|
249
|
+
values over ~180 frames while holding a direction, and read the curve.
|
|
250
|
+
Two signatures to look for:
|
|
251
|
+
|
|
252
|
+
1. **Camera scroll changes while the sprite's screen-X barely moves**
|
|
253
|
+
(or vice-versa) → your camera-follow math is off; the world slides
|
|
254
|
+
under a frozen-looking player, or the player slides on a frozen world.
|
|
255
|
+
2. **Scroll JUMPS** (non-monotone, big steps) → you're scrolling by a
|
|
256
|
+
non-constant amount per frame (variable-rate camera, or you only
|
|
257
|
+
update scroll on a tile boundary instead of every frame).
|
|
258
|
+
|
|
259
|
+
The exact call — hold RIGHT, sample player-X + both planes' HSCROLL +
|
|
260
|
+
VSRAM over 180 frames. Expose the player/camera vars as `volatile`
|
|
261
|
+
globals so they resolve (see "Reading your C globals headlessly"); the
|
|
262
|
+
HSCROLL table lives in VRAM (`video_ram`), default base **$F000**
|
|
263
|
+
(`frame({op:'verify'})`'s render summary prints "H-scroll table: $Fxxx"):
|
|
264
|
+
|
|
265
|
+
```js
|
|
266
|
+
b = build({output:'romWithDebug', platform:'genesis', source, inline:true,
|
|
267
|
+
resolveSymbols:['g_player_x','g_cam_x']})
|
|
268
|
+
// → resolvedSymbols.g_player_x.ramOffset (system_ram offset)
|
|
269
|
+
recordSession({
|
|
270
|
+
frames:180, sampleEvery:10, includeScreenshots:false,
|
|
271
|
+
holdInputs:[{right:true}],
|
|
272
|
+
memorySamples:[
|
|
273
|
+
{label:'player_x', region:'system_ram', offset: PLAYER_X_OFF, length:2},
|
|
274
|
+
{label:'cam_x', region:'system_ram', offset: CAM_X_OFF, length:2},
|
|
275
|
+
{label:'hscrollA', region:'video_ram', offset:0xF000, length:2},
|
|
276
|
+
{label:'hscrollB', region:'video_ram', offset:0xF002, length:2},
|
|
277
|
+
{label:'vsram', region:'genesis_vsram', offset:0, length:4},
|
|
278
|
+
],
|
|
279
|
+
})
|
|
280
|
+
```
|
|
281
|
+
|
|
282
|
+
Read the columns: `player_x` should ramp smoothly; `hscrollA` should
|
|
283
|
+
move 1:1 with the camera and `hscrollB` at the parallax ratio; both
|
|
284
|
+
should be **monotone** (no jumps) while RIGHT is held. ⚠ Genesis WRAM +
|
|
285
|
+
VRAM read **word-byte-swapped** in gpgx (a 16-bit `0x00F0` reads as
|
|
286
|
+
bytes `F0 00`) — account for the swap, or read single bytes. For a
|
|
287
|
+
compact value-vs-frame curve of just the HSCROLL table use
|
|
288
|
+
`watch({on:'mem', region:'video_ram', offset:0xF000, length:4,
|
|
289
|
+
format:'series', pressDuring:[{frame:0, button:'right', holdFrames:180}]})`.
|
|
290
|
+
|
|
291
|
+
## Is the loop doing too much VDP work? — per-frame DMA budget ⭐
|
|
292
|
+
|
|
293
|
+
The render-side cause of choppy scroll is **too many VDP/DMA bytes per
|
|
294
|
+
frame** (a tilemap rewrite, an asset re-upload). Measure it directly,
|
|
295
|
+
no core rebuild:
|
|
296
|
+
|
|
297
|
+
```js
|
|
298
|
+
watch({on:'dma', perFrame:true, frames:120,
|
|
299
|
+
pressDuring:[{frame:0, button:'right', holdFrames:120}]})
|
|
300
|
+
```
|
|
301
|
+
|
|
302
|
+
returns a per-frame timeline `[{frame, dmas, bytes, romBytes, ramBytes}]`
|
|
303
|
+
plus `avgBytesPerFrame`, `peakFrame`/`peakBytes`, and `spikes`.
|
|
304
|
+
|
|
305
|
+
- A **smooth hardware-scroll loop** shows a LOW, FLAT curve — after boot
|
|
306
|
+
it's mostly the steady SAT/scroll refresh (`ramBytes`, single/low
|
|
307
|
+
double digits per frame).
|
|
308
|
+
- A **`spikes` entry** (bytes ≫ average, especially `romBytes` — an
|
|
309
|
+
asset upload FROM cart ROM) is the "I rewrote a tilemap / re-uploaded
|
|
310
|
+
tiles in the frame loop" smell. Move that work to setup, or stream
|
|
311
|
+
ONE column per 8-px scroll step (above).
|
|
312
|
+
|
|
313
|
+
**CEILING / what this does NOT catch:** this counts mem→VDP **DMA**
|
|
314
|
+
bytes (the dominant cost). Plain CPU writes to the VDP data port —
|
|
315
|
+
`VDP_setTileMapXY` without DMA, single-cell pokes — are not DMA and are
|
|
316
|
+
NOT counted; catching *those* would need a core-side VDP-data-port write
|
|
317
|
+
hook (a gpgx patch, not shipped). In practice the expensive per-frame
|
|
318
|
+
mistakes (whole-plane fills, `VDP_loadTileMap`, big `DMA_*` transfers)
|
|
319
|
+
ALL go through DMA and DO show up here, so the budget is a reliable
|
|
320
|
+
choppiness diagnostic today. There is no exposed per-frame
|
|
321
|
+
"vblank-cycles-used / overrun" counter either — infer overrun from the
|
|
322
|
+
byte budget (DMA bandwidth in vblank is finite: ~7.6 KB to VRAM in PAL
|
|
323
|
+
vblank, less in NTSC; a frame moving multiple KB to VRAM is at risk).
|
|
324
|
+
|
|
164
325
|
## Input
|
|
165
326
|
|
|
166
327
|
`u16 pad = JOY_readJoypad(JOY_1)` returns a packed bitmask. The
|
|
@@ -252,6 +413,25 @@ headless per-PCM-channel "is it playing" readout for Genesis yet (it would need
|
|
|
252
413
|
core patch to expose the XGM2 Z80 driver state), so audio verification here is
|
|
253
414
|
record-and-listen, not assert.
|
|
254
415
|
|
|
416
|
+
## MCP debug & inspection tooling
|
|
417
|
+
|
|
418
|
+
The shipped genesis_plus_gx (gpgx) core is patched for live introspection.
|
|
419
|
+
Video is deeply readable; the FM audio chip is only partially exposed:
|
|
420
|
+
|
|
421
|
+
- **Sprites:** `sprites({op:'inspect'})` decodes the live SAT.
|
|
422
|
+
- **Palette:** `palette({source:'live'})` reads live CRAM.
|
|
423
|
+
- **CPU:** `cpu({op:'read', cpu:'main'})` reads the 68000.
|
|
424
|
+
- **Audio (limited):** `getYm2612State` returns the YM2612's internal
|
|
425
|
+
struct as a raw blob — gpgx doesn't expose it in a safely per-channel
|
|
426
|
+
decodable form (good for frame-to-frame diffing, see "Debugging sound").
|
|
427
|
+
`getPsgState` decodes the SN76489 (3 tone + 1 noise channels).
|
|
428
|
+
- **Memory regions:** `memory({op:'read'})` exposes CRAM, VSRAM, VDP_REGS,
|
|
429
|
+
Z80_RAM (the sound CPU's RAM), M68K work RAM, YM2612, PSG, and VRAM.
|
|
430
|
+
Remember the gpgx byte-swap quirk: VRAM and WRAM read host-LE
|
|
431
|
+
word-byte-swapped (a 16-bit value's two bytes are swapped at the offset)
|
|
432
|
+
— account for it or read single bytes (see "Reading your C globals
|
|
433
|
+
headlessly").
|
|
434
|
+
|
|
255
435
|
## ROM layout
|
|
256
436
|
|
|
257
437
|
```
|
|
@@ -230,3 +230,35 @@ pixels per byte). The shipped `hello_sprite`, `tile_engine`, `shmup`,
|
|
|
230
230
|
`platformer`, `puzzle` templates use this approach. **But never
|
|
231
231
|
hand-encode a full-screen image this way** — that's the red/choppy
|
|
232
232
|
failure above.
|
|
233
|
+
|
|
234
|
+
## "Horizontal movement / scrolling feels choppy or judders"
|
|
235
|
+
|
|
236
|
+
Almost always: **you're rewriting the tilemap in the frame loop.** The
|
|
237
|
+
Genesis scrolls in HARDWARE — moving the world is two register writes
|
|
238
|
+
(`VDP_setHorizontalScroll`), which cost nothing. If you instead redraw a
|
|
239
|
+
plane every frame (a `VDP_fillTileMapRect` / `VDP_loadTileMap` / big
|
|
240
|
+
`DMA_*` each frame), you overrun the vblank DMA budget and drop frames →
|
|
241
|
+
judder. Fix: paint the planes ONCE at setup; the loop only nudges scroll
|
|
242
|
+
registers + re-stages sprites. The `template:"two_plane_parallax"`
|
|
243
|
+
scaffold is the known-good shape.
|
|
244
|
+
|
|
245
|
+
Diagnose it without guessing (no core rebuild):
|
|
246
|
+
|
|
247
|
+
- **Per-frame VDP work:** `watch({on:'dma', perFrame:true, frames:120,
|
|
248
|
+
pressDuring:[{frame:0, button:'right', holdFrames:120}]})` → a per-frame
|
|
249
|
+
`[{frame, bytes, romBytes, ramBytes}]` timeline + `spikes`. A smooth
|
|
250
|
+
loop is a LOW, FLAT curve (just the SAT/scroll refresh). A `spikes`
|
|
251
|
+
entry (bytes ≫ avg, esp. `romBytes`) IS the per-frame asset-upload /
|
|
252
|
+
tilemap-rewrite mistake. (Counts DMA bytes only — non-DMA single-cell
|
|
253
|
+
`VDP_setTileMapXY` pokes aren't counted; the expensive whole-plane work
|
|
254
|
+
always uses DMA and does show.)
|
|
255
|
+
- **Motion curve:** sample the player-X + both planes' HSCROLL ($F000 in
|
|
256
|
+
`video_ram`) over 180 frames while holding a direction — see
|
|
257
|
+
MENTAL_MODEL.md "Why does horizontal movement feel choppy?". Look for
|
|
258
|
+
scroll that jumps (non-constant per-frame delta) or a camera that moves
|
|
259
|
+
while the sprite's screen-X is frozen.
|
|
260
|
+
|
|
261
|
+
For a world WIDER than one 512-px plane, don't make the plane bigger and
|
|
262
|
+
don't redraw it — stream ONE offscreen column per 8-px camera step
|
|
263
|
+
(circular-buffer the 64-cell plane). See MENTAL_MODEL.md "How Sonic-style
|
|
264
|
+
large maps REALLY work".
|
|
Binary file
|
|
Binary file
|
|
Binary file
|
|
@@ -160,6 +160,30 @@ void main(void) {
|
|
|
160
160
|
}
|
|
161
161
|
```
|
|
162
162
|
|
|
163
|
+
## MCP debug & inspection tooling
|
|
164
|
+
|
|
165
|
+
Game Gear runs on the same genesis_plus_gx (gpgx, patched) core as SMS, so the
|
|
166
|
+
inspectors are identical. **The canonical reference lives in the SMS
|
|
167
|
+
MENTAL_MODEL** (`src/platforms/sms/MENTAL_MODEL.md`, "MCP debug & inspection
|
|
168
|
+
tooling" section): `sprites({op:'inspect'})`, `tiles({op:'png'})`,
|
|
169
|
+
`cpu({op:'read'})` (Z80), `background({view:'renderState'})`,
|
|
170
|
+
`audioDebug({op:'inspect', chip:'psg'})` (SN76489, the shared
|
|
171
|
+
SMS/GG/Genesis region), and the z80 `objdump` disasm pipeline all apply
|
|
172
|
+
unchanged.
|
|
173
|
+
|
|
174
|
+
Game-Gear-only deltas:
|
|
175
|
+
|
|
176
|
+
- **`palette({source:'live'})`** decodes **12-bit BGR (4-4-4)**, twice the
|
|
177
|
+
depth of SMS's 6-bit. CRAM is **64 bytes** (2 little-endian bytes per
|
|
178
|
+
entry) instead of 32.
|
|
179
|
+
- The Game-Gear memory regions are **`gg_vram`** and **`gg_cram`** (the
|
|
180
|
+
64-byte palette); use these instead of `sms_vram` / `sms_cram`. The
|
|
181
|
+
`sms_vdp_regs` / `sms_z80_regs` register regions are shared (same VDP and
|
|
182
|
+
Z80).
|
|
183
|
+
- `sprites({op:'inspect'})` X/Y fields are reported in **256×192 hardware
|
|
184
|
+
coordinates**, not the 160×144 visible window — match them with
|
|
185
|
+
hardware-coord arithmetic (see "Sprite coords are hardware-space" above).
|
|
186
|
+
|
|
163
187
|
## Differences from SMS — quick reference
|
|
164
188
|
|
|
165
189
|
- Visible 160×144 vs 256×192 — center content
|
|
@@ -31,6 +31,8 @@
|
|
|
31
31
|
.globl l__INITIALIZER
|
|
32
32
|
.globl s__INITIALIZER
|
|
33
33
|
.globl s__INITIALIZED
|
|
34
|
+
.globl s__DATA
|
|
35
|
+
.globl l__DATA
|
|
34
36
|
|
|
35
37
|
;; ─── Reset vector at $0000 ────────────────────────────────────────
|
|
36
38
|
.area _HEADER (ABS)
|
|
@@ -83,7 +85,15 @@
|
|
|
83
85
|
;; call main. The initializer area is filled by sdcc when it sees
|
|
84
86
|
;; global initializations.
|
|
85
87
|
|
|
88
|
+
;; AREA ORDERING IS LOad-BEARING. `_INITIALIZER` (the ROM image of
|
|
89
|
+
;; every value-initialised `static` global) MUST be declared in the
|
|
90
|
+
;; ROM group here — BEFORE the `_DATA` RAM block. If it isn't, sdld
|
|
91
|
+
;; places `_INITIALIZER` in RAM right after `_INITIALIZED`, so the
|
|
92
|
+
;; gsinit copy below copies uninitialised RAM onto itself and every
|
|
93
|
+
;; `static uint8_t x = 5;` boots as 0. (Bug found 2026-06-08; see the
|
|
94
|
+
;; matching note in sms_crt0.s — both z80 crt0s were missing this.)
|
|
86
95
|
.area _HOME
|
|
96
|
+
.area _INITIALIZER
|
|
87
97
|
.area _CODE
|
|
88
98
|
.area _GSINIT
|
|
89
99
|
.area _GSFINAL
|
|
@@ -97,6 +107,26 @@
|
|
|
97
107
|
.area _CODE
|
|
98
108
|
|
|
99
109
|
gsinit:
|
|
110
|
+
;; ── Zero the BSS segment (`_DATA`). ──────────────────────────
|
|
111
|
+
;; Every uninitialised `static` global lands in `_DATA` and MUST
|
|
112
|
+
;; read back 0 at boot. Mirrors the sm83 GB crt0's gsinit_data loop.
|
|
113
|
+
ld bc, #l__DATA
|
|
114
|
+
ld a, b
|
|
115
|
+
or a, c
|
|
116
|
+
jr Z, gsinit_bss_done
|
|
117
|
+
ld hl, #s__DATA
|
|
118
|
+
ld (hl), #0x00
|
|
119
|
+
ld d, h
|
|
120
|
+
ld e, l
|
|
121
|
+
inc de
|
|
122
|
+
dec bc
|
|
123
|
+
ld a, b
|
|
124
|
+
or a, c
|
|
125
|
+
jr Z, gsinit_bss_done
|
|
126
|
+
ldir ; propagate the 0 across _DATA
|
|
127
|
+
gsinit_bss_done:
|
|
128
|
+
|
|
129
|
+
;; ── Copy `_INITIALIZER` (ROM) → `_INITIALIZED` (RAM). ────────
|
|
100
130
|
ld bc, #l__INITIALIZER
|
|
101
131
|
ld a, b
|
|
102
132
|
or a, c
|
|
@@ -35,13 +35,39 @@ Mikey handles:
|
|
|
35
35
|
- **Joystick** — read via SWITCHES register at `$FCB0`. cc65 provides
|
|
36
36
|
`joy_read(JOY_1)` + `JOY_LEFT/RIGHT/UP/DOWN/BTN_1/BTN_2` macros.
|
|
37
37
|
|
|
38
|
-
**Live debug:**
|
|
39
|
-
|
|
40
|
-
|
|
41
|
-
|
|
42
|
-
|
|
43
|
-
|
|
44
|
-
|
|
38
|
+
**Live debug:** the MCP inspectors (`palette` / `cpu` / `audioDebug` /
|
|
39
|
+
`background` / `breakpoint`, and the SCB-list-head `sprites` special case) are
|
|
40
|
+
documented in "MCP debug & inspection tooling" below.
|
|
41
|
+
|
|
42
|
+
## MCP debug & inspection tooling
|
|
43
|
+
|
|
44
|
+
The Lynx runs on handy (patched). The inspectors read the *live* core state —
|
|
45
|
+
reach for them when a sprite or palette renders wrong and the source alone
|
|
46
|
+
doesn't explain it. Details and the per-tool facts:
|
|
47
|
+
|
|
48
|
+
- **`palette({source:'live'})`** — the **16-entry, 12-bit RGB** Mikey palette
|
|
49
|
+
(`$FDA0-$FDBF`) converted to RGB.
|
|
50
|
+
- **`cpu({op:'read'})`** — 65C02 dump: A / X / Y / P / SP / PC plus the
|
|
51
|
+
decoded flag bits.
|
|
52
|
+
- **`audioDebug({op:'inspect', chip:'mikey'})`** — the 4 Mikey voices: volume,
|
|
53
|
+
the timer→period→frequency→note chain, and the **12-bit LFSR** state.
|
|
54
|
+
- **`background({view:'renderState'})`** — decodes DISPCTL: the DMA-enable,
|
|
55
|
+
flip, and color-mode bits plus the display base address.
|
|
56
|
+
- **`sprites({op:'inspect'})` is the special case.** The Lynx has **no fixed
|
|
57
|
+
OAM** — sprites are **SCB (Sprite Control Block) linked lists in RAM** that
|
|
58
|
+
Suzy walks at blit time. So this tool can't return a sprite table; instead
|
|
59
|
+
it returns the **SCB list head (SCBNEXT, `$FC10`/`$FC11`)** plus
|
|
60
|
+
instructions to walk the chain yourself over `system_ram`.
|
|
61
|
+
|
|
62
|
+
### Memory regions (`memory({op:'read', region:…})`)
|
|
63
|
+
|
|
64
|
+
| Region | Address / size | Contents |
|
|
65
|
+
|-----------------|-----------------------|-------------------------------------------------------|
|
|
66
|
+
| `lynx_cpu_regs` | — | 65C02 register snapshot |
|
|
67
|
+
| `lynx_hw_regs` | $FC00-$FDFF window | the **Suzy + Mikey** register window — sprite-engine regs, LCD control, audio, palette |
|
|
68
|
+
| `system_ram` | 64 KB | full address space (also where the SCB chain lives) |
|
|
69
|
+
|
|
70
|
+
Pair these with `breakpoint({on:'write'})` for the full live-debug loop.
|
|
45
71
|
|
|
46
72
|
## Frame heartbeat (cc65 + tgi)
|
|
47
73
|
|
|
@@ -117,3 +117,30 @@ exactly this.
|
|
|
117
117
|
generator + the envelope (period + shape bits).
|
|
118
118
|
- `memory({op:'read'})` regions: `msx_vram`, `msx_vdp_regs`, `msx_vdp_status`,
|
|
119
119
|
`msx_palette`, `msx_cpu_regs`, `msx_psg_regs`, plus `system_ram` (work RAM).
|
|
120
|
+
|
|
121
|
+
## MCP debug & inspection tooling
|
|
122
|
+
|
|
123
|
+
MSX is a **Tier-1** platform with deep introspection — the full set of
|
|
124
|
+
inspectors and memory regions is listed under **"Debugging tools"** above
|
|
125
|
+
(`cpu` / `background` / `palette` / `sprites` / `symbols` / `audioDebug` for
|
|
126
|
+
the AY-3-8910 PSG, and the `msx_vram` / `msx_vdp_regs` / `msx_vdp_status` /
|
|
127
|
+
`msx_palette` / `msx_cpu_regs` / `msx_psg_regs` / `system_ram` regions). The
|
|
128
|
+
PSG-channel decode means `audioDebug({op:'inspect', chip:'ay8910'})` gives
|
|
129
|
+
you the 3 square-wave channels plus the shared noise generator and envelope
|
|
130
|
+
without poking at `msx_psg_regs` by hand.
|
|
131
|
+
|
|
132
|
+
### ColecoVision shares this core family — but is bring-up only
|
|
133
|
+
|
|
134
|
+
ColecoVision runs the same toolchain family and exposes only the **standard**
|
|
135
|
+
introspection: `system_ram` + `save_ram` + `video_ram`. It has **no deep
|
|
136
|
+
inspectors** (no `palette` / `sprites` / `background` / `audioDebug` decode)
|
|
137
|
+
and **no MENTAL_MODEL of its own** — treat it as a bring-up target, not a
|
|
138
|
+
finished Tier-1 platform.
|
|
139
|
+
|
|
140
|
+
### Extending introspection (for whoever adds a platform)
|
|
141
|
+
|
|
142
|
+
Deeper, decoded inspectors are not free — each is implemented by **patching
|
|
143
|
+
the emulator core** to expose the extra register/VRAM regions, then wiring a
|
|
144
|
+
decoder. To add deep introspection to ColecoVision (or any thin platform),
|
|
145
|
+
follow the existing core-patch pattern used for snes9x / gpgx / fceumm / vice
|
|
146
|
+
under **`scripts/patches/`**.
|
|
@@ -25,6 +25,13 @@ The cc65 runtime claims:
|
|
|
25
25
|
- ZP $1C+ available to your game (with our chr-ram crt0)
|
|
26
26
|
- `$0500-$07FF` (3 pages): cc65 C parameter stack
|
|
27
27
|
|
|
28
|
+
> **cc65 zero-page starts at $02, not $00 (applies to every cc65 platform —
|
|
29
|
+
> NES, C64, Atari, Lynx, …).** cc65 reserves `$00-$01` for its runtime, so your
|
|
30
|
+
> first `.res 1` in the `ZEROPAGE` segment lands at **$02**, not $00. If you
|
|
31
|
+
> hand-write asm that assumes a zero-page var is at $00 you'll clobber the
|
|
32
|
+
> runtime. Confirm actual addresses with `symbols({op:'map'})` after
|
|
33
|
+
> `build({output:'romWithDebug'})`.
|
|
34
|
+
|
|
28
35
|
## PPU memory map (separate from CPU bus!)
|
|
29
36
|
|
|
30
37
|
```
|
|
@@ -94,6 +101,15 @@ The `nes_runtime` helper `tile_set_palette(nt, x, y, palette)` does
|
|
|
94
101
|
the read-modify-write dance and the bit-twiddling — use it instead
|
|
95
102
|
of writing attributes by hand.
|
|
96
103
|
|
|
104
|
+
> **256-tile cap per pattern table (the busy-image trap).** The nametable's
|
|
105
|
+
> tile index is 8-bit, so a single pattern table holds at most **256 unique
|
|
106
|
+
> tiles** — and a per-frame BG can therefore use at most 256 distinct tiles.
|
|
107
|
+
> Auto-converting a busy full-screen illustration almost always needs more than
|
|
108
|
+
> 256 unique 8×8 tiles and **overflows**; `encodeArt({stage:'tilemap'})` warns
|
|
109
|
+
> when it does. The only real workaround is mid-frame CHR bank switching
|
|
110
|
+
> (an MMC3-class mapper) — the bundled NROM presets can't do it, so design BG
|
|
111
|
+
> art to reuse tiles (≤256 unique per table).
|
|
112
|
+
|
|
97
113
|
## Palettes
|
|
98
114
|
|
|
99
115
|
32 bytes at $3F00-$3F1F:
|
|
@@ -329,6 +345,25 @@ incorrectly aligned."
|
|
|
329
345
|
scrolling worlds you need to manage the nametable buffer + bank
|
|
330
346
|
switching yourself.
|
|
331
347
|
|
|
348
|
+
## MCP debug & inspection tooling
|
|
349
|
+
|
|
350
|
+
The shipped fceumm core is patched for live introspection — read state
|
|
351
|
+
instead of guessing:
|
|
352
|
+
|
|
353
|
+
- **Sprites:** `sprites({op:'inspect'})` decodes live OAM.
|
|
354
|
+
- **Palette:** `palette({source:'live'})` reads the live 32-byte palette RAM.
|
|
355
|
+
- **CPU:** `cpu({op:'read'})` reads the 6502.
|
|
356
|
+
- **Background render state:** `background({view:'renderState'})` decodes
|
|
357
|
+
PPUCTRL/PPUMASK and resolves the active CHR bank (plus its file offset) —
|
|
358
|
+
this is what tells you which pattern table BG vs sprites are fetching from
|
|
359
|
+
(the bit-4 footgun above).
|
|
360
|
+
- **Memory regions:** `memory({op:'read'})` exposes OAM, Palette,
|
|
361
|
+
Nametables (CIRAM — including the 2-bit-per-16x16 attribute data that
|
|
362
|
+
selects each tile group's sub-palette, decoded by `inspectBackgroundMap`),
|
|
363
|
+
CHR (live MMC1-banked CHR — don't parse the iNES file), CPU_REGS,
|
|
364
|
+
PPU_REGS, and APU_REGS (the synthesized $4000-$4017 snapshot consumed by
|
|
365
|
+
`audioDebug`).
|
|
366
|
+
|
|
332
367
|
## Rebuilding a CHR-ROM NROM image (reverse-engineering)
|
|
333
368
|
|
|
334
369
|
The homebrew presets above are CHR-**RAM** (the CPU uploads tiles at runtime).
|