romdevtools 0.15.0 → 0.16.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
package/AGENTS.md CHANGED
@@ -43,7 +43,7 @@ Skip playtest only when there's clearly no human in the loop: CI runs, automated
43
43
 
44
44
  ## Tool surface: everything is loaded — just call the tool
45
45
 
46
- **All ~34 tools are registered and callable from session init — there is no loading step.** If you see a tool name anywhere in this doc or via `catalog({op:'categories'})`, you can call it right now. Each tool is a small VERB with an operation axis — `memory({op})`, `build({output})`, `sprites({op})`, `breakpoint({on})`, `cpu({op})` — so the whole surface is a few dozen names, not a few hundred.
46
+ **All ~32 tools are registered and callable from session init — there is no loading step.** If you see a tool name anywhere in this doc or via `catalog({op:'categories'})`, you can call it right now. Each tool is a small VERB with an operation axis — `memory({op})`, `build({output})`, `sprites({op})`, `breakpoint({on})`, `cpu({op})` — so the whole surface is a few dozen names, not a few hundred.
47
47
 
48
48
  (We used to lazy-load tools behind a `loadCategory` call. It caused more harm than good — agents burned round-trips re-loading categories, and dynamic registration never propagated reliably to clients anyway. The consolidation shrank the surface enough that the entire thing loads up front; the old `loadCategory`/`describeTool` discovery tools are gone.)
49
49
 
package/CHANGELOG.md CHANGED
@@ -67,6 +67,37 @@ instead of assuming the server is broken and installing their own tools.
67
67
  - The `uint8-loop-bound` preflight lint is scope-aware (no longer false-flags a
68
68
  `uint16_t` loop counter that shares a name with a `uint8_t` in another function).
69
69
 
70
+ ## 0.16.0
71
+
72
+ **Build diagnostics: agents were building blind — errors AND warnings now reach
73
+ the response as structured `issues[]`.** An agent can only fix what the toolchain
74
+ tells it, where it tells it. Audited the whole build surface and closed the gaps
75
+ so diagnostics (file/line/message/stage) come back in the tool result, not buried
76
+ in the raw log. (Also bumps a doc count: the surface is 32 tools after 0.15.0's
77
+ dmaTrace→watch / patchGbHeader→romPatch consolidation; stale "34" references in
78
+ the docs + source comments updated.)
79
+
80
+ ### Fixed
81
+ - **Warnings were OFF.** No C compiler was being asked for them. gcc (GBA/Genesis)
82
+ now compiles USER source with `-Wall -Wextra -Wno-unused-parameter` (the bundled
83
+ SDK stays warning-free so its noise doesn't bury the agent's); cc65 enables its
84
+ valid high-value `-W` set. So unused vars, implicit declarations, etc. are now
85
+ emitted and surfaced.
86
+ - **Swallowed errors now structured:** SDCC's keyword-less `file:line: syntax
87
+ error: …` and `warning NNN: …` (GB/GBC/SMS/MSX previously returned an empty
88
+ `issues[]` on a syntax error); the sdld/ASlink `Undefined Global '_x'` link
89
+ error; vasm errors (Genesis asm emits no stage marker, so they hit the
90
+ fallback, which had skipped the vasm parser); and a **missing `incbin` asset**
91
+ — the #1 thing an agent forgets to pass — now reports `could not open <x.bin>`
92
+ with the exact filename.
93
+ - **Fixed a build crash that ate the real error:** `build({output:'rom', path})`
94
+ fell into the source builder with no source and threw "Cannot read properties
95
+ of undefined (reading 'split')" instead of the compiler error; it now routes to
96
+ the project-dir builder like `output:'run'`/`'project'`.
97
+ - Verified live across all 14 platforms; a `parse-errors-coverage` test locks the
98
+ formats in. (Known limit: asar/SNES-asm only yields a wrapper "aborted"
99
+ message — its WASM build aborts without printing line info.)
100
+
70
101
  ## 0.14.0
71
102
 
72
103
  **Two platform-specific top-level tools folded into their domain verbs, a
package/README.md CHANGED
@@ -52,7 +52,7 @@ Agents: the server delivers [`AGENTS.md`](./AGENTS.md) as connection-time instru
52
52
  Most agents support MCP, but you don't have to use it. Run the server
53
53
  (`npx romdevtools`) and **skip wiring it into your agent's MCP
54
54
  config** — no `claude mcp add`, no `mcp.json` entry, no MCP client at all. The
55
- same 34 tools are reachable over plain HTTP / as an Agent Skill against the
55
+ same 32 tools are reachable over plain HTTP / as an Agent Skill against the
56
56
  running server:
57
57
 
58
58
  - **Plain HTTP:** `POST http://127.0.0.1:7331/tool/{name}` with the args as a JSON
package/package.json CHANGED
@@ -1,6 +1,6 @@
1
1
  {
2
2
  "name": "romdevtools",
3
- "version": "0.15.0",
3
+ "version": "0.16.0",
4
4
  "description": "Tool server giving coding agents full control of homebrew ROM development AND reverse-engineering/romhacking across 14 retro platforms (NES, SNES, GB, Genesis, Atari, C64, PC Engine, MSX, ...) via WASM toolchains + emulator cores. Use over plain HTTP, as an Agent Skill, or as an MCP server.",
5
5
  "type": "module",
6
6
  "main": "src/mcp/server.js",
@@ -18,7 +18,7 @@ import { toolJsonSchema } from "./tool-registry.js";
18
18
  */
19
19
  export const mcpPreamble = [
20
20
  "romdev: homebrew retro game development + reverse-engineering for coding agents.",
21
- "All ~34 tools register at session init — call any by name directly, no loading step. Each is a domain VERB with an operation axis: memory({op}), build({output}), breakpoint({on}), cpu({op}), sprites({op}), tiles({op}), disasm({target}), romPatch({op}), …",
21
+ "All ~32 tools register at session init — call any by name directly, no loading step. Each is a domain VERB with an operation axis: memory({op}), build({output}), breakpoint({on}), cpu({op}), sprites({op}), tiles({op}), disasm({target}), romPatch({op}), …",
22
22
  "catalog({op:'categories'}) maps the tools by purpose (a guide, not a gate); catalog({op:'status'}) is a session re-orient.",
23
23
  ].join("\n");
24
24
 
@@ -1,7 +1,7 @@
1
1
  // Tool registry harvester — the single source the HTTP route/skill/OpenAPI
2
2
  // surfaces build from.
3
3
  //
4
- // The MCP path registers 34 tools via registerTools(server, z, sessionKey),
4
+ // The MCP path registers 32 tools via registerTools(server, z, sessionKey),
5
5
  // where `server` is an McpServer and each handler closes over `sessionKey` for
6
6
  // per-session host isolation. The HTTP surfaces (POST /tool/{name},
7
7
  // /skills/romdev/SKILL.md, /openapi.json, /documentation) want the EXACT same handlers,
@@ -1,6 +1,6 @@
1
1
  // Register MCP tools on the server.
2
2
  //
3
- // The surface is ~34 consolidated domain tools (memory({op}), build({output}),
3
+ // The surface is ~32 consolidated domain tools (memory({op}), build({output}),
4
4
  // breakpoint({on}), …) and EVERY one registers at session init. There is no
5
5
  // progressive-disclosure / lean mode anymore: the dynamic loadCategory dance
6
6
  // never propagated reliably to clients (they don't re-read tools/list after a
@@ -208,7 +208,7 @@ export function registerTools(server, z, sessionKey) {
208
208
  if (!sessionKey) sessionKey = randomUUID();
209
209
  // Clear validation errors for EVERY tool registered below: turns the SDK's
210
210
  // raw JSON validation dump into a plain sentence and catches unknown/misspelled
211
- // params (which the SDK otherwise drops silently). One wrap, all 34 tools.
211
+ // params (which the SDK otherwise drops silently). One wrap, all 32 tools.
212
212
  // This is what lets the param descriptions stay terse — the guidance lives in
213
213
  // the error (paid only on a bad call), not in every agent's initial context.
214
214
  server = withClearToolErrors(server, z);
@@ -259,7 +259,7 @@ export function registerTools(server, z, sessionKey) {
259
259
  );
260
260
 
261
261
  // loadCategory + describeTool DELETED with the progressive-disclosure path:
262
- // the whole surface is ~34 tools now and every one registers at session init,
262
+ // the whole surface is ~32 tools now and every one registers at session init,
263
263
  // so the dynamic lean-mode dance (which never worked reliably — clients don't
264
264
  // re-read tools/list after list_changed) has no reason to exist. `catalog`
265
265
  // still exposes the category map for orientation. (See the consolidation.)
@@ -293,7 +293,7 @@ export function registerTools(server, z, sessionKey) {
293
293
  // So by default we register EVERY category at session init. listCategories
294
294
  // / loadCategory still exist (idempotent, harmless) for clients that probe
295
295
  // Register EVERY category now — there is no lean/deferred mode anymore. The
296
- // surface is small enough (~34 tools) that loading it all up front is the
296
+ // surface is small enough (~32 tools) that loading it all up front is the
297
297
  // right call (the dynamic loadCategory dance never propagated reliably to
298
298
  // clients). `disclosure.loadCategory("all")` is just the internal "register
299
299
  // all categories" helper here, not a user-facing tool.
@@ -68,6 +68,7 @@ const LOG_TAIL = 1200;
68
68
  // stays visible. Left intact on failure (timing can matter when diagnosing a
69
69
  // hang/OOM). The full untrimmed log is still written to logPath when spilled.
70
70
  export function denoiseSuccessLog(log) {
71
+ if (typeof log !== "string") return log ?? "";
71
72
  const lines = log.split("\n");
72
73
  const kept = [];
73
74
  let inTiming = false;
@@ -657,7 +658,18 @@ export function registerToolchainTools(server, z, sessionKey) {
657
658
  },
658
659
  safeTool(async (args) => {
659
660
  switch (args.output) {
660
- case "rom": return await buildSourceImpl(args);
661
+ case "rom": {
662
+ // `build({output:'rom', path})` (a project dir, no explicit sources) is
663
+ // the natural "build my scaffolded dir to a ROM file" call. Route it to
664
+ // the dir builder (same recipe as output:'project'/'run') — otherwise it
665
+ // fell into buildSourceImpl with no source and crashed on an undefined
666
+ // log. With path AND explicit sources, the sources win (manual build).
667
+ if (args.path && args.source == null && args.sources == null &&
668
+ args.sourcePath == null && args.sourcesPaths == null) {
669
+ return await buildProjectImpl(args);
670
+ }
671
+ return await buildSourceImpl(args);
672
+ }
661
673
  case "run": return await runSourceImpl(args);
662
674
  case "project": {
663
675
  if (!args.path) throw new Error("build({output:'project'}): `path` (the project directory) is required.");
@@ -257,7 +257,14 @@ export function parseRamUsage(mapText) {
257
257
  * @param {string} [args.linkerConfig] custom ld65 .cfg (overrides per-target default)
258
258
  */
259
259
  export async function buildC(args) {
260
- const ccOpts = args.debug ? ["-g"] : [];
260
+ // Enable cc65's high-value warnings so the agent SEES real bugs (parsed into
261
+ // structured issues[]). These are the valid cc65 -W names that catch actual
262
+ // mistakes; unused-param is left off (scaffold callbacks commonly ignore
263
+ // params). cc65's warning set is thin to begin with, but errors always surface
264
+ // and these are pure upside. (NOTE: cc65 errors on an unknown -W name, so this
265
+ // list is verified valid against the bundled cc65.)
266
+ const ccWarn = ["-W", "unused-var,unused-func,unused-label,const-comparison,struct-param,pointer-sign"];
267
+ const ccOpts = args.debug ? ["-g", ...ccWarn] : [...ccWarn];
261
268
  const caOpts = args.debug ? ["-g"] : [];
262
269
  const sources = normalizeSources(args, "main.c");
263
270
 
@@ -92,7 +92,12 @@ export async function buildGbaC(args) {
92
92
  // per-symbol `.bss.<name>`/`.data.<name>` line — that's what lets
93
93
  // symbols({op:'resolve'}) turn a static C global's name into an address on GBA
94
94
  // (same as SGDK does for Genesis). Pure metadata; no codegen change to what's kept.
95
- const cc1Options = args.cc1Options ?? ["-O2", "-mthumb", "-ffunction-sections", "-fdata-sections"];
95
+ // -Wall -Wextra so the agent SEES warnings (unused vars, implicit decls,
96
+ // sign-compare, etc.) — they're parsed into structured issues[]. Without these
97
+ // gcc is silent and agents build blind. -Wno-unused-parameter keeps the common
98
+ // intentional `(void)`-style scaffold params from being noise. Applied to USER
99
+ // .c only (the libtonc/maxmod SDK is a prebuilt seed, not recompiled here).
100
+ const cc1Options = args.cc1Options ?? ["-O2", "-mthumb", "-ffunction-sections", "-fdata-sections", "-Wall", "-Wextra", "-Wno-unused-parameter"];
96
101
  const sources = normalizeGbaSources(args);
97
102
  const binaryIncludes = args.binaryIncludes ?? {};
98
103
 
@@ -264,6 +264,12 @@ async function buildWithSgdk({ sources, headers, binaryIncludes, cc1Options, reb
264
264
  "-fdata-sections",
265
265
  ...cc1Options,
266
266
  ];
267
+ // USER source gets warnings on so the agent SEES its bugs (unused vars,
268
+ // implicit decls, …) parsed into structured issues[]. The SGDK runtime is
269
+ // compiled WITHOUT these (sgdkCc1Options) — we can't fix SDK warnings and they'd
270
+ // bury the agent's own. -Wno-unused-parameter avoids the common `(void)hard`
271
+ // scaffold-param noise.
272
+ const userCc1Options = [...sgdkCc1Options, "-Wall", "-Wextra", "-Wno-unused-parameter"];
267
273
 
268
274
  // ── Stage A: gather SGDK headers (visible to tcc via tcc-style flat mount) ──
269
275
  // cc1's -iquote /work picks up sibling files mounted alongside main.c.
@@ -280,7 +286,7 @@ async function buildWithSgdk({ sources, headers, binaryIncludes, cc1Options, reb
280
286
  const cc = await runCc1m68k({
281
287
  source: sources[cName],
282
288
  headers: tccHeaders,
283
- options: sgdkCc1Options,
289
+ options: userCc1Options,
284
290
  });
285
291
  log += `--- cc1 (${cName}) ---\n` + (cc.log || "(ok)") + "\n";
286
292
  if (cc.exitCode !== 0 || !cc.asmSource) {
@@ -469,12 +475,14 @@ async function buildMinimal(args) {
469
475
  // ── Stage 1: compile each .c file via cc1 → .s ─────────────────
470
476
  /** @type {Record<string, Uint8Array>} */
471
477
  const userObjs = {};
478
+ // User .c gets warnings on (minimal path has no SDK to flood). See buildWithSgdk.
479
+ const userCc1Options = [...cc1Options, "-Wall", "-Wextra", "-Wno-unused-parameter"];
472
480
  const cFiles = Object.keys(sources).filter((n) => /\.c$/i.test(n));
473
481
  for (const cName of cFiles) {
474
482
  const cc = await runCc1m68k({
475
483
  source: sources[cName],
476
484
  headers,
477
- options: cc1Options,
485
+ options: userCc1Options,
478
486
  });
479
487
  log += `--- cc1 (${cName}) ---\n` + (cc.log || "(ok)") + "\n";
480
488
  if (cc.exitCode !== 0 || !cc.asmSource) {
@@ -41,10 +41,12 @@ export function parseBuildLog(log) {
41
41
  } else if (/^vasm/.test(baseStage)) {
42
42
  issues.push(...parseVasm(text));
43
43
  } else if (/^sdcc$|^sdasz80$|^sdasgb$|^sdld$|^mcpp$/.test(baseStage)) {
44
- // SDCC family: sdcc / sdasz80 / sdasgb / sdld / mcpp emit cc65-style
45
- // `file:line: severity: msg` errors. Tag with the actual originating
46
- // tool, not "asar" (the old fallback was wrong).
44
+ // SDCC family: sdcc / sdasz80 / sdasgb / sdld / mcpp. Some diagnostics use
45
+ // the cc65-style `file:line: Error: msg`; SDCC's frontend ALSO emits a
46
+ // keyword-less form `main.c:2: syntax error: token -> ';' ; column 44`
47
+ // and `main.c:N: warning NNN: msg` — which parseCc65Like misses. Run both.
47
48
  issues.push(...parseCc65Like(text, baseStage));
49
+ issues.push(...parseSdcc(text, baseStage));
48
50
  } else if (/^wla|^wlalink|^wladx/.test(baseStage)) {
49
51
  // SNES C path: wla-65816 assembler + wlalink linker. wlalink floods a
50
52
  // symbol-table dump on failure — parseWla extracts just the diagnostics.
@@ -60,11 +62,18 @@ export function parseBuildLog(log) {
60
62
  // Tag everything with the (possibly empty) actual stage name so an
61
63
  // assembler error doesn't mistakenly report as "asar" on a non-SNES
62
64
  // build.
65
+ // Try EVERY parser — some toolchains (vasm genesis-asm) emit no
66
+ // "--- stage ---" marker, so the whole log lands here unnamed; if we skip
67
+ // a parser the error is silently swallowed (issues[] empty on a real
68
+ // failure). Include vasm + sdcc + wla, which the old fallback omitted.
63
69
  const tag = baseStage || "unknown";
64
70
  issues.push(...parseCc65Like(text, tag));
71
+ issues.push(...parseSdcc(text, tag));
65
72
  issues.push(...parseDasm(text));
66
73
  issues.push(...parseAsar(text, tag));
67
74
  issues.push(...parseRgbds(text, tag));
75
+ issues.push(...parseVasm(text));
76
+ issues.push(...parseWla(text, tag));
68
77
  issues.push(...parseGnuToolchain(text, tag));
69
78
  }
70
79
  }
@@ -122,6 +131,57 @@ function parseCc65Like(text, stage) {
122
131
  return out;
123
132
  }
124
133
 
134
+ // SDCC's keyword-less diagnostics that parseCc65Like (which requires an explicit
135
+ // "Error:"/"Warning:" word) doesn't catch:
136
+ // /work/main.c:2: syntax error: token -> ';' ; column 44
137
+ // /work/main.c:7: warning 112: function 'foo' implicit declaration
138
+ // /work/main.c:9: error 20: undefined identifier 'x'
139
+ // We classify by the leading word after `file:line:` (syntax error / error / warning).
140
+ function parseSdcc(text, stage) {
141
+ const out = [];
142
+ const re = /^(?<file>[^\n:]+):(?<line>\d+):\s*(?<kind>syntax error|error(?:\s+\d+)?|warning(?:\s+\d+)?):?\s*(?<msg>.*)$/gm;
143
+ let m;
144
+ while ((m = re.exec(text))) {
145
+ const kind = m.groups.kind.toLowerCase();
146
+ // Skip the forms parseCc65Like already caught ("Error:" capitalized w/ colon)
147
+ // — this regex is case-insensitive on `error`/`warning`, but parseCc65Like
148
+ // only matches when a colon immediately follows the keyword AND it's
149
+ // capitalized; SDCC's lowercase keyword-less form is what we add here.
150
+ const severity = kind.startsWith("warning") ? "warning"
151
+ : (kind.startsWith("error") || kind === "syntax error") ? "error" : "info";
152
+ const message = kind === "syntax error"
153
+ ? ("syntax error: " + m.groups.msg).trim().replace(/\s*;\s*$/, "")
154
+ : m.groups.msg.trim();
155
+ out.push({
156
+ severity,
157
+ file: m.groups.file,
158
+ line: parseInt(m.groups.line, 10),
159
+ message: message.replace(/\x1b\[[0-9;]*m/g, ""),
160
+ stage,
161
+ });
162
+ }
163
+ // sdld/ASlink linker diagnostics have NO file:line — they reference a symbol +
164
+ // module. The most common is an undefined symbol (a call to a function that was
165
+ // never defined/linked). Without parsing these the agent sees "build failed"
166
+ // with no reason in issues[] (the error lived only in the raw log).
167
+ // ?ASlink-Warning-Undefined Global '_foo' referenced by module '_main'
168
+ // ?ASlink-Error-...
169
+ const linkRe = /^\?ASlink-(?<sev>Warning|Error)-(?<msg>.+)$/gm;
170
+ let lm;
171
+ while ((lm = linkRe.exec(text))) {
172
+ const msg = lm.groups.msg.trim();
173
+ // An "Undefined Global" is effectively an error even though ASlink labels it
174
+ // a warning — the ROM won't run. Promote it so the agent treats it as fatal.
175
+ const isUndef = /undefined\s+global/i.test(msg);
176
+ out.push({
177
+ severity: lm.groups.sev === "Error" || isUndef ? "error" : "warning",
178
+ message: "linker: " + msg,
179
+ stage: "sdld",
180
+ });
181
+ }
182
+ return out;
183
+ }
184
+
125
185
  // dasm example:
126
186
  // main.asm (1): error: Unknown Mnemonic 'is'.
127
187
  function parseDasm(text) {
@@ -248,13 +308,15 @@ function parseWla(text, stage = "wla") {
248
308
  // vasm example:
249
309
  // error 22 in line 5 of "/work/main.s": ...
250
310
  // warning 1003 in line 8 of "main.s": ...
311
+ // fatal error 13 in line 1 of "/work/main.s": could not open <x.bin> for input
312
+ // (← a MISSING incbin asset: the #1 thing an agent forgets to pass)
251
313
  function parseVasm(text) {
252
314
  const out = [];
253
- const re = /^(?<sev>error|warning)\s+\d+\s+in\s+line\s+(?<line>\d+)\s+of\s+"(?<file>[^"]+)":\s*(?<msg>.+)$/gm;
315
+ const re = /^(?<sev>fatal error|error|warning)\s+\d+\s+in\s+line\s+(?<line>\d+)\s+of\s+"(?<file>[^"]+)":\s*(?<msg>.+)$/gm;
254
316
  let m;
255
317
  while ((m = re.exec(text))) {
256
318
  out.push({
257
- severity: m.groups.sev,
319
+ severity: m.groups.sev === "warning" ? "warning" : "error",
258
320
  file: m.groups.file,
259
321
  line: parseInt(m.groups.line, 10),
260
322
  message: m.groups.msg.trim(),