windows-exe-decompiler-mcp-server 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (44) hide show
  1. package/CHANGELOG.md +51 -0
  2. package/CLAUDE_INSTALLATION.md +14 -0
  3. package/CODEX_INSTALLATION.md +11 -0
  4. package/COPILOT_INSTALLATION.md +14 -0
  5. package/README.md +89 -3
  6. package/README_zh.md +532 -0
  7. package/dist/analysis-task-runner.js +3 -0
  8. package/dist/config.d.ts +43 -0
  9. package/dist/config.js +57 -0
  10. package/dist/decompiler-worker.d.ts +14 -1
  11. package/dist/decompiler-worker.js +314 -55
  12. package/dist/ghidra-analysis-status.d.ts +26 -0
  13. package/dist/ghidra-config.d.ts +21 -0
  14. package/dist/ghidra-config.js +142 -8
  15. package/dist/ghidra-execution-summary.d.ts +158 -0
  16. package/dist/ghidra-execution-summary.js +174 -0
  17. package/dist/polling-guidance.d.ts +28 -0
  18. package/dist/polling-guidance.js +75 -0
  19. package/dist/server.js +1 -1
  20. package/dist/setup-guidance.d.ts +2 -0
  21. package/dist/setup-guidance.js +92 -1
  22. package/dist/tools/code-reconstruct-export.d.ts +12 -12
  23. package/dist/tools/dotnet-reconstruct-export.d.ts +6 -6
  24. package/dist/tools/ghidra-analyze.d.ts +2 -0
  25. package/dist/tools/ghidra-analyze.js +8 -1
  26. package/dist/tools/ghidra-health.js +7 -3
  27. package/dist/tools/report-generate.js +31 -3
  28. package/dist/tools/report-summarize.d.ts +255 -0
  29. package/dist/tools/report-summarize.js +12 -4
  30. package/dist/tools/system-health.js +10 -5
  31. package/dist/tools/system-setup-guide.d.ts +8 -8
  32. package/dist/tools/system-setup-guide.js +11 -6
  33. package/dist/tools/task-status.js +36 -4
  34. package/dist/tools/tool-help.js +13 -2
  35. package/dist/workflows/deep-static.js +7 -0
  36. package/dist/workflows/function-explanation-review.d.ts +372 -4
  37. package/dist/workflows/function-explanation-review.js +13 -0
  38. package/dist/workflows/module-reconstruction-review.d.ts +372 -4
  39. package/dist/workflows/module-reconstruction-review.js +13 -0
  40. package/dist/workflows/reconstruct.d.ts +306 -4
  41. package/dist/workflows/reconstruct.js +12 -0
  42. package/dist/workflows/semantic-name-review.d.ts +372 -4
  43. package/dist/workflows/semantic-name-review.js +13 -0
  44. package/package.json +3 -1
package/CHANGELOG.md ADDED
@@ -0,0 +1,51 @@
1
+ # Changelog
2
+
3
+ All notable changes to this project will be documented in this file.
4
+
5
+ The format is based on Keep a Changelog, and this project follows Semantic
6
+ Versioning where practical.
7
+
8
+ ## [Unreleased]
9
+
10
+ ## [0.1.4] - 2026-03-14
11
+
12
+ - Added safer Ghidra defaults for `GHIDRA_PROJECT_ROOT` / `GHIDRA_LOG_ROOT`, automatic project-parent creation, and safer Windows defaults that avoid unstable per-repo relative paths
13
+ - Fixed bundled `ghidra_scripts` resolution so helper scripts are loaded from the installed package or repository root instead of the current working directory
14
+ - Added richer Ghidra diagnostics: persisted command/runtime logs, parsed Java exception summaries, normalized remediation hints, and stage progress callbacks for queued analysis
15
+ - Surfaced structured `ghidra_execution` summaries through `workflow.reconstruct`, `workflow.semantic_name_review`, `workflow.function_explanation_review`, `workflow.module_reconstruction_review`, `report.summarize`, and `report.generate`
16
+ - Added Java runtime detection and Java 21+ setup guidance across `ghidra.health`, `system.health`, `system.setup.guide`, and high-level workflows
17
+ - Extended module reconstruction review refresh so all three high-level semantic review workflows now expose the same Ghidra project/log/progress context after export refresh
18
+ - Stabilized unit coverage for Ghidra analysis failure handling, timeout reporting, Java fallback extraction, and degraded function-index recovery
19
+
20
+ ## [0.1.3] - 2026-03-14
21
+
22
+ - Added DLL- and COM-oriented profiling with `dll.export.profile` and `com.role.profile`
23
+ - Added module-level LLM review primitives: `code.module.review.prepare`, `code.module.review`, `code.module.review.apply`, prompt `reverse.module_reconstruction_review`, and `workflow.module_reconstruction_review`
24
+ - Extended `workflow.reconstruct` with role-aware export strategy so DLL/COM/Rust preflight can influence module grouping and reconstruction priority
25
+ - Improved runtime memory ingestion with segment/module hints, region ownership, and richer runtime provenance
26
+ - Added structured setup guidance with `system.setup.guide` and surfaced install/input requirements from health checks and high-level workflows
27
+ - Refined README, installation docs, and release packaging for the `0.1.3` npm/GitHub release
28
+
29
+ ## [0.1.2] - 2026-03-12
30
+
31
+ - Upgraded `workflow.reconstruct` with universal preflight orchestration, including binary role profiling, Rust-specific profiling, and optional automatic function-index recovery before export
32
+ - Aligned `workflow.semantic_name_review` and `workflow.function_explanation_review` with reconstruct refresh preflight, provenance, and selection diff semantics
33
+ - Added `.pdata`-driven PE recovery tooling: `pe.pdata.extract`, `code.functions.smart_recover`, `pe.symbols.recover`, and `code.functions.define`
34
+ - Added `workflow.function_index_recover` and `rust_binary.analyze` to make Rust and hard-to-index native samples recoverable even when Ghidra function extraction fails
35
+ - Hardened sample/original and Ghidra project fallback handling so analysis can continue when older workspaces are incomplete
36
+ - Stabilized runtime state defaults by moving workspace, database, cache, and audit paths to persistent user-level configuration roots
37
+
38
+ ## [0.1.1] - 2026-03-11
39
+
40
+ - Added `binary.role.profile` for universal EXE/DLL/.NET/driver role profiling, export surface triage, and COM/service/plugin indicators
41
+ - Added quality scaffolding with benchmark corpus example and evaluation guidance for future regression baselines
42
+ - Added async job mode for `workflow.reconstruct`, `workflow.semantic_name_review`, and `workflow.function_explanation_review`
43
+ - Wired queued workflow execution into the background analysis task runner
44
+ - Integrated binary role profile output into `report.summarize` and `report.generate`
45
+ - Added report coverage for runtime/semantic provenance plus binary role context in generated markdown and JSON output
46
+ - Continued repository and packaging cleanup for public GitHub/npm release
47
+
48
+ ## [0.1.0] - 2026-03-11
49
+
50
+ - Initial public packaging baseline
51
+ - MCP server with static PE analysis, Ghidra integration hooks, runtime evidence tools, and reconstruction workflows
@@ -38,6 +38,15 @@ It also pins:
38
38
  - `DB_PATH`
39
39
  - `CACHE_ROOT`
40
40
  - `AUDIT_LOG_PATH`
41
+ - `GHIDRA_PROJECT_ROOT`
42
+ - `GHIDRA_LOG_ROOT`
43
+
44
+ The server's bundled `ghidra_scripts/` directory is resolved from the installed
45
+ package or repository root, not from the shell's current working directory. You
46
+ do not need to manually point Claude at `ExtractFunctions.py`.
47
+
48
+ For Ghidra 12.0.4, keep Java 21+ available. If Java is installed outside the
49
+ system default location, also set `JAVA_HOME`.
41
50
 
42
51
  ## Pass Ghidra Explicitly
43
52
 
@@ -47,6 +56,11 @@ It also pins:
47
56
 
48
57
  The script writes both `GHIDRA_PATH` and `GHIDRA_INSTALL_DIR`.
49
58
 
59
+ If you want to pin Ghidra project/log roots explicitly, set:
60
+
61
+ - `GHIDRA_PROJECT_ROOT`
62
+ - `GHIDRA_LOG_ROOT`
63
+
50
64
  If you want a different persistent workspace root:
51
65
 
52
66
  ```powershell
@@ -23,6 +23,15 @@ It also pins:
23
23
  - `DB_PATH`
24
24
  - `CACHE_ROOT`
25
25
  - `AUDIT_LOG_PATH`
26
+ - `GHIDRA_PROJECT_ROOT`
27
+ - `GHIDRA_LOG_ROOT`
28
+
29
+ The server's bundled `ghidra_scripts/` directory is resolved from the installed
30
+ package or repository root, not from the shell's current working directory. You
31
+ do not need to manually configure a script path for `ExtractFunctions.py`.
32
+
33
+ For Ghidra 12.0.4, keep Java 21+ available. If Java is installed in a custom
34
+ location, set `JAVA_HOME` before starting Codex.
26
35
 
27
36
  If Ghidra is not already configured through `GHIDRA_PATH` or
28
37
  `GHIDRA_INSTALL_DIR`, pass it explicitly:
@@ -44,6 +53,8 @@ If you want a different persistent workspace root:
44
53
  - updates `~/.codex/config.toml`
45
54
  - writes `WORKSPACE_ROOT` so workspaces do not depend on the current repo path
46
55
  - writes `GHIDRA_PATH` and `GHIDRA_INSTALL_DIR` when a Ghidra path is provided
56
+ - honors `GHIDRA_PROJECT_ROOT` and `GHIDRA_LOG_ROOT` when you want Ghidra
57
+ projects and runtime logs under a fixed location
47
58
 
48
59
  ## Manual configuration example
49
60
 
@@ -15,6 +15,15 @@ It also pins:
15
15
  - `DB_PATH`
16
16
  - `CACHE_ROOT`
17
17
  - `AUDIT_LOG_PATH`
18
+ - `GHIDRA_PROJECT_ROOT`
19
+ - `GHIDRA_LOG_ROOT`
20
+
21
+ The server's bundled `ghidra_scripts/` directory is resolved from the installed
22
+ package or repository root, not from the shell's current working directory. You
23
+ do not need to separately point Copilot at `ExtractFunctions.py`.
24
+
25
+ For Ghidra 12.0.4, keep Java 21+ available. If Java is installed outside the
26
+ default system location, set `JAVA_HOME` before launching Copilot clients.
18
27
 
19
28
  Build the project first:
20
29
 
@@ -28,6 +37,11 @@ If Ghidra is not already configured in the environment, pass it explicitly:
28
37
  .\install-to-copilot.ps1 -GhidraPath "C:\tools\ghidra"
29
38
  ```
30
39
 
40
+ If you want to pin Ghidra projects and logs under a fixed location, set:
41
+
42
+ - `GHIDRA_PROJECT_ROOT`
43
+ - `GHIDRA_LOG_ROOT`
44
+
31
45
  If you want a different persistent workspace root:
32
46
 
33
47
  ```powershell
package/README.md CHANGED
@@ -4,6 +4,36 @@ Chinese version: [`README_zh.md`](./README_zh.md)
4
4
 
5
5
  An MCP server for Windows reverse engineering. It exposes PE triage, Ghidra-backed inspection, DLL/COM profiling, runtime evidence ingestion, Rust/.NET recovery, source-like reconstruction, and LLM-assisted review as reusable MCP tools for any tool-calling LLM.
6
6
 
7
+ ## Feature highlights
8
+
9
+ - Universal Windows PE coverage: EXE, DLL, COM-oriented libraries, Rust-native samples, and .NET assemblies all have dedicated profiling or recovery paths.
10
+ - Recover-first design: when Ghidra function extraction is empty or degraded, the server can continue with `.pdata` parsing, boundary recovery, symbol recovery, and imported function definitions.
11
+ - Observable Ghidra runs: command logs, runtime logs, staged progress, project/log roots, and parsed Java exception summaries are surfaced through high-level outputs.
12
+ - Runtime-aware reconstruction: static evidence, trace imports, memory snapshots, and semantic review artifacts can all be correlated back into reconstruct and report workflows.
13
+ - LLM-assisted review layers: function naming, function explanation, and module reconstruction review are exposed as structured MCP flows instead of ad hoc prompts.
14
+ - Queue-friendly orchestration: long-running workflows return `job_id`, progress, and `polling_guidance` so MCP clients can wait efficiently instead of burning tokens on tight polling loops.
15
+
16
+ ## Typical analysis flows
17
+
18
+ ### Quick triage
19
+
20
+ 1. `sample.ingest`
21
+ 2. `workflow.triage`
22
+ 3. `report.summarize`
23
+
24
+ ### Hard native recovery
25
+
26
+ 1. `ghidra.analyze`
27
+ 2. `workflow.function_index_recover`
28
+ 3. `workflow.reconstruct`
29
+
30
+ ### LLM-assisted refinement
31
+
32
+ 1. `workflow.reconstruct`
33
+ 2. `workflow.semantic_name_review`
34
+ 3. `workflow.function_explanation_review`
35
+ 4. `workflow.module_reconstruction_review`
36
+
7
37
  ## What this server is for
8
38
 
9
39
  This project is meant to be a reusable reverse-engineering tool surface, not a pile of one-off local scripts.
@@ -14,6 +44,8 @@ It is designed to help MCP clients:
14
44
  - inspect imports, exports, strings, packers, runtime hints, and binary role
15
45
  - use Ghidra when available for decompile, CFG, search, and reconstruction
16
46
  - recover usable function indexes when Ghidra function extraction fails
47
+ - surface actionable setup guidance when Java, Python extras, or Ghidra are missing
48
+ - expose richer Ghidra diagnostics, command logs, and stage/progress metadata when analysis fails
17
49
  - correlate static evidence, runtime traces, memory snapshots, and semantic review artifacts
18
50
  - export source-like reconstruction output with optional build and harness validation
19
51
 
@@ -117,6 +149,8 @@ It can:
117
149
  - export native or .NET reconstruction output
118
150
  - optionally validate build and run the generated harness
119
151
  - tune export strategy based on role-aware preflight for native Rust, DLL, and COM-oriented samples
152
+ - return structured setup guidance when Java, Ghidra, or optional dependencies are not ready
153
+ - expose stage-oriented progress metadata for queued and foreground runs
120
154
  - carry runtime and semantic provenance through the result
121
155
 
122
156
  ### `workflow.function_index_recover`
@@ -131,15 +165,15 @@ Use this when Ghidra analysis exists but function extraction is empty or degrade
131
165
 
132
166
  ### `workflow.semantic_name_review`
133
167
 
134
- High-level semantic naming review workflow for external LLM clients. It can prepare evidence, request model review through MCP sampling when available, apply accepted names, and optionally refresh reconstruct/export output.
168
+ High-level semantic naming review workflow for external LLM clients. It can prepare evidence, request model review through MCP sampling when available, apply accepted names, and optionally refresh reconstruct/export output. When export refresh runs, the workflow now carries the same `ghidra_execution` summary used by `workflow.reconstruct`, including project root, log root, command/runtime log paths, progress stages, and parsed Java exception context.
135
169
 
136
170
  ### `workflow.function_explanation_review`
137
171
 
138
- High-level explanation workflow for external LLM clients. It can prepare evidence, request structured explanations, apply them, and optionally rerun reconstruct/export.
172
+ High-level explanation workflow for external LLM clients. It can prepare evidence, request structured explanations, apply them, and optionally rerun reconstruct/export. Export refresh results also surface `ghidra_execution` so explanation-heavy review chains still expose Ghidra project/log context and progress metadata.
139
173
 
140
174
  ### `workflow.module_reconstruction_review`
141
175
 
142
- High-level module review workflow for external LLM clients. It can prepare reconstructed modules for review, request structured module refinements through MCP sampling when available, apply accepted module summaries and guidance, and optionally refresh reconstruct/export output.
176
+ High-level module review workflow for external LLM clients. It can prepare reconstructed modules for review, request structured module refinements through MCP sampling when available, apply accepted module summaries and guidance, and optionally refresh reconstruct/export output. When export refresh runs, the workflow also carries `ghidra_execution` so module-level review chains expose Ghidra project/log context and progress metadata just like reconstruct and function-level review workflows.
143
177
 
144
178
  ## Universal recovery model
145
179
 
@@ -212,6 +246,11 @@ Use these with:
212
246
  - `task.cancel`
213
247
  - `task.sweep`
214
248
 
249
+ Queued workflow responses and `task.status` now include `polling_guidance`.
250
+ When a long-running Ghidra or reconstruct job is still queued/running, MCP
251
+ clients should prefer one client-side sleep/wait using that recommendation
252
+ instead of repeated immediate polling.
253
+
215
254
  ## Environment bootstrap and setup guidance
216
255
 
217
256
  If a client starts using the server before Python, dynamic-analysis extras, or Ghidra are configured, use:
@@ -224,9 +263,47 @@ If a client starts using the server before Python, dynamic-analysis extras, or G
224
263
  These return structured setup actions and required user inputs so an MCP client can explicitly ask for:
225
264
 
226
265
  - `python -m pip install ...`
266
+ - `JAVA_HOME`
227
267
  - `GHIDRA_PATH` / `GHIDRA_INSTALL_DIR`
268
+ - `GHIDRA_PROJECT_ROOT` / `GHIDRA_LOG_ROOT`
228
269
  - optional dynamic-analysis extras such as Speakeasy/Frida dependencies
229
270
 
271
+ For Ghidra 12.0.4, the server expects Java 21+ and will report explicit Java compatibility hints through:
272
+
273
+ - `ghidra.health`
274
+ - `system.health`
275
+ - `system.setup.guide`
276
+
277
+ When Ghidra commands fail, the server now persists command logs and, when available, Ghidra runtime logs. Normalized diagnostics include Java exception summaries and remediation hints instead of only returning `exit code 1`.
278
+
279
+ The bundled `ghidra_scripts/` directory is resolved from the installed package
280
+ or repository root, not from the current working directory. This prevents
281
+ `ExtractFunctions.py` / `ExtractFunctions.java` lookup failures when the server
282
+ is launched from a different folder.
283
+
284
+ ## Ghidra execution visibility
285
+
286
+ High-level outputs now expose a structured `ghidra_execution` block instead of hiding Ghidra details behind generic success/failure states.
287
+
288
+ You can now see:
289
+
290
+ - which analysis record was selected
291
+ - whether the result came from the best ready analysis or only the latest attempt
292
+ - project path, project root, and log root
293
+ - persisted command logs and runtime logs
294
+ - function extraction status and script name
295
+ - staged progress metadata
296
+ - parsed Java exception summaries when Ghidra fails
297
+
298
+ This summary is surfaced through:
299
+
300
+ - `workflow.reconstruct`
301
+ - `workflow.semantic_name_review` when export refresh runs
302
+ - `workflow.function_explanation_review` when export refresh runs
303
+ - `workflow.module_reconstruction_review` when export refresh runs
304
+ - `report.summarize`
305
+ - `report.generate`
306
+
230
307
  ## Project layout
231
308
 
232
309
  ```text
@@ -333,10 +410,19 @@ By default, runtime state is stored under the user profile instead of the curren
333
410
  - SQLite database: `%USERPROFILE%/.windows-exe-decompiler-mcp-server/data/database.db`
334
411
  - File cache: `%USERPROFILE%/.windows-exe-decompiler-mcp-server/cache`
335
412
  - Audit log: `%USERPROFILE%/.windows-exe-decompiler-mcp-server/audit.log`
413
+ - Ghidra project root: `%ProgramData%/.windows-exe-decompiler-mcp-server/ghidra-projects`
414
+ - Ghidra log root: `%ProgramData%/.windows-exe-decompiler-mcp-server/ghidra-logs`
415
+ - Bundled Ghidra scripts: resolved from the installed package root
336
416
 
337
417
  You can override these with environment variables or the user config file:
338
418
 
339
419
  - `%USERPROFILE%/.windows-exe-decompiler-mcp-server/config.json`
420
+ - `WORKSPACE_ROOT`
421
+ - `DB_PATH`
422
+ - `CACHE_ROOT`
423
+ - `AUDIT_LOG_PATH`
424
+ - `GHIDRA_PROJECT_ROOT`
425
+ - `GHIDRA_LOG_ROOT`
340
426
 
341
427
  ## Sample ingest note
342
428