npm - windows-exe-decompiler-mcp-server - Versions diffs - 0.1.3 → 0.1.4 - Mend

windows-exe-decompiler-mcp-server 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (44) hide show

package/CHANGELOG.md +51 -0
package/CLAUDE_INSTALLATION.md +14 -0
package/CODEX_INSTALLATION.md +11 -0
package/COPILOT_INSTALLATION.md +14 -0
package/README.md +89 -3
package/README_zh.md +532 -0
package/dist/analysis-task-runner.js +3 -0
package/dist/config.d.ts +43 -0
package/dist/config.js +57 -0
package/dist/decompiler-worker.d.ts +14 -1
package/dist/decompiler-worker.js +314 -55
package/dist/ghidra-analysis-status.d.ts +26 -0
package/dist/ghidra-config.d.ts +21 -0
package/dist/ghidra-config.js +142 -8
package/dist/ghidra-execution-summary.d.ts +158 -0
package/dist/ghidra-execution-summary.js +174 -0
package/dist/polling-guidance.d.ts +28 -0
package/dist/polling-guidance.js +75 -0
package/dist/server.js +1 -1
package/dist/setup-guidance.d.ts +2 -0
package/dist/setup-guidance.js +92 -1
package/dist/tools/code-reconstruct-export.d.ts +12 -12
package/dist/tools/dotnet-reconstruct-export.d.ts +6 -6
package/dist/tools/ghidra-analyze.d.ts +2 -0
package/dist/tools/ghidra-analyze.js +8 -1
package/dist/tools/ghidra-health.js +7 -3
package/dist/tools/report-generate.js +31 -3
package/dist/tools/report-summarize.d.ts +255 -0
package/dist/tools/report-summarize.js +12 -4
package/dist/tools/system-health.js +10 -5
package/dist/tools/system-setup-guide.d.ts +8 -8
package/dist/tools/system-setup-guide.js +11 -6
package/dist/tools/task-status.js +36 -4
package/dist/tools/tool-help.js +13 -2
package/dist/workflows/deep-static.js +7 -0
package/dist/workflows/function-explanation-review.d.ts +372 -4
package/dist/workflows/function-explanation-review.js +13 -0
package/dist/workflows/module-reconstruction-review.d.ts +372 -4
package/dist/workflows/module-reconstruction-review.js +13 -0
package/dist/workflows/reconstruct.d.ts +306 -4
package/dist/workflows/reconstruct.js +12 -0
package/dist/workflows/semantic-name-review.d.ts +372 -4
package/dist/workflows/semantic-name-review.js +13 -0
package/package.json +3 -1

package/CHANGELOG.md ADDED Viewed

@@ -0,0 +1,51 @@
+# Changelog
+All notable changes to this project will be documented in this file.
+The format is based on Keep a Changelog, and this project follows Semantic
+Versioning where practical.
+## [Unreleased]
+## [0.1.4] - 2026-03-14
+- Added safer Ghidra defaults for `GHIDRA_PROJECT_ROOT` / `GHIDRA_LOG_ROOT`, automatic project-parent creation, and safer Windows defaults that avoid unstable per-repo relative paths
+- Fixed bundled `ghidra_scripts` resolution so helper scripts are loaded from the installed package or repository root instead of the current working directory
+- Added richer Ghidra diagnostics: persisted command/runtime logs, parsed Java exception summaries, normalized remediation hints, and stage progress callbacks for queued analysis
+- Surfaced structured `ghidra_execution` summaries through `workflow.reconstruct`, `workflow.semantic_name_review`, `workflow.function_explanation_review`, `workflow.module_reconstruction_review`, `report.summarize`, and `report.generate`
+- Added Java runtime detection and Java 21+ setup guidance across `ghidra.health`, `system.health`, `system.setup.guide`, and high-level workflows
+- Extended module reconstruction review refresh so all three high-level semantic review workflows now expose the same Ghidra project/log/progress context after export refresh
+- Stabilized unit coverage for Ghidra analysis failure handling, timeout reporting, Java fallback extraction, and degraded function-index recovery
+## [0.1.3] - 2026-03-14
+- Added DLL- and COM-oriented profiling with `dll.export.profile` and `com.role.profile`
+- Added module-level LLM review primitives: `code.module.review.prepare`, `code.module.review`, `code.module.review.apply`, prompt `reverse.module_reconstruction_review`, and `workflow.module_reconstruction_review`
+- Extended `workflow.reconstruct` with role-aware export strategy so DLL/COM/Rust preflight can influence module grouping and reconstruction priority
+- Improved runtime memory ingestion with segment/module hints, region ownership, and richer runtime provenance
+- Added structured setup guidance with `system.setup.guide` and surfaced install/input requirements from health checks and high-level workflows
+- Refined README, installation docs, and release packaging for the `0.1.3` npm/GitHub release
+## [0.1.2] - 2026-03-12
+- Upgraded `workflow.reconstruct` with universal preflight orchestration, including binary role profiling, Rust-specific profiling, and optional automatic function-index recovery before export
+- Aligned `workflow.semantic_name_review` and `workflow.function_explanation_review` with reconstruct refresh preflight, provenance, and selection diff semantics
+- Added `.pdata`-driven PE recovery tooling: `pe.pdata.extract`, `code.functions.smart_recover`, `pe.symbols.recover`, and `code.functions.define`
+- Added `workflow.function_index_recover` and `rust_binary.analyze` to make Rust and hard-to-index native samples recoverable even when Ghidra function extraction fails
+- Hardened sample/original and Ghidra project fallback handling so analysis can continue when older workspaces are incomplete
+- Stabilized runtime state defaults by moving workspace, database, cache, and audit paths to persistent user-level configuration roots
+## [0.1.1] - 2026-03-11
+- Added `binary.role.profile` for universal EXE/DLL/.NET/driver role profiling, export surface triage, and COM/service/plugin indicators
+- Added quality scaffolding with benchmark corpus example and evaluation guidance for future regression baselines
+- Added async job mode for `workflow.reconstruct`, `workflow.semantic_name_review`, and `workflow.function_explanation_review`
+- Wired queued workflow execution into the background analysis task runner
+- Integrated binary role profile output into `report.summarize` and `report.generate`
+- Added report coverage for runtime/semantic provenance plus binary role context in generated markdown and JSON output
+- Continued repository and packaging cleanup for public GitHub/npm release
+## [0.1.0] - 2026-03-11
+- Initial public packaging baseline
+- MCP server with static PE analysis, Ghidra integration hooks, runtime evidence tools, and reconstruction workflows

package/CLAUDE_INSTALLATION.md CHANGED Viewed

@@ -38,6 +38,15 @@ It also pins:
 - `DB_PATH`
 - `CACHE_ROOT`
 - `AUDIT_LOG_PATH`
+- `GHIDRA_PROJECT_ROOT`
+- `GHIDRA_LOG_ROOT`
+The server's bundled `ghidra_scripts/` directory is resolved from the installed
+package or repository root, not from the shell's current working directory. You
+do not need to manually point Claude at `ExtractFunctions.py`.
+For Ghidra 12.0.4, keep Java 21+ available. If Java is installed outside the
+system default location, also set `JAVA_HOME`.
 ## Pass Ghidra Explicitly
@@ -47,6 +56,11 @@ It also pins:
 The script writes both `GHIDRA_PATH` and `GHIDRA_INSTALL_DIR`.
+If you want to pin Ghidra project/log roots explicitly, set:
+- `GHIDRA_PROJECT_ROOT`
+- `GHIDRA_LOG_ROOT`
 If you want a different persistent workspace root:
 ```powershell

package/CODEX_INSTALLATION.md CHANGED Viewed

@@ -23,6 +23,15 @@ It also pins:
 - `DB_PATH`
 - `CACHE_ROOT`
 - `AUDIT_LOG_PATH`
+- `GHIDRA_PROJECT_ROOT`
+- `GHIDRA_LOG_ROOT`
+The server's bundled `ghidra_scripts/` directory is resolved from the installed
+package or repository root, not from the shell's current working directory. You
+do not need to manually configure a script path for `ExtractFunctions.py`.
+For Ghidra 12.0.4, keep Java 21+ available. If Java is installed in a custom
+location, set `JAVA_HOME` before starting Codex.
 If Ghidra is not already configured through `GHIDRA_PATH` or
 `GHIDRA_INSTALL_DIR`, pass it explicitly:
@@ -44,6 +53,8 @@ If you want a different persistent workspace root:
 - updates `~/.codex/config.toml`
 - writes `WORKSPACE_ROOT` so workspaces do not depend on the current repo path
 - writes `GHIDRA_PATH` and `GHIDRA_INSTALL_DIR` when a Ghidra path is provided
+- honors `GHIDRA_PROJECT_ROOT` and `GHIDRA_LOG_ROOT` when you want Ghidra
+  projects and runtime logs under a fixed location
 ## Manual configuration example

package/COPILOT_INSTALLATION.md CHANGED Viewed

@@ -15,6 +15,15 @@ It also pins:
 - `DB_PATH`
 - `CACHE_ROOT`
 - `AUDIT_LOG_PATH`
+- `GHIDRA_PROJECT_ROOT`
+- `GHIDRA_LOG_ROOT`
+The server's bundled `ghidra_scripts/` directory is resolved from the installed
+package or repository root, not from the shell's current working directory. You
+do not need to separately point Copilot at `ExtractFunctions.py`.
+For Ghidra 12.0.4, keep Java 21+ available. If Java is installed outside the
+default system location, set `JAVA_HOME` before launching Copilot clients.
 Build the project first:
@@ -28,6 +37,11 @@ If Ghidra is not already configured in the environment, pass it explicitly:
 .\install-to-copilot.ps1 -GhidraPath "C:\tools\ghidra"
 ```
+If you want to pin Ghidra projects and logs under a fixed location, set:
+- `GHIDRA_PROJECT_ROOT`
+- `GHIDRA_LOG_ROOT`
 If you want a different persistent workspace root:
 ```powershell

package/README.md CHANGED Viewed

@@ -4,6 +4,36 @@ Chinese version: [`README_zh.md`](./README_zh.md)
 An MCP server for Windows reverse engineering. It exposes PE triage, Ghidra-backed inspection, DLL/COM profiling, runtime evidence ingestion, Rust/.NET recovery, source-like reconstruction, and LLM-assisted review as reusable MCP tools for any tool-calling LLM.
+## Feature highlights
+- Universal Windows PE coverage: EXE, DLL, COM-oriented libraries, Rust-native samples, and .NET assemblies all have dedicated profiling or recovery paths.
+- Recover-first design: when Ghidra function extraction is empty or degraded, the server can continue with `.pdata` parsing, boundary recovery, symbol recovery, and imported function definitions.
+- Observable Ghidra runs: command logs, runtime logs, staged progress, project/log roots, and parsed Java exception summaries are surfaced through high-level outputs.
+- Runtime-aware reconstruction: static evidence, trace imports, memory snapshots, and semantic review artifacts can all be correlated back into reconstruct and report workflows.
+- LLM-assisted review layers: function naming, function explanation, and module reconstruction review are exposed as structured MCP flows instead of ad hoc prompts.
+- Queue-friendly orchestration: long-running workflows return `job_id`, progress, and `polling_guidance` so MCP clients can wait efficiently instead of burning tokens on tight polling loops.
+## Typical analysis flows
+### Quick triage
+1. `sample.ingest`
+2. `workflow.triage`
+3. `report.summarize`
+### Hard native recovery
+1. `ghidra.analyze`
+2. `workflow.function_index_recover`
+3. `workflow.reconstruct`
+### LLM-assisted refinement
+1. `workflow.reconstruct`
+2. `workflow.semantic_name_review`
+3. `workflow.function_explanation_review`
+4. `workflow.module_reconstruction_review`
 ## What this server is for
 This project is meant to be a reusable reverse-engineering tool surface, not a pile of one-off local scripts.
@@ -14,6 +44,8 @@ It is designed to help MCP clients:
 - inspect imports, exports, strings, packers, runtime hints, and binary role
 - use Ghidra when available for decompile, CFG, search, and reconstruction
 - recover usable function indexes when Ghidra function extraction fails
+- surface actionable setup guidance when Java, Python extras, or Ghidra are missing
+- expose richer Ghidra diagnostics, command logs, and stage/progress metadata when analysis fails
 - correlate static evidence, runtime traces, memory snapshots, and semantic review artifacts
 - export source-like reconstruction output with optional build and harness validation
@@ -117,6 +149,8 @@ It can:
 - export native or .NET reconstruction output
 - optionally validate build and run the generated harness
 - tune export strategy based on role-aware preflight for native Rust, DLL, and COM-oriented samples
+- return structured setup guidance when Java, Ghidra, or optional dependencies are not ready
+- expose stage-oriented progress metadata for queued and foreground runs
 - carry runtime and semantic provenance through the result
 ### `workflow.function_index_recover`
@@ -131,15 +165,15 @@ Use this when Ghidra analysis exists but function extraction is empty or degrade
 ### `workflow.semantic_name_review`
-High-level semantic naming review workflow for external LLM clients. It can prepare evidence, request model review through MCP sampling when available, apply accepted names, and optionally refresh reconstruct/export output.
+High-level semantic naming review workflow for external LLM clients. It can prepare evidence, request model review through MCP sampling when available, apply accepted names, and optionally refresh reconstruct/export output. When export refresh runs, the workflow now carries the same `ghidra_execution` summary used by `workflow.reconstruct`, including project root, log root, command/runtime log paths, progress stages, and parsed Java exception context.
 ### `workflow.function_explanation_review`
-High-level explanation workflow for external LLM clients. It can prepare evidence, request structured explanations, apply them, and optionally rerun reconstruct/export.
+High-level explanation workflow for external LLM clients. It can prepare evidence, request structured explanations, apply them, and optionally rerun reconstruct/export. Export refresh results also surface `ghidra_execution` so explanation-heavy review chains still expose Ghidra project/log context and progress metadata.
 ### `workflow.module_reconstruction_review`
-High-level module review workflow for external LLM clients. It can prepare reconstructed modules for review, request structured module refinements through MCP sampling when available, apply accepted module summaries and guidance, and optionally refresh reconstruct/export output.
+High-level module review workflow for external LLM clients. It can prepare reconstructed modules for review, request structured module refinements through MCP sampling when available, apply accepted module summaries and guidance, and optionally refresh reconstruct/export output. When export refresh runs, the workflow also carries `ghidra_execution` so module-level review chains expose Ghidra project/log context and progress metadata just like reconstruct and function-level review workflows.
 ## Universal recovery model
@@ -212,6 +246,11 @@ Use these with:
 - `task.cancel`
 - `task.sweep`
+Queued workflow responses and `task.status` now include `polling_guidance`.
+When a long-running Ghidra or reconstruct job is still queued/running, MCP
+clients should prefer one client-side sleep/wait using that recommendation
+instead of repeated immediate polling.
 ## Environment bootstrap and setup guidance
 If a client starts using the server before Python, dynamic-analysis extras, or Ghidra are configured, use:
@@ -224,9 +263,47 @@ If a client starts using the server before Python, dynamic-analysis extras, or G
 These return structured setup actions and required user inputs so an MCP client can explicitly ask for:
 - `python -m pip install ...`
+- `JAVA_HOME`
 - `GHIDRA_PATH` / `GHIDRA_INSTALL_DIR`
+- `GHIDRA_PROJECT_ROOT` / `GHIDRA_LOG_ROOT`
 - optional dynamic-analysis extras such as Speakeasy/Frida dependencies
+For Ghidra 12.0.4, the server expects Java 21+ and will report explicit Java compatibility hints through:
+- `ghidra.health`
+- `system.health`
+- `system.setup.guide`
+When Ghidra commands fail, the server now persists command logs and, when available, Ghidra runtime logs. Normalized diagnostics include Java exception summaries and remediation hints instead of only returning `exit code 1`.
+The bundled `ghidra_scripts/` directory is resolved from the installed package
+or repository root, not from the current working directory. This prevents
+`ExtractFunctions.py` / `ExtractFunctions.java` lookup failures when the server
+is launched from a different folder.
+## Ghidra execution visibility
+High-level outputs now expose a structured `ghidra_execution` block instead of hiding Ghidra details behind generic success/failure states.
+You can now see:
+- which analysis record was selected
+- whether the result came from the best ready analysis or only the latest attempt
+- project path, project root, and log root
+- persisted command logs and runtime logs
+- function extraction status and script name
+- staged progress metadata
+- parsed Java exception summaries when Ghidra fails
+This summary is surfaced through:
+- `workflow.reconstruct`
+- `workflow.semantic_name_review` when export refresh runs
+- `workflow.function_explanation_review` when export refresh runs
+- `workflow.module_reconstruction_review` when export refresh runs
+- `report.summarize`
+- `report.generate`
 ## Project layout
 ```text
@@ -333,10 +410,19 @@ By default, runtime state is stored under the user profile instead of the curren
 - SQLite database: `%USERPROFILE%/.windows-exe-decompiler-mcp-server/data/database.db`
 - File cache: `%USERPROFILE%/.windows-exe-decompiler-mcp-server/cache`
 - Audit log: `%USERPROFILE%/.windows-exe-decompiler-mcp-server/audit.log`
+- Ghidra project root: `%ProgramData%/.windows-exe-decompiler-mcp-server/ghidra-projects`
+- Ghidra log root: `%ProgramData%/.windows-exe-decompiler-mcp-server/ghidra-logs`
+- Bundled Ghidra scripts: resolved from the installed package root
 You can override these with environment variables or the user config file:
 - `%USERPROFILE%/.windows-exe-decompiler-mcp-server/config.json`
+- `WORKSPACE_ROOT`
+- `DB_PATH`
+- `CACHE_ROOT`
+- `AUDIT_LOG_PATH`
+- `GHIDRA_PROJECT_ROOT`
+- `GHIDRA_LOG_ROOT`
 ## Sample ingest note