metasm 1.0.1 → 1.0.2
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +1 -0
- data/.hgtags +3 -0
- data/Gemfile +1 -0
- data/INSTALL +61 -0
- data/LICENCE +458 -0
- data/README +29 -21
- data/Rakefile +10 -0
- data/TODO +10 -12
- data/doc/code_organisation.txt +2 -0
- data/doc/core/DynLdr.txt +247 -0
- data/doc/core/ExeFormat.txt +43 -0
- data/doc/core/Expression.txt +220 -0
- data/doc/core/GNUExports.txt +27 -0
- data/doc/core/Ia32.txt +236 -0
- data/doc/core/SerialStruct.txt +108 -0
- data/doc/core/VirtualString.txt +145 -0
- data/doc/core/WindowsExports.txt +61 -0
- data/doc/core/index.txt +1 -0
- data/doc/style.css +6 -3
- data/doc/usage/debugger.txt +327 -0
- data/doc/usage/index.txt +1 -0
- data/doc/use_cases.txt +2 -2
- data/metasm.gemspec +22 -0
- data/{lib/metasm.rb → metasm.rb} +11 -3
- data/{lib/metasm → metasm}/compile_c.rb +13 -7
- data/metasm/cpu/arc.rb +8 -0
- data/metasm/cpu/arc/decode.rb +425 -0
- data/metasm/cpu/arc/main.rb +191 -0
- data/metasm/cpu/arc/opcodes.rb +588 -0
- data/{lib/metasm → metasm/cpu}/arm.rb +7 -5
- data/{lib/metasm → metasm/cpu}/arm/debug.rb +2 -2
- data/{lib/metasm → metasm/cpu}/arm/decode.rb +13 -12
- data/{lib/metasm → metasm/cpu}/arm/encode.rb +23 -8
- data/{lib/metasm → metasm/cpu}/arm/main.rb +0 -3
- data/metasm/cpu/arm/opcodes.rb +324 -0
- data/{lib/metasm → metasm/cpu}/arm/parse.rb +25 -13
- data/{lib/metasm → metasm/cpu}/arm/render.rb +2 -2
- data/metasm/cpu/arm64.rb +15 -0
- data/metasm/cpu/arm64/debug.rb +38 -0
- data/metasm/cpu/arm64/decode.rb +289 -0
- data/metasm/cpu/arm64/encode.rb +41 -0
- data/metasm/cpu/arm64/main.rb +105 -0
- data/metasm/cpu/arm64/opcodes.rb +232 -0
- data/metasm/cpu/arm64/parse.rb +20 -0
- data/metasm/cpu/arm64/render.rb +95 -0
- data/{lib/metasm/ppc.rb → metasm/cpu/bpf.rb} +2 -4
- data/metasm/cpu/bpf/decode.rb +142 -0
- data/metasm/cpu/bpf/main.rb +60 -0
- data/metasm/cpu/bpf/opcodes.rb +81 -0
- data/metasm/cpu/bpf/render.rb +41 -0
- data/metasm/cpu/cy16.rb +9 -0
- data/metasm/cpu/cy16/decode.rb +253 -0
- data/metasm/cpu/cy16/main.rb +63 -0
- data/metasm/cpu/cy16/opcodes.rb +78 -0
- data/metasm/cpu/cy16/render.rb +41 -0
- data/metasm/cpu/dalvik.rb +11 -0
- data/{lib/metasm → metasm/cpu}/dalvik/decode.rb +35 -13
- data/{lib/metasm → metasm/cpu}/dalvik/main.rb +51 -2
- data/{lib/metasm → metasm/cpu}/dalvik/opcodes.rb +19 -11
- data/metasm/cpu/ia32.rb +17 -0
- data/{lib/metasm → metasm/cpu}/ia32/compile_c.rb +5 -7
- data/{lib/metasm → metasm/cpu}/ia32/debug.rb +5 -5
- data/{lib/metasm → metasm/cpu}/ia32/decode.rb +246 -59
- data/{lib/metasm → metasm/cpu}/ia32/decompile.rb +7 -7
- data/{lib/metasm → metasm/cpu}/ia32/encode.rb +19 -13
- data/{lib/metasm → metasm/cpu}/ia32/main.rb +51 -8
- data/metasm/cpu/ia32/opcodes.rb +1424 -0
- data/{lib/metasm → metasm/cpu}/ia32/parse.rb +47 -16
- data/{lib/metasm → metasm/cpu}/ia32/render.rb +31 -4
- data/metasm/cpu/mips.rb +14 -0
- data/{lib/metasm → metasm/cpu}/mips/compile_c.rb +1 -1
- data/metasm/cpu/mips/debug.rb +42 -0
- data/{lib/metasm → metasm/cpu}/mips/decode.rb +46 -16
- data/{lib/metasm → metasm/cpu}/mips/encode.rb +4 -3
- data/{lib/metasm → metasm/cpu}/mips/main.rb +11 -4
- data/{lib/metasm → metasm/cpu}/mips/opcodes.rb +86 -17
- data/{lib/metasm → metasm/cpu}/mips/parse.rb +1 -1
- data/{lib/metasm → metasm/cpu}/mips/render.rb +1 -1
- data/{lib/metasm/dalvik.rb → metasm/cpu/msp430.rb} +1 -1
- data/metasm/cpu/msp430/decode.rb +247 -0
- data/metasm/cpu/msp430/main.rb +62 -0
- data/metasm/cpu/msp430/opcodes.rb +101 -0
- data/{lib/metasm → metasm/cpu}/pic16c/decode.rb +6 -7
- data/{lib/metasm → metasm/cpu}/pic16c/main.rb +0 -0
- data/{lib/metasm → metasm/cpu}/pic16c/opcodes.rb +1 -1
- data/{lib/metasm/mips.rb → metasm/cpu/ppc.rb} +4 -4
- data/{lib/metasm → metasm/cpu}/ppc/decode.rb +18 -12
- data/{lib/metasm → metasm/cpu}/ppc/decompile.rb +3 -3
- data/{lib/metasm → metasm/cpu}/ppc/encode.rb +2 -2
- data/{lib/metasm → metasm/cpu}/ppc/main.rb +17 -12
- data/{lib/metasm → metasm/cpu}/ppc/opcodes.rb +11 -5
- data/metasm/cpu/ppc/parse.rb +55 -0
- data/metasm/cpu/python.rb +8 -0
- data/metasm/cpu/python/decode.rb +136 -0
- data/metasm/cpu/python/main.rb +36 -0
- data/metasm/cpu/python/opcodes.rb +180 -0
- data/{lib/metasm → metasm/cpu}/sh4.rb +1 -1
- data/{lib/metasm → metasm/cpu}/sh4/decode.rb +48 -17
- data/{lib/metasm → metasm/cpu}/sh4/main.rb +13 -4
- data/{lib/metasm → metasm/cpu}/sh4/opcodes.rb +7 -8
- data/metasm/cpu/x86_64.rb +15 -0
- data/{lib/metasm → metasm/cpu}/x86_64/compile_c.rb +28 -17
- data/{lib/metasm → metasm/cpu}/x86_64/debug.rb +4 -4
- data/{lib/metasm → metasm/cpu}/x86_64/decode.rb +57 -15
- data/{lib/metasm → metasm/cpu}/x86_64/encode.rb +55 -26
- data/{lib/metasm → metasm/cpu}/x86_64/main.rb +14 -6
- data/metasm/cpu/x86_64/opcodes.rb +136 -0
- data/{lib/metasm → metasm/cpu}/x86_64/parse.rb +10 -2
- data/metasm/cpu/x86_64/render.rb +35 -0
- data/metasm/cpu/z80.rb +9 -0
- data/metasm/cpu/z80/decode.rb +313 -0
- data/metasm/cpu/z80/main.rb +67 -0
- data/metasm/cpu/z80/opcodes.rb +224 -0
- data/metasm/cpu/z80/render.rb +59 -0
- data/{lib/metasm/os/main.rb → metasm/debug.rb} +160 -401
- data/{lib/metasm → metasm}/decode.rb +35 -4
- data/{lib/metasm → metasm}/decompile.rb +15 -16
- data/{lib/metasm → metasm}/disassemble.rb +201 -45
- data/{lib/metasm → metasm}/disassemble_api.rb +651 -87
- data/{lib/metasm → metasm}/dynldr.rb +220 -133
- data/{lib/metasm → metasm}/encode.rb +10 -1
- data/{lib/metasm → metasm}/exe_format/a_out.rb +9 -6
- data/{lib/metasm → metasm}/exe_format/autoexe.rb +1 -0
- data/{lib/metasm → metasm}/exe_format/bflt.rb +57 -27
- data/{lib/metasm → metasm}/exe_format/coff.rb +11 -3
- data/{lib/metasm → metasm}/exe_format/coff_decode.rb +53 -20
- data/{lib/metasm → metasm}/exe_format/coff_encode.rb +11 -13
- data/{lib/metasm → metasm}/exe_format/dex.rb +13 -5
- data/{lib/metasm → metasm}/exe_format/dol.rb +1 -0
- data/{lib/metasm → metasm}/exe_format/elf.rb +93 -57
- data/{lib/metasm → metasm}/exe_format/elf_decode.rb +143 -34
- data/{lib/metasm → metasm}/exe_format/elf_encode.rb +122 -31
- data/metasm/exe_format/gb.rb +65 -0
- data/metasm/exe_format/javaclass.rb +424 -0
- data/{lib/metasm → metasm}/exe_format/macho.rb +204 -16
- data/{lib/metasm → metasm}/exe_format/main.rb +26 -3
- data/{lib/metasm → metasm}/exe_format/mz.rb +1 -0
- data/{lib/metasm → metasm}/exe_format/nds.rb +7 -4
- data/{lib/metasm → metasm}/exe_format/pe.rb +71 -8
- data/metasm/exe_format/pyc.rb +167 -0
- data/{lib/metasm → metasm}/exe_format/serialstruct.rb +67 -14
- data/{lib/metasm → metasm}/exe_format/shellcode.rb +7 -3
- data/metasm/exe_format/shellcode_rwx.rb +114 -0
- data/metasm/exe_format/swf.rb +205 -0
- data/{lib/metasm → metasm}/exe_format/xcoff.rb +7 -7
- data/metasm/exe_format/zip.rb +335 -0
- data/metasm/gui.rb +13 -0
- data/{lib/metasm → metasm}/gui/cstruct.rb +35 -41
- data/{lib/metasm → metasm}/gui/dasm_coverage.rb +11 -11
- data/{lib/metasm → metasm}/gui/dasm_decomp.rb +7 -20
- data/{lib/metasm → metasm}/gui/dasm_funcgraph.rb +0 -0
- data/metasm/gui/dasm_graph.rb +1695 -0
- data/{lib/metasm → metasm}/gui/dasm_hex.rb +12 -8
- data/{lib/metasm → metasm}/gui/dasm_listing.rb +43 -28
- data/{lib/metasm → metasm}/gui/dasm_main.rb +310 -53
- data/{lib/metasm → metasm}/gui/dasm_opcodes.rb +5 -19
- data/{lib/metasm → metasm}/gui/debug.rb +93 -27
- data/{lib/metasm → metasm}/gui/gtk.rb +162 -40
- data/{lib/metasm → metasm}/gui/qt.rb +12 -2
- data/{lib/metasm → metasm}/gui/win32.rb +179 -42
- data/{lib/metasm → metasm}/gui/x11.rb +59 -59
- data/{lib/metasm → metasm}/main.rb +389 -264
- data/{lib/metasm/os/remote.rb → metasm/os/gdbremote.rb} +146 -54
- data/{lib/metasm → metasm}/os/gnu_exports.rb +1 -1
- data/{lib/metasm → metasm}/os/linux.rb +628 -151
- data/metasm/os/main.rb +330 -0
- data/{lib/metasm → metasm}/os/windows.rb +132 -42
- data/{lib/metasm → metasm}/os/windows_exports.rb +141 -0
- data/{lib/metasm → metasm}/parse.rb +26 -24
- data/{lib/metasm → metasm}/parse_c.rb +221 -116
- data/{lib/metasm → metasm}/preprocessor.rb +55 -40
- data/{lib/metasm → metasm}/render.rb +14 -38
- data/misc/hexdump.rb +2 -1
- data/misc/lint.rb +58 -0
- data/misc/txt2html.rb +9 -7
- data/samples/bindiff.rb +3 -4
- data/samples/dasm-plugins/bindiff.rb +15 -0
- data/samples/dasm-plugins/bookmark.rb +133 -0
- data/samples/dasm-plugins/c_constants.rb +57 -0
- data/samples/dasm-plugins/colortheme_solarized.rb +125 -0
- data/samples/dasm-plugins/cppobj_funcall.rb +60 -0
- data/samples/dasm-plugins/dasm_all.rb +70 -0
- data/samples/dasm-plugins/demangle_cpp.rb +31 -0
- data/samples/dasm-plugins/deobfuscate.rb +251 -0
- data/samples/dasm-plugins/dump_text.rb +35 -0
- data/samples/dasm-plugins/export_graph_svg.rb +86 -0
- data/samples/dasm-plugins/findgadget.rb +75 -0
- data/samples/dasm-plugins/hl_opcode.rb +32 -0
- data/samples/dasm-plugins/hotfix_gtk_dbg.rb +19 -0
- data/samples/dasm-plugins/imm2off.rb +34 -0
- data/samples/dasm-plugins/match_libsigs.rb +93 -0
- data/samples/dasm-plugins/patch_file.rb +95 -0
- data/samples/dasm-plugins/scanfuncstart.rb +36 -0
- data/samples/dasm-plugins/scanxrefs.rb +26 -0
- data/samples/dasm-plugins/selfmodify.rb +197 -0
- data/samples/dasm-plugins/stringsxrefs.rb +28 -0
- data/samples/dasmnavig.rb +1 -1
- data/samples/dbg-apihook.rb +24 -9
- data/samples/dbg-plugins/heapscan.rb +283 -0
- data/samples/dbg-plugins/heapscan/compiled_heapscan_lin.c +155 -0
- data/samples/dbg-plugins/heapscan/compiled_heapscan_win.c +128 -0
- data/samples/dbg-plugins/heapscan/graphheap.rb +616 -0
- data/samples/dbg-plugins/heapscan/heapscan.rb +709 -0
- data/samples/dbg-plugins/heapscan/winheap.h +174 -0
- data/samples/dbg-plugins/heapscan/winheap7.h +307 -0
- data/samples/dbg-plugins/trace_func.rb +214 -0
- data/samples/disassemble-gui.rb +35 -5
- data/samples/disassemble.rb +31 -6
- data/samples/dump_upx.rb +24 -12
- data/samples/dynamic_ruby.rb +12 -3
- data/samples/exeencode.rb +6 -5
- data/samples/factorize-headers-peimports.rb +1 -1
- data/samples/lindebug.rb +175 -381
- data/samples/metasm-shell.rb +1 -2
- data/samples/peldr.rb +2 -2
- data/tests/all.rb +1 -1
- data/tests/arc.rb +26 -0
- data/tests/dynldr.rb +22 -4
- data/tests/expression.rb +55 -0
- data/tests/graph_layout.rb +285 -0
- data/tests/ia32.rb +79 -26
- data/tests/mips.rb +9 -2
- data/tests/x86_64.rb +66 -18
- metadata +330 -218
- data/lib/metasm/arm/opcodes.rb +0 -177
- data/lib/metasm/gui.rb +0 -23
- data/lib/metasm/gui/dasm_graph.rb +0 -1354
- data/lib/metasm/ia32.rb +0 -14
- data/lib/metasm/ia32/opcodes.rb +0 -873
- data/lib/metasm/ppc/parse.rb +0 -52
- data/lib/metasm/x86_64.rb +0 -12
- data/lib/metasm/x86_64/opcodes.rb +0 -118
- data/samples/gdbclient.rb +0 -583
- data/samples/rubstop.rb +0 -399
data/README
CHANGED
@@ -21,6 +21,10 @@ Ready-to-use scripts can be found in the samples/ subdirectory, check the
|
|
21
21
|
comments in the scripts headers. You can also try the --help argument if
|
22
22
|
you're feeling lucky.
|
23
23
|
|
24
|
+
For more information, check the doc/ subdirectory. The text files can be
|
25
|
+
compiled to html using the misc/txt2html.rb script.
|
26
|
+
|
27
|
+
|
24
28
|
|
25
29
|
Here is a short overview of the Metasm internals.
|
26
30
|
|
@@ -167,8 +171,8 @@ You can encode/decode an ExeFormat (ie decode sections, imports, headers etc)
|
|
167
171
|
Constructor: ExeFormat.decode_file(str), ExeFormat.decode_file_header(str)
|
168
172
|
Methods: ExeFormat#encode_file(filename), ExeFormat#encode_string
|
169
173
|
|
170
|
-
PE and ELF files have a LoadedPE/LoadedELF counterpart, that
|
171
|
-
with memory-mmaped versions of those formats (e.g. to
|
174
|
+
PE and ELF files have a LoadedPE/LoadedELF counterpart, that are able to work
|
175
|
+
with memory-mmaped versions of those formats (e.g. to debug running
|
172
176
|
processes)
|
173
177
|
|
174
178
|
|
@@ -198,27 +202,31 @@ disassembly/patching easily (using LoadedPE/LoadedELF as ExeFormat)
|
|
198
202
|
|
199
203
|
Debugging:
|
200
204
|
|
201
|
-
Metasm includes a few interfaces to
|
205
|
+
Metasm includes a few interfaces to handle debugging.
|
202
206
|
The WinOS and LinOS classes offer access to the underlying OS processes (e.g.
|
203
207
|
OS.current.find_process('foobar') will retrieve a running process with foobar
|
204
208
|
in its filename ; then process.mem can be used to access its memory.)
|
205
209
|
|
206
|
-
The Windows and Linux debugging APIs
|
207
|
-
(
|
208
|
-
|
210
|
+
The Windows and Linux low-level debugging APIs have a basic ruby interface
|
211
|
+
(PTrace and WinAPI) ; which are used by the unified high-end Debugger class.
|
212
|
+
Remote debugging is supported through the GDB server wire protocol.
|
209
213
|
|
210
|
-
|
211
|
-
|
212
|
-
This interface can talk to a gdb-server through samples/gdbclient.rb ; use
|
213
|
-
[udp:]<host:port> as target.
|
214
|
+
High-level debuggers can be created with the following ruby line:
|
215
|
+
Metasm::OS.current.create_debugger('foo')
|
214
216
|
|
215
|
-
|
216
|
-
|
217
|
+
Only one kind of host debugger class can exist at a time ; to debug multiple
|
218
|
+
processes, attach to other processes using the existing class. This is due
|
219
|
+
to the way the OS debugging API works on Windows and Linux.
|
217
220
|
|
218
|
-
|
219
|
-
|
221
|
+
The low-level backends are defined in the os/ subdirectory, the front-end is
|
222
|
+
defined in debug.rb.
|
220
223
|
|
221
|
-
|
224
|
+
A linux console debugging interface is available in samples/lindebug.rb ; it
|
225
|
+
uses a (simplified) SoftICE-like look and feel.
|
226
|
+
It can talk to a gdb-server socket ; use a [udp:]<host:port> target.
|
227
|
+
|
228
|
+
The disassembler-gui sample allow live process interaction when using as
|
229
|
+
target 'live:<pid or part of program name>'.
|
222
230
|
|
223
231
|
|
224
232
|
C Parser:
|
@@ -236,7 +244,11 @@ It handles all the constructs i am aware of, except hex floats:
|
|
236
244
|
- __int8 etc native types
|
237
245
|
- Label addresses (&&label)
|
238
246
|
Also note that all those things are parsed, but most of them will fail to
|
239
|
-
compile on the Ia32 backend (the only one implemented so far.)
|
247
|
+
compile on the Ia32/X64 backend (the only one implemented so far.)
|
248
|
+
|
249
|
+
Parsing C files should be done using an existing ExeFormat, with the
|
250
|
+
parse_c_file method. This ensures that format-specific macros/ABI are correctly
|
251
|
+
defined (ex: size of the 'long' type, ABI to pass parameters to functions, etc)
|
240
252
|
|
241
253
|
When you parse a C String using C::Parser.parse(text), you receive a Parser
|
242
254
|
object. It holds a #toplevel field, which is a C::Block, which holds #structs,
|
@@ -249,15 +261,11 @@ CExpressions...)
|
|
249
261
|
|
250
262
|
A C::Parser may be #precompiled to transform it into a simplified version that
|
251
263
|
is easier to compile: typedefs are removed, control sequences are transformed
|
252
|
-
|
264
|
+
into 'if (XX) goto YY;' etc.
|
253
265
|
|
254
266
|
To compile a C program, use PE/ELF.compile_c, that will create a C::Parser with
|
255
267
|
exe-specific macros defined (eg __PE__ or __ELF__).
|
256
268
|
|
257
|
-
The prefered way to create a C::Parser is to initialize it with a CPU and the
|
258
|
-
desired ExeFormat, so that it is
|
259
|
-
correctly initialized (eg type sizes: is long 4 or 8 bytes? etc) ; and
|
260
|
-
may define preprocessor macros needed to correctly parse standard headers.
|
261
269
|
Vendor-specific headers may need to use either #pragma prepare_visualstudio
|
262
270
|
(to parse the Microsoft Visual Studio headers) or prepare_gcc (for gcc), the
|
263
271
|
latter may be auto-detected (or may not).
|
data/Rakefile
ADDED
data/TODO
CHANGED
@@ -1,14 +1,13 @@
|
|
1
1
|
List of TODO items, by section, in random order
|
2
2
|
|
3
3
|
Ia32
|
4
|
-
emu fpu
|
5
|
-
add all sse2 instrs
|
6
4
|
realmode
|
7
5
|
|
8
6
|
X86_64
|
9
7
|
decompiler
|
10
8
|
|
11
9
|
CPU
|
10
|
+
Arm
|
12
11
|
Sparc
|
13
12
|
Cell
|
14
13
|
|
@@ -26,19 +25,20 @@ Assembler
|
|
26
25
|
Disasm
|
27
26
|
DecodedData
|
28
27
|
Exe decoding generate decodeddata ?
|
29
|
-
Function-local namespace (esp+12 -> esp+var_42)
|
30
28
|
Fix thunk detection (thunk: mov ecx, 42 jmp [iat_thiscall] is not a thunk)
|
31
29
|
Test with ET_REL style exe
|
32
30
|
Store stuff out of mem (to handle big binaries)
|
33
31
|
Better :default usage
|
34
32
|
good on call eax, but not on <600k instrs> ret
|
35
33
|
use binary personality ? (uses call vs uses pushret..)
|
36
|
-
Improve backtrace
|
34
|
+
Improve 'backtrace => patch di.instr.args'
|
37
35
|
path-specific backtracking ( foo: call a ; a: jmp retloc ; bar: call b ; b: jmp retloc ; retloc: ret ; call foo ; ret : last ret trackback should only reach a:)
|
38
36
|
Decode pseudo/macro-instrs (mips 'li')
|
39
37
|
Deoptimizer (instr reordering for readability)
|
40
38
|
Optimizer (deobfuscating)
|
41
39
|
Per-instr context (allows to mix cell/ppc, x86 32/16bits, arm/armthumb..)
|
40
|
+
Better save/load dasm state
|
41
|
+
Parse symbol.map generated by IDA for ELF files
|
42
42
|
|
43
43
|
Compiler
|
44
44
|
Optimizer
|
@@ -69,6 +69,7 @@ Decompiler
|
|
69
69
|
Handle/hide compiler-generated stuff (getip, stack cookie setup/check..)
|
70
70
|
Handle call 1f ; 1: pop eax
|
71
71
|
More user control (force/forbid register arg, return type, etc)
|
72
|
+
Preserve C decompiled line association to range of asm decoded addrs
|
72
73
|
|
73
74
|
Debugger
|
74
75
|
OSX
|
@@ -77,14 +78,9 @@ Debugger
|
|
77
78
|
Generic remote process manip
|
78
79
|
create blank state
|
79
80
|
linux virtualallocex
|
80
|
-
pax-compatible code patch through mmap
|
81
81
|
Remote debugging (small standalone C client)
|
82
82
|
Support dbghelp.dll (ms symbol server info)
|
83
83
|
Support debugee function call (gdb 'call')
|
84
|
-
Manipulate memory through C struct casts
|
85
|
-
|
86
|
-
ExeFormat
|
87
|
-
Handle minor editing without decode/reencode (eg patch ELF entrypoint)
|
88
84
|
|
89
85
|
ELF
|
90
86
|
test encoding openbsd binaries
|
@@ -98,6 +94,7 @@ PE
|
|
98
94
|
resource editor ?
|
99
95
|
rc compiler ?
|
100
96
|
add simple accessor for resource stuff (manifest, icon, ...)
|
97
|
+
parse PDB
|
101
98
|
|
102
99
|
GUI
|
103
100
|
debugger
|
@@ -105,10 +102,11 @@ GUI
|
|
105
102
|
show breakpoints
|
106
103
|
show jump direction from current flag values
|
107
104
|
have a console frontend
|
108
|
-
better graph positionning fallback
|
109
105
|
zoom font when zooming graph
|
110
|
-
|
106
|
+
text selection
|
107
|
+
copy/paste
|
111
108
|
map (part of) the binary & debug it (map a PE on a linux host & run it)
|
109
|
+
html frontend
|
112
110
|
|
113
111
|
Ruby
|
114
|
-
|
112
|
+
write a fast ruby-like interpreter
|
data/doc/code_organisation.txt
CHANGED
data/doc/core/DynLdr.txt
ADDED
@@ -0,0 +1,247 @@
|
|
1
|
+
DynLdr
|
2
|
+
======
|
3
|
+
|
4
|
+
DynLdr is a class that uses metasm to dynamically add native methods,
|
5
|
+
or native method wrappers, available to the running ruby interpreter.
|
6
|
+
|
7
|
+
It leverages the built-in C parser / compiler.
|
8
|
+
|
9
|
+
It is implemented in `metasm/dynldr.rb`.
|
10
|
+
|
11
|
+
Currently only supported for <core/Ia32.txt> and <core/X86_64.txt> under
|
12
|
+
Windows and Linux.
|
13
|
+
|
14
|
+
|
15
|
+
Basics
|
16
|
+
------
|
17
|
+
|
18
|
+
Native library wrapper
|
19
|
+
######################
|
20
|
+
|
21
|
+
The main usage is to generate interfaces to native libraries.
|
22
|
+
|
23
|
+
This is done through the `#new_api_c` method.
|
24
|
+
|
25
|
+
The following exemple will read the specified C header fragment,
|
26
|
+
define ruby constants for all `#define`/`enum`, and define ruby
|
27
|
+
method wrappers to call the native functions whose prototype is
|
28
|
+
present in the header.
|
29
|
+
|
30
|
+
All referenced native functions must be exported by the given
|
31
|
+
library file.
|
32
|
+
|
33
|
+
class MyInterface < DynLdr
|
34
|
+
c_header = <<EOS
|
35
|
+
#define SomeConst 42
|
36
|
+
enum { V1, V2 };
|
37
|
+
|
38
|
+
__stdcall int methodist(char*, int);
|
39
|
+
EOS
|
40
|
+
|
41
|
+
new_api_c c_header, 'mylib.dll'
|
42
|
+
end
|
43
|
+
|
44
|
+
Then you can call, from the ruby:
|
45
|
+
|
46
|
+
MyInterface.methodist("lol", MyInterface::SOMECONST)
|
47
|
+
|
48
|
+
Constant/enum names are converted to full uppercase, and method
|
49
|
+
names are converted to full lowercase.
|
50
|
+
|
51
|
+
Dynamic native inline function
|
52
|
+
##############################
|
53
|
+
|
54
|
+
You can also dynamically compile native functions, that are compiled
|
55
|
+
in memory and copied to RWX memory with the right ruby wrapper:
|
56
|
+
|
57
|
+
class MyInterface < DynLdr
|
58
|
+
new_func_c <<EOS
|
59
|
+
int bla(char*arg) {
|
60
|
+
if (strlen(arg) > 4)
|
61
|
+
return 1;
|
62
|
+
else
|
63
|
+
return 0;
|
64
|
+
}
|
65
|
+
EOS
|
66
|
+
end
|
67
|
+
|
68
|
+
References to external functions are allowed, and resolved automatically.
|
69
|
+
|
70
|
+
The ruby objects used as arguments to the wrapper method are
|
71
|
+
automatically converted to the right C type.
|
72
|
+
|
73
|
+
|
74
|
+
You can also write native functions in assembly, but you must specify a
|
75
|
+
C prototype, used for argument and return value conversion.
|
76
|
+
|
77
|
+
class MyInterface < DynLdr
|
78
|
+
new_func_asm "int increment(int i);", <<EOS
|
79
|
+
mov eax, [esp+4]
|
80
|
+
inc eax
|
81
|
+
ret
|
82
|
+
EOS
|
83
|
+
|
84
|
+
p increment(4)
|
85
|
+
|
86
|
+
end
|
87
|
+
|
88
|
+
|
89
|
+
Structures
|
90
|
+
----------
|
91
|
+
|
92
|
+
`DynLdr` handles C structures.
|
93
|
+
|
94
|
+
Once a structure is specified in the C part, you can create a ruby object
|
95
|
+
using `MyClass.alloc_c_struct(structname)`, which will allocate an object of the
|
96
|
+
right size to hold all the structure members, and with the right accessors.
|
97
|
+
|
98
|
+
To access/modify struct members, you can either use a `Hash`-style access
|
99
|
+
|
100
|
+
structobj['membername'] = 42
|
101
|
+
|
102
|
+
or `Struct`-style access
|
103
|
+
|
104
|
+
structobj.membername = 42
|
105
|
+
|
106
|
+
Member names are matched case-insensitively, and nested structures/unions
|
107
|
+
are also searched.
|
108
|
+
|
109
|
+
The struct members can be initially populated by passing a `Hash` argument
|
110
|
+
to the `alloc_c_struct` constructor. Additionally, this hash may use the
|
111
|
+
special value `:size` to reference the byte size of the current structure.
|
112
|
+
|
113
|
+
class MyInterface < DynLdr
|
114
|
+
new_api_c <<EOS
|
115
|
+
struct sname {
|
116
|
+
int s_mysize;
|
117
|
+
int s_value;
|
118
|
+
union {
|
119
|
+
struct {
|
120
|
+
int s_bits:4;
|
121
|
+
int s_bits2:4;
|
122
|
+
};
|
123
|
+
int s_union;
|
124
|
+
}
|
125
|
+
};
|
126
|
+
EOS
|
127
|
+
end
|
128
|
+
|
129
|
+
# field s_mysize holds the size of the structure in bytes, ie 12
|
130
|
+
s_obj = MyInterface.alloc_c_struct('sname', :s_mysize => :size, :s_value => 42)
|
131
|
+
|
132
|
+
# we can access fields using Hash-style access
|
133
|
+
s_obj['s_UniOn'] = 0xa8
|
134
|
+
|
135
|
+
# or Struct-style access
|
136
|
+
puts '0x%x' % s_obj.s_BiTS2 # => '0xa'
|
137
|
+
|
138
|
+
This object can be directly passed as argument to a wrapped function, and
|
139
|
+
the native function will receive a pointer to this structure (that it can
|
140
|
+
freely modify).
|
141
|
+
|
142
|
+
This object is a `C::AllocStruct`, defined in `metasm/parse_c.rb`.
|
143
|
+
Internally, it is based on a ruby `String`, and has a reference to the parser's
|
144
|
+
`Struct` to find the mapping membername -> offsets/length.
|
145
|
+
|
146
|
+
See <core/CParser.txt> for more details.
|
147
|
+
|
148
|
+
|
149
|
+
Callbacks
|
150
|
+
---------
|
151
|
+
|
152
|
+
`DynLdr` handles C callbacks, with arbitrary ABI.
|
153
|
+
|
154
|
+
Any number of callbacks can be defined at any time.
|
155
|
+
|
156
|
+
C callbacks are backed by a ruby `Proc`, eg `lambda {}`.
|
157
|
+
|
158
|
+
|
159
|
+
class MyInterface < DynLdr
|
160
|
+
new_api_c <<EOS
|
161
|
+
void qsort(void *, int, int, int(*)(void*, void*));
|
162
|
+
EOS
|
163
|
+
|
164
|
+
str = "sanotheusnaonetuh"
|
165
|
+
cmp = lambda { |p1, p2|
|
166
|
+
memory_read(p1, 1) <=> memory_read(p2, 1)
|
167
|
+
}
|
168
|
+
qsort(str, str.length, 1, cmp)
|
169
|
+
p str
|
170
|
+
end
|
171
|
+
|
172
|
+
|
173
|
+
|
174
|
+
Argument conversion
|
175
|
+
-------------------
|
176
|
+
|
177
|
+
Ruby objects passed to a wrapper method are converted to the corresponding
|
178
|
+
C type
|
179
|
+
|
180
|
+
* `Strings` are converted to a C pointer to the byte buffer (also directly
|
181
|
+
accessible from the ruby through `DynLdr.str_ptr(obj)`
|
182
|
+
* `Integers` are converted to their C equivalent, according to the prototype
|
183
|
+
(`char`, `unsigned long long`, ...)
|
184
|
+
* `Procs` are converted to a C callback
|
185
|
+
* `Floats` are not supported for now.
|
186
|
+
|
187
|
+
|
188
|
+
Working with memory
|
189
|
+
-------------------
|
190
|
+
|
191
|
+
DynLdr provides different ways to allocate memory.
|
192
|
+
|
193
|
+
* `alloc_c_struct` to allocate a C structure
|
194
|
+
* `alloc_c_ary` to allocate C array of some type
|
195
|
+
* `alloc_c_ptr`, which is just an ary of size 1
|
196
|
+
* `memory_alloc` allocates memory from a new memory page
|
197
|
+
|
198
|
+
`memory_alloc` works by calling `mmap` under linux and `VirtualAlloc` under windows,
|
199
|
+
and is suitable for allocating memory where you want to control
|
200
|
+
the memory permissions (read, write, execute). This is done through `memory_perm`.
|
201
|
+
|
202
|
+
`memory_perm` takes for argument the start address, the length, and the new permission, specified as a String (e.g. 'r', 'rwx')
|
203
|
+
|
204
|
+
To work with memory that may be returned by an API (e.g. `malloc`),
|
205
|
+
DynLdr provides ways to read and write arbitrary pointers from the ruby
|
206
|
+
interpreter memory.
|
207
|
+
Take care, those may generate faults when called with invalid addresses that
|
208
|
+
will crash the ruby interpreter.
|
209
|
+
|
210
|
+
* `memory_read` takes a pointer and a length, and returns a String
|
211
|
+
* `memory_read_int` takes a pointer, and returns an Integer (of pointer size,
|
212
|
+
e.g. 64 bit in a 64-bit interpreter)
|
213
|
+
* `memory_write` takes a pointer and a String, and writes it to memory
|
214
|
+
* `memory_write_int`
|
215
|
+
|
216
|
+
|
217
|
+
Hacking
|
218
|
+
-------
|
219
|
+
|
220
|
+
Internally, DynLdr relies on a number of features that are not directly
|
221
|
+
available from the ruby interpreter.
|
222
|
+
|
223
|
+
So the first thing done by the script is to generate a binary native module
|
224
|
+
that will act as a C extension to the ruby interpreter.
|
225
|
+
This binary is necessarily different depending on the interpreter.
|
226
|
+
The binary name includes the target architecture, in the format
|
227
|
+
dynldr-*arch*-*cpu*-*19*.so, e.g.
|
228
|
+
|
229
|
+
* dynldr-linux-ia32.so
|
230
|
+
* dynldr-windows-x64-19.so
|
231
|
+
|
232
|
+
This native module is (re)generated if it does not exist, or is older than the
|
233
|
+
`dynldr.rb` script.
|
234
|
+
|
235
|
+
A special trick is used in this module, as it does not know the actual name
|
236
|
+
of the ruby library used by the interpreter. So on linux, the `libruby` is
|
237
|
+
removed from the `DT_NEEDED` library list, and on windows a special stub
|
238
|
+
is assembled to manually resolve the ruby imports needed by the module from
|
239
|
+
any instance of `libruby` present in the running process.
|
240
|
+
|
241
|
+
The native file is written to a directory writeably by the current user.
|
242
|
+
The following list of directories are tried, until a suitable one is found:
|
243
|
+
|
244
|
+
* the `metasm` directory itself
|
245
|
+
* the `$HOME`/`$APPDATA`/`$USERPROFILE` directory
|
246
|
+
* the `$TMP`/`$TEMP`/current directory
|
247
|
+
|
@@ -0,0 +1,43 @@
|
|
1
|
+
ExeFormat
|
2
|
+
=========
|
3
|
+
|
4
|
+
This class is the parent of all executable format handlers.
|
5
|
+
|
6
|
+
It is defined in `metasm/exe_format/main.rb`.
|
7
|
+
|
8
|
+
It defines some standard shortcut functions, such as:
|
9
|
+
|
10
|
+
* `Exe.decode_file(filename)`
|
11
|
+
* `Exe.assemble(cpu,asm_source)`
|
12
|
+
* `Exe.compile_c(cpu,c_source)`
|
13
|
+
* `Exe#encode_file(filename)`
|
14
|
+
|
15
|
+
These methods will instanciate a new Exe, and call the corresponding
|
16
|
+
methods, *e.g.* `load` with the file content, and `decode`.
|
17
|
+
|
18
|
+
The handling of the different structures in the binary format should be
|
19
|
+
done using the <core/SerialStruct.txt> facility.
|
20
|
+
|
21
|
+
The subclasses are expected to implement various functions, depending on the
|
22
|
+
usage (refer to the ELF and COFF implementations for more details):
|
23
|
+
|
24
|
+
File decoding/disassembly
|
25
|
+
-------------------------
|
26
|
+
|
27
|
+
* `#decode_header`: parse the raw data in `#encoded` only to parse the file header
|
28
|
+
* `#decode`: parse all the raw data in `#encoded`
|
29
|
+
* `#cpu_from_headers`: return a <core/CPU.txt> instance according to the exe header information
|
30
|
+
* `#get_default_entrypoints`: the list of entrypoints (exported functions, etc)
|
31
|
+
* `#dump_section_header`: return a string that may be assembled to recreate the specified section
|
32
|
+
* `#section_info`: return a list of generic section informations for the disassembler
|
33
|
+
|
34
|
+
|
35
|
+
File encoding/source parsing
|
36
|
+
----------------------------
|
37
|
+
|
38
|
+
* `#tune_prepro`: define exe-specific macros for the preprocessor (optional)
|
39
|
+
* `#parse_init`: initialize the `@cursource` array to receive the parsed asm source
|
40
|
+
* `#parse_parser_instruction`: parse exe-specific instructions, eg `.text`, `.import`...
|
41
|
+
* `#assemble`: assemble the content of the @cursource into binary section contents
|
42
|
+
* `#encode`: assemble the various sections and a binary header into `@encoded`
|
43
|
+
|