furnace-avm2 1.0.0 → 1.0.1

Sign up to get free protection for your applications and to get access to all the features.
data/README.md ADDED
@@ -0,0 +1,122 @@
1
+ Furnace-AVM2
2
+ ============
3
+
4
+ Furnace-AVM2 is a library for manipulating Adobe Flash ActionScript 3 bytecode. The library contains routines for reading and writing the bytecode in native Flash binary format and transformations between various flavors of abstract syntax trees suited for automatic and manual analysis.
5
+
6
+ This library can solve wide range of tasks, including but not limited to:
7
+
8
+ * **Deobfuscation.** Currently, only an opcode-level dead code eliminator is provided, which is nevertheless proved itself quite useful. A name normalization routine which transforms the names in the source code back to a human-readable form is also provided.
9
+ * **Decompilation.** One of the provided AST transformations emits back the source code in ActionScript 3. One of the development goals was to produce code which can be compiled back and perform the identical actions. It was (mostly) achieved.
10
+ * **Behavioral matching of code.** Furnace matchers provide a powerful [regular language](http://en.wikipedia.org/wiki/Regular_languages) and can be used to locate statement level or control flow level constructs. For example, a matcher could:
11
+
12
+ * Find all statement sequences of the form
13
+
14
+ var <var>:String = Loader.loadURL(<url>);
15
+ trace(<var>);
16
+
17
+ where `<x>` correspond to wildcards which capture and remember values.
18
+
19
+ * Find all loops of the form
20
+
21
+ for(var <var>:<type> in [object.getParent().getList()|object.getList()]) {
22
+ ...
23
+ var descendant:<type> = <var>.getDescendant();
24
+ ...
25
+ }
26
+
27
+ where `<x>` correspond to wildcards as described above and `[y|x]` means that either of variants `y` or `x` is accepted in place of the construct.
28
+
29
+ * **Patching.** The library does not allow transformation of abstract syntax trees back to bytecode, but it retails a lot of information about origin of constructs and allows to modify every other aspect of the bitstream.
30
+
31
+ The library supports all known AVM2 opcodes, including undocumented (generic types) and Alchemy ones. Most of these opcodes are supported down to source level, with a notable exception of some [E4X](http://en.wikipedia.org/wiki/E4X), which are just too braindead to implement.
32
+
33
+ The library is extensible. A new transformation can easily be plugged in, shall such a need to arise. Adding support for a certain obfuscator boils down to adding one or two stages to the pipeline, which would normalize the mangled code.
34
+
35
+ The library is portable and fast. It works on 1.9 rubies: MRI, JRuby and (if you're lucky) Rubinius. It can decompile circa 9000 methods per 30s (on JRuby 1.7 and 8-core Intel i7).
36
+
37
+ Installation
38
+ ------------
39
+
40
+ Furnace-AVM2 is written in Ruby 1.9. You will need to install a compatible Ruby implementation. JRuby is recommended as supports real multithreading mode, but Ruby MRI is also acceptable.
41
+
42
+ Install the required gems:
43
+
44
+ $ gem install furnace-avm2 furnace-swf
45
+
46
+ Command line interface
47
+ ----------------------
48
+
49
+ Furnace-AVM2 has two main command-line utilites, `furnace-avm2` and `furnace-avm2-decompiler`. There is also a supplementary utility called `furnace-swf` which is contained in the gem with the same name.
50
+
51
+ Note that this library only operates with raw bytecode. It does not know anything about SWF files nor can it parse them.
52
+
53
+ To analyze a real-world file, which most certainly will be an SWF, you will need to use `furnace-swf` first. It can parse the whole SWF file (including compressed ones), but only supports DoABC2 tags which contain AVM2 bytecode.
54
+
55
+ The `furnace-swf` utility currently has three subcommands, `abclist`, `abcextract` and `abcreplace`. They should be mostly self-explanatory; an example session is shown below.
56
+
57
+ $ furnace-swf -i sample.swf abclist
58
+ ABC tags:
59
+ "frame1": 1488672 byte(s)
60
+ $ furnace-swf -i sample.swf abcextract -n frame1 -o frame1.abc
61
+ $
62
+
63
+ After you have extracted the AVM2 bytecode, you can use Furnace-AVM2 itself. First, if you think that the file might be obfuscated, you need to preprocess it to clean the obfuscation artifacts. Run the `furnace-avm2` utility in DCE mode (the names are normalized by default; if you don't need that, pass `-q`).
64
+
65
+ $ furnace-avm2 -i frame1.abc -d -o frame1.abc
66
+ $
67
+
68
+ `furnace-avm2` generally works on the method level. It builds a set of methods to work on (`-O` and `-E` options), and then performs various transformations on them. If you need to determine why a particular method fails or to retrieve AST in free-text form, it's a right utility to use. Check its inline help and don't hesitate to experiment with different options.
69
+
70
+ Contrary to that, `furnace-avm2-decompiler` works on a class level. You can include and exclude objects to decompile with the class granularity, and it doesn't have much more configuration than that.
71
+
72
+ $ furnace-avm2-decompiler -i frame1.abc -d -D funids >frame1.as
73
+ Reading input data...
74
+ Found 2434 classes and packages.
75
+ Decompiling... 2402/2434 /
76
+ Decompiled: 9167/9201 (99%)
77
+ Partially decompiled: 8/9201 (0%)
78
+ Failed: 26/9201 (0%)
79
+ Time taken: 69.27s
80
+ $
81
+
82
+ The `-D funids` option adds a comment with method body index for each decompiled method. It can be used for debugging decompiler failures.
83
+
84
+ You'll notice that some methods probably will not get decompiled. (The file I used in this example is quite complex.) Not every possible bytecode sequence can be directly represented in ActionScript 3, and there are some corner cases yet to be described in the decompiler. For "partially decompiled" (i.e. where there were no control flow uncertainites, but some expressions were impossible to transform to ActionScript) the relevant NF-AST code is automatically emitted. You can look at it manually with `furnace-avm2 -n`. For "failed" methods there is no generated code, but you might try to look at control flow graph (`furnace-avm2 -C`, look for emitted `method-*.dot` file) in [Graphviz](http://en.wikipedia.org/wiki/Graphviz) format to understand the logic.
85
+
86
+ Programming interface
87
+ ---------------------
88
+
89
+ The programming interface will get an in-depth description later. For now, you are advised to look at the source code of [ABC metadata](https://github.com/whitequark/furnace-avm2/tree/master/lib/furnace-avm2/abc/metadata) parser/storage code and [Furnace](https://github.com/whitequark/furnace/tree/master/lib/furnace) source code. Neither of these are particularly large, and you will probably need to read it anyway.
90
+
91
+ You can also use [Pry](http://pry.github.com/) to explore the interfaces. Try launching the utility `furnace-avm2-shell`.
92
+
93
+ Contact
94
+ -------
95
+
96
+ If you experience any difficultes, you can ask me (*whitequark*) on channel `#ruby-lang` at `irc.freenode.net` or drop me a email.
97
+
98
+ License
99
+ -------
100
+
101
+ Furnace-AVM2 is distributed under the terms of MIT license.
102
+
103
+ Copyright (c) 2012 Peter Zotov <whitequark@whitequark.org>
104
+
105
+ Permission is hereby granted, free of charge, to any person obtaining a
106
+ copy of this software and associated documentation files (the
107
+ "Software"), to deal in the Software without restriction, including
108
+ without limitation the rights to use, copy, modify, merge, publish,
109
+ distribute, sublicense, and/or sell copies of the Software, and to
110
+ permit persons to whom the Software is furnished to do so, subject to
111
+ the following conditions:
112
+
113
+ The above copyright notice and this permission notice shall be included
114
+ in all copies or substantial portions of the Software.
115
+
116
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
117
+ OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
118
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
119
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
120
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
121
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
122
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/furnace-avm2.gemspec CHANGED
@@ -17,6 +17,6 @@ Gem::Specification.new do |s|
17
17
  s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
18
18
  s.require_paths = ["lib"]
19
19
 
20
- s.add_runtime_dependency "furnace", '= 0.2.3'
20
+ s.add_runtime_dependency "furnace", '= 0.2.4'
21
21
  s.add_runtime_dependency "trollop"
22
22
  end
@@ -688,6 +688,41 @@ module Furnace::AVM2
688
688
  alias :expr_pre_increment_local :expr_prepost_incdec_local
689
689
  alias :expr_pre_decrement_local :expr_prepost_incdec_local
690
690
 
691
+ PrePostIncDecSlot = Matcher.new do
692
+ [any,
693
+ capture(:index),
694
+ either[
695
+ [:get_global_scope],
696
+ [:get_scope_object, capture(:scope_pos)]
697
+ ]
698
+ ]
699
+ end
700
+
701
+ def expr_prepost_incdec_slot(opcode)
702
+ if captures = PrePostIncDecSlot.match(opcode)
703
+ scope = @scopes[captures[:scope_pos] || 0]
704
+
705
+ if @closure_slots && scope == :activation
706
+ slot_trait = @closure_slots[captures[:index]]
707
+ slot = token(VariableNameToken, slot_trait.name.name)
708
+
709
+ if opcode.type == :post_increment_slot
710
+ token(UnaryPostOperatorToken, slot, "++")
711
+ elsif opcode.type == :post_decrement_slot
712
+ token(UnaryPostOperatorToken, slot, "--")
713
+ elsif opcode.type == :pre_increment_slot
714
+ token(UnaryOperatorToken, slot, "++")
715
+ elsif opcode.type == :pre_decrement_slot
716
+ token(UnaryOperatorToken, slot, "--")
717
+ end
718
+ end
719
+ end
720
+ end
721
+ alias :expr_post_increment_slot :expr_prepost_incdec_slot
722
+ alias :expr_post_decrement_slot :expr_prepost_incdec_slot
723
+ alias :expr_pre_increment_slot :expr_prepost_incdec_slot
724
+ alias :expr_pre_decrement_slot :expr_prepost_incdec_slot
725
+
691
726
  OPERATOR_MAP = {
692
727
  :and => :"&&",
693
728
  :or => :"||",
@@ -15,7 +15,8 @@ module Furnace::AVM2
15
15
  SHORT_ASSIGN_OPERATORS = [ :add, :add_i, :subtract, :subtract_i, :multiply, :multiply_i,
16
16
  :divide, :modulo,
17
17
  :set_local, :set_local_0, :set_local_1, :set_local_2, :set_local_3,
18
- :new_catch, :new_activation ]
18
+ :new_catch, :new_activation,
19
+ :next_value ]
19
20
 
20
21
  def initialize(options)
21
22
  @validate = options[:validate] || false
@@ -59,7 +59,7 @@ module Furnace::AVM2
59
59
  next_label = next_node.metadata[:label] if next_node
60
60
 
61
61
  case node.type
62
- when :return_value, :return_void
62
+ when :return_value, :return_void, :throw
63
63
  cutoff(nil, [nil])
64
64
 
65
65
  when :jump
@@ -112,7 +112,7 @@ module Furnace::AVM2
112
112
  loop_stack.include?(@loop_tails[block])
113
113
  end
114
114
 
115
- def extended_block(block, stopgap=nil, loop_stack=[], nesting=0, upper_exc=nil)
115
+ def extended_block(block, stopgap=nil, loop_stack=[], nesting=0, upper_exc=nil, options={})
116
116
  nodes = []
117
117
  prev_block = nil
118
118
  current_exception = upper_exc
@@ -125,13 +125,20 @@ module Furnace::AVM2
125
125
  log nesting, "BLOCK: #{block.inspect}"
126
126
 
127
127
  if is_loop_head?(block, loop_stack)
128
- log nesting, "exit: loop head (continue stmt)"
128
+ if options[:infinite_loop_head]
129
+ # Infinite loop head is a special case where cti_block
130
+ # has back edges pointing to it, but just for once it
131
+ # should not be turned to (continue) statement.
132
+ options.delete(:infinite_loop_head)
133
+ else
134
+ log nesting, "exit: loop head (continue stmt)"
129
135
 
130
- check_nonlocal_loop(loop_stack, block) do |params|
131
- current_nodes << AST::Node.new(:continue, params)
132
- end
136
+ check_nonlocal_loop(loop_stack, block) do |params|
137
+ current_nodes << AST::Node.new(:continue, params)
138
+ end
133
139
 
134
- break
140
+ break
141
+ end
135
142
  elsif is_loop_tail?(block, loop_stack)
136
143
  log nesting, "exit: loop tail (break stmt)"
137
144
 
@@ -183,6 +190,8 @@ module Furnace::AVM2
183
190
  if block.cti.type == :lookup_switch
184
191
  log nesting, "is a switch"
185
192
 
193
+ append_instructions(block, current_nodes)
194
+
186
195
  # Group cases pointing to the same blocks of code.
187
196
  aliases = Hash[block.targets.each_index.
188
197
  group_by { |index| block.targets[index] }.values.
@@ -270,8 +279,8 @@ module Furnace::AVM2
270
279
  body.children << AST::Node.new(:break)
271
280
  end
272
281
 
273
- main_index = case_branches.index(next_branch)
274
- body = case_bodies[main_index]
282
+ main_index = block.targets.index(next_branch)
283
+ body = case_bodies[case_branches.index(next_branch)]
275
284
 
276
285
  [ main_index, *aliases[main_index] ].each do |index|
277
286
  if index == 0
@@ -299,7 +308,11 @@ module Furnace::AVM2
299
308
  block = exit_point
300
309
  elsif @loops.include?(block) && !@postcond_heads.include?(block)
301
310
  # we're trapped in a strange loop
302
- if block.insns.first == block.cti
311
+ if block.insns.first == block.cti &&
312
+ !(@loops[block].include?(block.targets.first) &&
313
+ @loops[block].include?(block.targets.last))
314
+ # Make sure that both branch targets don't reside within the
315
+ # loop. If they do, it's a do-while loop.
303
316
  log nesting, "is a while loop"
304
317
 
305
318
  loop_type = :head_cti
@@ -317,10 +330,15 @@ module Furnace::AVM2
317
330
  end
318
331
 
319
332
  if back_edges.count == 1
333
+ log nesting, "is a do-while loop"
334
+
320
335
  loop_type = :tail_cti
321
336
  cti_block = back_edges.first
322
337
  else
323
- raise "invalid back edge count"
338
+ log nesting, "is an infinite loop"
339
+
340
+ loop_type = :infinite
341
+ cti_block = block
324
342
  end
325
343
  end
326
344
 
@@ -335,9 +353,9 @@ module Furnace::AVM2
335
353
  reverse = !cti_block.cti.children[0]
336
354
  in_root, out_root = cti_block.targets
337
355
 
338
- # One of the branch targets should reside within
339
- # the loop.
340
356
  if !@loops[block].include?(in_root)
357
+ # One of the branch targets should reside within
358
+ # the loop.
341
359
  in_root, out_root = out_root, in_root
342
360
  reverse = !reverse
343
361
  end
@@ -366,7 +384,8 @@ module Furnace::AVM2
366
384
 
367
385
  append_instructions(block, body.children)
368
386
  else
369
- body = extended_block(in_root, nil, [ cti_block ] + loop_stack, nesting + 1, current_exception)
387
+ body = extended_block(in_root, nil, [ cti_block ] + loop_stack, nesting + 1, current_exception,
388
+ { infinite_loop_head: (loop_type == :infinite) })
370
389
  end
371
390
 
372
391
  # [(label name)]
@@ -463,9 +482,11 @@ module Furnace::AVM2
463
482
  if completely_dominated?(right_root, block)
464
483
  # Yes. Find merge point.
465
484
 
466
- # The function technically finds two merge points,
467
- # but in case of two heads they're identical.
468
- merge, = find_merge_point([ left_root, right_root ])
485
+ merge_left, merge_right = find_merge_point([ left_root, right_root ])
486
+
487
+ # One or both of the merge points could be nil, but they will
488
+ # not be different.
489
+ merge = merge_left || merge_right
469
490
 
470
491
  # If the merge search did not yield a valid node, use
471
492
  # stopgap for the current block to avoid runaway code
@@ -23,10 +23,15 @@ module Furnace::AVM2
23
23
  end
24
24
 
25
25
  LocalIncDecMatcher = AST::Matcher.new do
26
- [:set_local, capture(:index),
26
+ [ either_multi[
27
+ [:set_slot, capture(:index), capture(:scope)],
28
+ [:set_local, capture(:index)],
29
+ ],
27
30
  either[
28
31
  [:convert, any,
29
32
  capture(:inner)],
33
+ [:coerce, :any,
34
+ capture(:inner)],
30
35
  capture(:inner)
31
36
  ]
32
37
  ]
@@ -36,13 +41,19 @@ module Furnace::AVM2
36
41
  [capture(:operator),
37
42
  either[
38
43
  [:convert, any,
39
- [:get_local, backref(:index)]],
40
- [:get_local, backref(:index)],
41
- capture(:abnormal)
44
+ capture(:getter)],
45
+ capture(:getter),
42
46
  ]
43
47
  ]
44
48
  end
45
49
 
50
+ LocalIncDecGetterMatcher = AST::Matcher.new do
51
+ either[
52
+ [:get_slot, backref(:index), backref(:scope)],
53
+ [:get_local, backref(:index)],
54
+ ]
55
+ end
56
+
46
57
  IncDecOperators = [
47
58
  :pre_increment, :post_increment,
48
59
  :pre_decrement, :post_decrement
@@ -51,22 +62,27 @@ module Furnace::AVM2
51
62
  def on_set_local(node)
52
63
  captures = {}
53
64
  if LocalIncDecMatcher.match(node, captures) &&
54
- LocalIncDecInnerMatcher.match(captures[:inner], captures)
55
- if IncDecOperators.include? captures[:operator]
56
- if captures[:abnormal]
57
- node.update(:add, [
58
- AST::Node.new(:set_local, [
59
- captures[:index],
60
- captures[:abnormal]
61
- ]),
62
- AST::Node.new(:integer, [ 1 ])
63
- ])
65
+ LocalIncDecInnerMatcher.match(captures[:inner], captures) &&
66
+ IncDecOperators.include?(captures[:operator])
67
+ if captures[:getter].is_a?(AST::Node) &&
68
+ LocalIncDecGetterMatcher.match(captures[:getter], captures)
69
+ if captures[:scope]
70
+ node.update(:"#{captures[:operator]}_slot", [ captures[:index], captures[:scope] ])
64
71
  else
65
72
  node.update(:"#{captures[:operator]}_local", [ captures[:index] ])
66
73
  end
74
+ else
75
+ node.update(:add, [
76
+ AST::Node.new(:set_local, [
77
+ captures[:index],
78
+ captures[:getter]
79
+ ]),
80
+ AST::Node.new(:integer, [ 1 ])
81
+ ])
67
82
  end
68
83
  end
69
84
  end
85
+ alias :on_set_slot :on_set_local
70
86
 
71
87
  ExpandedForInMatcher = AST::Matcher.new do
72
88
  [:if, [:has_next2, skip], skip]
@@ -5,45 +5,60 @@ module Furnace::AVM2
5
5
  class PropagateConstants
6
6
  include AST::Visitor
7
7
 
8
- def transform(ast, *stuff)
9
- @local_nonconst = Set.new
10
- @local_sets = {}
11
- @local_gets = Hash.new { |h,k| h[k] = [] }
8
+ class Replacer
9
+ include AST::Visitor
12
10
 
13
- visit ast
11
+ def initialize(local_var, value)
12
+ @local_var, @value = local_var, value
13
+ end
14
14
 
15
- @local_sets.each do |index, set_node|
16
- *, value = set_node.children
15
+ def replace_in(nodes)
16
+ @nodes = nodes
17
+ @graceful_shutdown = true
17
18
 
18
- unless @local_nonconst.include? index
19
- @local_gets[index].each do |get_node|
20
- get_node.update(:find_property_strict,
21
- value.children.dup,
22
- get_node.metadata)
19
+ catch(:stop) {
20
+ @nodes.each do |node|
21
+ visit node
23
22
  end
23
+ }
24
+
25
+ @graceful_shutdown
26
+ end
24
27
 
25
- set_node.update(:nop, [])
28
+ def on_set_local(node)
29
+ index, value = node.children
30
+ if index == @local_var
31
+ @graceful_shutdown = @nodes.include?(node)
32
+ throw :stop
26
33
  end
27
34
  end
28
35
 
36
+ def on_get_local(node)
37
+ index, = node.children
38
+ if index == @local_var
39
+ node.update(@value.type, @value.children.dup, @value.metadata)
40
+ end
41
+ end
42
+ end
43
+
44
+ def transform(ast, *stuff)
45
+ visit ast
46
+
29
47
  [ ast, *stuff ]
30
48
  end
31
49
 
32
50
  def on_set_local(node)
33
51
  index, value = node.children
34
52
  if value.type == :find_property_strict
35
- if @local_sets.has_key?(index)
36
- @local_nonconst.add index
37
- else
38
- @local_sets[index] = node
53
+ block = node.parent
54
+ nodes = block.children[(block.children.index(node) + 1)..-1]
55
+
56
+ replacer = Replacer.new(index, value)
57
+ if replacer.replace_in(nodes)
58
+ node.update(:remove)
39
59
  end
40
60
  end
41
61
  end
42
-
43
- def on_get_local(node)
44
- index, = node.children
45
- @local_gets[index].push node
46
- end
47
62
  end
48
63
  end
49
64
  end
@@ -1,5 +1,5 @@
1
1
  module Furnace
2
2
  module AVM2
3
- VERSION = "1.0.0"
3
+ VERSION = "1.0.1"
4
4
  end
5
5
  end