memdump 0.1.0 → 0.2.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/Gemfile +2 -0
- data/README.md +117 -8
- data/bin/memdump +1 -0
- data/lib/memdump.rb +19 -2
- data/lib/memdump/cli.rb +40 -21
- data/lib/memdump/common_ancestor.rb +44 -0
- data/lib/memdump/convert_to_gml.rb +23 -33
- data/lib/memdump/json_dump.rb +50 -7
- data/lib/memdump/memory_dump.rb +662 -0
- data/lib/memdump/out_degree.rb +7 -0
- data/lib/memdump/replace_class_address_by_name.rb +22 -5
- data/lib/memdump/version.rb +1 -1
- data/memdump.gemspec +2 -1
- metadata +21 -7
- data/lib/memdump/diff.rb +0 -44
- data/lib/memdump/stats.rb +0 -15
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: deaf03849e0949a5cf0f6150ea598b02e55411cb
|
4
|
+
data.tar.gz: e8d53128488b1d0c83392b0b56eed940327d5a0a
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: bf78d4e885d83b66e47f8f642b90ba74117a3b7c5a6963ce602f0dbbbe5eab1c19ce2bce1a54f01290f787926a2170520b39e20661ce23ef9f4ee08d1fc2ee68
|
7
|
+
data.tar.gz: bbb79a73ec1e0dc13e42040380c12168ebdabc2e9c91d9801421aa025245c1eb3e6070b79f886642d275504ce74749afdd7bdc17c5ccd22d867db101164134b7
|
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -86,10 +86,10 @@ Allocation tracing is enabled with
|
|
86
86
|
|
87
87
|
~~~ ruby
|
88
88
|
require 'objspace'
|
89
|
-
ObjectSpace.
|
89
|
+
ObjectSpace.trace_object_allocations_start
|
90
90
|
~~~
|
91
91
|
|
92
|
-
##
|
92
|
+
## Basic analysis
|
93
93
|
|
94
94
|
The first thing you will probably want to do is to run the replace-class command
|
95
95
|
on the dump. It replaces the class attribute, which in the original dump is the
|
@@ -105,13 +105,122 @@ count by class. For memory leaks, the **diff** command allows you to output the
|
|
105
105
|
part of the graph that involves new objects (removing the
|
106
106
|
"old-and-not-referred-to-by-new")
|
107
107
|
|
108
|
+
Beyond, this analyzing the dump is best done through the interactive mode:
|
109
|
+
|
110
|
+
```
|
111
|
+
memdump interactive /tmp/mydump
|
112
|
+
```
|
113
|
+
|
114
|
+
will get you a pry shell in the context of the loaded MemoryDump object. Use
|
115
|
+
the MemoryDump API to filter out what you need. If you're dealing with big dumps,
|
116
|
+
it is usually a good idea to save them regularly with `#dump`.
|
117
|
+
|
118
|
+
One useful call to do at the beginning is #common_cleanup. It collapses the
|
119
|
+
common collections (Array, Set, Hash) as well as internal bookkeeping objects
|
120
|
+
(ICLASS, …). I usually run this, save the result and re-load the result (which
|
121
|
+
is usually significantly smaller).
|
122
|
+
|
123
|
+
After, the usual process is to find out which non-standard classes are
|
124
|
+
unexpectedly present in high numbers using `stats`, extract the objects from
|
125
|
+
these classes with `dump = objects_of_class('classname')` and the subgraph that
|
126
|
+
keeps them alive with `roots_of(dump)`
|
127
|
+
|
128
|
+
```
|
129
|
+
# Get the subgraph of all objects whose class name matches /Plan/ and export
|
130
|
+
# it to GML to process with Gephi (see below)
|
131
|
+
parent_dump, _ = roots_of(objects_of_class(/Plan/))
|
132
|
+
parent_dump.to_gml('plan-subgraph.gml')
|
133
|
+
```
|
134
|
+
|
135
|
+
Once you start filtering dumps, don't forget to simplify your life by `cd`'ing
|
136
|
+
in the context of the newly filtered dumps
|
137
|
+
|
108
138
|
Beyond that, I usually go back and forth between the memory dump and
|
109
|
-
[gephi](http://gephi.org), a graph analysis application.
|
110
|
-
|
111
|
-
|
112
|
-
|
113
|
-
|
114
|
-
|
139
|
+
[gephi](http://gephi.org), a graph analysis application. `to_gml` allows to
|
140
|
+
convert the memory dump into a graph format that gephi can import. From there,
|
141
|
+
use gephi's layouting and filtering algorithms to get an idea of the shape of
|
142
|
+
the dump. Note that you need to first get a graph smaller than a few 10k of objects
|
143
|
+
before you can use gephi.
|
144
|
+
|
145
|
+
## Dump diffs
|
146
|
+
|
147
|
+
One powerful way to find out where memory is leaked is to look at objects that
|
148
|
+
got allocated and find the interface between the long-term objects and these
|
149
|
+
objects. memdump supports this by computing diffs.
|
150
|
+
|
151
|
+
If you mean to use dump diffs you **MUST** enable allocation tracing. Not doing
|
152
|
+
so will make the diffs inaccurate, as memdump will not be able to recognize that some
|
153
|
+
object addresses have been reused after a garbage collection.
|
154
|
+
|
155
|
+
Let's assume that we have a "before.json" and "after.json" dumps. Start an interactive
|
156
|
+
shell loading `before`.
|
157
|
+
|
158
|
+
```
|
159
|
+
memdump interactive before.json
|
160
|
+
```
|
161
|
+
|
162
|
+
Then, in the shell, let's load the after dump
|
163
|
+
|
164
|
+
```
|
165
|
+
> after = MemDump::JSONDump.load('after.json')
|
166
|
+
```
|
167
|
+
|
168
|
+
The set of objects that are in `after` and `before` is given by `#diff`
|
169
|
+
|
170
|
+
```
|
171
|
+
d = diff(after)
|
172
|
+
```
|
173
|
+
|
174
|
+
We'll also add a special marker to the records in `d` so that we can easily colorize
|
175
|
+
them differently in Gephi.
|
176
|
+
|
177
|
+
```
|
178
|
+
d = d.map { |r| r['in_after'] = 1; r }
|
179
|
+
```
|
180
|
+
|
181
|
+
## Case 1: few new objects are linked to the old ones
|
182
|
+
|
183
|
+
One possibility is that there are only a few objects in the diff that are kept
|
184
|
+
alive from `before`. These objects in turn keep alive a lot more objects (which
|
185
|
+
cause the noticeable memory leak). What's interesting in this case is to
|
186
|
+
visualize the interface, that is that set of objects.
|
187
|
+
|
188
|
+
In memdump, one computes it with the `interface_with` method, which computes the
|
189
|
+
interface between the receiver and the argument. The receiver must contain the
|
190
|
+
edges between itself and the argument, which means in our case that we must use
|
191
|
+
`after`.
|
192
|
+
|
193
|
+
```
|
194
|
+
self_border, diff_border = after.interface_with(d)
|
195
|
+
```
|
196
|
+
|
197
|
+
In addition to computing the border, it computes the count of objects that are
|
198
|
+
kept alive by each object in `diff_border`. Each record in `diff_border` has an
|
199
|
+
attribute called `keepalive_count` that counts the amount of nodes in `after`
|
200
|
+
that are reachable (i.e. kept alive by) it. It is usually a good idea to
|
201
|
+
visualize the distribution of `keepalive_count` to see whether there's indeed
|
202
|
+
only a few nodes, and whether some are keeping a lot more objects alive than
|
203
|
+
others. Note that cycles that involve more than one "border node" will be
|
204
|
+
counted multiple ones (so the sum of `keepalive_count` will be higher than
|
205
|
+
`d.size`)
|
206
|
+
|
207
|
+
```
|
208
|
+
diff_border.size # is this much smaller than d.size ?
|
209
|
+
diff_border.each_record.map { |r| r['keepalive_count'] }.sort.reverse # are there some high counts at the top ?
|
210
|
+
```
|
211
|
+
|
212
|
+
From there, one needs to do a bunch of back-and-forth between memdump and Gephi.
|
213
|
+
What I usually do is start by dumping the whole subgraph that contains the border
|
214
|
+
and visualize. If I can't make any sense of it, I isolate the high-count elements
|
215
|
+
in the border and visualize the related subgraph
|
216
|
+
|
217
|
+
```
|
218
|
+
full_subgraph = after.roots_of(diff_border)
|
219
|
+
full_subgraph.to_gml 'full.gml'
|
220
|
+
filtered_border = diff_border.find_all { |r| r['keepalive_count'] > 1000 }
|
221
|
+
filtered_subgraph = after.roots_of(filtered_border)
|
222
|
+
filtered_subgraph.to_gml 'filtered.gml'
|
223
|
+
```
|
115
224
|
|
116
225
|
## Contributing
|
117
226
|
|
data/bin/memdump
CHANGED
data/lib/memdump.rb
CHANGED
@@ -1,5 +1,22 @@
|
|
1
|
+
require 'rgl/adjacency'
|
2
|
+
require 'rgl/dijkstra'
|
3
|
+
require 'rgl/traversal'
|
4
|
+
|
1
5
|
require "memdump/version"
|
6
|
+
require 'memdump/json_dump'
|
7
|
+
require 'memdump/memory_dump'
|
8
|
+
|
9
|
+
require 'memdump/cleanup_references'
|
10
|
+
require 'memdump/common_ancestor'
|
11
|
+
require 'memdump/convert_to_gml'
|
12
|
+
require 'memdump/out_degree'
|
13
|
+
require 'memdump/remove_node'
|
14
|
+
require 'memdump/replace_class_address_by_name'
|
15
|
+
require 'memdump/root_of'
|
16
|
+
require 'memdump/subgraph_of'
|
2
17
|
|
3
|
-
module
|
4
|
-
|
18
|
+
module MemDump
|
19
|
+
def self.pry(dump)
|
20
|
+
binding.pry
|
21
|
+
end
|
5
22
|
end
|
data/lib/memdump/cli.rb
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
require 'thor'
|
2
2
|
require 'pathname'
|
3
|
-
require 'memdump
|
3
|
+
require 'memdump'
|
4
4
|
|
5
5
|
module MemDump
|
6
6
|
class CLI < Thor
|
@@ -17,17 +17,14 @@ module MemDump
|
|
17
17
|
|
18
18
|
desc 'diff SOURCE TARGET OUTPUT', 'generate a memory dump that contains the objects in TARGET not in SOURCE, and all their parents'
|
19
19
|
def diff(source, target, output)
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
io.puts JSON.dump(r)
|
29
|
-
end
|
30
|
-
end
|
20
|
+
from = MemDump::JSONDump.load(source)
|
21
|
+
to = MemDump::JSONDump.load(target)
|
22
|
+
diff = from.diff(to)
|
23
|
+
STDOUT.sync
|
24
|
+
puts "#{diff.size} nodes are in target but not in source"
|
25
|
+
diff = to.roots_of(diff)
|
26
|
+
puts "#{diff.size} nodes in final dump"
|
27
|
+
diff.save(output)
|
31
28
|
end
|
32
29
|
|
33
30
|
desc 'gml DUMP GML', 'converts a memory dump into a graph in the GML format (for processing by e.g. gephi)'
|
@@ -82,13 +79,9 @@ module MemDump
|
|
82
79
|
if output_path then Pathname.new(output_path)
|
83
80
|
else dump_path
|
84
81
|
end
|
85
|
-
dump = MemDump::JSONDump.
|
86
|
-
|
87
|
-
|
88
|
-
result.each do |r|
|
89
|
-
io.puts JSON.dump(r)
|
90
|
-
end
|
91
|
-
end
|
82
|
+
dump = MemDump::JSONDump.load(dump_path)
|
83
|
+
dump = dump.replace_class_id_by_class_name(add_reference_to_class: options[:add_ref])
|
84
|
+
dump.save(output_path)
|
92
85
|
end
|
93
86
|
|
94
87
|
desc 'cleanup-refs DUMP OUTPUT', "removes references to deleted objects"
|
@@ -121,13 +114,39 @@ module MemDump
|
|
121
114
|
def stats(dump)
|
122
115
|
require 'pp'
|
123
116
|
require 'memdump/stats'
|
124
|
-
dump = MemDump::JSONDump.
|
125
|
-
unknown, by_type =
|
117
|
+
dump = MemDump::JSONDump.load(dump)
|
118
|
+
unknown, by_type = dump.stats
|
126
119
|
puts "#{unknown} objects without a known type"
|
127
120
|
by_type.sort_by { |n, v| v }.reverse.each do |n, v|
|
128
121
|
puts "#{n}: #{v}"
|
129
122
|
end
|
130
123
|
end
|
124
|
+
|
125
|
+
desc 'out_degree DUMP', 'display the direct count of objects held by each object in the dump'
|
126
|
+
option "min", desc: "hide the objects whose degree is lower than this",
|
127
|
+
type: :numeric
|
128
|
+
def out_degree(dump)
|
129
|
+
dump = MemDump::JSONDump.new(Pathname.new(dump))
|
130
|
+
min = options[:min] || 0
|
131
|
+
sorted = dump.each_record.sort_by { |r| (r['references'] || Array.new).size }
|
132
|
+
sorted.each do |r|
|
133
|
+
size = (r['references'] || Array.new).size
|
134
|
+
break if size > min
|
135
|
+
puts "#{size} #{r}"
|
136
|
+
end
|
137
|
+
end
|
138
|
+
|
139
|
+
desc 'interactive DUMP', 'loads a dump file and spawn a pry shell'
|
140
|
+
option :load, desc: 'load the whole dump in memory', type: :boolean, default: true
|
141
|
+
def interactive(dump)
|
142
|
+
require 'memdump'
|
143
|
+
require 'pry'
|
144
|
+
dump = MemDump::JSONDump.new(Pathname.new(dump))
|
145
|
+
if options[:load]
|
146
|
+
dump = dump.load
|
147
|
+
end
|
148
|
+
dump.pry
|
149
|
+
end
|
131
150
|
end
|
132
151
|
end
|
133
152
|
|
@@ -0,0 +1,44 @@
|
|
1
|
+
module MemDump
|
2
|
+
def self.common_ancestors(dump, class_name, threshold: 0.1)
|
3
|
+
selected_records = Hash.new
|
4
|
+
remaining_records = Array.new
|
5
|
+
dump.each_record do |r|
|
6
|
+
if class_name === r['class']
|
7
|
+
selected_records[r['address']] = r
|
8
|
+
else
|
9
|
+
remaining_records << r
|
10
|
+
end
|
11
|
+
end
|
12
|
+
|
13
|
+
remaining_records = Array.new
|
14
|
+
selected_records = Hash.new
|
15
|
+
selected_root = root_address
|
16
|
+
dump.each_record do |r|
|
17
|
+
address = (r['address'] || r['root'])
|
18
|
+
if selected_root == address
|
19
|
+
selected_records[address] = r
|
20
|
+
selected_root = nil;
|
21
|
+
else
|
22
|
+
remaining_records << r
|
23
|
+
end
|
24
|
+
end
|
25
|
+
|
26
|
+
count = 0
|
27
|
+
while count != selected_records.size
|
28
|
+
count = selected_records.size
|
29
|
+
remaining_records.delete_if do |r|
|
30
|
+
references = r['references']
|
31
|
+
if references && references.any? { |a| selected_records.has_key?(a) }
|
32
|
+
address = (r['address'] || r['root'])
|
33
|
+
selected_records[address] = r
|
34
|
+
end
|
35
|
+
end
|
36
|
+
end
|
37
|
+
|
38
|
+
selected_records.values.reverse.each do |r|
|
39
|
+
if refs = r['references']
|
40
|
+
refs.delete_if { |a| !selected_records.has_key?(a) }
|
41
|
+
end
|
42
|
+
end
|
43
|
+
end
|
44
|
+
end
|
@@ -1,47 +1,37 @@
|
|
1
|
-
require 'set'
|
2
|
-
|
3
1
|
module MemDump
|
4
2
|
def self.convert_to_gml(dump, io)
|
5
|
-
nodes = dump.each_record.map do |row|
|
6
|
-
if row['class_address'] # transformed with replace_class_address_by_name
|
7
|
-
name = row['class']
|
8
|
-
else
|
9
|
-
name = row['struct'] || row['root'] || row['type']
|
10
|
-
end
|
11
|
-
|
12
|
-
address = row['address'] || row['root']
|
13
|
-
refs = Hash.new
|
14
|
-
if row_refs = row['references']
|
15
|
-
row_refs.each { |r| refs[r] = nil }
|
16
|
-
end
|
17
|
-
|
18
|
-
[address, refs, name]
|
19
|
-
end
|
20
|
-
|
21
3
|
io.puts "graph"
|
22
4
|
io.puts "["
|
23
|
-
|
24
|
-
|
25
|
-
|
5
|
+
|
6
|
+
edges = []
|
7
|
+
dump.each_record do |row|
|
8
|
+
address = row['address']
|
9
|
+
|
26
10
|
io.puts " node"
|
27
11
|
io.puts " ["
|
28
12
|
io.puts " id #{address}"
|
29
|
-
|
13
|
+
row.each do |key, value|
|
14
|
+
if value.respond_to?(:to_str)
|
15
|
+
io.puts " #{key} \"#{value}\""
|
16
|
+
elsif value.kind_of?(Numeric)
|
17
|
+
io.puts " #{key} #{value}"
|
18
|
+
end
|
19
|
+
end
|
30
20
|
io.puts " ]"
|
31
|
-
end
|
32
21
|
|
33
|
-
|
34
|
-
|
35
|
-
io.puts " edge"
|
36
|
-
io.puts " ["
|
37
|
-
io.puts " source #{address}"
|
38
|
-
io.puts " target #{ref_address}"
|
39
|
-
if ref_label
|
40
|
-
io.puts " label \"#{ref_label}\""
|
41
|
-
end
|
42
|
-
io.puts " ]"
|
22
|
+
row['references'].each do |ref_address|
|
23
|
+
edges << address << ref_address
|
43
24
|
end
|
44
25
|
end
|
26
|
+
|
27
|
+
edges.each_slice(2) do |address, ref_address|
|
28
|
+
io.puts " edge"
|
29
|
+
io.puts " ["
|
30
|
+
io.puts " source #{address}"
|
31
|
+
io.puts " target #{ref_address}"
|
32
|
+
io.puts " ]"
|
33
|
+
end
|
34
|
+
|
45
35
|
io.puts "]"
|
46
36
|
end
|
47
37
|
end
|
data/lib/memdump/json_dump.rb
CHANGED
@@ -1,22 +1,65 @@
|
|
1
|
+
require 'pathname'
|
1
2
|
require 'json'
|
2
3
|
module MemDump
|
3
4
|
class JSONDump
|
5
|
+
def self.load(filename)
|
6
|
+
new(filename).load
|
7
|
+
end
|
8
|
+
|
4
9
|
def initialize(filename)
|
5
|
-
@filename = filename
|
10
|
+
@filename = Pathname(filename)
|
6
11
|
end
|
7
12
|
|
8
13
|
def each_record
|
9
14
|
return enum_for(__method__) if !block_given?
|
10
15
|
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
16
|
+
@filename.open do |f|
|
17
|
+
f.each_line do |line|
|
18
|
+
r = JSON.parse(line)
|
19
|
+
r['address'] ||= r['root']
|
20
|
+
r['references'] ||= Set.new
|
21
|
+
yield r
|
22
|
+
end
|
23
|
+
end
|
24
|
+
end
|
25
|
+
|
26
|
+
def load
|
27
|
+
address_to_record = Hash.new
|
28
|
+
generations = Hash.new
|
29
|
+
each_record do |r|
|
30
|
+
if !(address = r['address'])
|
31
|
+
raise "no address in #{r}"
|
32
|
+
end
|
33
|
+
r = r.dup
|
34
|
+
|
35
|
+
if generation = r['generation']
|
36
|
+
generations[address] = r['address'] = "#{address}:#{generation}"
|
37
|
+
end
|
38
|
+
r['references'] = r['references'].to_set
|
39
|
+
address_to_record[r['address']] = r
|
40
|
+
end
|
41
|
+
|
42
|
+
if !generations.empty?
|
43
|
+
address_to_record.each_value do |r|
|
44
|
+
if class_address = r['class']
|
45
|
+
r['class'] = generations.fetch(class_address, class_address)
|
46
|
+
end
|
47
|
+
if class_address = r['class_address']
|
48
|
+
r['class_address'] = generations.fetch(class_address, class_address)
|
17
49
|
end
|
50
|
+
|
51
|
+
refs = Set.new
|
52
|
+
r['references'].each do |ref_address|
|
53
|
+
refs << generations.fetch(ref_address, ref_address)
|
54
|
+
end
|
55
|
+
r['references'] = refs
|
18
56
|
end
|
19
57
|
end
|
58
|
+
MemoryDump.new(address_to_record)
|
59
|
+
end
|
60
|
+
|
61
|
+
def inspect
|
62
|
+
to_s
|
20
63
|
end
|
21
64
|
end
|
22
65
|
end
|
@@ -0,0 +1,662 @@
|
|
1
|
+
module MemDump
|
2
|
+
class MemoryDump
|
3
|
+
attr_reader :address_to_record
|
4
|
+
|
5
|
+
def initialize(address_to_record)
|
6
|
+
@address_to_record = address_to_record
|
7
|
+
@forward_graph = nil
|
8
|
+
@backward_graph = nil
|
9
|
+
end
|
10
|
+
|
11
|
+
def include?(address)
|
12
|
+
address_to_record.has_key?(address)
|
13
|
+
end
|
14
|
+
|
15
|
+
def each_record(&block)
|
16
|
+
address_to_record.each_value(&block)
|
17
|
+
end
|
18
|
+
|
19
|
+
def addresses
|
20
|
+
address_to_record.keys
|
21
|
+
end
|
22
|
+
|
23
|
+
def size
|
24
|
+
address_to_record.size
|
25
|
+
end
|
26
|
+
|
27
|
+
def find_by_address(address)
|
28
|
+
address_to_record[address]
|
29
|
+
end
|
30
|
+
|
31
|
+
def inspect
|
32
|
+
to_s
|
33
|
+
end
|
34
|
+
|
35
|
+
def save(io_or_path)
|
36
|
+
if io_or_path.respond_to?(:open)
|
37
|
+
io_or_path.open 'w' do |io|
|
38
|
+
save(io)
|
39
|
+
end
|
40
|
+
else
|
41
|
+
each_record do |r|
|
42
|
+
io_or_path.puts JSON.dump(r)
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
# Filter the records
|
48
|
+
#
|
49
|
+
# @yieldparam record a record
|
50
|
+
# @yieldreturn [Object] the record object that should be included in the
|
51
|
+
# returned dump
|
52
|
+
# @return [MemoryDump]
|
53
|
+
def find_all
|
54
|
+
return enum_for(__method__) if !block_given?
|
55
|
+
|
56
|
+
address_to_record = Hash.new
|
57
|
+
each_record do |r|
|
58
|
+
if yield(r)
|
59
|
+
address_to_record[r['address']] = r
|
60
|
+
end
|
61
|
+
end
|
62
|
+
MemoryDump.new(address_to_record)
|
63
|
+
end
|
64
|
+
|
65
|
+
# Map the records
|
66
|
+
#
|
67
|
+
# @yieldparam record a record
|
68
|
+
# @yieldreturn [Object] the record object that should be included in the
|
69
|
+
# returned dump
|
70
|
+
# @return [MemoryDump]
|
71
|
+
def map
|
72
|
+
return enum_for(__method__) if !block_given?
|
73
|
+
|
74
|
+
address_to_record = Hash.new
|
75
|
+
each_record do |r|
|
76
|
+
address_to_record[r['address']] = yield(r.dup).to_hash
|
77
|
+
end
|
78
|
+
MemoryDump.new(address_to_record)
|
79
|
+
end
|
80
|
+
|
81
|
+
# Filter the entries, removing those for which the block returns falsy
|
82
|
+
#
|
83
|
+
# @yieldparam record a record
|
84
|
+
# @yieldreturn [nil,Object] either a record object, or falsy to remove
|
85
|
+
# this record in the returned dump
|
86
|
+
# @return [MemoryDump]
|
87
|
+
def find_and_map
|
88
|
+
return enum_for(__method__) if !block_given?
|
89
|
+
|
90
|
+
address_to_record = Hash.new
|
91
|
+
each_record do |r|
|
92
|
+
if result = yield(r.dup)
|
93
|
+
address_to_record[r['address']] = result.to_hash
|
94
|
+
end
|
95
|
+
end
|
96
|
+
MemoryDump.new(address_to_record)
|
97
|
+
end
|
98
|
+
|
99
|
+
# Return the records of a given type
|
100
|
+
#
|
101
|
+
# @param [String] name the type
|
102
|
+
# @return [MemoryDump] the matching records
|
103
|
+
#
|
104
|
+
# @example return all ICLASS (singleton) records
|
105
|
+
# objects_of_class("ICLASS")
|
106
|
+
def objects_of_type(name)
|
107
|
+
find_all { |r| name === r['type'] }
|
108
|
+
end
|
109
|
+
|
110
|
+
# Return the records of a given class
|
111
|
+
#
|
112
|
+
# @param [String] name the class
|
113
|
+
# @return [MemoryDump] the matching entries
|
114
|
+
#
|
115
|
+
# @example return all string records
|
116
|
+
# objects_of_class("String")
|
117
|
+
def objects_of_class(name)
|
118
|
+
find_all { |r| name === r['class'] }
|
119
|
+
end
|
120
|
+
|
121
|
+
# Return the entries that refer to the entries in the dump
|
122
|
+
#
|
123
|
+
# @param [MemoryDump] the set of entries whose parents we're looking for
|
124
|
+
# @param [Integer] min only return the entries in self that refer to
|
125
|
+
# more than this much entries in 'dump'
|
126
|
+
# @param [Boolean] exclude_dump exclude the entries that are already in
|
127
|
+
# 'dump'
|
128
|
+
# @return [(MemoryDump,Hash)] the parent entries, and a mapping from
|
129
|
+
# records in the parent entries to the count of entries in 'dump' they
|
130
|
+
# refer to
|
131
|
+
def parents_of(dump, min: 0, exclude_dump: false)
|
132
|
+
children = dump.addresses.to_set
|
133
|
+
counts = Hash.new
|
134
|
+
filtered = find_all do |r|
|
135
|
+
next if exclude_dump && children.include?(r['address'])
|
136
|
+
|
137
|
+
count = r['references'].count { |r| children.include?(r) }
|
138
|
+
if count > min
|
139
|
+
counts[r] = count
|
140
|
+
true
|
141
|
+
end
|
142
|
+
end
|
143
|
+
return filtered, counts
|
144
|
+
end
|
145
|
+
|
146
|
+
# Remove entries from this dump, keeping the transitivity in the
|
147
|
+
# remaining graph
|
148
|
+
#
|
149
|
+
# @param [MemoryDump] entries entries to remove
|
150
|
+
#
|
151
|
+
# @example remove all entries that are of type HASH
|
152
|
+
# collapse(objects_of_type('HASH'))
|
153
|
+
def collapse(entries)
|
154
|
+
collapsed_entries = Hash.new
|
155
|
+
entries.each_record do |r|
|
156
|
+
collapsed_entries[r['address']] = r['references'].dup
|
157
|
+
end
|
158
|
+
|
159
|
+
|
160
|
+
# Remove references in-between the entries to collapse
|
161
|
+
already_expanded = Hash.new { |h, k| h[k] = Set[k] }
|
162
|
+
begin
|
163
|
+
changed_entries = Hash.new
|
164
|
+
collapsed_entries.each do |address, references|
|
165
|
+
sets = references.classify { |ref_address| collapsed_entries.has_key?(ref_address) }
|
166
|
+
updated_references = sets[false] || Set.new
|
167
|
+
if to_collapse = sets[true]
|
168
|
+
to_collapse.each do |ref_address|
|
169
|
+
next if already_expanded[address].include?(ref_address)
|
170
|
+
updated_references.merge(collapsed_entries[ref_address])
|
171
|
+
end
|
172
|
+
already_expanded[address].merge(to_collapse)
|
173
|
+
changed_entries[address] = updated_references
|
174
|
+
end
|
175
|
+
end
|
176
|
+
puts "#{changed_entries.size} changed entries"
|
177
|
+
collapsed_entries.merge!(changed_entries)
|
178
|
+
end while !changed_entries.empty?
|
179
|
+
|
180
|
+
find_and_map do |record|
|
181
|
+
next if collapsed_entries.has_key?(record['address'])
|
182
|
+
|
183
|
+
sets = record['references'].classify do |ref_address|
|
184
|
+
collapsed_entries.has_key?(ref_address)
|
185
|
+
end
|
186
|
+
updated_references = sets[false] || Set.new
|
187
|
+
if to_collapse = sets[true]
|
188
|
+
to_collapse.each do |ref_address|
|
189
|
+
updated_references.merge(collapsed_entries[ref_address])
|
190
|
+
end
|
191
|
+
record = record.dup
|
192
|
+
record['references'] = updated_references
|
193
|
+
end
|
194
|
+
record
|
195
|
+
end
|
196
|
+
end
|
197
|
+
|
198
|
+
# Remove entries from the dump, and all references to them
|
199
|
+
#
|
200
|
+
# @param [MemoryDump] the set of entries to remove, as e.g. returned by
|
201
|
+
# {#objects_of_class}
|
202
|
+
# @return [MemoryDump] the filtered dump
|
203
|
+
def without(entries)
|
204
|
+
find_and_map do |record|
|
205
|
+
next if entries.include?(record['address'])
|
206
|
+
record_refs = record['references']
|
207
|
+
references = record_refs.find_all { |r| !entries.include?(r) }
|
208
|
+
if references.size != record_refs.size
|
209
|
+
record = record.dup
|
210
|
+
record['references'] = references.to_set
|
211
|
+
end
|
212
|
+
record
|
213
|
+
end
|
214
|
+
end
|
215
|
+
|
216
|
+
# Write the dump to a GML file that can loaded by Gephi
|
217
|
+
#
|
218
|
+
# @param [Pathname,String,IO] the path or the IO stream into which we should
|
219
|
+
# dump
|
220
|
+
def to_gml(io_or_path)
|
221
|
+
if io_or_path.kind_of?(IO)
|
222
|
+
MemDump.convert_to_gml(self, io_or_path)
|
223
|
+
else
|
224
|
+
Pathname(io_or_path).open 'w' do |io|
|
225
|
+
to_gml(io)
|
226
|
+
end
|
227
|
+
end
|
228
|
+
nil
|
229
|
+
end
|
230
|
+
|
231
|
+
# Save the dump
|
232
|
+
def save(io_or_path)
|
233
|
+
if io_or_path.kind_of?(IO)
|
234
|
+
each_record do |r|
|
235
|
+
r = r.dup
|
236
|
+
r['address'] = r['address'].gsub(/:\d+$/, '')
|
237
|
+
if r['class_address']
|
238
|
+
r['class_address'] = r['class_address'].gsub(/:\d+$/, '')
|
239
|
+
elsif r['address']
|
240
|
+
r['address'] = r['address'].gsub(/:\d+$/, '')
|
241
|
+
end
|
242
|
+
r['references'] = r['references'].map { |ref_addr| ref_addr.gsub(/:\d+$/, '') }
|
243
|
+
io_or_path.puts JSON.dump(r)
|
244
|
+
end
|
245
|
+
nil
|
246
|
+
else
|
247
|
+
Pathname(io_or_path).open 'w' do |io|
|
248
|
+
save(io)
|
249
|
+
end
|
250
|
+
end
|
251
|
+
end
|
252
|
+
|
253
|
+
COMMON_COLLAPSE_TYPES = %w{IMEMO HASH ARRAY}
|
254
|
+
COMMON_COLLAPSE_CLASSES = %w{Set RubyVM::Env}
|
255
|
+
|
256
|
+
# Perform common initial cleanup
|
257
|
+
#
|
258
|
+
# It basically removes common classes that usually make a dump analysis
|
259
|
+
# more complicated without providing more information
|
260
|
+
#
|
261
|
+
# Namely, it collapses internal Ruby node types ROOT and IMEMO, as well
|
262
|
+
# as common collection classes {COMMON_COLLAPSE_CLASSES}.
|
263
|
+
#
|
264
|
+
# One usually analyses a cleaned-up dump before getting into the full
|
265
|
+
# dump
|
266
|
+
#
|
267
|
+
# @return [MemDump] the filtered dump
|
268
|
+
def common_cleanup
|
269
|
+
without_weakrefs = remove(objects_of_class 'WeakRef')
|
270
|
+
to_collapse = without_weakrefs.find_all do |r|
|
271
|
+
COMMON_COLLAPSE_CLASSES.include?(r['class']) ||
|
272
|
+
COMMON_COLLAPSE_TYPES.include?(r['type']) ||
|
273
|
+
r['method'] == 'dump_all'
|
274
|
+
end
|
275
|
+
without_weakrefs.collapse(to_collapse)
|
276
|
+
end
|
277
|
+
|
278
|
+
# Remove entries in the reference for which we can't find an object with
|
279
|
+
# the matching address
|
280
|
+
#
|
281
|
+
# @return [(MemoryDump,Set)] the filtered dump and the set of missing addresses found
|
282
|
+
def remove_invalid_references
|
283
|
+
addresses = self.addresses.to_set
|
284
|
+
missing = Set.new
|
285
|
+
result = map do |r|
|
286
|
+
common = (addresses & r['references'])
|
287
|
+
if common.size != r['references'].size
|
288
|
+
missing.merge(r['references'] - common)
|
289
|
+
end
|
290
|
+
r = r.dup
|
291
|
+
r['references'] = common
|
292
|
+
r
|
293
|
+
end
|
294
|
+
return result, missing
|
295
|
+
end
|
296
|
+
|
297
|
+
# Return the graph of object that keeps objects in dump alive
|
298
|
+
#
|
299
|
+
# It contains only the shortest paths from the roots to the objects in
|
300
|
+
# dump
|
301
|
+
#
|
302
|
+
# @param [MemoryDump] dump
|
303
|
+
# @return [MemoryDump]
|
304
|
+
def roots_of(dump, root_dump: nil)
|
305
|
+
if root_dump && root_dump.empty?
|
306
|
+
raise ArgumentError, "no roots provided"
|
307
|
+
end
|
308
|
+
|
309
|
+
root_addresses =
|
310
|
+
if root_dump then root_dump.addresses
|
311
|
+
else
|
312
|
+
['ALL_ROOTS']
|
313
|
+
end
|
314
|
+
|
315
|
+
ensure_graphs_computed
|
316
|
+
|
317
|
+
result_nodes = Set.new
|
318
|
+
dump_addresses = dump.addresses
|
319
|
+
root_addresses.each do |root_address|
|
320
|
+
visitor = RGL::DijkstraVisitor.new(@forward_graph)
|
321
|
+
dijkstra = RGL::DijkstraAlgorithm.new(@forward_graph, Hash.new(1), visitor)
|
322
|
+
dijkstra.find_shortest_paths(root_address)
|
323
|
+
path_builder = RGL::PathBuilder.new(root_address, visitor.parents_map)
|
324
|
+
|
325
|
+
dump_addresses.each_with_index do |record_address, record_i|
|
326
|
+
if path = path_builder.path(record_address)
|
327
|
+
result_nodes.merge(path)
|
328
|
+
end
|
329
|
+
end
|
330
|
+
end
|
331
|
+
|
332
|
+
find_and_map do |record|
|
333
|
+
address = record['address']
|
334
|
+
next if !result_nodes.include?(address)
|
335
|
+
|
336
|
+
# Prefer records in 'dump' to allow for annotations in the
|
337
|
+
# source
|
338
|
+
record = dump.find_by_address(address) || record
|
339
|
+
record = record.dup
|
340
|
+
record['references'] = result_nodes & record['references']
|
341
|
+
record
|
342
|
+
end
|
343
|
+
end
|
344
|
+
|
345
|
+
def minimum_spanning_tree(root_dump)
|
346
|
+
if root_dump.size != 1
|
347
|
+
raise ArgumentError, "there should be exactly one root"
|
348
|
+
end
|
349
|
+
root_address, _ = root_dump.address_to_record.first
|
350
|
+
if !(root = address_to_record[root_address])
|
351
|
+
raise ArgumentError, "no record with address #{root_address} in self"
|
352
|
+
end
|
353
|
+
|
354
|
+
ensure_graphs_computed
|
355
|
+
|
356
|
+
mst = @forward_graph.minimum_spanning_tree(root)
|
357
|
+
map = Hash.new
|
358
|
+
mst.each_vertex do |record|
|
359
|
+
record = record.dup
|
360
|
+
record['references'] = record['references'].dup
|
361
|
+
record['references'].delete_if { |ref_address| !mst.has_vertex?(ref_address) }
|
362
|
+
end
|
363
|
+
MemoryDump.new(map)
|
364
|
+
end
|
365
|
+
|
366
|
+
# @api private
|
367
|
+
#
|
368
|
+
# Ensure that @forward_graph and @backward_graph are computed
|
369
|
+
def ensure_graphs_computed
|
370
|
+
if !@forward_graph
|
371
|
+
@forward_graph, @backward_graph = compute_graphs
|
372
|
+
end
|
373
|
+
end
|
374
|
+
|
375
|
+
# @api private
|
376
|
+
#
|
377
|
+
# Force recomputation of the graph representation of the dump the next
|
378
|
+
# time it is needed
|
379
|
+
def clear_graph
|
380
|
+
@forward_graph = nil
|
381
|
+
@backward_graph = nil
|
382
|
+
end
|
383
|
+
|
384
|
+
# @api private
|
385
|
+
#
|
386
|
+
# Create two RGL::DirectedAdjacencyGraph, for the forward and backward edges of the graph
|
387
|
+
def compute_graphs
|
388
|
+
forward_graph = RGL::DirectedAdjacencyGraph.new
|
389
|
+
forward_graph.add_vertex 'ALL_ROOTS'
|
390
|
+
address_to_record.each do |address, record|
|
391
|
+
forward_graph.add_vertex(address)
|
392
|
+
|
393
|
+
if record['type'] == 'ROOT'
|
394
|
+
forward_graph.add_edge('ALL_ROOTS', address)
|
395
|
+
end
|
396
|
+
record['references'].each do |ref_address|
|
397
|
+
forward_graph.add_edge(address, ref_address)
|
398
|
+
end
|
399
|
+
end
|
400
|
+
|
401
|
+
backward_graph = RGL::DirectedAdjacencyGraph.new
|
402
|
+
forward_graph.each_edge do |u, v|
|
403
|
+
backward_graph.add_edge(v, u)
|
404
|
+
end
|
405
|
+
return forward_graph, backward_graph
|
406
|
+
end
|
407
|
+
|
408
|
+
def depth_first_visit(root, &block)
|
409
|
+
ensure_graphs_computed
|
410
|
+
@forward_graph.depth_first_visit(root, &block)
|
411
|
+
end
|
412
|
+
|
413
|
+
# Validate that all reference entries have a matching dump entry
|
414
|
+
#
|
415
|
+
# @raise [RuntimeError] if references have been found
|
416
|
+
def validate_references
|
417
|
+
addresses = self.addresses.to_set
|
418
|
+
each_record do |r|
|
419
|
+
common = addresses & r['references']
|
420
|
+
if common.size != r['references'].size
|
421
|
+
missing = r['references'] - common
|
422
|
+
raise "#{r} references #{missing.to_a.sort.join(", ")} which do not exist"
|
423
|
+
end
|
424
|
+
end
|
425
|
+
nil
|
426
|
+
end
|
427
|
+
|
428
|
+
# Get a random sample of the records
|
429
|
+
#
|
430
|
+
# The sampling is random, so the returned set might be bigger or smaller
|
431
|
+
# than expected. Do not use on small sets.
|
432
|
+
#
|
433
|
+
# @param [Float] the ratio of selected samples vs. total samples (0.1
|
434
|
+
# will select approximately 10% of the samples)
|
435
|
+
def sample(ratio)
|
436
|
+
result = Hash.new
|
437
|
+
each_record do |record|
|
438
|
+
if rand <= ratio
|
439
|
+
result[record['address']] = record
|
440
|
+
end
|
441
|
+
end
|
442
|
+
MemoryDump.new(result)
|
443
|
+
end
|
444
|
+
|
445
|
+
# @api private
|
446
|
+
#
|
447
|
+
# Return the set of record addresses that are the addresses of roots in
|
448
|
+
# the live graph
|
449
|
+
#
|
450
|
+
# @return [Set<String>]
|
451
|
+
def root_addresses
|
452
|
+
roots = self.addresses.to_set.dup
|
453
|
+
each_record do |r|
|
454
|
+
roots.subtract(r['references'])
|
455
|
+
end
|
456
|
+
roots
|
457
|
+
end
|
458
|
+
|
459
|
+
# Returns the set of roots
|
460
|
+
def roots(with_keepalive_count: false)
|
461
|
+
result = Hash.new
|
462
|
+
self.root_addresses.each do |addr|
|
463
|
+
record = find_by_address(addr)
|
464
|
+
if with_keepalive_count
|
465
|
+
record = record.dup
|
466
|
+
count = 0
|
467
|
+
depth_first_visit(addr) { count += 1 }
|
468
|
+
record['keepalive_count'] = count
|
469
|
+
end
|
470
|
+
result[addr] = record
|
471
|
+
end
|
472
|
+
MemoryDump.new(result)
|
473
|
+
end
|
474
|
+
|
475
|
+
def add_children(roots, with_keepalive_count: false)
|
476
|
+
result = Hash.new
|
477
|
+
roots.each_record do |root_record|
|
478
|
+
result[root_record['address']] = root_record
|
479
|
+
|
480
|
+
root_record['references'].each do |addr|
|
481
|
+
ref_record = find_by_address(addr)
|
482
|
+
next if !ref_record
|
483
|
+
|
484
|
+
if with_keepalive_count
|
485
|
+
ref_record = ref_record.dup
|
486
|
+
count = 0
|
487
|
+
depth_first_visit(addr) { count += 1 }
|
488
|
+
ref_record['keepalive_count'] = count
|
489
|
+
end
|
490
|
+
result[addr] = ref_record
|
491
|
+
end
|
492
|
+
end
|
493
|
+
MemoryDump.new(result)
|
494
|
+
end
|
495
|
+
|
496
|
+
def dup
|
497
|
+
find_all { true }
|
498
|
+
end
|
499
|
+
|
500
|
+
# Simply remove the given objects
|
501
|
+
def remove(objects)
|
502
|
+
removed_addresses = objects.addresses.to_set
|
503
|
+
return dup if removed_addresses.empty?
|
504
|
+
|
505
|
+
find_and_map do |r|
|
506
|
+
if !removed_addresses.include?(r['address'])
|
507
|
+
references = r['references'].dup
|
508
|
+
references.delete_if { |a| removed_addresses.include?(a) }
|
509
|
+
r['references'] = references
|
510
|
+
r
|
511
|
+
end
|
512
|
+
end
|
513
|
+
end
|
514
|
+
|
515
|
+
# Remove all components that are smaller than the given number of nodes
|
516
|
+
#
|
517
|
+
# It really looks only at the number of nodes reachable from a root
|
518
|
+
# (i.e. won't notice if two smaller-than-threshold roots have nodes in
|
519
|
+
# common)
|
520
|
+
def remove_small_components(max_size: 1)
|
521
|
+
roots = self.addresses.to_set.dup
|
522
|
+
leaves = Set.new
|
523
|
+
each_record do |r|
|
524
|
+
refs = r['references']
|
525
|
+
if refs.empty?
|
526
|
+
leaves << r['address']
|
527
|
+
else
|
528
|
+
roots.subtract(r['references'])
|
529
|
+
end
|
530
|
+
end
|
531
|
+
|
532
|
+
to_remove = Set.new
|
533
|
+
roots.each do |root_address|
|
534
|
+
component = Set[]
|
535
|
+
queue = Set[root_address]
|
536
|
+
while !queue.empty? && (component.size <= max_size)
|
537
|
+
address = queue.first
|
538
|
+
queue.delete(address)
|
539
|
+
next if component.include?(address)
|
540
|
+
component << address
|
541
|
+
queue.merge(address_to_record[address]['references'])
|
542
|
+
end
|
543
|
+
|
544
|
+
if component.size <= max_size
|
545
|
+
to_remove.merge(component)
|
546
|
+
end
|
547
|
+
end
|
548
|
+
|
549
|
+
without(find_all { |r| to_remove.include?(r['address']) })
|
550
|
+
end
|
551
|
+
|
552
|
+
def stats
|
553
|
+
unknown_class = 0
|
554
|
+
by_class = Hash.new(0)
|
555
|
+
each_record do |r|
|
556
|
+
if klass = (r['class'] || r['type'] || r['root'])
|
557
|
+
by_class[klass] += 1
|
558
|
+
else
|
559
|
+
unknown_class += 1
|
560
|
+
end
|
561
|
+
end
|
562
|
+
return unknown_class, by_class
|
563
|
+
end
|
564
|
+
|
565
|
+
# Compute the set of records that are not in self but are in to
|
566
|
+
#
|
567
|
+
# @param [MemoryDump]
|
568
|
+
# @return [MemoryDump]
|
569
|
+
def diff(to)
|
570
|
+
diff = Hash.new
|
571
|
+
to.each_record do |r|
|
572
|
+
address = r['address']
|
573
|
+
if !@address_to_record.include?(address)
|
574
|
+
diff[address] = r
|
575
|
+
end
|
576
|
+
end
|
577
|
+
MemoryDump.new(diff)
|
578
|
+
end
|
579
|
+
|
580
|
+
# Compute the interface between self and the other dump, that is the
|
581
|
+
# elements of self that have a child in dump, and the elements of dump
|
582
|
+
# that have a parent in self
|
583
|
+
def interface_with(dump)
|
584
|
+
self_border = Hash.new
|
585
|
+
dump_border = Hash.new
|
586
|
+
each_record do |r|
|
587
|
+
next if dump.find_by_address(r['address'])
|
588
|
+
|
589
|
+
refs_in_dump = r['references'].map do |addr|
|
590
|
+
dump.find_by_address(addr)
|
591
|
+
end.compact
|
592
|
+
|
593
|
+
if !refs_in_dump.empty?
|
594
|
+
self_border[r['address']] = r
|
595
|
+
refs_in_dump.each do |child|
|
596
|
+
dump_border[child['address']] = child.dup
|
597
|
+
end
|
598
|
+
end
|
599
|
+
end
|
600
|
+
|
601
|
+
self_border = MemoryDump.new(self_border)
|
602
|
+
dump_border = MemoryDump.new(dump_border)
|
603
|
+
|
604
|
+
dump.update_keepalive_count(dump_border)
|
605
|
+
return self_border, dump_border
|
606
|
+
end
|
607
|
+
|
608
|
+
# Replace all objects in dump by a single "group" object
|
609
|
+
def group(name, dump, attributes = Hash.new)
|
610
|
+
group_addresses = Set.new
|
611
|
+
group_references = Set.new
|
612
|
+
dump.each_record do |r|
|
613
|
+
group_addresses << r['address']
|
614
|
+
group_references.merge(r['references'])
|
615
|
+
end
|
616
|
+
group_record = attributes.dup
|
617
|
+
group_record['address'] = name
|
618
|
+
group_record['references'] = group_references - group_addresses
|
619
|
+
|
620
|
+
updated = Hash[name => group_record]
|
621
|
+
each_record do |record|
|
622
|
+
next if group_addresses.include?(record['address'])
|
623
|
+
|
624
|
+
updated_record = record.dup
|
625
|
+
updated_record['references'] -= group_addresses
|
626
|
+
if updated_record['references'].size != record['references'].size
|
627
|
+
updated_record['references'] << name
|
628
|
+
end
|
629
|
+
|
630
|
+
if group_addresses.include?(updated_record['class_address'])
|
631
|
+
updated_record['class_address'] = name
|
632
|
+
end
|
633
|
+
if group_addresses.include?(updated_record['class'])
|
634
|
+
updated_record['class'] = name
|
635
|
+
end
|
636
|
+
|
637
|
+
updated[updated_record['address']] = updated_record
|
638
|
+
end
|
639
|
+
|
640
|
+
MemoryDump.new(updated)
|
641
|
+
end
|
642
|
+
|
643
|
+
def update_keepalive_count(dump)
|
644
|
+
ensure_graphs_computed
|
645
|
+
dump.each_record do |record|
|
646
|
+
count = 0
|
647
|
+
dump.depth_first_visit(record['address']) { |obj| count += 1 }
|
648
|
+
record['keepalive_count'] = count
|
649
|
+
record
|
650
|
+
end
|
651
|
+
end
|
652
|
+
|
653
|
+
def replace_class_id_by_class_name(add_reference_to_class: false)
|
654
|
+
MemDump.replace_class_address_by_name(self, add_reference_to_class: add_reference_to_class)
|
655
|
+
end
|
656
|
+
|
657
|
+
def to_s
|
658
|
+
"#<MemoryDump size=#{size}>"
|
659
|
+
end
|
660
|
+
end
|
661
|
+
end
|
662
|
+
|
@@ -2,23 +2,40 @@ module MemDump
|
|
2
2
|
# Replace the address in the 'class' attribute by the class name
|
3
3
|
def self.replace_class_address_by_name(dump, add_reference_to_class: false)
|
4
4
|
class_names = Hash.new
|
5
|
+
iclasses = Hash.new
|
5
6
|
dump.each_record do |row|
|
6
7
|
if row['type'] == 'CLASS' || row['type'] == 'MODULE'
|
7
8
|
class_names[row['address']] = row['name']
|
9
|
+
elsif row['type'] == 'ICLASS' || row['type'] == "IMEMO"
|
10
|
+
iclasses[row['address']] = row
|
8
11
|
end
|
9
12
|
end
|
10
13
|
|
11
|
-
|
14
|
+
iclass_size = 0
|
15
|
+
while !iclasses.empty? && (iclass_size != iclasses.size)
|
16
|
+
iclass_size = iclasses.size
|
17
|
+
iclasses.delete_if do |_, r|
|
18
|
+
if (klass = r['class']) && (class_name = class_names[klass])
|
19
|
+
class_names[r['address']] = "I(#{class_name})"
|
20
|
+
r['class'] = class_name
|
21
|
+
r['class_address'] = klass
|
22
|
+
if add_reference_to_class
|
23
|
+
(r['references'] ||= Set.new) << klass
|
24
|
+
end
|
25
|
+
true
|
26
|
+
end
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
dump.map do |r|
|
12
31
|
if klass = r['class']
|
32
|
+
r = r.dup
|
13
33
|
r['class'] = class_names[klass] || klass
|
14
34
|
r['class_address'] = klass
|
15
35
|
if add_reference_to_class
|
16
|
-
(r['references'] ||=
|
36
|
+
(r['references'] ||= Set.new) << klass
|
17
37
|
end
|
18
38
|
end
|
19
|
-
if r['type'] == 'ICLASS'
|
20
|
-
r['class'] = "I(#{r['class']})"
|
21
|
-
end
|
22
39
|
r
|
23
40
|
end
|
24
41
|
end
|
data/lib/memdump/version.rb
CHANGED
data/memdump.gemspec
CHANGED
@@ -20,7 +20,8 @@ Gem::Specification.new do |spec|
|
|
20
20
|
spec.require_paths = ["lib"]
|
21
21
|
|
22
22
|
spec.add_dependency 'thor'
|
23
|
-
spec.add_dependency '
|
23
|
+
spec.add_dependency 'rgl'
|
24
|
+
spec.add_dependency 'pry'
|
24
25
|
spec.add_development_dependency "bundler", "~> 1.11"
|
25
26
|
spec.add_development_dependency "rake", "~> 10.0"
|
26
27
|
spec.add_development_dependency "minitest", "~> 5.0"
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: memdump
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Sylvain Joyeux
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2018-02-03 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: thor
|
@@ -25,7 +25,21 @@ dependencies:
|
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: '0'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
|
-
name:
|
28
|
+
name: rgl
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - ">="
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '0'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - ">="
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: pry
|
29
43
|
requirement: !ruby/object:Gem::Requirement
|
30
44
|
requirements:
|
31
45
|
- - ">="
|
@@ -98,13 +112,14 @@ files:
|
|
98
112
|
- lib/memdump.rb
|
99
113
|
- lib/memdump/cleanup_references.rb
|
100
114
|
- lib/memdump/cli.rb
|
115
|
+
- lib/memdump/common_ancestor.rb
|
101
116
|
- lib/memdump/convert_to_gml.rb
|
102
|
-
- lib/memdump/diff.rb
|
103
117
|
- lib/memdump/json_dump.rb
|
118
|
+
- lib/memdump/memory_dump.rb
|
119
|
+
- lib/memdump/out_degree.rb
|
104
120
|
- lib/memdump/remove_node.rb
|
105
121
|
- lib/memdump/replace_class_address_by_name.rb
|
106
122
|
- lib/memdump/root_of.rb
|
107
|
-
- lib/memdump/stats.rb
|
108
123
|
- lib/memdump/subgraph_of.rb
|
109
124
|
- lib/memdump/version.rb
|
110
125
|
- memdump.gemspec
|
@@ -128,9 +143,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
128
143
|
version: '0'
|
129
144
|
requirements: []
|
130
145
|
rubyforge_project:
|
131
|
-
rubygems_version: 2.
|
146
|
+
rubygems_version: 2.5.1
|
132
147
|
signing_key:
|
133
148
|
specification_version: 4
|
134
149
|
summary: Tools to manipulate Ruby 2.1+ memory dumps
|
135
150
|
test_files: []
|
136
|
-
has_rdoc:
|
data/lib/memdump/diff.rb
DELETED
@@ -1,44 +0,0 @@
|
|
1
|
-
require 'set'
|
2
|
-
|
3
|
-
module MemDump
|
4
|
-
def self.diff(from, to)
|
5
|
-
from_objects = Set.new
|
6
|
-
from.each_record { |r| from_objects << (r['address'] || r['root']) }
|
7
|
-
puts "#{from_objects.size} objects found in source dump"
|
8
|
-
|
9
|
-
selected_records = Hash.new
|
10
|
-
remaining_records = Array.new
|
11
|
-
to.each_record do |r|
|
12
|
-
address = (r['address'] || r['root'])
|
13
|
-
if !from_objects.include?(address)
|
14
|
-
selected_records[address] = r
|
15
|
-
r['only_in_target'] = 1
|
16
|
-
else
|
17
|
-
remaining_records << r
|
18
|
-
end
|
19
|
-
end
|
20
|
-
|
21
|
-
total = remaining_records.size + selected_records.size
|
22
|
-
count = 0
|
23
|
-
while selected_records.size != count
|
24
|
-
count = selected_records.size
|
25
|
-
puts "#{count}/#{total} records selected so far"
|
26
|
-
remaining_records.delete_if do |r|
|
27
|
-
address = (r['address'] || r['root'])
|
28
|
-
references = r['references']
|
29
|
-
|
30
|
-
if references && references.any? { |r| selected_records.has_key?(r) }
|
31
|
-
selected_records[address] = r
|
32
|
-
end
|
33
|
-
end
|
34
|
-
end
|
35
|
-
puts "#{count}/#{total} records selected"
|
36
|
-
|
37
|
-
selected_records.each_value do |r|
|
38
|
-
if references = r['references']
|
39
|
-
references.delete_if { |a| !selected_records.has_key?(a) }
|
40
|
-
end
|
41
|
-
end
|
42
|
-
selected_records.each_value
|
43
|
-
end
|
44
|
-
end
|
data/lib/memdump/stats.rb
DELETED
@@ -1,15 +0,0 @@
|
|
1
|
-
module MemDump
|
2
|
-
def self.stats(memdump)
|
3
|
-
unknown_class = 0
|
4
|
-
by_class = Hash.new(0)
|
5
|
-
memdump.each_record do |r|
|
6
|
-
if klass = (r['class'] || r['type'] || r['root'])
|
7
|
-
by_class[klass] += 1
|
8
|
-
else
|
9
|
-
unknown_class += 1
|
10
|
-
end
|
11
|
-
end
|
12
|
-
return unknown_class, by_class
|
13
|
-
end
|
14
|
-
end
|
15
|
-
|