memdump 0.1.0 → 0.2.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/Gemfile +2 -0
- data/README.md +117 -8
- data/bin/memdump +1 -0
- data/lib/memdump.rb +19 -2
- data/lib/memdump/cli.rb +40 -21
- data/lib/memdump/common_ancestor.rb +44 -0
- data/lib/memdump/convert_to_gml.rb +23 -33
- data/lib/memdump/json_dump.rb +50 -7
- data/lib/memdump/memory_dump.rb +662 -0
- data/lib/memdump/out_degree.rb +7 -0
- data/lib/memdump/replace_class_address_by_name.rb +22 -5
- data/lib/memdump/version.rb +1 -1
- data/memdump.gemspec +2 -1
- metadata +21 -7
- data/lib/memdump/diff.rb +0 -44
- data/lib/memdump/stats.rb +0 -15
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: deaf03849e0949a5cf0f6150ea598b02e55411cb
|
4
|
+
data.tar.gz: e8d53128488b1d0c83392b0b56eed940327d5a0a
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: bf78d4e885d83b66e47f8f642b90ba74117a3b7c5a6963ce602f0dbbbe5eab1c19ce2bce1a54f01290f787926a2170520b39e20661ce23ef9f4ee08d1fc2ee68
|
7
|
+
data.tar.gz: bbb79a73ec1e0dc13e42040380c12168ebdabc2e9c91d9801421aa025245c1eb3e6070b79f886642d275504ce74749afdd7bdc17c5ccd22d867db101164134b7
|
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -86,10 +86,10 @@ Allocation tracing is enabled with
|
|
86
86
|
|
87
87
|
~~~ ruby
|
88
88
|
require 'objspace'
|
89
|
-
ObjectSpace.
|
89
|
+
ObjectSpace.trace_object_allocations_start
|
90
90
|
~~~
|
91
91
|
|
92
|
-
##
|
92
|
+
## Basic analysis
|
93
93
|
|
94
94
|
The first thing you will probably want to do is to run the replace-class command
|
95
95
|
on the dump. It replaces the class attribute, which in the original dump is the
|
@@ -105,13 +105,122 @@ count by class. For memory leaks, the **diff** command allows you to output the
|
|
105
105
|
part of the graph that involves new objects (removing the
|
106
106
|
"old-and-not-referred-to-by-new")
|
107
107
|
|
108
|
+
Beyond, this analyzing the dump is best done through the interactive mode:
|
109
|
+
|
110
|
+
```
|
111
|
+
memdump interactive /tmp/mydump
|
112
|
+
```
|
113
|
+
|
114
|
+
will get you a pry shell in the context of the loaded MemoryDump object. Use
|
115
|
+
the MemoryDump API to filter out what you need. If you're dealing with big dumps,
|
116
|
+
it is usually a good idea to save them regularly with `#dump`.
|
117
|
+
|
118
|
+
One useful call to do at the beginning is #common_cleanup. It collapses the
|
119
|
+
common collections (Array, Set, Hash) as well as internal bookkeeping objects
|
120
|
+
(ICLASS, …). I usually run this, save the result and re-load the result (which
|
121
|
+
is usually significantly smaller).
|
122
|
+
|
123
|
+
After, the usual process is to find out which non-standard classes are
|
124
|
+
unexpectedly present in high numbers using `stats`, extract the objects from
|
125
|
+
these classes with `dump = objects_of_class('classname')` and the subgraph that
|
126
|
+
keeps them alive with `roots_of(dump)`
|
127
|
+
|
128
|
+
```
|
129
|
+
# Get the subgraph of all objects whose class name matches /Plan/ and export
|
130
|
+
# it to GML to process with Gephi (see below)
|
131
|
+
parent_dump, _ = roots_of(objects_of_class(/Plan/))
|
132
|
+
parent_dump.to_gml('plan-subgraph.gml')
|
133
|
+
```
|
134
|
+
|
135
|
+
Once you start filtering dumps, don't forget to simplify your life by `cd`'ing
|
136
|
+
in the context of the newly filtered dumps
|
137
|
+
|
108
138
|
Beyond that, I usually go back and forth between the memory dump and
|
109
|
-
[gephi](http://gephi.org), a graph analysis application.
|
110
|
-
|
111
|
-
|
112
|
-
|
113
|
-
|
114
|
-
|
139
|
+
[gephi](http://gephi.org), a graph analysis application. `to_gml` allows to
|
140
|
+
convert the memory dump into a graph format that gephi can import. From there,
|
141
|
+
use gephi's layouting and filtering algorithms to get an idea of the shape of
|
142
|
+
the dump. Note that you need to first get a graph smaller than a few 10k of objects
|
143
|
+
before you can use gephi.
|
144
|
+
|
145
|
+
## Dump diffs
|
146
|
+
|
147
|
+
One powerful way to find out where memory is leaked is to look at objects that
|
148
|
+
got allocated and find the interface between the long-term objects and these
|
149
|
+
objects. memdump supports this by computing diffs.
|
150
|
+
|
151
|
+
If you mean to use dump diffs you **MUST** enable allocation tracing. Not doing
|
152
|
+
so will make the diffs inaccurate, as memdump will not be able to recognize that some
|
153
|
+
object addresses have been reused after a garbage collection.
|
154
|
+
|
155
|
+
Let's assume that we have a "before.json" and "after.json" dumps. Start an interactive
|
156
|
+
shell loading `before`.
|
157
|
+
|
158
|
+
```
|
159
|
+
memdump interactive before.json
|
160
|
+
```
|
161
|
+
|
162
|
+
Then, in the shell, let's load the after dump
|
163
|
+
|
164
|
+
```
|
165
|
+
> after = MemDump::JSONDump.load('after.json')
|
166
|
+
```
|
167
|
+
|
168
|
+
The set of objects that are in `after` and `before` is given by `#diff`
|
169
|
+
|
170
|
+
```
|
171
|
+
d = diff(after)
|
172
|
+
```
|
173
|
+
|
174
|
+
We'll also add a special marker to the records in `d` so that we can easily colorize
|
175
|
+
them differently in Gephi.
|
176
|
+
|
177
|
+
```
|
178
|
+
d = d.map { |r| r['in_after'] = 1; r }
|
179
|
+
```
|
180
|
+
|
181
|
+
## Case 1: few new objects are linked to the old ones
|
182
|
+
|
183
|
+
One possibility is that there are only a few objects in the diff that are kept
|
184
|
+
alive from `before`. These objects in turn keep alive a lot more objects (which
|
185
|
+
cause the noticeable memory leak). What's interesting in this case is to
|
186
|
+
visualize the interface, that is that set of objects.
|
187
|
+
|
188
|
+
In memdump, one computes it with the `interface_with` method, which computes the
|
189
|
+
interface between the receiver and the argument. The receiver must contain the
|
190
|
+
edges between itself and the argument, which means in our case that we must use
|
191
|
+
`after`.
|
192
|
+
|
193
|
+
```
|
194
|
+
self_border, diff_border = after.interface_with(d)
|
195
|
+
```
|
196
|
+
|
197
|
+
In addition to computing the border, it computes the count of objects that are
|
198
|
+
kept alive by each object in `diff_border`. Each record in `diff_border` has an
|
199
|
+
attribute called `keepalive_count` that counts the amount of nodes in `after`
|
200
|
+
that are reachable (i.e. kept alive by) it. It is usually a good idea to
|
201
|
+
visualize the distribution of `keepalive_count` to see whether there's indeed
|
202
|
+
only a few nodes, and whether some are keeping a lot more objects alive than
|
203
|
+
others. Note that cycles that involve more than one "border node" will be
|
204
|
+
counted multiple ones (so the sum of `keepalive_count` will be higher than
|
205
|
+
`d.size`)
|
206
|
+
|
207
|
+
```
|
208
|
+
diff_border.size # is this much smaller than d.size ?
|
209
|
+
diff_border.each_record.map { |r| r['keepalive_count'] }.sort.reverse # are there some high counts at the top ?
|
210
|
+
```
|
211
|
+
|
212
|
+
From there, one needs to do a bunch of back-and-forth between memdump and Gephi.
|
213
|
+
What I usually do is start by dumping the whole subgraph that contains the border
|
214
|
+
and visualize. If I can't make any sense of it, I isolate the high-count elements
|
215
|
+
in the border and visualize the related subgraph
|
216
|
+
|
217
|
+
```
|
218
|
+
full_subgraph = after.roots_of(diff_border)
|
219
|
+
full_subgraph.to_gml 'full.gml'
|
220
|
+
filtered_border = diff_border.find_all { |r| r['keepalive_count'] > 1000 }
|
221
|
+
filtered_subgraph = after.roots_of(filtered_border)
|
222
|
+
filtered_subgraph.to_gml 'filtered.gml'
|
223
|
+
```
|
115
224
|
|
116
225
|
## Contributing
|
117
226
|
|
data/bin/memdump
CHANGED
data/lib/memdump.rb
CHANGED
@@ -1,5 +1,22 @@
|
|
1
|
+
require 'rgl/adjacency'
|
2
|
+
require 'rgl/dijkstra'
|
3
|
+
require 'rgl/traversal'
|
4
|
+
|
1
5
|
require "memdump/version"
|
6
|
+
require 'memdump/json_dump'
|
7
|
+
require 'memdump/memory_dump'
|
8
|
+
|
9
|
+
require 'memdump/cleanup_references'
|
10
|
+
require 'memdump/common_ancestor'
|
11
|
+
require 'memdump/convert_to_gml'
|
12
|
+
require 'memdump/out_degree'
|
13
|
+
require 'memdump/remove_node'
|
14
|
+
require 'memdump/replace_class_address_by_name'
|
15
|
+
require 'memdump/root_of'
|
16
|
+
require 'memdump/subgraph_of'
|
2
17
|
|
3
|
-
module
|
4
|
-
|
18
|
+
module MemDump
|
19
|
+
def self.pry(dump)
|
20
|
+
binding.pry
|
21
|
+
end
|
5
22
|
end
|
data/lib/memdump/cli.rb
CHANGED
@@ -1,6 +1,6 @@
|
|
1
1
|
require 'thor'
|
2
2
|
require 'pathname'
|
3
|
-
require 'memdump
|
3
|
+
require 'memdump'
|
4
4
|
|
5
5
|
module MemDump
|
6
6
|
class CLI < Thor
|
@@ -17,17 +17,14 @@ module MemDump
|
|
17
17
|
|
18
18
|
desc 'diff SOURCE TARGET OUTPUT', 'generate a memory dump that contains the objects in TARGET not in SOURCE, and all their parents'
|
19
19
|
def diff(source, target, output)
|
20
|
-
|
21
|
-
|
22
|
-
|
23
|
-
|
24
|
-
|
25
|
-
|
26
|
-
|
27
|
-
|
28
|
-
io.puts JSON.dump(r)
|
29
|
-
end
|
30
|
-
end
|
20
|
+
from = MemDump::JSONDump.load(source)
|
21
|
+
to = MemDump::JSONDump.load(target)
|
22
|
+
diff = from.diff(to)
|
23
|
+
STDOUT.sync
|
24
|
+
puts "#{diff.size} nodes are in target but not in source"
|
25
|
+
diff = to.roots_of(diff)
|
26
|
+
puts "#{diff.size} nodes in final dump"
|
27
|
+
diff.save(output)
|
31
28
|
end
|
32
29
|
|
33
30
|
desc 'gml DUMP GML', 'converts a memory dump into a graph in the GML format (for processing by e.g. gephi)'
|
@@ -82,13 +79,9 @@ module MemDump
|
|
82
79
|
if output_path then Pathname.new(output_path)
|
83
80
|
else dump_path
|
84
81
|
end
|
85
|
-
dump = MemDump::JSONDump.
|
86
|
-
|
87
|
-
|
88
|
-
result.each do |r|
|
89
|
-
io.puts JSON.dump(r)
|
90
|
-
end
|
91
|
-
end
|
82
|
+
dump = MemDump::JSONDump.load(dump_path)
|
83
|
+
dump = dump.replace_class_id_by_class_name(add_reference_to_class: options[:add_ref])
|
84
|
+
dump.save(output_path)
|
92
85
|
end
|
93
86
|
|
94
87
|
desc 'cleanup-refs DUMP OUTPUT', "removes references to deleted objects"
|
@@ -121,13 +114,39 @@ module MemDump
|
|
121
114
|
def stats(dump)
|
122
115
|
require 'pp'
|
123
116
|
require 'memdump/stats'
|
124
|
-
dump = MemDump::JSONDump.
|
125
|
-
unknown, by_type =
|
117
|
+
dump = MemDump::JSONDump.load(dump)
|
118
|
+
unknown, by_type = dump.stats
|
126
119
|
puts "#{unknown} objects without a known type"
|
127
120
|
by_type.sort_by { |n, v| v }.reverse.each do |n, v|
|
128
121
|
puts "#{n}: #{v}"
|
129
122
|
end
|
130
123
|
end
|
124
|
+
|
125
|
+
desc 'out_degree DUMP', 'display the direct count of objects held by each object in the dump'
|
126
|
+
option "min", desc: "hide the objects whose degree is lower than this",
|
127
|
+
type: :numeric
|
128
|
+
def out_degree(dump)
|
129
|
+
dump = MemDump::JSONDump.new(Pathname.new(dump))
|
130
|
+
min = options[:min] || 0
|
131
|
+
sorted = dump.each_record.sort_by { |r| (r['references'] || Array.new).size }
|
132
|
+
sorted.each do |r|
|
133
|
+
size = (r['references'] || Array.new).size
|
134
|
+
break if size > min
|
135
|
+
puts "#{size} #{r}"
|
136
|
+
end
|
137
|
+
end
|
138
|
+
|
139
|
+
desc 'interactive DUMP', 'loads a dump file and spawn a pry shell'
|
140
|
+
option :load, desc: 'load the whole dump in memory', type: :boolean, default: true
|
141
|
+
def interactive(dump)
|
142
|
+
require 'memdump'
|
143
|
+
require 'pry'
|
144
|
+
dump = MemDump::JSONDump.new(Pathname.new(dump))
|
145
|
+
if options[:load]
|
146
|
+
dump = dump.load
|
147
|
+
end
|
148
|
+
dump.pry
|
149
|
+
end
|
131
150
|
end
|
132
151
|
end
|
133
152
|
|
@@ -0,0 +1,44 @@
|
|
1
|
+
module MemDump
|
2
|
+
def self.common_ancestors(dump, class_name, threshold: 0.1)
|
3
|
+
selected_records = Hash.new
|
4
|
+
remaining_records = Array.new
|
5
|
+
dump.each_record do |r|
|
6
|
+
if class_name === r['class']
|
7
|
+
selected_records[r['address']] = r
|
8
|
+
else
|
9
|
+
remaining_records << r
|
10
|
+
end
|
11
|
+
end
|
12
|
+
|
13
|
+
remaining_records = Array.new
|
14
|
+
selected_records = Hash.new
|
15
|
+
selected_root = root_address
|
16
|
+
dump.each_record do |r|
|
17
|
+
address = (r['address'] || r['root'])
|
18
|
+
if selected_root == address
|
19
|
+
selected_records[address] = r
|
20
|
+
selected_root = nil;
|
21
|
+
else
|
22
|
+
remaining_records << r
|
23
|
+
end
|
24
|
+
end
|
25
|
+
|
26
|
+
count = 0
|
27
|
+
while count != selected_records.size
|
28
|
+
count = selected_records.size
|
29
|
+
remaining_records.delete_if do |r|
|
30
|
+
references = r['references']
|
31
|
+
if references && references.any? { |a| selected_records.has_key?(a) }
|
32
|
+
address = (r['address'] || r['root'])
|
33
|
+
selected_records[address] = r
|
34
|
+
end
|
35
|
+
end
|
36
|
+
end
|
37
|
+
|
38
|
+
selected_records.values.reverse.each do |r|
|
39
|
+
if refs = r['references']
|
40
|
+
refs.delete_if { |a| !selected_records.has_key?(a) }
|
41
|
+
end
|
42
|
+
end
|
43
|
+
end
|
44
|
+
end
|
@@ -1,47 +1,37 @@
|
|
1
|
-
require 'set'
|
2
|
-
|
3
1
|
module MemDump
|
4
2
|
def self.convert_to_gml(dump, io)
|
5
|
-
nodes = dump.each_record.map do |row|
|
6
|
-
if row['class_address'] # transformed with replace_class_address_by_name
|
7
|
-
name = row['class']
|
8
|
-
else
|
9
|
-
name = row['struct'] || row['root'] || row['type']
|
10
|
-
end
|
11
|
-
|
12
|
-
address = row['address'] || row['root']
|
13
|
-
refs = Hash.new
|
14
|
-
if row_refs = row['references']
|
15
|
-
row_refs.each { |r| refs[r] = nil }
|
16
|
-
end
|
17
|
-
|
18
|
-
[address, refs, name]
|
19
|
-
end
|
20
|
-
|
21
3
|
io.puts "graph"
|
22
4
|
io.puts "["
|
23
|
-
|
24
|
-
|
25
|
-
|
5
|
+
|
6
|
+
edges = []
|
7
|
+
dump.each_record do |row|
|
8
|
+
address = row['address']
|
9
|
+
|
26
10
|
io.puts " node"
|
27
11
|
io.puts " ["
|
28
12
|
io.puts " id #{address}"
|
29
|
-
|
13
|
+
row.each do |key, value|
|
14
|
+
if value.respond_to?(:to_str)
|
15
|
+
io.puts " #{key} \"#{value}\""
|
16
|
+
elsif value.kind_of?(Numeric)
|
17
|
+
io.puts " #{key} #{value}"
|
18
|
+
end
|
19
|
+
end
|
30
20
|
io.puts " ]"
|
31
|
-
end
|
32
21
|
|
33
|
-
|
34
|
-
|
35
|
-
io.puts " edge"
|
36
|
-
io.puts " ["
|
37
|
-
io.puts " source #{address}"
|
38
|
-
io.puts " target #{ref_address}"
|
39
|
-
if ref_label
|
40
|
-
io.puts " label \"#{ref_label}\""
|
41
|
-
end
|
42
|
-
io.puts " ]"
|
22
|
+
row['references'].each do |ref_address|
|
23
|
+
edges << address << ref_address
|
43
24
|
end
|
44
25
|
end
|
26
|
+
|
27
|
+
edges.each_slice(2) do |address, ref_address|
|
28
|
+
io.puts " edge"
|
29
|
+
io.puts " ["
|
30
|
+
io.puts " source #{address}"
|
31
|
+
io.puts " target #{ref_address}"
|
32
|
+
io.puts " ]"
|
33
|
+
end
|
34
|
+
|
45
35
|
io.puts "]"
|
46
36
|
end
|
47
37
|
end
|
data/lib/memdump/json_dump.rb
CHANGED
@@ -1,22 +1,65 @@
|
|
1
|
+
require 'pathname'
|
1
2
|
require 'json'
|
2
3
|
module MemDump
|
3
4
|
class JSONDump
|
5
|
+
def self.load(filename)
|
6
|
+
new(filename).load
|
7
|
+
end
|
8
|
+
|
4
9
|
def initialize(filename)
|
5
|
-
@filename = filename
|
10
|
+
@filename = Pathname(filename)
|
6
11
|
end
|
7
12
|
|
8
13
|
def each_record
|
9
14
|
return enum_for(__method__) if !block_given?
|
10
15
|
|
11
|
-
|
12
|
-
|
13
|
-
|
14
|
-
|
15
|
-
|
16
|
-
|
16
|
+
@filename.open do |f|
|
17
|
+
f.each_line do |line|
|
18
|
+
r = JSON.parse(line)
|
19
|
+
r['address'] ||= r['root']
|
20
|
+
r['references'] ||= Set.new
|
21
|
+
yield r
|
22
|
+
end
|
23
|
+
end
|
24
|
+
end
|
25
|
+
|
26
|
+
def load
|
27
|
+
address_to_record = Hash.new
|
28
|
+
generations = Hash.new
|
29
|
+
each_record do |r|
|
30
|
+
if !(address = r['address'])
|
31
|
+
raise "no address in #{r}"
|
32
|
+
end
|
33
|
+
r = r.dup
|
34
|
+
|
35
|
+
if generation = r['generation']
|
36
|
+
generations[address] = r['address'] = "#{address}:#{generation}"
|
37
|
+
end
|
38
|
+
r['references'] = r['references'].to_set
|
39
|
+
address_to_record[r['address']] = r
|
40
|
+
end
|
41
|
+
|
42
|
+
if !generations.empty?
|
43
|
+
address_to_record.each_value do |r|
|
44
|
+
if class_address = r['class']
|
45
|
+
r['class'] = generations.fetch(class_address, class_address)
|
46
|
+
end
|
47
|
+
if class_address = r['class_address']
|
48
|
+
r['class_address'] = generations.fetch(class_address, class_address)
|
17
49
|
end
|
50
|
+
|
51
|
+
refs = Set.new
|
52
|
+
r['references'].each do |ref_address|
|
53
|
+
refs << generations.fetch(ref_address, ref_address)
|
54
|
+
end
|
55
|
+
r['references'] = refs
|
18
56
|
end
|
19
57
|
end
|
58
|
+
MemoryDump.new(address_to_record)
|
59
|
+
end
|
60
|
+
|
61
|
+
def inspect
|
62
|
+
to_s
|
20
63
|
end
|
21
64
|
end
|
22
65
|
end
|
@@ -0,0 +1,662 @@
|
|
1
|
+
module MemDump
|
2
|
+
class MemoryDump
|
3
|
+
attr_reader :address_to_record
|
4
|
+
|
5
|
+
def initialize(address_to_record)
|
6
|
+
@address_to_record = address_to_record
|
7
|
+
@forward_graph = nil
|
8
|
+
@backward_graph = nil
|
9
|
+
end
|
10
|
+
|
11
|
+
def include?(address)
|
12
|
+
address_to_record.has_key?(address)
|
13
|
+
end
|
14
|
+
|
15
|
+
def each_record(&block)
|
16
|
+
address_to_record.each_value(&block)
|
17
|
+
end
|
18
|
+
|
19
|
+
def addresses
|
20
|
+
address_to_record.keys
|
21
|
+
end
|
22
|
+
|
23
|
+
def size
|
24
|
+
address_to_record.size
|
25
|
+
end
|
26
|
+
|
27
|
+
def find_by_address(address)
|
28
|
+
address_to_record[address]
|
29
|
+
end
|
30
|
+
|
31
|
+
def inspect
|
32
|
+
to_s
|
33
|
+
end
|
34
|
+
|
35
|
+
def save(io_or_path)
|
36
|
+
if io_or_path.respond_to?(:open)
|
37
|
+
io_or_path.open 'w' do |io|
|
38
|
+
save(io)
|
39
|
+
end
|
40
|
+
else
|
41
|
+
each_record do |r|
|
42
|
+
io_or_path.puts JSON.dump(r)
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
46
|
+
|
47
|
+
# Filter the records
|
48
|
+
#
|
49
|
+
# @yieldparam record a record
|
50
|
+
# @yieldreturn [Object] the record object that should be included in the
|
51
|
+
# returned dump
|
52
|
+
# @return [MemoryDump]
|
53
|
+
def find_all
|
54
|
+
return enum_for(__method__) if !block_given?
|
55
|
+
|
56
|
+
address_to_record = Hash.new
|
57
|
+
each_record do |r|
|
58
|
+
if yield(r)
|
59
|
+
address_to_record[r['address']] = r
|
60
|
+
end
|
61
|
+
end
|
62
|
+
MemoryDump.new(address_to_record)
|
63
|
+
end
|
64
|
+
|
65
|
+
# Map the records
|
66
|
+
#
|
67
|
+
# @yieldparam record a record
|
68
|
+
# @yieldreturn [Object] the record object that should be included in the
|
69
|
+
# returned dump
|
70
|
+
# @return [MemoryDump]
|
71
|
+
def map
|
72
|
+
return enum_for(__method__) if !block_given?
|
73
|
+
|
74
|
+
address_to_record = Hash.new
|
75
|
+
each_record do |r|
|
76
|
+
address_to_record[r['address']] = yield(r.dup).to_hash
|
77
|
+
end
|
78
|
+
MemoryDump.new(address_to_record)
|
79
|
+
end
|
80
|
+
|
81
|
+
# Filter the entries, removing those for which the block returns falsy
|
82
|
+
#
|
83
|
+
# @yieldparam record a record
|
84
|
+
# @yieldreturn [nil,Object] either a record object, or falsy to remove
|
85
|
+
# this record in the returned dump
|
86
|
+
# @return [MemoryDump]
|
87
|
+
def find_and_map
|
88
|
+
return enum_for(__method__) if !block_given?
|
89
|
+
|
90
|
+
address_to_record = Hash.new
|
91
|
+
each_record do |r|
|
92
|
+
if result = yield(r.dup)
|
93
|
+
address_to_record[r['address']] = result.to_hash
|
94
|
+
end
|
95
|
+
end
|
96
|
+
MemoryDump.new(address_to_record)
|
97
|
+
end
|
98
|
+
|
99
|
+
# Return the records of a given type
|
100
|
+
#
|
101
|
+
# @param [String] name the type
|
102
|
+
# @return [MemoryDump] the matching records
|
103
|
+
#
|
104
|
+
# @example return all ICLASS (singleton) records
|
105
|
+
# objects_of_class("ICLASS")
|
106
|
+
def objects_of_type(name)
|
107
|
+
find_all { |r| name === r['type'] }
|
108
|
+
end
|
109
|
+
|
110
|
+
# Return the records of a given class
|
111
|
+
#
|
112
|
+
# @param [String] name the class
|
113
|
+
# @return [MemoryDump] the matching entries
|
114
|
+
#
|
115
|
+
# @example return all string records
|
116
|
+
# objects_of_class("String")
|
117
|
+
def objects_of_class(name)
|
118
|
+
find_all { |r| name === r['class'] }
|
119
|
+
end
|
120
|
+
|
121
|
+
# Return the entries that refer to the entries in the dump
|
122
|
+
#
|
123
|
+
# @param [MemoryDump] the set of entries whose parents we're looking for
|
124
|
+
# @param [Integer] min only return the entries in self that refer to
|
125
|
+
# more than this much entries in 'dump'
|
126
|
+
# @param [Boolean] exclude_dump exclude the entries that are already in
|
127
|
+
# 'dump'
|
128
|
+
# @return [(MemoryDump,Hash)] the parent entries, and a mapping from
|
129
|
+
# records in the parent entries to the count of entries in 'dump' they
|
130
|
+
# refer to
|
131
|
+
def parents_of(dump, min: 0, exclude_dump: false)
|
132
|
+
children = dump.addresses.to_set
|
133
|
+
counts = Hash.new
|
134
|
+
filtered = find_all do |r|
|
135
|
+
next if exclude_dump && children.include?(r['address'])
|
136
|
+
|
137
|
+
count = r['references'].count { |r| children.include?(r) }
|
138
|
+
if count > min
|
139
|
+
counts[r] = count
|
140
|
+
true
|
141
|
+
end
|
142
|
+
end
|
143
|
+
return filtered, counts
|
144
|
+
end
|
145
|
+
|
146
|
+
# Remove entries from this dump, keeping the transitivity in the
|
147
|
+
# remaining graph
|
148
|
+
#
|
149
|
+
# @param [MemoryDump] entries entries to remove
|
150
|
+
#
|
151
|
+
# @example remove all entries that are of type HASH
|
152
|
+
# collapse(objects_of_type('HASH'))
|
153
|
+
def collapse(entries)
|
154
|
+
collapsed_entries = Hash.new
|
155
|
+
entries.each_record do |r|
|
156
|
+
collapsed_entries[r['address']] = r['references'].dup
|
157
|
+
end
|
158
|
+
|
159
|
+
|
160
|
+
# Remove references in-between the entries to collapse
|
161
|
+
already_expanded = Hash.new { |h, k| h[k] = Set[k] }
|
162
|
+
begin
|
163
|
+
changed_entries = Hash.new
|
164
|
+
collapsed_entries.each do |address, references|
|
165
|
+
sets = references.classify { |ref_address| collapsed_entries.has_key?(ref_address) }
|
166
|
+
updated_references = sets[false] || Set.new
|
167
|
+
if to_collapse = sets[true]
|
168
|
+
to_collapse.each do |ref_address|
|
169
|
+
next if already_expanded[address].include?(ref_address)
|
170
|
+
updated_references.merge(collapsed_entries[ref_address])
|
171
|
+
end
|
172
|
+
already_expanded[address].merge(to_collapse)
|
173
|
+
changed_entries[address] = updated_references
|
174
|
+
end
|
175
|
+
end
|
176
|
+
puts "#{changed_entries.size} changed entries"
|
177
|
+
collapsed_entries.merge!(changed_entries)
|
178
|
+
end while !changed_entries.empty?
|
179
|
+
|
180
|
+
find_and_map do |record|
|
181
|
+
next if collapsed_entries.has_key?(record['address'])
|
182
|
+
|
183
|
+
sets = record['references'].classify do |ref_address|
|
184
|
+
collapsed_entries.has_key?(ref_address)
|
185
|
+
end
|
186
|
+
updated_references = sets[false] || Set.new
|
187
|
+
if to_collapse = sets[true]
|
188
|
+
to_collapse.each do |ref_address|
|
189
|
+
updated_references.merge(collapsed_entries[ref_address])
|
190
|
+
end
|
191
|
+
record = record.dup
|
192
|
+
record['references'] = updated_references
|
193
|
+
end
|
194
|
+
record
|
195
|
+
end
|
196
|
+
end
|
197
|
+
|
198
|
+
# Remove entries from the dump, and all references to them
|
199
|
+
#
|
200
|
+
# @param [MemoryDump] the set of entries to remove, as e.g. returned by
|
201
|
+
# {#objects_of_class}
|
202
|
+
# @return [MemoryDump] the filtered dump
|
203
|
+
def without(entries)
|
204
|
+
find_and_map do |record|
|
205
|
+
next if entries.include?(record['address'])
|
206
|
+
record_refs = record['references']
|
207
|
+
references = record_refs.find_all { |r| !entries.include?(r) }
|
208
|
+
if references.size != record_refs.size
|
209
|
+
record = record.dup
|
210
|
+
record['references'] = references.to_set
|
211
|
+
end
|
212
|
+
record
|
213
|
+
end
|
214
|
+
end
|
215
|
+
|
216
|
+
# Write the dump to a GML file that can loaded by Gephi
|
217
|
+
#
|
218
|
+
# @param [Pathname,String,IO] the path or the IO stream into which we should
|
219
|
+
# dump
|
220
|
+
def to_gml(io_or_path)
|
221
|
+
if io_or_path.kind_of?(IO)
|
222
|
+
MemDump.convert_to_gml(self, io_or_path)
|
223
|
+
else
|
224
|
+
Pathname(io_or_path).open 'w' do |io|
|
225
|
+
to_gml(io)
|
226
|
+
end
|
227
|
+
end
|
228
|
+
nil
|
229
|
+
end
|
230
|
+
|
231
|
+
# Save the dump
|
232
|
+
def save(io_or_path)
|
233
|
+
if io_or_path.kind_of?(IO)
|
234
|
+
each_record do |r|
|
235
|
+
r = r.dup
|
236
|
+
r['address'] = r['address'].gsub(/:\d+$/, '')
|
237
|
+
if r['class_address']
|
238
|
+
r['class_address'] = r['class_address'].gsub(/:\d+$/, '')
|
239
|
+
elsif r['address']
|
240
|
+
r['address'] = r['address'].gsub(/:\d+$/, '')
|
241
|
+
end
|
242
|
+
r['references'] = r['references'].map { |ref_addr| ref_addr.gsub(/:\d+$/, '') }
|
243
|
+
io_or_path.puts JSON.dump(r)
|
244
|
+
end
|
245
|
+
nil
|
246
|
+
else
|
247
|
+
Pathname(io_or_path).open 'w' do |io|
|
248
|
+
save(io)
|
249
|
+
end
|
250
|
+
end
|
251
|
+
end
|
252
|
+
|
253
|
+
COMMON_COLLAPSE_TYPES = %w{IMEMO HASH ARRAY}
|
254
|
+
COMMON_COLLAPSE_CLASSES = %w{Set RubyVM::Env}
|
255
|
+
|
256
|
+
# Perform common initial cleanup
|
257
|
+
#
|
258
|
+
# It basically removes common classes that usually make a dump analysis
|
259
|
+
# more complicated without providing more information
|
260
|
+
#
|
261
|
+
# Namely, it collapses internal Ruby node types ROOT and IMEMO, as well
|
262
|
+
# as common collection classes {COMMON_COLLAPSE_CLASSES}.
|
263
|
+
#
|
264
|
+
# One usually analyses a cleaned-up dump before getting into the full
|
265
|
+
# dump
|
266
|
+
#
|
267
|
+
# @return [MemDump] the filtered dump
|
268
|
+
def common_cleanup
|
269
|
+
without_weakrefs = remove(objects_of_class 'WeakRef')
|
270
|
+
to_collapse = without_weakrefs.find_all do |r|
|
271
|
+
COMMON_COLLAPSE_CLASSES.include?(r['class']) ||
|
272
|
+
COMMON_COLLAPSE_TYPES.include?(r['type']) ||
|
273
|
+
r['method'] == 'dump_all'
|
274
|
+
end
|
275
|
+
without_weakrefs.collapse(to_collapse)
|
276
|
+
end
|
277
|
+
|
278
|
+
# Remove entries in the reference for which we can't find an object with
|
279
|
+
# the matching address
|
280
|
+
#
|
281
|
+
# @return [(MemoryDump,Set)] the filtered dump and the set of missing addresses found
|
282
|
+
def remove_invalid_references
|
283
|
+
addresses = self.addresses.to_set
|
284
|
+
missing = Set.new
|
285
|
+
result = map do |r|
|
286
|
+
common = (addresses & r['references'])
|
287
|
+
if common.size != r['references'].size
|
288
|
+
missing.merge(r['references'] - common)
|
289
|
+
end
|
290
|
+
r = r.dup
|
291
|
+
r['references'] = common
|
292
|
+
r
|
293
|
+
end
|
294
|
+
return result, missing
|
295
|
+
end
|
296
|
+
|
297
|
+
# Return the graph of object that keeps objects in dump alive
|
298
|
+
#
|
299
|
+
# It contains only the shortest paths from the roots to the objects in
|
300
|
+
# dump
|
301
|
+
#
|
302
|
+
# @param [MemoryDump] dump
|
303
|
+
# @return [MemoryDump]
|
304
|
+
def roots_of(dump, root_dump: nil)
|
305
|
+
if root_dump && root_dump.empty?
|
306
|
+
raise ArgumentError, "no roots provided"
|
307
|
+
end
|
308
|
+
|
309
|
+
root_addresses =
|
310
|
+
if root_dump then root_dump.addresses
|
311
|
+
else
|
312
|
+
['ALL_ROOTS']
|
313
|
+
end
|
314
|
+
|
315
|
+
ensure_graphs_computed
|
316
|
+
|
317
|
+
result_nodes = Set.new
|
318
|
+
dump_addresses = dump.addresses
|
319
|
+
root_addresses.each do |root_address|
|
320
|
+
visitor = RGL::DijkstraVisitor.new(@forward_graph)
|
321
|
+
dijkstra = RGL::DijkstraAlgorithm.new(@forward_graph, Hash.new(1), visitor)
|
322
|
+
dijkstra.find_shortest_paths(root_address)
|
323
|
+
path_builder = RGL::PathBuilder.new(root_address, visitor.parents_map)
|
324
|
+
|
325
|
+
dump_addresses.each_with_index do |record_address, record_i|
|
326
|
+
if path = path_builder.path(record_address)
|
327
|
+
result_nodes.merge(path)
|
328
|
+
end
|
329
|
+
end
|
330
|
+
end
|
331
|
+
|
332
|
+
find_and_map do |record|
|
333
|
+
address = record['address']
|
334
|
+
next if !result_nodes.include?(address)
|
335
|
+
|
336
|
+
# Prefer records in 'dump' to allow for annotations in the
|
337
|
+
# source
|
338
|
+
record = dump.find_by_address(address) || record
|
339
|
+
record = record.dup
|
340
|
+
record['references'] = result_nodes & record['references']
|
341
|
+
record
|
342
|
+
end
|
343
|
+
end
|
344
|
+
|
345
|
+
def minimum_spanning_tree(root_dump)
|
346
|
+
if root_dump.size != 1
|
347
|
+
raise ArgumentError, "there should be exactly one root"
|
348
|
+
end
|
349
|
+
root_address, _ = root_dump.address_to_record.first
|
350
|
+
if !(root = address_to_record[root_address])
|
351
|
+
raise ArgumentError, "no record with address #{root_address} in self"
|
352
|
+
end
|
353
|
+
|
354
|
+
ensure_graphs_computed
|
355
|
+
|
356
|
+
mst = @forward_graph.minimum_spanning_tree(root)
|
357
|
+
map = Hash.new
|
358
|
+
mst.each_vertex do |record|
|
359
|
+
record = record.dup
|
360
|
+
record['references'] = record['references'].dup
|
361
|
+
record['references'].delete_if { |ref_address| !mst.has_vertex?(ref_address) }
|
362
|
+
end
|
363
|
+
MemoryDump.new(map)
|
364
|
+
end
|
365
|
+
|
366
|
+
# @api private
|
367
|
+
#
|
368
|
+
# Ensure that @forward_graph and @backward_graph are computed
|
369
|
+
def ensure_graphs_computed
|
370
|
+
if !@forward_graph
|
371
|
+
@forward_graph, @backward_graph = compute_graphs
|
372
|
+
end
|
373
|
+
end
|
374
|
+
|
375
|
+
# @api private
|
376
|
+
#
|
377
|
+
# Force recomputation of the graph representation of the dump the next
|
378
|
+
# time it is needed
|
379
|
+
def clear_graph
|
380
|
+
@forward_graph = nil
|
381
|
+
@backward_graph = nil
|
382
|
+
end
|
383
|
+
|
384
|
+
# @api private
|
385
|
+
#
|
386
|
+
# Create two RGL::DirectedAdjacencyGraph, for the forward and backward edges of the graph
|
387
|
+
def compute_graphs
|
388
|
+
forward_graph = RGL::DirectedAdjacencyGraph.new
|
389
|
+
forward_graph.add_vertex 'ALL_ROOTS'
|
390
|
+
address_to_record.each do |address, record|
|
391
|
+
forward_graph.add_vertex(address)
|
392
|
+
|
393
|
+
if record['type'] == 'ROOT'
|
394
|
+
forward_graph.add_edge('ALL_ROOTS', address)
|
395
|
+
end
|
396
|
+
record['references'].each do |ref_address|
|
397
|
+
forward_graph.add_edge(address, ref_address)
|
398
|
+
end
|
399
|
+
end
|
400
|
+
|
401
|
+
backward_graph = RGL::DirectedAdjacencyGraph.new
|
402
|
+
forward_graph.each_edge do |u, v|
|
403
|
+
backward_graph.add_edge(v, u)
|
404
|
+
end
|
405
|
+
return forward_graph, backward_graph
|
406
|
+
end
|
407
|
+
|
408
|
+
def depth_first_visit(root, &block)
|
409
|
+
ensure_graphs_computed
|
410
|
+
@forward_graph.depth_first_visit(root, &block)
|
411
|
+
end
|
412
|
+
|
413
|
+
# Validate that all reference entries have a matching dump entry
|
414
|
+
#
|
415
|
+
# @raise [RuntimeError] if references have been found
|
416
|
+
def validate_references
|
417
|
+
addresses = self.addresses.to_set
|
418
|
+
each_record do |r|
|
419
|
+
common = addresses & r['references']
|
420
|
+
if common.size != r['references'].size
|
421
|
+
missing = r['references'] - common
|
422
|
+
raise "#{r} references #{missing.to_a.sort.join(", ")} which do not exist"
|
423
|
+
end
|
424
|
+
end
|
425
|
+
nil
|
426
|
+
end
|
427
|
+
|
428
|
+
# Get a random sample of the records
|
429
|
+
#
|
430
|
+
# The sampling is random, so the returned set might be bigger or smaller
|
431
|
+
# than expected. Do not use on small sets.
|
432
|
+
#
|
433
|
+
# @param [Float] the ratio of selected samples vs. total samples (0.1
|
434
|
+
# will select approximately 10% of the samples)
|
435
|
+
def sample(ratio)
|
436
|
+
result = Hash.new
|
437
|
+
each_record do |record|
|
438
|
+
if rand <= ratio
|
439
|
+
result[record['address']] = record
|
440
|
+
end
|
441
|
+
end
|
442
|
+
MemoryDump.new(result)
|
443
|
+
end
|
444
|
+
|
445
|
+
# @api private
|
446
|
+
#
|
447
|
+
# Return the set of record addresses that are the addresses of roots in
|
448
|
+
# the live graph
|
449
|
+
#
|
450
|
+
# @return [Set<String>]
|
451
|
+
def root_addresses
|
452
|
+
roots = self.addresses.to_set.dup
|
453
|
+
each_record do |r|
|
454
|
+
roots.subtract(r['references'])
|
455
|
+
end
|
456
|
+
roots
|
457
|
+
end
|
458
|
+
|
459
|
+
# Returns the set of roots
|
460
|
+
def roots(with_keepalive_count: false)
|
461
|
+
result = Hash.new
|
462
|
+
self.root_addresses.each do |addr|
|
463
|
+
record = find_by_address(addr)
|
464
|
+
if with_keepalive_count
|
465
|
+
record = record.dup
|
466
|
+
count = 0
|
467
|
+
depth_first_visit(addr) { count += 1 }
|
468
|
+
record['keepalive_count'] = count
|
469
|
+
end
|
470
|
+
result[addr] = record
|
471
|
+
end
|
472
|
+
MemoryDump.new(result)
|
473
|
+
end
|
474
|
+
|
475
|
+
def add_children(roots, with_keepalive_count: false)
|
476
|
+
result = Hash.new
|
477
|
+
roots.each_record do |root_record|
|
478
|
+
result[root_record['address']] = root_record
|
479
|
+
|
480
|
+
root_record['references'].each do |addr|
|
481
|
+
ref_record = find_by_address(addr)
|
482
|
+
next if !ref_record
|
483
|
+
|
484
|
+
if with_keepalive_count
|
485
|
+
ref_record = ref_record.dup
|
486
|
+
count = 0
|
487
|
+
depth_first_visit(addr) { count += 1 }
|
488
|
+
ref_record['keepalive_count'] = count
|
489
|
+
end
|
490
|
+
result[addr] = ref_record
|
491
|
+
end
|
492
|
+
end
|
493
|
+
MemoryDump.new(result)
|
494
|
+
end
|
495
|
+
|
496
|
+
def dup
|
497
|
+
find_all { true }
|
498
|
+
end
|
499
|
+
|
500
|
+
# Simply remove the given objects
|
501
|
+
def remove(objects)
|
502
|
+
removed_addresses = objects.addresses.to_set
|
503
|
+
return dup if removed_addresses.empty?
|
504
|
+
|
505
|
+
find_and_map do |r|
|
506
|
+
if !removed_addresses.include?(r['address'])
|
507
|
+
references = r['references'].dup
|
508
|
+
references.delete_if { |a| removed_addresses.include?(a) }
|
509
|
+
r['references'] = references
|
510
|
+
r
|
511
|
+
end
|
512
|
+
end
|
513
|
+
end
|
514
|
+
|
515
|
+
# Remove all components that are smaller than the given number of nodes
|
516
|
+
#
|
517
|
+
# It really looks only at the number of nodes reachable from a root
|
518
|
+
# (i.e. won't notice if two smaller-than-threshold roots have nodes in
|
519
|
+
# common)
|
520
|
+
def remove_small_components(max_size: 1)
|
521
|
+
roots = self.addresses.to_set.dup
|
522
|
+
leaves = Set.new
|
523
|
+
each_record do |r|
|
524
|
+
refs = r['references']
|
525
|
+
if refs.empty?
|
526
|
+
leaves << r['address']
|
527
|
+
else
|
528
|
+
roots.subtract(r['references'])
|
529
|
+
end
|
530
|
+
end
|
531
|
+
|
532
|
+
to_remove = Set.new
|
533
|
+
roots.each do |root_address|
|
534
|
+
component = Set[]
|
535
|
+
queue = Set[root_address]
|
536
|
+
while !queue.empty? && (component.size <= max_size)
|
537
|
+
address = queue.first
|
538
|
+
queue.delete(address)
|
539
|
+
next if component.include?(address)
|
540
|
+
component << address
|
541
|
+
queue.merge(address_to_record[address]['references'])
|
542
|
+
end
|
543
|
+
|
544
|
+
if component.size <= max_size
|
545
|
+
to_remove.merge(component)
|
546
|
+
end
|
547
|
+
end
|
548
|
+
|
549
|
+
without(find_all { |r| to_remove.include?(r['address']) })
|
550
|
+
end
|
551
|
+
|
552
|
+
def stats
|
553
|
+
unknown_class = 0
|
554
|
+
by_class = Hash.new(0)
|
555
|
+
each_record do |r|
|
556
|
+
if klass = (r['class'] || r['type'] || r['root'])
|
557
|
+
by_class[klass] += 1
|
558
|
+
else
|
559
|
+
unknown_class += 1
|
560
|
+
end
|
561
|
+
end
|
562
|
+
return unknown_class, by_class
|
563
|
+
end
|
564
|
+
|
565
|
+
# Compute the set of records that are not in self but are in to
|
566
|
+
#
|
567
|
+
# @param [MemoryDump]
|
568
|
+
# @return [MemoryDump]
|
569
|
+
def diff(to)
|
570
|
+
diff = Hash.new
|
571
|
+
to.each_record do |r|
|
572
|
+
address = r['address']
|
573
|
+
if !@address_to_record.include?(address)
|
574
|
+
diff[address] = r
|
575
|
+
end
|
576
|
+
end
|
577
|
+
MemoryDump.new(diff)
|
578
|
+
end
|
579
|
+
|
580
|
+
# Compute the interface between self and the other dump, that is the
|
581
|
+
# elements of self that have a child in dump, and the elements of dump
|
582
|
+
# that have a parent in self
|
583
|
+
def interface_with(dump)
|
584
|
+
self_border = Hash.new
|
585
|
+
dump_border = Hash.new
|
586
|
+
each_record do |r|
|
587
|
+
next if dump.find_by_address(r['address'])
|
588
|
+
|
589
|
+
refs_in_dump = r['references'].map do |addr|
|
590
|
+
dump.find_by_address(addr)
|
591
|
+
end.compact
|
592
|
+
|
593
|
+
if !refs_in_dump.empty?
|
594
|
+
self_border[r['address']] = r
|
595
|
+
refs_in_dump.each do |child|
|
596
|
+
dump_border[child['address']] = child.dup
|
597
|
+
end
|
598
|
+
end
|
599
|
+
end
|
600
|
+
|
601
|
+
self_border = MemoryDump.new(self_border)
|
602
|
+
dump_border = MemoryDump.new(dump_border)
|
603
|
+
|
604
|
+
dump.update_keepalive_count(dump_border)
|
605
|
+
return self_border, dump_border
|
606
|
+
end
|
607
|
+
|
608
|
+
# Replace all objects in dump by a single "group" object
|
609
|
+
def group(name, dump, attributes = Hash.new)
|
610
|
+
group_addresses = Set.new
|
611
|
+
group_references = Set.new
|
612
|
+
dump.each_record do |r|
|
613
|
+
group_addresses << r['address']
|
614
|
+
group_references.merge(r['references'])
|
615
|
+
end
|
616
|
+
group_record = attributes.dup
|
617
|
+
group_record['address'] = name
|
618
|
+
group_record['references'] = group_references - group_addresses
|
619
|
+
|
620
|
+
updated = Hash[name => group_record]
|
621
|
+
each_record do |record|
|
622
|
+
next if group_addresses.include?(record['address'])
|
623
|
+
|
624
|
+
updated_record = record.dup
|
625
|
+
updated_record['references'] -= group_addresses
|
626
|
+
if updated_record['references'].size != record['references'].size
|
627
|
+
updated_record['references'] << name
|
628
|
+
end
|
629
|
+
|
630
|
+
if group_addresses.include?(updated_record['class_address'])
|
631
|
+
updated_record['class_address'] = name
|
632
|
+
end
|
633
|
+
if group_addresses.include?(updated_record['class'])
|
634
|
+
updated_record['class'] = name
|
635
|
+
end
|
636
|
+
|
637
|
+
updated[updated_record['address']] = updated_record
|
638
|
+
end
|
639
|
+
|
640
|
+
MemoryDump.new(updated)
|
641
|
+
end
|
642
|
+
|
643
|
+
def update_keepalive_count(dump)
|
644
|
+
ensure_graphs_computed
|
645
|
+
dump.each_record do |record|
|
646
|
+
count = 0
|
647
|
+
dump.depth_first_visit(record['address']) { |obj| count += 1 }
|
648
|
+
record['keepalive_count'] = count
|
649
|
+
record
|
650
|
+
end
|
651
|
+
end
|
652
|
+
|
653
|
+
def replace_class_id_by_class_name(add_reference_to_class: false)
|
654
|
+
MemDump.replace_class_address_by_name(self, add_reference_to_class: add_reference_to_class)
|
655
|
+
end
|
656
|
+
|
657
|
+
def to_s
|
658
|
+
"#<MemoryDump size=#{size}>"
|
659
|
+
end
|
660
|
+
end
|
661
|
+
end
|
662
|
+
|
@@ -2,23 +2,40 @@ module MemDump
|
|
2
2
|
# Replace the address in the 'class' attribute by the class name
|
3
3
|
def self.replace_class_address_by_name(dump, add_reference_to_class: false)
|
4
4
|
class_names = Hash.new
|
5
|
+
iclasses = Hash.new
|
5
6
|
dump.each_record do |row|
|
6
7
|
if row['type'] == 'CLASS' || row['type'] == 'MODULE'
|
7
8
|
class_names[row['address']] = row['name']
|
9
|
+
elsif row['type'] == 'ICLASS' || row['type'] == "IMEMO"
|
10
|
+
iclasses[row['address']] = row
|
8
11
|
end
|
9
12
|
end
|
10
13
|
|
11
|
-
|
14
|
+
iclass_size = 0
|
15
|
+
while !iclasses.empty? && (iclass_size != iclasses.size)
|
16
|
+
iclass_size = iclasses.size
|
17
|
+
iclasses.delete_if do |_, r|
|
18
|
+
if (klass = r['class']) && (class_name = class_names[klass])
|
19
|
+
class_names[r['address']] = "I(#{class_name})"
|
20
|
+
r['class'] = class_name
|
21
|
+
r['class_address'] = klass
|
22
|
+
if add_reference_to_class
|
23
|
+
(r['references'] ||= Set.new) << klass
|
24
|
+
end
|
25
|
+
true
|
26
|
+
end
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
dump.map do |r|
|
12
31
|
if klass = r['class']
|
32
|
+
r = r.dup
|
13
33
|
r['class'] = class_names[klass] || klass
|
14
34
|
r['class_address'] = klass
|
15
35
|
if add_reference_to_class
|
16
|
-
(r['references'] ||=
|
36
|
+
(r['references'] ||= Set.new) << klass
|
17
37
|
end
|
18
38
|
end
|
19
|
-
if r['type'] == 'ICLASS'
|
20
|
-
r['class'] = "I(#{r['class']})"
|
21
|
-
end
|
22
39
|
r
|
23
40
|
end
|
24
41
|
end
|
data/lib/memdump/version.rb
CHANGED
data/memdump.gemspec
CHANGED
@@ -20,7 +20,8 @@ Gem::Specification.new do |spec|
|
|
20
20
|
spec.require_paths = ["lib"]
|
21
21
|
|
22
22
|
spec.add_dependency 'thor'
|
23
|
-
spec.add_dependency '
|
23
|
+
spec.add_dependency 'rgl'
|
24
|
+
spec.add_dependency 'pry'
|
24
25
|
spec.add_development_dependency "bundler", "~> 1.11"
|
25
26
|
spec.add_development_dependency "rake", "~> 10.0"
|
26
27
|
spec.add_development_dependency "minitest", "~> 5.0"
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: memdump
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.2.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Sylvain Joyeux
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2018-02-03 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: thor
|
@@ -25,7 +25,21 @@ dependencies:
|
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: '0'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
|
-
name:
|
28
|
+
name: rgl
|
29
|
+
requirement: !ruby/object:Gem::Requirement
|
30
|
+
requirements:
|
31
|
+
- - ">="
|
32
|
+
- !ruby/object:Gem::Version
|
33
|
+
version: '0'
|
34
|
+
type: :runtime
|
35
|
+
prerelease: false
|
36
|
+
version_requirements: !ruby/object:Gem::Requirement
|
37
|
+
requirements:
|
38
|
+
- - ">="
|
39
|
+
- !ruby/object:Gem::Version
|
40
|
+
version: '0'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: pry
|
29
43
|
requirement: !ruby/object:Gem::Requirement
|
30
44
|
requirements:
|
31
45
|
- - ">="
|
@@ -98,13 +112,14 @@ files:
|
|
98
112
|
- lib/memdump.rb
|
99
113
|
- lib/memdump/cleanup_references.rb
|
100
114
|
- lib/memdump/cli.rb
|
115
|
+
- lib/memdump/common_ancestor.rb
|
101
116
|
- lib/memdump/convert_to_gml.rb
|
102
|
-
- lib/memdump/diff.rb
|
103
117
|
- lib/memdump/json_dump.rb
|
118
|
+
- lib/memdump/memory_dump.rb
|
119
|
+
- lib/memdump/out_degree.rb
|
104
120
|
- lib/memdump/remove_node.rb
|
105
121
|
- lib/memdump/replace_class_address_by_name.rb
|
106
122
|
- lib/memdump/root_of.rb
|
107
|
-
- lib/memdump/stats.rb
|
108
123
|
- lib/memdump/subgraph_of.rb
|
109
124
|
- lib/memdump/version.rb
|
110
125
|
- memdump.gemspec
|
@@ -128,9 +143,8 @@ required_rubygems_version: !ruby/object:Gem::Requirement
|
|
128
143
|
version: '0'
|
129
144
|
requirements: []
|
130
145
|
rubyforge_project:
|
131
|
-
rubygems_version: 2.
|
146
|
+
rubygems_version: 2.5.1
|
132
147
|
signing_key:
|
133
148
|
specification_version: 4
|
134
149
|
summary: Tools to manipulate Ruby 2.1+ memory dumps
|
135
150
|
test_files: []
|
136
|
-
has_rdoc:
|
data/lib/memdump/diff.rb
DELETED
@@ -1,44 +0,0 @@
|
|
1
|
-
require 'set'
|
2
|
-
|
3
|
-
module MemDump
|
4
|
-
def self.diff(from, to)
|
5
|
-
from_objects = Set.new
|
6
|
-
from.each_record { |r| from_objects << (r['address'] || r['root']) }
|
7
|
-
puts "#{from_objects.size} objects found in source dump"
|
8
|
-
|
9
|
-
selected_records = Hash.new
|
10
|
-
remaining_records = Array.new
|
11
|
-
to.each_record do |r|
|
12
|
-
address = (r['address'] || r['root'])
|
13
|
-
if !from_objects.include?(address)
|
14
|
-
selected_records[address] = r
|
15
|
-
r['only_in_target'] = 1
|
16
|
-
else
|
17
|
-
remaining_records << r
|
18
|
-
end
|
19
|
-
end
|
20
|
-
|
21
|
-
total = remaining_records.size + selected_records.size
|
22
|
-
count = 0
|
23
|
-
while selected_records.size != count
|
24
|
-
count = selected_records.size
|
25
|
-
puts "#{count}/#{total} records selected so far"
|
26
|
-
remaining_records.delete_if do |r|
|
27
|
-
address = (r['address'] || r['root'])
|
28
|
-
references = r['references']
|
29
|
-
|
30
|
-
if references && references.any? { |r| selected_records.has_key?(r) }
|
31
|
-
selected_records[address] = r
|
32
|
-
end
|
33
|
-
end
|
34
|
-
end
|
35
|
-
puts "#{count}/#{total} records selected"
|
36
|
-
|
37
|
-
selected_records.each_value do |r|
|
38
|
-
if references = r['references']
|
39
|
-
references.delete_if { |a| !selected_records.has_key?(a) }
|
40
|
-
end
|
41
|
-
end
|
42
|
-
selected_records.each_value
|
43
|
-
end
|
44
|
-
end
|
data/lib/memdump/stats.rb
DELETED
@@ -1,15 +0,0 @@
|
|
1
|
-
module MemDump
|
2
|
-
def self.stats(memdump)
|
3
|
-
unknown_class = 0
|
4
|
-
by_class = Hash.new(0)
|
5
|
-
memdump.each_record do |r|
|
6
|
-
if klass = (r['class'] || r['type'] || r['root'])
|
7
|
-
by_class[klass] += 1
|
8
|
-
else
|
9
|
-
unknown_class += 1
|
10
|
-
end
|
11
|
-
end
|
12
|
-
return unknown_class, by_class
|
13
|
-
end
|
14
|
-
end
|
15
|
-
|