sheap 0.1.0 → 0.2.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (6) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +76 -12
  3. data/lib/sheap/version.rb +1 -1
  4. data/lib/sheap.rb +288 -53
  5. data/tmp/.keep +0 -0
  6. metadata +4 -3
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0f61e05e0d4dbe85f032a53a7b61a9d5446d1f7bddd599ed47b46c132c3615ca
4
- data.tar.gz: f2c2aa947a8c2110df25add16f97b27a4a0df10473dd4e73c98a4634899489b1
3
+ metadata.gz: 63584faef266eb7722518ee68bbb9cb757786a6aec38779889466b3d215eb542
4
+ data.tar.gz: f4979afbb9ce9f1f7ae8aabb58a8b69e7fc86b676c35a451774311a8d6d212d1
5
5
  SHA512:
6
- metadata.gz: d75e2dce655e44a761c0f3a23c29afd59fe2e535f179b286175e1685d3e4552bf3c4a9eefc21495f1be67a31084db6ceb6a400f7fe40ac98bf21b0751665c50d
7
- data.tar.gz: 5410d3da66d793c4c552878738726e15e34341c017f62bc8c992baebab2933fe3780a9efec522717c8c01ee02ac7bc9cdbbc65c7c9691d69861299b4fda26cb3
6
+ metadata.gz: 5388e77ef86d256762f7388836f6aa2f34837e0a87da9e9fa842156fca972c475b87c506c7310cbde3739343d250bcc11714c875785627fd557da7bd0a7606e7
7
+ data.tar.gz: b3696454a64657a42c07c331d659938933417fb611ef5494127d76c9aa4b83ee2ef057c0c0e2d220d8cf27a1b822c9e5300b0a1fee9c1706ab07283335e21175
data/README.md CHANGED
@@ -1,24 +1,88 @@
1
1
  # Sheap
2
2
 
3
- TODO: Delete this and the text below, and describe your gem
3
+ Sheap is a library for interactively exploring Ruby Heap dumps. Sheap contains contains a command-line tool and a library for use in IRB.
4
4
 
5
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/sheap`. To experiment with that code, run `bin/console` for an interactive prompt.
5
+ Some examples of things you can do with Sheap:
6
+ - Find all retained objects between two heap dumps, and analyze them by their properties
7
+ - Inspect individual objects in a heap dump, and interrogate the objects it references and the objects that reference it (reverse references)
8
+ - For a given object, discover all paths back to the root of the heap, which can help you understand why an object is retained.
6
9
 
7
- ## Installation
8
-
9
- TODO: Replace `UPDATE_WITH_YOUR_GEM_NAME_PRIOR_TO_RELEASE_TO_RUBYGEMS_ORG` with your gem name right after releasing it to RubyGems.org. Please do not do it earlier due to security reasons. Alternatively, replace this section with instructions to install your gem from git if you don't plan to release to RubyGems.org.
10
-
11
- Install the gem and add to the application's Gemfile by executing:
10
+ Why Ruby heap dumps, briefly:
11
+ - Ruby heap dumps are a snapshot of the state of the Ruby VM at a given point in time, which can be useful for understanding memory-related behavior such as bloat and retention issues.
12
+ - The heap contains objects that may be familiar to your application (constants, classes, instances of classes, and primitives like strings and arrays), as well as objects that are internal to the Ruby VM, such as instruction sequences and call caches.
13
+ - Ruby's garbage collector is a mark-and-sweep collector, which means that it starts at the root of the heap and marks all objects that are reachable from the root. It then sweeps the heap, freeing any objects that were not marked. This means that any object that is reachable from the root is retained, and any object that is not reachable from the root is freed. This is why it's useful to find all objects that are retained between two heap dumps (and thus multiple GC runs), inspect their properties, and understand their paths back to the root of the heap.
12
14
 
13
- $ bundle add UPDATE_WITH_YOUR_GEM_NAME_PRIOR_TO_RELEASE_TO_RUBYGEMS_ORG
14
-
15
- If bundler is not being used to manage dependencies, install the gem by executing:
15
+ ## Installation
16
16
 
17
- $ gem install UPDATE_WITH_YOUR_GEM_NAME_PRIOR_TO_RELEASE_TO_RUBYGEMS_ORG
17
+ You can `gem install sheap` to get sheap as a library and command line tool. You can also download `lib/sheap.rb` to a remote server and require it as a standalone file from IRB.
18
18
 
19
19
  ## Usage
20
20
 
21
- TODO: Write usage instructions here
21
+ Using the command line will open an IRB session with the heap loaded. You can then use the `$diff`, `$before`, and `$after` variable to explore the heap.
22
+
23
+ ```console
24
+ $ sheap [HEAP_BEFORE.dump] [HEAP_AFTER.dump]
25
+ ```
26
+
27
+ To use directly with IRB:
28
+
29
+ ```ruby
30
+ # $ irb
31
+
32
+ require './lib/sheap'
33
+
34
+ # Create a diff of two heap dumps
35
+ $diff = Sheap::Diff.new('tmp/heap_before.dump', 'tmp/heap_after.dump')
36
+
37
+ # Find all retained objects and count by type
38
+ $diff.retained.map(&:type_str).tally.sort_by(&:last)
39
+ # => [["DATA", 1], ["FILE", 1], ["IMEMO", 4], ["STRING", 4], ["ARRAY", 10000]]
40
+
41
+ # Find the largest array in the 'after' heap dump
42
+ $diff.after.of_type("ARRAY").sort_by { |o| o.data["length"] }.last
43
+ # => <ARRAY 0x1023effc8 (10000 refs)>
44
+
45
+ # Is it old?
46
+ $diff.after.of_type("ARRAY").sort_by { |o| o.data["length"] }.last.old?
47
+ # => false
48
+
49
+ # What else can be learned about it
50
+ $diff.after.of_type("ARRAY").sort_by { |o| o.data["length"] }.last.data
51
+ # =>
52
+ # { "address"=>"0x1023effc8",
53
+ # "type"=>"ARRAY",
54
+ # "shape_id"=>0,
55
+ # "slot_size"=>40,
56
+ # "class"=>"0x1024c33f0",
57
+ # "length"=>10000,
58
+ # "memsize"=>89712,
59
+ # "flags"=>{"wb_protected"=>true}
60
+ # "references" => [
61
+ # <ARRAY 0x1023efc80>,
62
+ # <ARRAY 0x1023efc58>,
63
+ # <ARRAY 0x1023efc30>,
64
+ # <ARRAY 0x1023efc08>,
65
+ # <ARRAY 0x1023efbe0>,
66
+ # <ARRAY 0x1023efbb8>,
67
+ # # ...
68
+ # ]
69
+ # }
70
+
71
+ # Find the first of its references by address
72
+ $diff.after.at("0x1023efc80")
73
+ => <ARRAY 0x1023efc80>
74
+
75
+ # Show that object's path back to the root of the heap
76
+ $diff.after.find_path($diff.after.at("0x1023efc80"))
77
+ # => [<ROOT global_tbl (13 refs)>, <ARRAY 0x1023effc8 (10000 refs)>, <ARRAY 0x1023efc80>]
78
+ ```
79
+
80
+ ### Generating heap dumps
81
+
82
+ Sheap on its own will not generate heap dumps for you. Some options for generating heap dumps:
83
+
84
+ - `ObjectSpace.dump_all(output: open("tmp/snapshot1.dump", "w"))`
85
+ - [Derailed Benchmarks](https://github.com/zombocom/derailed_benchmarks) `bundle exec derailed exec perf:heap_diff` produces 3 generations of heap dumps.
22
86
 
23
87
  ## Development
24
88
 
data/lib/sheap/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  class Sheap
4
- VERSION = "0.1.0"
4
+ VERSION = "0.2.0"
5
5
  end
data/lib/sheap.rb CHANGED
@@ -15,38 +15,94 @@ class Sheap
15
15
  FileUtils.mkdir_p(@dir)
16
16
  end
17
17
 
18
+ module Collection
19
+ def class_named(name)
20
+ filter do |obj|
21
+ obj.json.include?(name) &&
22
+ obj.type_str == "CLASS" &&
23
+ obj.name == name
24
+ end
25
+ end
26
+
27
+ def instances_of(klass)
28
+ addr = klass.address
29
+ filter do |obj|
30
+ obj.json.include?(addr) &&
31
+ obj.class_addr == addr
32
+ end
33
+ end
34
+
35
+ def of_type(type)
36
+ type = type.to_s.upcase
37
+ filter { |o| o.json.include?(type) && o.type_str == type }
38
+ end
39
+
40
+ def classes; of_type("CLASS"); end
41
+ def icasses; of_type("ICLASS"); end
42
+ def modules; of_type("MODULE"); end
43
+ def imemos; of_type("IMEMO"); end
44
+
45
+ def strings; of_type("STRING"); end
46
+ def hashes; of_type("HASH"); end
47
+ def arrays; of_type("ARRAY"); end
48
+
49
+ def plain_objects; of_type("OBJECT"); end
50
+ def structs; of_type("STRUCT"); end
51
+ def datas; of_type("DATA"); end
52
+ def files; of_type("FILE"); end
53
+
54
+ def regexps; of_type("REGEXP"); end
55
+ def matches; of_type("MATCH"); end
56
+
57
+ def bignums; of_type("BIGNUM"); end
58
+ def symbols; of_type("SYMBOL"); end
59
+ def floats; of_type("FLOAT"); end
60
+ def rationals; of_type("RATIONAL"); end
61
+ def complexes; of_type("COMPLEX"); end
62
+ end
63
+
18
64
  class Diff
19
- attr_reader :before, :after
20
- def initialize(before, after)
21
- @before = before
22
- @after = after
65
+ include Collection
66
+
67
+ attr_reader :before, :after, :later
68
+ def initialize(before, after, later = nil)
69
+ @before = Heap.wrap(before)
70
+ @after = Heap.wrap(after)
71
+ @later = Heap.wrap(later) if later
23
72
  end
24
73
 
25
74
  def retained
26
- @retained ||= after.objects - before.objects
75
+ @retained ||= HeapObjectCollection.new(calculate_retained, @after)
27
76
  end
28
- end
77
+ alias objects retained
29
78
 
30
- def self.load
31
- Dir["sheap/*"].map do |file|
32
- Heap.new(file)
79
+ def filter(&block)
80
+ retained.filter(&block)
33
81
  end
34
- end
35
82
 
36
- def self.load_diff
37
- before = Heap.new("sheap/snapshot-0.dump")
38
- after = Heap.new("sheap/snapshot-1.dump")
39
- Diff.new(before, after)
40
- end
83
+ def inspect
84
+ "#<#{self.class} (#{objects.size} objects)>"
85
+ end
41
86
 
42
- def snapshot(gc: true)
43
- 3.times { GC.start } if gc
87
+ private
88
+
89
+ def calculate_retained
90
+ set = Set.new
91
+ @after.objects.each do |obj|
92
+ set.add(obj)
93
+ end
94
+ @before.objects.each do |obj|
95
+ set.delete(obj)
96
+ end
97
+ if @later
98
+ later_set = Set.new(@later.objects)
99
+ set.select! do |obj|
100
+ later_set.include?(obj)
101
+ end
102
+ end
44
103
 
45
- output = File.join(@dir, "snapshot-#{@idx}.dump")
46
- File.open(output, "w") do |file|
47
- ObjectSpace.dump_all(output: file)
104
+ set.to_a
48
105
  end
49
- @idx += 1
50
106
  end
51
107
 
52
108
  class << self
@@ -61,6 +117,76 @@ class Sheap
61
117
 
62
118
  EMPTY_ARRAY = [].freeze
63
119
 
120
+ class HeapObjectCollection
121
+ include Enumerable
122
+ include Collection
123
+
124
+ attr_reader :heap, :objects
125
+
126
+ def initialize(objects, heap = nil)
127
+ objects = objects.to_a unless objects.instance_of?(Array)
128
+ @objects = objects
129
+ @heap = heap || objects.first&.heap
130
+ end
131
+
132
+ def filter(&block)
133
+ HeapObjectCollection.new(@objects.select(&block), @heap)
134
+ end
135
+ alias select filter
136
+
137
+ def sample(n = nil)
138
+ if n
139
+ HeapObjectCollection.new(@objects.sample(n))
140
+ else
141
+ @objects.sample
142
+ end
143
+ end
144
+
145
+ def last(n = nil)
146
+ if n
147
+ HeapObjectCollection.new(@objects.last(n))
148
+ else
149
+ objects.last
150
+ end
151
+ end
152
+
153
+ def each(&block)
154
+ @objects.each(&block)
155
+ end
156
+
157
+ def length
158
+ @objects.length
159
+ end
160
+ alias size length
161
+ alias count length
162
+
163
+ def pretty_print(q)
164
+ q.group(1, '[', ']') {
165
+ if size <= 20
166
+ q.seplist(self) {|v|
167
+ q.pp v
168
+ }
169
+ else
170
+ preview = 4
171
+ q.seplist(first(preview)) {|v|
172
+ q.pp v
173
+ }
174
+ q.comma_breakable
175
+ q.text "... (#{size - preview} more)"
176
+ end
177
+ }
178
+ end
179
+
180
+ def inspect
181
+ "#<#{self.class} (#{size} objects)>"
182
+ end
183
+
184
+ def to_a
185
+ @objects
186
+ end
187
+ alias to_ary to_a
188
+ end
189
+
64
190
  class HeapObject
65
191
  attr_reader :heap, :json
66
192
 
@@ -73,6 +199,10 @@ class Sheap
73
199
  @json[/"type":"([A-Z]+)"/, 1]
74
200
  end
75
201
 
202
+ def root?
203
+ @json.include?('"type":"ROOT"')
204
+ end
205
+
76
206
  def address
77
207
  @json[/"address":"(0x[0-9a-f]+)"/, 1] || @json[/"root":"([a-z_]+)"/, 1]
78
208
  end
@@ -87,13 +217,19 @@ class Sheap
87
217
  end
88
218
 
89
219
  def references
90
- referenced_addrs.map do |addr|
91
- @heap.at(addr)
92
- end
220
+ HeapObjectCollection.new(
221
+ referenced_addrs.map do |addr|
222
+ @heap.at(addr)
223
+ end,
224
+ heap
225
+ )
93
226
  end
94
227
 
95
228
  def inverse_references
96
- @heap.inverse_references[address] || EMPTY_ARRAY
229
+ HeapObjectCollection.new(
230
+ (@heap.inverse_references[address] || EMPTY_ARRAY),
231
+ heap
232
+ )
97
233
  end
98
234
 
99
235
  def data
@@ -109,17 +245,21 @@ class Sheap
109
245
  end
110
246
 
111
247
  def imemo_type
112
- @json[/"imemo_type":"([a-z]+)"/, 1]
248
+ @json[/"imemo_type":"([a-z_]+)"/, 1]
113
249
  end
114
250
 
115
251
  def struct
116
- @json[/"struct":"([a-zA-Z]+)"/, 1]
252
+ @json[/"struct":"([^"]+)"/, 1]
117
253
  end
118
254
 
119
255
  def wb_protected?
120
256
  @json.include?('"wb_protected":true')
121
257
  end
122
258
 
259
+ def old?
260
+ @json.include?('"old":true')
261
+ end
262
+
123
263
  def name
124
264
  data["name"]
125
265
  end
@@ -133,23 +273,30 @@ class Sheap
133
273
  heap.instances_of(self)
134
274
  end
135
275
 
276
+ def superclass
277
+ heap.at(data["superclass"])
278
+ end
279
+
136
280
  def inspect
137
281
  type_str = self.type_str
138
- s = +"<#{type_str} #{address}"
282
+ s = +"<#{type_str} #{address} #{inspect_hint}>"
283
+ end
139
284
 
285
+ def inspect_hint
286
+ s = +""
140
287
  case type_str
141
288
  when "CLASS"
142
- s << " " << (name || "(anonymous)")
289
+ s << (name || "(anonymous)")
143
290
  when "MODULE"
144
- s << " " << (name || "(anonymous)")
291
+ s << (name || "(anonymous)")
145
292
  when "STRING"
146
- s << " " << data["value"].inspect
293
+ s << data["value"].inspect
147
294
  when "IMEMO"
148
- s << " " << (imemo_type || "unknown")
295
+ s << (imemo_type || "unknown")
149
296
  when "OBJECT"
150
- s << " " << (klass.name || "(#{klass.address})")
297
+ s << (klass.name || "(#{klass.address})")
151
298
  when "DATA"
152
- s << " " << struct.to_s
299
+ s << struct.to_s
153
300
  end
154
301
 
155
302
  refs = referenced_addrs
@@ -157,7 +304,42 @@ class Sheap
157
304
  s << " (#{referenced_addrs.size} refs)"
158
305
  end
159
306
 
160
- s << ">"
307
+ s
308
+ end
309
+
310
+ def pretty_print(q)
311
+ current_depth = q.current_group.depth
312
+ q.group(1, "#<#{type_str}", '>') do
313
+ q.text " "
314
+ q.text address
315
+ if current_depth <= 1
316
+ data = self.data
317
+ attributes = data.keys - ["address"]
318
+ q.seplist(attributes, lambda { q.text ',' }) {|v|
319
+ q.breakable
320
+ q.text v
321
+ q.text "="
322
+ q.group(1) {
323
+ q.breakable ''
324
+ case v
325
+ when "class"
326
+ q.pp klass
327
+ when "superclass"
328
+ q.pp superclass
329
+ when "flags"
330
+ q.text flags.keys.join("|")
331
+ when "references"
332
+ q.text "(#{referenced_addrs.size} refs)"
333
+ else
334
+ q.pp data[v]
335
+ end
336
+ }
337
+ }
338
+ else
339
+ q.breakable
340
+ q.text inspect_hint
341
+ end
342
+ end
161
343
  end
162
344
 
163
345
  def value
@@ -198,9 +380,27 @@ class Sheap
198
380
  def hash
199
381
  address.hash
200
382
  end
383
+
384
+ def method_missing(name, *args)
385
+ if value = data[name.to_s]
386
+ value
387
+ else
388
+ super
389
+ end
390
+ end
391
+
392
+ def respond_to_missing?(name, *)
393
+ data.key?(name.to_s) || super
394
+ end
395
+
396
+ def [](key)
397
+ data[key.to_s]
398
+ end
201
399
  end
202
400
 
203
401
  class Heap
402
+ include Collection
403
+
204
404
  attr_reader :filename
205
405
 
206
406
  def initialize(filename)
@@ -210,15 +410,29 @@ class Sheap
210
410
  def each_object
211
411
  return enum_for(__method__) unless block_given?
212
412
 
213
- File.open(filename) do |file|
413
+ open_file do |file|
214
414
  file.each_line do |json|
215
415
  yield HeapObject.new(self, json)
216
416
  end
217
417
  end
218
418
  end
219
419
 
420
+ def open_file(&block)
421
+ # FIXME: look for magic header
422
+ if filename.end_with?(".gz")
423
+ require "zlib"
424
+ Zlib::GzipReader.open(filename, &block)
425
+ else
426
+ File.open(filename, &block)
427
+ end
428
+ end
429
+
220
430
  def objects
221
- @objects ||= each_object.to_a
431
+ @objects ||= HeapObjectCollection.new(each_object.to_a, self)
432
+ end
433
+
434
+ def filter(&block)
435
+ objects.filter(&block)
222
436
  end
223
437
 
224
438
  def objects_by_addr
@@ -247,33 +461,54 @@ class Sheap
247
461
  end
248
462
  end
249
463
 
250
- def at(addr)
251
- objects_by_addr[addr]
464
+ def roots
465
+ of_type("ROOT")
252
466
  end
253
467
 
254
- def class_named(name)
255
- objects.select do |obj|
256
- obj.json.include?(name) &&
257
- obj.type_str == "CLASS" &&
258
- obj.name == name
468
+ # finds a path from `start_address` through the inverse_references hash
469
+ # and so the end_address will be the object that's closer to the root
470
+ def find_path(start_addresses, end_addresses = nil)
471
+ if end_addresses.nil?
472
+ end_addresses = start_addresses
473
+ start_addresses = roots
259
474
  end
260
- end
475
+ start_addresses = Array(start_addresses)
476
+ end_addresses = Array(end_addresses)
261
477
 
262
- def instances_of(klass)
263
- addr = klass.address
264
- objects.select do |obj|
265
- obj.json.include?(addr) &&
266
- obj.class_addr == addr
478
+ q = start_addresses.map{|x| [x] }
479
+
480
+ visited = Set.new
481
+ while !q.empty?
482
+ current_path = q.shift
483
+ current_address = current_path.last
484
+
485
+ if end_addresses.include?(current_address)
486
+ return current_path.map{|addr| addr}
487
+ end
488
+
489
+ if !visited.include?(current_address)
490
+ visited.add(current_address)
491
+
492
+ current_references = current_address.references
493
+
494
+ current_references.each do |obj|
495
+ q.push([*current_path, obj])
496
+ end
497
+ end
267
498
  end
499
+ nil
268
500
  end
269
501
 
270
- def of_type(type)
271
- type = type.to_s.upcase
272
- objects.select { |o| o.type_str == type }
502
+ def at(addr)
503
+ objects_by_addr[addr]
273
504
  end
274
505
 
275
506
  def inspect
276
507
  "#<#{self.class} (#{objects.size} objects)>"
277
508
  end
509
+
510
+ def self.wrap(heap)
511
+ self === heap ? heap : new(heap)
512
+ end
278
513
  end
279
514
  end
data/tmp/.keep ADDED
File without changes
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sheap
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - John Hawthorn
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-10-13 00:00:00.000000000 Z
11
+ date: 2024-02-22 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: A set of helpers for analyzing the output of ObjectSpace.dump_all
14
14
  email:
@@ -26,6 +26,7 @@ files:
26
26
  - exe/sheap
27
27
  - lib/sheap.rb
28
28
  - lib/sheap/version.rb
29
+ - tmp/.keep
29
30
  homepage: https://github.com/jhawthorn/sheap
30
31
  licenses:
31
32
  - MIT
@@ -48,7 +49,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
48
49
  - !ruby/object:Gem::Version
49
50
  version: '0'
50
51
  requirements: []
51
- rubygems_version: 3.4.10
52
+ rubygems_version: 3.5.3
52
53
  signing_key:
53
54
  specification_version: 4
54
55
  summary: A helpers for heap dumps