sheap 0.1.0 → 0.3.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (6) hide show
  1. checksums.yaml +4 -4
  2. data/README.md +88 -12
  3. data/lib/sheap/version.rb +1 -1
  4. data/lib/sheap.rb +307 -53
  5. data/tmp/.keep +0 -0
  6. metadata +4 -3
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 0f61e05e0d4dbe85f032a53a7b61a9d5446d1f7bddd599ed47b46c132c3615ca
4
- data.tar.gz: f2c2aa947a8c2110df25add16f97b27a4a0df10473dd4e73c98a4634899489b1
3
+ metadata.gz: 84cd499accc73ed5e94564c3703fd2ed19dfb380b1c62cbccb35a207ab670e1e
4
+ data.tar.gz: b44e356f1d76c36b1cd4bb335594765aafafcdf3e21af96b7cecda3bb3617fdb
5
5
  SHA512:
6
- metadata.gz: d75e2dce655e44a761c0f3a23c29afd59fe2e535f179b286175e1685d3e4552bf3c4a9eefc21495f1be67a31084db6ceb6a400f7fe40ac98bf21b0751665c50d
7
- data.tar.gz: 5410d3da66d793c4c552878738726e15e34341c017f62bc8c992baebab2933fe3780a9efec522717c8c01ee02ac7bc9cdbbc65c7c9691d69861299b4fda26cb3
6
+ metadata.gz: 69fdf739e7088d998f3968024f2fe2b086ba79d2a4707f3dc924a87a079f40c2b89958647ef92fafa8f7a9f513377bae229c715178071a2c19c30384c81374ec
7
+ data.tar.gz: 6b29e443a217dab6b283c9026b85c1ac358fca2c8653926774cb7356eff8693e61652c1d4bad1032539c9cb0e37fba56b2dc3267d334a38c6105d2bcb8bde69f
data/README.md CHANGED
@@ -1,24 +1,100 @@
1
1
  # Sheap
2
2
 
3
- TODO: Delete this and the text below, and describe your gem
3
+ Sheap is a library for interactively exploring Ruby Heap dumps. Sheap contains a command-line tool and a library for use in IRB.
4
4
 
5
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/sheap`. To experiment with that code, run `bin/console` for an interactive prompt.
5
+ Some examples of things you can do with Sheap:
6
+ - Find all retained objects between two heap dumps, and analyze them by their properties
7
+ - Inspect individual objects in a heap dump, and interrogate the objects it references and the objects that reference it (reverse references)
8
+ - For a given object, discover all paths back to the root of the heap, which can help you understand why an object is retained.
6
9
 
7
- ## Installation
8
-
9
- TODO: Replace `UPDATE_WITH_YOUR_GEM_NAME_PRIOR_TO_RELEASE_TO_RUBYGEMS_ORG` with your gem name right after releasing it to RubyGems.org. Please do not do it earlier due to security reasons. Alternatively, replace this section with instructions to install your gem from git if you don't plan to release to RubyGems.org.
10
-
11
- Install the gem and add to the application's Gemfile by executing:
10
+ Why Ruby heap dumps, briefly:
11
+ - Ruby heap dumps are a snapshot of the state of the Ruby VM at a given point in time, which can be useful for understanding memory-related behavior such as bloat and retention issues.
12
+ - The heap contains objects that may be familiar to your application (constants, classes, instances of classes, and primitives like strings and arrays), as well as objects that are internal to the Ruby VM, such as instruction sequences and call caches.
13
+ - Ruby's garbage collector is a mark-and-sweep collector, which means that it starts at the root of the heap and marks all objects that are reachable from the root. It then sweeps the heap, freeing any objects that were not marked. This means that any object that is reachable from the root is retained, and any object that is not reachable from the root is freed. This is why it's useful to find all objects that are retained between two heap dumps (and thus multiple GC runs), inspect their properties, and understand their paths back to the root of the heap.
12
14
 
13
- $ bundle add UPDATE_WITH_YOUR_GEM_NAME_PRIOR_TO_RELEASE_TO_RUBYGEMS_ORG
14
-
15
- If bundler is not being used to manage dependencies, install the gem by executing:
15
+ ## Installation
16
16
 
17
- $ gem install UPDATE_WITH_YOUR_GEM_NAME_PRIOR_TO_RELEASE_TO_RUBYGEMS_ORG
17
+ You can `gem install sheap` to get sheap as a library and command line tool. You can also download `lib/sheap.rb` to a remote server and require it as a standalone file from IRB.
18
18
 
19
19
  ## Usage
20
20
 
21
- TODO: Write usage instructions here
21
+ Using the command line will open an IRB session with the heap loaded. You can then use the `$diff`, `$before`, and `$after` variable to explore the heap.
22
+
23
+ ```console
24
+ $ sheap [HEAP_BEFORE.dump] [HEAP_AFTER.dump]
25
+ ```
26
+
27
+ To use directly with IRB:
28
+
29
+ ```ruby
30
+ # $ irb
31
+
32
+ require './lib/sheap'
33
+
34
+ # Create a diff of two heap dumps
35
+ $diff = Sheap::Diff.new('tmp/heap_before.dump', 'tmp/heap_after.dump')
36
+
37
+ # Find all retained objects and count by type
38
+ $diff.retained.map(&:type_str).tally.sort_by(&:last)
39
+ # => [["DATA", 1], ["FILE", 1], ["IMEMO", 4], ["STRING", 4], ["ARRAY", 10000]]
40
+
41
+ # Find the 4 largest arrays in the 'after' heap dump
42
+ >> $diff.after.arrays.sort_by(&:length).last(5)
43
+ # =>
44
+ # [#<ARRAY 0x100ec0440 (512 refs)>,
45
+ # #<ARRAY 0x100ec9270 (512 refs)>,
46
+ # #<ARRAY 0x100f4b450 (512 refs)>,
47
+ # #<ARRAY 0x11bc6d5b0 (512 refs)>,
48
+ # #<ARRAY 0x11c137960 (10000 refs)>]
49
+
50
+ # Grab and examine just the largest array
51
+ large_arr = $diff.after.arrays.max_by(&:length)
52
+ # =>
53
+ # #<ARRAY 0x1023effc8
54
+ # type="ARRAY",
55
+ # shape_id=0,
56
+ # slot_size=40,
57
+ # class=#<CLASS 0x100e43350 Array (252 refs)>,
58
+ # length=10000,
59
+ # references=(10000 refs),
60
+ # memsize=89712,
61
+ # flags=wb_protected>
62
+
63
+ # Is it old?
64
+ large_arr.old?
65
+ # => false
66
+
67
+ # Find the first of its references
68
+ large_arr.references.first
69
+ # =>
70
+ # #<ARRAY 0x11c13fdb8
71
+ # type="ARRAY",
72
+ # shape_id=0,
73
+ # slot_size=40,
74
+ # class=#<CLASS 0x100e43350 Array (252 refs)>,
75
+ # length=0,
76
+ # embedded=true,
77
+ # memsize=40,
78
+ # flags=wb_protected>
79
+
80
+ # Reference that same object by address
81
+ $diff.after.at("0x11c13fdb8")
82
+ # =>
83
+ # #<ARRAY 0x11c13fdb8
84
+ # type="ARRAY",
85
+ # ...
86
+
87
+ # Show that object's path back to the root of the heap
88
+ $diff.after.find_path($diff.after.at("0x11c13fdb8"))
89
+ # => [#<ROOT global_tbl (13 refs)>, #<ARRAY 0x1023effc8 (10000 refs)>, #<ARRAY 0x11c13fdb8>]
90
+ ```
91
+
92
+ ### Generating heap dumps
93
+
94
+ Sheap on its own will not generate heap dumps for you. Some options for generating heap dumps:
95
+
96
+ - `ObjectSpace.dump_all(output: open("tmp/snapshot1.dump", "w"))`
97
+ - [Derailed Benchmarks](https://github.com/zombocom/derailed_benchmarks) `bundle exec derailed exec perf:heap_diff` produces 3 generations of heap dumps.
22
98
 
23
99
  ## Development
24
100
 
data/lib/sheap/version.rb CHANGED
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  class Sheap
4
- VERSION = "0.1.0"
4
+ VERSION = "0.3.0"
5
5
  end
data/lib/sheap.rb CHANGED
@@ -15,38 +15,107 @@ class Sheap
15
15
  FileUtils.mkdir_p(@dir)
16
16
  end
17
17
 
18
+ module Collection
19
+ def class_named(name)
20
+ filter do |obj|
21
+ obj.json.include?(name) &&
22
+ obj.type_str == "CLASS" &&
23
+ obj.name == name
24
+ end
25
+ end
26
+
27
+ def instances_of(klass)
28
+ addr = klass.address
29
+ filter do |obj|
30
+ obj.json.include?(addr) &&
31
+ obj.class_addr == addr
32
+ end
33
+ end
34
+
35
+ def of_type(type)
36
+ type = type.to_s.upcase
37
+ filter { |o| o.json.include?(type) && o.type_str == type }
38
+ end
39
+
40
+ def of_imemo_type(type)
41
+ type = type.to_s.downcase
42
+ filter { |o| o.json.include?(type) && o.imemo_type == type }
43
+ end
44
+
45
+ def classes; of_type("CLASS"); end
46
+ def icasses; of_type("ICLASS"); end
47
+ def modules; of_type("MODULE"); end
48
+ def imemos; of_type("IMEMO"); end
49
+
50
+ def strings; of_type("STRING"); end
51
+ def hashes; of_type("HASH"); end
52
+ def arrays; of_type("ARRAY"); end
53
+
54
+ def plain_objects; of_type("OBJECT"); end
55
+ def structs; of_type("STRUCT"); end
56
+ def datas; of_type("DATA"); end
57
+ def files; of_type("FILE"); end
58
+
59
+ def regexps; of_type("REGEXP"); end
60
+ def matches; of_type("MATCH"); end
61
+
62
+ def bignums; of_type("BIGNUM"); end
63
+ def symbols; of_type("SYMBOL"); end
64
+ def floats; of_type("FLOAT"); end
65
+ def rationals; of_type("RATIONAL"); end
66
+ def complexes; of_type("COMPLEX"); end
67
+
68
+ # imemo types
69
+ def iseqs; of_imemo_type("iseq"); end
70
+ def callcaches; of_imemo_type("callcache"); end
71
+ def constcaches; of_imemo_type("constcache"); end
72
+ def callinfos; of_imemo_type("callinfo"); end
73
+ def crefs; of_imemo_type("cref"); end
74
+ def ments; of_imemo_type("ment"); end
75
+ end
76
+
18
77
  class Diff
19
- attr_reader :before, :after
20
- def initialize(before, after)
21
- @before = before
22
- @after = after
78
+ include Collection
79
+
80
+ attr_reader :before, :after, :later
81
+ def initialize(before, after, later = nil)
82
+ @before = Heap.wrap(before)
83
+ @after = Heap.wrap(after)
84
+ @later = Heap.wrap(later) if later
23
85
  end
24
86
 
25
87
  def retained
26
- @retained ||= after.objects - before.objects
88
+ @retained ||= HeapObjectCollection.new(calculate_retained, @after)
27
89
  end
28
- end
90
+ alias objects retained
29
91
 
30
- def self.load
31
- Dir["sheap/*"].map do |file|
32
- Heap.new(file)
92
+ def filter(&block)
93
+ retained.filter(&block)
33
94
  end
34
- end
35
95
 
36
- def self.load_diff
37
- before = Heap.new("sheap/snapshot-0.dump")
38
- after = Heap.new("sheap/snapshot-1.dump")
39
- Diff.new(before, after)
40
- end
96
+ def inspect
97
+ "#<#{self.class} (#{objects.size} objects)>"
98
+ end
41
99
 
42
- def snapshot(gc: true)
43
- 3.times { GC.start } if gc
100
+ private
44
101
 
45
- output = File.join(@dir, "snapshot-#{@idx}.dump")
46
- File.open(output, "w") do |file|
47
- ObjectSpace.dump_all(output: file)
102
+ def calculate_retained
103
+ set = Set.new
104
+ @after.objects.each do |obj|
105
+ set.add(obj)
106
+ end
107
+ @before.objects.each do |obj|
108
+ set.delete(obj)
109
+ end
110
+ if @later
111
+ later_set = Set.new(@later.objects)
112
+ set.select! do |obj|
113
+ later_set.include?(obj)
114
+ end
115
+ end
116
+
117
+ set.to_a
48
118
  end
49
- @idx += 1
50
119
  end
51
120
 
52
121
  class << self
@@ -61,6 +130,82 @@ class Sheap
61
130
 
62
131
  EMPTY_ARRAY = [].freeze
63
132
 
133
+ class HeapObjectCollection
134
+ include Enumerable
135
+ include Collection
136
+
137
+ attr_reader :heap, :objects
138
+
139
+ def initialize(objects, heap = nil)
140
+ objects = objects.to_a unless objects.instance_of?(Array)
141
+ @objects = objects
142
+ @heap = heap || objects.first&.heap
143
+ end
144
+
145
+ def filter(&block)
146
+ HeapObjectCollection.new(@objects.select(&block), @heap)
147
+ end
148
+ alias select filter
149
+
150
+ def sample(n = nil)
151
+ if n
152
+ HeapObjectCollection.new(@objects.sample(n))
153
+ else
154
+ @objects.sample
155
+ end
156
+ end
157
+
158
+ def [](*args)
159
+ @objects[*args]
160
+ end
161
+
162
+ def last(n = nil)
163
+ if n
164
+ HeapObjectCollection.new(@objects.last(n))
165
+ else
166
+ objects.last
167
+ end
168
+ end
169
+
170
+ def each(&block)
171
+ @objects.each(&block)
172
+ end
173
+
174
+ def length
175
+ @objects.length
176
+ end
177
+ alias size length
178
+ def count(&block)
179
+ @objects.count(&block)
180
+ end
181
+
182
+ def pretty_print(q)
183
+ q.group(1, '[', ']') {
184
+ if size <= 20
185
+ q.seplist(self) {|v|
186
+ q.pp v
187
+ }
188
+ else
189
+ preview = 4
190
+ q.seplist(first(preview)) {|v|
191
+ q.pp v
192
+ }
193
+ q.comma_breakable
194
+ q.text "... (#{size - preview} more)"
195
+ end
196
+ }
197
+ end
198
+
199
+ def inspect
200
+ "#<#{self.class} (#{size} objects)>"
201
+ end
202
+
203
+ def to_a
204
+ @objects
205
+ end
206
+ alias to_ary to_a
207
+ end
208
+
64
209
  class HeapObject
65
210
  attr_reader :heap, :json
66
211
 
@@ -73,6 +218,10 @@ class Sheap
73
218
  @json[/"type":"([A-Z]+)"/, 1]
74
219
  end
75
220
 
221
+ def root?
222
+ @json.include?('"type":"ROOT"')
223
+ end
224
+
76
225
  def address
77
226
  @json[/"address":"(0x[0-9a-f]+)"/, 1] || @json[/"root":"([a-z_]+)"/, 1]
78
227
  end
@@ -87,13 +236,19 @@ class Sheap
87
236
  end
88
237
 
89
238
  def references
90
- referenced_addrs.map do |addr|
91
- @heap.at(addr)
92
- end
239
+ HeapObjectCollection.new(
240
+ referenced_addrs.map do |addr|
241
+ @heap.at(addr)
242
+ end,
243
+ heap
244
+ )
93
245
  end
94
246
 
95
247
  def inverse_references
96
- @heap.inverse_references[address] || EMPTY_ARRAY
248
+ HeapObjectCollection.new(
249
+ (@heap.inverse_references[address] || EMPTY_ARRAY),
250
+ heap
251
+ )
97
252
  end
98
253
 
99
254
  def data
@@ -109,17 +264,21 @@ class Sheap
109
264
  end
110
265
 
111
266
  def imemo_type
112
- @json[/"imemo_type":"([a-z]+)"/, 1]
267
+ @json[/"imemo_type":"([a-z_]+)"/, 1]
113
268
  end
114
269
 
115
270
  def struct
116
- @json[/"struct":"([a-zA-Z]+)"/, 1]
271
+ @json[/"struct":"([^"]+)"/, 1]
117
272
  end
118
273
 
119
274
  def wb_protected?
120
275
  @json.include?('"wb_protected":true')
121
276
  end
122
277
 
278
+ def old?
279
+ @json.include?('"old":true')
280
+ end
281
+
123
282
  def name
124
283
  data["name"]
125
284
  end
@@ -133,23 +292,30 @@ class Sheap
133
292
  heap.instances_of(self)
134
293
  end
135
294
 
295
+ def superclass
296
+ heap.at(data["superclass"])
297
+ end
298
+
136
299
  def inspect
137
300
  type_str = self.type_str
138
- s = +"<#{type_str} #{address}"
301
+ s = +"<#{type_str} #{address} #{inspect_hint}>"
302
+ end
139
303
 
304
+ def inspect_hint
305
+ s = +""
140
306
  case type_str
141
307
  when "CLASS"
142
- s << " " << (name || "(anonymous)")
308
+ s << (name || "(anonymous)")
143
309
  when "MODULE"
144
- s << " " << (name || "(anonymous)")
310
+ s << (name || "(anonymous)")
145
311
  when "STRING"
146
- s << " " << data["value"].inspect
312
+ s << data["value"].inspect
147
313
  when "IMEMO"
148
- s << " " << (imemo_type || "unknown")
314
+ s << (imemo_type || "unknown")
149
315
  when "OBJECT"
150
- s << " " << (klass.name || "(#{klass.address})")
316
+ s << (klass.name || "(#{klass.address})")
151
317
  when "DATA"
152
- s << " " << struct.to_s
318
+ s << struct.to_s
153
319
  end
154
320
 
155
321
  refs = referenced_addrs
@@ -157,7 +323,42 @@ class Sheap
157
323
  s << " (#{referenced_addrs.size} refs)"
158
324
  end
159
325
 
160
- s << ">"
326
+ s
327
+ end
328
+
329
+ def pretty_print(q)
330
+ current_depth = q.current_group.depth
331
+ q.group(1, "#<#{type_str}", '>') do
332
+ q.text " "
333
+ q.text address
334
+ if current_depth <= 1
335
+ data = self.data
336
+ attributes = data.keys - ["address"]
337
+ q.seplist(attributes, lambda { q.text ',' }) {|v|
338
+ q.breakable
339
+ q.text v
340
+ q.text "="
341
+ q.group(1) {
342
+ q.breakable ''
343
+ case v
344
+ when "class"
345
+ q.pp klass
346
+ when "superclass"
347
+ q.pp superclass
348
+ when "flags"
349
+ q.text flags.keys.join("|")
350
+ when "references"
351
+ q.text "(#{referenced_addrs.size} refs)"
352
+ else
353
+ q.pp data[v]
354
+ end
355
+ }
356
+ }
357
+ else
358
+ q.breakable
359
+ q.text inspect_hint
360
+ end
361
+ end
161
362
  end
162
363
 
163
364
  def value
@@ -198,9 +399,27 @@ class Sheap
198
399
  def hash
199
400
  address.hash
200
401
  end
402
+
403
+ def method_missing(name, *args)
404
+ if value = data[name.to_s]
405
+ value
406
+ else
407
+ super
408
+ end
409
+ end
410
+
411
+ def respond_to_missing?(name, *)
412
+ data.key?(name.to_s) || super
413
+ end
414
+
415
+ def [](key)
416
+ data[key.to_s]
417
+ end
201
418
  end
202
419
 
203
420
  class Heap
421
+ include Collection
422
+
204
423
  attr_reader :filename
205
424
 
206
425
  def initialize(filename)
@@ -210,15 +429,29 @@ class Sheap
210
429
  def each_object
211
430
  return enum_for(__method__) unless block_given?
212
431
 
213
- File.open(filename) do |file|
432
+ open_file do |file|
214
433
  file.each_line do |json|
215
434
  yield HeapObject.new(self, json)
216
435
  end
217
436
  end
218
437
  end
219
438
 
439
+ def open_file(&block)
440
+ # FIXME: look for magic header
441
+ if filename.end_with?(".gz")
442
+ require "zlib"
443
+ Zlib::GzipReader.open(filename, &block)
444
+ else
445
+ File.open(filename, &block)
446
+ end
447
+ end
448
+
220
449
  def objects
221
- @objects ||= each_object.to_a
450
+ @objects ||= HeapObjectCollection.new(each_object.to_a, self)
451
+ end
452
+
453
+ def filter(&block)
454
+ objects.filter(&block)
222
455
  end
223
456
 
224
457
  def objects_by_addr
@@ -247,33 +480,54 @@ class Sheap
247
480
  end
248
481
  end
249
482
 
250
- def at(addr)
251
- objects_by_addr[addr]
483
+ def roots
484
+ of_type("ROOT")
252
485
  end
253
486
 
254
- def class_named(name)
255
- objects.select do |obj|
256
- obj.json.include?(name) &&
257
- obj.type_str == "CLASS" &&
258
- obj.name == name
487
+ # finds a path from `start_address` through the inverse_references hash
488
+ # and so the end_address will be the object that's closer to the root
489
+ def find_path(start_addresses, end_addresses = nil)
490
+ if end_addresses.nil?
491
+ end_addresses = start_addresses
492
+ start_addresses = roots
259
493
  end
260
- end
494
+ start_addresses = Array(start_addresses)
495
+ end_addresses = Array(end_addresses)
261
496
 
262
- def instances_of(klass)
263
- addr = klass.address
264
- objects.select do |obj|
265
- obj.json.include?(addr) &&
266
- obj.class_addr == addr
497
+ q = start_addresses.map{|x| [x] }
498
+
499
+ visited = Set.new
500
+ while !q.empty?
501
+ current_path = q.shift
502
+ current_address = current_path.last
503
+
504
+ if end_addresses.include?(current_address)
505
+ return current_path.map{|addr| addr}
506
+ end
507
+
508
+ if !visited.include?(current_address)
509
+ visited.add(current_address)
510
+
511
+ current_references = current_address.references
512
+
513
+ current_references.each do |obj|
514
+ q.push([*current_path, obj])
515
+ end
516
+ end
267
517
  end
518
+ nil
268
519
  end
269
520
 
270
- def of_type(type)
271
- type = type.to_s.upcase
272
- objects.select { |o| o.type_str == type }
521
+ def at(addr)
522
+ objects_by_addr[addr]
273
523
  end
274
524
 
275
525
  def inspect
276
526
  "#<#{self.class} (#{objects.size} objects)>"
277
527
  end
528
+
529
+ def self.wrap(heap)
530
+ self === heap ? heap : new(heap)
531
+ end
278
532
  end
279
533
  end
data/tmp/.keep ADDED
File without changes
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: sheap
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.3.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - John Hawthorn
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2023-10-13 00:00:00.000000000 Z
11
+ date: 2024-05-26 00:00:00.000000000 Z
12
12
  dependencies: []
13
13
  description: A set of helpers for analyzing the output of ObjectSpace.dump_all
14
14
  email:
@@ -26,6 +26,7 @@ files:
26
26
  - exe/sheap
27
27
  - lib/sheap.rb
28
28
  - lib/sheap/version.rb
29
+ - tmp/.keep
29
30
  homepage: https://github.com/jhawthorn/sheap
30
31
  licenses:
31
32
  - MIT
@@ -48,7 +49,7 @@ required_rubygems_version: !ruby/object:Gem::Requirement
48
49
  - !ruby/object:Gem::Version
49
50
  version: '0'
50
51
  requirements: []
51
- rubygems_version: 3.4.10
52
+ rubygems_version: 3.5.3
52
53
  signing_key:
53
54
  specification_version: 4
54
55
  summary: A helpers for heap dumps