ruby-prof 2.0.4 → 2.0.5
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +6 -2
- data/lib/ruby-prof/printers/flame_graph_printer.rb +80 -78
- data/lib/ruby-prof/version.rb +1 -1
- metadata +4 -85
- data/Rakefile +0 -98
- data/docs/advanced-usage.md +0 -132
- data/docs/alternatives.md +0 -98
- data/docs/architecture.md +0 -304
- data/docs/best-practices.md +0 -27
- data/docs/getting-started.md +0 -130
- data/docs/history.md +0 -11
- data/docs/index.md +0 -45
- data/docs/profiling-rails.md +0 -64
- data/docs/public/examples/example.rb +0 -33
- data/docs/public/examples/generate_reports.rb +0 -92
- data/docs/public/examples/reports/call_info.txt +0 -27
- data/docs/public/examples/reports/call_stack.html +0 -835
- data/docs/public/examples/reports/callgrind.out +0 -150
- data/docs/public/examples/reports/flame_graph.html +0 -408
- data/docs/public/examples/reports/flat.txt +0 -45
- data/docs/public/examples/reports/graph.dot +0 -129
- data/docs/public/examples/reports/graph.html +0 -1319
- data/docs/public/examples/reports/graph.txt +0 -100
- data/docs/public/examples/reports/graphviz_viewer.html +0 -1
- data/docs/public/images/call_stack.png +0 -0
- data/docs/public/images/class_diagram.png +0 -0
- data/docs/public/images/dot_printer.png +0 -0
- data/docs/public/images/flame_graph.png +0 -0
- data/docs/public/images/flat.png +0 -0
- data/docs/public/images/graph.png +0 -0
- data/docs/public/images/graph_html.png +0 -0
- data/docs/public/images/ruby-prof-logo.svg +0 -1
- data/docs/reports.md +0 -151
- data/docs/stylesheets/extra.css +0 -80
- data/ruby-prof.gemspec +0 -66
- data/test/abstract_printer_test.rb +0 -25
- data/test/alias_test.rb +0 -203
- data/test/call_tree_builder.rb +0 -126
- data/test/call_tree_test.rb +0 -94
- data/test/call_tree_visitor_test.rb +0 -27
- data/test/call_trees_test.rb +0 -66
- data/test/duplicate_names_test.rb +0 -32
- data/test/dynamic_method_test.rb +0 -50
- data/test/enumerable_test.rb +0 -23
- data/test/exceptions_test.rb +0 -24
- data/test/exclude_methods_test.rb +0 -363
- data/test/exclude_threads_test.rb +0 -48
- data/test/fiber_test.rb +0 -195
- data/test/gc_test.rb +0 -104
- data/test/inverse_call_tree_test.rb +0 -174
- data/test/line_number_test.rb +0 -563
- data/test/marshal_test.rb +0 -144
- data/test/measure_allocations.rb +0 -26
- data/test/measure_allocations_test.rb +0 -1511
- data/test/measure_process_time_test.rb +0 -3286
- data/test/measure_times.rb +0 -56
- data/test/measure_wall_time_test.rb +0 -774
- data/test/measurement_test.rb +0 -82
- data/test/merge_test.rb +0 -146
- data/test/method_info_test.rb +0 -100
- data/test/multi_printer_test.rb +0 -52
- data/test/no_method_class_test.rb +0 -15
- data/test/pause_resume_test.rb +0 -171
- data/test/prime.rb +0 -54
- data/test/prime_script.rb +0 -6
- data/test/printer_call_stack_test.rb +0 -28
- data/test/printer_call_tree_test.rb +0 -30
- data/test/printer_flame_graph_test.rb +0 -82
- data/test/printer_flat_test.rb +0 -110
- data/test/printer_graph_html_test.rb +0 -62
- data/test/printer_graph_test.rb +0 -42
- data/test/printers_test.rb +0 -162
- data/test/printing_recursive_graph_test.rb +0 -81
- data/test/profile_test.rb +0 -101
- data/test/rack_test.rb +0 -103
- data/test/recursive_test.rb +0 -796
- data/test/scheduler.rb +0 -367
- data/test/singleton_test.rb +0 -39
- data/test/stack_printer_test.rb +0 -61
- data/test/start_stop_test.rb +0 -106
- data/test/test_helper.rb +0 -24
- data/test/thread_test.rb +0 -229
- data/test/unique_call_path_test.rb +0 -123
- data/test/yarv_test.rb +0 -56
data/docs/architecture.md
DELETED
|
@@ -1,304 +0,0 @@
|
|
|
1
|
-
# Architecture
|
|
2
|
-
|
|
3
|
-
## Overview
|
|
4
|
-
|
|
5
|
-
ruby-prof is a C extension that uses Ruby's [TracePoint](https://docs.ruby-lang.org/en/master/TracePoint.html) API to intercept method calls and returns. Every time a method is entered or exited, ruby-prof records timing and (optionally) allocation data. This tracing approach means ruby-prof captures every method invocation, giving exact call counts and complete call graphs.
|
|
6
|
-
|
|
7
|
-
The diagram below shows the main classes that make up ruby-prof:
|
|
8
|
-
|
|
9
|
-
```mermaid
|
|
10
|
-
classDiagram
|
|
11
|
-
Profile "1" *-- "1" Measurer
|
|
12
|
-
Profile "1" *-- "*" Thread
|
|
13
|
-
Thread "1" *-- "1" Stack
|
|
14
|
-
Thread "1" *-- "*" MethodInfo
|
|
15
|
-
Thread "1" *-- "1" CallTree
|
|
16
|
-
Stack "1" o-- "*" Frame
|
|
17
|
-
Frame --> CallTree
|
|
18
|
-
CallTree "1" *-- "1" Measurement
|
|
19
|
-
CallTree --> MethodInfo : target
|
|
20
|
-
MethodInfo "1" *-- "1" CallTrees
|
|
21
|
-
MethodInfo "1" *-- "1" Measurement
|
|
22
|
-
MethodInfo "1" *-- "*" Allocation
|
|
23
|
-
CallTrees o-- "*" CallTree
|
|
24
|
-
|
|
25
|
-
class Profile {
|
|
26
|
-
+threads: Hash
|
|
27
|
-
+measurer: Measurer
|
|
28
|
-
}
|
|
29
|
-
class Measurer {
|
|
30
|
-
+mode: MeasurerMode
|
|
31
|
-
+track_allocations: boolean
|
|
32
|
-
+multiplier: double
|
|
33
|
-
+measure: function pointer
|
|
34
|
-
}
|
|
35
|
-
class Thread {
|
|
36
|
-
+methods: Hash
|
|
37
|
-
+stack: Stack
|
|
38
|
-
+callTree: CallTree
|
|
39
|
-
}
|
|
40
|
-
class Stack {
|
|
41
|
-
+frames: Array
|
|
42
|
-
}
|
|
43
|
-
class Frame {
|
|
44
|
-
+callTree: CallTree
|
|
45
|
-
}
|
|
46
|
-
class CallTree {
|
|
47
|
-
+parent: CallTree
|
|
48
|
-
+children: Hash
|
|
49
|
-
+target: MethodInfo
|
|
50
|
-
+measurement: Measurement
|
|
51
|
-
}
|
|
52
|
-
class MethodInfo {
|
|
53
|
-
+allocations: Hash
|
|
54
|
-
+callTrees: CallTrees
|
|
55
|
-
+measurement: Measurement
|
|
56
|
-
}
|
|
57
|
-
class Measurement {
|
|
58
|
-
+total_time: double
|
|
59
|
-
+self_time: double
|
|
60
|
-
+wait_time: double
|
|
61
|
-
+called: integer
|
|
62
|
-
}
|
|
63
|
-
class Allocation {
|
|
64
|
-
+count: integer
|
|
65
|
-
+source_file: string
|
|
66
|
-
+source_line: int
|
|
67
|
-
+klass: VALUE
|
|
68
|
-
}
|
|
69
|
-
class CallTrees {
|
|
70
|
-
+callTrees: Array
|
|
71
|
-
}
|
|
72
|
-
```
|
|
73
|
-
|
|
74
|
-
## Profile
|
|
75
|
-
|
|
76
|
-
Profile is the top-level object returned by a profiling run:
|
|
77
|
-
|
|
78
|
-
```ruby
|
|
79
|
-
profile = RubyProf::Profile.profile do
|
|
80
|
-
...
|
|
81
|
-
end
|
|
82
|
-
```
|
|
83
|
-
|
|
84
|
-
A Profile owns a Measurer that determines what is being measured, and a collection of Threads representing each thread (or fiber) that was active during profiling.
|
|
85
|
-
|
|
86
|
-
## Measurer and Measurement
|
|
87
|
-
|
|
88
|
-
The **Measurer** controls what ruby-prof measures. It holds a function pointer that is called on every method entry and exit to take a measurement. The three modes are:
|
|
89
|
-
|
|
90
|
-
- **Wall time** — elapsed real time
|
|
91
|
-
- **Process time** — CPU time consumed by the process (excludes time spent in sleep or I/O)
|
|
92
|
-
- **Allocations** — number of objects allocated
|
|
93
|
-
|
|
94
|
-
Each CallTree and MethodInfo holds a **Measurement** that accumulates the results: total time, self time (excluding children), wait time (time spent waiting on other threads), and call count.
|
|
95
|
-
|
|
96
|
-
## Thread
|
|
97
|
-
|
|
98
|
-
Each Thread tracks the methods called on that thread and owns the root of a call tree. It also maintains an internal Stack of Frames used during profiling to track the current call depth.
|
|
99
|
-
|
|
100
|
-
**Stack** and **Frame** are transient — they exist only while profiling is active. A Frame records timing data for a single method invocation on the stack, including start time and time spent in child calls. When a method returns, its Frame is popped and the accumulated timing is transferred to the corresponding CallTree node.
|
|
101
|
-
|
|
102
|
-
## CallTree and MethodInfo
|
|
103
|
-
|
|
104
|
-
These two classes are central to ruby-prof and represent two different views of the same profiling data:
|
|
105
|
-
|
|
106
|
-
- **CallTree** records the calling structure — which method called which, forming a tree. Each node has a parent, children, and a reference to its target MethodInfo. A method that is called from two different call sites will have two separate CallTree nodes, each with its own Measurement. Recursive methods are handled by creating a chain of CallTree nodes (see [Recursion](#recursion) below).
|
|
107
|
-
|
|
108
|
-
- **MethodInfo** represents a single method regardless of where it was called from. It aggregates data across all call sites. Each MethodInfo holds a CallTrees collection that links back to every CallTree node that invoked that method, providing both caller and callee information.
|
|
109
|
-
|
|
110
|
-
This separation is what allows ruby-prof to generate both call graph reports (which show calling relationships) and flat reports (which show per-method totals).
|
|
111
|
-
|
|
112
|
-
## Building the Call Tree
|
|
113
|
-
|
|
114
|
-
This section describes how the call tree is constructed during profiling.
|
|
115
|
-
|
|
116
|
-
Consider profiling this code:
|
|
117
|
-
|
|
118
|
-
```ruby
|
|
119
|
-
def process
|
|
120
|
-
validate
|
|
121
|
-
save
|
|
122
|
-
end
|
|
123
|
-
|
|
124
|
-
def save
|
|
125
|
-
validate
|
|
126
|
-
write
|
|
127
|
-
end
|
|
128
|
-
```
|
|
129
|
-
|
|
130
|
-
The resulting CallTree looks like:
|
|
131
|
-
|
|
132
|
-
```mermaid
|
|
133
|
-
graph TD
|
|
134
|
-
A{{"[global]"}} -->|child| B{{"process"}}
|
|
135
|
-
B -->|child| C{{"validate"}}
|
|
136
|
-
B -->|child| D{{"save"}}
|
|
137
|
-
D -->|child| E{{"validate"}}
|
|
138
|
-
D -->|child| F{{"write"}}
|
|
139
|
-
|
|
140
|
-
B -.->|parent| A
|
|
141
|
-
C -.->|parent| B
|
|
142
|
-
D -.->|parent| B
|
|
143
|
-
E -.->|parent| D
|
|
144
|
-
F -.->|parent| D
|
|
145
|
-
```
|
|
146
|
-
|
|
147
|
-
Notice that `validate` appears as two separate CallTree nodes — one under `process` and one under `save` — because it was called from two different call sites. Each has its own parent and its own Measurement. Both nodes reference the same `validate` MethodInfo, which aggregates the data across both call sites.
|
|
148
|
-
|
|
149
|
-
The following diagram shows both views together. CallTree nodes (hexagons) reference their target MethodInfo (rectangles) via dashed arrows:
|
|
150
|
-
|
|
151
|
-
```mermaid
|
|
152
|
-
graph TD
|
|
153
|
-
classDef calltree fill:#E8F4FD,stroke:#2E86C1
|
|
154
|
-
classDef methodinfo fill:#FADBD8,stroke:#E74C3C
|
|
155
|
-
|
|
156
|
-
CT1{{"[global]"}}:::calltree --> CT2{{"process"}}:::calltree
|
|
157
|
-
CT2 --> CT3{{"validate"}}:::calltree
|
|
158
|
-
CT2 --> CT4{{"save"}}:::calltree
|
|
159
|
-
CT4 --> CT5{{"validate"}}:::calltree
|
|
160
|
-
CT4 --> CT6{{"write"}}:::calltree
|
|
161
|
-
|
|
162
|
-
CT1 -.->|target| M1["[global]"]:::methodinfo
|
|
163
|
-
CT2 -.->|target| M2["process"]:::methodinfo
|
|
164
|
-
CT3 -.->|target| M3["validate"]:::methodinfo
|
|
165
|
-
CT4 -.->|target| M4["save"]:::methodinfo
|
|
166
|
-
CT5 -.->|target| M3
|
|
167
|
-
CT6 -.->|target| M5["write"]:::methodinfo
|
|
168
|
-
|
|
169
|
-
M3 -.->|call_trees| CT3
|
|
170
|
-
M3 -.->|call_trees| CT5
|
|
171
|
-
```
|
|
172
|
-
|
|
173
|
-
Both `validate` CallTree nodes point to the same `validate` MethodInfo via `target`. The MethodInfo points back to its CallTree nodes via `call_trees` — a flat array of every CallTree node that invoked this method. From this array, `callers` and `callees` are derived: `callers` walks each node's parent, and `callees` walks each node's children. Both are aggregated by method to produce a single entry per caller or callee method.
|
|
174
|
-
|
|
175
|
-
### Parents
|
|
176
|
-
|
|
177
|
-
Each CallTree node has exactly one parent, set at creation. When a method call event fires, the profiler determines the parent from the current frame on the stack:
|
|
178
|
-
|
|
179
|
-
```c
|
|
180
|
-
parent_call_tree = frame->call_tree;
|
|
181
|
-
```
|
|
182
|
-
|
|
183
|
-
The parent is the CallTree node that was active (top of the stack) when this method was called. The root CallTree node for each thread has no parent.
|
|
184
|
-
|
|
185
|
-
### Children
|
|
186
|
-
|
|
187
|
-
A CallTree node's children are stored in a hash table keyed by method. The profiler looks up the method in the current parent's children. If a child already exists for that method, the existing CallTree node is **reused** and its `called` count increments. Otherwise a new node is created:
|
|
188
|
-
|
|
189
|
-
```c
|
|
190
|
-
call_tree = call_tree_table_lookup(parent_call_tree->children, method->key);
|
|
191
|
-
|
|
192
|
-
if (!call_tree)
|
|
193
|
-
{
|
|
194
|
-
call_tree = prof_call_tree_create(method, parent_call_tree, ...);
|
|
195
|
-
prof_call_tree_add_child(parent_call_tree, call_tree);
|
|
196
|
-
}
|
|
197
|
-
```
|
|
198
|
-
|
|
199
|
-
This means each parent has **one** child CallTree per method. For example, if `foo` calls `bar` ten times, there is a single `bar` CallTree node under `foo` with `called: 10`.
|
|
200
|
-
|
|
201
|
-
## Allocation
|
|
202
|
-
|
|
203
|
-
When allocation tracking is enabled, each MethodInfo records the objects it allocated. An Allocation tracks the class of object created, the source location, and the count.
|
|
204
|
-
|
|
205
|
-
## Memory Management
|
|
206
|
-
|
|
207
|
-
The Profile object is responsible for managing the memory of its child objects, which are C structures. When a Profile is garbage collected, it recursively frees all its objects. In the class diagram, composition relationships (filled diamond) indicate ownership — a Profile frees its Threads, Threads free their CallTrees and MethodInfo instances, and so on.
|
|
208
|
-
|
|
209
|
-
ruby-prof keeps a Profile alive as long as there are live references to any of its MethodInfo or CallTree objects. This is done via Ruby's GC mark phase: CallTree instances mark their associated MethodInfo, and MethodInfo instances mark their owning Profile.
|
|
210
|
-
|
|
211
|
-
Starting with version 1.5, it is possible to create Thread, CallTree and MethodInfo instances from Ruby (this was added to support testing). These Ruby-created objects are owned by Ruby's garbage collector rather than the C extension. An internal ownership flag on each instance tracks who is responsible for freeing it.
|
|
212
|
-
|
|
213
|
-
## Recursion
|
|
214
|
-
|
|
215
|
-
The call tree handles recursion naturally — each recursive call has a different parent, so new nodes are created at each level just like any other method call. The only special handling is in timing calculation, where care is needed to avoid double-counting.
|
|
216
|
-
|
|
217
|
-
### How Recursive Calls Create New Nodes
|
|
218
|
-
|
|
219
|
-
Consider a simple recursive method:
|
|
220
|
-
|
|
221
|
-
```ruby
|
|
222
|
-
def simple(n)
|
|
223
|
-
sleep(1)
|
|
224
|
-
return if n == 0
|
|
225
|
-
simple(n - 1)
|
|
226
|
-
end
|
|
227
|
-
|
|
228
|
-
simple(2)
|
|
229
|
-
```
|
|
230
|
-
|
|
231
|
-
Each recursive call to `simple` has a different parent CallTree node, so the lookup in the parent's children misses and a new node is created at each level:
|
|
232
|
-
|
|
233
|
-
```mermaid
|
|
234
|
-
graph TD
|
|
235
|
-
classDef calltree fill:#E8F4FD,stroke:#2E86C1
|
|
236
|
-
classDef methodinfo fill:#FADBD8,stroke:#E74C3C
|
|
237
|
-
|
|
238
|
-
A{{"[global]"}}:::calltree --> B{{"simple"}}:::calltree
|
|
239
|
-
B --> C{{"sleep"}}:::calltree
|
|
240
|
-
B --> D{{"simple"}}:::calltree
|
|
241
|
-
D --> E{{"sleep"}}:::calltree
|
|
242
|
-
D --> F{{"simple"}}:::calltree
|
|
243
|
-
F --> G{{"sleep"}}:::calltree
|
|
244
|
-
|
|
245
|
-
B -.-> A
|
|
246
|
-
C -.-> B
|
|
247
|
-
D -.-> B
|
|
248
|
-
E -.-> D
|
|
249
|
-
F -.-> D
|
|
250
|
-
G -.-> F
|
|
251
|
-
|
|
252
|
-
B -.-> M["simple"]:::methodinfo
|
|
253
|
-
D -.-> M
|
|
254
|
-
F -.-> M
|
|
255
|
-
```
|
|
256
|
-
|
|
257
|
-
The CallTree is always acyclic — each recursive call creates a new node at a deeper level. However, there is a single `simple` MethodInfo (red rectangle at the bottom of the diagram), and each CallTree node points to it.
|
|
258
|
-
|
|
259
|
-
### The Visits Counter
|
|
260
|
-
|
|
261
|
-
Both CallTree and MethodInfo have a `visits` field that tracks how many times that node or method is currently on the stack. This counter is incremented on method entry and decremented on method exit:
|
|
262
|
-
|
|
263
|
-
```c
|
|
264
|
-
// Method entry (prof_frame_push):
|
|
265
|
-
call_tree->visits++;
|
|
266
|
-
if (call_tree->method->visits > 0)
|
|
267
|
-
call_tree->method->recursive = true;
|
|
268
|
-
call_tree->method->visits++;
|
|
269
|
-
|
|
270
|
-
// Method exit (prof_frame_pop):
|
|
271
|
-
call_tree->visits--;
|
|
272
|
-
call_tree->method->visits--;
|
|
273
|
-
```
|
|
274
|
-
|
|
275
|
-
The MethodInfo `visits` counter serves two purposes:
|
|
276
|
-
|
|
277
|
-
1. Detecting recursion — if `method->visits > 0` when a method is entered, the method is currently an ancestor of itself in the call stack and is marked recursive.
|
|
278
|
-
|
|
279
|
-
2. Correct total_time accounting — total time is only added to the Measurement when a node's `visits` drops back to 1, meaning it is the outermost invocation:
|
|
280
|
-
|
|
281
|
-
```c
|
|
282
|
-
// Only accumulate total_time at the outermost visit
|
|
283
|
-
if (call_tree->visits == 1)
|
|
284
|
-
call_tree->measurement->total_time += total_time;
|
|
285
|
-
|
|
286
|
-
if (call_tree->method->visits == 1)
|
|
287
|
-
call_tree->method->measurement->total_time += total_time;
|
|
288
|
-
```
|
|
289
|
-
|
|
290
|
-
Without this guard, total time would be double-counted. Consider `simple(2)` with 1-second sleeps. The outermost call takes ~3 seconds total, the middle call ~2 seconds, and the innermost ~1 second. Naively summing all three would give 6 seconds, but the actual elapsed time is only 3 seconds. By only recording total_time at the outermost visit, the MethodInfo correctly reports 3 seconds.
|
|
291
|
-
|
|
292
|
-
### Recursion at the MethodInfo Level
|
|
293
|
-
|
|
294
|
-
At the MethodInfo level, recursive methods create cycles. A recursive `simple` method has itself as both a caller and a callee:
|
|
295
|
-
|
|
296
|
-
```mermaid
|
|
297
|
-
graph TD
|
|
298
|
-
classDef methodinfo fill:#FADBD8,stroke:#E74C3C
|
|
299
|
-
A["global"]:::methodinfo -->|"calls"| B["simple"]:::methodinfo
|
|
300
|
-
B -->|"calls"| C["sleep"]:::methodinfo
|
|
301
|
-
B -->|"calls"| B
|
|
302
|
-
```
|
|
303
|
-
|
|
304
|
-
This is why MethodInfo has a `recursive?` flag — printers that operate on MethodInfo (such as the graph printer) need to be aware of these cycles. However, the underlying CallTree structure is always a tree with no structural cycles.
|
data/docs/best-practices.md
DELETED
|
@@ -1,27 +0,0 @@
|
|
|
1
|
-
# Best Practices
|
|
2
|
-
|
|
3
|
-
Profiling gives you amazing insight into your program. What you think is slow is almost never what is actually slow. Below are some best practices to help unlock this power.
|
|
4
|
-
|
|
5
|
-
## Start With Realistic Runs
|
|
6
|
-
|
|
7
|
-
When profiling data-heavy work, start with a smaller sample of the data instead of the full dataset. Profile a portion first (for example 1% or 10%). It is faster, easier to understand, and often enough to find the main bottleneck. Once you have a likely fix, validate it with a larger and more realistic workload so you know the result still holds in context. Run the same profile more than once and warm up before you measure so one-time startup work does not dominate the report.
|
|
8
|
-
|
|
9
|
-
## Choose The Right Measurement Mode
|
|
10
|
-
|
|
11
|
-
Pick the measurement mode based on the question you are asking. Use `WALL_TIME` for end-to-end latency, `PROCESS_TIME` for CPU-focused work, and `ALLOCATIONS` when object churn is the concern. See [Measurement Mode](advanced-usage.md#measurement-mode) for details.
|
|
12
|
-
|
|
13
|
-
## Reduce Noise Before Deep Analysis
|
|
14
|
-
|
|
15
|
-
When framework internals or concurrency noise dominate output, narrow the scope first. Use `exclude_common` or explicit method exclusions, and use thread filtering (`include_threads` / `exclude_threads`) when needed. For highly concurrent workloads, merging worker results (`merge!` or Rack `merge_fibers: true`) can make trends much easier to read. See [Profiling Options](advanced-usage.md#profiling-options), [Method Exclusion](advanced-usage.md#method-exclusion), and [Merging Threads and Fibers](advanced-usage.md#merging-threads-and-fibers).
|
|
16
|
-
|
|
17
|
-
## Use Reports In A Sequence
|
|
18
|
-
|
|
19
|
-
Start with a quick summary, then drill down. In practice, this usually means using `FlatPrinter` to find hotspots, `GraphHtmlPrinter` (or `GraphPrinter`) to understand caller/callee relationships, and `FlameGraphPrinter` to validate dominant paths visually. See [Reports](reports.md), especially [Creating Reports](reports.md#creating-reports) and [Report Types](reports.md#report-types).
|
|
20
|
-
|
|
21
|
-
## Use Threshold Filters Early
|
|
22
|
-
|
|
23
|
-
Threshold filters are one of the fastest ways to make a large profile readable. Start with `min_percent` to hide low-impact methods in most printers. For `GraphHtmlPrinter`, use `min_time` when you want to drop methods below an absolute time cutoff. These filters help you focus on the code that actually moves total runtime.
|
|
24
|
-
|
|
25
|
-
## Compare Trends, Not Single Snapshots
|
|
26
|
-
|
|
27
|
-
Do not optimize based on one run unless the signal is overwhelming. Compare before/after profiles under the same workload, then prioritize repeated hot paths over one-off spikes.
|
data/docs/getting-started.md
DELETED
|
@@ -1,130 +0,0 @@
|
|
|
1
|
-
# Getting Started
|
|
2
|
-
|
|
3
|
-
There are three ways to use ruby-prof:
|
|
4
|
-
|
|
5
|
-
- command line
|
|
6
|
-
- convenience API
|
|
7
|
-
- core API
|
|
8
|
-
|
|
9
|
-
## Command Line
|
|
10
|
-
|
|
11
|
-
The easiest way to use ruby-prof is via the command line, which requires no modifications to your program. The basic usage is:
|
|
12
|
-
|
|
13
|
-
```
|
|
14
|
-
ruby-prof [options] <script.rb> [--] [script-options]
|
|
15
|
-
```
|
|
16
|
-
|
|
17
|
-
Where script.rb is the program you want to profile.
|
|
18
|
-
|
|
19
|
-
For a full list of options, see the RubyProf::Cmd documentation or execute the following command:
|
|
20
|
-
|
|
21
|
-
```
|
|
22
|
-
ruby-prof -h
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
## Convenience API
|
|
26
|
-
|
|
27
|
-
The second way to use ruby-prof is via its convenience API. This requires small modifications to the program you want to profile:
|
|
28
|
-
|
|
29
|
-
```ruby
|
|
30
|
-
require 'ruby-prof'
|
|
31
|
-
|
|
32
|
-
profile = RubyProf::Profile.new
|
|
33
|
-
|
|
34
|
-
# profile the code
|
|
35
|
-
profile.start
|
|
36
|
-
# ... code to profile ...
|
|
37
|
-
result = profile.stop
|
|
38
|
-
|
|
39
|
-
# print a flat profile to text
|
|
40
|
-
printer = RubyProf::FlatPrinter.new(result)
|
|
41
|
-
printer.print(STDOUT)
|
|
42
|
-
```
|
|
43
|
-
|
|
44
|
-
Alternatively, you can use a block to tell ruby-prof what to profile:
|
|
45
|
-
|
|
46
|
-
```ruby
|
|
47
|
-
require 'ruby-prof'
|
|
48
|
-
|
|
49
|
-
# profile the code
|
|
50
|
-
result = RubyProf::Profile.profile do
|
|
51
|
-
# ... code to profile ...
|
|
52
|
-
end
|
|
53
|
-
|
|
54
|
-
# print a graph profile to text
|
|
55
|
-
printer = RubyProf::GraphPrinter.new(result)
|
|
56
|
-
printer.print(STDOUT)
|
|
57
|
-
```
|
|
58
|
-
|
|
59
|
-
ruby-prof also supports pausing and resuming profiling runs.
|
|
60
|
-
|
|
61
|
-
```ruby
|
|
62
|
-
require 'ruby-prof'
|
|
63
|
-
|
|
64
|
-
profile = RubyProf::Profile.new
|
|
65
|
-
|
|
66
|
-
# profile the code
|
|
67
|
-
profile.start
|
|
68
|
-
# ... code to profile ...
|
|
69
|
-
|
|
70
|
-
profile.pause
|
|
71
|
-
# ... other code ...
|
|
72
|
-
|
|
73
|
-
profile.resume
|
|
74
|
-
# ... code to profile ...
|
|
75
|
-
|
|
76
|
-
result = profile.stop
|
|
77
|
-
```
|
|
78
|
-
|
|
79
|
-
Note that resume will only work if start has been called previously. In addition, resume can also take a block:
|
|
80
|
-
|
|
81
|
-
```ruby
|
|
82
|
-
require 'ruby-prof'
|
|
83
|
-
|
|
84
|
-
profile = RubyProf::Profile.new
|
|
85
|
-
|
|
86
|
-
# profile the code
|
|
87
|
-
profile.start
|
|
88
|
-
# ... code to profile ...
|
|
89
|
-
|
|
90
|
-
profile.pause
|
|
91
|
-
# ... other code ...
|
|
92
|
-
|
|
93
|
-
profile.resume do
|
|
94
|
-
# ... code to profile...
|
|
95
|
-
end
|
|
96
|
-
|
|
97
|
-
result = profile.stop
|
|
98
|
-
```
|
|
99
|
-
|
|
100
|
-
With this usage, resume will automatically call pause at the end of the block.
|
|
101
|
-
|
|
102
|
-
The `RubyProf::Profile.profile` method can take various options, which are described in [Profiling Options](advanced-usage.md#profiling-options).
|
|
103
|
-
|
|
104
|
-
## Core API
|
|
105
|
-
|
|
106
|
-
The convenience API is a wrapper around the `RubyProf::Profile` class. Using the Profile class directly provides additional functionality, such as [method exclusion](advanced-usage.md#method-exclusion).
|
|
107
|
-
|
|
108
|
-
To create a new profile:
|
|
109
|
-
|
|
110
|
-
```ruby
|
|
111
|
-
require 'ruby-prof'
|
|
112
|
-
|
|
113
|
-
profile = RubyProf::Profile.new(measure_mode: RubyProf::WALL_TIME)
|
|
114
|
-
result = profile.profile do
|
|
115
|
-
...
|
|
116
|
-
end
|
|
117
|
-
```
|
|
118
|
-
|
|
119
|
-
Once a profile is completed, you can either generate a [report](reports.md) via a printer or [save](advanced-usage.md#saving-results) the results for later analysis. For a list of profiling options, please see the [Profiling Options](advanced-usage.md#profiling-options) section.
|
|
120
|
-
If you are unsure which report to generate first, see [Report Types](reports.md#report-types).
|
|
121
|
-
|
|
122
|
-
|
|
123
|
-
However, using ruby-prof also comes with two caveats:
|
|
124
|
-
|
|
125
|
-
- To use ruby-prof you generally need to include a few lines of extra code in your program (although see [command line usage](getting-started.md#command-line))
|
|
126
|
-
- Using ruby-prof will cause your program to run slower (see [Performance](index.md#performance) section)
|
|
127
|
-
|
|
128
|
-
Most of the time, these two caveats are acceptable. But if you need to determine why a program running in production is slow or hung, a sampling profiler will be a better choice. Excellent choices include [stackprof](https://github.com/tmm1/stackprof) or [rbspy](https://rbspy.github.io/).
|
|
129
|
-
|
|
130
|
-
If you are just interested in memory usage, you may also want to checkout the [memory_profiler](https://github.com/SamSaffron/memory_profiler) gem (although ruby-prof provides similar information).
|
data/docs/history.md
DELETED
|
@@ -1,11 +0,0 @@
|
|
|
1
|
-
# History
|
|
2
|
-
|
|
3
|
-
For a full list of changes between versions, see the [Changelog](changelog.md).
|
|
4
|
-
|
|
5
|
-
The first version of ruby-prof, 0.1.1, was released on March 22, 2005 by [Shugo Maeda](https://shugo.net/) The original [source](https://shugo.net/archive/ruby-prof/) code is still available on his website (it is not actually in the git history). ruby-prof was a vast improvement at the time, running 30 times faster as the original ruby profiler.
|
|
6
|
-
|
|
7
|
-
Version [0.4.0](https://rubygems.org/gems/ruby-prof/versions/0.4.0) was the first version packaged as a Ruby gem. Version 0.4.0 also introduced Windows support, thread support and added a number of additional reports such as the graph report in HTML and the call graph report.
|
|
8
|
-
|
|
9
|
-
A number of versions were subsequently released, with a 1.0.0 [release](https://cfis.savagexi.com/2019/07/29/ruby-prof-1-0/) finally happening in July of 2019. Version 1.0.0 was a major rewrite that significantly improved performance, correctly profiled recursive methods, redesigned reports, added allocation/memory measurement support and introduced saving and reloading profiling results. Since then ruby-prof has continued to evolve along with Ruby with 19 releases.
|
|
10
|
-
|
|
11
|
-
Version 2.0.0 will mark the 20th release of ruby-prof since the 1.0.0 release. Version 2.0.0 supports Ruby 4 and includes new flame/icicle graph support, revamped reports and improved documentation. The reason for the 2.0.0 jump is because profiling memory sizes has been removed due to changes in Ruby 4.0.0. In addition, the old compatibility API was also removed.
|
data/docs/index.md
DELETED
|
@@ -1,45 +0,0 @@
|
|
|
1
|
-
# ruby-prof
|
|
2
|
-
|
|
3
|
-
ruby-prof is a [tracing](./alternatives.md#tracing-vs-sampling) profiler for MRI Ruby with a long [history](./history.md) that dates back to 2005! Its features include:
|
|
4
|
-
|
|
5
|
-
- Measurement Modes - ruby-prof can measure program [wall time](advanced-usage.md#wall-time), [process time](advanced-usage.md#process-time) and [object allocations](advanced-usage.md#object-allocations).
|
|
6
|
-
- Reports - ruby-prof can generate [flat](reports.md#flat), [graph (text)](reports.md#graph-text), [graph (HTML)](reports.md#graph-html), [flame graph](reports.md#flame-graph), [call stack](reports.md#call-stack), [graphviz](reports.md#graphviz), [cachegrind](reports.md#cachegrind), and [call info](reports.md#call-info-report) reports.
|
|
7
|
-
- Threads - supports profiling multiple threads simultaneously.
|
|
8
|
-
- Fibers - supports profiling multiple fibers simultaneously.
|
|
9
|
-
- Merging - supports merging results across fibers or threads
|
|
10
|
-
- Recursive - supports profiling recursive methods
|
|
11
|
-
|
|
12
|
-

|
|
13
|
-
|
|
14
|
-
## Why ruby-prof?
|
|
15
|
-
|
|
16
|
-
ruby-prof is helpful if your program is slow and you want to know why! It can help you track down methods that are either slow or allocate a large number of objects. Often times the results will surprise you - when profiling what you think you know almost always turns out to be wrong.
|
|
17
|
-
|
|
18
|
-
## Installation
|
|
19
|
-
To install ruby-prof:
|
|
20
|
-
|
|
21
|
-
```
|
|
22
|
-
gem install ruby-prof
|
|
23
|
-
```
|
|
24
|
-
|
|
25
|
-
If you are running Linux or Unix you'll need to have a C compiler installed so the extension can be built when it is installed. If you are running Windows, then you should install the Windows specific gem or install [devkit](https://rubyinstaller.org/add-ons/devkit.html).
|
|
26
|
-
|
|
27
|
-
ruby-prof requires Ruby 3.2.0 or higher. If you need to work with older Ruby versions then you can download an older version of ruby-prof.
|
|
28
|
-
|
|
29
|
-
## Performance
|
|
30
|
-
ruby-prof is a tracing profiler, not a sampling profiler, and thus will cause your program to run slower. Our tests show that the overhead varies considerably based on the code being profiled. Significant effort has been put into reducing this overhead, but most programs will run approximately twice as slow while highly recursive programs (like the fibonacci series test) may run up to five times slower.
|
|
31
|
-
|
|
32
|
-
## History
|
|
33
|
-
ruby-prof has been under continuous development since 2005 — see the full [History](history.md) page.
|
|
34
|
-
|
|
35
|
-
## API Documentation
|
|
36
|
-
|
|
37
|
-
API documentation for each class is available at the [ruby-prof API docs](https://ruby-prof.github.io/doc/index.html).
|
|
38
|
-
|
|
39
|
-
## License
|
|
40
|
-
|
|
41
|
-
See [LICENSE](../LICENSE) for license information.
|
|
42
|
-
|
|
43
|
-
## Development
|
|
44
|
-
|
|
45
|
-
Code is located at [github.com/ruby-prof/ruby-prof](https://github.com/ruby-prof/ruby-prof).
|
data/docs/profiling-rails.md
DELETED
|
@@ -1,64 +0,0 @@
|
|
|
1
|
-
# Profiling Rails
|
|
2
|
-
|
|
3
|
-
To profile a Rails application it is vital to run it using production-like settings (cache classes, cache view lookups, etc.). Otherwise, Rails dependency loading code will overwhelm any time spent in the application itself (our tests show that Rails dependency loading causes a roughly 6x slowdown). The best way to do this is to create a new Rails environment, `profile`.
|
|
4
|
-
|
|
5
|
-
To profile Rails:
|
|
6
|
-
|
|
7
|
-
1. Add ruby-prof to your Gemfile:
|
|
8
|
-
|
|
9
|
-
```ruby
|
|
10
|
-
group :profile do
|
|
11
|
-
gem 'ruby-prof'
|
|
12
|
-
end
|
|
13
|
-
```
|
|
14
|
-
|
|
15
|
-
Then install it:
|
|
16
|
-
|
|
17
|
-
```bash
|
|
18
|
-
bundle install
|
|
19
|
-
```
|
|
20
|
-
|
|
21
|
-
2. Create `config/environments/profile.rb` with production-like settings and the ruby-prof middleware:
|
|
22
|
-
|
|
23
|
-
```ruby
|
|
24
|
-
# config/environments/profile.rb
|
|
25
|
-
require_relative "production"
|
|
26
|
-
|
|
27
|
-
Rails.application.configure do
|
|
28
|
-
# Optional: reduce noise while profiling.
|
|
29
|
-
config.log_level = :warn
|
|
30
|
-
|
|
31
|
-
# Optional: disable controller/view caching if you want raw app execution timing.
|
|
32
|
-
config.action_controller.perform_caching = false
|
|
33
|
-
|
|
34
|
-
config.middleware.use Rack::RubyProf, path: Rails.root.join("tmp/profile")
|
|
35
|
-
end
|
|
36
|
-
```
|
|
37
|
-
|
|
38
|
-
By default the rack adapter generates flat text, graph text, graph HTML, and call stack HTML reports.
|
|
39
|
-
|
|
40
|
-
3. Start Rails in the profile environment:
|
|
41
|
-
|
|
42
|
-
```bash
|
|
43
|
-
bin/rails server -e profile
|
|
44
|
-
```
|
|
45
|
-
|
|
46
|
-
You can run a console in the same environment with:
|
|
47
|
-
|
|
48
|
-
```bash
|
|
49
|
-
bin/rails console -e profile
|
|
50
|
-
```
|
|
51
|
-
|
|
52
|
-
4. Make a request to generate profile output:
|
|
53
|
-
|
|
54
|
-
```bash
|
|
55
|
-
curl http://127.0.0.1:3000/
|
|
56
|
-
```
|
|
57
|
-
|
|
58
|
-
5. Inspect reports in `tmp/profile`:
|
|
59
|
-
|
|
60
|
-
```bash
|
|
61
|
-
ls -1 tmp/profile
|
|
62
|
-
```
|
|
63
|
-
|
|
64
|
-
Reports are generated per request path. Repeating the same request path overwrites the previous report files for that path.
|
|
@@ -1,33 +0,0 @@
|
|
|
1
|
-
# A small synthetic workload for demonstrating ruby-prof reports.
|
|
2
|
-
# word_freq.rb
|
|
3
|
-
|
|
4
|
-
def normalize(text)
|
|
5
|
-
text.downcase.gsub(/[^a-z\s]/, "")
|
|
6
|
-
end
|
|
7
|
-
|
|
8
|
-
def tokenize(text)
|
|
9
|
-
text.split(/\s+/)
|
|
10
|
-
end
|
|
11
|
-
|
|
12
|
-
def count_words(words)
|
|
13
|
-
counts = Hash.new(0)
|
|
14
|
-
words.each { |w| counts[w] += 1 }
|
|
15
|
-
counts
|
|
16
|
-
end
|
|
17
|
-
|
|
18
|
-
def top_words(counts, n = 10)
|
|
19
|
-
counts.sort_by { |_, v| -v }.take(n)
|
|
20
|
-
end
|
|
21
|
-
|
|
22
|
-
def run_example
|
|
23
|
-
text = <<~EOS * 200
|
|
24
|
-
Ruby is a dynamic, open source programming language with a focus on
|
|
25
|
-
simplicity and productivity. It has an elegant syntax that is natural
|
|
26
|
-
to read and easy to write.
|
|
27
|
-
EOS
|
|
28
|
-
|
|
29
|
-
normalized = normalize(text)
|
|
30
|
-
tokens = tokenize(normalized)
|
|
31
|
-
counts = count_words(tokens)
|
|
32
|
-
top = top_words(counts)
|
|
33
|
-
end
|