rperf 0.7.0 → 0.9.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: f19061984f2ea33bbcd569c43e8a7ece03071b8fbb442ebe108468eb07d96a14
4
- data.tar.gz: 66bee438bd8459db8ce89129cef39bdaaba3ad82e348012c5d220fb5b6f3963f
3
+ metadata.gz: 7182c353301aa38afde2d65928219c46cee3e777b49842bf60331dd40e7b3ab2
4
+ data.tar.gz: ebb30b807d9b86a7ff48090bc5990d6bdcc0c9951f80b21758df2413bdbd39f2
5
5
  SHA512:
6
- metadata.gz: 3a9468eadacbb41afbc751bd767141a3db785a2eaa51e33549503fe160a8adb25f6612c0cc4c61381b8f8442836a970a23d29cda4fe696488ca85d2b048518a2
7
- data.tar.gz: 5e8e6c6c24fbb264f352c98511481e5d448b6a07e390c72cc31beeab2aa03699cde22f2f04268531ab50572e7cfd54cab7235359915caa6c6b3c9eb40f26d4e9
6
+ metadata.gz: d70dde1af1a4c3c9cec02e20981038c465facea4d3da32c32ce26e16cfc402de01f594c0b41386e3095f53603631f930fb2bd133d23b8a5757662d40f117d2a5
7
+ data.tar.gz: b7868f0237f84a47bda6286c99b7056f47738fb235ede0d21ac8f950feb9b891b1677dff38f25294393766cbe4f742b2eeffe73a43dded92562926db8d5bd392
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2026 Koichi Sasada
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md CHANGED
@@ -17,7 +17,7 @@
17
17
  </p>
18
18
 
19
19
  <p align="center">
20
- pprof / collapsed stacks / text report &nbsp;·&nbsp; CPU mode & wall mode (GVL + GC tracking)
20
+ Built-in flamegraph viewer &nbsp;·&nbsp; CPU mode & wall mode (GVL + GC tracking)
21
21
  </p>
22
22
 
23
23
  <p align="center">
@@ -29,29 +29,34 @@
29
29
  ## See It in Action
30
30
 
31
31
  ```bash
32
- $ gem install rperf
33
32
  $ rperf exec ruby fib.rb
34
33
 
35
34
  Performance stats for 'ruby fib.rb':
36
35
 
37
- 2,326.0 ms user
38
- 64.5 ms sys
39
- 2,035.5 ms real
36
+ 2,023.3 ms user
37
+ 4.3 ms sys
38
+ 2,001.8 ms real
40
39
 
41
- 2,034.2 ms 100.0% CPU execution
42
- 1 [Ruby] detected threads
43
- 7.0 ms [Ruby] GC time (7 count: 5 minor, 2 major)
44
- 106,078 [Ruby] allocated objects
45
- 22 MB [OS] peak memory (maxrss)
40
+ 2,000.3 ms 100.0% [Rperf] CPU execution
41
+ 3.0 ms [Ruby ] GC time (4 count: 2 minor, 2 major)
42
+ 48,741 [Ruby ] allocated objects
43
+ 27,034 [Ruby ] freed objects
44
+ 1 [Ruby ] detected threads
45
+ 16 MB [OS ] peak memory (maxrss)
46
+ 5,784 [OS ] page faults (5,783 minor, 1 major)
47
+ 22 [OS ] context switches (13 voluntary, 9 involuntary)
48
+ 0 MB [OS ] disk I/O (0 MB read, 0 MB write)
46
49
 
47
50
  Flat:
48
- 2,034.2 ms 100.0% Object#fibonacci (fib.rb)
51
+ 1,998.4 ms 99.9% Object#fibonacci (fib.rb)
52
+ 1.9 ms 0.1% Module#method_added (<C method>)
49
53
 
50
54
  Cumulative:
51
- 2,034.2 ms 100.0% Object#fibonacci (fib.rb)
52
- 2,034.2 ms 100.0% <main> (fib.rb)
55
+ 2,000.3 ms 100.0% <main> (fib.rb)
56
+ 1,998.4 ms 99.9% Object#fibonacci (fib.rb)
57
+ 1.9 ms 0.1% Module#method_added (<C method>)
53
58
 
54
- 2034 samples / 2034 triggers, 0.1% profiler overhead
59
+ 1999 samples / 1999 triggers, 0.1% profiler overhead
55
60
  ```
56
61
 
57
62
  ## Quick Start
@@ -60,26 +65,27 @@ $ rperf exec ruby fib.rb
60
65
  # Performance summary (wall mode, prints to stderr)
61
66
  rperf stat ruby app.rb
62
67
 
63
- # Record a pprof profile to file
64
- rperf record ruby app.rb # → rperf.data (cpu mode)
65
- rperf record -m wall -o profile.pb.gz ruby server.rb # wall mode, custom output
68
+ # Record a profile to file
69
+ rperf record ruby app.rb # → rperf.json.gz (cpu mode, default)
70
+ rperf record -m wall ruby server.rb # wall mode
66
71
 
67
- # View results (report/diff require Go: https://go.dev/dl/)
68
- rperf report # open rperf.data in browser
69
- rperf report --top profile.pb.gz # print top functions to terminal
72
+ # View results in browser
73
+ rperf report # open rperf.json.gz in viewer
74
+ rperf report --top profile.json.gz # print top functions to terminal
70
75
 
71
- # Compare two profiles
72
- rperf diff before.pb.gz after.pb.gz # open diff in browser
73
- rperf diff --top before.pb.gz after.pb.gz # print diff to terminal
76
+ # Compare two profiles (requires Go)
77
+ rperf diff before.json.gz after.json.gz # open diff in browser
74
78
  ```
75
79
 
80
+ On `rperf report`, you can see the profile result like this page: [rprof viewer](https://ko1.github.io/rperf/examples/cpu_intensive_profile.html)
81
+
76
82
  ### Ruby API
77
83
 
78
84
  ```ruby
79
85
  require "rperf"
80
86
 
81
87
  # Block form — profiles and saves to file
82
- Rperf.start(output: "profile.pb.gz", frequency: 500, mode: :cpu) do
88
+ Rperf.start(output: "profile.json.gz", frequency: 500, mode: :cpu) do
83
89
  # code to profile
84
90
  end
85
91
 
@@ -87,18 +93,37 @@ end
87
93
  Rperf.start(frequency: 1000, mode: :wall)
88
94
  # ...
89
95
  data = Rperf.stop
90
- Rperf.save("profile.pb.gz", data)
96
+ Rperf.save("profile.json.gz", data)
97
+ ```
98
+
99
+ ### In-browser Viewer
100
+
101
+ ```ruby
102
+ # config.ru
103
+ require "rperf/viewer"
104
+ require "rperf/rack"
105
+
106
+ Rperf.start(mode: :wall, defer: true)
107
+ use Rperf::Viewer # visit /rperf/ for flamegraph UI
108
+ use Rperf::RackMiddleware # labels each request
109
+ run MyApp
110
+
111
+ # Snapshot every 60 minutes
112
+ Thread.new { loop { sleep 3600; Rperf::Viewer.instance&.take_snapshot! } }
91
113
  ```
92
114
 
115
+ > **Note:** `Rperf::Viewer` has no built-in authentication. In production, restrict access with your framework's auth mechanisms (e.g., route constraints in Rails). See the [manual](https://ko1.github.io/rperf/docs/manual/) for examples.
116
+
93
117
  ### Environment Variables
94
118
 
95
119
  Profile without code changes (e.g., Rails):
96
120
 
97
121
  ```bash
98
- RPERF_ENABLED=1 RPERF_MODE=wall RPERF_OUTPUT=profile.pb.gz ruby app.rb
122
+ RPERF_ENABLED=1 RPERF_MODE=wall ruby app.rb # → rperf.json.gz
123
+ rperf report # open in viewer
99
124
  ```
100
125
 
101
- Run `rperf help` for full documentation, or see the [online manual](https://ko1.github.io/rperf/).
126
+ Run `rperf help` for full documentation, or see the [online manual](https://ko1.github.io/rperf/docs/manual/).
102
127
 
103
128
  ## Subcommands
104
129
 
@@ -106,11 +131,11 @@ Inspired by Linux `perf` — familiar subcommand interface for profiling workflo
106
131
 
107
132
  | Command | Description |
108
133
  |---------|-------------|
109
- | `rperf record` | Profile a command and save to file |
134
+ | `rperf record` | Profile a command and save to file (default: `.json.gz`) |
110
135
  | `rperf stat` | Profile a command and print summary to stderr |
111
136
  | `rperf exec` | Profile a command and print full report to stderr |
112
- | `rperf report` | Open pprof profile with `go tool pprof` (requires Go) |
113
- | `rperf diff` | Compare two pprof profiles (requires Go) |
137
+ | `rperf report` | Open viewer for `.json.gz`; wraps `go tool pprof` for `.pb.gz` (requires Go) |
138
+ | `rperf diff` | Compare two profiles (requires Go) |
114
139
  | `rperf help` | Show full reference documentation |
115
140
 
116
141
  ## How It Works
@@ -133,7 +158,7 @@ Timer (signal or thread) VM thread (postponed job)
133
158
  record(backtrace, weight)
134
159
  ```
135
160
 
136
- On Linux, the timer uses `timer_create` + signal delivery (no extra thread).
161
+ On Linux, the timer uses `timer_create` + signal delivery to a dedicated worker thread.
137
162
  On other platforms, a dedicated pthread with `nanosleep` is used.
138
163
 
139
164
  If a safepoint is delayed, the sample carries proportionally more weight. The total weight equals the total time, accurately distributed across call stacks.
@@ -147,44 +172,45 @@ If a safepoint is delayed, the sample carries proportionally more weight. The to
147
172
 
148
173
  Use `cpu` to find what consumes CPU. Use `wall` to find what makes things slow (I/O, GVL contention, GC).
149
174
 
150
- ### Synthetic Frames (wall mode)
175
+ ### GVL and GC Labels
151
176
 
152
- rperf hooks GVL and GC events to attribute non-CPU time:
177
+ rperf hooks GVL and GC events to attribute non-CPU time. These are recorded as labels on samples rather than synthetic stack frames:
153
178
 
154
- | Frame | Meaning |
155
- |-------|---------|
156
- | `[GVL blocked]` | Off-GVL time (I/O, sleep, C extension releasing GVL) |
157
- | `[GVL wait]` | Waiting to reacquire the GVL (contention) |
158
- | `[GC marking]` | Time in GC mark phase |
159
- | `[GC sweeping]` | Time in GC sweep phase |
179
+ | Label (key=value) | Mode | Meaning |
180
+ |-------|------|---------|
181
+ | `%GVL=blocked` | wall only | Off-GVL time (I/O, sleep, C extension releasing GVL) |
182
+ | `%GVL=wait` | wall only | Waiting to reacquire the GVL (contention) |
183
+ | `%GC=mark` | cpu and wall | Time in GC mark phase (wall time) |
184
+ | `%GC=sweep` | cpu and wall | Time in GC sweep phase (wall time) |
160
185
 
161
186
  ## Why rperf?
162
187
 
163
188
  - **Accurate despite safepoints** — Safepoint sampling is *safer* (no async-signal-safety issues), but normally *inaccurate*. rperf compensates with real time-delta weights, so profiles faithfully reflect where time is actually spent.
164
- - **See the whole picture** (wall mode) — GVL contention, off-GVL I/O, GC marking/sweeping — all attributed to the call stacks responsible, via synthetic frames.
165
- - **Low overhead** — Signal-based timer on Linux (no extra thread). ~1–5 µs per sample.
166
- - **pprof compatible** — Works with `go tool pprof`, speedscope, and other standard tools out of the box.
189
+ - **See the whole picture** (wall mode) — GVL contention, off-GVL I/O, GC marking/sweeping — all attributed to the call stacks responsible, via sample labels.
190
+ - **Built-in viewer** — Flamegraph, Top, Tags tabs with interactive tag filtering. No external tools needed to analyze profiles.
191
+ - **Low overhead** — Signal-based timer on Linux (no extra thread). ~1–5 us per sample.
167
192
  - **Zero code changes** — Profile any Ruby program via CLI or environment variables. Drop-in for Rails, too.
168
193
  - **`perf`-like CLI** — `record`, `stat`, `report`, `diff` — if you know Linux perf, you already know rperf.
194
+ - **Multi-process** — automatically profiles forked/spawned Ruby child processes (e.g., Unicorn/Puma workers). Use `--no-inherit` to disable.
169
195
 
170
196
  ### Limitations
171
197
 
172
198
  - **Method-level only** — no line-level granularity.
173
199
  - **Ruby >= 3.4.0** — uses recent VM internals (postponed jobs, thread event hooks).
174
200
  - **POSIX only** — Linux, macOS. No Windows.
175
- - **No fork support** — profiling does not follow fork(2) child processes.
176
201
 
177
202
 
178
203
  ## Output Formats
179
204
 
180
- | Format | Extension | Use case |
181
- |--------|-----------|----------|
182
- | pprof (default) | `.pb.gz` | `rperf report`, `go tool pprof`, speedscope |
183
- | collapsed | `.collapsed` | FlameGraph (`flamegraph.pl`), speedscope |
184
- | text | `.txt` | Human/AI-readable flat + cumulative report |
205
+ | Format | Extension | Viewer |
206
+ |--------|-----------|--------|
207
+ | JSON (default) | `.json.gz` | `rperf report` (built-in viewer), `Rperf.load`, any JSON tool |
208
+ | pprof | `.pb.gz` | `go tool pprof` (requires Go), speedscope |
209
+ | collapsed | `.collapsed` | FlameGraph, speedscope |
210
+ | text | `.txt` | any text viewer |
185
211
 
186
212
  Format is auto-detected from extension, or set explicitly with `--format`.
187
213
 
188
214
  ## License
189
215
 
190
- MIT
216
+ MIT