fluent-plugin-perf-tools 0.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +15 -0
- data/.rubocop.yml +26 -0
- data/.ruby-version +1 -0
- data/CHANGELOG.md +5 -0
- data/CODE_OF_CONDUCT.md +84 -0
- data/Gemfile +5 -0
- data/LICENSE.txt +21 -0
- data/README.md +43 -0
- data/Rakefile +17 -0
- data/bin/console +15 -0
- data/bin/setup +8 -0
- data/fluent-plugin-perf-tools.gemspec +48 -0
- data/lib/fluent/plugin/in_perf_tools.rb +42 -0
- data/lib/fluent/plugin/perf_tools/cachestat.rb +65 -0
- data/lib/fluent/plugin/perf_tools/command.rb +30 -0
- data/lib/fluent/plugin/perf_tools/version.rb +9 -0
- data/lib/fluent/plugin/perf_tools.rb +11 -0
- data/perf-tools/LICENSE +339 -0
- data/perf-tools/README.md +205 -0
- data/perf-tools/bin/bitesize +1 -0
- data/perf-tools/bin/cachestat +1 -0
- data/perf-tools/bin/execsnoop +1 -0
- data/perf-tools/bin/funccount +1 -0
- data/perf-tools/bin/funcgraph +1 -0
- data/perf-tools/bin/funcslower +1 -0
- data/perf-tools/bin/functrace +1 -0
- data/perf-tools/bin/iolatency +1 -0
- data/perf-tools/bin/iosnoop +1 -0
- data/perf-tools/bin/killsnoop +1 -0
- data/perf-tools/bin/kprobe +1 -0
- data/perf-tools/bin/opensnoop +1 -0
- data/perf-tools/bin/perf-stat-hist +1 -0
- data/perf-tools/bin/reset-ftrace +1 -0
- data/perf-tools/bin/syscount +1 -0
- data/perf-tools/bin/tcpretrans +1 -0
- data/perf-tools/bin/tpoint +1 -0
- data/perf-tools/bin/uprobe +1 -0
- data/perf-tools/deprecated/README.md +1 -0
- data/perf-tools/deprecated/execsnoop-proc +150 -0
- data/perf-tools/deprecated/execsnoop-proc.8 +80 -0
- data/perf-tools/deprecated/execsnoop-proc_example.txt +46 -0
- data/perf-tools/disk/bitesize +175 -0
- data/perf-tools/examples/bitesize_example.txt +63 -0
- data/perf-tools/examples/cachestat_example.txt +58 -0
- data/perf-tools/examples/execsnoop_example.txt +153 -0
- data/perf-tools/examples/funccount_example.txt +126 -0
- data/perf-tools/examples/funcgraph_example.txt +2178 -0
- data/perf-tools/examples/funcslower_example.txt +110 -0
- data/perf-tools/examples/functrace_example.txt +341 -0
- data/perf-tools/examples/iolatency_example.txt +350 -0
- data/perf-tools/examples/iosnoop_example.txt +302 -0
- data/perf-tools/examples/killsnoop_example.txt +62 -0
- data/perf-tools/examples/kprobe_example.txt +379 -0
- data/perf-tools/examples/opensnoop_example.txt +47 -0
- data/perf-tools/examples/perf-stat-hist_example.txt +149 -0
- data/perf-tools/examples/reset-ftrace_example.txt +88 -0
- data/perf-tools/examples/syscount_example.txt +297 -0
- data/perf-tools/examples/tcpretrans_example.txt +93 -0
- data/perf-tools/examples/tpoint_example.txt +210 -0
- data/perf-tools/examples/uprobe_example.txt +321 -0
- data/perf-tools/execsnoop +292 -0
- data/perf-tools/fs/cachestat +167 -0
- data/perf-tools/images/perf-tools_2016.png +0 -0
- data/perf-tools/iolatency +296 -0
- data/perf-tools/iosnoop +296 -0
- data/perf-tools/kernel/funccount +146 -0
- data/perf-tools/kernel/funcgraph +259 -0
- data/perf-tools/kernel/funcslower +248 -0
- data/perf-tools/kernel/functrace +192 -0
- data/perf-tools/kernel/kprobe +270 -0
- data/perf-tools/killsnoop +263 -0
- data/perf-tools/man/man8/bitesize.8 +70 -0
- data/perf-tools/man/man8/cachestat.8 +111 -0
- data/perf-tools/man/man8/execsnoop.8 +104 -0
- data/perf-tools/man/man8/funccount.8 +76 -0
- data/perf-tools/man/man8/funcgraph.8 +166 -0
- data/perf-tools/man/man8/funcslower.8 +129 -0
- data/perf-tools/man/man8/functrace.8 +123 -0
- data/perf-tools/man/man8/iolatency.8 +116 -0
- data/perf-tools/man/man8/iosnoop.8 +169 -0
- data/perf-tools/man/man8/killsnoop.8 +100 -0
- data/perf-tools/man/man8/kprobe.8 +162 -0
- data/perf-tools/man/man8/opensnoop.8 +113 -0
- data/perf-tools/man/man8/perf-stat-hist.8 +111 -0
- data/perf-tools/man/man8/reset-ftrace.8 +49 -0
- data/perf-tools/man/man8/syscount.8 +96 -0
- data/perf-tools/man/man8/tcpretrans.8 +93 -0
- data/perf-tools/man/man8/tpoint.8 +140 -0
- data/perf-tools/man/man8/uprobe.8 +168 -0
- data/perf-tools/misc/perf-stat-hist +223 -0
- data/perf-tools/net/tcpretrans +311 -0
- data/perf-tools/opensnoop +280 -0
- data/perf-tools/syscount +192 -0
- data/perf-tools/system/tpoint +232 -0
- data/perf-tools/tools/reset-ftrace +123 -0
- data/perf-tools/user/uprobe +390 -0
- metadata +349 -0
@@ -0,0 +1,110 @@
|
|
1
|
+
Demonstrations of funcslower, the Linux ftrace version.
|
2
|
+
|
3
|
+
|
4
|
+
Show me ext3_readpages() calls slower than 1000 microseconds (1 ms):
|
5
|
+
|
6
|
+
# ./funcslower ext3_readpages 1000
|
7
|
+
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
|
8
|
+
0) ! 8147.120 us | } /* ext3_readpages */
|
9
|
+
0) ! 8135.067 us | } /* ext3_readpages */
|
10
|
+
0) ! 12202.93 us | } /* ext3_readpages */
|
11
|
+
0) ! 12201.84 us | } /* ext3_readpages */
|
12
|
+
0) ! 8142.667 us | } /* ext3_readpages */
|
13
|
+
0) ! 12194.14 us | } /* ext3_readpages */
|
14
|
+
^C
|
15
|
+
Ending tracing...
|
16
|
+
|
17
|
+
Neat. So this confirms that there are ext3_readpages() calls that are taking
|
18
|
+
over 8000 us (8 ms).
|
19
|
+
|
20
|
+
funcslower uses the ftrace function graph profiler to dynamically instrument
|
21
|
+
the given kernel function, time it in-kernel, and only emit events slower
|
22
|
+
than the given latency threshold in-kernel. Since this all operates in
|
23
|
+
kernel context, the overheads are relatively low (compared to post-processing
|
24
|
+
in user space).
|
25
|
+
|
26
|
+
|
27
|
+
Now include the process name and PID (-P) of the process who is on-CPU, and the
|
28
|
+
absolute timestamp (-t) of the event:
|
29
|
+
|
30
|
+
# ./funcslower -Pt ext3_readpages 1000
|
31
|
+
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
|
32
|
+
2678112.003180 | 0) cksum-26695 | ! 8145.268 us | } /* ext3_readpages */
|
33
|
+
2678113.538763 | 0) cksum-26695 | ! 8139.086 us | } /* ext3_readpages */
|
34
|
+
2678113.704901 | 0) cksum-26695 | ! 8147.549 us | } /* ext3_readpages */
|
35
|
+
2678113.721102 | 0) cksum-26695 | ! 8142.530 us | } /* ext3_readpages */
|
36
|
+
2678113.810269 | 0) cksum-26695 | ! 12234.70 us | } /* ext3_readpages */
|
37
|
+
2678113.996625 | 0) cksum-26695 | ! 8146.129 us | } /* ext3_readpages */
|
38
|
+
2678114.012832 | 0) cksum-26695 | ! 8148.153 us | } /* ext3_readpages */
|
39
|
+
^C
|
40
|
+
Ending tracing...
|
41
|
+
|
42
|
+
Great! Now I can see the process name, which in this case is the responsible
|
43
|
+
process. The timestamps also let me determine the rate of these slow events.
|
44
|
+
|
45
|
+
|
46
|
+
Now measure time differently: excluding time spent sleeping, so that we only
|
47
|
+
see on-CPU time:
|
48
|
+
|
49
|
+
# ./funcslower -Pct ext3_readpages 1000
|
50
|
+
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
|
51
|
+
^C
|
52
|
+
Ending tracing...
|
53
|
+
|
54
|
+
I believe the workload hasn't changed, so these ext3_readpages() calls are
|
55
|
+
still happening, however, their CPU time doesn't exceed 1 ms. Compared to the
|
56
|
+
earlier output, this tells me that the latency in this function is due to time
|
57
|
+
spent blocked off-CPU, and not on-CPU. This makes sense: this function is
|
58
|
+
ultimately being blocked on disk I/O.
|
59
|
+
|
60
|
+
Were the function duration times to be similar with and without -C, that would
|
61
|
+
tell us that the high latency is due to time spent on-CPU executing code.
|
62
|
+
|
63
|
+
|
64
|
+
This traces the sys_nanosleep() kernel function, and shows calls taking over
|
65
|
+
100 us:
|
66
|
+
|
67
|
+
# ./funcslower sys_nanosleep 100
|
68
|
+
Tracing "sys_nanosleep" slower than 100 us... Ctrl-C to end.
|
69
|
+
0) ! 2000147 us | } /* sys_nanosleep */
|
70
|
+
------------------------------------------
|
71
|
+
0) registe-27414 => vmstat-27419
|
72
|
+
------------------------------------------
|
73
|
+
|
74
|
+
0) ! 1000143 us | } /* sys_nanosleep */
|
75
|
+
0) ! 1000154 us | } /* sys_nanosleep */
|
76
|
+
------------------------------------------
|
77
|
+
0) vmstat-27419 => registe-27414
|
78
|
+
------------------------------------------
|
79
|
+
|
80
|
+
0) ! 2000183 us | } /* sys_nanosleep */
|
81
|
+
------------------------------------------
|
82
|
+
0) registe-27414 => vmstat-27419
|
83
|
+
------------------------------------------
|
84
|
+
|
85
|
+
0) ! 1000141 us | } /* sys_nanosleep */
|
86
|
+
^C
|
87
|
+
Ending tracing...
|
88
|
+
|
89
|
+
This is an example where I did not use -P, but ftrace has included process
|
90
|
+
information anyway. Look for the lines containing "=>", which indicate a process
|
91
|
+
switch on the given CPU.
|
92
|
+
|
93
|
+
|
94
|
+
Use -h to print the USAGE message:
|
95
|
+
|
96
|
+
# ./funcslower -h
|
97
|
+
USAGE: funcslower [-aChHPt] [-p PID] [-d secs] funcstring latency_us
|
98
|
+
-a # all info (same as -HPt)
|
99
|
+
-C # measure on-CPU time only
|
100
|
+
-d seconds # trace duration, and use buffers
|
101
|
+
-h # this usage message
|
102
|
+
-H # include column headers
|
103
|
+
-p PID # trace when this pid is on-CPU
|
104
|
+
-L TID # trace when this thread is on-CPU
|
105
|
+
-P # show process names & PIDs
|
106
|
+
-t # show timestamps
|
107
|
+
eg,
|
108
|
+
funcslower vfs_read 10000 # trace vfs_read() slower than 10 ms
|
109
|
+
|
110
|
+
See the man page and example file for more info.
|
@@ -0,0 +1,341 @@
|
|
1
|
+
Demonstrations of functrace, the Linux ftrace version.
|
2
|
+
|
3
|
+
|
4
|
+
A (usually) good example to start with is do_nanosleep(), since it is not called
|
5
|
+
frequently, and easily triggered. Here's tracing it using functrace:
|
6
|
+
|
7
|
+
# ./functrace 'do_nanosleep'
|
8
|
+
Tracing "do_nanosleep"... Ctrl-C to end.
|
9
|
+
svscan-1678 [000] .... 6412438.703521: do_nanosleep <-hrtimer_nanosleep
|
10
|
+
svscan-1678 [000] .... 6412443.703678: do_nanosleep <-hrtimer_nanosleep
|
11
|
+
svscan-1678 [000] .... 6412448.703865: do_nanosleep <-hrtimer_nanosleep
|
12
|
+
vmstat-28371 [000] .... 6412453.216241: do_nanosleep <-hrtimer_nanosleep
|
13
|
+
svscan-1678 [000] .... 6412453.704049: do_nanosleep <-hrtimer_nanosleep
|
14
|
+
vmstat-28371 [000] .... 6412454.216524: do_nanosleep <-hrtimer_nanosleep
|
15
|
+
vmstat-28371 [000] .... 6412455.216816: do_nanosleep <-hrtimer_nanosleep
|
16
|
+
vmstat-28371 [000] .... 6412456.217093: do_nanosleep <-hrtimer_nanosleep
|
17
|
+
vmstat-28371 [000] .... 6412457.217378: do_nanosleep <-hrtimer_nanosleep
|
18
|
+
vmstat-28371 [000] .... 6412458.217660: do_nanosleep <-hrtimer_nanosleep
|
19
|
+
^C
|
20
|
+
Ending tracing...
|
21
|
+
|
22
|
+
While tracing, I ran a "vmstat 1" in another window. vmstat and its process ID
|
23
|
+
can be seen as the 1st column, and the timestamp and one second intervals can
|
24
|
+
be seen as the 4th column.
|
25
|
+
|
26
|
+
This is basic details: who was on-CPU (process name and PID), flags, timestamp,
|
27
|
+
and calling function. Treat this as the next step, after funccount, for getting
|
28
|
+
a little more information on kernel function execution, before using more
|
29
|
+
capabilities to dig further.
|
30
|
+
|
31
|
+
This is Linux 3.16, and the output is the ftrace text buffer format, which has
|
32
|
+
changed slightly between kernel versions.
|
33
|
+
|
34
|
+
|
35
|
+
To see the column headers, use -H. This is Linux 3.16:
|
36
|
+
|
37
|
+
# ./functrace -H do_nanosleep
|
38
|
+
Tracing "do_nanosleep"... Ctrl-C to end.
|
39
|
+
# tracer: function
|
40
|
+
#
|
41
|
+
# entries-in-buffer/entries-written: 0/0 #P:2
|
42
|
+
#
|
43
|
+
# _-----=> irqs-off
|
44
|
+
# / _----=> need-resched
|
45
|
+
# | / _---=> hardirq/softirq
|
46
|
+
# || / _--=> preempt-depth
|
47
|
+
# ||| / delay
|
48
|
+
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
|
49
|
+
# | | | |||| | |
|
50
|
+
svscan-1678 [001] .... 6413283.729520: do_nanosleep <-hrtimer_nanosleep
|
51
|
+
svscan-1678 [001] .... 6413288.729679: do_nanosleep <-hrtimer_nanosleep
|
52
|
+
|
53
|
+
For comparison, here's Linux 3.2:
|
54
|
+
|
55
|
+
# ./functrace -H do_nanosleep
|
56
|
+
Tracing "do_nanosleep"... Ctrl-C to end.
|
57
|
+
# tracer: function
|
58
|
+
#
|
59
|
+
# TASK-PID CPU# TIMESTAMP FUNCTION
|
60
|
+
# | | | | |
|
61
|
+
vmstat-11789 [000] 1763207.021204: do_nanosleep <-hrtimer_nanosleep
|
62
|
+
vmstat-11789 [000] 1763208.022970: do_nanosleep <-hrtimer_nanosleep
|
63
|
+
vmstat-11789 [000] 1763209.023267: do_nanosleep <-hrtimer_nanosleep
|
64
|
+
|
65
|
+
For documentation on the exact format, see the Linux kernel source under
|
66
|
+
Documentation/trace/ftrace.txt.
|
67
|
+
|
68
|
+
|
69
|
+
This error:
|
70
|
+
|
71
|
+
# ./functrace 'ext4_z*'
|
72
|
+
Tracing "ext4_z*"... Ctrl-C to end.
|
73
|
+
./functrace: line 136: echo: write error: Invalid argument
|
74
|
+
ERROR: enabling "ext4_z*". Exiting.
|
75
|
+
|
76
|
+
Is because there were no functions beginning with "ext4_z". You can check
|
77
|
+
available functions in the /sys/kernel/debug/tracing/available_filter_functions
|
78
|
+
file.
|
79
|
+
|
80
|
+
|
81
|
+
You might want to use funccount to check the frequency of events before using
|
82
|
+
functrace. For example, counting ext3 events on a system:
|
83
|
+
|
84
|
+
# ./funccount -d 10 'ext3*'
|
85
|
+
Tracing "ext3*" for 10 seconds...
|
86
|
+
|
87
|
+
FUNC COUNT
|
88
|
+
ext3_journal_dirty_data 1
|
89
|
+
ext3_ordered_write_end 1
|
90
|
+
ext3_write_begin 1
|
91
|
+
ext3_writepage_trans_blocks 1
|
92
|
+
ext3_dirty_inode 2
|
93
|
+
ext3_do_update_inode 2
|
94
|
+
ext3_get_group_desc 2
|
95
|
+
ext3_get_inode_block.isra.20 2
|
96
|
+
ext3_get_inode_flags 2
|
97
|
+
ext3_get_inode_loc 2
|
98
|
+
ext3_mark_iloc_dirty 2
|
99
|
+
ext3_mark_inode_dirty 2
|
100
|
+
ext3_reserve_inode_write 2
|
101
|
+
ext3_journal_start_sb 3
|
102
|
+
ext3_block_to_path.isra.22 6
|
103
|
+
ext3_bmap 6
|
104
|
+
ext3_get_block 6
|
105
|
+
ext3_get_blocks_handle 6
|
106
|
+
ext3_get_branch 6
|
107
|
+
ext3_discard_reservation 11
|
108
|
+
ext3_ioctl 11
|
109
|
+
ext3_release_file 11
|
110
|
+
|
111
|
+
Ending tracing...
|
112
|
+
|
113
|
+
During 10 seconds, there weren't many ext3 calls. I might consider tracing
|
114
|
+
them all (warnings about dynamic tracing many kernel functions apply: test
|
115
|
+
before use, as in the past there have been bugs causing panics).
|
116
|
+
|
117
|
+
# ./functrace 'ext3_*'
|
118
|
+
Tracing "ext3_*"... Ctrl-C to end.
|
119
|
+
register_start.-17008 [000] 1763557.577985: ext3_release_file <-__fput
|
120
|
+
register_start.-17008 [000] 1763557.577987: ext3_discard_reservation <-ext3_release_file
|
121
|
+
register_start.-17026 [000] 1763558.163620: ext3_ioctl <-file_ioctl
|
122
|
+
register_start.-17026 [000] 1763558.481081: ext3_release_file <-__fput
|
123
|
+
register_start.-17026 [000] 1763558.481083: ext3_discard_reservation <-ext3_release_file
|
124
|
+
register_start.-17041 [000] 1763559.186984: ext3_ioctl <-file_ioctl
|
125
|
+
register_start.-17041 [000] 1763559.511267: ext3_release_file <-__fput
|
126
|
+
[...]
|
127
|
+
|
128
|
+
For comparison, here's a different system and ext4:
|
129
|
+
|
130
|
+
# ./funccount -d 10 'ext4*'
|
131
|
+
Tracing "ext4*" for 10 seconds...
|
132
|
+
|
133
|
+
FUNC COUNT
|
134
|
+
ext4_journal_commit_callback 2
|
135
|
+
ext4_htree_fill_tree 6
|
136
|
+
ext4_htree_free_dir_info 6
|
137
|
+
ext4_release_dir 6
|
138
|
+
ext4_readdir 12
|
139
|
+
ext4fs_dirhash 29
|
140
|
+
ext4_htree_store_dirent 29
|
141
|
+
ext4_follow_link 36
|
142
|
+
ext4_file_mmap 42
|
143
|
+
ext4_free_data_callback 44
|
144
|
+
ext4_getattr 45
|
145
|
+
ext4_bmap 62
|
146
|
+
ext4_get_block 62
|
147
|
+
ext4_add_entry 280
|
148
|
+
ext4_add_nondir 280
|
149
|
+
ext4_alloc_da_blocks 280
|
150
|
+
ext4_alloc_inode 280
|
151
|
+
ext4_bio_write_page 280
|
152
|
+
ext4_can_truncate 280
|
153
|
+
ext4_claim_free_clusters 280
|
154
|
+
ext4_clear_inode 280
|
155
|
+
ext4_create 280
|
156
|
+
ext4_da_get_block_prep 280
|
157
|
+
ext4_da_invalidatepage 280
|
158
|
+
ext4_da_update_reserve_space 280
|
159
|
+
ext4_da_write_begin 280
|
160
|
+
ext4_da_write_end 280
|
161
|
+
ext4_dec_count.isra.22 280
|
162
|
+
ext4_delete_entry 280
|
163
|
+
ext4_destroy_inode 280
|
164
|
+
ext4_drop_inode 280
|
165
|
+
ext4_end_bio 280
|
166
|
+
ext4_es_init_tree 280
|
167
|
+
ext4_es_lru_del 280
|
168
|
+
ext4_evict_inode 280
|
169
|
+
ext4_ext_calc_metadata_amount 280
|
170
|
+
ext4_ext_correct_indexes 280
|
171
|
+
ext4_ext_find_goal 280
|
172
|
+
ext4_ext_insert_extent 280
|
173
|
+
ext4_ext_remove_space 280
|
174
|
+
ext4_ext_tree_init 280
|
175
|
+
ext4_ext_truncate 280
|
176
|
+
ext4_ext_truncate_extend_resta 280
|
177
|
+
ext4_ext_try_to_merge 280
|
178
|
+
ext4_ext_try_to_merge_right 280
|
179
|
+
ext4_file_write_iter 280
|
180
|
+
ext4_find_dest_de 280
|
181
|
+
ext4_finish_bio 280
|
182
|
+
ext4_free_blocks 280
|
183
|
+
ext4_free_inode 280
|
184
|
+
ext4_generic_delete_entry 280
|
185
|
+
ext4_has_free_clusters 280
|
186
|
+
ext4_i_callback 280
|
187
|
+
ext4_init_acl 280
|
188
|
+
ext4_init_security 280
|
189
|
+
ext4_inode_attach_jinode 280
|
190
|
+
ext4_inode_to_goal_block 280
|
191
|
+
ext4_insert_dentry 280
|
192
|
+
ext4_invalidatepage 280
|
193
|
+
ext4_io_submit_init 280
|
194
|
+
ext4_itable_unused_count 280
|
195
|
+
ext4_lookup 280
|
196
|
+
ext4_mb_complex_scan_group 280
|
197
|
+
ext4_mb_find_by_goal 280
|
198
|
+
ext4_mb_free_metadata 280
|
199
|
+
ext4_mb_initialize_context 280
|
200
|
+
ext4_mb_mark_diskspace_used 280
|
201
|
+
ext4_mb_new_blocks 280
|
202
|
+
ext4_mb_normalize_request 280
|
203
|
+
ext4_mb_regular_allocator 280
|
204
|
+
ext4_mb_release_context 280
|
205
|
+
ext4_mb_use_best_found 280
|
206
|
+
ext4_mb_use_preallocated 280
|
207
|
+
ext4_nonda_switch 280
|
208
|
+
ext4_orphan_del 280
|
209
|
+
ext4_put_io_end_defer 280
|
210
|
+
ext4_releasepage 280
|
211
|
+
ext4_rename 280
|
212
|
+
ext4_set_aops 280
|
213
|
+
ext4_setent 280
|
214
|
+
ext4_set_inode_flags 280
|
215
|
+
ext4_truncate 280
|
216
|
+
ext4_writepages 280
|
217
|
+
ext4_writepage_trans_blocks 280
|
218
|
+
ext4_xattr_delete_inode 280
|
219
|
+
ext4_xattr_get 285
|
220
|
+
ext4_xattr_ibody_get 285
|
221
|
+
ext4_xattr_security_get 285
|
222
|
+
ext4_bread 286
|
223
|
+
ext4_release_file 288
|
224
|
+
ext4_file_open 305
|
225
|
+
ext4_superblock_csum_set 494
|
226
|
+
ext4_block_bitmap_csum_set 560
|
227
|
+
ext4_es_free_extent 560
|
228
|
+
ext4_es_insert_extent 560
|
229
|
+
ext4_es_remove_extent 560
|
230
|
+
ext4_ext_find_extent 560
|
231
|
+
ext4_ext_map_blocks 560
|
232
|
+
ext4_free_group_clusters_set 560
|
233
|
+
ext4_free_inodes_set 560
|
234
|
+
ext4_get_group_no_and_offset 560
|
235
|
+
ext4_get_reserved_space 560
|
236
|
+
ext4_init_io_end 560
|
237
|
+
ext4_inode_bitmap_csum_set 560
|
238
|
+
ext4_io_submit 560
|
239
|
+
ext4_mb_good_group 560
|
240
|
+
ext4_orphan_add 560
|
241
|
+
ext4_put_io_end 560
|
242
|
+
ext4_read_block_bitmap 560
|
243
|
+
ext4_read_block_bitmap_nowait 560
|
244
|
+
ext4_read_inode_bitmap 560
|
245
|
+
ext4_release_io_end 560
|
246
|
+
ext4_set_bits 560
|
247
|
+
ext4_validate_block_bitmap 560
|
248
|
+
ext4_wait_block_bitmap 560
|
249
|
+
ext4_mb_load_buddy 604
|
250
|
+
ext4_mb_unload_buddy.isra.24 604
|
251
|
+
ext4_block_bitmap 840
|
252
|
+
ext4_discard_preallocations 840
|
253
|
+
ext4_ext_drop_refs 840
|
254
|
+
ext4_ext_get_access.isra.30 840
|
255
|
+
ext4_ext_index_trans_blocks 840
|
256
|
+
ext4_find_entry 840
|
257
|
+
ext4_free_group_clusters 840
|
258
|
+
ext4_handle_dirty_dirent_node 840
|
259
|
+
ext4_inode_bitmap 840
|
260
|
+
ext4_meta_trans_blocks 840
|
261
|
+
ext4_dirty_inode 845
|
262
|
+
ext4_free_inodes_count 1120
|
263
|
+
ext4_group_desc_csum 1120
|
264
|
+
ext4_group_desc_csum_set 1120
|
265
|
+
ext4_getblk 1126
|
266
|
+
ext4_map_blocks 1468
|
267
|
+
ext4_es_lookup_extent 1748
|
268
|
+
ext4_mb_check_limits 1875
|
269
|
+
ext4_es_lru_add 2028
|
270
|
+
ext4_data_block_valid 2308
|
271
|
+
ext4_journal_check_start 3085
|
272
|
+
ext4_mark_inode_dirty 5325
|
273
|
+
ext4_get_inode_flags 5951
|
274
|
+
ext4_get_inode_loc 5951
|
275
|
+
ext4_mark_iloc_dirty 5951
|
276
|
+
ext4_reserve_inode_write 5951
|
277
|
+
ext4_inode_table 7071
|
278
|
+
ext4_get_group_desc 8471
|
279
|
+
ext4_has_inline_data 9486
|
280
|
+
|
281
|
+
Ending tracing...
|
282
|
+
|
283
|
+
There are many functions called frequently. Tracing them all may cost
|
284
|
+
significant performance overhead. I may read through this list and look for
|
285
|
+
the most interesting functions to trace, reducing overheads by only selecting
|
286
|
+
a few.
|
287
|
+
|
288
|
+
For example, ext4_create() looks interesting:
|
289
|
+
|
290
|
+
# ./functrace ext4_create
|
291
|
+
Tracing "ext4_create"... Ctrl-C to end.
|
292
|
+
supervise-1681 [000] .... 6414396.700163: ext4_create <-vfs_create
|
293
|
+
supervise-1684 [001] .... 6414396.700287: ext4_create <-vfs_create
|
294
|
+
supervise-1681 [000] .... 6414396.700598: ext4_create <-vfs_create
|
295
|
+
supervise-1684 [001] .... 6414396.700636: ext4_create <-vfs_create
|
296
|
+
supervise-1687 [001] .... 6414396.701577: ext4_create <-vfs_create
|
297
|
+
supervise-1688 [000] .... 6414396.702590: ext4_create <-vfs_create
|
298
|
+
supervise-1693 [001] .... 6414396.702829: ext4_create <-vfs_create
|
299
|
+
supervise-1693 [001] .... 6414396.703592: ext4_create <-vfs_create
|
300
|
+
supervise-1688 [000] .... 6414396.703598: ext4_create <-vfs_create
|
301
|
+
supervise-1687 [001] .... 6414396.703988: ext4_create <-vfs_create
|
302
|
+
supervise-1685 [001] .... 6414396.704126: ext4_create <-vfs_create
|
303
|
+
supervise-1685 [001] .... 6414396.704458: ext4_create <-vfs_create
|
304
|
+
supervise-1682 [001] .... 6414396.704577: ext4_create <-vfs_create
|
305
|
+
supervise-1683 [000] .... 6414396.704984: ext4_create <-vfs_create
|
306
|
+
supervise-1682 [001] .... 6414396.704985: ext4_create <-vfs_create
|
307
|
+
[...]
|
308
|
+
|
309
|
+
Now I know that different PIDs of the supervise program are calling ext4_create,
|
310
|
+
of around the same time, and from vfs_create().
|
311
|
+
|
312
|
+
|
313
|
+
The duration mode uses buffering, instead of printing events as they occur.
|
314
|
+
This greatly reduces overheads. For example:
|
315
|
+
|
316
|
+
# ./functrace -d 10 ext4_create > out.ext4_create
|
317
|
+
# wc out.ext4_create
|
318
|
+
283 1687 21059 out.ext4_create
|
319
|
+
|
320
|
+
Note that the buffer has a limited size. Check the timestamps to see if the
|
321
|
+
range does not match your duration, as one clue that the buffer was exhausted
|
322
|
+
and events were missed.
|
323
|
+
|
324
|
+
|
325
|
+
Use -h to print the USAGE message:
|
326
|
+
|
327
|
+
# ./functrace -h
|
328
|
+
USAGE: functrace [-hH] [-p PID] [-L TID] [-d secs] funcstring
|
329
|
+
-d seconds # trace duration, and use buffers
|
330
|
+
-h # this usage message
|
331
|
+
-H # include column headers
|
332
|
+
-p PID # trace when this pid is on-CPU
|
333
|
+
-L TID # trace when this thread is on-CPU
|
334
|
+
eg,
|
335
|
+
functrace do_nanosleep # trace the do_nanosleep() function
|
336
|
+
functrace '*sleep' # trace functions ending in "sleep"
|
337
|
+
functrace -p 198 'vfs*' # trace "vfs*" funcs for PID 198
|
338
|
+
functrace 'tcp*' > out # trace all "tcp*" funcs to out file
|
339
|
+
functrace -d 1 'tcp*' > out # trace 1 sec, then write out file
|
340
|
+
|
341
|
+
See the man page and example file for more info.
|