fluent-plugin-perf-tools 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.gitignore +15 -0
- data/.rubocop.yml +26 -0
- data/.ruby-version +1 -0
- data/CHANGELOG.md +5 -0
- data/CODE_OF_CONDUCT.md +84 -0
- data/Gemfile +5 -0
- data/LICENSE.txt +21 -0
- data/README.md +43 -0
- data/Rakefile +17 -0
- data/bin/console +15 -0
- data/bin/setup +8 -0
- data/fluent-plugin-perf-tools.gemspec +48 -0
- data/lib/fluent/plugin/in_perf_tools.rb +42 -0
- data/lib/fluent/plugin/perf_tools/cachestat.rb +65 -0
- data/lib/fluent/plugin/perf_tools/command.rb +30 -0
- data/lib/fluent/plugin/perf_tools/version.rb +9 -0
- data/lib/fluent/plugin/perf_tools.rb +11 -0
- data/perf-tools/LICENSE +339 -0
- data/perf-tools/README.md +205 -0
- data/perf-tools/bin/bitesize +1 -0
- data/perf-tools/bin/cachestat +1 -0
- data/perf-tools/bin/execsnoop +1 -0
- data/perf-tools/bin/funccount +1 -0
- data/perf-tools/bin/funcgraph +1 -0
- data/perf-tools/bin/funcslower +1 -0
- data/perf-tools/bin/functrace +1 -0
- data/perf-tools/bin/iolatency +1 -0
- data/perf-tools/bin/iosnoop +1 -0
- data/perf-tools/bin/killsnoop +1 -0
- data/perf-tools/bin/kprobe +1 -0
- data/perf-tools/bin/opensnoop +1 -0
- data/perf-tools/bin/perf-stat-hist +1 -0
- data/perf-tools/bin/reset-ftrace +1 -0
- data/perf-tools/bin/syscount +1 -0
- data/perf-tools/bin/tcpretrans +1 -0
- data/perf-tools/bin/tpoint +1 -0
- data/perf-tools/bin/uprobe +1 -0
- data/perf-tools/deprecated/README.md +1 -0
- data/perf-tools/deprecated/execsnoop-proc +150 -0
- data/perf-tools/deprecated/execsnoop-proc.8 +80 -0
- data/perf-tools/deprecated/execsnoop-proc_example.txt +46 -0
- data/perf-tools/disk/bitesize +175 -0
- data/perf-tools/examples/bitesize_example.txt +63 -0
- data/perf-tools/examples/cachestat_example.txt +58 -0
- data/perf-tools/examples/execsnoop_example.txt +153 -0
- data/perf-tools/examples/funccount_example.txt +126 -0
- data/perf-tools/examples/funcgraph_example.txt +2178 -0
- data/perf-tools/examples/funcslower_example.txt +110 -0
- data/perf-tools/examples/functrace_example.txt +341 -0
- data/perf-tools/examples/iolatency_example.txt +350 -0
- data/perf-tools/examples/iosnoop_example.txt +302 -0
- data/perf-tools/examples/killsnoop_example.txt +62 -0
- data/perf-tools/examples/kprobe_example.txt +379 -0
- data/perf-tools/examples/opensnoop_example.txt +47 -0
- data/perf-tools/examples/perf-stat-hist_example.txt +149 -0
- data/perf-tools/examples/reset-ftrace_example.txt +88 -0
- data/perf-tools/examples/syscount_example.txt +297 -0
- data/perf-tools/examples/tcpretrans_example.txt +93 -0
- data/perf-tools/examples/tpoint_example.txt +210 -0
- data/perf-tools/examples/uprobe_example.txt +321 -0
- data/perf-tools/execsnoop +292 -0
- data/perf-tools/fs/cachestat +167 -0
- data/perf-tools/images/perf-tools_2016.png +0 -0
- data/perf-tools/iolatency +296 -0
- data/perf-tools/iosnoop +296 -0
- data/perf-tools/kernel/funccount +146 -0
- data/perf-tools/kernel/funcgraph +259 -0
- data/perf-tools/kernel/funcslower +248 -0
- data/perf-tools/kernel/functrace +192 -0
- data/perf-tools/kernel/kprobe +270 -0
- data/perf-tools/killsnoop +263 -0
- data/perf-tools/man/man8/bitesize.8 +70 -0
- data/perf-tools/man/man8/cachestat.8 +111 -0
- data/perf-tools/man/man8/execsnoop.8 +104 -0
- data/perf-tools/man/man8/funccount.8 +76 -0
- data/perf-tools/man/man8/funcgraph.8 +166 -0
- data/perf-tools/man/man8/funcslower.8 +129 -0
- data/perf-tools/man/man8/functrace.8 +123 -0
- data/perf-tools/man/man8/iolatency.8 +116 -0
- data/perf-tools/man/man8/iosnoop.8 +169 -0
- data/perf-tools/man/man8/killsnoop.8 +100 -0
- data/perf-tools/man/man8/kprobe.8 +162 -0
- data/perf-tools/man/man8/opensnoop.8 +113 -0
- data/perf-tools/man/man8/perf-stat-hist.8 +111 -0
- data/perf-tools/man/man8/reset-ftrace.8 +49 -0
- data/perf-tools/man/man8/syscount.8 +96 -0
- data/perf-tools/man/man8/tcpretrans.8 +93 -0
- data/perf-tools/man/man8/tpoint.8 +140 -0
- data/perf-tools/man/man8/uprobe.8 +168 -0
- data/perf-tools/misc/perf-stat-hist +223 -0
- data/perf-tools/net/tcpretrans +311 -0
- data/perf-tools/opensnoop +280 -0
- data/perf-tools/syscount +192 -0
- data/perf-tools/system/tpoint +232 -0
- data/perf-tools/tools/reset-ftrace +123 -0
- data/perf-tools/user/uprobe +390 -0
- metadata +349 -0
|
@@ -0,0 +1,110 @@
|
|
|
1
|
+
Demonstrations of funcslower, the Linux ftrace version.
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
Show me ext3_readpages() calls slower than 1000 microseconds (1 ms):
|
|
5
|
+
|
|
6
|
+
# ./funcslower ext3_readpages 1000
|
|
7
|
+
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
|
|
8
|
+
0) ! 8147.120 us | } /* ext3_readpages */
|
|
9
|
+
0) ! 8135.067 us | } /* ext3_readpages */
|
|
10
|
+
0) ! 12202.93 us | } /* ext3_readpages */
|
|
11
|
+
0) ! 12201.84 us | } /* ext3_readpages */
|
|
12
|
+
0) ! 8142.667 us | } /* ext3_readpages */
|
|
13
|
+
0) ! 12194.14 us | } /* ext3_readpages */
|
|
14
|
+
^C
|
|
15
|
+
Ending tracing...
|
|
16
|
+
|
|
17
|
+
Neat. So this confirms that there are ext3_readpages() calls that are taking
|
|
18
|
+
over 8000 us (8 ms).
|
|
19
|
+
|
|
20
|
+
funcslower uses the ftrace function graph profiler to dynamically instrument
|
|
21
|
+
the given kernel function, time it in-kernel, and only emit events slower
|
|
22
|
+
than the given latency threshold in-kernel. Since this all operates in
|
|
23
|
+
kernel context, the overheads are relatively low (compared to post-processing
|
|
24
|
+
in user space).
|
|
25
|
+
|
|
26
|
+
|
|
27
|
+
Now include the process name and PID (-P) of the process who is on-CPU, and the
|
|
28
|
+
absolute timestamp (-t) of the event:
|
|
29
|
+
|
|
30
|
+
# ./funcslower -Pt ext3_readpages 1000
|
|
31
|
+
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
|
|
32
|
+
2678112.003180 | 0) cksum-26695 | ! 8145.268 us | } /* ext3_readpages */
|
|
33
|
+
2678113.538763 | 0) cksum-26695 | ! 8139.086 us | } /* ext3_readpages */
|
|
34
|
+
2678113.704901 | 0) cksum-26695 | ! 8147.549 us | } /* ext3_readpages */
|
|
35
|
+
2678113.721102 | 0) cksum-26695 | ! 8142.530 us | } /* ext3_readpages */
|
|
36
|
+
2678113.810269 | 0) cksum-26695 | ! 12234.70 us | } /* ext3_readpages */
|
|
37
|
+
2678113.996625 | 0) cksum-26695 | ! 8146.129 us | } /* ext3_readpages */
|
|
38
|
+
2678114.012832 | 0) cksum-26695 | ! 8148.153 us | } /* ext3_readpages */
|
|
39
|
+
^C
|
|
40
|
+
Ending tracing...
|
|
41
|
+
|
|
42
|
+
Great! Now I can see the process name, which in this case is the responsible
|
|
43
|
+
process. The timestamps also let me determine the rate of these slow events.
|
|
44
|
+
|
|
45
|
+
|
|
46
|
+
Now measure time differently: excluding time spent sleeping, so that we only
|
|
47
|
+
see on-CPU time:
|
|
48
|
+
|
|
49
|
+
# ./funcslower -Pct ext3_readpages 1000
|
|
50
|
+
Tracing "ext3_readpages" slower than 1000 us... Ctrl-C to end.
|
|
51
|
+
^C
|
|
52
|
+
Ending tracing...
|
|
53
|
+
|
|
54
|
+
I believe the workload hasn't changed, so these ext3_readpages() calls are
|
|
55
|
+
still happening, however, their CPU time doesn't exceed 1 ms. Compared to the
|
|
56
|
+
earlier output, this tells me that the latency in this function is due to time
|
|
57
|
+
spent blocked off-CPU, and not on-CPU. This makes sense: this function is
|
|
58
|
+
ultimately being blocked on disk I/O.
|
|
59
|
+
|
|
60
|
+
Were the function duration times to be similar with and without -C, that would
|
|
61
|
+
tell us that the high latency is due to time spent on-CPU executing code.
|
|
62
|
+
|
|
63
|
+
|
|
64
|
+
This traces the sys_nanosleep() kernel function, and shows calls taking over
|
|
65
|
+
100 us:
|
|
66
|
+
|
|
67
|
+
# ./funcslower sys_nanosleep 100
|
|
68
|
+
Tracing "sys_nanosleep" slower than 100 us... Ctrl-C to end.
|
|
69
|
+
0) ! 2000147 us | } /* sys_nanosleep */
|
|
70
|
+
------------------------------------------
|
|
71
|
+
0) registe-27414 => vmstat-27419
|
|
72
|
+
------------------------------------------
|
|
73
|
+
|
|
74
|
+
0) ! 1000143 us | } /* sys_nanosleep */
|
|
75
|
+
0) ! 1000154 us | } /* sys_nanosleep */
|
|
76
|
+
------------------------------------------
|
|
77
|
+
0) vmstat-27419 => registe-27414
|
|
78
|
+
------------------------------------------
|
|
79
|
+
|
|
80
|
+
0) ! 2000183 us | } /* sys_nanosleep */
|
|
81
|
+
------------------------------------------
|
|
82
|
+
0) registe-27414 => vmstat-27419
|
|
83
|
+
------------------------------------------
|
|
84
|
+
|
|
85
|
+
0) ! 1000141 us | } /* sys_nanosleep */
|
|
86
|
+
^C
|
|
87
|
+
Ending tracing...
|
|
88
|
+
|
|
89
|
+
This is an example where I did not use -P, but ftrace has included process
|
|
90
|
+
information anyway. Look for the lines containing "=>", which indicate a process
|
|
91
|
+
switch on the given CPU.
|
|
92
|
+
|
|
93
|
+
|
|
94
|
+
Use -h to print the USAGE message:
|
|
95
|
+
|
|
96
|
+
# ./funcslower -h
|
|
97
|
+
USAGE: funcslower [-aChHPt] [-p PID] [-d secs] funcstring latency_us
|
|
98
|
+
-a # all info (same as -HPt)
|
|
99
|
+
-C # measure on-CPU time only
|
|
100
|
+
-d seconds # trace duration, and use buffers
|
|
101
|
+
-h # this usage message
|
|
102
|
+
-H # include column headers
|
|
103
|
+
-p PID # trace when this pid is on-CPU
|
|
104
|
+
-L TID # trace when this thread is on-CPU
|
|
105
|
+
-P # show process names & PIDs
|
|
106
|
+
-t # show timestamps
|
|
107
|
+
eg,
|
|
108
|
+
funcslower vfs_read 10000 # trace vfs_read() slower than 10 ms
|
|
109
|
+
|
|
110
|
+
See the man page and example file for more info.
|
|
@@ -0,0 +1,341 @@
|
|
|
1
|
+
Demonstrations of functrace, the Linux ftrace version.
|
|
2
|
+
|
|
3
|
+
|
|
4
|
+
A (usually) good example to start with is do_nanosleep(), since it is not called
|
|
5
|
+
frequently, and easily triggered. Here's tracing it using functrace:
|
|
6
|
+
|
|
7
|
+
# ./functrace 'do_nanosleep'
|
|
8
|
+
Tracing "do_nanosleep"... Ctrl-C to end.
|
|
9
|
+
svscan-1678 [000] .... 6412438.703521: do_nanosleep <-hrtimer_nanosleep
|
|
10
|
+
svscan-1678 [000] .... 6412443.703678: do_nanosleep <-hrtimer_nanosleep
|
|
11
|
+
svscan-1678 [000] .... 6412448.703865: do_nanosleep <-hrtimer_nanosleep
|
|
12
|
+
vmstat-28371 [000] .... 6412453.216241: do_nanosleep <-hrtimer_nanosleep
|
|
13
|
+
svscan-1678 [000] .... 6412453.704049: do_nanosleep <-hrtimer_nanosleep
|
|
14
|
+
vmstat-28371 [000] .... 6412454.216524: do_nanosleep <-hrtimer_nanosleep
|
|
15
|
+
vmstat-28371 [000] .... 6412455.216816: do_nanosleep <-hrtimer_nanosleep
|
|
16
|
+
vmstat-28371 [000] .... 6412456.217093: do_nanosleep <-hrtimer_nanosleep
|
|
17
|
+
vmstat-28371 [000] .... 6412457.217378: do_nanosleep <-hrtimer_nanosleep
|
|
18
|
+
vmstat-28371 [000] .... 6412458.217660: do_nanosleep <-hrtimer_nanosleep
|
|
19
|
+
^C
|
|
20
|
+
Ending tracing...
|
|
21
|
+
|
|
22
|
+
While tracing, I ran a "vmstat 1" in another window. vmstat and its process ID
|
|
23
|
+
can be seen as the 1st column, and the timestamp and one second intervals can
|
|
24
|
+
be seen as the 4th column.
|
|
25
|
+
|
|
26
|
+
This is basic details: who was on-CPU (process name and PID), flags, timestamp,
|
|
27
|
+
and calling function. Treat this as the next step, after funccount, for getting
|
|
28
|
+
a little more information on kernel function execution, before using more
|
|
29
|
+
capabilities to dig further.
|
|
30
|
+
|
|
31
|
+
This is Linux 3.16, and the output is the ftrace text buffer format, which has
|
|
32
|
+
changed slightly between kernel versions.
|
|
33
|
+
|
|
34
|
+
|
|
35
|
+
To see the column headers, use -H. This is Linux 3.16:
|
|
36
|
+
|
|
37
|
+
# ./functrace -H do_nanosleep
|
|
38
|
+
Tracing "do_nanosleep"... Ctrl-C to end.
|
|
39
|
+
# tracer: function
|
|
40
|
+
#
|
|
41
|
+
# entries-in-buffer/entries-written: 0/0 #P:2
|
|
42
|
+
#
|
|
43
|
+
# _-----=> irqs-off
|
|
44
|
+
# / _----=> need-resched
|
|
45
|
+
# | / _---=> hardirq/softirq
|
|
46
|
+
# || / _--=> preempt-depth
|
|
47
|
+
# ||| / delay
|
|
48
|
+
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
|
|
49
|
+
# | | | |||| | |
|
|
50
|
+
svscan-1678 [001] .... 6413283.729520: do_nanosleep <-hrtimer_nanosleep
|
|
51
|
+
svscan-1678 [001] .... 6413288.729679: do_nanosleep <-hrtimer_nanosleep
|
|
52
|
+
|
|
53
|
+
For comparison, here's Linux 3.2:
|
|
54
|
+
|
|
55
|
+
# ./functrace -H do_nanosleep
|
|
56
|
+
Tracing "do_nanosleep"... Ctrl-C to end.
|
|
57
|
+
# tracer: function
|
|
58
|
+
#
|
|
59
|
+
# TASK-PID CPU# TIMESTAMP FUNCTION
|
|
60
|
+
# | | | | |
|
|
61
|
+
vmstat-11789 [000] 1763207.021204: do_nanosleep <-hrtimer_nanosleep
|
|
62
|
+
vmstat-11789 [000] 1763208.022970: do_nanosleep <-hrtimer_nanosleep
|
|
63
|
+
vmstat-11789 [000] 1763209.023267: do_nanosleep <-hrtimer_nanosleep
|
|
64
|
+
|
|
65
|
+
For documentation on the exact format, see the Linux kernel source under
|
|
66
|
+
Documentation/trace/ftrace.txt.
|
|
67
|
+
|
|
68
|
+
|
|
69
|
+
This error:
|
|
70
|
+
|
|
71
|
+
# ./functrace 'ext4_z*'
|
|
72
|
+
Tracing "ext4_z*"... Ctrl-C to end.
|
|
73
|
+
./functrace: line 136: echo: write error: Invalid argument
|
|
74
|
+
ERROR: enabling "ext4_z*". Exiting.
|
|
75
|
+
|
|
76
|
+
Is because there were no functions beginning with "ext4_z". You can check
|
|
77
|
+
available functions in the /sys/kernel/debug/tracing/available_filter_functions
|
|
78
|
+
file.
|
|
79
|
+
|
|
80
|
+
|
|
81
|
+
You might want to use funccount to check the frequency of events before using
|
|
82
|
+
functrace. For example, counting ext3 events on a system:
|
|
83
|
+
|
|
84
|
+
# ./funccount -d 10 'ext3*'
|
|
85
|
+
Tracing "ext3*" for 10 seconds...
|
|
86
|
+
|
|
87
|
+
FUNC COUNT
|
|
88
|
+
ext3_journal_dirty_data 1
|
|
89
|
+
ext3_ordered_write_end 1
|
|
90
|
+
ext3_write_begin 1
|
|
91
|
+
ext3_writepage_trans_blocks 1
|
|
92
|
+
ext3_dirty_inode 2
|
|
93
|
+
ext3_do_update_inode 2
|
|
94
|
+
ext3_get_group_desc 2
|
|
95
|
+
ext3_get_inode_block.isra.20 2
|
|
96
|
+
ext3_get_inode_flags 2
|
|
97
|
+
ext3_get_inode_loc 2
|
|
98
|
+
ext3_mark_iloc_dirty 2
|
|
99
|
+
ext3_mark_inode_dirty 2
|
|
100
|
+
ext3_reserve_inode_write 2
|
|
101
|
+
ext3_journal_start_sb 3
|
|
102
|
+
ext3_block_to_path.isra.22 6
|
|
103
|
+
ext3_bmap 6
|
|
104
|
+
ext3_get_block 6
|
|
105
|
+
ext3_get_blocks_handle 6
|
|
106
|
+
ext3_get_branch 6
|
|
107
|
+
ext3_discard_reservation 11
|
|
108
|
+
ext3_ioctl 11
|
|
109
|
+
ext3_release_file 11
|
|
110
|
+
|
|
111
|
+
Ending tracing...
|
|
112
|
+
|
|
113
|
+
During 10 seconds, there weren't many ext3 calls. I might consider tracing
|
|
114
|
+
them all (warnings about dynamic tracing many kernel functions apply: test
|
|
115
|
+
before use, as in the past there have been bugs causing panics).
|
|
116
|
+
|
|
117
|
+
# ./functrace 'ext3_*'
|
|
118
|
+
Tracing "ext3_*"... Ctrl-C to end.
|
|
119
|
+
register_start.-17008 [000] 1763557.577985: ext3_release_file <-__fput
|
|
120
|
+
register_start.-17008 [000] 1763557.577987: ext3_discard_reservation <-ext3_release_file
|
|
121
|
+
register_start.-17026 [000] 1763558.163620: ext3_ioctl <-file_ioctl
|
|
122
|
+
register_start.-17026 [000] 1763558.481081: ext3_release_file <-__fput
|
|
123
|
+
register_start.-17026 [000] 1763558.481083: ext3_discard_reservation <-ext3_release_file
|
|
124
|
+
register_start.-17041 [000] 1763559.186984: ext3_ioctl <-file_ioctl
|
|
125
|
+
register_start.-17041 [000] 1763559.511267: ext3_release_file <-__fput
|
|
126
|
+
[...]
|
|
127
|
+
|
|
128
|
+
For comparison, here's a different system and ext4:
|
|
129
|
+
|
|
130
|
+
# ./funccount -d 10 'ext4*'
|
|
131
|
+
Tracing "ext4*" for 10 seconds...
|
|
132
|
+
|
|
133
|
+
FUNC COUNT
|
|
134
|
+
ext4_journal_commit_callback 2
|
|
135
|
+
ext4_htree_fill_tree 6
|
|
136
|
+
ext4_htree_free_dir_info 6
|
|
137
|
+
ext4_release_dir 6
|
|
138
|
+
ext4_readdir 12
|
|
139
|
+
ext4fs_dirhash 29
|
|
140
|
+
ext4_htree_store_dirent 29
|
|
141
|
+
ext4_follow_link 36
|
|
142
|
+
ext4_file_mmap 42
|
|
143
|
+
ext4_free_data_callback 44
|
|
144
|
+
ext4_getattr 45
|
|
145
|
+
ext4_bmap 62
|
|
146
|
+
ext4_get_block 62
|
|
147
|
+
ext4_add_entry 280
|
|
148
|
+
ext4_add_nondir 280
|
|
149
|
+
ext4_alloc_da_blocks 280
|
|
150
|
+
ext4_alloc_inode 280
|
|
151
|
+
ext4_bio_write_page 280
|
|
152
|
+
ext4_can_truncate 280
|
|
153
|
+
ext4_claim_free_clusters 280
|
|
154
|
+
ext4_clear_inode 280
|
|
155
|
+
ext4_create 280
|
|
156
|
+
ext4_da_get_block_prep 280
|
|
157
|
+
ext4_da_invalidatepage 280
|
|
158
|
+
ext4_da_update_reserve_space 280
|
|
159
|
+
ext4_da_write_begin 280
|
|
160
|
+
ext4_da_write_end 280
|
|
161
|
+
ext4_dec_count.isra.22 280
|
|
162
|
+
ext4_delete_entry 280
|
|
163
|
+
ext4_destroy_inode 280
|
|
164
|
+
ext4_drop_inode 280
|
|
165
|
+
ext4_end_bio 280
|
|
166
|
+
ext4_es_init_tree 280
|
|
167
|
+
ext4_es_lru_del 280
|
|
168
|
+
ext4_evict_inode 280
|
|
169
|
+
ext4_ext_calc_metadata_amount 280
|
|
170
|
+
ext4_ext_correct_indexes 280
|
|
171
|
+
ext4_ext_find_goal 280
|
|
172
|
+
ext4_ext_insert_extent 280
|
|
173
|
+
ext4_ext_remove_space 280
|
|
174
|
+
ext4_ext_tree_init 280
|
|
175
|
+
ext4_ext_truncate 280
|
|
176
|
+
ext4_ext_truncate_extend_resta 280
|
|
177
|
+
ext4_ext_try_to_merge 280
|
|
178
|
+
ext4_ext_try_to_merge_right 280
|
|
179
|
+
ext4_file_write_iter 280
|
|
180
|
+
ext4_find_dest_de 280
|
|
181
|
+
ext4_finish_bio 280
|
|
182
|
+
ext4_free_blocks 280
|
|
183
|
+
ext4_free_inode 280
|
|
184
|
+
ext4_generic_delete_entry 280
|
|
185
|
+
ext4_has_free_clusters 280
|
|
186
|
+
ext4_i_callback 280
|
|
187
|
+
ext4_init_acl 280
|
|
188
|
+
ext4_init_security 280
|
|
189
|
+
ext4_inode_attach_jinode 280
|
|
190
|
+
ext4_inode_to_goal_block 280
|
|
191
|
+
ext4_insert_dentry 280
|
|
192
|
+
ext4_invalidatepage 280
|
|
193
|
+
ext4_io_submit_init 280
|
|
194
|
+
ext4_itable_unused_count 280
|
|
195
|
+
ext4_lookup 280
|
|
196
|
+
ext4_mb_complex_scan_group 280
|
|
197
|
+
ext4_mb_find_by_goal 280
|
|
198
|
+
ext4_mb_free_metadata 280
|
|
199
|
+
ext4_mb_initialize_context 280
|
|
200
|
+
ext4_mb_mark_diskspace_used 280
|
|
201
|
+
ext4_mb_new_blocks 280
|
|
202
|
+
ext4_mb_normalize_request 280
|
|
203
|
+
ext4_mb_regular_allocator 280
|
|
204
|
+
ext4_mb_release_context 280
|
|
205
|
+
ext4_mb_use_best_found 280
|
|
206
|
+
ext4_mb_use_preallocated 280
|
|
207
|
+
ext4_nonda_switch 280
|
|
208
|
+
ext4_orphan_del 280
|
|
209
|
+
ext4_put_io_end_defer 280
|
|
210
|
+
ext4_releasepage 280
|
|
211
|
+
ext4_rename 280
|
|
212
|
+
ext4_set_aops 280
|
|
213
|
+
ext4_setent 280
|
|
214
|
+
ext4_set_inode_flags 280
|
|
215
|
+
ext4_truncate 280
|
|
216
|
+
ext4_writepages 280
|
|
217
|
+
ext4_writepage_trans_blocks 280
|
|
218
|
+
ext4_xattr_delete_inode 280
|
|
219
|
+
ext4_xattr_get 285
|
|
220
|
+
ext4_xattr_ibody_get 285
|
|
221
|
+
ext4_xattr_security_get 285
|
|
222
|
+
ext4_bread 286
|
|
223
|
+
ext4_release_file 288
|
|
224
|
+
ext4_file_open 305
|
|
225
|
+
ext4_superblock_csum_set 494
|
|
226
|
+
ext4_block_bitmap_csum_set 560
|
|
227
|
+
ext4_es_free_extent 560
|
|
228
|
+
ext4_es_insert_extent 560
|
|
229
|
+
ext4_es_remove_extent 560
|
|
230
|
+
ext4_ext_find_extent 560
|
|
231
|
+
ext4_ext_map_blocks 560
|
|
232
|
+
ext4_free_group_clusters_set 560
|
|
233
|
+
ext4_free_inodes_set 560
|
|
234
|
+
ext4_get_group_no_and_offset 560
|
|
235
|
+
ext4_get_reserved_space 560
|
|
236
|
+
ext4_init_io_end 560
|
|
237
|
+
ext4_inode_bitmap_csum_set 560
|
|
238
|
+
ext4_io_submit 560
|
|
239
|
+
ext4_mb_good_group 560
|
|
240
|
+
ext4_orphan_add 560
|
|
241
|
+
ext4_put_io_end 560
|
|
242
|
+
ext4_read_block_bitmap 560
|
|
243
|
+
ext4_read_block_bitmap_nowait 560
|
|
244
|
+
ext4_read_inode_bitmap 560
|
|
245
|
+
ext4_release_io_end 560
|
|
246
|
+
ext4_set_bits 560
|
|
247
|
+
ext4_validate_block_bitmap 560
|
|
248
|
+
ext4_wait_block_bitmap 560
|
|
249
|
+
ext4_mb_load_buddy 604
|
|
250
|
+
ext4_mb_unload_buddy.isra.24 604
|
|
251
|
+
ext4_block_bitmap 840
|
|
252
|
+
ext4_discard_preallocations 840
|
|
253
|
+
ext4_ext_drop_refs 840
|
|
254
|
+
ext4_ext_get_access.isra.30 840
|
|
255
|
+
ext4_ext_index_trans_blocks 840
|
|
256
|
+
ext4_find_entry 840
|
|
257
|
+
ext4_free_group_clusters 840
|
|
258
|
+
ext4_handle_dirty_dirent_node 840
|
|
259
|
+
ext4_inode_bitmap 840
|
|
260
|
+
ext4_meta_trans_blocks 840
|
|
261
|
+
ext4_dirty_inode 845
|
|
262
|
+
ext4_free_inodes_count 1120
|
|
263
|
+
ext4_group_desc_csum 1120
|
|
264
|
+
ext4_group_desc_csum_set 1120
|
|
265
|
+
ext4_getblk 1126
|
|
266
|
+
ext4_map_blocks 1468
|
|
267
|
+
ext4_es_lookup_extent 1748
|
|
268
|
+
ext4_mb_check_limits 1875
|
|
269
|
+
ext4_es_lru_add 2028
|
|
270
|
+
ext4_data_block_valid 2308
|
|
271
|
+
ext4_journal_check_start 3085
|
|
272
|
+
ext4_mark_inode_dirty 5325
|
|
273
|
+
ext4_get_inode_flags 5951
|
|
274
|
+
ext4_get_inode_loc 5951
|
|
275
|
+
ext4_mark_iloc_dirty 5951
|
|
276
|
+
ext4_reserve_inode_write 5951
|
|
277
|
+
ext4_inode_table 7071
|
|
278
|
+
ext4_get_group_desc 8471
|
|
279
|
+
ext4_has_inline_data 9486
|
|
280
|
+
|
|
281
|
+
Ending tracing...
|
|
282
|
+
|
|
283
|
+
There are many functions called frequently. Tracing them all may cost
|
|
284
|
+
significant performance overhead. I may read through this list and look for
|
|
285
|
+
the most interesting functions to trace, reducing overheads by only selecting
|
|
286
|
+
a few.
|
|
287
|
+
|
|
288
|
+
For example, ext4_create() looks interesting:
|
|
289
|
+
|
|
290
|
+
# ./functrace ext4_create
|
|
291
|
+
Tracing "ext4_create"... Ctrl-C to end.
|
|
292
|
+
supervise-1681 [000] .... 6414396.700163: ext4_create <-vfs_create
|
|
293
|
+
supervise-1684 [001] .... 6414396.700287: ext4_create <-vfs_create
|
|
294
|
+
supervise-1681 [000] .... 6414396.700598: ext4_create <-vfs_create
|
|
295
|
+
supervise-1684 [001] .... 6414396.700636: ext4_create <-vfs_create
|
|
296
|
+
supervise-1687 [001] .... 6414396.701577: ext4_create <-vfs_create
|
|
297
|
+
supervise-1688 [000] .... 6414396.702590: ext4_create <-vfs_create
|
|
298
|
+
supervise-1693 [001] .... 6414396.702829: ext4_create <-vfs_create
|
|
299
|
+
supervise-1693 [001] .... 6414396.703592: ext4_create <-vfs_create
|
|
300
|
+
supervise-1688 [000] .... 6414396.703598: ext4_create <-vfs_create
|
|
301
|
+
supervise-1687 [001] .... 6414396.703988: ext4_create <-vfs_create
|
|
302
|
+
supervise-1685 [001] .... 6414396.704126: ext4_create <-vfs_create
|
|
303
|
+
supervise-1685 [001] .... 6414396.704458: ext4_create <-vfs_create
|
|
304
|
+
supervise-1682 [001] .... 6414396.704577: ext4_create <-vfs_create
|
|
305
|
+
supervise-1683 [000] .... 6414396.704984: ext4_create <-vfs_create
|
|
306
|
+
supervise-1682 [001] .... 6414396.704985: ext4_create <-vfs_create
|
|
307
|
+
[...]
|
|
308
|
+
|
|
309
|
+
Now I know that different PIDs of the supervise program are calling ext4_create,
|
|
310
|
+
of around the same time, and from vfs_create().
|
|
311
|
+
|
|
312
|
+
|
|
313
|
+
The duration mode uses buffering, instead of printing events as they occur.
|
|
314
|
+
This greatly reduces overheads. For example:
|
|
315
|
+
|
|
316
|
+
# ./functrace -d 10 ext4_create > out.ext4_create
|
|
317
|
+
# wc out.ext4_create
|
|
318
|
+
283 1687 21059 out.ext4_create
|
|
319
|
+
|
|
320
|
+
Note that the buffer has a limited size. Check the timestamps to see if the
|
|
321
|
+
range does not match your duration, as one clue that the buffer was exhausted
|
|
322
|
+
and events were missed.
|
|
323
|
+
|
|
324
|
+
|
|
325
|
+
Use -h to print the USAGE message:
|
|
326
|
+
|
|
327
|
+
# ./functrace -h
|
|
328
|
+
USAGE: functrace [-hH] [-p PID] [-L TID] [-d secs] funcstring
|
|
329
|
+
-d seconds # trace duration, and use buffers
|
|
330
|
+
-h # this usage message
|
|
331
|
+
-H # include column headers
|
|
332
|
+
-p PID # trace when this pid is on-CPU
|
|
333
|
+
-L TID # trace when this thread is on-CPU
|
|
334
|
+
eg,
|
|
335
|
+
functrace do_nanosleep # trace the do_nanosleep() function
|
|
336
|
+
functrace '*sleep' # trace functions ending in "sleep"
|
|
337
|
+
functrace -p 198 'vfs*' # trace "vfs*" funcs for PID 198
|
|
338
|
+
functrace 'tcp*' > out # trace all "tcp*" funcs to out file
|
|
339
|
+
functrace -d 1 'tcp*' > out # trace 1 sec, then write out file
|
|
340
|
+
|
|
341
|
+
See the man page and example file for more info.
|