fluent-plugin-perf-tools 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
Files changed (98) hide show
  1. checksums.yaml +7 -0
  2. data/.gitignore +15 -0
  3. data/.rubocop.yml +26 -0
  4. data/.ruby-version +1 -0
  5. data/CHANGELOG.md +5 -0
  6. data/CODE_OF_CONDUCT.md +84 -0
  7. data/Gemfile +5 -0
  8. data/LICENSE.txt +21 -0
  9. data/README.md +43 -0
  10. data/Rakefile +17 -0
  11. data/bin/console +15 -0
  12. data/bin/setup +8 -0
  13. data/fluent-plugin-perf-tools.gemspec +48 -0
  14. data/lib/fluent/plugin/in_perf_tools.rb +42 -0
  15. data/lib/fluent/plugin/perf_tools/cachestat.rb +65 -0
  16. data/lib/fluent/plugin/perf_tools/command.rb +30 -0
  17. data/lib/fluent/plugin/perf_tools/version.rb +9 -0
  18. data/lib/fluent/plugin/perf_tools.rb +11 -0
  19. data/perf-tools/LICENSE +339 -0
  20. data/perf-tools/README.md +205 -0
  21. data/perf-tools/bin/bitesize +1 -0
  22. data/perf-tools/bin/cachestat +1 -0
  23. data/perf-tools/bin/execsnoop +1 -0
  24. data/perf-tools/bin/funccount +1 -0
  25. data/perf-tools/bin/funcgraph +1 -0
  26. data/perf-tools/bin/funcslower +1 -0
  27. data/perf-tools/bin/functrace +1 -0
  28. data/perf-tools/bin/iolatency +1 -0
  29. data/perf-tools/bin/iosnoop +1 -0
  30. data/perf-tools/bin/killsnoop +1 -0
  31. data/perf-tools/bin/kprobe +1 -0
  32. data/perf-tools/bin/opensnoop +1 -0
  33. data/perf-tools/bin/perf-stat-hist +1 -0
  34. data/perf-tools/bin/reset-ftrace +1 -0
  35. data/perf-tools/bin/syscount +1 -0
  36. data/perf-tools/bin/tcpretrans +1 -0
  37. data/perf-tools/bin/tpoint +1 -0
  38. data/perf-tools/bin/uprobe +1 -0
  39. data/perf-tools/deprecated/README.md +1 -0
  40. data/perf-tools/deprecated/execsnoop-proc +150 -0
  41. data/perf-tools/deprecated/execsnoop-proc.8 +80 -0
  42. data/perf-tools/deprecated/execsnoop-proc_example.txt +46 -0
  43. data/perf-tools/disk/bitesize +175 -0
  44. data/perf-tools/examples/bitesize_example.txt +63 -0
  45. data/perf-tools/examples/cachestat_example.txt +58 -0
  46. data/perf-tools/examples/execsnoop_example.txt +153 -0
  47. data/perf-tools/examples/funccount_example.txt +126 -0
  48. data/perf-tools/examples/funcgraph_example.txt +2178 -0
  49. data/perf-tools/examples/funcslower_example.txt +110 -0
  50. data/perf-tools/examples/functrace_example.txt +341 -0
  51. data/perf-tools/examples/iolatency_example.txt +350 -0
  52. data/perf-tools/examples/iosnoop_example.txt +302 -0
  53. data/perf-tools/examples/killsnoop_example.txt +62 -0
  54. data/perf-tools/examples/kprobe_example.txt +379 -0
  55. data/perf-tools/examples/opensnoop_example.txt +47 -0
  56. data/perf-tools/examples/perf-stat-hist_example.txt +149 -0
  57. data/perf-tools/examples/reset-ftrace_example.txt +88 -0
  58. data/perf-tools/examples/syscount_example.txt +297 -0
  59. data/perf-tools/examples/tcpretrans_example.txt +93 -0
  60. data/perf-tools/examples/tpoint_example.txt +210 -0
  61. data/perf-tools/examples/uprobe_example.txt +321 -0
  62. data/perf-tools/execsnoop +292 -0
  63. data/perf-tools/fs/cachestat +167 -0
  64. data/perf-tools/images/perf-tools_2016.png +0 -0
  65. data/perf-tools/iolatency +296 -0
  66. data/perf-tools/iosnoop +296 -0
  67. data/perf-tools/kernel/funccount +146 -0
  68. data/perf-tools/kernel/funcgraph +259 -0
  69. data/perf-tools/kernel/funcslower +248 -0
  70. data/perf-tools/kernel/functrace +192 -0
  71. data/perf-tools/kernel/kprobe +270 -0
  72. data/perf-tools/killsnoop +263 -0
  73. data/perf-tools/man/man8/bitesize.8 +70 -0
  74. data/perf-tools/man/man8/cachestat.8 +111 -0
  75. data/perf-tools/man/man8/execsnoop.8 +104 -0
  76. data/perf-tools/man/man8/funccount.8 +76 -0
  77. data/perf-tools/man/man8/funcgraph.8 +166 -0
  78. data/perf-tools/man/man8/funcslower.8 +129 -0
  79. data/perf-tools/man/man8/functrace.8 +123 -0
  80. data/perf-tools/man/man8/iolatency.8 +116 -0
  81. data/perf-tools/man/man8/iosnoop.8 +169 -0
  82. data/perf-tools/man/man8/killsnoop.8 +100 -0
  83. data/perf-tools/man/man8/kprobe.8 +162 -0
  84. data/perf-tools/man/man8/opensnoop.8 +113 -0
  85. data/perf-tools/man/man8/perf-stat-hist.8 +111 -0
  86. data/perf-tools/man/man8/reset-ftrace.8 +49 -0
  87. data/perf-tools/man/man8/syscount.8 +96 -0
  88. data/perf-tools/man/man8/tcpretrans.8 +93 -0
  89. data/perf-tools/man/man8/tpoint.8 +140 -0
  90. data/perf-tools/man/man8/uprobe.8 +168 -0
  91. data/perf-tools/misc/perf-stat-hist +223 -0
  92. data/perf-tools/net/tcpretrans +311 -0
  93. data/perf-tools/opensnoop +280 -0
  94. data/perf-tools/syscount +192 -0
  95. data/perf-tools/system/tpoint +232 -0
  96. data/perf-tools/tools/reset-ftrace +123 -0
  97. data/perf-tools/user/uprobe +390 -0
  98. metadata +349 -0
@@ -0,0 +1,62 @@
1
+ Demonstrations of killsnoop, the Linux ftrace version.
2
+
3
+
4
+ What signals are happening on my system?
5
+
6
+ # ./killsnoop
7
+ Tracing kill()s. Ctrl-C to end.
8
+ COMM PID TPID SIGNAL RETURN
9
+ postgres 2209 2148 10 0
10
+ postgres 5416 2209 12 0
11
+ postgres 5416 2209 12 0
12
+ supervise 2135 5465 15 0
13
+ supervise 2135 5465 18 0
14
+ ^C
15
+ Ending tracing...
16
+
17
+ The first line of output shows that PID 2209, process name "postgres", has
18
+ sent a signal 10 (SIGUSR1) to target PID 2148. This signal returned success (0).
19
+
20
+ kilsnoop traces the kill() syscall, which is used to send signals to other
21
+ processes. These signals can include SIGKILL and SIGTERM, both of which
22
+ ultimately kill the target process (in different fashions), but the signals
23
+ may also include other operations, including checking if a process still
24
+ exists (signal 0). To read more about signals, see "man -s7 signal".
25
+
26
+ killsnoop can be useful to identify why some processes are abruptly and
27
+ unexpectedly ending (also check for the OOM killer in dmesg).
28
+
29
+
30
+ The -s option can be used to print signal names instead of numbers:
31
+
32
+ # ./killsnoop -s
33
+ Tracing kill()s. Ctrl-C to end.
34
+ COMM PID KILLED SIGNAL RETURN
35
+ postgres 2209 2148 SIGUSR1 0
36
+ postgres 5665 2209 SIGUSR2 0
37
+ postgres 5665 2209 SIGUSR2 0
38
+ supervise 2135 5711 SIGTERM 0
39
+ supervise 2135 5711 SIGCONT 0
40
+ bash 27450 27450 0 0
41
+ [...]
42
+
43
+ On the last line: there wasn't a nice signal name for signal 0, so just numeric
44
+ 0 is printed. You'll see signal 0's used to check if processes still exist.
45
+
46
+
47
+ Use -h to print the USAGE message:
48
+
49
+ # ./opensnoop -h
50
+ USAGE: killsnoop [-ht] [-d secs] [-p PID] [-n name] [filename]
51
+ -d seconds # trace duration, and use buffers
52
+ -n name # process name to match
53
+ -p PID # PID to match on kill issue
54
+ -t # include time (seconds)
55
+ -s # human readable signal names
56
+ -h # this usage message
57
+ eg,
58
+ killsnoop # watch kill()s live (unbuffered)
59
+ killsnoop -d 1 # trace 1 sec (buffered)
60
+ killsnoop -p 181 # trace kill()s issued to PID 181 only
61
+
62
+ See the man page and example file for more info.
@@ -0,0 +1,379 @@
1
+ Demonstrations of kprobe, the Linux ftrace version.
2
+
3
+
4
+ This traces the kernel do_sys_open() function, when it is called:
5
+
6
+ # ./kprobe p:do_sys_open
7
+ Tracing kprobe do_sys_open. Ctrl-C to end.
8
+ kprobe-26042 [001] d... 6910441.001452: do_sys_open: (do_sys_open+0x0/0x220)
9
+ kprobe-26042 [001] d... 6910441.001475: do_sys_open: (do_sys_open+0x0/0x220)
10
+ kprobe-26042 [001] d... 6910441.001866: do_sys_open: (do_sys_open+0x0/0x220)
11
+ kprobe-26042 [001] d... 6910441.001966: do_sys_open: (do_sys_open+0x0/0x220)
12
+ supervise-1689 [000] d... 6910441.083302: do_sys_open: (do_sys_open+0x0/0x220)
13
+ supervise-1693 [001] d... 6910441.083530: do_sys_open: (do_sys_open+0x0/0x220)
14
+ supervise-1689 [000] d... 6910441.083759: do_sys_open: (do_sys_open+0x0/0x220)
15
+ supervise-1693 [001] d... 6910441.083877: do_sys_open: (do_sys_open+0x0/0x220)
16
+ [...]
17
+
18
+ The "p:" is for creating a probe. Use "r:" to probe the return of the function:
19
+
20
+ # ./kprobe r:do_sys_open
21
+ Tracing kprobe do_sys_open. Ctrl-C to end.
22
+ kprobe-29475 [001] d... 6910688.229777: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
23
+ <...>-29476 [001] d... 6910688.231101: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
24
+ <...>-29476 [001] d... 6910688.231123: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
25
+ <...>-29476 [001] d... 6910688.231530: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
26
+ <...>-29476 [001] d... 6910688.231624: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
27
+ supervise-1685 [001] d... 6910688.328776: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
28
+ supervise-1689 [000] d... 6910688.328780: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
29
+ [...]
30
+
31
+ This output includes the function that the traced function is returning to.
32
+
33
+
34
+ The trace output can be a little different between kernel versions. Use -H to
35
+ print the header:
36
+
37
+ # ./kprobe -H p:do_sys_open
38
+ Tracing kprobe do_sys_open. Ctrl-C to end.
39
+ # tracer: nop
40
+ #
41
+ # entries-in-buffer/entries-written: 4/4 #P:2
42
+ #
43
+ # _-----=> irqs-off
44
+ # / _----=> need-resched
45
+ # | / _---=> hardirq/softirq
46
+ # || / _--=> preempt-depth
47
+ # ||| / delay
48
+ # TASK-PID CPU# |||| TIMESTAMP FUNCTION
49
+ # | | | |||| | |
50
+ kprobe-27952 [001] d... 6910580.008086: do_sys_open: (do_sys_open+0x0/0x220)
51
+ kprobe-27952 [001] d... 6910580.008109: do_sys_open: (do_sys_open+0x0/0x220)
52
+ kprobe-27952 [001] d... 6910580.008483: do_sys_open: (do_sys_open+0x0/0x220)
53
+ [...]
54
+
55
+ These columns are explained in the kernel source under Documentation/trace/ftrace.txt.
56
+
57
+
58
+ This traces do_sys_open() returns, using a probe alias "myopen", and showing
59
+ the return value ($retval):
60
+
61
+ # ./kprobe 'r:myopen do_sys_open $retval'
62
+ Tracing kprobe myopen. Ctrl-C to end.
63
+ kprobe-26386 [001] d... 6593278.858754: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
64
+ <...>-26387 [001] d... 6593278.860043: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
65
+ <...>-26387 [001] d... 6593278.860064: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
66
+ <...>-26387 [001] d... 6593278.860433: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
67
+ <...>-26387 [001] d... 6593278.860521: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
68
+ supervise-1685 [001] d... 6593279.178806: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
69
+ supervise-1689 [001] d... 6593279.228756: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
70
+ supervise-1689 [001] d... 6593279.229106: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
71
+ supervise-1688 [000] d... 6593279.229501: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
72
+ supervise-1695 [000] d... 6593279.229944: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
73
+ supervise-1685 [001] d... 6593279.230104: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
74
+ supervise-1687 [001] d... 6593279.230293: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
75
+ supervise-1699 [000] d... 6593279.230381: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
76
+ supervise-1692 [000] d... 6593279.230825: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
77
+ supervise-1698 [000] d... 6593279.230915: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
78
+ supervise-1698 [000] d... 6593279.231277: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
79
+ supervise-1690 [000] d... 6593279.231703: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
80
+ ^C
81
+ Ending tracing...
82
+
83
+ The string specified, 'r:myopen do_sys_open $retval', is a kprobe definition,
84
+ and is the same as those documented in the Linux kernel source under
85
+ Documentation/trace/kprobetrace.txt, which can be written to the
86
+ /sys/kernel/debug/tracing/kprobe_events file.
87
+
88
+ Apart from probe name aliases, you can also provide arbitrary names for
89
+ arguments. Eg, instead of the "arg1" default, calling it "rval":
90
+
91
+ # ./kprobe 'r:myopen do_sys_open rval=$retval'
92
+ Tracing kprobe myopen. Ctrl-C to end.
93
+ kprobe-27454 [001] d... 6593356.250019: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
94
+ <...>-27455 [001] d... 6593356.251280: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
95
+ <...>-27455 [001] d... 6593356.251301: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
96
+ <...>-27455 [001] d... 6593356.251672: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
97
+ <...>-27455 [001] d... 6593356.251769: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
98
+ supervise-1689 [000] d... 6593356.859758: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
99
+ supervise-1689 [000] d... 6593356.860143: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
100
+ supervise-1696 [000] d... 6593356.862682: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
101
+ supervise-1685 [001] d... 6593356.862684: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
102
+ [...]
103
+
104
+ That's a bit better.
105
+
106
+
107
+ Tracing the open() mode:
108
+
109
+ # ./kprobe 'p:myopen do_sys_open mode=%cx:u16'
110
+ Tracing kprobe myopen. Ctrl-C to end.
111
+ kprobe-29572 [001] d... 6593503.353923: myopen: (do_sys_open+0x0/0x220) mode=0x1
112
+ kprobe-29572 [001] d... 6593503.353945: myopen: (do_sys_open+0x0/0x220) mode=0x0
113
+ kprobe-29572 [001] d... 6593503.354307: myopen: (do_sys_open+0x0/0x220) mode=0x5c00
114
+ kprobe-29572 [001] d... 6593503.354401: myopen: (do_sys_open+0x0/0x220) mode=0x0
115
+ supervise-1689 [000] d... 6593503.944125: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
116
+ supervise-1688 [001] d... 6593503.944125: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
117
+ supervise-1688 [001] d... 6593503.944606: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
118
+ supervise-1689 [000] d... 6593503.944606: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
119
+ supervise-1698 [000] d... 6593503.944728: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
120
+ supervise-1698 [000] d... 6593503.945077: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
121
+ [...]
122
+
123
+ Here I guessed that the mode was in register %cx, and cast it as a 16-bit
124
+ unsigned integer (":u16"). Your platform and kernel may be different, and the
125
+ mode may be in a different register. If fiddling with such registers becomes too
126
+ painful or unreliable for you, consider installing kernel debuginfo and using
127
+ the named variables with perf_events "perf probe".
128
+
129
+
130
+ Tracing the open() filename:
131
+
132
+ # ./kprobe 'p:myopen do_sys_open filename=+0(%si):string'
133
+ Tracing kprobe myopen. Ctrl-C to end.
134
+ kprobe-32369 [001] d... 6593706.999728: myopen: (do_sys_open+0x0/0x220) filename="/etc/ld.so.cache"
135
+ kprobe-32369 [001] d... 6593706.999748: myopen: (do_sys_open+0x0/0x220) filename="/lib/x86_64-linux-gnu/libc.so.6"
136
+ kprobe-32369 [001] d... 6593707.000092: myopen: (do_sys_open+0x0/0x220) filename="/usr/lib/locale/locale-archive"
137
+ kprobe-32369 [001] d... 6593707.000176: myopen: (do_sys_open+0x0/0x220) filename="trace_pipe"
138
+ supervise-1699 [000] d... 6593707.254970: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
139
+ supervise-1689 [001] d... 6593707.254970: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
140
+ supervise-1689 [001] d... 6593707.255432: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
141
+ supervise-1699 [000] d... 6593707.255432: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
142
+ supervise-1695 [001] d... 6593707.258805: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
143
+ [...]
144
+
145
+ As mentioned previously, the %si register may be different on your platform.
146
+ In this example, I cast it as a string.
147
+
148
+
149
+ Specifying a duration will buffer in-kernel (reducing overhead), and write at
150
+ the end. Here's tracing for 10 seconds, and writing to the "out" file:
151
+
152
+ # ./kprobe -d 10 'p:myopen do_sys_open filename=+0(%si):string' > out
153
+
154
+
155
+ You can match on a single PID only:
156
+
157
+ # ./kprobe -p 1696 'p:myopen do_sys_open filename=+0(%si):string'
158
+ Tracing kprobe myopen. Ctrl-C to end.
159
+ supervise-1696 [001] d... 6593773.677033: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
160
+ supervise-1696 [001] d... 6593773.677332: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
161
+ supervise-1696 [001] d... 6593774.697144: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
162
+ supervise-1696 [001] d... 6593774.697675: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
163
+ supervise-1696 [001] d... 6593775.717986: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
164
+ supervise-1696 [001] d... 6593775.718499: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
165
+ ^C
166
+ Ending tracing...
167
+
168
+ This will only show events when that PID is on-CPU.
169
+
170
+
171
+ The -v option will show you the available variables you can use in custom
172
+ filters:
173
+
174
+ # ./kprobe -v 'p:myopen do_sys_open filename=+0(%si):string'
175
+ name: myopen
176
+ ID: 1443
177
+ format:
178
+ field:unsigned short common_type; offset:0; size:2; signed:0;
179
+ field:unsigned char common_flags; offset:2; size:1; signed:0;
180
+ field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
181
+ field:int common_pid; offset:4; size:4; signed:1;
182
+
183
+ field:unsigned long __probe_ip; offset:8; size:8; signed:0;
184
+ field:__data_loc char[] filename; offset:16; size:4; signed:1;
185
+
186
+ print fmt: "(%lx) filename=\"%s\"", REC->__probe_ip, __get_str(filename)
187
+
188
+
189
+ Tracing filenames that end in "stat", by adding a filter:
190
+
191
+ # ./kprobe 'p:myopen do_sys_open filename=+0(%si):string' 'filename ~ "*stat"'
192
+ Tracing kprobe myopen. Ctrl-C to end.
193
+ postgres-1172 [000] d... 6594028.787166: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
194
+ postgres-1172 [001] d... 6594028.797410: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
195
+ postgres-1172 [001] d... 6594028.797467: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
196
+ postgres-4443 [001] d... 6594028.800908: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
197
+ postgres-4443 [000] d... 6594028.811237: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
198
+ postgres-4443 [000] d... 6594028.811290: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
199
+ ^C
200
+ Ending tracing...
201
+
202
+ This filtering is done in-kernel context.
203
+
204
+
205
+ As an example of tracing a deeper kernel function, lets trace bio_alloc() and
206
+ entry registers:
207
+
208
+ # ./kprobe 'p:myprobe bio_alloc %ax %bx %cx %dx'
209
+ Tracing kprobe myprobe. Ctrl-C to end.
210
+ supervise-3055 [000] 2172148.728250: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
211
+ supervise-3055 [000] 2172148.728527: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
212
+ jbd2/xvda1-8-212 [000] 2172149.749474: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800ad1f87b8 arg3=ffff8800ba22c06c arg4=8
213
+ jbd2/xvda1-8-212 [000] 2172149.749485: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d053a8 arg3=10f16c5bb arg4=0
214
+ jbd2/xvda1-8-212 [000] 2172149.749487: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05958 arg3=5 arg4=0
215
+ jbd2/xvda1-8-212 [000] 2172149.749488: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05b60 arg3=5 arg4=0
216
+ jbd2/xvda1-8-212 [000] 2172149.749489: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05820 arg3=5 arg4=0
217
+ jbd2/xvda1-8-212 [000] 2172149.749489: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d055b0 arg3=5 arg4=0
218
+ jbd2/xvda1-8-212 [000] 2172149.749490: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88006ff22ea0 arg3=5 arg4=0
219
+ jbd2/xvda1-8-212 [000] 2172149.749491: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d1f000 arg3=5 arg4=0
220
+ jbd2/xvda1-8-212 [000] 2172149.749492: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d1f138 arg3=5 arg4=0
221
+ jbd2/xvda1-8-212 [000] 2172149.749493: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d267138 arg3=5 arg4=0
222
+ jbd2/xvda1-8-212 [000] 2172149.749494: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d267680 arg3=5 arg4=0
223
+ jbd2/xvda1-8-212 [000] 2172149.749495: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d2675b0 arg3=5 arg4=0
224
+ jbd2/xvda1-8-212 [000] 2172149.751044: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800cc241ea0 arg3=445f0300 arg4=ffff8800effba000
225
+ supervise-3055 [000] 2172149.751095: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
226
+ supervise-3055 [000] 2172149.751341: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
227
+ supervise-3055 [000] 2172150.772033: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
228
+ supervise-3055 [000] 2172150.772305: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
229
+ flush-202:1-409 [000] 2172151.087815: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800da51d6e8 arg3=16afd arg4=1
230
+ flush-202:1-409 [000] 2172151.087829: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7537f08 arg3=16afd arg4=2
231
+ flush-202:1-409 [000] 2172151.087844: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7519af8 arg3=16afd arg4=3
232
+ flush-202:1-409 [000] 2172151.087846: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7511478 arg3=16afd arg4=4
233
+ flush-202:1-409 [000] 2172151.087849: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e75e6a90 arg3=16afd arg4=5
234
+ flush-202:1-409 [000] 2172151.087851: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7512bc8 arg3=16afd arg4=6
235
+ flush-202:1-409 [000] 2172151.087853: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800eb3bf410 arg3=16afd arg4=7
236
+ ^C
237
+
238
+ The output includes who is on-CPU, high resolution timestamps, and the arguments
239
+ we requested (registers %ax to %dx). These registers are platform dependent,
240
+ and are mapped by the compiler to the entry arguments of the function.
241
+
242
+ How are these useful? If you are debugging this kernel function, you'll know. :)
243
+
244
+
245
+ Note that you can add qualifiers, eg, if I knew %ax was a uint32:
246
+
247
+ # ./kprobe 'p:myprobe bio_alloc %ax:u32'
248
+ Tracing kprobe myprobe. Ctrl-C to end.
249
+ supervise-3055 [000] 2172389.734606: myprobe: (bio_alloc+0x0/0x30) arg1=64acf948
250
+ supervise-3055 [000] 2172389.734865: myprobe: (bio_alloc+0x0/0x30) arg1=64acc8d0
251
+ supervise-3055 [000] 2172390.772391: myprobe: (bio_alloc+0x0/0x30) arg1=64acf948
252
+ supervise-3055 [000] 2172390.772676: myprobe: (bio_alloc+0x0/0x30) arg1=64acc8d0
253
+ ^C
254
+ Ending tracing...
255
+
256
+ You can give them aliases too, instead of the default arg1..N:
257
+
258
+ # ./kprobe 'p:myprobe bio_alloc ax=%ax'
259
+ Tracing kprobe myprobe. Ctrl-C to end.
260
+ supervise-3055 [000] 2172420.451663: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
261
+ supervise-3055 [000] 2172420.451938: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acf948
262
+ flush-202:1-409 [000] 2172421.163462: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
263
+ supervise-3055 [000] 2172421.500994: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
264
+ supervise-3055 [000] 2172421.501307: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acf948
265
+ ^C
266
+ Ending tracing...
267
+
268
+
269
+ Now for the return of bio_alloc():
270
+
271
+ # ./kprobe 'r:myprobe bio_alloc $retval'
272
+ Tracing kprobe myprobe. Ctrl-C to end.
273
+ supervise-3055 [000] 2172164.145533: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e55843c0
274
+ supervise-3055 [000] 2172164.145829: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e5584840
275
+ jbd2/xvda1-8-212 [000] 2172165.166453: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57596c0
276
+ jbd2/xvda1-8-212 [000] 2172165.166493: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759c00
277
+ jbd2/xvda1-8-212 [000] 2172165.166496: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759600
278
+ jbd2/xvda1-8-212 [000] 2172165.166497: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759e40
279
+ jbd2/xvda1-8-212 [000] 2172165.166498: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57590c0
280
+ jbd2/xvda1-8-212 [000] 2172165.166500: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57599c0
281
+ jbd2/xvda1-8-212 [000] 2172165.166500: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759a80
282
+ jbd2/xvda1-8-212 [000] 2172165.166502: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759f00
283
+ jbd2/xvda1-8-212 [000] 2172165.166503: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759540
284
+ jbd2/xvda1-8-212 [000] 2172165.166504: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759180
285
+ jbd2/xvda1-8-212 [000] 2172165.166504: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759900
286
+ jbd2/xvda1-8-212 [000] 2172165.166505: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759000
287
+ jbd2/xvda1-8-212 [000] 2172165.166506: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759480
288
+ <...>-212 [000] 2172165.176261: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759480
289
+ supervise-3055 [000] 2172165.176317: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e57596c0
290
+ supervise-3055 [000] 2172165.176586: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e5759900
291
+ ^C
292
+ Ending tracing...
293
+
294
+ Great. This output includes the function we are returning to, in most cases,
295
+ submit_bh().
296
+
297
+ Note that this mode (without a duration) prints events as they happen,
298
+ so the overheads can be high for frequent events. You could try the -d mode,
299
+ which buffers in-kernel.
300
+
301
+
302
+ The -s option will print the kernel stack trace after the event:
303
+
304
+ # ./kprobe -s 'p:mytcp tcp_init_cwnd'
305
+ Tracing kprobe mytcp. Ctrl-C to end.
306
+ sshd-5121 [000] d... 6897275.911301: mytcp: (tcp_init_cwnd+0x0/0x40)
307
+ sshd-5121 [000] d... 6897275.911309: <stack trace>
308
+ => tcp_write_xmit
309
+ => __tcp_push_pending_frames
310
+ => tcp_push
311
+ => tcp_sendmsg
312
+ => inet_sendmsg
313
+ => sock_aio_write
314
+ => do_sync_write
315
+ => vfs_write
316
+ => SyS_write
317
+ => system_call_fastpath
318
+ sshd-32219 [000] d... 6897275.911467: mytcp: (tcp_init_cwnd+0x0/0x40)
319
+ sshd-32219 [000] d... 6897275.911471: <stack trace>
320
+ => tcp_write_xmit
321
+ => __tcp_push_pending_frames
322
+ => tcp_push
323
+ => tcp_sendmsg
324
+ => inet_sendmsg
325
+ => sock_aio_write
326
+ => do_sync_write
327
+ => vfs_write
328
+ => SyS_write
329
+ => system_call_fastpath
330
+ sshd-5121 [000] d... 6897277.878794: mytcp: (tcp_init_cwnd+0x0/0x40)
331
+ sshd-5121 [000] d... 6897277.878801: <stack trace>
332
+ => tcp_write_xmit
333
+ => __tcp_push_pending_frames
334
+ => tcp_push
335
+ => tcp_sendmsg
336
+ => inet_sendmsg
337
+ => sock_aio_write
338
+ => do_sync_write
339
+ => vfs_write
340
+ => SyS_write
341
+ => system_call_fastpath
342
+
343
+ This makes use of the kernel options/stacktrace feature.
344
+
345
+
346
+ Use -h to print the USAGE message:
347
+
348
+ # ./kprobe -h
349
+ USAGE: kprobe [-FhHsv] [-d secs] [-p PID] [-L TID] kprobe_definition [filter]
350
+ -F # force. trace despite warnings.
351
+ -d seconds # trace duration, and use buffers
352
+ -p PID # PID to match on events
353
+ -L TID # thread id to match on events
354
+ -v # view format file (don't trace)
355
+ -H # include column headers
356
+ -s # show kernel stack traces
357
+ -h # this usage message
358
+
359
+ Note that these examples may need modification to match your kernel
360
+ version's function names and platform's register usage.
361
+ eg,
362
+ kprobe p:do_sys_open
363
+ # trace open() entry
364
+ kprobe r:do_sys_open
365
+ # trace open() return
366
+ kprobe 'r:do_sys_open $retval'
367
+ # trace open() return value
368
+ kprobe 'r:myopen do_sys_open $retval'
369
+ # use a custom probe name
370
+ kprobe 'p:myopen do_sys_open mode=%cx:u16'
371
+ # trace open() file mode
372
+ kprobe 'p:myopen do_sys_open filename=+0(%si):string'
373
+ # trace open() with filename
374
+ kprobe -s 'p:myprobe tcp_retransmit_skb'
375
+ # show kernel stacks
376
+ kprobe 'p:do_sys_open file=+0(%si):string' 'file ~ "*stat"'
377
+ # opened files ending in "stat"
378
+
379
+ See the man page and example file for more info.
@@ -0,0 +1,47 @@
1
+ Demonstrations of opensnoop, the Linux ftrace version.
2
+
3
+
4
+ # ./opensnoop
5
+ Tracing open()s. Ctrl-C to end.
6
+ COMM PID FD FILE
7
+ opensnoop 5334 0x3
8
+ <...> 5343 0x3 /etc/ld.so.cache
9
+ opensnoop 5342 0x3 /etc/ld.so.cache
10
+ <...> 5343 0x3 /lib/x86_64-linux-gnu/libc.so.6
11
+ opensnoop 5342 0x3 /lib/x86_64-linux-gnu/libm.so.6
12
+ opensnoop 5342 0x3 /lib/x86_64-linux-gnu/libc.so.6
13
+ <...> 5343 0x3 /usr/lib/locale/locale-archive
14
+ <...> 5343 0x3 trace_pipe
15
+ supervise 1684 0x9 supervise/status.new
16
+ supervise 1684 0x9 supervise/status.new
17
+ supervise 1688 0x9 supervise/status.new
18
+ supervise 1688 0x9 supervise/status.new
19
+ supervise 1686 0x9 supervise/status.new
20
+ supervise 1685 0x9 supervise/status.new
21
+ supervise 1685 0x9 supervise/status.new
22
+ supervise 1686 0x9 supervise/status.new
23
+ [...]
24
+
25
+ The first several lines show opensnoop catching itself initializing.
26
+
27
+
28
+ Use -h to print the USAGE message:
29
+
30
+ # ./opensnoop -h
31
+ USAGE: opensnoop [-htx] [-d secs] [-p PID] [-L TID] [-n name] [filename]
32
+ -d seconds # trace duration, and use buffers
33
+ -n name # process name to match on open
34
+ -p PID # PID to match on open
35
+ -L TID # thread id to match on open
36
+ -t # include time (seconds)
37
+ -x # only show failed opens
38
+ -h # this usage message
39
+ filename # match filename (partials, REs, ok)
40
+ eg,
41
+ opensnoop # watch open()s live (unbuffered)
42
+ opensnoop -d 1 # trace 1 sec (buffered)
43
+ opensnoop -p 181 # trace I/O issued by PID 181 only
44
+ opensnoop conf # trace filenames containing "conf"
45
+ opensnoop 'log$' # filenames ending in "log"
46
+
47
+ See the man page and example file for more info.
@@ -0,0 +1,149 @@
1
+ Demonstrations of perf-stat-hist, the Linux perf_events version.
2
+
3
+
4
+ Tracing the net:net_dev_xmit tracepoint, and building a power-of-4 histogram
5
+ for the "len" variable, for 10 seconds:
6
+
7
+ # ./perf-stat-hist net:net_dev_xmit len 10
8
+ Tracing net:net_dev_xmit, power-of-4, max 1048576, for 10 seconds...
9
+
10
+ Range : Count Distribution
11
+ 0 : 0 | |
12
+ 1 -> 3 : 0 | |
13
+ 4 -> 15 : 0 | |
14
+ 16 -> 63 : 2 |# |
15
+ 64 -> 255 : 30 |### |
16
+ 256 -> 1023 : 3 |# |
17
+ 1024 -> 4095 : 446 |######################################|
18
+ 4096 -> 16383 : 0 | |
19
+ 16384 -> 65535 : 0 | |
20
+ 65536 -> 262143 : 0 | |
21
+ 262144 -> 1048575 : 0 | |
22
+ 1048576 -> : 0 | |
23
+
24
+ This showed that most of the network transmits were between 1024 and 4095 bytes,
25
+ with a handful between 64 and 255 bytes.
26
+
27
+ Cat the format file for the tracepoint to see what other variables are available
28
+ to trace. Eg:
29
+
30
+ # cat /sys/kernel/debug/tracing/events/net/net_dev_xmit/format
31
+ name: net_dev_xmit
32
+ ID: 1078
33
+ format:
34
+ field:unsigned short common_type; offset:0; size:2; signed:0;
35
+ field:unsigned char common_flags; offset:2; size:1; signed:0;
36
+ field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
37
+ field:int common_pid; offset:4; size:4; signed:1;
38
+
39
+ field:void * skbaddr; offset:8; size:8; signed:0;
40
+ field:unsigned int len; offset:16; size:4; signed:0;
41
+ field:int rc; offset:20; size:4; signed:1;
42
+ field:__data_loc char[] name; offset:24; size:4; signed:1;
43
+
44
+ print fmt: "dev=%s skbaddr=%p len=%u rc=%d", __get_str(name), REC->skbaddr, REC->len, REC->rc
45
+
46
+ That's where "len" came from.
47
+
48
+ This works by creating a series of tracepoint and filter pairs for each
49
+ histogram bucket, and doing in-kernel counts. The overhead should in many cases
50
+ be better than user space post-processing, however, this approach is still
51
+ not ideal. I've called it a "perf hacktogram". The overhead is relative to
52
+ the frequency of events, multiplied by the number of buckets. You can modify
53
+ the script to use power-of-2 instead, or whatever you like, but the overhead
54
+ for more buckets will be higher.
55
+
56
+
57
+ Histogram of the returned read() syscall sizes:
58
+
59
+ # ./perf-stat-hist syscalls:sys_exit_read ret 10
60
+ Tracing syscalls:sys_exit_read, power-of-4, max 1048576, for 10 seconds...
61
+
62
+ Range : Count Distribution
63
+ 0 : 90 |# |
64
+ 1 -> 3 : 9587 |######################################|
65
+ 4 -> 15 : 69 |# |
66
+ 16 -> 63 : 590 |### |
67
+ 64 -> 255 : 250 |# |
68
+ 256 -> 1023 : 389 |## |
69
+ 1024 -> 4095 : 296 |## |
70
+ 4096 -> 16383 : 183 |# |
71
+ 16384 -> 65535 : 12 |# |
72
+ 65536 -> 262143 : 0 | |
73
+ 262144 -> 1048575 : 0 | |
74
+ 1048576 -> : 0 | |
75
+
76
+ Most of our read()s were tiny, between 1 and 3 bytes.
77
+
78
+
79
+ Using power-of-2, and a max of 1024:
80
+
81
+ # ./perf-stat-hist -P 2 -m 1024 syscalls:sys_exit_read ret
82
+ Tracing syscalls:sys_exit_read, power-of-2, max 1024, until Ctrl-C...
83
+ ^C
84
+ Range : Count Distribution
85
+ -> -1 : 29 |## |
86
+ 0 -> 0 : 1 |# |
87
+ 1 -> 1 : 959 |######################################|
88
+ 2 -> 3 : 1 |# |
89
+ 4 -> 7 : 0 | |
90
+ 8 -> 15 : 2 |# |
91
+ 16 -> 31 : 14 |# |
92
+ 32 -> 63 : 1 |# |
93
+ 64 -> 127 : 0 | |
94
+ 128 -> 255 : 0 | |
95
+ 256 -> 511 : 0 | |
96
+ 512 -> 1023 : 1 |# |
97
+ 1024 -> : 1 |# |
98
+
99
+
100
+ Specifying custom bucket sizes:
101
+
102
+ # ./perf-stat-hist -b "10 50 100 5000" syscalls:sys_exit_read ret
103
+ Tracing syscalls:sys_exit_read, specified buckets, until Ctrl-C...
104
+ ^C
105
+ Range : Count Distribution
106
+ -> 9 : 989 |######################################|
107
+ 10 -> 49 : 5 |# |
108
+ 50 -> 99 : 0 | |
109
+ 100 -> 4999 : 2 |# |
110
+ 5000 -> : 0 | |
111
+
112
+
113
+ Specifying a single value to bifurcate statistics:
114
+
115
+ # ./perf-stat-hist -b 10 syscalls:sys_exit_read ret
116
+ Tracing syscalls:sys_exit_read, specified buckets, until Ctrl-C...
117
+ ^C
118
+ Range : Count Distribution
119
+ -> 9 : 2959 |######################################|
120
+ 10 -> : 7 |# |
121
+
122
+ This has the lowest overhead for collection, since only two tracepoint
123
+ filter pairs are used.
124
+
125
+
126
+ Use -h to print the USAGE message:
127
+
128
+ # ./perf-stat-hist -h
129
+ USAGE: perf-stat-hist [-h] [-b buckets|-P power] [-m max] tracepoint
130
+ variable [seconds]
131
+ -b buckets # specify histogram bucket points
132
+ -P power # power-of (default is 4)
133
+ -m max # max value for power-of
134
+ -h # this usage message
135
+ eg,
136
+ perf-stat-hist syscalls:sys_enter_read count 5
137
+ # read() request histogram, 5 seconds
138
+ perf-stat-hist syscalls:sys_exit_read ret 5
139
+ # read() return histogram, 5 seconds
140
+ perf-stat-hist -P 10 syscalls:sys_exit_read ret 5
141
+ # ... use power-of-10
142
+ perf-stat-hist -P 2 -m 1024 syscalls:sys_exit_read ret 5
143
+ # ... use power-of-2, max 1024
144
+ perf-stat-hist -b "10 50 100 500" syscalls:sys_exit_read ret 5
145
+ # ... histogram based on these bucket ranges
146
+ perf-stat-hist -b 10 syscalls:sys_exit_read ret 5
147
+ # ... bifurcate by the value 10 (lowest overhead)
148
+
149
+ See the man page and example file for more info.