fluent-plugin-perf-tools 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (98) hide show
  1. checksums.yaml +7 -0
  2. data/.gitignore +15 -0
  3. data/.rubocop.yml +26 -0
  4. data/.ruby-version +1 -0
  5. data/CHANGELOG.md +5 -0
  6. data/CODE_OF_CONDUCT.md +84 -0
  7. data/Gemfile +5 -0
  8. data/LICENSE.txt +21 -0
  9. data/README.md +43 -0
  10. data/Rakefile +17 -0
  11. data/bin/console +15 -0
  12. data/bin/setup +8 -0
  13. data/fluent-plugin-perf-tools.gemspec +48 -0
  14. data/lib/fluent/plugin/in_perf_tools.rb +42 -0
  15. data/lib/fluent/plugin/perf_tools/cachestat.rb +65 -0
  16. data/lib/fluent/plugin/perf_tools/command.rb +30 -0
  17. data/lib/fluent/plugin/perf_tools/version.rb +9 -0
  18. data/lib/fluent/plugin/perf_tools.rb +11 -0
  19. data/perf-tools/LICENSE +339 -0
  20. data/perf-tools/README.md +205 -0
  21. data/perf-tools/bin/bitesize +1 -0
  22. data/perf-tools/bin/cachestat +1 -0
  23. data/perf-tools/bin/execsnoop +1 -0
  24. data/perf-tools/bin/funccount +1 -0
  25. data/perf-tools/bin/funcgraph +1 -0
  26. data/perf-tools/bin/funcslower +1 -0
  27. data/perf-tools/bin/functrace +1 -0
  28. data/perf-tools/bin/iolatency +1 -0
  29. data/perf-tools/bin/iosnoop +1 -0
  30. data/perf-tools/bin/killsnoop +1 -0
  31. data/perf-tools/bin/kprobe +1 -0
  32. data/perf-tools/bin/opensnoop +1 -0
  33. data/perf-tools/bin/perf-stat-hist +1 -0
  34. data/perf-tools/bin/reset-ftrace +1 -0
  35. data/perf-tools/bin/syscount +1 -0
  36. data/perf-tools/bin/tcpretrans +1 -0
  37. data/perf-tools/bin/tpoint +1 -0
  38. data/perf-tools/bin/uprobe +1 -0
  39. data/perf-tools/deprecated/README.md +1 -0
  40. data/perf-tools/deprecated/execsnoop-proc +150 -0
  41. data/perf-tools/deprecated/execsnoop-proc.8 +80 -0
  42. data/perf-tools/deprecated/execsnoop-proc_example.txt +46 -0
  43. data/perf-tools/disk/bitesize +175 -0
  44. data/perf-tools/examples/bitesize_example.txt +63 -0
  45. data/perf-tools/examples/cachestat_example.txt +58 -0
  46. data/perf-tools/examples/execsnoop_example.txt +153 -0
  47. data/perf-tools/examples/funccount_example.txt +126 -0
  48. data/perf-tools/examples/funcgraph_example.txt +2178 -0
  49. data/perf-tools/examples/funcslower_example.txt +110 -0
  50. data/perf-tools/examples/functrace_example.txt +341 -0
  51. data/perf-tools/examples/iolatency_example.txt +350 -0
  52. data/perf-tools/examples/iosnoop_example.txt +302 -0
  53. data/perf-tools/examples/killsnoop_example.txt +62 -0
  54. data/perf-tools/examples/kprobe_example.txt +379 -0
  55. data/perf-tools/examples/opensnoop_example.txt +47 -0
  56. data/perf-tools/examples/perf-stat-hist_example.txt +149 -0
  57. data/perf-tools/examples/reset-ftrace_example.txt +88 -0
  58. data/perf-tools/examples/syscount_example.txt +297 -0
  59. data/perf-tools/examples/tcpretrans_example.txt +93 -0
  60. data/perf-tools/examples/tpoint_example.txt +210 -0
  61. data/perf-tools/examples/uprobe_example.txt +321 -0
  62. data/perf-tools/execsnoop +292 -0
  63. data/perf-tools/fs/cachestat +167 -0
  64. data/perf-tools/images/perf-tools_2016.png +0 -0
  65. data/perf-tools/iolatency +296 -0
  66. data/perf-tools/iosnoop +296 -0
  67. data/perf-tools/kernel/funccount +146 -0
  68. data/perf-tools/kernel/funcgraph +259 -0
  69. data/perf-tools/kernel/funcslower +248 -0
  70. data/perf-tools/kernel/functrace +192 -0
  71. data/perf-tools/kernel/kprobe +270 -0
  72. data/perf-tools/killsnoop +263 -0
  73. data/perf-tools/man/man8/bitesize.8 +70 -0
  74. data/perf-tools/man/man8/cachestat.8 +111 -0
  75. data/perf-tools/man/man8/execsnoop.8 +104 -0
  76. data/perf-tools/man/man8/funccount.8 +76 -0
  77. data/perf-tools/man/man8/funcgraph.8 +166 -0
  78. data/perf-tools/man/man8/funcslower.8 +129 -0
  79. data/perf-tools/man/man8/functrace.8 +123 -0
  80. data/perf-tools/man/man8/iolatency.8 +116 -0
  81. data/perf-tools/man/man8/iosnoop.8 +169 -0
  82. data/perf-tools/man/man8/killsnoop.8 +100 -0
  83. data/perf-tools/man/man8/kprobe.8 +162 -0
  84. data/perf-tools/man/man8/opensnoop.8 +113 -0
  85. data/perf-tools/man/man8/perf-stat-hist.8 +111 -0
  86. data/perf-tools/man/man8/reset-ftrace.8 +49 -0
  87. data/perf-tools/man/man8/syscount.8 +96 -0
  88. data/perf-tools/man/man8/tcpretrans.8 +93 -0
  89. data/perf-tools/man/man8/tpoint.8 +140 -0
  90. data/perf-tools/man/man8/uprobe.8 +168 -0
  91. data/perf-tools/misc/perf-stat-hist +223 -0
  92. data/perf-tools/net/tcpretrans +311 -0
  93. data/perf-tools/opensnoop +280 -0
  94. data/perf-tools/syscount +192 -0
  95. data/perf-tools/system/tpoint +232 -0
  96. data/perf-tools/tools/reset-ftrace +123 -0
  97. data/perf-tools/user/uprobe +390 -0
  98. metadata +349 -0
@@ -0,0 +1,62 @@
1
+ Demonstrations of killsnoop, the Linux ftrace version.
2
+
3
+
4
+ What signals are happening on my system?
5
+
6
+ # ./killsnoop
7
+ Tracing kill()s. Ctrl-C to end.
8
+ COMM PID TPID SIGNAL RETURN
9
+ postgres 2209 2148 10 0
10
+ postgres 5416 2209 12 0
11
+ postgres 5416 2209 12 0
12
+ supervise 2135 5465 15 0
13
+ supervise 2135 5465 18 0
14
+ ^C
15
+ Ending tracing...
16
+
17
+ The first line of output shows that PID 2209, process name "postgres", has
18
+ sent a signal 10 (SIGUSR1) to target PID 2148. This signal returned success (0).
19
+
20
+ kilsnoop traces the kill() syscall, which is used to send signals to other
21
+ processes. These signals can include SIGKILL and SIGTERM, both of which
22
+ ultimately kill the target process (in different fashions), but the signals
23
+ may also include other operations, including checking if a process still
24
+ exists (signal 0). To read more about signals, see "man -s7 signal".
25
+
26
+ killsnoop can be useful to identify why some processes are abruptly and
27
+ unexpectedly ending (also check for the OOM killer in dmesg).
28
+
29
+
30
+ The -s option can be used to print signal names instead of numbers:
31
+
32
+ # ./killsnoop -s
33
+ Tracing kill()s. Ctrl-C to end.
34
+ COMM PID KILLED SIGNAL RETURN
35
+ postgres 2209 2148 SIGUSR1 0
36
+ postgres 5665 2209 SIGUSR2 0
37
+ postgres 5665 2209 SIGUSR2 0
38
+ supervise 2135 5711 SIGTERM 0
39
+ supervise 2135 5711 SIGCONT 0
40
+ bash 27450 27450 0 0
41
+ [...]
42
+
43
+ On the last line: there wasn't a nice signal name for signal 0, so just numeric
44
+ 0 is printed. You'll see signal 0's used to check if processes still exist.
45
+
46
+
47
+ Use -h to print the USAGE message:
48
+
49
+ # ./opensnoop -h
50
+ USAGE: killsnoop [-ht] [-d secs] [-p PID] [-n name] [filename]
51
+ -d seconds # trace duration, and use buffers
52
+ -n name # process name to match
53
+ -p PID # PID to match on kill issue
54
+ -t # include time (seconds)
55
+ -s # human readable signal names
56
+ -h # this usage message
57
+ eg,
58
+ killsnoop # watch kill()s live (unbuffered)
59
+ killsnoop -d 1 # trace 1 sec (buffered)
60
+ killsnoop -p 181 # trace kill()s issued to PID 181 only
61
+
62
+ See the man page and example file for more info.
@@ -0,0 +1,379 @@
1
+ Demonstrations of kprobe, the Linux ftrace version.
2
+
3
+
4
+ This traces the kernel do_sys_open() function, when it is called:
5
+
6
+ # ./kprobe p:do_sys_open
7
+ Tracing kprobe do_sys_open. Ctrl-C to end.
8
+ kprobe-26042 [001] d... 6910441.001452: do_sys_open: (do_sys_open+0x0/0x220)
9
+ kprobe-26042 [001] d... 6910441.001475: do_sys_open: (do_sys_open+0x0/0x220)
10
+ kprobe-26042 [001] d... 6910441.001866: do_sys_open: (do_sys_open+0x0/0x220)
11
+ kprobe-26042 [001] d... 6910441.001966: do_sys_open: (do_sys_open+0x0/0x220)
12
+ supervise-1689 [000] d... 6910441.083302: do_sys_open: (do_sys_open+0x0/0x220)
13
+ supervise-1693 [001] d... 6910441.083530: do_sys_open: (do_sys_open+0x0/0x220)
14
+ supervise-1689 [000] d... 6910441.083759: do_sys_open: (do_sys_open+0x0/0x220)
15
+ supervise-1693 [001] d... 6910441.083877: do_sys_open: (do_sys_open+0x0/0x220)
16
+ [...]
17
+
18
+ The "p:" is for creating a probe. Use "r:" to probe the return of the function:
19
+
20
+ # ./kprobe r:do_sys_open
21
+ Tracing kprobe do_sys_open. Ctrl-C to end.
22
+ kprobe-29475 [001] d... 6910688.229777: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
23
+ <...>-29476 [001] d... 6910688.231101: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
24
+ <...>-29476 [001] d... 6910688.231123: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
25
+ <...>-29476 [001] d... 6910688.231530: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
26
+ <...>-29476 [001] d... 6910688.231624: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
27
+ supervise-1685 [001] d... 6910688.328776: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
28
+ supervise-1689 [000] d... 6910688.328780: do_sys_open: (SyS_open+0x1e/0x20 <- do_sys_open)
29
+ [...]
30
+
31
+ This output includes the function that the traced function is returning to.
32
+
33
+
34
+ The trace output can be a little different between kernel versions. Use -H to
35
+ print the header:
36
+
37
+ # ./kprobe -H p:do_sys_open
38
+ Tracing kprobe do_sys_open. Ctrl-C to end.
39
+ # tracer: nop
40
+ #
41
+ # entries-in-buffer/entries-written: 4/4 #P:2
42
+ #
43
+ # _-----=> irqs-off
44
+ # / _----=> need-resched
45
+ # | / _---=> hardirq/softirq
46
+ # || / _--=> preempt-depth
47
+ # ||| / delay
48
+ # TASK-PID CPU# |||| TIMESTAMP FUNCTION
49
+ # | | | |||| | |
50
+ kprobe-27952 [001] d... 6910580.008086: do_sys_open: (do_sys_open+0x0/0x220)
51
+ kprobe-27952 [001] d... 6910580.008109: do_sys_open: (do_sys_open+0x0/0x220)
52
+ kprobe-27952 [001] d... 6910580.008483: do_sys_open: (do_sys_open+0x0/0x220)
53
+ [...]
54
+
55
+ These columns are explained in the kernel source under Documentation/trace/ftrace.txt.
56
+
57
+
58
+ This traces do_sys_open() returns, using a probe alias "myopen", and showing
59
+ the return value ($retval):
60
+
61
+ # ./kprobe 'r:myopen do_sys_open $retval'
62
+ Tracing kprobe myopen. Ctrl-C to end.
63
+ kprobe-26386 [001] d... 6593278.858754: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
64
+ <...>-26387 [001] d... 6593278.860043: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
65
+ <...>-26387 [001] d... 6593278.860064: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
66
+ <...>-26387 [001] d... 6593278.860433: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
67
+ <...>-26387 [001] d... 6593278.860521: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x3
68
+ supervise-1685 [001] d... 6593279.178806: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
69
+ supervise-1689 [001] d... 6593279.228756: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
70
+ supervise-1689 [001] d... 6593279.229106: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
71
+ supervise-1688 [000] d... 6593279.229501: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
72
+ supervise-1695 [000] d... 6593279.229944: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
73
+ supervise-1685 [001] d... 6593279.230104: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
74
+ supervise-1687 [001] d... 6593279.230293: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
75
+ supervise-1699 [000] d... 6593279.230381: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
76
+ supervise-1692 [000] d... 6593279.230825: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
77
+ supervise-1698 [000] d... 6593279.230915: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
78
+ supervise-1698 [000] d... 6593279.231277: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
79
+ supervise-1690 [000] d... 6593279.231703: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) arg1=0x9
80
+ ^C
81
+ Ending tracing...
82
+
83
+ The string specified, 'r:myopen do_sys_open $retval', is a kprobe definition,
84
+ and is the same as those documented in the Linux kernel source under
85
+ Documentation/trace/kprobetrace.txt, which can be written to the
86
+ /sys/kernel/debug/tracing/kprobe_events file.
87
+
88
+ Apart from probe name aliases, you can also provide arbitrary names for
89
+ arguments. Eg, instead of the "arg1" default, calling it "rval":
90
+
91
+ # ./kprobe 'r:myopen do_sys_open rval=$retval'
92
+ Tracing kprobe myopen. Ctrl-C to end.
93
+ kprobe-27454 [001] d... 6593356.250019: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
94
+ <...>-27455 [001] d... 6593356.251280: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
95
+ <...>-27455 [001] d... 6593356.251301: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
96
+ <...>-27455 [001] d... 6593356.251672: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
97
+ <...>-27455 [001] d... 6593356.251769: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x3
98
+ supervise-1689 [000] d... 6593356.859758: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
99
+ supervise-1689 [000] d... 6593356.860143: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
100
+ supervise-1696 [000] d... 6593356.862682: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
101
+ supervise-1685 [001] d... 6593356.862684: myopen: (SyS_open+0x1e/0x20 <- do_sys_open) rval=0x9
102
+ [...]
103
+
104
+ That's a bit better.
105
+
106
+
107
+ Tracing the open() mode:
108
+
109
+ # ./kprobe 'p:myopen do_sys_open mode=%cx:u16'
110
+ Tracing kprobe myopen. Ctrl-C to end.
111
+ kprobe-29572 [001] d... 6593503.353923: myopen: (do_sys_open+0x0/0x220) mode=0x1
112
+ kprobe-29572 [001] d... 6593503.353945: myopen: (do_sys_open+0x0/0x220) mode=0x0
113
+ kprobe-29572 [001] d... 6593503.354307: myopen: (do_sys_open+0x0/0x220) mode=0x5c00
114
+ kprobe-29572 [001] d... 6593503.354401: myopen: (do_sys_open+0x0/0x220) mode=0x0
115
+ supervise-1689 [000] d... 6593503.944125: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
116
+ supervise-1688 [001] d... 6593503.944125: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
117
+ supervise-1688 [001] d... 6593503.944606: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
118
+ supervise-1689 [000] d... 6593503.944606: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
119
+ supervise-1698 [000] d... 6593503.944728: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
120
+ supervise-1698 [000] d... 6593503.945077: myopen: (do_sys_open+0x0/0x220) mode=0x1a4
121
+ [...]
122
+
123
+ Here I guessed that the mode was in register %cx, and cast it as a 16-bit
124
+ unsigned integer (":u16"). Your platform and kernel may be different, and the
125
+ mode may be in a different register. If fiddling with such registers becomes too
126
+ painful or unreliable for you, consider installing kernel debuginfo and using
127
+ the named variables with perf_events "perf probe".
128
+
129
+
130
+ Tracing the open() filename:
131
+
132
+ # ./kprobe 'p:myopen do_sys_open filename=+0(%si):string'
133
+ Tracing kprobe myopen. Ctrl-C to end.
134
+ kprobe-32369 [001] d... 6593706.999728: myopen: (do_sys_open+0x0/0x220) filename="/etc/ld.so.cache"
135
+ kprobe-32369 [001] d... 6593706.999748: myopen: (do_sys_open+0x0/0x220) filename="/lib/x86_64-linux-gnu/libc.so.6"
136
+ kprobe-32369 [001] d... 6593707.000092: myopen: (do_sys_open+0x0/0x220) filename="/usr/lib/locale/locale-archive"
137
+ kprobe-32369 [001] d... 6593707.000176: myopen: (do_sys_open+0x0/0x220) filename="trace_pipe"
138
+ supervise-1699 [000] d... 6593707.254970: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
139
+ supervise-1689 [001] d... 6593707.254970: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
140
+ supervise-1689 [001] d... 6593707.255432: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
141
+ supervise-1699 [000] d... 6593707.255432: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
142
+ supervise-1695 [001] d... 6593707.258805: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
143
+ [...]
144
+
145
+ As mentioned previously, the %si register may be different on your platform.
146
+ In this example, I cast it as a string.
147
+
148
+
149
+ Specifying a duration will buffer in-kernel (reducing overhead), and write at
150
+ the end. Here's tracing for 10 seconds, and writing to the "out" file:
151
+
152
+ # ./kprobe -d 10 'p:myopen do_sys_open filename=+0(%si):string' > out
153
+
154
+
155
+ You can match on a single PID only:
156
+
157
+ # ./kprobe -p 1696 'p:myopen do_sys_open filename=+0(%si):string'
158
+ Tracing kprobe myopen. Ctrl-C to end.
159
+ supervise-1696 [001] d... 6593773.677033: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
160
+ supervise-1696 [001] d... 6593773.677332: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
161
+ supervise-1696 [001] d... 6593774.697144: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
162
+ supervise-1696 [001] d... 6593774.697675: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
163
+ supervise-1696 [001] d... 6593775.717986: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
164
+ supervise-1696 [001] d... 6593775.718499: myopen: (do_sys_open+0x0/0x220) filename="supervise/status.new"
165
+ ^C
166
+ Ending tracing...
167
+
168
+ This will only show events when that PID is on-CPU.
169
+
170
+
171
+ The -v option will show you the available variables you can use in custom
172
+ filters:
173
+
174
+ # ./kprobe -v 'p:myopen do_sys_open filename=+0(%si):string'
175
+ name: myopen
176
+ ID: 1443
177
+ format:
178
+ field:unsigned short common_type; offset:0; size:2; signed:0;
179
+ field:unsigned char common_flags; offset:2; size:1; signed:0;
180
+ field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
181
+ field:int common_pid; offset:4; size:4; signed:1;
182
+
183
+ field:unsigned long __probe_ip; offset:8; size:8; signed:0;
184
+ field:__data_loc char[] filename; offset:16; size:4; signed:1;
185
+
186
+ print fmt: "(%lx) filename=\"%s\"", REC->__probe_ip, __get_str(filename)
187
+
188
+
189
+ Tracing filenames that end in "stat", by adding a filter:
190
+
191
+ # ./kprobe 'p:myopen do_sys_open filename=+0(%si):string' 'filename ~ "*stat"'
192
+ Tracing kprobe myopen. Ctrl-C to end.
193
+ postgres-1172 [000] d... 6594028.787166: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
194
+ postgres-1172 [001] d... 6594028.797410: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
195
+ postgres-1172 [001] d... 6594028.797467: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
196
+ postgres-4443 [001] d... 6594028.800908: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
197
+ postgres-4443 [000] d... 6594028.811237: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
198
+ postgres-4443 [000] d... 6594028.811290: myopen: (do_sys_open+0x0/0x220) filename="pg_stat_tmp/pgstat.stat"
199
+ ^C
200
+ Ending tracing...
201
+
202
+ This filtering is done in-kernel context.
203
+
204
+
205
+ As an example of tracing a deeper kernel function, lets trace bio_alloc() and
206
+ entry registers:
207
+
208
+ # ./kprobe 'p:myprobe bio_alloc %ax %bx %cx %dx'
209
+ Tracing kprobe myprobe. Ctrl-C to end.
210
+ supervise-3055 [000] 2172148.728250: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
211
+ supervise-3055 [000] 2172148.728527: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
212
+ jbd2/xvda1-8-212 [000] 2172149.749474: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800ad1f87b8 arg3=ffff8800ba22c06c arg4=8
213
+ jbd2/xvda1-8-212 [000] 2172149.749485: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d053a8 arg3=10f16c5bb arg4=0
214
+ jbd2/xvda1-8-212 [000] 2172149.749487: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05958 arg3=5 arg4=0
215
+ jbd2/xvda1-8-212 [000] 2172149.749488: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05b60 arg3=5 arg4=0
216
+ jbd2/xvda1-8-212 [000] 2172149.749489: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d05820 arg3=5 arg4=0
217
+ jbd2/xvda1-8-212 [000] 2172149.749489: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d055b0 arg3=5 arg4=0
218
+ jbd2/xvda1-8-212 [000] 2172149.749490: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88006ff22ea0 arg3=5 arg4=0
219
+ jbd2/xvda1-8-212 [000] 2172149.749491: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d1f000 arg3=5 arg4=0
220
+ jbd2/xvda1-8-212 [000] 2172149.749492: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff880089d1f138 arg3=5 arg4=0
221
+ jbd2/xvda1-8-212 [000] 2172149.749493: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d267138 arg3=5 arg4=0
222
+ jbd2/xvda1-8-212 [000] 2172149.749494: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d267680 arg3=5 arg4=0
223
+ jbd2/xvda1-8-212 [000] 2172149.749495: myprobe: (bio_alloc+0x0/0x30) arg1=0 arg2=ffff88005d2675b0 arg3=5 arg4=0
224
+ jbd2/xvda1-8-212 [000] 2172149.751044: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800cc241ea0 arg3=445f0300 arg4=ffff8800effba000
225
+ supervise-3055 [000] 2172149.751095: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
226
+ supervise-3055 [000] 2172149.751341: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
227
+ supervise-3055 [000] 2172150.772033: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acc8d0 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acc910
228
+ supervise-3055 [000] 2172150.772305: myprobe: (bio_alloc+0x0/0x30) arg1=ffff880064acf948 arg2=ffff8800e56a7990 arg3=0 arg4=ffff880064acf988
229
+ flush-202:1-409 [000] 2172151.087815: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800da51d6e8 arg3=16afd arg4=1
230
+ flush-202:1-409 [000] 2172151.087829: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7537f08 arg3=16afd arg4=2
231
+ flush-202:1-409 [000] 2172151.087844: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7519af8 arg3=16afd arg4=3
232
+ flush-202:1-409 [000] 2172151.087846: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7511478 arg3=16afd arg4=4
233
+ flush-202:1-409 [000] 2172151.087849: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e75e6a90 arg3=16afd arg4=5
234
+ flush-202:1-409 [000] 2172151.087851: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800e7512bc8 arg3=16afd arg4=6
235
+ flush-202:1-409 [000] 2172151.087853: myprobe: (bio_alloc+0x0/0x30) arg1=ffffffff arg2=ffff8800eb3bf410 arg3=16afd arg4=7
236
+ ^C
237
+
238
+ The output includes who is on-CPU, high resolution timestamps, and the arguments
239
+ we requested (registers %ax to %dx). These registers are platform dependent,
240
+ and are mapped by the compiler to the entry arguments of the function.
241
+
242
+ How are these useful? If you are debugging this kernel function, you'll know. :)
243
+
244
+
245
+ Note that you can add qualifiers, eg, if I knew %ax was a uint32:
246
+
247
+ # ./kprobe 'p:myprobe bio_alloc %ax:u32'
248
+ Tracing kprobe myprobe. Ctrl-C to end.
249
+ supervise-3055 [000] 2172389.734606: myprobe: (bio_alloc+0x0/0x30) arg1=64acf948
250
+ supervise-3055 [000] 2172389.734865: myprobe: (bio_alloc+0x0/0x30) arg1=64acc8d0
251
+ supervise-3055 [000] 2172390.772391: myprobe: (bio_alloc+0x0/0x30) arg1=64acf948
252
+ supervise-3055 [000] 2172390.772676: myprobe: (bio_alloc+0x0/0x30) arg1=64acc8d0
253
+ ^C
254
+ Ending tracing...
255
+
256
+ You can give them aliases too, instead of the default arg1..N:
257
+
258
+ # ./kprobe 'p:myprobe bio_alloc ax=%ax'
259
+ Tracing kprobe myprobe. Ctrl-C to end.
260
+ supervise-3055 [000] 2172420.451663: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
261
+ supervise-3055 [000] 2172420.451938: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acf948
262
+ flush-202:1-409 [000] 2172421.163462: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
263
+ supervise-3055 [000] 2172421.500994: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acc8d0
264
+ supervise-3055 [000] 2172421.501307: myprobe: (bio_alloc+0x0/0x30) ax=ffff880064acf948
265
+ ^C
266
+ Ending tracing...
267
+
268
+
269
+ Now for the return of bio_alloc():
270
+
271
+ # ./kprobe 'r:myprobe bio_alloc $retval'
272
+ Tracing kprobe myprobe. Ctrl-C to end.
273
+ supervise-3055 [000] 2172164.145533: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e55843c0
274
+ supervise-3055 [000] 2172164.145829: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e5584840
275
+ jbd2/xvda1-8-212 [000] 2172165.166453: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57596c0
276
+ jbd2/xvda1-8-212 [000] 2172165.166493: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759c00
277
+ jbd2/xvda1-8-212 [000] 2172165.166496: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759600
278
+ jbd2/xvda1-8-212 [000] 2172165.166497: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759e40
279
+ jbd2/xvda1-8-212 [000] 2172165.166498: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57590c0
280
+ jbd2/xvda1-8-212 [000] 2172165.166500: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e57599c0
281
+ jbd2/xvda1-8-212 [000] 2172165.166500: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759a80
282
+ jbd2/xvda1-8-212 [000] 2172165.166502: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759f00
283
+ jbd2/xvda1-8-212 [000] 2172165.166503: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759540
284
+ jbd2/xvda1-8-212 [000] 2172165.166504: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759180
285
+ jbd2/xvda1-8-212 [000] 2172165.166504: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759900
286
+ jbd2/xvda1-8-212 [000] 2172165.166505: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759000
287
+ jbd2/xvda1-8-212 [000] 2172165.166506: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759480
288
+ <...>-212 [000] 2172165.176261: myprobe: (submit_bh+0x76/0x120 <- bio_alloc) arg1=ffff8800e5759480
289
+ supervise-3055 [000] 2172165.176317: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e57596c0
290
+ supervise-3055 [000] 2172165.176586: myprobe: (io_submit_init.isra.6+0x74/0x100 <- bio_alloc) arg1=ffff8800e5759900
291
+ ^C
292
+ Ending tracing...
293
+
294
+ Great. This output includes the function we are returning to, in most cases,
295
+ submit_bh().
296
+
297
+ Note that this mode (without a duration) prints events as they happen,
298
+ so the overheads can be high for frequent events. You could try the -d mode,
299
+ which buffers in-kernel.
300
+
301
+
302
+ The -s option will print the kernel stack trace after the event:
303
+
304
+ # ./kprobe -s 'p:mytcp tcp_init_cwnd'
305
+ Tracing kprobe mytcp. Ctrl-C to end.
306
+ sshd-5121 [000] d... 6897275.911301: mytcp: (tcp_init_cwnd+0x0/0x40)
307
+ sshd-5121 [000] d... 6897275.911309: <stack trace>
308
+ => tcp_write_xmit
309
+ => __tcp_push_pending_frames
310
+ => tcp_push
311
+ => tcp_sendmsg
312
+ => inet_sendmsg
313
+ => sock_aio_write
314
+ => do_sync_write
315
+ => vfs_write
316
+ => SyS_write
317
+ => system_call_fastpath
318
+ sshd-32219 [000] d... 6897275.911467: mytcp: (tcp_init_cwnd+0x0/0x40)
319
+ sshd-32219 [000] d... 6897275.911471: <stack trace>
320
+ => tcp_write_xmit
321
+ => __tcp_push_pending_frames
322
+ => tcp_push
323
+ => tcp_sendmsg
324
+ => inet_sendmsg
325
+ => sock_aio_write
326
+ => do_sync_write
327
+ => vfs_write
328
+ => SyS_write
329
+ => system_call_fastpath
330
+ sshd-5121 [000] d... 6897277.878794: mytcp: (tcp_init_cwnd+0x0/0x40)
331
+ sshd-5121 [000] d... 6897277.878801: <stack trace>
332
+ => tcp_write_xmit
333
+ => __tcp_push_pending_frames
334
+ => tcp_push
335
+ => tcp_sendmsg
336
+ => inet_sendmsg
337
+ => sock_aio_write
338
+ => do_sync_write
339
+ => vfs_write
340
+ => SyS_write
341
+ => system_call_fastpath
342
+
343
+ This makes use of the kernel options/stacktrace feature.
344
+
345
+
346
+ Use -h to print the USAGE message:
347
+
348
+ # ./kprobe -h
349
+ USAGE: kprobe [-FhHsv] [-d secs] [-p PID] [-L TID] kprobe_definition [filter]
350
+ -F # force. trace despite warnings.
351
+ -d seconds # trace duration, and use buffers
352
+ -p PID # PID to match on events
353
+ -L TID # thread id to match on events
354
+ -v # view format file (don't trace)
355
+ -H # include column headers
356
+ -s # show kernel stack traces
357
+ -h # this usage message
358
+
359
+ Note that these examples may need modification to match your kernel
360
+ version's function names and platform's register usage.
361
+ eg,
362
+ kprobe p:do_sys_open
363
+ # trace open() entry
364
+ kprobe r:do_sys_open
365
+ # trace open() return
366
+ kprobe 'r:do_sys_open $retval'
367
+ # trace open() return value
368
+ kprobe 'r:myopen do_sys_open $retval'
369
+ # use a custom probe name
370
+ kprobe 'p:myopen do_sys_open mode=%cx:u16'
371
+ # trace open() file mode
372
+ kprobe 'p:myopen do_sys_open filename=+0(%si):string'
373
+ # trace open() with filename
374
+ kprobe -s 'p:myprobe tcp_retransmit_skb'
375
+ # show kernel stacks
376
+ kprobe 'p:do_sys_open file=+0(%si):string' 'file ~ "*stat"'
377
+ # opened files ending in "stat"
378
+
379
+ See the man page and example file for more info.
@@ -0,0 +1,47 @@
1
+ Demonstrations of opensnoop, the Linux ftrace version.
2
+
3
+
4
+ # ./opensnoop
5
+ Tracing open()s. Ctrl-C to end.
6
+ COMM PID FD FILE
7
+ opensnoop 5334 0x3
8
+ <...> 5343 0x3 /etc/ld.so.cache
9
+ opensnoop 5342 0x3 /etc/ld.so.cache
10
+ <...> 5343 0x3 /lib/x86_64-linux-gnu/libc.so.6
11
+ opensnoop 5342 0x3 /lib/x86_64-linux-gnu/libm.so.6
12
+ opensnoop 5342 0x3 /lib/x86_64-linux-gnu/libc.so.6
13
+ <...> 5343 0x3 /usr/lib/locale/locale-archive
14
+ <...> 5343 0x3 trace_pipe
15
+ supervise 1684 0x9 supervise/status.new
16
+ supervise 1684 0x9 supervise/status.new
17
+ supervise 1688 0x9 supervise/status.new
18
+ supervise 1688 0x9 supervise/status.new
19
+ supervise 1686 0x9 supervise/status.new
20
+ supervise 1685 0x9 supervise/status.new
21
+ supervise 1685 0x9 supervise/status.new
22
+ supervise 1686 0x9 supervise/status.new
23
+ [...]
24
+
25
+ The first several lines show opensnoop catching itself initializing.
26
+
27
+
28
+ Use -h to print the USAGE message:
29
+
30
+ # ./opensnoop -h
31
+ USAGE: opensnoop [-htx] [-d secs] [-p PID] [-L TID] [-n name] [filename]
32
+ -d seconds # trace duration, and use buffers
33
+ -n name # process name to match on open
34
+ -p PID # PID to match on open
35
+ -L TID # thread id to match on open
36
+ -t # include time (seconds)
37
+ -x # only show failed opens
38
+ -h # this usage message
39
+ filename # match filename (partials, REs, ok)
40
+ eg,
41
+ opensnoop # watch open()s live (unbuffered)
42
+ opensnoop -d 1 # trace 1 sec (buffered)
43
+ opensnoop -p 181 # trace I/O issued by PID 181 only
44
+ opensnoop conf # trace filenames containing "conf"
45
+ opensnoop 'log$' # filenames ending in "log"
46
+
47
+ See the man page and example file for more info.
@@ -0,0 +1,149 @@
1
+ Demonstrations of perf-stat-hist, the Linux perf_events version.
2
+
3
+
4
+ Tracing the net:net_dev_xmit tracepoint, and building a power-of-4 histogram
5
+ for the "len" variable, for 10 seconds:
6
+
7
+ # ./perf-stat-hist net:net_dev_xmit len 10
8
+ Tracing net:net_dev_xmit, power-of-4, max 1048576, for 10 seconds...
9
+
10
+ Range : Count Distribution
11
+ 0 : 0 | |
12
+ 1 -> 3 : 0 | |
13
+ 4 -> 15 : 0 | |
14
+ 16 -> 63 : 2 |# |
15
+ 64 -> 255 : 30 |### |
16
+ 256 -> 1023 : 3 |# |
17
+ 1024 -> 4095 : 446 |######################################|
18
+ 4096 -> 16383 : 0 | |
19
+ 16384 -> 65535 : 0 | |
20
+ 65536 -> 262143 : 0 | |
21
+ 262144 -> 1048575 : 0 | |
22
+ 1048576 -> : 0 | |
23
+
24
+ This showed that most of the network transmits were between 1024 and 4095 bytes,
25
+ with a handful between 64 and 255 bytes.
26
+
27
+ Cat the format file for the tracepoint to see what other variables are available
28
+ to trace. Eg:
29
+
30
+ # cat /sys/kernel/debug/tracing/events/net/net_dev_xmit/format
31
+ name: net_dev_xmit
32
+ ID: 1078
33
+ format:
34
+ field:unsigned short common_type; offset:0; size:2; signed:0;
35
+ field:unsigned char common_flags; offset:2; size:1; signed:0;
36
+ field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
37
+ field:int common_pid; offset:4; size:4; signed:1;
38
+
39
+ field:void * skbaddr; offset:8; size:8; signed:0;
40
+ field:unsigned int len; offset:16; size:4; signed:0;
41
+ field:int rc; offset:20; size:4; signed:1;
42
+ field:__data_loc char[] name; offset:24; size:4; signed:1;
43
+
44
+ print fmt: "dev=%s skbaddr=%p len=%u rc=%d", __get_str(name), REC->skbaddr, REC->len, REC->rc
45
+
46
+ That's where "len" came from.
47
+
48
+ This works by creating a series of tracepoint and filter pairs for each
49
+ histogram bucket, and doing in-kernel counts. The overhead should in many cases
50
+ be better than user space post-processing, however, this approach is still
51
+ not ideal. I've called it a "perf hacktogram". The overhead is relative to
52
+ the frequency of events, multiplied by the number of buckets. You can modify
53
+ the script to use power-of-2 instead, or whatever you like, but the overhead
54
+ for more buckets will be higher.
55
+
56
+
57
+ Histogram of the returned read() syscall sizes:
58
+
59
+ # ./perf-stat-hist syscalls:sys_exit_read ret 10
60
+ Tracing syscalls:sys_exit_read, power-of-4, max 1048576, for 10 seconds...
61
+
62
+ Range : Count Distribution
63
+ 0 : 90 |# |
64
+ 1 -> 3 : 9587 |######################################|
65
+ 4 -> 15 : 69 |# |
66
+ 16 -> 63 : 590 |### |
67
+ 64 -> 255 : 250 |# |
68
+ 256 -> 1023 : 389 |## |
69
+ 1024 -> 4095 : 296 |## |
70
+ 4096 -> 16383 : 183 |# |
71
+ 16384 -> 65535 : 12 |# |
72
+ 65536 -> 262143 : 0 | |
73
+ 262144 -> 1048575 : 0 | |
74
+ 1048576 -> : 0 | |
75
+
76
+ Most of our read()s were tiny, between 1 and 3 bytes.
77
+
78
+
79
+ Using power-of-2, and a max of 1024:
80
+
81
+ # ./perf-stat-hist -P 2 -m 1024 syscalls:sys_exit_read ret
82
+ Tracing syscalls:sys_exit_read, power-of-2, max 1024, until Ctrl-C...
83
+ ^C
84
+ Range : Count Distribution
85
+ -> -1 : 29 |## |
86
+ 0 -> 0 : 1 |# |
87
+ 1 -> 1 : 959 |######################################|
88
+ 2 -> 3 : 1 |# |
89
+ 4 -> 7 : 0 | |
90
+ 8 -> 15 : 2 |# |
91
+ 16 -> 31 : 14 |# |
92
+ 32 -> 63 : 1 |# |
93
+ 64 -> 127 : 0 | |
94
+ 128 -> 255 : 0 | |
95
+ 256 -> 511 : 0 | |
96
+ 512 -> 1023 : 1 |# |
97
+ 1024 -> : 1 |# |
98
+
99
+
100
+ Specifying custom bucket sizes:
101
+
102
+ # ./perf-stat-hist -b "10 50 100 5000" syscalls:sys_exit_read ret
103
+ Tracing syscalls:sys_exit_read, specified buckets, until Ctrl-C...
104
+ ^C
105
+ Range : Count Distribution
106
+ -> 9 : 989 |######################################|
107
+ 10 -> 49 : 5 |# |
108
+ 50 -> 99 : 0 | |
109
+ 100 -> 4999 : 2 |# |
110
+ 5000 -> : 0 | |
111
+
112
+
113
+ Specifying a single value to bifurcate statistics:
114
+
115
+ # ./perf-stat-hist -b 10 syscalls:sys_exit_read ret
116
+ Tracing syscalls:sys_exit_read, specified buckets, until Ctrl-C...
117
+ ^C
118
+ Range : Count Distribution
119
+ -> 9 : 2959 |######################################|
120
+ 10 -> : 7 |# |
121
+
122
+ This has the lowest overhead for collection, since only two tracepoint
123
+ filter pairs are used.
124
+
125
+
126
+ Use -h to print the USAGE message:
127
+
128
+ # ./perf-stat-hist -h
129
+ USAGE: perf-stat-hist [-h] [-b buckets|-P power] [-m max] tracepoint
130
+ variable [seconds]
131
+ -b buckets # specify histogram bucket points
132
+ -P power # power-of (default is 4)
133
+ -m max # max value for power-of
134
+ -h # this usage message
135
+ eg,
136
+ perf-stat-hist syscalls:sys_enter_read count 5
137
+ # read() request histogram, 5 seconds
138
+ perf-stat-hist syscalls:sys_exit_read ret 5
139
+ # read() return histogram, 5 seconds
140
+ perf-stat-hist -P 10 syscalls:sys_exit_read ret 5
141
+ # ... use power-of-10
142
+ perf-stat-hist -P 2 -m 1024 syscalls:sys_exit_read ret 5
143
+ # ... use power-of-2, max 1024
144
+ perf-stat-hist -b "10 50 100 500" syscalls:sys_exit_read ret 5
145
+ # ... histogram based on these bucket ranges
146
+ perf-stat-hist -b 10 syscalls:sys_exit_read ret 5
147
+ # ... bifurcate by the value 10 (lowest overhead)
148
+
149
+ See the man page and example file for more info.