ruby-prof 0.5.0-mswin32 → 0.5.1-mswin32

Sign up to get free protection for your applications and to get access to all the features.
data/CHANGES CHANGED
@@ -1,33 +1,61 @@
1
- 0.5.0 (?)
1
+ 0.5.1 (2007-07-18)
2
+ ========================
3
+
4
+ ruby-prof 0.5.1 is a bug fix and performance release.
5
+
6
+ Performance
7
+ --------
8
+ * Significantly reduced the number of thread lookups by
9
+ caching the last executed thread.
10
+
11
+ Fixes
12
+ -------
13
+ * Properly escape method names in HTML reports
14
+ * Fix use of -m and --min-percent command line switches
15
+ * Default source file information to ruby_runtime#0 for c calls
16
+ * Moved rails_plugin to top level so it is more obvious
17
+ * Updated rails_plugin to write reports to the current
18
+ Rails log directory
19
+ * Added additional tests
20
+
21
+
22
+ 0.5.0 (2007-07-09)
2
23
  ========================
3
24
 
4
25
  Features
5
26
  --------
6
- * Added support for 64 bit systems (patch from Diego 'Flameeyes' Petten�)
27
+ * Added support for timing multi-threaded applications
28
+ * Added support for 64 bit systems (patch from Diego 'Flameeyes' Petten)
7
29
  * Added suport for outputting data in the format used by
8
30
  KCacheGrind (patch from Carl Shimer)
9
31
  * Add filename and line numbers to call tree information (patch from Carl Shimer)
10
32
  * Added Visual Studio 2005 project file.
33
+ * Added replace-progname switch, als rcov.
34
+ * Added better support for recursive methods
35
+ * Added better support for profiling Rails applications
11
36
 
12
37
  Fixes
13
38
  -------
14
- * Fixes a bug when the type of an attached object (singleton) is inherited
15
- from T_OBJECT as opposed to being a T_OBJECT (identified by Francis Cianfrocca)
16
-
39
+ * Fixes bug when the type of an attached object (singleton) is inherited
40
+ from T_OBJECT as opposed to being a T_OBJECT (identified by Francis Cianfrocca)
41
+ * ruby-prof now works in IRB.
42
+ * Fix sort order in reports.
43
+ * Fixed rdoc compile error.
44
+ * Fix tabs in erb template for graph html report on windows.
17
45
 
18
46
  0.4.1 (2006-06-26)
19
47
  ========================
20
48
 
21
49
  Features
22
50
  --------
23
- * Added a RubyProf.running? method to indicate whether a profile is in progress.
51
+ * Added a RubyProf.running? method to indicate whether a profile is in progress.
24
52
  * Added tgz and zip archives to release
25
53
 
26
54
  Fixes
27
55
  -------
28
- * Duplicate method names are now allowed
29
- * The documentation has been updated to show the correct API usage is RubyProf.stop not RubyProf.end
30
-
56
+ * Duplicate method names are now allowed
57
+ * The documentation has been updated to show the correct API usage is RubyProf.stop not RubyProf.end
58
+
31
59
 
32
60
  0.4.0 (2006-06-16)
33
61
  ========================
data/README CHANGED
@@ -61,7 +61,7 @@ particular segments of code.
61
61
  # Print a flat profile to text
62
62
  printer = RubyProf::TextPrinter.new(result)
63
63
  printer.print(STDOUT, 0)
64
-
64
+
65
65
  Alternatively, you can use a block to tell ruby-prof what
66
66
  to profile:
67
67
 
@@ -83,7 +83,7 @@ to profile:
83
83
 
84
84
  The third way of using ruby-prof is by requiring unprof.rb:
85
85
 
86
- require 'unprof'
86
+ require 'unprof'
87
87
 
88
88
  This will start profiling immediately and will output the results
89
89
  using a flat profile report.
@@ -195,11 +195,11 @@ You may also specify the measure_mode by using the RUBY_PROF_MEASURE_MODE
195
195
  environment variable:
196
196
 
197
197
  * export RUBY_PROF_MEASURE_MODE=process
198
- * export RUBY_PROF_MEASURE_MODE=wall
198
+ * export RUBY_PROF_MEASURE_MODE=wall
199
199
  * export RUBY_PROF_MEASURE_MODE=cpu
200
200
  * export RUBY_PROF_MEASURE_MODE=allocations
201
201
 
202
- Note that these values have changed since ruby-prof-0.3.0.
202
+ Note that these values have changed since ruby-prof-0.3.0.
203
203
 
204
204
  On Linux, process time is measured using the clock method provided
205
205
  by the C runtime library. Note that the clock method does not
@@ -214,6 +214,12 @@ provided by the C runtime library. Note though, these values are
214
214
  wall times on Windows and not process times like on Linux.
215
215
  Wall time is measured using the GetLocalTime API.
216
216
 
217
+ If you use wall time, the results will be affected by other
218
+ processes running on your computer, network delays, disk access,
219
+ etc. As result, for the best results, try to make sure your
220
+ computer is only performing your profiling run and is
221
+ otherwise quiescent.
222
+
217
223
  On both platforms, cpu time is measured using the RDTSC assembly
218
224
  function provided by the Pentium and PowerPC platforms. CPU time
219
225
  is dependent on the cpu's frequency. On Linux, ruby-prof attempts
@@ -221,31 +227,62 @@ to read this value from "/proc/cpuinfo." On Windows, you must
221
227
  specify the clock frequency. This can be done using the
222
228
  RUBY_PROF_CPU_FREQUENCY environment variable:
223
229
 
224
- export RUBY_PROF_CPU_FREQUENCY=<value>
225
-
230
+ export RUBY_PROF_CPU_FREQUENCY=<value>
231
+
226
232
  You can also directly set the cpu frequency by calling:
227
233
 
228
- RubyProf.cpu_frequency = <value>
234
+ RubyProf.cpu_frequency = <value>
229
235
 
230
236
 
231
237
  == Recursive Calls
232
238
 
233
239
  Recursive calls occur when method A calls method A and cycles
234
240
  occur when method A calls method B calls method C calls method A.
235
- ruby-prof can detect recursive calls any cycle calls, but does not
236
- currently report these in its output.
241
+ ruby-prof detects both direct recursive calls and cycles. Both
242
+ are indicated in reports by a dash and number following a method
243
+ name. For example, here is a flat profile from the test method
244
+ RecursiveTest#test_recursive:
245
+
246
+
247
+ %self total self wait child calls name
248
+ 100.00 2.00 2.00 0.00 0.00 2 Kernel#sleep
249
+ 0.00 2.00 0.00 0.00 2.00 0 RecursiveTest#test_cycle
250
+ 0.00 0.00 0.00 0.00 0.00 2 Fixnum#==
251
+ 0.00 0.00 0.00 0.00 0.00 2 Fixnum#-
252
+ 0.00 1.00 0.00 0.00 1.00 1 Object#sub_cycle-1
253
+ 0.00 2.00 0.00 0.00 2.00 1 Object#sub_cycle
254
+ 0.00 2.00 0.00 0.00 2.00 1 Object#cycle
255
+ 0.00 1.00 0.00 0.00 1.00 1 Object#cycle-1
256
+
257
+ Notice the presence of Object#cycle and Object#cycle-1. The -1 means
258
+ the method was either recursively called (directly or indirectly).
237
259
 
238
260
  However, the self time values for recursive calls should always
239
261
  be accurate. It is also believed that the total times are
240
262
  accurate, but these should be carefully analyzed to verify their veracity.
241
263
 
264
+ == Multi-threaded Applications
265
+
266
+ Unfortunately, Ruby does not provide an internal api
267
+ for detecting thread context switches. As a result, the
268
+ timings ruby-prof reports for each thread may be slightly
269
+ inaccurate. In particular, this will happen for newly
270
+ spanned threads that immediately go to sleep. For instance,
271
+ if you use Ruby's timeout library to wait for 2 seconds,
272
+ the 2 seconds will be assigned to the foreground thread
273
+ and not the newly created background thread. These errors
274
+ can largely be avoided if the background thread performs an
275
+ operation before going to sleeep.
276
+
277
+
242
278
  == Performance
243
279
 
244
280
  Significant effort has been put into reducing ruby-prof's overhead
245
281
  as much as possible. Our tests show that the overhead associated
246
282
  with profiling code varies considerably with the code being
247
- profiled. On the low end overhead is around 10% while on the
248
- high end its can around 80%.
283
+ profiled. Most programs will run approximately twice as slow
284
+ while highly recursive programs (like the fibonacci series test)
285
+ will run three times slower.
249
286
 
250
287
  == Windows Binary
251
288
 
data/Rakefile CHANGED
@@ -5,7 +5,7 @@ require 'rake/rdoctask'
5
5
  SO_NAME = "ruby_prof.so"
6
6
 
7
7
  # ------- Default Package ----------
8
- RUBY_PROF_VERSION = "0.5.0"
8
+ RUBY_PROF_VERSION = "0.5.1"
9
9
 
10
10
  FILES = FileList[
11
11
  'Rakefile',
@@ -117,16 +117,16 @@ Rake::RDocTask.new("rdoc") do |rdoc|
117
117
  rdoc.options << "--inline-source" << "--line-numbers"
118
118
  # Make the readme file the start page for the generated html
119
119
  rdoc.options << '--main' << 'README'
120
- rdoc.rdoc_files.include('bin/**/*',
121
- 'doc/*.rdoc',
122
- 'examples/flat.txt',
123
- 'examples/graph.txt',
124
- 'examples/graph.html',
125
- 'lib/**/*.rb',
126
- 'ext/**/ruby_prof.c',
127
- 'README',
128
- 'LICENSE')
129
- end
120
+ rdoc.rdoc_files.include('bin/**/*',
121
+ 'doc/*.rdoc',
122
+ 'examples/flat.txt',
123
+ 'examples/graph.txt',
124
+ 'examples/graph.html',
125
+ 'lib/**/*.rb',
126
+ 'ext/**/ruby_prof.c',
127
+ 'README',
128
+ 'LICENSE')
129
+ end
130
130
 
131
131
 
132
132
  # --------- Publish to RubyForge ----------------
data/bin/ruby-prof CHANGED
@@ -15,7 +15,10 @@
15
15
  # graph_html - Prints a graph profile as html.
16
16
  # call_tree - format for KCacheGrind
17
17
  # -f, --file=path Output results to a file instead of standard out.
18
- # -m, --measure-mode=measure_mode Select a measurement mode:
18
+ # -m, --min_percent=min_percent The minimum percent a method must take before ',
19
+ # being included in output reports. Should be an
20
+ # integer between 1 and 100. 0 means all methods are printed.
21
+ # --mode=measure_mode Select a measurement mode:
19
22
  # process - Use process time (default).
20
23
  # wall - Use wall time.
21
24
  # cpu - Use the CPU clock counter
@@ -79,16 +82,15 @@ opts = OptionParser.new do |opts|
79
82
  options.file = file
80
83
  end
81
84
 
82
- opts.on('-m measure_mode', '--measure-mode=measure_mode',
85
+ opts.on('--mode=measure_mode',
83
86
  [:process, :wall, :cpu, :allocations],
84
87
  'Select what ruby-prof should measure:',
85
88
  ' process - Process time (default).',
86
89
  ' wall - Wall time.',
87
- ' cpu - CPU time',
88
- ' (only supported on Pentium and PowerPCs).',
89
- ' allocations - O3bject allocations (required patched Ruby interpreter).') do |measure_mode|
90
+ ' cpu - CPU time (Pentium and PowerPCs only).',
91
+ ' allocations - Object allocations (requires patched Ruby interpreter).') do |measure_mode|
90
92
 
91
- case measure_mode
93
+ case mode
92
94
  when :process
93
95
  options.measure_mode = RubyProf::PROCESS_TIME
94
96
  when :wall
@@ -100,14 +102,15 @@ opts = OptionParser.new do |opts|
100
102
  end
101
103
  end
102
104
 
103
- opts.on("--replace-progname", "Replace $0 when loading the .rb files.") do
104
- options.replace_prog_name = true
105
- end
105
+ opts.on("--replace-progname", "Replace $0 when loading the .rb files.") do
106
+ options.replace_prog_name = true
107
+ end
106
108
 
107
109
  opts.on_tail("-h", "--help", "Show help message") do
108
110
  puts opts
109
111
  exit
110
112
  end
113
+
111
114
  opts.on_tail("-v", "--version", "Show version") do
112
115
  puts "ruby_prof " + RubyProf::VERSION
113
116
  exit
@@ -147,11 +150,11 @@ at_exit {
147
150
  # Get output
148
151
  if options.file
149
152
  File.open(options.file, 'w') do |file|
150
- printer.print(file, options.min_percent)
153
+ printer.print(file, {:min_percent => options.min_percent})
151
154
  end
152
155
  else
153
156
  # Print out results
154
- printer.print(STDOUT, options.min_percent)
157
+ printer.print(STDOUT, {:min_percent => options.min_percent})
155
158
  end
156
159
  }
157
160
 
data/ext/ruby_prof.c CHANGED
@@ -57,7 +57,7 @@
57
57
 
58
58
  /* ================ Constants =================*/
59
59
  #define INITIAL_STACK_SIZE 8
60
- #define PROF_VERSION "0.5.0"
60
+ #define PROF_VERSION "0.5.1"
61
61
 
62
62
 
63
63
  /* ================ Measurement =================*/
@@ -159,7 +159,9 @@ static thread_data_t* last_thread_data = NULL;
159
159
  static inline long
160
160
  get_thread_id(VALUE thread)
161
161
  {
162
- return NUM2ULONG(rb_obj_id(thread));
162
+ //return NUM2ULONG(rb_obj_id(thread));
163
+ // From line 1997 in gc.c
164
+ return (long)thread;
163
165
  }
164
166
 
165
167
  static VALUE
@@ -572,7 +574,8 @@ the RubyProf::Result object.
572
574
 
573
575
  /* :nodoc: */
574
576
  static prof_method_t *
575
- prof_method_create(NODE *node, st_data_t key, VALUE klass, ID mid, int depth)
577
+ prof_method_create(st_data_t key, VALUE klass, ID mid, int depth,
578
+ const char* source_file, int line)
576
579
  {
577
580
  prof_method_t *result = ALLOC(prof_method_t);
578
581
 
@@ -590,8 +593,8 @@ prof_method_create(NODE *node, st_data_t key, VALUE klass, ID mid, int depth)
590
593
  result->active_frame = 0;
591
594
  result->base = result;
592
595
 
593
- result->source_file = (node ? node->nd_file : 0);
594
- result->line = (node ? nd_line(node) : 0);
596
+ result->source_file = source_file;
597
+ result->line = line;
595
598
  return result;
596
599
  }
597
600
 
@@ -706,7 +709,7 @@ static VALUE prof_method_source_file(VALUE self)
706
709
  const char* sf = get_prof_method(self)->source_file;
707
710
  if(!sf)
708
711
  {
709
- return Qnil;
712
+ return rb_str_new2("ruby_runtime");
710
713
  }
711
714
  else
712
715
  {
@@ -917,25 +920,23 @@ threads_table_insert(st_table *table, VALUE thread, thread_data_t *thread_data)
917
920
  }
918
921
 
919
922
  static inline thread_data_t *
920
- threads_table_lookup(st_table *table, VALUE thread)
923
+ threads_table_lookup(st_table *table, long thread_id)
921
924
  {
922
925
  thread_data_t* result;
923
926
  st_data_t val;
924
927
 
925
928
  /* Its too slow to key on the real thread id so just typecast thread instead. */
926
- if (st_lookup(table, (st_data_t) thread, &val))
929
+ if (st_lookup(table, (st_data_t) thread_id, &val))
927
930
  {
928
931
  result = (thread_data_t *) val;
929
932
  }
930
933
  else
931
934
  {
932
935
  result = thread_data_create();
933
-
934
- /* Store the real thread id here so it can be shown in the results. */
935
- result->thread_id = get_thread_id(thread);
936
+ result->thread_id = thread_id;
936
937
 
937
938
  /* Insert the table */
938
- threads_table_insert(threads_tbl, thread, result);
939
+ threads_table_insert(threads_tbl, thread_id, result);
939
940
  }
940
941
  return result;
941
942
  }
@@ -1076,10 +1077,11 @@ prof_event_hook(rb_event_t event, NODE *node, VALUE self, ID mid, VALUE klass)
1076
1077
  VALUE thread;
1077
1078
  prof_measure_t now = 0;
1078
1079
  thread_data_t* thread_data = NULL;
1080
+ long thread_id = 0;
1079
1081
  prof_frame_t *frame = NULL;
1080
1082
 
1081
-
1082
- /* {
1083
+ /*
1084
+ {
1083
1085
  st_data_t key = 0;
1084
1086
  static unsigned long last_thread_id = 0;
1085
1087
 
@@ -1110,29 +1112,42 @@ prof_event_hook(rb_event_t event, NODE *node, VALUE self, ID mid, VALUE klass)
1110
1112
  /* Get current measurement*/
1111
1113
  now = get_measurement();
1112
1114
 
1113
- /* Get the current thread and thread data. */
1115
+ /* Get the current thread information. */
1114
1116
  thread = rb_thread_current();
1115
- thread_data = threads_table_lookup(threads_tbl, thread);
1117
+ thread_id = get_thread_id(thread);
1116
1118
 
1117
- /* Get the frame at the top of the stack. This may represent
1118
- the current method (EVENT_LINE, EVENT_RETURN) or the
1119
- previous method (EVENT_CALL).*/
1120
- frame = stack_peek(thread_data->stack);
1121
-
1122
- /* Check for a context switch */
1123
- if (last_thread_data && last_thread_data != thread_data)
1119
+ /* Was there a context switch? */
1120
+ if (!last_thread_data || last_thread_data->thread_id != thread_id)
1124
1121
  {
1125
- /* Note how long have we been waiting. */
1126
- prof_measure_t wait_time = now - thread_data->last_switch;
1122
+ prof_measure_t wait_time = 0;
1123
+
1124
+ /* Get new thread information. */
1125
+ thread_data = threads_table_lookup(threads_tbl, thread_id);
1126
+
1127
+ /* How long has this thread been waiting? */
1128
+ wait_time = now - thread_data->last_switch;
1129
+ thread_data->last_switch = 0;
1130
+
1131
+ /* Get the frame at the top of the stack. This may represent
1132
+ the current method (EVENT_LINE, EVENT_RETURN) or the
1133
+ previous method (EVENT_CALL).*/
1134
+ frame = stack_peek(thread_data->stack);
1135
+
1127
1136
  if (frame)
1128
1137
  frame->wait_time += wait_time;
1129
1138
 
1130
1139
  /* Save on the last thread the time of the context switch
1131
1140
  and reset this thread's last context switch to 0.*/
1132
- last_thread_data->last_switch = now;
1133
- thread_data->last_switch = 0;
1141
+ if (last_thread_data)
1142
+ last_thread_data->last_switch = now;
1143
+
1144
+ last_thread_data = thread_data;
1145
+ }
1146
+ else
1147
+ {
1148
+ thread_data = last_thread_data;
1149
+ frame = stack_peek(thread_data->stack);
1134
1150
  }
1135
- last_thread_data = thread_data;
1136
1151
 
1137
1152
  switch (event) {
1138
1153
  case RUBY_EVENT_LINE:
@@ -1156,7 +1171,7 @@ prof_event_hook(rb_event_t event, NODE *node, VALUE self, ID mid, VALUE klass)
1156
1171
  int depth = 0;
1157
1172
  st_data_t key = 0;
1158
1173
  prof_method_t *method = NULL;
1159
-
1174
+
1160
1175
  /* Is this an include for a module? If so get the actual
1161
1176
  module class since we want to combine all profiling
1162
1177
  results for that module. */
@@ -1170,7 +1185,17 @@ prof_event_hook(rb_event_t event, NODE *node, VALUE self, ID mid, VALUE klass)
1170
1185
 
1171
1186
  if (!method)
1172
1187
  {
1173
- method = prof_method_create(node, key, klass, mid, depth);
1188
+ const char* source_file = (node ? node->nd_file : 0);
1189
+ int line = (node ? nd_line(node) : 0);
1190
+
1191
+ /* Line numbers are not accurate for c method calls */
1192
+ if (event == RUBY_EVENT_C_CALL)
1193
+ {
1194
+ line = 0;
1195
+ source_file = NULL;
1196
+ }
1197
+
1198
+ method = prof_method_create(key, klass, mid, depth, source_file, line);
1174
1199
  method_info_table_insert(thread_data->method_info_table, key, method);
1175
1200
  }
1176
1201
 
@@ -1185,7 +1210,17 @@ prof_event_hook(rb_event_t event, NODE *node, VALUE self, ID mid, VALUE klass)
1185
1210
 
1186
1211
  if (!method)
1187
1212
  {
1188
- method = prof_method_create(node, key, klass, mid, depth);
1213
+ const char* source_file = (node ? node->nd_file : 0);
1214
+ int line = (node ? nd_line(node) : 0);
1215
+
1216
+ /* Line numbers are not accurate for c method calls */
1217
+ if (event == RUBY_EVENT_C_CALL)
1218
+ {
1219
+ line = 0;
1220
+ source_file = NULL;
1221
+ }
1222
+
1223
+ method = prof_method_create(key, klass, mid, depth, source_file, line);
1189
1224
  method->base = base_method;
1190
1225
  method_info_table_insert(thread_data->method_info_table, key, method);
1191
1226
  }