progressor 0.0.1 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fa7eacefe70e6b186f9a72625d6158db19b7f9b45b327c397b7b4402dc56bc47
4
- data.tar.gz: 5cb7a42fffe3d6650a48454e41c73d8f74e7cb72fdf521fb2fdd6e59bb2cae8c
3
+ metadata.gz: 67733b5ccdcf4efdbe628a511246b38238f5bd368e320ae9251ec34dc1be7cc5
4
+ data.tar.gz: 2c84c065e48fdfab6e9ab4079c7c595be60cbfe16f7378970a451cebe36af326
5
5
  SHA512:
6
- metadata.gz: 2d11d38113127328cb2d0cd6b6d98652f9e6010816aebdbbd0c44cc27b331ba03c9b961308f2ec71e6abb5f0b0694f4952132214a126cdbecb5d062741e7cbe0
7
- data.tar.gz: cba7008ee6ca3151ba3c89159dc7835d8d76b5cb981da4017d3dffe0fdb14a9bf0c3ef9e18a6f532ab220f4abececc487a8145e86e34b77b226b9a81f9473f13
6
+ metadata.gz: 4735274940fcb4bd4b54089dce074bbbb6fe42cc530db202f3a3f4738c396624685d3d6740e8707882d8cfa9c0f4e2053851f5ed1c987c3d0a88c1fb665b2f28
7
+ data.tar.gz: 375e03f4109ea4a74de4483b55aa889611d7d2d0ec8901950363b2e5196f8652fb8995d20bcddb8d8118e842e6ead21d71bb47dec00cd071795962361f505371
data/README.md CHANGED
@@ -1,8 +1,31 @@
1
- A very basic library to measure loops in a long-running task.
1
+ Full documentation for the latest released version can be found at: https://www.rubydoc.info/gems/progressor
2
2
 
3
- *Note: Very incomplete, so mostly for personal usage. Will hopefully flesh it out, write tests, configuration, etc, at some point (PRs welcome). Until then, a similar library can be found here: https://github.com/mkdynamic/ke*
3
+ ## Basic example
4
4
 
5
- Example usage:
5
+ Here's an example long-running task:
6
+
7
+ ``` ruby
8
+ Product.find_each do |product|
9
+ next if product.not_something_we_want_to_process?
10
+ product.calculate_interesting_stats
11
+ end
12
+ ```
13
+
14
+ In order to understand how it's progressing, we might add some print statements:
15
+
16
+ ``` ruby
17
+ Product.find_each do |product|
18
+ if product.not_something_we_want_to_process?
19
+ puts "Skipping product: #{product.id}"
20
+ next
21
+ end
22
+
23
+ puts "Working on product: #{product.id}"
24
+ product.calculate_interesting_stats
25
+ end
26
+ ```
27
+
28
+ This gives us some indication of progress, but no idea how much time is left. We could take a count and maintain a manual index, and then eyeball it based on how fast the numbers are adding up. Progressor automates that process:
6
29
 
7
30
  ``` ruby
8
31
  progressor = Progressor.new(total_count: Product.count)
@@ -20,12 +43,79 @@ Product.find_each do |product|
20
43
  end
21
44
  ```
22
45
 
23
- Example output:
46
+ Each invocation of `run` measures how long its block took and records it. The yielded `progress` parameter is an object that can be `to_s`-ed to provide progress information.
47
+
48
+ The output might look like this:
24
49
 
25
50
  ```
26
51
  ...
27
- [0038/1000, (004%), t/i: 0.5s, ETA: 8m:0.27s] Product 38
28
- [0039/1000, (004%), t/i: 0.5s, ETA: 7m:58.47s] Product 39
29
- [0040/1000, (004%), t/i: 0.5s, ETA: 7m:57.08s] Product 40
52
+ [0038/1000, (004%), t/i: 0.5s, ETA: 8m:00s] Product 38
53
+ [0039/1000, (004%), t/i: 0.5s, ETA: 7m:58s] Product 39
54
+ [0040/1000, (004%), t/i: 0.5s, ETA: 7m:57s] Product 40
30
55
  ...
31
56
  ```
57
+
58
+ You can check the documentation for the [Progressor](https://www.rubydoc.info/gems/progressor/Progressor) class for details on the methods you can call to get the individual pieces of data shown in the report.
59
+
60
+ ## Limited and unlimited sequences
61
+
62
+ Initializing a `Progressor` with a provided `total_count:` parameter gives you a limited sequence, which can give you not only a progress report, but an estimation of when it'll be done:
63
+
64
+ ```
65
+ [<current loop>/<total count>, (<progress>%), t/i: <time per iteration>, ETA: <time until it's done>]
66
+ ```
67
+
68
+ The calculation is done by maintaining a list of measurements with a limited size, and a list of averages of those measurements. The average of averages is the "time per iteration" and it's multiplied by the remaining count to produce the estimation.
69
+
70
+ I can't really say how reliable this is, but it seems to provide smoothly changing estimations that seem more or less correct to me, for similarly-sized chunks of work per iteration.
71
+
72
+ **Not** providing a `total_count:` parameter leads to less available information:
73
+
74
+ ``` ruby
75
+ progressor = Progressor.new
76
+
77
+ (1..100).each do |i|
78
+ progressor.run do |progress|
79
+ sleep rand
80
+ puts progress
81
+ end
82
+ end
83
+ ```
84
+
85
+ A sample of output might look like this:
86
+
87
+ ```
88
+ ...
89
+ 11, t: 5.32s, t/i: 442.39ms
90
+ 12, t: 5.58s, t/i: 446.11ms
91
+ ...
92
+ ```
93
+
94
+ The format is:
95
+
96
+ ```
97
+ <current>, t: <time from start>, t/i: <time per iteration>
98
+ ```
99
+
100
+ ## Configuration
101
+
102
+ Apart from `total_count`, which is optional and affects the kind of sequence that will be stored, you can provide `min_samples` and `max_samples`. You can also provide a custom formatter:
103
+
104
+ ``` ruby
105
+ progressor = Progressor.new({
106
+ total_count: 1000,
107
+ min_samples: 5,
108
+ max_samples: 10,
109
+ formatter: -> (p) { p.eta }
110
+ })
111
+ ```
112
+
113
+ The option `min_samples` determines how many loops the tool will wait until trying to produce an estimation. A higher number means no information in the beginning, but no wild fluctuations, either. It needs to be at least 1 and the default is 1.
114
+
115
+ The option `max_samples` is how many measurements will be retained. Those measurements will be averaged, and then those averages averaged to get a time-per-iteration estimate. A smaller number means giving more weight to later events, while a larger one would average over a larger amount of samples. The default is 100.
116
+
117
+ The `formatter` is a callback that gets a progress object as an argument and you can return your own string to output on every loop. Check `LimitedSequence` and `UnlimitedSequence` for the available methods and accessors you can use.
118
+
119
+ ## Related work
120
+
121
+ A very similar tool is the gem [ke](https://github.com/mkdynamic/ke). It provides its estimation by maintaining the median quartile range of the stored measurements, removing outliers. It also automates the output of the progress report, only printing it every N loops. Depending on your needs and preferences, it might be better for your use case.
@@ -1,7 +1,13 @@
1
+ require 'progressor/version'
2
+ require 'progressor/error'
3
+ require 'progressor/formatting'
4
+ require 'progressor/limited_sequence'
5
+ require 'progressor/unlimited_sequence'
6
+
1
7
  require 'benchmark'
2
8
 
3
9
  # Used to measure the running time of parts of a long-running task and output
4
- # an estimation based on the average of the last 10-100 measurements.
10
+ # an estimation based on the average of the last 1-100 measurements.
5
11
  #
6
12
  # Example usage:
7
13
  #
@@ -22,21 +28,23 @@ require 'benchmark'
22
28
  # Example output:
23
29
  #
24
30
  # ...
25
- # [0038/1000, (004%), t/i: 0.5s, ETA: 8m:0.27s] Product 38
26
- # [0039/1000, (004%), t/i: 0.5s, ETA: 7m:58.47s] Product 39
27
- # [0040/1000, (004%), t/i: 0.5s, ETA: 7m:57.08s] Product 40
31
+ # [0038/1000, 004%, t/i: 0.5s, ETA: 8m:00s] Product 38
32
+ # [0039/1000, 004%, t/i: 0.5s, ETA: 7m:58s] Product 39
33
+ # [0040/1000, 004%, t/i: 0.5s, ETA: 7m:57s] Product 40
28
34
  # ...
29
35
  #
30
36
  class Progressor
31
- VERSION = '0.0.1'
37
+ include Formatting
32
38
 
33
39
  # Utility method to print a message with the time it took to run the contents
34
40
  # of the block.
35
41
  #
36
- # > Progressor.puts("Working on a thing") { thing_work }
42
+ # Progressor.puts("Working on a thing") { thing_work }
43
+ #
44
+ # Output:
37
45
  #
38
- # Working on a thing...
39
- # Working on a thing DONE: 2.1s
46
+ # Working on a thing...
47
+ # Working on a thing DONE: 2.1s
40
48
  #
41
49
  def self.puts(message, &block)
42
50
  Kernel.puts "#{message}..."
@@ -44,76 +52,49 @@ class Progressor
44
52
  Kernel.puts "#{message} DONE: #{format_time(measurement.real)}"
45
53
  end
46
54
 
47
- def initialize(total_count:)
48
- @total_count = total_count
49
- @total_count_digits = total_count.to_s.length
50
- @current = 0
51
- @measurements = []
52
- @averages = []
55
+ # Set up a new Progressor instance. Optional parameters:
56
+ #
57
+ # - total_count: If given, the tool will be able to provide an ETA.
58
+ #
59
+ # - min_samples: The number of samples to collect before attempting to
60
+ # calculate a time per iteration. Default: 1
61
+ #
62
+ # - max_samples: The maximum number of measurements to collect and average.
63
+ # Default: 100.
64
+ #
65
+ # - formatter: A callable that accepts a progress object and returns a
66
+ # custom formatted string.
67
+ #
68
+ def initialize(total_count: nil, min_samples: 1, max_samples: 100, formatter: nil)
69
+ params = {
70
+ min_samples: min_samples,
71
+ max_samples: max_samples,
72
+ formatter: formatter,
73
+ }
74
+
75
+ if total_count
76
+ @sequence = LimitedSequence.new(total_count: total_count, **params)
77
+ else
78
+ @sequence = UnlimitedSequence.new(**params)
79
+ end
53
80
  end
54
81
 
82
+ # Run the given block of code, yielding a sequence object that holds progress
83
+ # information.
84
+ #
85
+ # Example usage:
86
+ #
87
+ # progressor.run { |progress| puts progress; long_running_task() }
88
+ #
55
89
  def run
56
- @current += 1
57
-
58
- measurement = Benchmark.measure { yield self }
59
-
60
- @measurements << measurement.real
61
- # only keep last 1000
62
- @measurements.shift if @measurements.count > 1000
63
-
64
- @averages << average(@measurements)
65
- @averages = @averages.compact
66
- # only keep last 100
67
- @averages.shift if @averages.count > 100
90
+ measurement = Benchmark.measure { yield @sequence }
91
+ @sequence.push(measurement.real)
68
92
  end
69
93
 
94
+ # Skips the given number of loops (will likely be 1), updating the
95
+ # estimations appropriately.
96
+ #
70
97
  def skip(n)
71
- @total_count -= n
72
- end
73
-
74
- def to_s
75
- [
76
- "#{@current.to_s.rjust(@total_count_digits, '0')}/#{@total_count}",
77
- "(#{((@current / @total_count.to_f) * 100).round.to_s.rjust(3, '0')}%)",
78
- "t/i: #{self.class.format_time(per_iteration)}",
79
- "ETA: #{self.class.format_time(eta)}",
80
- ].join(', ')
81
- end
82
-
83
- def per_iteration
84
- return nil if @measurements.count < 10
85
- average(@averages)
86
- end
87
-
88
- def eta
89
- return nil if @measurements.count < 10
90
-
91
- remaining_time = per_iteration * (@total_count - @current)
92
- remaining_time.round(2)
93
- end
94
-
95
- private
96
-
97
- def self.format_time(time)
98
- return "?s" if time.nil?
99
-
100
- if time < 0.1
101
- "#{(time * 1000).round(2)}ms"
102
- elsif time < 60
103
- "#{time.round(2)}s"
104
- elsif time < 3600
105
- minutes = time.to_i / 60
106
- seconds = (time - minutes * 60).round(2)
107
- "#{minutes}m:#{seconds}s"
108
- else
109
- hours = time.to_i / 3600
110
- minutes = (time.to_i % 3600) / 60
111
- seconds = (time - (hours * 3600 + minutes * 60)).round(2)
112
- "#{hours}h:#{minutes}m:#{seconds}s"
113
- end
114
- end
115
-
116
- def average(collection)
117
- collection.inject(&:+) / collection.count.to_f
98
+ @sequence.skip(n)
118
99
  end
119
100
  end
@@ -0,0 +1,7 @@
1
+ class Progressor
2
+ # A custom error class for targeted catching. All Progressor errors will be
3
+ # wrapped in a Progressor::Error.
4
+ #
5
+ class Error < RuntimeError
6
+ end
7
+ end
@@ -0,0 +1,39 @@
1
+ class Progressor
2
+ module Formatting
3
+ # Formats the given time in seconds to something human readable. Examples:
4
+ #
5
+ # - 1 second: 1.00s
6
+ # - 0.123 seconds: 123.00ms
7
+ # - 100 seconds: 01m:40s
8
+ # - 101.5 seconds: 01m:41s
9
+ # - 3661 seconds: 01h:01m:01s
10
+ def format_time(time)
11
+ return "?s" if time.nil?
12
+
13
+ if time < 1
14
+ "#{format_float((time * 1000).round(2))}ms"
15
+ elsif time < 60
16
+ "#{format_float(time.round(2))}s"
17
+ elsif time < 3600
18
+ minutes = time.to_i / 60
19
+ seconds = (time - minutes * 60).round(2)
20
+ "#{format_int(minutes)}m:#{format_int(seconds)}s"
21
+ else
22
+ hours = time.to_i / 3600
23
+ minutes = (time.to_i % 3600) / 60
24
+ seconds = (time - (hours * 3600 + minutes * 60)).round(2)
25
+ "#{format_int(hours)}h:#{format_int(minutes)}m:#{format_int(seconds)}s"
26
+ end
27
+ end
28
+
29
+ # :nodoc:
30
+ def format_int(value)
31
+ sprintf("%02d", value)
32
+ end
33
+
34
+ # :nodoc:
35
+ def format_float(value)
36
+ sprintf("%0.2f", value)
37
+ end
38
+ end
39
+ end
@@ -0,0 +1,126 @@
1
+ class Progressor
2
+ class LimitedSequence
3
+ include Formatting
4
+
5
+ attr_reader :total_count, :min_samples, :max_samples
6
+
7
+ # The current loop index, starts at 1
8
+ attr_reader :current
9
+
10
+ # The time the object was created
11
+ attr_reader :start_time
12
+
13
+ # Creates a new LimitedSequence with the given parameters:
14
+ #
15
+ # - total_count: The expected number of loops.
16
+ #
17
+ # - min_samples: The number of samples to collect before attempting to
18
+ # calculate a time per iteration. Default: 1
19
+ #
20
+ # - max_samples: The maximum number of measurements to collect and average.
21
+ # Default: 100.
22
+ #
23
+ # - formatter: A callable that accepts the sequence object and returns a
24
+ # custom formatted string.
25
+ #
26
+ def initialize(total_count:, min_samples: 1, max_samples: 100, formatter: nil)
27
+ @total_count = total_count
28
+ @min_samples = min_samples
29
+ @max_samples = [max_samples, total_count].min
30
+ @formatter = formatter
31
+
32
+ raise Error.new("min_samples needs to be a positive number") if min_samples <= 0
33
+ raise Error.new("max_samples needs to be larger than min_samples") if max_samples <= min_samples
34
+
35
+ @start_time = Time.now
36
+ @total_count_digits = total_count.to_s.length
37
+ @current = 0
38
+ @measurements = []
39
+ @averages = []
40
+ end
41
+
42
+ # Adds a duration in seconds to the internal storage of samples. Updates
43
+ # averages accordingly.
44
+ #
45
+ def push(duration)
46
+ @current += 1
47
+ @measurements << duration
48
+ # only keep last `max_samples`
49
+ @measurements.shift if @measurements.count > max_samples
50
+
51
+ @averages << average(@measurements)
52
+ @averages = @averages.compact
53
+ # only keep last `max_samples`
54
+ @averages.shift if @averages.count > max_samples
55
+ end
56
+
57
+ # Skips an iteration, updating the total count and ETA
58
+ #
59
+ def skip(n)
60
+ @total_count -= n
61
+ end
62
+
63
+ # Outputs a textual representation of the current state of the
64
+ # UnlimitedSequence. Shows:
65
+ #
66
+ # - the current number of iterations and the total count
67
+ # - completion level in percentage
68
+ # - how long a single iteration takes
69
+ # - estimated time of arrival (ETA) -- time until it's done
70
+ #
71
+ # A custom `formatter` provided at construction time overrides this default
72
+ # output.
73
+ #
74
+ # If the "current" number of iterations goes over the total count, an ETA
75
+ # can't be shown anymore, so it'll just be the current number over the
76
+ # expected one, and the time per iteration.
77
+ #
78
+ def to_s
79
+ return @formatter.call(self).to_s if @formatter
80
+
81
+ if @current > @total_count
82
+ return [
83
+ "#{@current} (expected #{@total_count})",
84
+ "t/i: #{format_time(per_iteration)}",
85
+ "ETA: ???",
86
+ ].join(', ')
87
+ end
88
+
89
+ [
90
+ "#{@current.to_s.rjust(@total_count_digits, '0')}/#{@total_count}",
91
+ "#{((@current / @total_count.to_f) * 100).round.to_s.rjust(3, '0')}%",
92
+ "t/i: #{format_time(per_iteration)}",
93
+ "ETA: #{format_time(eta)}",
94
+ ].join(', ')
95
+ end
96
+
97
+ # Returns an estimation for the time per single iteration. Implemented as
98
+ # an average of averages to provide a smoother gradient from loop to loop.
99
+ #
100
+ # Returns nil if not enough samples have been collected yet.
101
+ #
102
+ def per_iteration
103
+ return nil if @measurements.count < min_samples
104
+ average(@averages)
105
+ end
106
+
107
+ # Returns an estimation for the Estimated Time of Arrival (time until
108
+ # done).
109
+ #
110
+ # Calculated by multiplying the average time per iteration with the
111
+ # remaining number of loops.
112
+ #
113
+ def eta
114
+ return nil if @measurements.count < min_samples
115
+
116
+ remaining_time = per_iteration * (@total_count - @current)
117
+ remaining_time.round(2)
118
+ end
119
+
120
+ private
121
+
122
+ def average(collection)
123
+ collection.inject(&:+) / collection.count.to_f
124
+ end
125
+ end
126
+ end
@@ -0,0 +1,104 @@
1
+ class Progressor
2
+ class UnlimitedSequence
3
+ include Formatting
4
+
5
+ attr_reader :min_samples, :max_samples
6
+
7
+ # The current loop index, starts at 1
8
+ attr_reader :current
9
+
10
+ # The time the object was created
11
+ attr_reader :start_time
12
+
13
+ # Creates a new UnlimitedSequence with the given parameters:
14
+ #
15
+ # - min_samples: The number of samples to collect before attempting to
16
+ # calculate a time per iteration. Default: 1
17
+ #
18
+ # - max_samples: The maximum number of measurements to collect and average.
19
+ # Default: 100.
20
+ #
21
+ # - formatter: A callable that accepts the sequence object and returns a
22
+ # custom formatted string.
23
+ #
24
+ def initialize(min_samples: 1, max_samples: 100, formatter: nil)
25
+ @min_samples = min_samples
26
+ @max_samples = max_samples
27
+ @formatter = formatter
28
+
29
+ raise Error.new("min_samples needs to be a positive number") if min_samples <= 0
30
+ raise Error.new("max_samples needs to be larger than min_samples") if max_samples <= min_samples
31
+
32
+ @start_time = Time.now
33
+ @current = 0
34
+ @measurements = []
35
+ @averages = []
36
+ end
37
+
38
+ # Adds a duration in seconds to the internal storage of samples. Updates
39
+ # averages accordingly.
40
+ #
41
+ def push(duration)
42
+ @current += 1
43
+ @measurements << duration
44
+ # only keep last `max_samples`
45
+ @measurements.shift if @measurements.count > max_samples
46
+
47
+ @averages << average(@measurements)
48
+ @averages = @averages.compact
49
+ # only keep last `max_samples`
50
+ @averages.shift if @averages.count > max_samples
51
+ end
52
+
53
+ # "Skips" an iteration, which, in the context of an UnlimitedSequence is a no-op.
54
+ #
55
+ def skip(_n)
56
+ # Nothing to do
57
+ end
58
+
59
+ # Outputs a textual representation of the current state of the
60
+ # UnlimitedSequence. Shows:
61
+ #
62
+ # - the current (1-indexed) number of iterations
63
+ # - how long since the start time
64
+ # - how long a single iteration takes
65
+ #
66
+ # A custom `formatter` provided at construction time overrides this default
67
+ # output.
68
+ #
69
+ def to_s
70
+ return @formatter.call(self).to_s if @formatter
71
+
72
+ [
73
+ "#{@current + 1}",
74
+ "t: #{format_time(Time.now - @start_time)}",
75
+ "t/i: #{format_time(per_iteration)}",
76
+ ].join(', ')
77
+ end
78
+
79
+ # Returns an estimation for the time per single iteration. Implemented as
80
+ # an average of averages to provide a smoother gradient from loop to loop.
81
+ #
82
+ # Returns nil if not enough samples have been collected yet.
83
+ #
84
+ def per_iteration
85
+ return nil if @measurements.count < min_samples
86
+ average(@averages)
87
+ end
88
+
89
+ # Is supposed to return an estimation for the Estimated Time of Arrival
90
+ # (time until done).
91
+ #
92
+ # For an UnlimitedSequence, this always returns nil.
93
+ #
94
+ def eta
95
+ # No estimation possible
96
+ end
97
+
98
+ private
99
+
100
+ def average(collection)
101
+ collection.inject(&:+) / collection.count.to_f
102
+ end
103
+ end
104
+ end
@@ -0,0 +1,3 @@
1
+ class Progressor
2
+ VERSION = '0.1.0'
3
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: progressor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Radev
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-02-20 00:00:00.000000000 Z
11
+ date: 2019-03-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -52,6 +52,20 @@ dependencies:
52
52
  - - "~>"
53
53
  - !ruby/object:Gem::Version
54
54
  version: '3.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: timecop
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '0.9'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '0.9'
55
69
  description: |
56
70
  Provides a way to measure how long each loop in a task took, outputting a
57
71
  report with an estimated time till the task is done.
@@ -65,6 +79,11 @@ files:
65
79
  - LICENSE
66
80
  - README.md
67
81
  - lib/progressor.rb
82
+ - lib/progressor/error.rb
83
+ - lib/progressor/formatting.rb
84
+ - lib/progressor/limited_sequence.rb
85
+ - lib/progressor/unlimited_sequence.rb
86
+ - lib/progressor/version.rb
68
87
  homepage: https://github.com/AndrewRadev/progressor
69
88
  licenses:
70
89
  - MIT