progressor 0.0.1 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: fa7eacefe70e6b186f9a72625d6158db19b7f9b45b327c397b7b4402dc56bc47
4
- data.tar.gz: 5cb7a42fffe3d6650a48454e41c73d8f74e7cb72fdf521fb2fdd6e59bb2cae8c
3
+ metadata.gz: 67733b5ccdcf4efdbe628a511246b38238f5bd368e320ae9251ec34dc1be7cc5
4
+ data.tar.gz: 2c84c065e48fdfab6e9ab4079c7c595be60cbfe16f7378970a451cebe36af326
5
5
  SHA512:
6
- metadata.gz: 2d11d38113127328cb2d0cd6b6d98652f9e6010816aebdbbd0c44cc27b331ba03c9b961308f2ec71e6abb5f0b0694f4952132214a126cdbecb5d062741e7cbe0
7
- data.tar.gz: cba7008ee6ca3151ba3c89159dc7835d8d76b5cb981da4017d3dffe0fdb14a9bf0c3ef9e18a6f532ab220f4abececc487a8145e86e34b77b226b9a81f9473f13
6
+ metadata.gz: 4735274940fcb4bd4b54089dce074bbbb6fe42cc530db202f3a3f4738c396624685d3d6740e8707882d8cfa9c0f4e2053851f5ed1c987c3d0a88c1fb665b2f28
7
+ data.tar.gz: 375e03f4109ea4a74de4483b55aa889611d7d2d0ec8901950363b2e5196f8652fb8995d20bcddb8d8118e842e6ead21d71bb47dec00cd071795962361f505371
data/README.md CHANGED
@@ -1,8 +1,31 @@
1
- A very basic library to measure loops in a long-running task.
1
+ Full documentation for the latest released version can be found at: https://www.rubydoc.info/gems/progressor
2
2
 
3
- *Note: Very incomplete, so mostly for personal usage. Will hopefully flesh it out, write tests, configuration, etc, at some point (PRs welcome). Until then, a similar library can be found here: https://github.com/mkdynamic/ke*
3
+ ## Basic example
4
4
 
5
- Example usage:
5
+ Here's an example long-running task:
6
+
7
+ ``` ruby
8
+ Product.find_each do |product|
9
+ next if product.not_something_we_want_to_process?
10
+ product.calculate_interesting_stats
11
+ end
12
+ ```
13
+
14
+ In order to understand how it's progressing, we might add some print statements:
15
+
16
+ ``` ruby
17
+ Product.find_each do |product|
18
+ if product.not_something_we_want_to_process?
19
+ puts "Skipping product: #{product.id}"
20
+ next
21
+ end
22
+
23
+ puts "Working on product: #{product.id}"
24
+ product.calculate_interesting_stats
25
+ end
26
+ ```
27
+
28
+ This gives us some indication of progress, but no idea how much time is left. We could take a count and maintain a manual index, and then eyeball it based on how fast the numbers are adding up. Progressor automates that process:
6
29
 
7
30
  ``` ruby
8
31
  progressor = Progressor.new(total_count: Product.count)
@@ -20,12 +43,79 @@ Product.find_each do |product|
20
43
  end
21
44
  ```
22
45
 
23
- Example output:
46
+ Each invocation of `run` measures how long its block took and records it. The yielded `progress` parameter is an object that can be `to_s`-ed to provide progress information.
47
+
48
+ The output might look like this:
24
49
 
25
50
  ```
26
51
  ...
27
- [0038/1000, (004%), t/i: 0.5s, ETA: 8m:0.27s] Product 38
28
- [0039/1000, (004%), t/i: 0.5s, ETA: 7m:58.47s] Product 39
29
- [0040/1000, (004%), t/i: 0.5s, ETA: 7m:57.08s] Product 40
52
+ [0038/1000, (004%), t/i: 0.5s, ETA: 8m:00s] Product 38
53
+ [0039/1000, (004%), t/i: 0.5s, ETA: 7m:58s] Product 39
54
+ [0040/1000, (004%), t/i: 0.5s, ETA: 7m:57s] Product 40
30
55
  ...
31
56
  ```
57
+
58
+ You can check the documentation for the [Progressor](https://www.rubydoc.info/gems/progressor/Progressor) class for details on the methods you can call to get the individual pieces of data shown in the report.
59
+
60
+ ## Limited and unlimited sequences
61
+
62
+ Initializing a `Progressor` with a provided `total_count:` parameter gives you a limited sequence, which can give you not only a progress report, but an estimation of when it'll be done:
63
+
64
+ ```
65
+ [<current loop>/<total count>, (<progress>%), t/i: <time per iteration>, ETA: <time until it's done>]
66
+ ```
67
+
68
+ The calculation is done by maintaining a list of measurements with a limited size, and a list of averages of those measurements. The average of averages is the "time per iteration" and it's multiplied by the remaining count to produce the estimation.
69
+
70
+ I can't really say how reliable this is, but it seems to provide smoothly changing estimations that seem more or less correct to me, for similarly-sized chunks of work per iteration.
71
+
72
+ **Not** providing a `total_count:` parameter leads to less available information:
73
+
74
+ ``` ruby
75
+ progressor = Progressor.new
76
+
77
+ (1..100).each do |i|
78
+ progressor.run do |progress|
79
+ sleep rand
80
+ puts progress
81
+ end
82
+ end
83
+ ```
84
+
85
+ A sample of output might look like this:
86
+
87
+ ```
88
+ ...
89
+ 11, t: 5.32s, t/i: 442.39ms
90
+ 12, t: 5.58s, t/i: 446.11ms
91
+ ...
92
+ ```
93
+
94
+ The format is:
95
+
96
+ ```
97
+ <current>, t: <time from start>, t/i: <time per iteration>
98
+ ```
99
+
100
+ ## Configuration
101
+
102
+ Apart from `total_count`, which is optional and affects the kind of sequence that will be stored, you can provide `min_samples` and `max_samples`. You can also provide a custom formatter:
103
+
104
+ ``` ruby
105
+ progressor = Progressor.new({
106
+ total_count: 1000,
107
+ min_samples: 5,
108
+ max_samples: 10,
109
+ formatter: -> (p) { p.eta }
110
+ })
111
+ ```
112
+
113
+ The option `min_samples` determines how many loops the tool will wait until trying to produce an estimation. A higher number means no information in the beginning, but no wild fluctuations, either. It needs to be at least 1 and the default is 1.
114
+
115
+ The option `max_samples` is how many measurements will be retained. Those measurements will be averaged, and then those averages averaged to get a time-per-iteration estimate. A smaller number means giving more weight to later events, while a larger one would average over a larger amount of samples. The default is 100.
116
+
117
+ The `formatter` is a callback that gets a progress object as an argument and you can return your own string to output on every loop. Check `LimitedSequence` and `UnlimitedSequence` for the available methods and accessors you can use.
118
+
119
+ ## Related work
120
+
121
+ A very similar tool is the gem [ke](https://github.com/mkdynamic/ke). It provides its estimation by maintaining the median quartile range of the stored measurements, removing outliers. It also automates the output of the progress report, only printing it every N loops. Depending on your needs and preferences, it might be better for your use case.
@@ -1,7 +1,13 @@
1
+ require 'progressor/version'
2
+ require 'progressor/error'
3
+ require 'progressor/formatting'
4
+ require 'progressor/limited_sequence'
5
+ require 'progressor/unlimited_sequence'
6
+
1
7
  require 'benchmark'
2
8
 
3
9
  # Used to measure the running time of parts of a long-running task and output
4
- # an estimation based on the average of the last 10-100 measurements.
10
+ # an estimation based on the average of the last 1-100 measurements.
5
11
  #
6
12
  # Example usage:
7
13
  #
@@ -22,21 +28,23 @@ require 'benchmark'
22
28
  # Example output:
23
29
  #
24
30
  # ...
25
- # [0038/1000, (004%), t/i: 0.5s, ETA: 8m:0.27s] Product 38
26
- # [0039/1000, (004%), t/i: 0.5s, ETA: 7m:58.47s] Product 39
27
- # [0040/1000, (004%), t/i: 0.5s, ETA: 7m:57.08s] Product 40
31
+ # [0038/1000, 004%, t/i: 0.5s, ETA: 8m:00s] Product 38
32
+ # [0039/1000, 004%, t/i: 0.5s, ETA: 7m:58s] Product 39
33
+ # [0040/1000, 004%, t/i: 0.5s, ETA: 7m:57s] Product 40
28
34
  # ...
29
35
  #
30
36
  class Progressor
31
- VERSION = '0.0.1'
37
+ include Formatting
32
38
 
33
39
  # Utility method to print a message with the time it took to run the contents
34
40
  # of the block.
35
41
  #
36
- # > Progressor.puts("Working on a thing") { thing_work }
42
+ # Progressor.puts("Working on a thing") { thing_work }
43
+ #
44
+ # Output:
37
45
  #
38
- # Working on a thing...
39
- # Working on a thing DONE: 2.1s
46
+ # Working on a thing...
47
+ # Working on a thing DONE: 2.1s
40
48
  #
41
49
  def self.puts(message, &block)
42
50
  Kernel.puts "#{message}..."
@@ -44,76 +52,49 @@ class Progressor
44
52
  Kernel.puts "#{message} DONE: #{format_time(measurement.real)}"
45
53
  end
46
54
 
47
- def initialize(total_count:)
48
- @total_count = total_count
49
- @total_count_digits = total_count.to_s.length
50
- @current = 0
51
- @measurements = []
52
- @averages = []
55
+ # Set up a new Progressor instance. Optional parameters:
56
+ #
57
+ # - total_count: If given, the tool will be able to provide an ETA.
58
+ #
59
+ # - min_samples: The number of samples to collect before attempting to
60
+ # calculate a time per iteration. Default: 1
61
+ #
62
+ # - max_samples: The maximum number of measurements to collect and average.
63
+ # Default: 100.
64
+ #
65
+ # - formatter: A callable that accepts a progress object and returns a
66
+ # custom formatted string.
67
+ #
68
+ def initialize(total_count: nil, min_samples: 1, max_samples: 100, formatter: nil)
69
+ params = {
70
+ min_samples: min_samples,
71
+ max_samples: max_samples,
72
+ formatter: formatter,
73
+ }
74
+
75
+ if total_count
76
+ @sequence = LimitedSequence.new(total_count: total_count, **params)
77
+ else
78
+ @sequence = UnlimitedSequence.new(**params)
79
+ end
53
80
  end
54
81
 
82
+ # Run the given block of code, yielding a sequence object that holds progress
83
+ # information.
84
+ #
85
+ # Example usage:
86
+ #
87
+ # progressor.run { |progress| puts progress; long_running_task() }
88
+ #
55
89
  def run
56
- @current += 1
57
-
58
- measurement = Benchmark.measure { yield self }
59
-
60
- @measurements << measurement.real
61
- # only keep last 1000
62
- @measurements.shift if @measurements.count > 1000
63
-
64
- @averages << average(@measurements)
65
- @averages = @averages.compact
66
- # only keep last 100
67
- @averages.shift if @averages.count > 100
90
+ measurement = Benchmark.measure { yield @sequence }
91
+ @sequence.push(measurement.real)
68
92
  end
69
93
 
94
+ # Skips the given number of loops (will likely be 1), updating the
95
+ # estimations appropriately.
96
+ #
70
97
  def skip(n)
71
- @total_count -= n
72
- end
73
-
74
- def to_s
75
- [
76
- "#{@current.to_s.rjust(@total_count_digits, '0')}/#{@total_count}",
77
- "(#{((@current / @total_count.to_f) * 100).round.to_s.rjust(3, '0')}%)",
78
- "t/i: #{self.class.format_time(per_iteration)}",
79
- "ETA: #{self.class.format_time(eta)}",
80
- ].join(', ')
81
- end
82
-
83
- def per_iteration
84
- return nil if @measurements.count < 10
85
- average(@averages)
86
- end
87
-
88
- def eta
89
- return nil if @measurements.count < 10
90
-
91
- remaining_time = per_iteration * (@total_count - @current)
92
- remaining_time.round(2)
93
- end
94
-
95
- private
96
-
97
- def self.format_time(time)
98
- return "?s" if time.nil?
99
-
100
- if time < 0.1
101
- "#{(time * 1000).round(2)}ms"
102
- elsif time < 60
103
- "#{time.round(2)}s"
104
- elsif time < 3600
105
- minutes = time.to_i / 60
106
- seconds = (time - minutes * 60).round(2)
107
- "#{minutes}m:#{seconds}s"
108
- else
109
- hours = time.to_i / 3600
110
- minutes = (time.to_i % 3600) / 60
111
- seconds = (time - (hours * 3600 + minutes * 60)).round(2)
112
- "#{hours}h:#{minutes}m:#{seconds}s"
113
- end
114
- end
115
-
116
- def average(collection)
117
- collection.inject(&:+) / collection.count.to_f
98
+ @sequence.skip(n)
118
99
  end
119
100
  end
@@ -0,0 +1,7 @@
1
+ class Progressor
2
+ # A custom error class for targeted catching. All Progressor errors will be
3
+ # wrapped in a Progressor::Error.
4
+ #
5
+ class Error < RuntimeError
6
+ end
7
+ end
@@ -0,0 +1,39 @@
1
+ class Progressor
2
+ module Formatting
3
+ # Formats the given time in seconds to something human readable. Examples:
4
+ #
5
+ # - 1 second: 1.00s
6
+ # - 0.123 seconds: 123.00ms
7
+ # - 100 seconds: 01m:40s
8
+ # - 101.5 seconds: 01m:41s
9
+ # - 3661 seconds: 01h:01m:01s
10
+ def format_time(time)
11
+ return "?s" if time.nil?
12
+
13
+ if time < 1
14
+ "#{format_float((time * 1000).round(2))}ms"
15
+ elsif time < 60
16
+ "#{format_float(time.round(2))}s"
17
+ elsif time < 3600
18
+ minutes = time.to_i / 60
19
+ seconds = (time - minutes * 60).round(2)
20
+ "#{format_int(minutes)}m:#{format_int(seconds)}s"
21
+ else
22
+ hours = time.to_i / 3600
23
+ minutes = (time.to_i % 3600) / 60
24
+ seconds = (time - (hours * 3600 + minutes * 60)).round(2)
25
+ "#{format_int(hours)}h:#{format_int(minutes)}m:#{format_int(seconds)}s"
26
+ end
27
+ end
28
+
29
+ # :nodoc:
30
+ def format_int(value)
31
+ sprintf("%02d", value)
32
+ end
33
+
34
+ # :nodoc:
35
+ def format_float(value)
36
+ sprintf("%0.2f", value)
37
+ end
38
+ end
39
+ end
@@ -0,0 +1,126 @@
1
+ class Progressor
2
+ class LimitedSequence
3
+ include Formatting
4
+
5
+ attr_reader :total_count, :min_samples, :max_samples
6
+
7
+ # The current loop index, starts at 1
8
+ attr_reader :current
9
+
10
+ # The time the object was created
11
+ attr_reader :start_time
12
+
13
+ # Creates a new LimitedSequence with the given parameters:
14
+ #
15
+ # - total_count: The expected number of loops.
16
+ #
17
+ # - min_samples: The number of samples to collect before attempting to
18
+ # calculate a time per iteration. Default: 1
19
+ #
20
+ # - max_samples: The maximum number of measurements to collect and average.
21
+ # Default: 100.
22
+ #
23
+ # - formatter: A callable that accepts the sequence object and returns a
24
+ # custom formatted string.
25
+ #
26
+ def initialize(total_count:, min_samples: 1, max_samples: 100, formatter: nil)
27
+ @total_count = total_count
28
+ @min_samples = min_samples
29
+ @max_samples = [max_samples, total_count].min
30
+ @formatter = formatter
31
+
32
+ raise Error.new("min_samples needs to be a positive number") if min_samples <= 0
33
+ raise Error.new("max_samples needs to be larger than min_samples") if max_samples <= min_samples
34
+
35
+ @start_time = Time.now
36
+ @total_count_digits = total_count.to_s.length
37
+ @current = 0
38
+ @measurements = []
39
+ @averages = []
40
+ end
41
+
42
+ # Adds a duration in seconds to the internal storage of samples. Updates
43
+ # averages accordingly.
44
+ #
45
+ def push(duration)
46
+ @current += 1
47
+ @measurements << duration
48
+ # only keep last `max_samples`
49
+ @measurements.shift if @measurements.count > max_samples
50
+
51
+ @averages << average(@measurements)
52
+ @averages = @averages.compact
53
+ # only keep last `max_samples`
54
+ @averages.shift if @averages.count > max_samples
55
+ end
56
+
57
+ # Skips an iteration, updating the total count and ETA
58
+ #
59
+ def skip(n)
60
+ @total_count -= n
61
+ end
62
+
63
+ # Outputs a textual representation of the current state of the
64
+ # UnlimitedSequence. Shows:
65
+ #
66
+ # - the current number of iterations and the total count
67
+ # - completion level in percentage
68
+ # - how long a single iteration takes
69
+ # - estimated time of arrival (ETA) -- time until it's done
70
+ #
71
+ # A custom `formatter` provided at construction time overrides this default
72
+ # output.
73
+ #
74
+ # If the "current" number of iterations goes over the total count, an ETA
75
+ # can't be shown anymore, so it'll just be the current number over the
76
+ # expected one, and the time per iteration.
77
+ #
78
+ def to_s
79
+ return @formatter.call(self).to_s if @formatter
80
+
81
+ if @current > @total_count
82
+ return [
83
+ "#{@current} (expected #{@total_count})",
84
+ "t/i: #{format_time(per_iteration)}",
85
+ "ETA: ???",
86
+ ].join(', ')
87
+ end
88
+
89
+ [
90
+ "#{@current.to_s.rjust(@total_count_digits, '0')}/#{@total_count}",
91
+ "#{((@current / @total_count.to_f) * 100).round.to_s.rjust(3, '0')}%",
92
+ "t/i: #{format_time(per_iteration)}",
93
+ "ETA: #{format_time(eta)}",
94
+ ].join(', ')
95
+ end
96
+
97
+ # Returns an estimation for the time per single iteration. Implemented as
98
+ # an average of averages to provide a smoother gradient from loop to loop.
99
+ #
100
+ # Returns nil if not enough samples have been collected yet.
101
+ #
102
+ def per_iteration
103
+ return nil if @measurements.count < min_samples
104
+ average(@averages)
105
+ end
106
+
107
+ # Returns an estimation for the Estimated Time of Arrival (time until
108
+ # done).
109
+ #
110
+ # Calculated by multiplying the average time per iteration with the
111
+ # remaining number of loops.
112
+ #
113
+ def eta
114
+ return nil if @measurements.count < min_samples
115
+
116
+ remaining_time = per_iteration * (@total_count - @current)
117
+ remaining_time.round(2)
118
+ end
119
+
120
+ private
121
+
122
+ def average(collection)
123
+ collection.inject(&:+) / collection.count.to_f
124
+ end
125
+ end
126
+ end
@@ -0,0 +1,104 @@
1
+ class Progressor
2
+ class UnlimitedSequence
3
+ include Formatting
4
+
5
+ attr_reader :min_samples, :max_samples
6
+
7
+ # The current loop index, starts at 1
8
+ attr_reader :current
9
+
10
+ # The time the object was created
11
+ attr_reader :start_time
12
+
13
+ # Creates a new UnlimitedSequence with the given parameters:
14
+ #
15
+ # - min_samples: The number of samples to collect before attempting to
16
+ # calculate a time per iteration. Default: 1
17
+ #
18
+ # - max_samples: The maximum number of measurements to collect and average.
19
+ # Default: 100.
20
+ #
21
+ # - formatter: A callable that accepts the sequence object and returns a
22
+ # custom formatted string.
23
+ #
24
+ def initialize(min_samples: 1, max_samples: 100, formatter: nil)
25
+ @min_samples = min_samples
26
+ @max_samples = max_samples
27
+ @formatter = formatter
28
+
29
+ raise Error.new("min_samples needs to be a positive number") if min_samples <= 0
30
+ raise Error.new("max_samples needs to be larger than min_samples") if max_samples <= min_samples
31
+
32
+ @start_time = Time.now
33
+ @current = 0
34
+ @measurements = []
35
+ @averages = []
36
+ end
37
+
38
+ # Adds a duration in seconds to the internal storage of samples. Updates
39
+ # averages accordingly.
40
+ #
41
+ def push(duration)
42
+ @current += 1
43
+ @measurements << duration
44
+ # only keep last `max_samples`
45
+ @measurements.shift if @measurements.count > max_samples
46
+
47
+ @averages << average(@measurements)
48
+ @averages = @averages.compact
49
+ # only keep last `max_samples`
50
+ @averages.shift if @averages.count > max_samples
51
+ end
52
+
53
+ # "Skips" an iteration, which, in the context of an UnlimitedSequence is a no-op.
54
+ #
55
+ def skip(_n)
56
+ # Nothing to do
57
+ end
58
+
59
+ # Outputs a textual representation of the current state of the
60
+ # UnlimitedSequence. Shows:
61
+ #
62
+ # - the current (1-indexed) number of iterations
63
+ # - how long since the start time
64
+ # - how long a single iteration takes
65
+ #
66
+ # A custom `formatter` provided at construction time overrides this default
67
+ # output.
68
+ #
69
+ def to_s
70
+ return @formatter.call(self).to_s if @formatter
71
+
72
+ [
73
+ "#{@current + 1}",
74
+ "t: #{format_time(Time.now - @start_time)}",
75
+ "t/i: #{format_time(per_iteration)}",
76
+ ].join(', ')
77
+ end
78
+
79
+ # Returns an estimation for the time per single iteration. Implemented as
80
+ # an average of averages to provide a smoother gradient from loop to loop.
81
+ #
82
+ # Returns nil if not enough samples have been collected yet.
83
+ #
84
+ def per_iteration
85
+ return nil if @measurements.count < min_samples
86
+ average(@averages)
87
+ end
88
+
89
+ # Is supposed to return an estimation for the Estimated Time of Arrival
90
+ # (time until done).
91
+ #
92
+ # For an UnlimitedSequence, this always returns nil.
93
+ #
94
+ def eta
95
+ # No estimation possible
96
+ end
97
+
98
+ private
99
+
100
+ def average(collection)
101
+ collection.inject(&:+) / collection.count.to_f
102
+ end
103
+ end
104
+ end
@@ -0,0 +1,3 @@
1
+ class Progressor
2
+ VERSION = '0.1.0'
3
+ end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: progressor
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Andrew Radev
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-02-20 00:00:00.000000000 Z
11
+ date: 2019-03-15 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -52,6 +52,20 @@ dependencies:
52
52
  - - "~>"
53
53
  - !ruby/object:Gem::Version
54
54
  version: '3.0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: timecop
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '0.9'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '0.9'
55
69
  description: |
56
70
  Provides a way to measure how long each loop in a task took, outputting a
57
71
  report with an estimated time till the task is done.
@@ -65,6 +79,11 @@ files:
65
79
  - LICENSE
66
80
  - README.md
67
81
  - lib/progressor.rb
82
+ - lib/progressor/error.rb
83
+ - lib/progressor/formatting.rb
84
+ - lib/progressor/limited_sequence.rb
85
+ - lib/progressor/unlimited_sequence.rb
86
+ - lib/progressor/version.rb
68
87
  homepage: https://github.com/AndrewRadev/progressor
69
88
  licenses:
70
89
  - MIT