rworkflow 0.6.5 → 0.7.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 338ee9efd55c80b7aead2a33138d200a60ab6d9f
4
- data.tar.gz: fb8f2820425891429247f155b66006420571038a
3
+ metadata.gz: 200f47a61942919470a013a36c8cc25b874bf445
4
+ data.tar.gz: 2764ce1f1c296d2499723be7a995e618b348b543
5
5
  SHA512:
6
- metadata.gz: 94ff46a524a656091ae7971d0878221cae8b0b239b91e10d8199ab14debdc4f1fb32141c07a57e0a816c0fd8c721c335a9ac55226927f5456a3d6201ce908c4f
7
- data.tar.gz: f8a1c8f480275ef5b895cc96fda83a3f9c4bcd82211a178b2764a7aba03defd50b9f2c78e6860f73f1958901ff66f52621c6f6fb097c06c231275a10b1f311c0
6
+ metadata.gz: 53f8c990a5557b835ae4acdb2a7cf63e1c1dcf8202d626a8d523af6fca4578841e57cdbba6c41c7fa1912543c415e337ebb31a8b99a82ed5c634e4d18e7ab0a7
7
+ data.tar.gz: d8547f66c05957d1ea952039bbfc8c3ce50e0c0d4bc15ec520cd1bde76d6d52998e6f5ed14336d5d829a0b0acafe12663083e1a2de6668b08383a737f1847e11
@@ -0,0 +1,110 @@
1
+ # Rworkflow
2
+
3
+ [![GitHub release](https://img.shields.io/badge/release-0.6.5-blue.png)](https://github.com/barcoo/rworkflow/releases/tag/0.6.5)
4
+ [![Build Status](https://travis-ci.org/barcoo/rworkflow.svg?branch=master&cache=busted)](https://travis-ci.org/barcoo/rworkflow)
5
+ [![Coverage Status](https://coveralls.io/repos/github/barcoo/rworkflow/badge.svg?branch=master)](https://coveralls.io/github/barcoo/rworkflow?branch=master)
6
+
7
+ The rworkflow framework removes many headaches when it comes to asynchronous tasks that need to run after the other, depending on their previous state.
8
+
9
+ A workflow is basically a set of defined states with transitions between them (a state machine or lifecycle) in which every state can contain a set of jobs (a job can be any serializable object). A set of jobs can then be pushed to the workflow on the initial state and then they can transit between the states following the transitions defined. Moreover, the jobs can be also transformed (e.g. a state receives 100 jobs, groups them into one unique job and then pushes this unique job to the next state with a transition).
10
+
11
+ A simple flow (Flow class) only implements this model, but there is a subclass of the simple flow (SidekiqFlow) which interprets every state of the lifecycle as a sidekiq job. Thus, whenever some jobs are pushed to the workflow, this dynamically creates the needed sidekiq workers to complete the workflow.
12
+
13
+
14
+ ## Define a lifecycle
15
+
16
+ The lifecycle is the definition of the state machine that every job pushed to the workflow will transit:
17
+
18
+ ```ruby
19
+ lifecycle = Workflow::Lifecycle.new do |lc|
20
+ lc.state("Floating", {cardinality: 10}) do |state|
21
+ state.transition :rescued, 'Lifeboat'
22
+ state.transition :drowned, Flow::STATE_FAILED
23
+ end
24
+
25
+ lc.state("Lifeboat", {cardinality: 2}) do |state|
26
+ state.transition :landed, 'Land'
27
+ state.transition :starved, Flow::STATE_FAILED
28
+ end
29
+
30
+ lc.state("Land") do |state|
31
+ state.transition :rescued, Flow::STATE_SUCCESSFUL
32
+ state.transition :died, Flow::STATE_FAILED
33
+ end
34
+
35
+ lc.initial = "Floating"
36
+ end
37
+ ```
38
+
39
+ Notes:
40
+
41
+ - For SidekiqFlow worflows the cycle state names need to be the same as an existing class that derives from Workflow::Worker (which implements a SidekiqWork)
42
+ - The transition state names (e.g. :rejected, :generated) are arbitrary, the Worker needs to call those later. There can be more than two.
43
+ - There are some predefined final states (Flow::STATE_FAILED, Flow::STATE_SUCCESSFUL). When all jobs are pushed via transitions to one of these states, the workflow is then finished.
44
+ - The state cardinality indicates how many jobs will be served to a state (by default one)
45
+
46
+ ## Create Workers
47
+
48
+ For SidekiqFlow create a subclass of Workflow::Worker for each state defined on the lifecycle (except for predefined final states)
49
+
50
+ ```ruby
51
+ class Floating < Workflow::Worker
52
+ def process(objects)
53
+ # The size of objects will be at the most the cardinality defined on the lifecycle
54
+ rescued, drowned = objects.partition { |object| object.even? }
55
+
56
+ transition(:rescued, rescued)
57
+ transition(:drowned, drowned)
58
+ end
59
+ end
60
+
61
+ class Lifeboat < Workflow::Worker
62
+ def process(objects)
63
+ landed, starved = objects.partition { |object| object < 4 }
64
+
65
+ transition(:landed, landed)
66
+ transition(:starved, starved)
67
+ end
68
+ end
69
+
70
+ class Land < Workflow::Worker
71
+ def process(objects)
72
+ rescued, died = objects.partition { |object| object == 0 }
73
+
74
+ transition(:rescued, rescued)
75
+ transition(:died, died)
76
+ end
77
+ end
78
+ ```
79
+
80
+ Notes:
81
+
82
+ - Create a class with the exact name that you defined above in the lifecycle definition
83
+ - You will be given an array of objects of a size to a maximum of the defined cardinality in the state. By default is 1.
84
+ - The worker is responsible for the jobs that receives: it has to define a transition for them or otherwise they will be out of the workflow.
85
+
86
+ ## Create and execute the Workflow
87
+
88
+ ```ruby
89
+ options = {}
90
+ workflow = Workflow::SidekiqFlow.create(lifecycle, 'SafeBoatWorkflow', options)
91
+ initial_jobs = [1,2,3,45,6,7,8,9,10]
92
+ workflow.start(initial_jobs)
93
+ ```
94
+
95
+ Notes:
96
+
97
+ - Create a new Sidekiq flow using the lifecycle object defined in the first step above
98
+ - Run flow.start passing in an array of objects
99
+ - The objects need to be serializable
100
+ - _options_ can contain several properties for the workflow (TODO: complete/expand)
101
+
102
+ # Roadmap
103
+
104
+ 1. Decouple persistence layer (for now rworkflow depends on redis_rds which, in turn, depends on redis)
105
+ 2. See [State and Transition Policies](doc/states_and_transitions_policies.rdoc).
106
+ 3. Test Helper (simplify tests)
107
+ 4. Improve logging
108
+ 5. Use a separated Redis instance/db instead of a namespace?
109
+ 6. sidekiq and rails dependencies should be optional
110
+ 7. Move Web UI from CimRails to here
data/Rakefile CHANGED
@@ -4,16 +4,6 @@ rescue LoadError
4
4
  puts 'You must `gem install bundler` and `bundle install` to run rake tasks'
5
5
  end
6
6
 
7
- require 'rdoc/task'
8
-
9
- RDoc::Task.new(:rdoc) do |rdoc|
10
- rdoc.rdoc_dir = 'rdoc'
11
- rdoc.title = 'Rworkflow'
12
- rdoc.options << '--line-numbers'
13
- rdoc.rdoc_files.include('README.rdoc')
14
- rdoc.rdoc_files.include('lib/**/*.rb')
15
- end
16
-
17
7
  Bundler::GemHelper.install_tasks
18
8
 
19
9
  require 'rake/testtask'
@@ -26,116 +16,3 @@ Rake::TestTask.new(:test) do |t|
26
16
  end
27
17
 
28
18
  task default: :test
29
-
30
- namespace :cim do
31
- desc 'Tags, updates README, and CHANGELOG and pushes to Github. Requires ruby-git'
32
- task :release do
33
- tasks = ['cim:assert_clean_repo', 'cim:git_fetch', 'cim:set_new_version', 'cim:update_readme', 'cim:update_changelog', 'cim:commit_changes', 'cim:tag']
34
- begin
35
- tasks.each { |task| Rake::Task[task].invoke }
36
- `git push && git push origin '#{Rworkflow::VERSION}'`
37
- rescue => error
38
- puts ">>> ERROR: #{error}; might want to reset your repository"
39
- end
40
- end
41
-
42
- desc 'Fails if the current repository is not clean'
43
- task :assert_clean_repo do
44
- status = `git status -s`.chomp.strip
45
- if status.strip.empty?
46
- status = `git log origin/master..HEAD`.chomp.strip # check if we have unpushed commits
47
- if status.strip.empty?
48
- puts '>>> Repository is clean!'
49
- else
50
- puts '>>> Please push your committed changes before releasing!'
51
- exit(-1)
52
- end
53
- else
54
- puts '>>> Please stash or commit your changes before releasing!'
55
- exit(-1)
56
- end
57
- end
58
-
59
- desc 'Fetches latest tags/commits'
60
- task :git_fetch do
61
- puts '>>> Fetching latest git refs'
62
- `git fetch --tags`
63
- end
64
-
65
- desc 'Requests the new version number'
66
- task :set_new_version do
67
- STDOUT.print(">>> New version number (current: #{Rworkflow::VERSION}; leave blank if already updated): ")
68
- input = STDIN.gets.strip.tr("'", "\'")
69
-
70
- current = if input.empty?
71
- Rworkflow::VERSION
72
- else
73
- unless input =~ /[0-9]+\.[0-9]+\.[0-9]+/
74
- puts '>>> Please use semantic versioning!'
75
- exit(-1)
76
- end
77
-
78
- input
79
- end
80
-
81
- latest = `git describe --abbrev=0`.chomp.strip
82
- unless Gem::Version.new(current) > Gem::Version.new(latest)
83
- puts ">>> Latest tagged version is #{latest}; make sure gem version (#{current}) is greater!"
84
- exit(-1)
85
- end
86
-
87
- if !input.empty?
88
- `sed -i -u "s@VERSION = '#{Rworkflow::VERSION}'@VERSION = '#{input}'@" #{File.expand_path('../lib/rworkflow/version.rb', __FILE__)}`
89
- $VERBOSE = nil
90
- Rworkflow.const_set('VERSION', input)
91
- $VERBOSE = false
92
-
93
- `bundle check` # force updating version
94
- end
95
- end
96
-
97
- desc 'Updates README with latest version'
98
- task :update_readme do
99
- puts '>>> Updating README.md'
100
- replace = %([![GitHub release](https://img.shields.io/badge/release-#{Rworkflow::VERSION}-blue.png)](https://github.com/barcoo/rworkflow/releases/tag/#{Rworkflow::VERSION}))
101
-
102
- `sed -i -u 's@^\\[\\!\\[GitHub release\\].*$@#{replace}@' README.md`
103
- end
104
-
105
- desc 'Updates CHANGELOG with commit log from last tag to this one'
106
- task :update_changelog do
107
- puts '>>> Updating CHANGELOG.md'
108
- latest = `git describe --abbrev=0`.chomp.strip
109
- log = `git log --pretty=format:'- [%h](https://github.com/barcoo/rworkflow/commit/%h) *%ad* __%s__ (%an)' --date=short '#{latest}'..HEAD`.chomp
110
-
111
- changelog = File.open('.CHANGELOG.md', 'w')
112
- changelog.write("# Changelog\n\n###{Rworkflow::VERSION}\n\n#{log}\n\n")
113
- File.open('CHANGELOG.md', 'r') do |file|
114
- begin
115
- file.readline # skip first two lines
116
- file.readline
117
- while buffer = file.read(2048)
118
- changelog.write(buffer)
119
- end
120
- rescue => error
121
- end
122
- end
123
-
124
- changelog.close
125
- `mv '.CHANGELOG.md' 'CHANGELOG.md'`
126
- end
127
-
128
- desc 'Commits the README/CHANGELOG changes'
129
- task :commit_changes do
130
- puts '>>> Committing updates to README/CHANGELOG'
131
- `git commit -am'Updated README.md and CHANGELOG.md on new release'`
132
- end
133
-
134
- desc 'Creates and pushes the tag to git'
135
- task :tag do
136
- puts '>>> Tagging'
137
- STDOUT.print('>>> Please enter a tag message: ')
138
- input = STDIN.gets.strip.tr("'", "\'")
139
- `git tag -a '#{Rworkflow::VERSION}' -m '#{input}'`
140
- end
141
- end
@@ -40,7 +40,7 @@ module Rworkflow
40
40
 
41
41
  def finished?
42
42
  return false unless started?
43
- total = get_counters.reduce(0) do |sum, pair|
43
+ total = self.counters.reduce(0) do |sum, pair|
44
44
  self.class.terminal?(pair[0]) ? sum : (sum + pair[1].to_i)
45
45
  end
46
46
 
@@ -49,13 +49,13 @@ module Rworkflow
49
49
 
50
50
  def status
51
51
  status = 'Running'
52
- status = (successful?) ? 'Finished' : 'Failed' if finished?
52
+ status = successful? ? 'Finished' : 'Failed' if finished?
53
53
 
54
54
  return status
55
55
  end
56
56
 
57
57
  def created_at
58
- return @created_at ||= begin Time.at(get(:created_at, 0)) end
58
+ return @created_at ||= begin Time.zone.at(get(:created_at, 0)) end
59
59
  end
60
60
 
61
61
  def started?
@@ -71,11 +71,11 @@ module Rworkflow
71
71
  end
72
72
 
73
73
  def start_time
74
- return Time.at(get(:start_time, 0))
74
+ return Time.zone.at(get(:start_time, 0))
75
75
  end
76
76
 
77
77
  def finish_time
78
- return Time.at(get(:finish_time, 0))
78
+ return Time.zone.at(get(:finish_time, 0))
79
79
  end
80
80
 
81
81
  def expected_duration
@@ -90,22 +90,22 @@ module Rworkflow
90
90
  return get_list(state).size
91
91
  end
92
92
 
93
- def get_counters
94
- counters = @storage.get(:counters)
95
- if !counters.nil?
96
- counters = begin
97
- self.class.serializer.load(counters)
93
+ def counters
94
+ the_counters = @storage.get(:counters)
95
+ if !the_counters.nil?
96
+ the_counters = begin
97
+ self.class.serializer.load(the_counters)
98
98
  rescue => e
99
99
  Rails.logger.error("Error loading stored flow counters: #{e.message}")
100
100
  nil
101
101
  end
102
102
  end
103
- return counters || get_counters!
103
+ return the_counters || counters!
104
104
  end
105
105
 
106
106
  # fetches counters atomically
107
- def get_counters!
108
- counters = { processing: 0 }
107
+ def counters!
108
+ the_counters = { processing: 0 }
109
109
 
110
110
  names = @lifecycle.states.keys
111
111
  results = RedisRds::Object.connection.multi do
@@ -115,14 +115,14 @@ module Rworkflow
115
115
  end
116
116
 
117
117
  (self.class::STATES_TERMINAL + names).each do |name|
118
- counters[name] = results.shift.to_i
118
+ the_counters[name] = results.shift.to_i
119
119
  end
120
120
 
121
- counters[:processing] = results.shift.reduce(0) { |sum, pair| sum + pair.last.to_i }
121
+ the_counters[:processing] = results.shift.reduce(0) { |sum, pair| sum + pair.last.to_i }
122
122
 
123
- return counters
123
+ return the_counters
124
124
  end
125
- private :get_counters!
125
+ private :counters!
126
126
 
127
127
  def fetch(fetcher_id, state_name)
128
128
  @processing.set(fetcher_id, 1)
@@ -159,7 +159,7 @@ module Rworkflow
159
159
 
160
160
  def list_objects(state_name, limit = -1)
161
161
  list = get_list(state_name)
162
- return list.get(0, limit).map {|object| self.class.serializer.load(object)}
162
+ return list.get(0, limit).map { |object| self.class.serializer.load(object) }
163
163
  end
164
164
 
165
165
  def get_state_list(state_name)
@@ -183,9 +183,9 @@ module Rworkflow
183
183
  post_process
184
184
 
185
185
  if self.public?
186
- counters = get_counters!
187
- counters[:processing] = 0 # Some worker might have increased the processing flag at that time even if there is no more jobs to be done
188
- @storage.setnx(:counters, self.class.serializer.dump(counters))
186
+ the_counters = self.counters!
187
+ the_counters[:processing] = 0 # Some worker might have increased the processing flag at that time even if there is no more jobs to be done
188
+ @storage.setnx(:counters, self.class.serializer.dump(the_counters))
189
189
  states_cleanup
190
190
  else
191
191
  self.cleanup
@@ -194,8 +194,7 @@ module Rworkflow
194
194
  end
195
195
  end
196
196
 
197
- def post_process
198
- end
197
+ def post_process; end
199
198
  protected :post_process
200
199
 
201
200
  def metadata_string
@@ -268,7 +267,7 @@ module Rworkflow
268
267
 
269
268
  def get(key, default = nil)
270
269
  value = @flow_data.get(key)
271
- value = if value.nil? then default else self.class.serializer.load(value) end
270
+ value = value.nil? ? default : self.class.serializer.load(value)
272
271
 
273
272
  return value
274
273
  end
@@ -319,7 +318,7 @@ module Rworkflow
319
318
  end
320
319
 
321
320
  def total_objects_processed(counters = nil)
322
- return (counters || get_counters).reduce(0) do |sum, pair|
321
+ return (counters || self.counters).reduce(0) do |sum, pair|
323
322
  if self.class.terminal?(pair[0])
324
323
  sum + pair[1]
325
324
  else
@@ -329,11 +328,11 @@ module Rworkflow
329
328
  end
330
329
 
331
330
  def total_objects(counters = nil)
332
- return (counters || get_counters).reduce(0) { |sum, pair| sum + pair[1] }
331
+ return (counters || self.counters).reduce(0) { |sum, pair| sum + pair[1] }
333
332
  end
334
333
 
335
334
  def total_objects_failed(counters = nil)
336
- return (counters || get_counters).reduce(0) do |sum, pair|
335
+ return (counters || self.counters).reduce(0) do |sum, pair|
337
336
  if self.class.failure?(pair[0])
338
337
  sum + pair[1]
339
338
  else
@@ -406,12 +405,14 @@ module Rworkflow
406
405
  def read_flow_class(id)
407
406
  klass = nil
408
407
  raw_class = id.split('__').first
409
- klass = begin
410
- raw_class.constantize
411
- rescue NameError => _
412
- Rails.logger.warn("Unknown flow class for workflow id #{id}")
413
- nil
414
- end if !raw_class.nil?
408
+ if !raw_class.nil?
409
+ klass = begin
410
+ raw_class.constantize
411
+ rescue NameError => _
412
+ Rails.logger.warn("Unknown flow class for workflow id #{id}")
413
+ nil
414
+ end
415
+ end
415
416
 
416
417
  return klass
417
418
  end
@@ -5,11 +5,12 @@ module Rworkflow
5
5
 
6
6
  CARDINALITY_ALL_STARTED = :all_started # Indicates a cardinality equal to the jobs pushed at the start of the workflow
7
7
 
8
+ DEFAULT_CARDINALITY = State::DEFAULT_CARDINALITY
9
+ STATE_POLICY_NO_WAIT = State::STATE_POLICY_NO_WAIT
8
10
  DEFAULT_STATE_OPTIONS = {
9
- cardinality: State::DEFAULT_CARDINALITY,
10
- priority: State::DEFAULT_PRIORITY,
11
- policy: State::STATE_POLICY_NO_WAIT
12
- }
11
+ cardinality: self::DEFAULT_CARDINALITY,
12
+ policy: self::STATE_POLICY_NO_WAIT
13
+ }.freeze
13
14
 
14
15
  def initialize(state_class: State, state_options: {})
15
16
  @state_options = DEFAULT_STATE_OPTIONS.merge(state_options)
@@ -30,7 +31,7 @@ module Rworkflow
30
31
 
31
32
  def transition(from, name)
32
33
  from_state = @states[from]
33
- fail(StateError, from) if from_state.nil?
34
+ raise(StateError, from) if from_state.nil?
34
35
 
35
36
  return from_state.perform(name, @default)
36
37
  end