pwrake 2.2.6 → 2.2.7

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d2c9f51a1a5bdbb546c288239917a1284a7d831f4332c339474d5eb4e368a0ac
4
- data.tar.gz: dda35b31e72bd380b0e6ee5e1e2d6cdecb66698bfd3167ce792e3b4f95146f4c
3
+ metadata.gz: 20590e6007af5bc0c567fe8266a82f40aaeec0bbb853bdd87e180f0f1c53bc31
4
+ data.tar.gz: 4ec4760f2428d854019675a2f51e49f791102f7eab6b3141e258cfbc25a8842a
5
5
  SHA512:
6
- metadata.gz: 5999d1c3ac6217ebab9d5e1915f6f9cf6c8c2023fa3fe973a3a3de7dfe751f4b89e5c8b1a31d4a0a9e972a8fb1be8233fb2c357b189f339925194bfde2352e52
7
- data.tar.gz: 366a8e85bf26d3af24b9f80f4521c8b222c56985b50c644f337d4e996eb232f8c077df6fbe4a314b5794eb91d840639ce8c13a3e508fc695ea12bf3c4cbd3fad
6
+ metadata.gz: 391db5799b9091fbbc0a242f78729fe265087581b6b88dbaf9bf01f188e8d4412c749683075c38dfcea4803589289205f85c42320913727fe51628eb7fdaa2f8
7
+ data.tar.gz: fc7a41165bdce7fa685e3d922d5b24f6146524deaa52a31049c459d8fa39430b3948b4d02943ec143014181f2f916e3ab6d06f9596b61c05d397243a02770fe4
data/README.md CHANGED
@@ -23,6 +23,14 @@ Parallel Workflow extension for Rake, runs on multicores, clusters, clouds.
23
23
  * Pwrake schedules a compute node to execute a task, to a node where input files are stored.
24
24
  * Other supports for Gfarm: Automatic mount of the Gfarm file system, etc.
25
25
 
26
+ ## Requirement
27
+
28
+ * Ruby version 2.2.3 or later
29
+ * UNIX-like OS
30
+ * For distributed processing using multiple computers:
31
+ * SSH command
32
+ * distributed file system (NFS, Gfarm, etc.)
33
+
26
34
  ## Installation
27
35
 
28
36
  Install with RubyGems:
@@ -33,8 +41,6 @@ Or download source tgz/zip and expand, cd to subdirectory and install:
33
41
 
34
42
  $ ruby setup.rb
35
43
 
36
- In the latter case, you need install [Parallel](https://github.com/grosser/parallel) manually. It is required by Pwrake for processor count.
37
-
38
44
  If you use rbenv, your system may fail to find pwrake command after installation:
39
45
 
40
46
  -bash: pwrake: command not found
@@ -70,7 +76,7 @@ In this case, you need the rehash of command paths:
70
76
 
71
77
  $ pwrake -F hosts
72
78
 
73
- ### Use MPI to start remote worker
79
+ ### Sustitute MPI for SSH to start remote worker (Experimental)
74
80
 
75
81
  1. Setup MPI on your cluster.
76
82
  2. Install [MPipe gem](https://rubygems.org/gems/mpipe). (requires `mpicc`)
@@ -179,10 +185,9 @@ Properties (The leftmost item is default):
179
185
 
180
186
  ## Note for Gfarm
181
187
 
182
- * `gfwhere-pipe` script (included in Pwrake) is used for file-affinity scheduling.
183
- This script requires Ruby/FFI (https://github.com/ffi/ffi). Install FFI by
184
-
185
- gem install ffi
188
+ * Gfarm file-affinity scheduling is achieved by `gfwhere-pipe` script bundled in the Pwrake package.
189
+ This script accesses `libgfarm.so.1` through Fiddle (a Ruby's standard module) since Pwrake ver.2.2.7.
190
+ Please set the environment variable `LD_LIBRARY_PATH` correctly to find `libgfarm.so.1`.
186
191
 
187
192
  ## Scheduling with Graph Partitioning
188
193
 
@@ -200,16 +205,7 @@ Properties (The leftmost item is default):
200
205
 
201
206
  * See publication: [M. Tanaka and O. Tatebe, “Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning,” in CCGrid 2012](http://ieeexplore.ieee.org/abstract/document/6217406/)
202
207
 
203
- ## Current version
204
-
205
- * Pwrake version 2.2.3
206
-
207
- ## Tested Platform
208
-
209
-
210
- * Ruby 2.4.1
211
- * Rake 12.0.0
212
- * CentOS 7.3
208
+ ## [Publications](https://github.com/masa16/pwrake/wiki/Publications)
213
209
 
214
210
  ## Acknowledgment
215
211
 
data/bin/gfwhere-pipe CHANGED
@@ -1,15 +1,18 @@
1
1
  #! /usr/bin/env ruby
2
2
 
3
- require 'ffi'
4
- require 'singleton'
3
+ require 'fiddle'
5
4
 
6
5
  module Gfarm
7
6
 
8
7
  class GfarmError < StandardError
9
8
  end
9
+
10
10
  GFARM_ERR_NO_ERROR = 0
11
11
 
12
- module FFI
12
+ module LibGfarm
13
+
14
+ module_function
15
+
13
16
  def find_executable(name)
14
17
  path = "/usr/local/bin:/usr/ucb:/usr/bin:/bin"
15
18
  begin
@@ -24,171 +27,180 @@ module Gfarm
24
27
  begin
25
28
  stat = File.stat(file)
26
29
  rescue SystemCallError
27
- else
30
+ else
28
31
  return file if stat.file? and stat.executable?
29
32
  end
30
33
  end
31
34
  nil
32
35
  end
33
- module_function :find_executable
34
-
35
- if LIBGFARM_PATH = ENV['LIBGFARM_PATH']
36
- dirs = LIBGFARM_PATH.split(":")
37
- elsif d = find_executable('gfwhere')
38
- d = File.dirname(File.dirname(d))
39
- dirs = %w[lib64 lib].map{|l| File.join(d,l)}
40
- else
41
- raise StandardError, "cannot find libgfarm path"
42
- end
43
- path = nil
44
- dirs.each do |d|
45
- f = File.join(d,"libgfarm.so*")
46
- g = Dir.glob(f)
47
- if !g.empty?
48
- path = g[0]
49
- break
36
+
37
+ begin
38
+ HANDLE = Fiddle::Handle.new('libgfarm.so.1')
39
+ rescue
40
+ if d = find_executable('gfwhere')
41
+ d = File.dirname(File.dirname(d))
42
+ dirs = %w[lib64 lib].map{|l| File.join(d,l)}
43
+ else
44
+ raise StandardError, "cannot find libgfarm path"
45
+ end
46
+ path = nil
47
+ dirs.each do |d|
48
+ f = File.join(d,"libgfarm.so*")
49
+ g = Dir.glob(f)
50
+ if !g.empty?
51
+ path = g[0]
52
+ break
53
+ end
54
+ end
55
+ if !(path && File.exist?(path))
56
+ raise StandardError, "cannot find libgfarm"
50
57
  end
58
+ HANDLE = Fiddle::Handle.new(path)
51
59
  end
52
- if !(path && File.exist?(path))
53
- raise StandardError, "cannot find libgfarm"
60
+
61
+ FUNC = {}
62
+
63
+ def def_cfunc(name,argtypes,rettype=Fiddle::TYPE_INT)
64
+ FUNC[name] = func =
65
+ Fiddle::Function.new(HANDLE.sym(name),argtypes,rettype,name:name)
66
+ define_singleton_method(name){|*a| func.call(*a)}
54
67
  end
55
- extend ::FFI::Library
56
- ffi_lib path
57
- attach_function :gfarm_initialize, [:pointer, :pointer], :int
58
- attach_function :gfarm_terminate, [], :int
59
- attach_function :gfarm_realpath_by_gfarm2fs, [:string, :pointer], :int
60
- attach_function :gfarm_error_string, [:int], :string
61
- attach_function :gfs_replica_info_by_name, [:string, :int, :pointer], :int
62
- attach_function :gfs_replica_info_number, [:pointer], :int
63
- attach_function :gfs_replica_info_free, [:pointer], :void
64
- attach_function :gfs_replica_info_nth_host, [:pointer, :int], :string
65
- end
66
68
 
69
+ # gfarm_error_t gfarm_initialize(int *argcp, char *** argvp);
70
+ def_cfunc 'gfarm_initialize',[Fiddle::TYPE_VOIDP,Fiddle::TYPE_VOIDP]
67
71
 
68
- class Connection
69
- include Singleton
72
+ # gfarm_error_t gfarm_terminate(void);
73
+ def_cfunc 'gfarm_terminate',[]
70
74
 
71
- def self.callback
72
- proc{ FFI.gfarm_terminate }
73
- end
75
+ # const char *gfarm_error_string(gfarm_error_t);
76
+ def_cfunc 'gfarm_error_string',[Fiddle::TYPE_INT],Fiddle::TYPE_VOIDP
77
+
78
+ # gfarm_error_t gfarm_realpath_by_gfarm2fs(const char *, char **);
79
+ def_cfunc 'gfarm_realpath_by_gfarm2fs',[Fiddle::TYPE_VOIDP,Fiddle::TYPE_VOIDP]
80
+
81
+ # gfarm_error_t gfs_replica_info_by_name(
82
+ # const char *, int, struct gfs_replica_info **);
83
+ def_cfunc 'gfs_replica_info_by_name',[Fiddle::TYPE_VOIDP,Fiddle::TYPE_INT,Fiddle::TYPE_VOIDP]
84
+
85
+ # void gfs_replica_info_free(struct gfs_replica_info *);
86
+ def_cfunc 'gfs_replica_info_free',[Fiddle::TYPE_VOIDP],Fiddle::TYPE_VOID
87
+
88
+ # int gfs_replica_info_number(struct gfs_replica_info *);
89
+ def_cfunc 'gfs_replica_info_number',[Fiddle::TYPE_VOIDP]
90
+
91
+ # const char *gfs_replica_info_nth_host(struct gfs_replica_info *, int);
92
+ def_cfunc 'gfs_replica_info_nth_host',[Fiddle::TYPE_VOIDP,Fiddle::TYPE_INT],Fiddle::TYPE_VOIDP
74
93
 
75
- def self.set_args(args)
76
- @@args = args
77
- end
78
94
 
79
- def initialize
80
- args = @@args || []
81
- argc = ::FFI::MemoryPointer.new(:int, 1)
82
- argc.write_int(args.size)
83
- ary = args.map do |s|
84
- str = ::FFI::MemoryPointer.new(:string, s.size)
85
- str.write_string(s)
86
- str
95
+ @@initialized = false
96
+
97
+ def initialize(*argv)
98
+ if @@initialized
99
+ warn "gfarm_initialize: already initialized"
100
+ return
101
+ end
102
+ argc_buf = [argv.size].pack('i')
103
+ if argv.empty?
104
+ argv_ary = [0].pack('J')
105
+ else
106
+ argv_ary = ARGV.map{|a| Fiddle::Pointer[a]}.pack('J*')
87
107
  end
88
- ptr = ::FFI::MemoryPointer.new(:pointer, args.size)
89
- ptr.write_array_of_pointer(ary)
90
- argv = ::FFI::MemoryPointer.new(:pointer, 1)
91
- argv.write_pointer(ptr)
92
- e = FFI.gfarm_initialize(argc, argv)
108
+ argv_buf = [Fiddle::Pointer[argv_ary]].pack('J')
109
+ e = LibGfarm.gfarm_initialize(argc_buf,argv_buf)
110
+ # size = argc_buf.unpack('i').first
93
111
  if e != GFARM_ERR_NO_ERROR
94
- raise GfarmError, FFI.gfarm_error_string(e)
112
+ raise GfarmError,error_string(e)
95
113
  end
96
- ObjectSpace.define_finalizer(self, self.class.callback)
114
+ @@initialized = true
115
+ at_exit{ gfarm_terminate() }
116
+ end
117
+
118
+ def error_string(i)
119
+ gfarm_error_string(i).to_s
97
120
  end
98
121
 
99
122
  def realpath_by_gfarm2fs(path)
100
- ptr = ::FFI::MemoryPointer.new(:pointer, 1)
101
- e = FFI.gfarm_realpath_by_gfarm2fs(path, ptr)
123
+ ptr_buf = [0].pack('J')
124
+ e = gfarm_realpath_by_gfarm2fs(path,ptr_buf)
102
125
  if e != GFARM_ERR_NO_ERROR
103
- raise GfarmError, FFI.gfarm_error_string(e)
126
+ raise GfarmError,error_string(e)
104
127
  end
105
- ptr.read_pointer().read_string()
128
+ Fiddle::Pointer[ptr_buf.unpack('J').first].to_s
106
129
  end
107
130
 
108
- def replica_info_by_name(name)
109
- ReplicaInfo.new(self,name)
110
- end
111
131
  end
112
132
 
113
133
 
114
- class ReplicaInfo < ::FFI::AutoPointer
134
+ class ReplicaInfo
115
135
 
116
- def self.release(ptr)
117
- FFI.gfs_replica_info_free(ptr)
118
- end
136
+ @@flags = 0
137
+ INCLUDING_DEAD_HOST = 1
138
+ INCLUDING_INCOMPLETE_COPY = 2
139
+ INCLUDING_DEAD_COPY = 4
119
140
 
120
- def self.set_opts(opts)
121
- @@opts = opts
141
+ def self.set_opts(argv)
142
+ @@flags = 0
143
+ args = []
144
+ argv.each do |x|
145
+ case x
146
+ when "-i"
147
+ @@flags |= INCLUDING_INCOMPLETE_COPY
148
+ else
149
+ args << x
150
+ end
151
+ end
152
+ args
122
153
  end
123
154
 
124
- def initialize(gfarm, path)
125
- @gfarm = gfarm
126
- @realpath = @gfarm.realpath_by_gfarm2fs(path)
127
- flag = @@opts.flags
128
- ptr = ::FFI::MemoryPointer.new(:pointer, 1)
129
- e = FFI.gfs_replica_info_by_name(@realpath, flag, ptr)
155
+ def initialize(path)
156
+ while File.symlink?(path)
157
+ link = File.readlink(path)
158
+ path = File.expand_path(link, File.dirname(path))
159
+ end
160
+ @realpath = LibGfarm.realpath_by_gfarm2fs(path)
161
+ ptr_buf = [0].pack('J')
162
+ e = LibGfarm.gfs_replica_info_by_name(@realpath,@@flags,ptr_buf)
130
163
  if e != GFARM_ERR_NO_ERROR
131
- raise GfarmError, @realpath+" "+FFI.gfarm_error_string(e)
164
+ raise GfarmError,@realpath+" : "+LibGfarm.error_string(e)
132
165
  end
133
- @ri = ptr.read_pointer()
134
- super @ri
166
+ @ptr = Fiddle::Pointer[ptr_buf.unpack('J').first]
167
+ @ptr.free = LibGfarm::FUNC['gfs_replica_info_free']
135
168
  end
169
+
136
170
  attr_reader :realpath
137
171
 
138
172
  def number
139
- FFI.gfs_replica_info_number(@ri)
173
+ LibGfarm.gfs_replica_info_number(@ptr)
140
174
  end
141
175
 
142
176
  def nth_host(i)
143
- FFI.gfs_replica_info_nth_host(@ri,i)
177
+ LibGfarm.gfs_replica_info_nth_host(@ptr,i).to_s
144
178
  end
145
179
  end
146
180
 
147
- class Options
148
- INCLUDING_DEAD_HOST = 1
149
- INCLUDING_INCOMPLETE_COPY = 2
150
- INCLUDING_DEAD_COPY = 4
181
+ end
151
182
 
152
- def initialize(argv)
153
- @args = []
154
- @flags = 0
155
- argv.each do |x|
156
- case x
157
- when "-i"
158
- @including_incomplete_copy = true
159
- @flags |= INCLUDING_INCOMPLETE_COPY
160
- else
161
- @args << x
162
- end
163
- end
164
- end
183
+ if $0 == __FILE__
165
184
 
166
- attr_reader :args
167
- attr_reader :flags
168
- attr_reader :including_incomplete_copy
185
+ [:PIPE,:TERM,:INT].each do |sig|
186
+ Signal.trap(sig, "EXIT")
169
187
  end
170
188
 
171
- end
172
-
173
- [:PIPE,:TERM,:INT].each do |sig|
174
- Signal.trap(sig, "EXIT")
175
- end
176
-
177
- opts = Gfarm::Options.new(ARGV)
178
- Gfarm::ReplicaInfo.set_opts(opts)
179
- Gfarm::Connection.set_args(opts.args)
180
- gfarm = Gfarm::Connection.instance
181
-
182
- while path=$stdin.gets
183
- path.chomp!
184
- $stdout.print path+"\n"
185
- $stdout.flush
186
- begin
187
- ri = gfarm.replica_info_by_name(path)
188
- hosts = ri.number.times.map{|i| ri.nth_host(i) }
189
- $stdout.print ri.realpath+":\n"+hosts.join(" ")+"\n"
190
- rescue
191
- $stdout.print "Error: "+path+"\n"
189
+ argv = Gfarm::ReplicaInfo.set_opts(ARGV)
190
+ Gfarm::LibGfarm.initialize(*argv)
191
+
192
+ while path=$stdin.gets
193
+ path.chomp!
194
+ $stdout.print path+"\n"
195
+ $stdout.flush
196
+ begin
197
+ rep_info = Gfarm::ReplicaInfo.new(path)
198
+ hosts = rep_info.number.times.map{|i| rep_info.nth_host(i) }
199
+ $stdout.print rep_info.realpath+":\n"+hosts.join(" ")+"\n"
200
+ rescue
201
+ $stdout.print "Error: "+path+"\n"
202
+ end
203
+ $stdout.flush
192
204
  end
193
- $stdout.flush
205
+
194
206
  end
@@ -22,7 +22,6 @@ module Pwrake
22
22
  standard_exception_handling do
23
23
  init("pwrake") # <- parse options here
24
24
  @role = @master = Master.new
25
- t = Time.now
26
25
  t = Pwrake.clock
27
26
  @master.setup_branches
28
27
  load_rakefile
@@ -1,4 +1,3 @@
1
- require "parallel/processor_count.rb"
2
1
  require "pwrake/nbio"
3
2
  require "pwrake/branch/fiber_queue"
4
3
  require "pwrake/worker/writer"
@@ -1,7 +1,6 @@
1
1
  require "pathname"
2
2
  require "yaml"
3
3
  require "socket"
4
- require "parallel"
5
4
  require "pwrake/option/host_map"
6
5
 
7
6
  module Pwrake
@@ -10,7 +10,6 @@ module Pwrake
10
10
 
11
11
  def set_filesystem_option
12
12
  @worker_progs = %w[
13
- parallel/processor_count.rb
14
13
  pwrake/nbio
15
14
  pwrake/branch/fiber_queue
16
15
  pwrake/worker/writer
@@ -39,7 +39,6 @@ module Pwrake
39
39
  :single_mp => self['GFARM_SINGLE_MP']
40
40
  }
41
41
  @worker_progs = %w[
42
- parallel/processor_count.rb
43
42
  pwrake/nbio
44
43
  pwrake/branch/fiber_queue
45
44
  pwrake/worker/writer
@@ -1,3 +1,3 @@
1
1
  module Pwrake
2
- VERSION = "2.2.6"
2
+ VERSION = "2.2.7"
3
3
  end
@@ -1,14 +1,13 @@
1
1
  require "socket"
2
+ require "etc"
2
3
 
3
4
  module Pwrake
4
5
 
5
6
  class Invoker
6
- begin
7
- # use Michael Grosser's Parallel module
8
- # https://github.com/grosser/parallel
9
- include Parallel::ProcessorCount
10
- rescue
11
- def processor_count
7
+ def processor_count
8
+ begin
9
+ Etc.nprocessors
10
+ rescue
12
11
  # only for Linux
13
12
  IO.read("/proc/cpuinfo").scan(/^processor/).size
14
13
  end
data/pwrake.gemspec CHANGED
@@ -8,8 +8,8 @@ Gem::Specification.new do |gem|
8
8
  gem.version = Pwrake::VERSION
9
9
  gem.authors = ["Masahiro TANAKA"]
10
10
  gem.email = ["masa16.tanaka@gmail.com"]
11
- gem.summary = %q{Parallel Workflow engine based on Rake}
12
- gem.description = %q{Parallel Workflow engine based on Rake, runs on multicores, clusters, clouds}
11
+ gem.summary = %q{Parallel and distributed Rake for workflow execution on multicores, clusters, clouds.}
12
+ gem.description = %q{Parallel and distributed Rake for workflow execution on multicores, clusters, clouds using SSH. It has locality-aware scheduling designed for Gfarm file system.}
13
13
  gem.homepage = "http://masa16.github.com/pwrake"
14
14
  gem.license = 'MIT'
15
15
 
@@ -17,5 +17,5 @@ Gem::Specification.new do |gem|
17
17
  gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
18
18
  gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
19
19
  gem.require_paths = ["lib"]
20
- gem.add_runtime_dependency 'parallel', '>= 1.2.4'
20
+ gem.required_ruby_version = '>= 2.2.3'
21
21
  end
metadata CHANGED
@@ -1,31 +1,17 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pwrake
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.2.6
4
+ version: 2.2.7
5
5
  platform: ruby
6
6
  authors:
7
7
  - Masahiro TANAKA
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-04-15 00:00:00.000000000 Z
12
- dependencies:
13
- - !ruby/object:Gem::Dependency
14
- name: parallel
15
- requirement: !ruby/object:Gem::Requirement
16
- requirements:
17
- - - ">="
18
- - !ruby/object:Gem::Version
19
- version: 1.2.4
20
- type: :runtime
21
- prerelease: false
22
- version_requirements: !ruby/object:Gem::Requirement
23
- requirements:
24
- - - ">="
25
- - !ruby/object:Gem::Version
26
- version: 1.2.4
27
- description: Parallel Workflow engine based on Rake, runs on multicores, clusters,
28
- clouds
11
+ date: 2018-11-30 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: Parallel and distributed Rake for workflow execution on multicores, clusters,
14
+ clouds using SSH. It has locality-aware scheduling designed for Gfarm file system.
29
15
  email:
30
16
  - masa16.tanaka@gmail.com
31
17
  executables:
@@ -128,7 +114,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
128
114
  requirements:
129
115
  - - ">="
130
116
  - !ruby/object:Gem::Version
131
- version: '0'
117
+ version: 2.2.3
132
118
  required_rubygems_version: !ruby/object:Gem::Requirement
133
119
  requirements:
134
120
  - - ">="
@@ -136,10 +122,11 @@ required_rubygems_version: !ruby/object:Gem::Requirement
136
122
  version: '0'
137
123
  requirements: []
138
124
  rubyforge_project:
139
- rubygems_version: 2.7.3
125
+ rubygems_version: 2.7.6
140
126
  signing_key:
141
127
  specification_version: 4
142
- summary: Parallel Workflow engine based on Rake
128
+ summary: Parallel and distributed Rake for workflow execution on multicores, clusters,
129
+ clouds.
143
130
  test_files:
144
131
  - spec/001/Rakefile
145
132
  - spec/002/Rakefile