pwrake 2.2.6 → 2.2.7

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d2c9f51a1a5bdbb546c288239917a1284a7d831f4332c339474d5eb4e368a0ac
4
- data.tar.gz: dda35b31e72bd380b0e6ee5e1e2d6cdecb66698bfd3167ce792e3b4f95146f4c
3
+ metadata.gz: 20590e6007af5bc0c567fe8266a82f40aaeec0bbb853bdd87e180f0f1c53bc31
4
+ data.tar.gz: 4ec4760f2428d854019675a2f51e49f791102f7eab6b3141e258cfbc25a8842a
5
5
  SHA512:
6
- metadata.gz: 5999d1c3ac6217ebab9d5e1915f6f9cf6c8c2023fa3fe973a3a3de7dfe751f4b89e5c8b1a31d4a0a9e972a8fb1be8233fb2c357b189f339925194bfde2352e52
7
- data.tar.gz: 366a8e85bf26d3af24b9f80f4521c8b222c56985b50c644f337d4e996eb232f8c077df6fbe4a314b5794eb91d840639ce8c13a3e508fc695ea12bf3c4cbd3fad
6
+ metadata.gz: 391db5799b9091fbbc0a242f78729fe265087581b6b88dbaf9bf01f188e8d4412c749683075c38dfcea4803589289205f85c42320913727fe51628eb7fdaa2f8
7
+ data.tar.gz: fc7a41165bdce7fa685e3d922d5b24f6146524deaa52a31049c459d8fa39430b3948b4d02943ec143014181f2f916e3ab6d06f9596b61c05d397243a02770fe4
data/README.md CHANGED
@@ -23,6 +23,14 @@ Parallel Workflow extension for Rake, runs on multicores, clusters, clouds.
23
23
  * Pwrake schedules a compute node to execute a task, to a node where input files are stored.
24
24
  * Other supports for Gfarm: Automatic mount of the Gfarm file system, etc.
25
25
 
26
+ ## Requirement
27
+
28
+ * Ruby version 2.2.3 or later
29
+ * UNIX-like OS
30
+ * For distributed processing using multiple computers:
31
+ * SSH command
32
+ * distributed file system (NFS, Gfarm, etc.)
33
+
26
34
  ## Installation
27
35
 
28
36
  Install with RubyGems:
@@ -33,8 +41,6 @@ Or download source tgz/zip and expand, cd to subdirectory and install:
33
41
 
34
42
  $ ruby setup.rb
35
43
 
36
- In the latter case, you need install [Parallel](https://github.com/grosser/parallel) manually. It is required by Pwrake for processor count.
37
-
38
44
  If you use rbenv, your system may fail to find pwrake command after installation:
39
45
 
40
46
  -bash: pwrake: command not found
@@ -70,7 +76,7 @@ In this case, you need the rehash of command paths:
70
76
 
71
77
  $ pwrake -F hosts
72
78
 
73
- ### Use MPI to start remote worker
79
+ ### Sustitute MPI for SSH to start remote worker (Experimental)
74
80
 
75
81
  1. Setup MPI on your cluster.
76
82
  2. Install [MPipe gem](https://rubygems.org/gems/mpipe). (requires `mpicc`)
@@ -179,10 +185,9 @@ Properties (The leftmost item is default):
179
185
 
180
186
  ## Note for Gfarm
181
187
 
182
- * `gfwhere-pipe` script (included in Pwrake) is used for file-affinity scheduling.
183
- This script requires Ruby/FFI (https://github.com/ffi/ffi). Install FFI by
184
-
185
- gem install ffi
188
+ * Gfarm file-affinity scheduling is achieved by `gfwhere-pipe` script bundled in the Pwrake package.
189
+ This script accesses `libgfarm.so.1` through Fiddle (a Ruby's standard module) since Pwrake ver.2.2.7.
190
+ Please set the environment variable `LD_LIBRARY_PATH` correctly to find `libgfarm.so.1`.
186
191
 
187
192
  ## Scheduling with Graph Partitioning
188
193
 
@@ -200,16 +205,7 @@ Properties (The leftmost item is default):
200
205
 
201
206
  * See publication: [M. Tanaka and O. Tatebe, “Workflow Scheduling to Minimize Data Movement Using Multi-constraint Graph Partitioning,” in CCGrid 2012](http://ieeexplore.ieee.org/abstract/document/6217406/)
202
207
 
203
- ## Current version
204
-
205
- * Pwrake version 2.2.3
206
-
207
- ## Tested Platform
208
-
209
-
210
- * Ruby 2.4.1
211
- * Rake 12.0.0
212
- * CentOS 7.3
208
+ ## [Publications](https://github.com/masa16/pwrake/wiki/Publications)
213
209
 
214
210
  ## Acknowledgment
215
211
 
data/bin/gfwhere-pipe CHANGED
@@ -1,15 +1,18 @@
1
1
  #! /usr/bin/env ruby
2
2
 
3
- require 'ffi'
4
- require 'singleton'
3
+ require 'fiddle'
5
4
 
6
5
  module Gfarm
7
6
 
8
7
  class GfarmError < StandardError
9
8
  end
9
+
10
10
  GFARM_ERR_NO_ERROR = 0
11
11
 
12
- module FFI
12
+ module LibGfarm
13
+
14
+ module_function
15
+
13
16
  def find_executable(name)
14
17
  path = "/usr/local/bin:/usr/ucb:/usr/bin:/bin"
15
18
  begin
@@ -24,171 +27,180 @@ module Gfarm
24
27
  begin
25
28
  stat = File.stat(file)
26
29
  rescue SystemCallError
27
- else
30
+ else
28
31
  return file if stat.file? and stat.executable?
29
32
  end
30
33
  end
31
34
  nil
32
35
  end
33
- module_function :find_executable
34
-
35
- if LIBGFARM_PATH = ENV['LIBGFARM_PATH']
36
- dirs = LIBGFARM_PATH.split(":")
37
- elsif d = find_executable('gfwhere')
38
- d = File.dirname(File.dirname(d))
39
- dirs = %w[lib64 lib].map{|l| File.join(d,l)}
40
- else
41
- raise StandardError, "cannot find libgfarm path"
42
- end
43
- path = nil
44
- dirs.each do |d|
45
- f = File.join(d,"libgfarm.so*")
46
- g = Dir.glob(f)
47
- if !g.empty?
48
- path = g[0]
49
- break
36
+
37
+ begin
38
+ HANDLE = Fiddle::Handle.new('libgfarm.so.1')
39
+ rescue
40
+ if d = find_executable('gfwhere')
41
+ d = File.dirname(File.dirname(d))
42
+ dirs = %w[lib64 lib].map{|l| File.join(d,l)}
43
+ else
44
+ raise StandardError, "cannot find libgfarm path"
45
+ end
46
+ path = nil
47
+ dirs.each do |d|
48
+ f = File.join(d,"libgfarm.so*")
49
+ g = Dir.glob(f)
50
+ if !g.empty?
51
+ path = g[0]
52
+ break
53
+ end
54
+ end
55
+ if !(path && File.exist?(path))
56
+ raise StandardError, "cannot find libgfarm"
50
57
  end
58
+ HANDLE = Fiddle::Handle.new(path)
51
59
  end
52
- if !(path && File.exist?(path))
53
- raise StandardError, "cannot find libgfarm"
60
+
61
+ FUNC = {}
62
+
63
+ def def_cfunc(name,argtypes,rettype=Fiddle::TYPE_INT)
64
+ FUNC[name] = func =
65
+ Fiddle::Function.new(HANDLE.sym(name),argtypes,rettype,name:name)
66
+ define_singleton_method(name){|*a| func.call(*a)}
54
67
  end
55
- extend ::FFI::Library
56
- ffi_lib path
57
- attach_function :gfarm_initialize, [:pointer, :pointer], :int
58
- attach_function :gfarm_terminate, [], :int
59
- attach_function :gfarm_realpath_by_gfarm2fs, [:string, :pointer], :int
60
- attach_function :gfarm_error_string, [:int], :string
61
- attach_function :gfs_replica_info_by_name, [:string, :int, :pointer], :int
62
- attach_function :gfs_replica_info_number, [:pointer], :int
63
- attach_function :gfs_replica_info_free, [:pointer], :void
64
- attach_function :gfs_replica_info_nth_host, [:pointer, :int], :string
65
- end
66
68
 
69
+ # gfarm_error_t gfarm_initialize(int *argcp, char *** argvp);
70
+ def_cfunc 'gfarm_initialize',[Fiddle::TYPE_VOIDP,Fiddle::TYPE_VOIDP]
67
71
 
68
- class Connection
69
- include Singleton
72
+ # gfarm_error_t gfarm_terminate(void);
73
+ def_cfunc 'gfarm_terminate',[]
70
74
 
71
- def self.callback
72
- proc{ FFI.gfarm_terminate }
73
- end
75
+ # const char *gfarm_error_string(gfarm_error_t);
76
+ def_cfunc 'gfarm_error_string',[Fiddle::TYPE_INT],Fiddle::TYPE_VOIDP
77
+
78
+ # gfarm_error_t gfarm_realpath_by_gfarm2fs(const char *, char **);
79
+ def_cfunc 'gfarm_realpath_by_gfarm2fs',[Fiddle::TYPE_VOIDP,Fiddle::TYPE_VOIDP]
80
+
81
+ # gfarm_error_t gfs_replica_info_by_name(
82
+ # const char *, int, struct gfs_replica_info **);
83
+ def_cfunc 'gfs_replica_info_by_name',[Fiddle::TYPE_VOIDP,Fiddle::TYPE_INT,Fiddle::TYPE_VOIDP]
84
+
85
+ # void gfs_replica_info_free(struct gfs_replica_info *);
86
+ def_cfunc 'gfs_replica_info_free',[Fiddle::TYPE_VOIDP],Fiddle::TYPE_VOID
87
+
88
+ # int gfs_replica_info_number(struct gfs_replica_info *);
89
+ def_cfunc 'gfs_replica_info_number',[Fiddle::TYPE_VOIDP]
90
+
91
+ # const char *gfs_replica_info_nth_host(struct gfs_replica_info *, int);
92
+ def_cfunc 'gfs_replica_info_nth_host',[Fiddle::TYPE_VOIDP,Fiddle::TYPE_INT],Fiddle::TYPE_VOIDP
74
93
 
75
- def self.set_args(args)
76
- @@args = args
77
- end
78
94
 
79
- def initialize
80
- args = @@args || []
81
- argc = ::FFI::MemoryPointer.new(:int, 1)
82
- argc.write_int(args.size)
83
- ary = args.map do |s|
84
- str = ::FFI::MemoryPointer.new(:string, s.size)
85
- str.write_string(s)
86
- str
95
+ @@initialized = false
96
+
97
+ def initialize(*argv)
98
+ if @@initialized
99
+ warn "gfarm_initialize: already initialized"
100
+ return
101
+ end
102
+ argc_buf = [argv.size].pack('i')
103
+ if argv.empty?
104
+ argv_ary = [0].pack('J')
105
+ else
106
+ argv_ary = ARGV.map{|a| Fiddle::Pointer[a]}.pack('J*')
87
107
  end
88
- ptr = ::FFI::MemoryPointer.new(:pointer, args.size)
89
- ptr.write_array_of_pointer(ary)
90
- argv = ::FFI::MemoryPointer.new(:pointer, 1)
91
- argv.write_pointer(ptr)
92
- e = FFI.gfarm_initialize(argc, argv)
108
+ argv_buf = [Fiddle::Pointer[argv_ary]].pack('J')
109
+ e = LibGfarm.gfarm_initialize(argc_buf,argv_buf)
110
+ # size = argc_buf.unpack('i').first
93
111
  if e != GFARM_ERR_NO_ERROR
94
- raise GfarmError, FFI.gfarm_error_string(e)
112
+ raise GfarmError,error_string(e)
95
113
  end
96
- ObjectSpace.define_finalizer(self, self.class.callback)
114
+ @@initialized = true
115
+ at_exit{ gfarm_terminate() }
116
+ end
117
+
118
+ def error_string(i)
119
+ gfarm_error_string(i).to_s
97
120
  end
98
121
 
99
122
  def realpath_by_gfarm2fs(path)
100
- ptr = ::FFI::MemoryPointer.new(:pointer, 1)
101
- e = FFI.gfarm_realpath_by_gfarm2fs(path, ptr)
123
+ ptr_buf = [0].pack('J')
124
+ e = gfarm_realpath_by_gfarm2fs(path,ptr_buf)
102
125
  if e != GFARM_ERR_NO_ERROR
103
- raise GfarmError, FFI.gfarm_error_string(e)
126
+ raise GfarmError,error_string(e)
104
127
  end
105
- ptr.read_pointer().read_string()
128
+ Fiddle::Pointer[ptr_buf.unpack('J').first].to_s
106
129
  end
107
130
 
108
- def replica_info_by_name(name)
109
- ReplicaInfo.new(self,name)
110
- end
111
131
  end
112
132
 
113
133
 
114
- class ReplicaInfo < ::FFI::AutoPointer
134
+ class ReplicaInfo
115
135
 
116
- def self.release(ptr)
117
- FFI.gfs_replica_info_free(ptr)
118
- end
136
+ @@flags = 0
137
+ INCLUDING_DEAD_HOST = 1
138
+ INCLUDING_INCOMPLETE_COPY = 2
139
+ INCLUDING_DEAD_COPY = 4
119
140
 
120
- def self.set_opts(opts)
121
- @@opts = opts
141
+ def self.set_opts(argv)
142
+ @@flags = 0
143
+ args = []
144
+ argv.each do |x|
145
+ case x
146
+ when "-i"
147
+ @@flags |= INCLUDING_INCOMPLETE_COPY
148
+ else
149
+ args << x
150
+ end
151
+ end
152
+ args
122
153
  end
123
154
 
124
- def initialize(gfarm, path)
125
- @gfarm = gfarm
126
- @realpath = @gfarm.realpath_by_gfarm2fs(path)
127
- flag = @@opts.flags
128
- ptr = ::FFI::MemoryPointer.new(:pointer, 1)
129
- e = FFI.gfs_replica_info_by_name(@realpath, flag, ptr)
155
+ def initialize(path)
156
+ while File.symlink?(path)
157
+ link = File.readlink(path)
158
+ path = File.expand_path(link, File.dirname(path))
159
+ end
160
+ @realpath = LibGfarm.realpath_by_gfarm2fs(path)
161
+ ptr_buf = [0].pack('J')
162
+ e = LibGfarm.gfs_replica_info_by_name(@realpath,@@flags,ptr_buf)
130
163
  if e != GFARM_ERR_NO_ERROR
131
- raise GfarmError, @realpath+" "+FFI.gfarm_error_string(e)
164
+ raise GfarmError,@realpath+" : "+LibGfarm.error_string(e)
132
165
  end
133
- @ri = ptr.read_pointer()
134
- super @ri
166
+ @ptr = Fiddle::Pointer[ptr_buf.unpack('J').first]
167
+ @ptr.free = LibGfarm::FUNC['gfs_replica_info_free']
135
168
  end
169
+
136
170
  attr_reader :realpath
137
171
 
138
172
  def number
139
- FFI.gfs_replica_info_number(@ri)
173
+ LibGfarm.gfs_replica_info_number(@ptr)
140
174
  end
141
175
 
142
176
  def nth_host(i)
143
- FFI.gfs_replica_info_nth_host(@ri,i)
177
+ LibGfarm.gfs_replica_info_nth_host(@ptr,i).to_s
144
178
  end
145
179
  end
146
180
 
147
- class Options
148
- INCLUDING_DEAD_HOST = 1
149
- INCLUDING_INCOMPLETE_COPY = 2
150
- INCLUDING_DEAD_COPY = 4
181
+ end
151
182
 
152
- def initialize(argv)
153
- @args = []
154
- @flags = 0
155
- argv.each do |x|
156
- case x
157
- when "-i"
158
- @including_incomplete_copy = true
159
- @flags |= INCLUDING_INCOMPLETE_COPY
160
- else
161
- @args << x
162
- end
163
- end
164
- end
183
+ if $0 == __FILE__
165
184
 
166
- attr_reader :args
167
- attr_reader :flags
168
- attr_reader :including_incomplete_copy
185
+ [:PIPE,:TERM,:INT].each do |sig|
186
+ Signal.trap(sig, "EXIT")
169
187
  end
170
188
 
171
- end
172
-
173
- [:PIPE,:TERM,:INT].each do |sig|
174
- Signal.trap(sig, "EXIT")
175
- end
176
-
177
- opts = Gfarm::Options.new(ARGV)
178
- Gfarm::ReplicaInfo.set_opts(opts)
179
- Gfarm::Connection.set_args(opts.args)
180
- gfarm = Gfarm::Connection.instance
181
-
182
- while path=$stdin.gets
183
- path.chomp!
184
- $stdout.print path+"\n"
185
- $stdout.flush
186
- begin
187
- ri = gfarm.replica_info_by_name(path)
188
- hosts = ri.number.times.map{|i| ri.nth_host(i) }
189
- $stdout.print ri.realpath+":\n"+hosts.join(" ")+"\n"
190
- rescue
191
- $stdout.print "Error: "+path+"\n"
189
+ argv = Gfarm::ReplicaInfo.set_opts(ARGV)
190
+ Gfarm::LibGfarm.initialize(*argv)
191
+
192
+ while path=$stdin.gets
193
+ path.chomp!
194
+ $stdout.print path+"\n"
195
+ $stdout.flush
196
+ begin
197
+ rep_info = Gfarm::ReplicaInfo.new(path)
198
+ hosts = rep_info.number.times.map{|i| rep_info.nth_host(i) }
199
+ $stdout.print rep_info.realpath+":\n"+hosts.join(" ")+"\n"
200
+ rescue
201
+ $stdout.print "Error: "+path+"\n"
202
+ end
203
+ $stdout.flush
192
204
  end
193
- $stdout.flush
205
+
194
206
  end
@@ -22,7 +22,6 @@ module Pwrake
22
22
  standard_exception_handling do
23
23
  init("pwrake") # <- parse options here
24
24
  @role = @master = Master.new
25
- t = Time.now
26
25
  t = Pwrake.clock
27
26
  @master.setup_branches
28
27
  load_rakefile
@@ -1,4 +1,3 @@
1
- require "parallel/processor_count.rb"
2
1
  require "pwrake/nbio"
3
2
  require "pwrake/branch/fiber_queue"
4
3
  require "pwrake/worker/writer"
@@ -1,7 +1,6 @@
1
1
  require "pathname"
2
2
  require "yaml"
3
3
  require "socket"
4
- require "parallel"
5
4
  require "pwrake/option/host_map"
6
5
 
7
6
  module Pwrake
@@ -10,7 +10,6 @@ module Pwrake
10
10
 
11
11
  def set_filesystem_option
12
12
  @worker_progs = %w[
13
- parallel/processor_count.rb
14
13
  pwrake/nbio
15
14
  pwrake/branch/fiber_queue
16
15
  pwrake/worker/writer
@@ -39,7 +39,6 @@ module Pwrake
39
39
  :single_mp => self['GFARM_SINGLE_MP']
40
40
  }
41
41
  @worker_progs = %w[
42
- parallel/processor_count.rb
43
42
  pwrake/nbio
44
43
  pwrake/branch/fiber_queue
45
44
  pwrake/worker/writer
@@ -1,3 +1,3 @@
1
1
  module Pwrake
2
- VERSION = "2.2.6"
2
+ VERSION = "2.2.7"
3
3
  end
@@ -1,14 +1,13 @@
1
1
  require "socket"
2
+ require "etc"
2
3
 
3
4
  module Pwrake
4
5
 
5
6
  class Invoker
6
- begin
7
- # use Michael Grosser's Parallel module
8
- # https://github.com/grosser/parallel
9
- include Parallel::ProcessorCount
10
- rescue
11
- def processor_count
7
+ def processor_count
8
+ begin
9
+ Etc.nprocessors
10
+ rescue
12
11
  # only for Linux
13
12
  IO.read("/proc/cpuinfo").scan(/^processor/).size
14
13
  end
data/pwrake.gemspec CHANGED
@@ -8,8 +8,8 @@ Gem::Specification.new do |gem|
8
8
  gem.version = Pwrake::VERSION
9
9
  gem.authors = ["Masahiro TANAKA"]
10
10
  gem.email = ["masa16.tanaka@gmail.com"]
11
- gem.summary = %q{Parallel Workflow engine based on Rake}
12
- gem.description = %q{Parallel Workflow engine based on Rake, runs on multicores, clusters, clouds}
11
+ gem.summary = %q{Parallel and distributed Rake for workflow execution on multicores, clusters, clouds.}
12
+ gem.description = %q{Parallel and distributed Rake for workflow execution on multicores, clusters, clouds using SSH. It has locality-aware scheduling designed for Gfarm file system.}
13
13
  gem.homepage = "http://masa16.github.com/pwrake"
14
14
  gem.license = 'MIT'
15
15
 
@@ -17,5 +17,5 @@ Gem::Specification.new do |gem|
17
17
  gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
18
18
  gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
19
19
  gem.require_paths = ["lib"]
20
- gem.add_runtime_dependency 'parallel', '>= 1.2.4'
20
+ gem.required_ruby_version = '>= 2.2.3'
21
21
  end
metadata CHANGED
@@ -1,31 +1,17 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: pwrake
3
3
  version: !ruby/object:Gem::Version
4
- version: 2.2.6
4
+ version: 2.2.7
5
5
  platform: ruby
6
6
  authors:
7
7
  - Masahiro TANAKA
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2018-04-15 00:00:00.000000000 Z
12
- dependencies:
13
- - !ruby/object:Gem::Dependency
14
- name: parallel
15
- requirement: !ruby/object:Gem::Requirement
16
- requirements:
17
- - - ">="
18
- - !ruby/object:Gem::Version
19
- version: 1.2.4
20
- type: :runtime
21
- prerelease: false
22
- version_requirements: !ruby/object:Gem::Requirement
23
- requirements:
24
- - - ">="
25
- - !ruby/object:Gem::Version
26
- version: 1.2.4
27
- description: Parallel Workflow engine based on Rake, runs on multicores, clusters,
28
- clouds
11
+ date: 2018-11-30 00:00:00.000000000 Z
12
+ dependencies: []
13
+ description: Parallel and distributed Rake for workflow execution on multicores, clusters,
14
+ clouds using SSH. It has locality-aware scheduling designed for Gfarm file system.
29
15
  email:
30
16
  - masa16.tanaka@gmail.com
31
17
  executables:
@@ -128,7 +114,7 @@ required_ruby_version: !ruby/object:Gem::Requirement
128
114
  requirements:
129
115
  - - ">="
130
116
  - !ruby/object:Gem::Version
131
- version: '0'
117
+ version: 2.2.3
132
118
  required_rubygems_version: !ruby/object:Gem::Requirement
133
119
  requirements:
134
120
  - - ">="
@@ -136,10 +122,11 @@ required_rubygems_version: !ruby/object:Gem::Requirement
136
122
  version: '0'
137
123
  requirements: []
138
124
  rubyforge_project:
139
- rubygems_version: 2.7.3
125
+ rubygems_version: 2.7.6
140
126
  signing_key:
141
127
  specification_version: 4
142
- summary: Parallel Workflow engine based on Rake
128
+ summary: Parallel and distributed Rake for workflow execution on multicores, clusters,
129
+ clouds.
143
130
  test_files:
144
131
  - spec/001/Rakefile
145
132
  - spec/002/Rakefile