mobilize-base 1.351 → 1.361
Sign up to get free protection for your applications and to get access to all the features.
- data/README.md +1 -672
- data/lib/mobilize-base/extensions/array.rb +8 -3
- data/lib/mobilize-base/extensions/google_drive/acl.rb +1 -1
- data/lib/mobilize-base/extensions/google_drive/client_login_fetcher.rb +2 -1
- data/lib/mobilize-base/extensions/string.rb +1 -1
- data/lib/mobilize-base/extensions/time.rb +20 -0
- data/lib/mobilize-base/handlers/gdrive.rb +12 -0
- data/lib/mobilize-base/handlers/gfile.rb +59 -7
- data/lib/mobilize-base/handlers/gsheet.rb +37 -27
- data/lib/mobilize-base/handlers/resque.rb +1 -4
- data/lib/mobilize-base/models/job.rb +84 -41
- data/lib/mobilize-base/models/runner.rb +13 -4
- data/lib/mobilize-base/models/stage.rb +4 -2
- data/lib/mobilize-base/tasks.rb +50 -13
- data/lib/mobilize-base/version.rb +1 -1
- data/lib/mobilize-base.rb +7 -0
- data/lib/samples/gdrive.yml +9 -0
- data/lib/samples/gfile.yml +9 -0
- data/test/fixtures/base1_stage1.in.yml +10 -0
- data/test/fixtures/integration_expected.yml +25 -0
- data/test/fixtures/integration_jobs.yml +12 -0
- data/test/fixtures/is_due.yml +97 -0
- data/test/integration/mobilize-base_test.rb +57 -0
- data/test/test_helper.rb +98 -19
- data/test/unit/mobilize-base_test.rb +33 -0
- metadata +18 -10
- data/test/base_job_rows.yml +0 -15
- data/test/mobilize-base_test.rb +0 -65
- data/test/test_base_1.yml +0 -3
data/README.md
CHANGED
@@ -1,675 +1,4 @@
|
|
1
1
|
Mobilize
|
2
2
|
========
|
3
3
|
|
4
|
-
|
5
|
-
* a Google Spreadsheets UI through [google-drive-ruby][google_drive_ruby];
|
6
|
-
* a queue manager through [Resque][resque];
|
7
|
-
* a persistent caching / database layer through [Mongoid][mongoid];
|
8
|
-
* gems for data transfers to/from Hive, mySQL, and HTTP endpoints
|
9
|
-
(coming soon).
|
10
|
-
|
11
|
-
Mobilize-Base includes all the core scheduling and processing
|
12
|
-
functionality, allowing you to:
|
13
|
-
* put workers on the Mobilize Resque queue.
|
14
|
-
* create [Users](#section_Start_Users_User) and their associated Google Spreadsheet [Runners](#section_Start_Users_Runner);
|
15
|
-
* poll for [Jobs](#section_Job) on Runners (currently gsheet to gsheet only) and add them to Resque;
|
16
|
-
* monitor the status of Jobs on a rolling log.
|
17
|
-
|
18
|
-
Table Of Contents
|
19
|
-
-----------------
|
20
|
-
* [Overview](#section_Overview)
|
21
|
-
* [Install](#section_Install)
|
22
|
-
* [Redis](#section_Install_Redis)
|
23
|
-
* [MongoDB](#section_Install_MongoDB)
|
24
|
-
* [Mobilize-Base](#section_Install_Mobilize-Base)
|
25
|
-
* [Default Folders and Files](#section_Install_Folders_and_Files)
|
26
|
-
* [Configure](#section_Configure)
|
27
|
-
* [Google Drive](#section_Configure_Google_Drive)
|
28
|
-
* [Google Sheets](#section_Configure_Google_Sheets)
|
29
|
-
* [Jobtracker](#section_Configure_Jobtracker)
|
30
|
-
* [Resque](#section_Configure_Resque)
|
31
|
-
* [Resque-Web](#section_Configure_Resque-Web)
|
32
|
-
* [Gridfs](#section_Configure_Gridfs)
|
33
|
-
* [Mongoid](#section_Configure_Mongoid)
|
34
|
-
* [Start](#section_Start)
|
35
|
-
* [Start Resque-Web](#section_Start_Start_Resque-Web)
|
36
|
-
* [Set Environment](#section_Start_Set_Environment)
|
37
|
-
* [Create User](#section_Start_Create_User)
|
38
|
-
* [Start Workers](#section_Start_Start_Workers)
|
39
|
-
* [View Logs](#section_Start_View_Logs)
|
40
|
-
* [Start Jobtracker](#section_Start_Start_Jobtracker)
|
41
|
-
* [Create Job](#section_Start_Create_Job)
|
42
|
-
* [Run Test](#section_Start_Run_Test)
|
43
|
-
* [Add Gbooks and Gsheets](#section_Start_Add_Gbooks_And_Gsheets)
|
44
|
-
* [Meta](#section_Meta)
|
45
|
-
* [Author](#section_Author)
|
46
|
-
* [Special Thanks](#section_Special_Thanks)
|
47
|
-
|
48
|
-
|
49
|
-
<a name='section_Overview'></a>
|
50
|
-
Overview
|
51
|
-
-----------
|
52
|
-
|
53
|
-
* Mobilize is a script deployment and data visualization framework with
|
54
|
-
a Google Spreadsheets UI.
|
55
|
-
* Mobilize uses Resque for parallelization and queueuing, MongoDB for caching,
|
56
|
-
and Google Drive for hosting, user input and display.
|
57
|
-
* The [mobilize-ssh][mobilize-ssh] gem allows you to run scripts and
|
58
|
-
copy files between different machines, and have output directed to a
|
59
|
-
spreadsheet for viewing and processing.
|
60
|
-
* The platform is easily extensible: add your own rake tasks and
|
61
|
-
handlers by following a few simple conventions, and you can have your own
|
62
|
-
Mobilize gem up and running in no time.
|
63
|
-
|
64
|
-
<a name='section_Install'></a>
|
65
|
-
Install
|
66
|
-
------------
|
67
|
-
|
68
|
-
Mobilize requires Ruby 1.9.3, and has been tested on OSX and Ubuntu.
|
69
|
-
|
70
|
-
[RVM][rvm] is great for managing your rubies.
|
71
|
-
|
72
|
-
<a name='section_Install_Redis'></a>
|
73
|
-
### Redis
|
74
|
-
|
75
|
-
Redis is a pre-requisite for running Resque.
|
76
|
-
|
77
|
-
Please refer to the [Resque Redis Section][redis] for complete
|
78
|
-
instructions.
|
79
|
-
|
80
|
-
<a name='section_Install_MongoDB'></a>
|
81
|
-
### MongoDB
|
82
|
-
|
83
|
-
MongoDB is used to persist caches between reads and writes, keep track
|
84
|
-
of Users and Jobs, and store Datasets that map to endpoints.
|
85
|
-
|
86
|
-
Please refer to the [MongoDB Quickstart Page][mongodb_quickstart] to get started.
|
87
|
-
|
88
|
-
The settings for database and port are set in config/mongoid.yml
|
89
|
-
and are best left as default. Please refer to [Configure
|
90
|
-
Mongoid](#section_Configure_Mongoid) for details.
|
91
|
-
|
92
|
-
<a name='section_Install_Mobilize-Base'></a>
|
93
|
-
### Mobilize-Base
|
94
|
-
|
95
|
-
Mobilize-Base contains all of the gems it needs to run.
|
96
|
-
|
97
|
-
add this to your Gemfile:
|
98
|
-
|
99
|
-
``` ruby
|
100
|
-
gem "mobilize-base"
|
101
|
-
```
|
102
|
-
|
103
|
-
or do
|
104
|
-
|
105
|
-
$ gem install mobilize-base
|
106
|
-
|
107
|
-
for a ruby-wide install.
|
108
|
-
|
109
|
-
<a name='section_Install_Folders_and_Files'></a>
|
110
|
-
### Folders and Files
|
111
|
-
|
112
|
-
Mobilize requires a config folder and a log folder.
|
113
|
-
|
114
|
-
If you're on Rails, it will use the built-in config and log folders.
|
115
|
-
|
116
|
-
Otherwise, it will use log and config folders in the project folder (the
|
117
|
-
same one that contains your Rakefile)
|
118
|
-
|
119
|
-
### Rakefile
|
120
|
-
|
121
|
-
Inside the Rakefile in your project's root folder, make sure you have:
|
122
|
-
|
123
|
-
``` ruby
|
124
|
-
require 'mobilize-base/tasks'
|
125
|
-
```
|
126
|
-
|
127
|
-
This defines rake tasks essential to run the environment.
|
128
|
-
|
129
|
-
### Config and Log Folders
|
130
|
-
|
131
|
-
run
|
132
|
-
|
133
|
-
$ rake mobilize_base:setup
|
134
|
-
|
135
|
-
Mobilize will create config/mobilize/ and log/ folders at the project root
|
136
|
-
level. (same as the Rakefile).
|
137
|
-
|
138
|
-
(You can override these by passing
|
139
|
-
MOBILIZE_CONFIG_DIR and/or MOBILIZE_LOG_DIR arguments to the command.
|
140
|
-
All directories must end with a '/'.)
|
141
|
-
|
142
|
-
The script will also create samples for all required config files, which are detailed below.
|
143
|
-
|
144
|
-
Resque will create a mobilize-resque-`<environment>`.log in the log folder,
|
145
|
-
and loop over 10 files, 10MB each.
|
146
|
-
|
147
|
-
<a name='section_Configure'></a>
|
148
|
-
Configure
|
149
|
-
------------
|
150
|
-
|
151
|
-
All Mobilize configurations live in files in `config/mobilize/*.yml` by
|
152
|
-
default. Samples can
|
153
|
-
be found below or on github in the [lib/samples][git_samples] folder.
|
154
|
-
|
155
|
-
<a name='section_Configure_Google_Drive'></a>
|
156
|
-
### Configure Google Drive
|
157
|
-
|
158
|
-
gdrive.yml needs:
|
159
|
-
* a domain, which can be gmail.com but may be different depending on
|
160
|
-
your organization. All gdrive accounts should have
|
161
|
-
the same domain, and all Users should have emails in this domain.
|
162
|
-
* an owner name and password. You can set up separate owners
|
163
|
-
for different environments as in the below file, which will keep your
|
164
|
-
mission critical workers from getting rate-limit errors.
|
165
|
-
* one admin_group_name, which the owner and all admins should be added to -- this
|
166
|
-
group will need read permissions to read from and edit permissions to write
|
167
|
-
to files.
|
168
|
-
* one or more admins with email attributes -- these will be for people
|
169
|
-
who should be given write permissions to all Mobilize books in the
|
170
|
-
environment for maintenance purposes.
|
171
|
-
* one worker_group_name, which the owner and all workers should be added to -- this
|
172
|
-
group will need read permissions to read from and edit permissions to write
|
173
|
-
to files.
|
174
|
-
* one or more workers with name and pw attributes -- they will be used
|
175
|
-
to queue up google reads and writes. This can be the same as the owner
|
176
|
-
account for testing purposes or low-volume environments.
|
177
|
-
|
178
|
-
__Mobilize only allows one Resque
|
179
|
-
worker at a time to use a Google drive worker account for
|
180
|
-
reading/writing, which is called a gdrive_slot.__
|
181
|
-
|
182
|
-
Sample gdrive.yml:
|
183
|
-
|
184
|
-
``` yml
|
185
|
-
---
|
186
|
-
development:
|
187
|
-
domain: host.com
|
188
|
-
owner:
|
189
|
-
name: owner_development
|
190
|
-
pw: google_drive_password
|
191
|
-
admin_group_name: admins_development
|
192
|
-
admins:
|
193
|
-
- name: admin
|
194
|
-
worker_group_name: workers_development
|
195
|
-
workers:
|
196
|
-
- name: worker_development001
|
197
|
-
pw: worker001_google_drive_password
|
198
|
-
- name: worker_development002,
|
199
|
-
pw: worker002_google_drive_password
|
200
|
-
test:
|
201
|
-
domain: host.com
|
202
|
-
owner:
|
203
|
-
name: owner_test
|
204
|
-
pw: google_drive_password
|
205
|
-
admin_group_name: admins_test
|
206
|
-
admins:
|
207
|
-
- name: admin
|
208
|
-
worker_group_name: workers_test
|
209
|
-
workers:
|
210
|
-
- name: worker_test001
|
211
|
-
pw: worker001_google_drive_password
|
212
|
-
- name: worker_test002
|
213
|
-
pw: worker002_google_drive_password
|
214
|
-
production:
|
215
|
-
domain: host.com
|
216
|
-
owner:
|
217
|
-
name: owner_production
|
218
|
-
pw: google_drive_password
|
219
|
-
admin_group_name: admins_production
|
220
|
-
admins:
|
221
|
-
- name: admin
|
222
|
-
worker_group_name: workers_production
|
223
|
-
workers:
|
224
|
-
- name: worker_production001
|
225
|
-
pw: worker001_google_drive_password
|
226
|
-
- name: worker_production002
|
227
|
-
pw: worker002_google_drive_password
|
228
|
-
```
|
229
|
-
|
230
|
-
<a name='section_Configure_Google_Sheets'></a>
|
231
|
-
### Configure Google Sheets
|
232
|
-
|
233
|
-
gsheet.yml needs:
|
234
|
-
* max_cells, which is the number of cells a sheet is allowed to have
|
235
|
-
written to it at one time. Default is 50k cells, which is about how
|
236
|
-
much you can write before things start breaking.
|
237
|
-
* Because Google Docs ties date formatting to the Locale for the
|
238
|
-
spreadsheet, there are 2 date format parameters:
|
239
|
-
* read_date_format, which is the format that should be read FROM google
|
240
|
-
sheets for date columns.
|
241
|
-
* sheet_date_format, which is the format that the google sheet is in.
|
242
|
-
* A date column is defined as one where the column header = "date" or "Date", or ends with "_date" or "Date".
|
243
|
-
* The defaults are set to US locale for sheet_date_format, because in 'Murica (US) we
|
244
|
-
use %m/%d/%Y for some reason, and to %Y-%m-%d format for
|
245
|
-
reading, which is more standard and sorts well as a string. If your
|
246
|
-
locale is NOT 'Murica you will want to change these.
|
247
|
-
|
248
|
-
Sample gsheet.yml
|
249
|
-
|
250
|
-
``` yml
|
251
|
-
---
|
252
|
-
development:
|
253
|
-
max_cells: 400000
|
254
|
-
read_date_format: "%Y-%m-%d"
|
255
|
-
sheet_date_format: "%m/%d/%Y"
|
256
|
-
test:
|
257
|
-
max_cells: 400000
|
258
|
-
read_date_format: "%Y-%m-%d"
|
259
|
-
sheet_date_format: "%m/%d/%Y"
|
260
|
-
staging:
|
261
|
-
max_cells: 400000
|
262
|
-
read_date_format: "%Y-%m-%d"
|
263
|
-
sheet_date_format: "%m/%d/%Y"
|
264
|
-
production:
|
265
|
-
max_cells: 400000
|
266
|
-
read_date_format: "%Y-%m-%d"
|
267
|
-
sheet_date_format: "%m/%d/%Y"
|
268
|
-
```
|
269
|
-
|
270
|
-
<a name='section_Configure_Jobtracker'></a>
|
271
|
-
### Configure Jobtracker
|
272
|
-
|
273
|
-
The Jobtracker sits on your Resque and does 2 things:
|
274
|
-
* check for Users that are due for polling;
|
275
|
-
* send out notifications when:
|
276
|
-
* there are failed jobs on Resque;
|
277
|
-
* there are jobs on Resque that have run beyond the max run time.
|
278
|
-
|
279
|
-
Emails are sent using ActionMailer, through the owner Google Drive
|
280
|
-
account.
|
281
|
-
* errors are sent to the owner of the job/stage as well as the admin group.
|
282
|
-
* errors not specific to a job/stage are sent to the admin group only, as
|
283
|
-
given in gdrive.yml
|
284
|
-
|
285
|
-
To this end, it needs these parameters, for which there is a sample
|
286
|
-
below and in the [lib/samples][git_samples] folder:
|
287
|
-
|
288
|
-
``` yml
|
289
|
-
---
|
290
|
-
development:
|
291
|
-
cycle_freq: 10 #time between Jobtracker sweeps
|
292
|
-
notification_freq: 3600 #1 hour between failure/timeout notifications
|
293
|
-
runner_read_freq: 300 #5 min between runner reads
|
294
|
-
max_run_time: 14400 # if a job runs for 4h+, notification will be sent
|
295
|
-
extensions: [] #additional Mobilize modules to load workers with
|
296
|
-
test:
|
297
|
-
cycle_freq: 10 #time between Jobtracker sweeps
|
298
|
-
notification_freq: 3600 #1 hour between failure/timeout notifications
|
299
|
-
runner_read_freq: 300 #5 min between runner reads
|
300
|
-
max_run_time: 14400 # if a job runs for 4h+, notification will be sent
|
301
|
-
extensions: [] #additional Mobilize modules to load workers with
|
302
|
-
production:
|
303
|
-
cycle_freq: 10 #time between Jobtracker sweeps
|
304
|
-
notification_freq: 3600 #1 hour between failure/timeout notifications
|
305
|
-
runner_read_freq: 300 #5 min between runner reads
|
306
|
-
max_run_time: 14400 # if a job runs for 4h+, notification will be sent
|
307
|
-
extensions: [] #additional Mobilize modules to load workers with
|
308
|
-
```
|
309
|
-
|
310
|
-
<a name='section_Configure_Resque'></a>
|
311
|
-
### Configure Resque
|
312
|
-
|
313
|
-
Resque keeps track of Jobs, Workers and logging.
|
314
|
-
|
315
|
-
It needs the below parameters, which can be found in the [lib/samples][git_samples] folder.
|
316
|
-
|
317
|
-
* queue_name - the name of the Resque queue where you would like the Jobtracker and Resque Workers to
|
318
|
-
run. Default is mobilize.
|
319
|
-
* max_workers - the total number of simultaneous workers you would like
|
320
|
-
on your queue. Default is 4 for development and test, 36 in
|
321
|
-
production, but feel free to adjust depending on your hardware.
|
322
|
-
* redis_port - you should probably leave this alone, it specifies the
|
323
|
-
default port for dev and prod and a separate one for testing.
|
324
|
-
* web_port - this specifies the port under which resque-web operates
|
325
|
-
|
326
|
-
``` yml
|
327
|
-
---
|
328
|
-
development:
|
329
|
-
queue_name: mobilize
|
330
|
-
max_workers: 4
|
331
|
-
redis_port: 6379
|
332
|
-
web_port: 8282
|
333
|
-
test:
|
334
|
-
queue_name: mobilize
|
335
|
-
max_workers: 4
|
336
|
-
redis_port: 9736
|
337
|
-
web_port: 8282
|
338
|
-
production:
|
339
|
-
queue_name: mobilize
|
340
|
-
max_workers: 36
|
341
|
-
redis_port: 6379
|
342
|
-
web_port: 8282
|
343
|
-
```
|
344
|
-
|
345
|
-
<a name='section_Configure_Resque-Web'></a>
|
346
|
-
### Configure Resque-Web
|
347
|
-
|
348
|
-
Please change your default username and password in the resque_web.rb
|
349
|
-
file in your config folder, reproduced below:
|
350
|
-
|
351
|
-
``` ruby
|
352
|
-
#comment out the below if you want no authentication on your web portal (not recommended)
|
353
|
-
Resque::Server.use(Rack::Auth::Basic) do |user, password|
|
354
|
-
[user, password] == ['admin', 'changeyourpassword']
|
355
|
-
end
|
356
|
-
```
|
357
|
-
|
358
|
-
This file is passed as a config file argument to
|
359
|
-
mobilize_base:resque_web task, as detailed in [Start Resque-Web](#section_Start_Start_Resque-Web).
|
360
|
-
|
361
|
-
<a name='section_Configure_Gridfs'></a>
|
362
|
-
### Configure Gridfs
|
363
|
-
|
364
|
-
Mobilize stores cached data in MongoDB Gridfs.
|
365
|
-
It needs the below parameters, which can be found in the [lib/samples][git_samples] folder.
|
366
|
-
|
367
|
-
* max_compressed_write_size - the amount of compressed data Gridfs will
|
368
|
-
allow. If you try to write more than this, an exception will be thrown.
|
369
|
-
|
370
|
-
``` yml
|
371
|
-
---
|
372
|
-
development:
|
373
|
-
max_compressed_write_size: 1000000000 #~1GB
|
374
|
-
test:
|
375
|
-
max_compressed_write_size: 1000000000 #~1GB
|
376
|
-
production:
|
377
|
-
max_compressed_write_size: 1000000000 #~1GB
|
378
|
-
```
|
379
|
-
|
380
|
-
<a name='section_Configure_Mongoid'></a>
|
381
|
-
### Configure Mongoid
|
382
|
-
|
383
|
-
Mongoid is the abstraction layer on top of MongoDB so we can interact
|
384
|
-
with it in an ActiveRecord-like fashion.
|
385
|
-
|
386
|
-
It needs the below parameters, which can be found in the [lib/samples][git_samples] folder.
|
387
|
-
|
388
|
-
You shouldn't need to change anything in this file.
|
389
|
-
|
390
|
-
``` yml
|
391
|
-
---
|
392
|
-
development:
|
393
|
-
sessions:
|
394
|
-
default:
|
395
|
-
database: mobilize-development
|
396
|
-
persist_in_safe_mode: true
|
397
|
-
hosts:
|
398
|
-
- 127.0.0.1:27017
|
399
|
-
test:
|
400
|
-
sessions:
|
401
|
-
default:
|
402
|
-
database: mobilize-test
|
403
|
-
persist_in_safe_mode: true
|
404
|
-
hosts:
|
405
|
-
- 127.0.0.1:27017
|
406
|
-
production:
|
407
|
-
sessions:
|
408
|
-
default:
|
409
|
-
database: mobilize-production
|
410
|
-
persist_in_safe_mode: true
|
411
|
-
hosts:
|
412
|
-
- 127.0.0.1:27017
|
413
|
-
```
|
414
|
-
|
415
|
-
<a name='section_Start'></a>
|
416
|
-
Start
|
417
|
-
-----
|
418
|
-
|
419
|
-
A Mobilize instance can be considered "started" or "running" when you have:
|
420
|
-
|
421
|
-
1. Resque workers running on the Mobilize queue;
|
422
|
-
2. A Jobtracker running on one of the Resque workers;
|
423
|
-
3. One or more Users created in your MongoDB;
|
424
|
-
4. One or more Jobs created in a User's Runner;
|
425
|
-
|
426
|
-
<a name='section_Start_Start_resque-web'></a>
|
427
|
-
### Start resque-web
|
428
|
-
|
429
|
-
Mobilize ships with its own rake task to start resque web -- you can do
|
430
|
-
the following:
|
431
|
-
|
432
|
-
|
433
|
-
$ MOBILIZE_ENV=<environment> rake mobilize_base:resque_web
|
434
|
-
|
435
|
-
This will start a resque_web instance with the port specified in your
|
436
|
-
resque.yml and the config/auth scheme specified in your resque_web.rb.
|
437
|
-
|
438
|
-
More detail on the
|
439
|
-
[Resque-Web Standalone section][resque-web].
|
440
|
-
|
441
|
-
<a name='section_Start_Set_Environment'></a>
|
442
|
-
### Set Environment
|
443
|
-
|
444
|
-
Mobilize takes the environment from your Rails.env if you're running
|
445
|
-
Rails, or assumes "development." You can specify "development", "test",
|
446
|
-
or "production," as per the yml files.
|
447
|
-
|
448
|
-
Otherwise, it takes it from MOBILIZE_ENV parameter, as in:
|
449
|
-
|
450
|
-
``` ruby
|
451
|
-
> ENV['MOBILIZE_ENV'] = 'production'
|
452
|
-
> require 'mobilize-base'
|
453
|
-
```
|
454
|
-
This affects all parameters as set in the yml files, including the
|
455
|
-
database.
|
456
|
-
|
457
|
-
<a name='section_Start_Create_User'></a>
|
458
|
-
### Create User
|
459
|
-
|
460
|
-
Users are people who use the Mobilize service to move data from one
|
461
|
-
endpoint to another. They each have a Runner, which is a google sheet
|
462
|
-
that contains one or more Jobs.
|
463
|
-
|
464
|
-
To create a requestor, use the User.find_or_create_by_name
|
465
|
-
command (replace the user with your own name, or any name
|
466
|
-
in your domain).
|
467
|
-
|
468
|
-
``` ruby
|
469
|
-
irb> User.find_or_create_by_name("user_name")
|
470
|
-
```
|
471
|
-
|
472
|
-
<a name='section_Start_Start_Workers'></a>
|
473
|
-
### Start Workers
|
474
|
-
|
475
|
-
Workers are rake tasks that load the Mobilize environment and allow the
|
476
|
-
processing of the Jobtracker, Users and Jobs.
|
477
|
-
|
478
|
-
These will start as many workers as are defined in your resque.yml.
|
479
|
-
|
480
|
-
To start workers, do:
|
481
|
-
|
482
|
-
``` ruby
|
483
|
-
> Jobtracker.prep_workers
|
484
|
-
```
|
485
|
-
|
486
|
-
if you have workers already running and would like to kill and refresh
|
487
|
-
them, do:
|
488
|
-
|
489
|
-
``` ruby
|
490
|
-
> Jobtracker.restart_workers!
|
491
|
-
```
|
492
|
-
|
493
|
-
Note that restart will kill any workers on the Mobilize queue.
|
494
|
-
|
495
|
-
<a name='section_Start_View_Logs'></a>
|
496
|
-
### View Logs
|
497
|
-
|
498
|
-
at this point, you'll want to start viewing the logs for the Resque
|
499
|
-
workers -- they will be stored under your log folder, by default log/. You can do:
|
500
|
-
|
501
|
-
$ tail -f log/mobilize-`<environment>`.log
|
502
|
-
|
503
|
-
to view them.
|
504
|
-
|
505
|
-
<a name='section_Start_Start_Jobtracker'></a>
|
506
|
-
### Start Jobtracker
|
507
|
-
|
508
|
-
Once the Resque workers are running, and you have at least one User
|
509
|
-
set up, it's time to start the Jobtracker:
|
510
|
-
|
511
|
-
``` ruby
|
512
|
-
> Jobtracker.start
|
513
|
-
```
|
514
|
-
|
515
|
-
The Jobtracker will automatically enqueue any Users that have not
|
516
|
-
been processed in the requestor_refresh period defined in the
|
517
|
-
jobtracker.yml, and create their Runners if they do not exist. You can
|
518
|
-
see this process on your Resque UI and in the log file.
|
519
|
-
|
520
|
-
<a name='section_Start_Create_Job'></a>
|
521
|
-
### Create Job
|
522
|
-
|
523
|
-
Now it's time to go onto the Runner and add a Job to be processed.
|
524
|
-
|
525
|
-
To do this, you should log into your Google Drive with either the
|
526
|
-
owner's account, an admin account, or the Runner User's account. These
|
527
|
-
will be the accounts with edit permissions to a given Runner.
|
528
|
-
|
529
|
-
Navigate to the Jobs tab on the Runner `(denoted by Runner(<requestor
|
530
|
-
name>))` and enter values under each header:
|
531
|
-
|
532
|
-
* name This is the name of the job you would like to add. Names must be unique across all your jobs, otherwise you will get an error
|
533
|
-
|
534
|
-
* active set this to blank or FALSE if you want to turn off a job
|
535
|
-
|
536
|
-
* trigger This uses human readable syntax to schedule jobs. It accepts the following:
|
537
|
-
* every `<integer>` hour -- fire the job at increments of `<integer>` hours, minimum of 1 hour
|
538
|
-
* every `<integer>` day -- fire the job at increments of `<integer>` days, minimum of 1
|
539
|
-
* every `<integer>` day after <HH:MM> -- fire the job at increments of <integer> days, after HH:MM UTC time
|
540
|
-
* every `<integer>` day_of_week after <HH:MM> -- fire the job on specified day of week, after HH:MM UTC time; 1=Sunday
|
541
|
-
* every `<integer>` day_of_month after <HH:MM> -- fire the job on specified day of month, after HH:MM UTC time
|
542
|
-
* once -- fire the job once if active is set to TRUE, set active to FALSE right after
|
543
|
-
* after `<jobname>` -- fire the job after the job named `<jobname>`
|
544
|
-
|
545
|
-
* status Mobilize writes this field with the last status returned by the job
|
546
|
-
|
547
|
-
* stage1..stage5 List of stages to be performed by the job.
|
548
|
-
* Stages have this syntax: `<handler>.<call> <params>`.
|
549
|
-
* handler specifies the file that should receive the stage
|
550
|
-
* the call specifies the method within the file. The method should
|
551
|
-
be called `"<handler>.<call>_by_stage_path"`
|
552
|
-
* the params the method accepts, which are custom to each
|
553
|
-
stage. These should be of the for `<key1>: <value1>, <key2>: <value2>`, where
|
554
|
-
`<key>` is an unquoted string and `<value>` is a quoted string, an
|
555
|
-
integer, an array (delimited by square braces), or a hash (delimited by
|
556
|
-
curly braces).
|
557
|
-
* For mobilize-base, the following stage is available:
|
558
|
-
* gsheet.write `source: <input_path>`, which reads the sheet.
|
559
|
-
* The input_path should be of the form:
|
560
|
-
* `<gbook_name>/<gsheet_name>` or just `<gsheet_name>` if the target is in
|
561
|
-
the Runner itself.
|
562
|
-
* `gfile://<gfile_name>` if the target is a file.
|
563
|
-
* The file must be owned by the Gdrive owner.
|
564
|
-
* The test uses "gfile://test_base_1.tsv".
|
565
|
-
* The stage_name should be of the form `<stage_column>`. The test uses "stage1" for the first test
|
566
|
-
and "base1.out" for the second test. The first
|
567
|
-
takes the output from the first stage and the second reads it straight
|
568
|
-
from the referenced sheet.
|
569
|
-
* All stages accept retry parameters:
|
570
|
-
* retries: an integer specifying the number of times that the system will try it again before giving up.
|
571
|
-
* delay: an integer specifying the number of seconds between retries.
|
572
|
-
* always_on: if false, turns the job off on stage failures.
|
573
|
-
Otherwise the job will retry from the beginning with the same frequency as the Runner refresh rate.
|
574
|
-
* notify: by default, the stage owner will be notified on failure.
|
575
|
-
* if false, will not notify the stage owner in the event of a failure.
|
576
|
-
* If it's an email address, will email the specified person.
|
577
|
-
* If a stage fails after all retries, it will output its standard error to a tab in the Runner with the name of the job, the name of the stage, and a ".err" extension
|
578
|
-
* The tab will be headed "response" and will contain the exception and backtrace for the error.
|
579
|
-
* The test uses "Requestor_mobilize(test)/base1.out" and
|
580
|
-
"Runner_mobilize(test)/base2.out" for target sheets.
|
581
|
-
|
582
|
-
<a name='section_Start_Run_Test'></a>
|
583
|
-
### Run Test
|
584
|
-
|
585
|
-
To run tests, you will need to
|
586
|
-
|
587
|
-
1) clone the repository
|
588
|
-
|
589
|
-
From the project folder, run
|
590
|
-
|
591
|
-
2) rake mobilize_base:setup
|
592
|
-
|
593
|
-
and populate the "test" environment in the config files with the
|
594
|
-
necessary details.
|
595
|
-
|
596
|
-
3) $ rake test
|
597
|
-
|
598
|
-
This will create a test Runner with a sample job. These will run off a
|
599
|
-
test redis instance which will be killed once the tests finish.
|
600
|
-
|
601
|
-
<a name='section_Start_'></a>
|
602
|
-
### Run Test
|
603
|
-
|
604
|
-
To run tests, you will need to
|
605
|
-
|
606
|
-
1) clone the repository
|
607
|
-
|
608
|
-
From the project folder, run
|
609
|
-
|
610
|
-
2) rake mobilize_base:setup
|
611
|
-
|
612
|
-
and populate the "test" environment in the config files with the
|
613
|
-
necessary details.
|
614
|
-
|
615
|
-
3) $ rake test
|
616
|
-
|
617
|
-
This will create a test Runner with a sample job. These will run off a
|
618
|
-
test redis instance. This instance will be kept alive so you can test
|
619
|
-
additional Mobilize modules. (see [mobilize-ssh][mobilize-ssh] for more)
|
620
|
-
|
621
|
-
<a name='section_Start_Add_Gbooks_And_Gsheets'></a>
|
622
|
-
### Add Gbooks and Gsheets
|
623
|
-
|
624
|
-
A User's Runner should be kept clean, preferably with only the jobs
|
625
|
-
sheet. The test keeps everything in the
|
626
|
-
Runner, but in reality you will want to create lots of different books
|
627
|
-
to share with different people in your organization.
|
628
|
-
|
629
|
-
To add a new Gbook, create one as you normally would, then make sure the
|
630
|
-
Owner is the same user as specified in your gdrive.yml/owner/name value.
|
631
|
-
Mobilize will handle the rest, extending permissions to workers and
|
632
|
-
admins.
|
633
|
-
|
634
|
-
Also make sure any Gsheets you specify for __read__ operations exist
|
635
|
-
prior to calling the job, or there will be an error. __Write__
|
636
|
-
operations will create the book and sheet if it does not already exist,
|
637
|
-
already under ownership of the owner account.
|
638
|
-
|
639
|
-
<a name='section_Meta'></a>
|
640
|
-
Meta
|
641
|
-
----
|
642
|
-
|
643
|
-
* Code: `git clone git://github.com/ngmoco/mobilize-base.git`
|
644
|
-
* Home: <https://github.com/ngmoco/mobilize-base>
|
645
|
-
* Bugs: <https://github.com/ngmoco/mobilize-base/issues>
|
646
|
-
* Gems: <http://rubygems.org/gems/mobilize-base>
|
647
|
-
|
648
|
-
<a name='section_Author'></a>
|
649
|
-
Author
|
650
|
-
------
|
651
|
-
|
652
|
-
Cassio Paes-Leme :: cpaesleme@ngmoco.com :: @cpaesleme
|
653
|
-
|
654
|
-
<a name='section_Special_Thanks'></a>
|
655
|
-
Special Thanks
|
656
|
-
--------------
|
657
|
-
|
658
|
-
* Al Thompson and Sagar Mehta for awesome design advice and discussions
|
659
|
-
* Elliott Clark for enlightening me to the wonders of Resque
|
660
|
-
* Bob Colner for pointing me to google-drive-ruby when I tried to
|
661
|
-
reinvent the wheel
|
662
|
-
* ngmoco:) and DeNA Global for supporting and adopting the Mobilize
|
663
|
-
platform
|
664
|
-
* gimite, defunkt, 10gen, and the countless other github heroes and
|
665
|
-
crewmembers.
|
666
|
-
|
667
|
-
[google_drive_ruby]: https://github.com/gimite/google-drive-ruby
|
668
|
-
[resque]: https://github.com/defunkt/resque
|
669
|
-
[mongoid]: http://mongoid.org/en/mongoid/index.html
|
670
|
-
[resque_redis]: https://github.com/defunkt/resque#section_Installing_Redis
|
671
|
-
[mongodb_quickstart]: http://www.mongodb.org/display/DOCS/Quickstart
|
672
|
-
[git_samples]: https://github.com/ngmoco/mobilize-base/tree/master/lib/samples
|
673
|
-
[rvm]: https://rvm.io/
|
674
|
-
[resque-web]: https://github.com/defunkt/resque#standalone
|
675
|
-
[mobilize-ssh]: https://github.com/ngmoco/mobilize-ssh
|
4
|
+
Please refer to the mobilize-server wiki: https://github.com/DeNA/mobilize-server/wiki
|
@@ -12,11 +12,16 @@ class Array
|
|
12
12
|
return self.inject{|sum,x| sum + x }
|
13
13
|
end
|
14
14
|
def hash_array_to_tsv
|
15
|
-
|
15
|
+
ha = self
|
16
|
+
if ha.first.nil? or ha.first.class!=Hash
|
16
17
|
return ""
|
17
18
|
end
|
18
|
-
|
19
|
-
|
19
|
+
max_row_length = ha.map{|h| h.keys.length}.max
|
20
|
+
header_keys = ha.select{|h| h.keys.length==max_row_length}.first.keys
|
21
|
+
header = header_keys.join("\t")
|
22
|
+
rows = ha.map do |r|
|
23
|
+
header_keys.map{|k| r[k]}.join("\t")
|
24
|
+
end
|
20
25
|
([header] + rows).join("\n")
|
21
26
|
end
|
22
27
|
end
|
@@ -14,7 +14,7 @@ module GoogleDrive
|
|
14
14
|
def push(entry)
|
15
15
|
#do not send email notifications
|
16
16
|
entry = AclEntry.new(entry) if entry.is_a?(Hash)
|
17
|
-
url_suffix = "?send-notification-emails=false"
|
17
|
+
url_suffix = ((@acls_feed_url.index("?") ? "&" : "?") + "send-notification-emails=false")
|
18
18
|
header = {"GData-Version" => "3.0", "Content-Type" => "application/atom+xml"}
|
19
19
|
doc = @session.request(:post, "#{@acls_feed_url}#{url_suffix}", :data => entry.to_xml(), :header => header, :auth => :writely)
|
20
20
|
entry.params = entry_to_params(doc.root)
|