tootsie 0.9.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,15 @@
1
+ *~
2
+ .DS_Store
3
+ /.bundle
4
+ /log
5
+ /tmp
6
+ /trash
7
+ /config/development.yml
8
+ /config/production.yml
9
+ .rvmrc
10
+ *.sublime-project
11
+ *.sublime-workspace
12
+ .rspec
13
+ /coverage
14
+ /pkg
15
+ /Gemfile.lock
data/Gemfile ADDED
@@ -0,0 +1,2 @@
1
+ source 'http://rubygems.org'
2
+ gemspec
data/License ADDED
@@ -0,0 +1,7 @@
1
+ Copyright © 2010, 2011 Alexander Staubo
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4
+
5
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6
+
7
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1,256 @@
1
+ Tootsie
2
+ =======
3
+
4
+ Tootsie (formerly called Tranz) is a simple audio/video/image transcoding/modification application written in Ruby. It can transcode audio, video and images between different formats, and also perform basic manipulations such as scaling.
5
+
6
+ Tootsie has the following external dependencies:
7
+
8
+ * FFmpeg for transcoding of video and audio.
9
+ * ImageMagick/GraphicsMagick for image conversion.
10
+ * Amazon S3 for loading and storage of files (optional).
11
+ * Amazon Simple Queue Service for internal task queue management (optional).
12
+
13
+ Overview
14
+ --------
15
+
16
+ Tootsie is divided into multiple independent parts:
17
+
18
+ * Job manager: finds new transcoding jobs and executes them.
19
+ * FFmpeg, ImageMagick: performs the actual transcoding.
20
+ * Queue: currently local file-based queues (for testing) and Amazon Simple Queue Service are supported.
21
+ * Storage: currently web servers and Amazon S3 are supported.
22
+ * Web service: A small RESTful API for managing jobs.
23
+
24
+ The framework is designed to be easily pluggable, and to let you pick the parts you need to build a custom transcoding service. It is also designed to be easily distributed across many nodes.
25
+
26
+ Execution flow
27
+ --------------
28
+
29
+ The task manager pops jobs from a queue and processes them. Each job specifies an input, an output, and transcoding parameters. Optionally the job may also specify a notification URL which is invoked to inform the caller about job progress.
30
+
31
+ Supported inputs at the moment:
32
+
33
+ * HTTP resource. Currently only public (non-authenticated) resources are supported.
34
+ * Amazon S3 bucket resource. S3 buckets must have the appropriate ACLs so that Tootsie can read the files; if the input file is not public, Tootsie must be run with an AWS access key that is granted read access to the file.
35
+
36
+ Supported outputs:
37
+
38
+ * HTTP resource. The encoded file will be `POST`ed to a URL.
39
+ * Amazon S3 bucket resource. Tootsie will need write permissions to any S3 buckets.
40
+
41
+ Each job may have multiple outputs given a single input. Designwise, the reason for doing this -- as opposed to requiring that the client submit multiple jobs, one for each output -- is twofold:
42
+
43
+ 1. It allows the job to cache the input data locally for the duration of the job, rather than fetching it multiple times. One could suppose that multiple jobs could share the same cached input, but this would be awkward in a distributed setting where each node has its own file system; in such a case, a shared storage mechanism (file system, database or similar) would be needed.
44
+
45
+ 2. It allows the client to be informed when *all* transcoded versions are available, something which may drastically simplify client logic. For example, a web application submitting a job to produce multiple scaled versions of an image may only start showing these images when all versions have been produced. To know whether all versions have been produced, it needs to maintain state somewhere about the progress. Having a single job produce all versions means this state can be reduced to a single boolean value.
46
+
47
+ When using multiple outputs per job one should keep in mind that this reduces job throughput, requiring more concurrent job workers to be deployed.
48
+
49
+ FFmpeg and ImageMagick are invoked for each job to perform the transcoding. These are abstracted behind set of generic options specifying format, codecs, bit rate and so on.
50
+
51
+ API
52
+ ===
53
+
54
+ To schedule jobs, one uses the web service, a small app that supports job control methods:
55
+
56
+ * POST `/job`: Schedule a job. Returns 201 if the job was created.
57
+ * GET `/status`: Get current processing status as a JSON hash.
58
+
59
+ The job must be posted as an JSON hash with the content type `application/json`. Common to all job scheduling POSTs are these keys:
60
+
61
+ * `type`: Type of job. See sections below for details.
62
+ * `notification_url`: Optional notification URL. Progress (including completion and failure) will be reported using POSTs.
63
+ * `retries`: Maximum number of retries, if any. Defaults to 5.
64
+ * `access_key`: Access key for calculating notification signature. See below.
65
+
66
+ Job-specific parameters are provided in the key `params`.
67
+
68
+ Access key
69
+ ----------
70
+
71
+ (This part has not been written yet. Nor implemented, actually.)
72
+
73
+ Notifications
74
+ -------------
75
+
76
+ If a notification URL is provided, events will be sent to it using `POST` requests as JSON data. These are 'fire and forget' and will currently not be retried on failure, and the response status code is ignored.
77
+
78
+ There are several types of events, indicated by the `event` key:
79
+
80
+ * `started`: The job was started.
81
+ * `complete`: The job was complete. The key `time_taken` will contain the time taken for the job, in seconds. Additional data will be provided that are specific to the type of job.
82
+ * `failed`: The job failed. The key `reason` will contain a textual explanation for the failure.
83
+ * `failed_will_retry`: The job failed, but is being rescheduled for retrying. The key `reason` will contain a textual explanation for the failure.
84
+
85
+ Video transcoding jobs
86
+ ----------------------
87
+
88
+ Video jobs have the `type` key set to either `video`, `audio`. Currently, `audio` is simply an alias for `video` and handled by the same pipeline. The key `params` must be set to a hash with these keys:
89
+
90
+ * `input_url`: URL to input file, either an HTTP URL or an S3 URL (see below).
91
+ * `versions`: Either a hash or an array of such hashes, each with the following keys:
92
+ * `target_url`: URL to output resource, either an HTTP URL which accepts POSTs, or an S3 URL.
93
+ * `thumbnail`: If specified, a thumbnail will be generated based on the options in this hash with the following keys:
94
+ * `target_url`: URL to output resource, either an HTTP URL which accepts POSTs, or an S3 URL.
95
+ * `width`: Desired width of thumbnail, defaults to output width.
96
+ * `height`: Desired height of thumbnail, defaults to output height.
97
+ * `at_seconds`: Desired point (in seconds) at which the thumbnail frame should be captured. Defaults to 50% into stream.
98
+ * `at_fraction`: Desired point (in percentage) at which the thumbnail frame should be captured. Defaults to 50% into stream.
99
+ * `force_aspect_ratio`: If `true`, force aspect ratio; otherwise aspect is preserved when computing dimensions.
100
+ * `audio_sample_rate`: Audio sample rate, in hertz.
101
+ * `audio_bitrate`: Audio bitrate, in bits per second.
102
+ * `audio_codec`: Audio codec name, eg. `mp4`.
103
+ * `video_frame_rate`: video frame rate, in hertz.
104
+ * `video_bitrate`: video bitrate, in bits per second.
105
+ * `video_codec`: video codec name, eg. `mp4`.
106
+ * `width`: desired video frame width in pixels.
107
+ * `height`: desired video frame height in pixels.
108
+ * `format`: File format.
109
+ * `content_type`: Content type of resultant file. Tootsie will not be able to guess this at the moment.
110
+
111
+ Completion notification provides the following data:
112
+
113
+ * `outputs` contains an array of results. Each is a hash with the following keys:
114
+ * `url`: the completed file.
115
+ * `metadata`: image metadata as a hash. These are raw EXIF and IPTC data from ImageMagick.
116
+
117
+ Image transcoding jobs
118
+ ----------------------
119
+
120
+ Image jobs have the `type` key set to `image`. The key `params` must be set to a hash with these keys:
121
+
122
+ * `input_url`: URL to input file, either an HTTP URL, `file:/path` URL or an S3 URL (see below).
123
+ * `versions`: Either a hash or an array of such hashes, each with the following keys:
124
+ * `target_url`: URL to output resource, either an HTTP URL, `file:/path` URL which accepts POSTs, or an S3 URL.
125
+ * `width`: Optional desired width of output image.
126
+ * `height`: Optional desired height of output image.
127
+ * `scale`: One of the following values:
128
+ * `down` (default): The input image is scaled to fit within the dimensions `width` x `height`. If only `width` or only `height` is specified, then the other component will be computed from the aspect ratio of the input image.
129
+ * `up`: As `within`, but allow scaling to dimensions that are larger than the input image.
130
+ * `fit`: Similar to `down`, but the dimensions are chosen so the output width and height are always met or exceeded. In other words, if you pass in an image that is 100x50, specifying output dimensions as 100x100, then the output image will be 150x100.
131
+ * `none`: Don't scale at all.
132
+ * `crop`: If true, crop the image to the output dimensions.
133
+ * `format`: Either `jpeg`, `png` or `gif`.
134
+ * `quality`: A quality value between 0.0 and 1.0 which will be translated to a compression level depending on the output coding. The default is 1.0.
135
+ * `strip_metadata`: If true, metadata such as EXIF and IPTC will be deleted. For thumbnails, this often reduces the file size considerably.
136
+ * `medium`: If `web`, the image will be optimized for web usage. See below for details.
137
+ * `content_type`: Content type of resultant file. The system will be able to guess basic types such as `image/jpeg`.
138
+
139
+ Note that scaling always preserves the aspect ratio of the original image; in other words, if the original is 100 x 200, then passing the dimensions 100x100 will produce an image that is 50x100. Enabling cropping, however, will force the aspect ratio of the specified dimensions.
140
+
141
+ If the option `medium` specifies `web`, the following additional transformations will be performed:
142
+
143
+ * The image will be automatically rotated based on EXIF orientation metadata, since web browsers don't do this.
144
+ * CMYK images will be converted to RGB, since most web browsers don't seem to display CMYK correctly.
145
+
146
+ Completion notification provides the following data:
147
+
148
+ * `outputs` contains an array of results. Each is a hash with the following keys:
149
+ * `url`: URL for the completed file.
150
+ * `metadata`: image metadata as a hash. These are raw EXIF and IPTC data from ImageMagick.
151
+ * `width`: width, in pixels, of original image.
152
+ * `height`: height, in pixels, of original image.
153
+ * `depth`: depth, in bits, of original image.
154
+
155
+ Note about S3 URLs
156
+ ------------------
157
+
158
+ To specify S3 URLs, we use a custom URI format:
159
+
160
+ s3:<bucketname></path/to/file>[?<options>]
161
+
162
+ The components are:
163
+
164
+ * bucketname: The name of the S3 bucket.
165
+ * /path/to/file: The actual S3 key.
166
+ * options: Optional parameters for storage, an URL query string.
167
+
168
+ The options are:
169
+
170
+ * `acl`: One of `private` (default), `public-read`, `public-read-write` or `authenticated-read`.
171
+ * `storage_class`: Either `standard` (default) or `reduced_redundancy`.
172
+ * `content_type`: Override stored content type.
173
+
174
+ Example S3 URLs:
175
+
176
+ * `s3:myapp/video`
177
+ * `s3:myapp/thumbnails?acl=public-read&storage_class=reduced_redundancy`
178
+ * `s3:myapp/images/12345?content_type=image/jpeg`
179
+
180
+ Current limitations
181
+ ===================
182
+
183
+ * Daemon supports only one task manager thread at a time.
184
+ * Transcoding options are very basic.
185
+ * No client access control; anyone can submit jobs.
186
+
187
+ Requirements
188
+ ============
189
+
190
+ * Ruby 1.9.1 or later.
191
+ * Bundler.
192
+
193
+ For video jobs:
194
+
195
+ * FFmpeg
196
+
197
+ For image jobs:
198
+
199
+ * ImageMagick
200
+ * Exiv2
201
+ * pngcrush (optional)
202
+
203
+ Installation
204
+ ============
205
+
206
+ * Fetch Git repositroy: `git clone git@github.com:origo/tootsie.git`.
207
+ * Install Bundler with `gem install bundler`.
208
+ * Install dependencies with `cd tootsie; bundle install`.
209
+
210
+ Running
211
+ =======
212
+
213
+ Create a configuration under `config`, eg. `config/development.yml`:
214
+
215
+ ---
216
+ aws_access_key_id: <your Amazon key>
217
+ aws_secret_access_key: <your Amazon secret>
218
+ sqs_queue_name: tootsie
219
+
220
+ Start the task manager with `bin/tootsie_task_manager start`. You can specify the number of workers with `-n`.
221
+
222
+ To run the web service, you will need a Rack-compatible web server, such as Unicorn or Thin. To start with Thin on port 9090:
223
+
224
+ $ thin --daemonize --rackup config.ru --port 9090 start
225
+
226
+ Jobs may now be posted to the web service API. For example:
227
+
228
+ $ cat << END | curl -d @- http://localhost:9090/job
229
+ {
230
+ 'type': 'video',
231
+ 'notification_url': 'http://example.com/transcoder_notification',
232
+ 'params': {
233
+ 'input_url': 'http://example.com/test.3gp',
234
+ 'outputs': {
235
+ 'target_url': 's3:mybucket/test.mp4?acl=public_read',
236
+ 'audio_sample_rate': 44100,
237
+ 'audio_bitrate': 64000,
238
+ 'format': 'flv',
239
+ 'content_type': 'video/x-flv'
240
+ }
241
+ }
242
+ }
243
+ END
244
+
245
+ License
246
+ =======
247
+
248
+ This software is licensed under the MIT License.
249
+
250
+ Copyright © 2010, 2011 Alexander Staubo
251
+
252
+ Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
253
+
254
+ The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
255
+
256
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
@@ -0,0 +1 @@
1
+ require "bundler/gem_tasks"
@@ -0,0 +1,36 @@
1
+ # -*- encoding: utf-8 -*-
2
+
3
+ $:.push File.expand_path("../lib", __FILE__)
4
+ require "tootsie/version"
5
+
6
+ Gem::Specification.new do |s|
7
+ s.name = "tootsie"
8
+ s.version = Tootsie::VERSION
9
+ s.authors = ["Alexander Staubo"]
10
+ s.email = ["alex@origo.no"]
11
+ s.homepage = "http://github.com/alexstaubo/tootsie"
12
+ s.summary = s.description = %{Tootsie is a simple audio/video/image transcoding/modification application.}
13
+
14
+ s.rubyforge_project = "tootsie"
15
+
16
+ s.files = `git ls-files`.split("\n")
17
+ s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
18
+ s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
19
+ s.require_paths = ["lib"]
20
+
21
+ s.add_runtime_dependency 'json', ['~> 1.4.6']
22
+ s.add_runtime_dependency 'sinatra', ['~> 1.0']
23
+ s.add_runtime_dependency 'activesupport', ['~>3.0.0']
24
+ s.add_runtime_dependency 'httpclient', ['~>2.2.1']
25
+ s.add_runtime_dependency 'builder', ['~> 2.1.2']
26
+ s.add_runtime_dependency 'mime-types', ['~> 1.16']
27
+ s.add_runtime_dependency 'xml-simple', ['~> 1.0.12']
28
+ s.add_runtime_dependency 'thin', ['~> 1.2.7']
29
+ s.add_runtime_dependency 's3', ['~> 0.3.7']
30
+ s.add_runtime_dependency 'sqs', ['~> 0.1.2']
31
+ s.add_runtime_dependency 'unicorn', ['~> 4.1.1']
32
+ s.add_runtime_dependency 'i18n', ['>= 0.4.2']
33
+ s.add_runtime_dependency 'scashin133-syslog_logger', ['~> 1.7.3']
34
+ s.add_development_dependency "rspec"
35
+ s.add_development_dependency "rake"
36
+ end
@@ -0,0 +1,82 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ ENV['BUNDLE_GEMFILE'] = File.expand_path('../../Gemfile', __FILE__)
4
+
5
+ require 'rubygems'
6
+ begin
7
+ require 'bundler'
8
+ rescue LoadError
9
+ # Ignore this
10
+ else
11
+ Bundler.setup
12
+ end
13
+
14
+ $:.unshift(File.expand_path('../../lib', __FILE__))
15
+ require 'tootsie'
16
+
17
+ environment = ENV['RACK_ENV']
18
+ environment ||= :development
19
+
20
+ num_workers = 4
21
+ logger = nil
22
+
23
+ ARGV.options do |opts|
24
+ opts.banner = "Usage: #{File.basename($0)} [OPTIONS] [start | stop | restart | status]"
25
+ opts.separator ""
26
+ opts.on("-e", "--environment=env", String,
27
+ "Environment to run in (default: #{environment})") do |value|
28
+ environment = value
29
+ end
30
+ opts.on("-n", "--num-workers=WORKERS", Integer,
31
+ "Specify number of workers to fork (defaults to #{num_workers}.") do |value|
32
+ num_workers = [1, value.to_i].max
33
+ end
34
+ opts.on("-l TARGET", "--logger TARGET", String,
35
+ "Log to TARGET, which is either a file name or 'syslog'.") do |value|
36
+ if value == 'syslog'
37
+ require 'syslog_logger'
38
+ logger = SyslogLogger.new('tootsie')
39
+ else
40
+ logger = Logger.new(value)
41
+ end
42
+ end
43
+ opts.on("-h", "--help", "Show this help message.") do
44
+ puts opts
45
+ exit
46
+ end
47
+ opts.parse!
48
+ if ARGV.empty?
49
+ puts "Nothing to do. Run with -h for help."
50
+ exit
51
+ end
52
+ end
53
+
54
+ controller = Tootsie::Daemon.new(
55
+ :root => File.join(File.dirname(__FILE__), "/.."),
56
+ :pid_file => File.join(File.dirname(__FILE__), "/../tmp/task_manager.pid"),
57
+ :logger => logger)
58
+
59
+ spawner = Spawner.new(:num_children => num_workers, :logger => controller.logger)
60
+
61
+ controller.on_spawn do
62
+ $0 = "tootsie: master"
63
+ spawner.on_spawn do
64
+ $0 = "tootsie: worker"
65
+ Signal.trap('TERM') do
66
+ exit(2)
67
+ end
68
+ app = Tootsie::Application.new(
69
+ :environment => environment,
70
+ :logger => controller.logger)
71
+ app.configure!
72
+ begin
73
+ app.task_manager.run!
74
+ rescue SystemExit, Interrupt
75
+ end
76
+ end
77
+ spawner.run
78
+ end
79
+ controller.on_terminate do
80
+ spawner.terminate
81
+ end
82
+ controller.control(ARGV)
@@ -0,0 +1,22 @@
1
+ require 'rubygems'
2
+ require 'bundler'
3
+ Bundler.setup
4
+ Bundler.require
5
+
6
+ $:.unshift(File.join(File.dirname(__FILE__), "/lib"))
7
+ require 'tootsie'
8
+
9
+ environment = ENV['RACK_ENV'] ||= 'development'
10
+
11
+ app = Tootsie::Application.new(
12
+ :environment => environment,
13
+ :logger => ENV["rack.logger"])
14
+ app.configure!
15
+
16
+ if environment == 'development'
17
+ Thread.new do
18
+ Tootsie::Application.get.task_manager.run!
19
+ end
20
+ end
21
+
22
+ run Tootsie::WebService
@@ -0,0 +1,4 @@
1
+ ---
2
+ aws_access_key_id: <your_key>
3
+ aws_secret_access_key: <your_secret>
4
+ sqs_queue_name: tootsie
@@ -0,0 +1,21 @@
1
+ require 'active_support/core_ext/hash'
2
+
3
+ require 'tootsie/application'
4
+ require 'tootsie/client'
5
+ require 'tootsie/configuration'
6
+ require 'tootsie/command_runner'
7
+ require 'tootsie/daemon'
8
+ require 'tootsie/spawner'
9
+ require 'tootsie/ffmpeg_adapter'
10
+ require 'tootsie/image_metadata_extractor'
11
+ require 'tootsie/input'
12
+ require 'tootsie/task_manager'
13
+ require 'tootsie/tasks/job_task'
14
+ require 'tootsie/tasks/notify_task'
15
+ require 'tootsie/output'
16
+ require 'tootsie/web_service'
17
+ require 'tootsie/processors/video_processor'
18
+ require 'tootsie/processors/image_processor'
19
+ require 'tootsie/queues/sqs_queue'
20
+ require 'tootsie/queues/file_system_queue'
21
+ require 'tootsie/s3_utilities'