viddl-rb 0.68 → 0.70

Sign up to get free protection for your applications and to get access to all the features.
@@ -1,19 +1,38 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- viddl-rb (0.61)
4
+ viddl-rb (0.68)
5
+ jruby-openssl
6
+ mechanize
5
7
  nokogiri
6
8
 
7
9
  GEM
8
10
  remote: http://rubygems.org/
9
11
  specs:
10
- mime-types (1.18)
11
- minitest (2.12.1)
12
- nokogiri (1.5.2)
13
- nokogiri (1.5.2-java)
12
+ bouncy-castle-java (1.5.0146.1)
13
+ domain_name (0.5.3)
14
+ unf (~> 0.0.3)
15
+ jruby-openssl (0.7.7)
16
+ bouncy-castle-java (>= 1.5.0146.1)
17
+ mechanize (2.5.1)
18
+ domain_name (~> 0.5, >= 0.5.1)
19
+ mime-types (~> 1.17, >= 1.17.2)
20
+ net-http-digest_auth (~> 1.1, >= 1.1.1)
21
+ net-http-persistent (~> 2.5, >= 2.5.2)
22
+ nokogiri (~> 1.4)
23
+ ntlm-http (~> 0.1, >= 0.1.1)
24
+ webrobots (~> 0.0, >= 0.0.9)
25
+ mime-types (1.19)
26
+ minitest (3.3.0)
27
+ net-http-digest_auth (1.2.1)
28
+ net-http-persistent (2.7)
29
+ nokogiri (1.5.5-java)
30
+ ntlm-http (0.1.1)
14
31
  rake (0.9.2.2)
15
32
  rest-client (1.6.7)
16
33
  mime-types (>= 1.16)
34
+ unf (0.0.5-java)
35
+ webrobots (0.0.13)
17
36
 
18
37
  PLATFORMS
19
38
  java
data/README.md CHANGED
@@ -1,49 +1,110 @@
1
- __viddl-rb:__
2
- Created by Marc Seeger (@rb2k)
3
- Repo: http://github.com/rb2k/viddl-rb
4
- [![Build Status](https://secure.travis-ci.org/rb2k/viddl-rb.png)](http://travis-ci.org/rb2k/viddl-rb) [![Dependency Status](https://gemnasium.com/rb2k/viddl-rb.png)](https://gemnasium.com/rb2k/viddl-rb)
5
-
6
-
7
-
8
- __Installation:__
9
- gem install viddl-rb
10
-
11
- __Usage:__
12
-
13
- Download a video:
14
- viddl-rb http://www.youtube.com/watch?v=QH2-TGUlwu4
15
-
16
- Download a video and extract the audio:
17
- viddl-rb http://www.youtube.com/watch?v=QH2-TGUlwu4 --extract-audio
18
-
19
- In both cases we'll name the output file according to the video title.
20
-
21
- __Youtube plugin specifics:__
22
-
23
- Download all videos on a playlist:
24
- viddl-rb http://www.youtube.com/playlist?list=PL7E8DA0A515924126
25
-
26
- Download all videos from a user:
27
- viddl-rb http://www.youtube.com/user/tedtalksdirector
28
-
29
- Filter videos to download from a user/playlist:
30
- viddl-rb http://www.youtube.com/user/tedtalksdirector --filter=internet/i
31
-
32
- The --filter argument accepts a regular expression and will only download videos where the title matches the regex.
33
- The /i option does a case-insensitive search.
34
-
35
- __Requirements:__
36
-
37
- * curl/wget or the [progress bar](http://github.com/nex3/ruby-progressbar/) gem
38
- * [Nokogiri](http://nokogiri.org/)
39
- * [Mechanize](http://mechanize.rubyforge.org/)
40
- * ffmpeg if you want to extract audio tracks from the videos
41
-
42
-
43
- __Contributors:__
44
-
45
- * [kl](https://github.com/kl): Windows support (who knew!), bug fixes, veoh plugin, metacafe plugin
46
- * [divout](https://github.com/divout) aka Ivan K: blip.tv plugin, bugfixes
47
- * Sniper: bugfixes
48
- * [Serabe](https://github.com/Serabe) aka Sergio Arbeo: packaging viddl as a binary
49
- * [laserlemon](https://github.com/laserlemon): Adding gemnasium images to readme
1
+ __viddl-rb:__
2
+ Initially created by Marc Seeger (@rb2k)
3
+ Repo: http://github.com/rb2k/viddl-rb
4
+ [![Build Status](https://secure.travis-ci.org/rb2k/viddl-rb.png)](http://travis-ci.org/rb2k/viddl-rb) [![Dependency Status](https://gemnasium.com/rb2k/viddl-rb.png)](https://gemnasium.com/rb2k/viddl-rb)
5
+
6
+ __Installation:__
7
+
8
+ gem install viddl-rb
9
+
10
+ __Usage:__
11
+
12
+ Download a video:
13
+ viddl-rb http://www.youtube.com/watch?v=QH2-TGUlwu4
14
+
15
+ Download a video and extract the audio:
16
+ viddl-rb http://www.youtube.com/watch?v=QH2-TGUlwu4 --extract-audio
17
+
18
+ In both cases we'll name the output file according to the video title.
19
+
20
+ Setting the video save directory:
21
+ viddl-rb http://vimeo.com/38372260 --save-dir=C:/myvideos
22
+
23
+ The --save-dir option works with both absolute and relative paths (relative based on the directory viddl-rb is run from).
24
+ If you want to save to a folder with spaces in it, you have to quote the path like this: --save-dir="C:/my videos"
25
+
26
+ __Youtube plugin specifics:__
27
+
28
+ Download all videos on a playlist:
29
+ viddl-rb http://www.youtube.com/playlist?list=PL7E8DA0A515924126
30
+
31
+ Download all videos from a user:
32
+ viddl-rb http://www.youtube.com/user/tedtalksdirector
33
+
34
+ Filter videos to download from a user/playlist:
35
+ viddl-rb http://www.youtube.com/user/tedtalksdirector --filter=internet/i
36
+
37
+ The --filter argument accepts a regular expression and will only download videos where the title matches the regex.
38
+ The /i option does a case-insensitive search.
39
+
40
+ __Library Usage:__
41
+
42
+ ```ruby
43
+ require 'viddl-rb'
44
+
45
+ download_urls = ViddlRb.get_urls("http://www.youtube.com/watch?v=QH2-TGUlwu4")
46
+ download_urls.first # => "http://o-o.preferred.arn06s04.v3.lscac ..."
47
+ ```
48
+
49
+ The ViddlRb module has the following module public methods:
50
+
51
+ * __get_urls_names(url)__
52
+ -- Returns an array of one or more hashes that has the keys :url which
53
+ points to the download url and :name which points to the name
54
+ (which is a filename safe version of the video title with a file extension).
55
+ Returns nil if the url is not recognized by any plugins.
56
+
57
+ * __get_urls_exts(url)__
58
+ -- Same as get_urls_names but with just the file extension (for example ".mp4")
59
+ instead of the full filename, and the :name key is replaced with :ext.
60
+ Returns nil if the url is not recognized by any plugins.
61
+
62
+ * __get_urls(url)__
63
+ -- Returns an array of download urls for the specified video url.
64
+ Returns nil if the url is not recognized by any plugins.
65
+
66
+ * __get_names(url)__
67
+ -- Returns an array of filenames for the specified video url.
68
+ Returns nil if the url is not recognized by any plugins.
69
+
70
+ * __io=(io_object)__
71
+ -- By default all plugin output to stdout will be suppressed when the library is used.
72
+ If you are interested in the output of a plugin, you can set an IO object that
73
+ will receive all plugin output using this method. For example:
74
+
75
+ ```ruby
76
+ require 'viddl-rb'
77
+
78
+ ViddlRb.io = $stdout # plugins will now write their output to $stdout
79
+ ```
80
+
81
+ All the __get__ methods in the ViddlRb module will raise either a ViddlRb::PluginError or a ViddlRb::DownloadError if the plugin fails.
82
+ A ViddlRb::PluginError is raised if the plugin fails in an unexpected way, and a ViddlRb::DownloadError is raised if the video could not be downloaded for some reason.
83
+ An example of that is if a Youtube video is not embeddable - then it can't be downloaded.
84
+
85
+ ```ruby
86
+ begin
87
+ ViddlRb.get_urls(video_url)
88
+ rescue ViddlRb::DownloadError => e
89
+ puts "Could not get download url: #{e.message}"
90
+ rescue ViddlRb::PluginError => e
91
+ puts "Plugin blew up! #{e.message}\n" +
92
+ "Backtrace:\n#{e.backtrace.join("\n")}"
93
+ end
94
+ ```
95
+
96
+ __Requirements:__
97
+
98
+ * curl/wget or the [progress bar](http://github.com/nex3/ruby-progressbar/) gem
99
+ * [Nokogiri](http://nokogiri.org/)
100
+ * [Mechanize](http://mechanize.rubyforge.org/)
101
+ * ffmpeg if you want to extract audio tracks from the videos
102
+
103
+ __Co Maintainer:__
104
+ * [kl](https://github.com/kl): Windows support (who knew!), bug fixes, veoh plugin, metacafe plugin, refactoring it into a library, ...
105
+
106
+ __Contributors:__
107
+ * [divout](https://github.com/divout) aka Ivan K: blip.tv plugin, bugfixes
108
+ * Sniper: bugfixes
109
+ * [Serabe](https://github.com/Serabe) aka Sergio Arbeo: packaging viddl as a binary
110
+ * [laserlemon](https://github.com/laserlemon): Adding gemnasium images to readme
data/Rakefile CHANGED
@@ -1,8 +1,18 @@
1
- require 'rubygems'
2
- require 'rake/testtask'
3
-
4
- task :default => [:test]
5
-
6
- Rake::TestTask.new do |t|
7
- t.pattern = "spec/*_spec.rb"
8
- end
1
+ require 'rubygems'
2
+ require 'bundler/setup'
3
+ require 'rake/testtask'
4
+
5
+ task :default => [:test]
6
+
7
+ Rake::TestTask.new(:test) do |t|
8
+ #t.pattern = "spec/*_spec.rb"
9
+ t.test_files = ["spec/lib_spec.rb", "spec/url_extraction_spec.rb"]
10
+ end
11
+
12
+ Rake::TestTask.new(:test_lib) do |t|
13
+ t.test_files = FileList["spec/lib_spec.rb"]
14
+ end
15
+
16
+ Rake::TestTask.new(:test_extract) do |t|
17
+ t.test_files = FileList["spec/url_extraction_spec.rb"]
18
+ end
@@ -0,0 +1,3 @@
1
+ * wrap all classes used by the lib in a module (for namespace reasons)
2
+ * add save_file method to library
3
+
@@ -0,0 +1,25 @@
1
+
2
+ # Downloader iterates over a download queue and downloads and saves each video in the queue.
3
+ class Downloader
4
+ class DownloadFailedError < StandardError; end
5
+
6
+ def download(download_queue, params)
7
+ download_queue.each do |url_name|
8
+ url = url_name[:url]
9
+ name = url_name[:name]
10
+
11
+ result = save_file(url, name, params[:save_dir])
12
+ unless result
13
+ raise DownloadFailedError, "Download for #{name} failed."
14
+ else
15
+ puts "Download for #{name} successful."
16
+ AudioHelper.extract(name) if params[:extract_audio]
17
+ end
18
+ end
19
+ end
20
+
21
+ # TODO save_dir is not used yet
22
+ def save_file(url, name, save_dir)
23
+ ViddlRb::DownloadHelper.save_file(url, name, save_dir)
24
+ end
25
+ end
@@ -0,0 +1,47 @@
1
+
2
+ # The Driver class drives the application logic in the viddl-rb utility.
3
+ # It gets the correct plugin for the given url and passes a download queue
4
+ # (that it gets from the plugin) to the Downloader object which downloads the videos.
5
+ class Driver
6
+
7
+ def initialize(param_hash)
8
+ @params = param_hash
9
+ @downloader = Downloader.new
10
+ end
11
+
12
+ #starts the downloading process or print just the urls or names.
13
+ def start
14
+ queue = get_download_queue
15
+
16
+ if @params[:url_only]
17
+ queue.each { |url_name| puts url_name[:url] }
18
+ elsif @params[:title_only]
19
+ queue.each { |url_name| puts url_name[:name] }
20
+ else
21
+ @downloader.download(queue, @params)
22
+ end
23
+ end
24
+
25
+ private
26
+
27
+ #finds the right plugins and returns the download queue.
28
+ def get_download_queue
29
+ url = @params[:url]
30
+ plugin = ViddlRb::PluginBase.registered_plugins.find { |p| p.matches_provider?(url) }
31
+ raise "ERROR: No plugin seems to feel responsible for this URL." unless plugin
32
+ puts "Using plugin: #{plugin}"
33
+
34
+ begin
35
+ #we'll end up with an array of hashes with they keys :url and :name
36
+ plugin.get_urls_and_filenames(url, @params)
37
+
38
+ rescue ViddlRb::PluginBase::CouldNotDownloadVideoError => e
39
+ raise "ERROR: The video could not be downloaded.\n" +
40
+ "Reason: #{e.message}"
41
+ rescue StandardError => e
42
+ raise "Error while running the #{plugin.name.inspect} plugin. Maybe it has to be updated?\n" +
43
+ "Error: #{e.message}.\n" +
44
+ "Backtrace:\n#{e.backtrace.join("\n")}"
45
+ end
46
+ end
47
+ end
@@ -0,0 +1,67 @@
1
+
2
+ # ParameterParser parses the program parameters.
3
+ # If the parameters are not valid in some way an exception is raised.
4
+ class ParameterParser
5
+
6
+ DEFAULT_SAVE_DIR = "."
7
+
8
+ #returns a hash with the parameters in it:
9
+ # :url => the video url
10
+ # :extract_audio => should attempt to extract audio? (true/false)
11
+ # :url_only => do not download, only print the urls to stdout
12
+ # :title_only => do not download, only print the titles to stdout
13
+ # :youtube_filer => a regular expression used ot fileter youtube playlists
14
+ # :save_dir => the directory where the videos are saved
15
+ def self.parse_app_parameters
16
+ check_valid_parameters!
17
+
18
+ params = {}
19
+ params[:url] = ARGV.first
20
+ params[:extract_audio] = ARGV.include?("--extract-audio")
21
+ params[:url_only] = ARGV.include?("--url-only")
22
+ params[:title_only] = ARGV.include?("--title-only")
23
+ params[:playlist_filter] = get_youtube_filter
24
+ params[:save_dir] = get_save_dir
25
+ params
26
+ end
27
+
28
+ #check if parameters are valid.
29
+ #the exceptions raised by this method are caught by the viddl-rb bin utility.
30
+ def self.check_valid_parameters!
31
+ if ARGV.empty?
32
+ raise "Usage: viddl-rb URL [--extract-audio]"
33
+ elsif !ARGV.first.match(/^http/)
34
+ raise "ERROR: Please include 'http' with your URL e.g. http://www.youtube.com/watch?v=QH2-TGUlwu4"
35
+ elsif ARGV.include?("--extract-audio") && !DownloadHelper.os_has?("ffmpeg")
36
+ raise "ERROR: To extract audio you need to have ffmpeg on your system"
37
+ end
38
+ end
39
+
40
+ #gets the regular expression used to filter youtube playlists.
41
+ def self.get_youtube_filter
42
+ filter = ARGV.find { |arg| arg =~ /--filter=./ } # --filter= and at least one more character
43
+ return nil unless filter
44
+
45
+ ignore_case = filter.include?("/i")
46
+ regex = filter[/--filter=(.*?)(?:\/|$)/, 1] # everything up to the first / (could be an empty string)
47
+ raise "ERROR: '#{regex}' is not a valid regular expression" unless is_valid_regex?(regex)
48
+ Regexp.new(regex, ignore_case)
49
+ end
50
+
51
+ #checks if the string is a valid regex (for example "*****" is not)
52
+ def self.is_valid_regex?(regex)
53
+ Regexp.compile(regex)
54
+ rescue RegexpError
55
+ false
56
+ end
57
+
58
+ #gets the directory used for saving videos in.
59
+ def self.get_save_dir
60
+ save_dir = ARGV.find { |arg| arg =~ /--save-dir=./ }
61
+ return DEFAULT_SAVE_DIR unless save_dir
62
+
63
+ dir = save_dir[/--save-dir=(.+)/, 1]
64
+ raise "ERROR: '#{dir}' is not a valid directory or does not exist" unless File.directory?(dir)
65
+ dir
66
+ end
67
+ end
@@ -1,118 +1,39 @@
1
- #!/usr/bin/env ruby
2
- $LOAD_PATH << File.join(File.dirname(__FILE__), '..', 'helper')
3
-
4
- require "rubygems"
5
- require "nokogiri"
6
- require "mechanize"
7
- require "cgi"
8
- require "open-uri"
9
- require "open3"
10
- require "download-helper.rb"
11
- require "plugin-helper.rb"
12
-
13
- if ARGV[0].nil?
14
- puts "Usage: viddl-rb URL [--extract-audio]"
15
- exit
16
- end
17
-
18
- puts "Loading Plugins"
19
- Dir[File.join(File.dirname(__FILE__),"../plugins/*.rb")].each do |plugin|
20
- load plugin
21
- end
22
-
23
- puts "Plugins loaded: #{PluginBase.registered_plugins.inspect}"
24
-
25
- url = ARGV[0]
26
- extract_audio = ARGV.include?('--extract-audio')
27
- url_only = ARGV.include?('--url-only')
28
- title_only = ARGV.include?('--title-only')
29
-
30
- puts "Will try to extract audio: #{extract_audio}."
31
-
32
- unless url.match(/^http/)
33
- puts "Please include 'http' with your URL e.g. http://www.youtube.com/watch?v=QH2-TGUlwu4"
34
- exit(1)
35
- end
36
-
37
- puts "Analyzing URL: #{url}"
38
- #Check all plugins for a match
39
- PluginBase.registered_plugins.each do |plugin|
40
- if plugin.matches_provider?(url)
41
- puts "#{plugin}: true"
42
- begin
43
- #we'll end up with an array of hashes with they keys :url and :name
44
- download_queue = plugin.get_urls_and_filenames(url)
45
- rescue StandardError => e
46
- puts "Error while running the #{plugin.name.inspect} plugin. Maybe it has to be updated? Error: #{e.message}."
47
- exit(1)
48
- end
49
-
50
- if url_only
51
- download_queue.each{|url_name| puts url_name[:url]}
52
- exit
53
- elsif title_only
54
- download_queue.each{|url_name| puts url_name[:name]}
55
- exit
56
- end
57
-
58
- download_queue.each do |url_name|
59
- result = DownloadHelper.save_file(url_name[:url], url_name[:name])
60
- if result
61
- puts "Download for #{url_name[:name]} successful."
62
- if extract_audio
63
- puts "Extracting audio for #{url_name[:name]}"
64
- if DownloadHelper.os_has?('ffmpeg')
65
- no_ext_filename = url_name[:name].split('.')[0..-1][0]
66
- #capture stderr because ffmpeg expects an output param and will error out
67
- puts "Gathering information about the downloaded file."
68
- file_info = Open3.popen3("ffmpeg -i #{url_name[:name]}") {|stdin, stdout, stderr, wait_thr| stderr.read }
69
- puts "Done gathering information about the downloaded file."
70
- if !file_info.to_s.empty?
71
- audio_format_matches = file_info.match(/Audio: (\w*)/)
72
- if audio_format_matches
73
- audio_format = audio_format_matches[1]
74
- puts "detected audio format: #{audio_format}"
75
- else
76
- puts "Couldn't find any audio:\n#{file_info.inspect}"
77
- next
78
- end
79
-
80
- extension_mapper = {
81
- 'aac' => 'm4a',
82
- 'mp3' => 'mp3',
83
- 'vorbis' => 'ogg'
84
- }
85
-
86
- if extension_mapper.key?(audio_format)
87
- output_extension = extension_mapper[audio_format]
88
- else
89
- #lame fallback
90
- puts "Unknown audio format: #{audio_format}, using name as extension: '.#{audio_format}'."
91
- output_extension = audio_format
92
- end
93
- output_filename = "#{no_ext_filename}.#{output_extension}"
94
- if File.exist?(output_filename)
95
- puts "Audio file seems to exist already, removing it before extraction."
96
- File.delete(output_filename)
97
- end
98
- Open3.popen3("ffmpeg -i #{url_name[:name]} -vn -acodec copy #{output_filename}") {|stdin, stdout, stderr, wait_thr| stdout.read }
99
- puts "Done extracting audio to #{output_filename}"
100
- else
101
- puts "Error while checking audio track of #{url_name[:name]}"
102
- end
103
- else
104
- puts "Didn't detect ffmpeg on your system, can't extract audio."
105
- end
106
- end
107
- else
108
- puts "Download for #{url_name[:name]} failed."
109
- end
110
- end
111
- #plugin matched and downloaded, we're done
112
- exit
113
- else
114
- puts "#{plugin}: false"
115
- end
116
- end
117
-
118
- puts "No plugin seems to feel responsible for this URL."
1
+ #!/usr/bin/env ruby
2
+ $LOAD_PATH << File.join(File.dirname(__FILE__), '..', 'helper') # general helpers
3
+ $LOAD_PATH << File.join(File.dirname(__FILE__), 'helper') # bin-specific helpers
4
+
5
+ require "rubygems"
6
+ require "nokogiri"
7
+ require "mechanize"
8
+ require "cgi"
9
+ require "open-uri"
10
+ require "open3"
11
+
12
+ require "driver.rb"
13
+ require "downloader.rb"
14
+ require "download-helper.rb"
15
+ require "parameter-parser.rb"
16
+ require "plugin-helper.rb"
17
+ require "audio-helper.rb"
18
+ require "utility-helper.rb"
19
+
20
+ begin
21
+ # params is a hash with keys for each of the parameters passed in.
22
+ # see helper/parameter-parser.rb for what those keys are.
23
+ params = ParameterParser.parse_app_parameters
24
+
25
+ puts "Loading Plugins"
26
+ ViddlRb::UtilityHelper.load_plugins
27
+ "Plugins loaded: #{ViddlRb::PluginBase.registered_plugins.inspect}"
28
+
29
+ puts "Will try to extract audio: #{params[:extract_audio] == true}."
30
+ puts "Analyzing URL: #{params[:url]}"
31
+
32
+ app = Driver.new(params)
33
+ app.start # starts the download process
34
+
35
+ rescue StandardError => e
36
+ puts "#{e.message}"
37
+ exit(1)
38
+ end
39
+