viddl-rb 0.7 → 0.8

Sign up to get free protection for your applications and to get access to all the features.
data/Gemfile.lock CHANGED
@@ -1,41 +1,43 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- viddl-rb (0.68)
5
- jruby-openssl
4
+ viddl-rb (0.79)
6
5
  mechanize
7
- nokogiri
6
+ nokogiri (~> 1.5.0)
7
+ progressbar
8
8
 
9
9
  GEM
10
10
  remote: http://rubygems.org/
11
11
  specs:
12
- bouncy-castle-java (1.5.0146.1)
13
- domain_name (0.5.3)
14
- unf (~> 0.0.3)
15
- jruby-openssl (0.7.7)
16
- bouncy-castle-java (>= 1.5.0146.1)
17
- mechanize (2.5.1)
12
+ domain_name (0.5.12)
13
+ unf (>= 0.0.5, < 1.0.0)
14
+ http-cookie (1.0.1)
15
+ domain_name (~> 0.5)
16
+ mechanize (2.7.1)
18
17
  domain_name (~> 0.5, >= 0.5.1)
18
+ http-cookie (~> 1.0.0)
19
19
  mime-types (~> 1.17, >= 1.17.2)
20
20
  net-http-digest_auth (~> 1.1, >= 1.1.1)
21
21
  net-http-persistent (~> 2.5, >= 2.5.2)
22
22
  nokogiri (~> 1.4)
23
23
  ntlm-http (~> 0.1, >= 0.1.1)
24
- webrobots (~> 0.0, >= 0.0.9)
25
- mime-types (1.19)
26
- minitest (3.3.0)
27
- net-http-digest_auth (1.2.1)
28
- net-http-persistent (2.7)
29
- nokogiri (1.5.5-java)
24
+ webrobots (>= 0.0.9, < 0.2)
25
+ mime-types (1.23)
26
+ minitest (5.0.4)
27
+ net-http-digest_auth (1.3)
28
+ net-http-persistent (2.8)
29
+ nokogiri (1.5.10)
30
30
  ntlm-http (0.1.1)
31
- rake (0.9.2.2)
31
+ progressbar (0.20.0)
32
+ rake (10.0.4)
32
33
  rest-client (1.6.7)
33
34
  mime-types (>= 1.16)
34
- unf (0.0.5-java)
35
- webrobots (0.0.13)
35
+ unf (0.1.1)
36
+ unf_ext
37
+ unf_ext (0.0.6)
38
+ webrobots (0.1.1)
36
39
 
37
40
  PLATFORMS
38
- java
39
41
  ruby
40
42
 
41
43
  DEPENDENCIES
data/README.md CHANGED
@@ -1,41 +1,53 @@
1
- __viddl-rb:__
1
+ __viddl-rb:__
2
2
  Initially created by Marc Seeger (@rb2k)
3
3
  Repo: http://github.com/rb2k/viddl-rb
4
- [![Build Status](https://secure.travis-ci.org/rb2k/viddl-rb.png)](http://travis-ci.org/rb2k/viddl-rb) [![Dependency Status](https://gemnasium.com/rb2k/viddl-rb.png)](https://gemnasium.com/rb2k/viddl-rb)
4
+ [![Gem Version](https://badge.fury.io/rb/viddl-rb.png)](http://badge.fury.io/rb/viddl-rb)[![Build Status](https://secure.travis-ci.org/rb2k/viddl-rb.png)](http://travis-ci.org/rb2k/viddl-rb) [![Dependency Status](https://gemnasium.com/rb2k/viddl-rb.png)](https://gemnasium.com/rb2k/viddl-rb)
5
5
 
6
6
  __Installation:__
7
7
 
8
8
  gem install viddl-rb
9
9
 
10
- __Usage:__
10
+ __Usage:__
11
11
 
12
12
  Download a video:
13
- viddl-rb http://www.youtube.com/watch?v=QH2-TGUlwu4
13
+ ```viddl-rb http://www.youtube.com/watch?v=QH2-TGUlwu4```
14
+
15
+ Viddl-rb supports the following command line options:
16
+ ```
17
+ -e, --extract-audio Save video audio to file
18
+ -u, --url-only Prints url without downloading
19
+ -t, --title-only Prints title without downloading
20
+ -f, --filter REGEX Filters a video playlist according to the regex (Youtube only right now)
21
+ -s, --save-dir DIRECTORY Specifies the directory where videos should be saved
22
+ -d, --downloader TOOL Specifies the tool to download with. Supports 'wget', 'curl' and 'net-http'
23
+ -q, --quality QUALITY Specifies the video format and resolution in the following way => resolution:extension (e.g. 720:mp4). Currently only supported by the Youtube plugin.
24
+ -h, --help Displays the help screen
25
+ ```
14
26
 
15
27
  Download a video and extract the audio:
16
- viddl-rb http://www.youtube.com/watch?v=QH2-TGUlwu4 --extract-audio
28
+ ```viddl-rb http://www.youtube.com/watch?v=QH2-TGUlwu4 --extract-audio```
17
29
 
18
30
  In both cases we'll name the output file according to the video title.
19
31
 
20
32
  Setting the video save directory:
21
- viddl-rb http://vimeo.com/38372260 --save-dir=C:/myvideos
33
+ ```viddl-rb http://vimeo.com/38372260 --save-dir C:/myvideos```
22
34
 
23
35
  The --save-dir option works with both absolute and relative paths (relative based on the directory viddl-rb is run from).
24
- If you want to save to a folder with spaces in it, you have to quote the path like this: --save-dir="C:/my videos"
36
+ If you want to save to a folder with spaces in it, you have to quote the path like this: --save-dir "C:/my videos"
25
37
 
26
38
  __Youtube plugin specifics:__
27
39
 
28
40
  Download all videos on a playlist:
29
- viddl-rb http://www.youtube.com/playlist?list=PL7E8DA0A515924126
41
+ ```viddl-rb http://www.youtube.com/playlist?list=PL7E8DA0A515924126```
30
42
 
31
43
  Download all videos from a user:
32
- viddl-rb http://www.youtube.com/user/tedtalksdirector
44
+ ```viddl-rb http://www.youtube.com/user/tedtalksdirector```
33
45
 
34
46
  Filter videos to download from a user/playlist:
35
- viddl-rb http://www.youtube.com/user/tedtalksdirector --filter=internet/i
47
+ ```viddl-rb http://www.youtube.com/user/tedtalksdirector --filter /internet/i```
36
48
 
37
49
  The --filter argument accepts a regular expression and will only download videos where the title matches the regex.
38
- The /i option does a case-insensitive search.
50
+ It uses the same syntax as Ruby regular expression literals do.
39
51
 
40
52
  __Library Usage:__
41
53
 
data/Rakefile CHANGED
@@ -6,7 +6,7 @@ task :default => [:test]
6
6
 
7
7
  Rake::TestTask.new(:test) do |t|
8
8
  #t.pattern = "spec/*_spec.rb"
9
- t.test_files = ["spec/lib_spec.rb", "spec/url_extraction_spec.rb"]
9
+ t.test_files = ["spec/lib_spec.rb", "spec/url_extraction_spec.rb", "spec/integration_spec.rb"]
10
10
  end
11
11
 
12
12
  Rake::TestTask.new(:test_lib) do |t|
@@ -16,3 +16,7 @@ end
16
16
  Rake::TestTask.new(:test_extract) do |t|
17
17
  t.test_files = FileList["spec/url_extraction_spec.rb"]
18
18
  end
19
+
20
+ Rake::TestTask.new(:test_integration) do |t|
21
+ t.test_files = FileList["spec/integration_spec.rb"]
22
+ end
@@ -8,18 +8,13 @@ class Downloader
8
8
  url = url_name[:url]
9
9
  name = url_name[:name]
10
10
 
11
- result = save_file(url, name, params[:save_dir])
11
+ result = ViddlRb::DownloadHelper.save_file(url, name, :save_dir => params[:save_dir], :tool => params[:tool])
12
12
  unless result
13
13
  raise DownloadFailedError, "Download for #{name} failed."
14
14
  else
15
15
  puts "Download for #{name} successful."
16
- AudioHelper.extract(name) if params[:extract_audio]
16
+ ViddlRb::AudioHelper.extract(name, params[:save_dir]) if params[:extract_audio]
17
17
  end
18
18
  end
19
19
  end
20
-
21
- # TODO save_dir is not used yet
22
- def save_file(url, name, save_dir)
23
- ViddlRb::DownloadHelper.save_file(url, name, save_dir)
24
- end
25
20
  end
@@ -1,67 +1,109 @@
1
1
 
2
2
  # ParameterParser parses the program parameters.
3
3
  # If the parameters are not valid in some way an exception is raised.
4
+ # The exceptions raised by this class are handeled in the bin program.
5
+
4
6
  class ParameterParser
5
7
 
6
8
  DEFAULT_SAVE_DIR = "."
7
9
 
8
10
  #returns a hash with the parameters in it:
9
- # :url => the video url
10
- # :extract_audio => should attempt to extract audio? (true/false)
11
- # :url_only => do not download, only print the urls to stdout
12
- # :title_only => do not download, only print the titles to stdout
13
- # :youtube_filer => a regular expression used ot fileter youtube playlists
14
- # :save_dir => the directory where the videos are saved
15
- def self.parse_app_parameters
16
- check_valid_parameters!
17
-
18
- params = {}
19
- params[:url] = ARGV.first
20
- params[:extract_audio] = ARGV.include?("--extract-audio")
21
- params[:url_only] = ARGV.include?("--url-only")
22
- params[:title_only] = ARGV.include?("--title-only")
23
- params[:playlist_filter] = get_youtube_filter
24
- params[:save_dir] = get_save_dir
25
- params
26
- end
11
+ # :url => the video url
12
+ # :extract_audio => should attempt to extract audio? (true/false)
13
+ # :url_only => do not download, only print the urls to stdout
14
+ # :title_only => do not download, only print the titles to stdout
15
+ # :playlist_filter => a regular expression used to filter playlists
16
+ # :save_dir => the directory where the videos are saved
17
+ # :tool => the download tool (wget, curl, net/http) to use
18
+ # :quality => the resolution and format to download
19
+ def self.parse_app_parameters(args)
27
20
 
28
- #check if parameters are valid.
29
- #the exceptions raised by this method are caught by the viddl-rb bin utility.
30
- def self.check_valid_parameters!
31
- if ARGV.empty?
32
- raise "Usage: viddl-rb URL [--extract-audio]"
33
- elsif !ARGV.first.match(/^http/)
34
- raise "ERROR: Please include 'http' with your URL e.g. http://www.youtube.com/watch?v=QH2-TGUlwu4"
35
- elsif ARGV.include?("--extract-audio") && !DownloadHelper.os_has?("ffmpeg")
36
- raise "ERROR: To extract audio you need to have ffmpeg on your system"
37
- end
38
- end
21
+ # Default option values are set here
22
+ options = {
23
+ :extract_audio => false,
24
+ :url_only => false,
25
+ :title_only => false,
26
+ :playlist_filter => nil,
27
+ :save_dir => DEFAULT_SAVE_DIR,
28
+ :tool => nil,
29
+ :quality => nil
30
+ }
31
+
32
+ optparse = OptionParser.new do |opts|
33
+ opts.banner = "Usage: viddl-rb URL [options]"
34
+
35
+ opts.on("-e", "--extract-audio", "Save video audio to file") do
36
+ if ViddlRb::UtilityHelper.os_has?("ffmpeg")
37
+ options[:extract_audio] = true
38
+ else
39
+ raise OptionParser::ParseError.new("to extract audio you need to have ffmpeg on your PATH")
40
+ end
41
+ end
42
+
43
+ opts.on("-u", "--url-only", "Prints url without downloading") do
44
+ options[:url_only] = true
45
+ end
46
+
47
+ opts.on("-t", "--title-only", "Prints title without downloading") do
48
+ options[:title_only] = true
49
+ end
39
50
 
40
- #gets the regular expression used to filter youtube playlists.
41
- def self.get_youtube_filter
42
- filter = ARGV.find { |arg| arg =~ /--filter=./ } # --filter= and at least one more character
43
- return nil unless filter
51
+ opts.on("-f", "--filter REGEX", Regexp, "Filters a video playlist according to the regex") do |regex|
52
+ options[:filter] = regex
53
+ end
44
54
 
45
- ignore_case = filter.include?("/i")
46
- regex = filter[/--filter=(.*?)(?:\/|$)/, 1] # everything up to the first / (could be an empty string)
47
- raise "ERROR: '#{regex}' is not a valid regular expression" unless is_valid_regex?(regex)
48
- Regexp.new(regex, ignore_case)
55
+ opts.on("-s", "--save-dir DIRECTORY", "Specifies the directory where videos should be saved") do |dir|
56
+ if File.directory?(dir)
57
+ options[:save_dir] = dir
58
+ else
59
+ raise OptionParser::InvalidArgument.new("'#{dir}' is not a valid directory")
60
+ end
61
+ end
62
+
63
+ opts.on("-d", "--downloader TOOL", "Specifies the tool to download with. Supports 'wget', 'curl' and 'net-http'") do |tool|
64
+ if tool =~ /(^wget$)|(^curl$)|(^net-http$)/
65
+ options[:tool] = tool
66
+ else
67
+ raise OptionParser::InvalidArgument.new("'#{tool}' is not a valid tool.")
68
+ end
69
+ end
70
+
71
+ opts.on("-q", "--quality QUALITY",
72
+ "Specifies the video format and resolution in the following way => resolution:extension (e.g. 720:mp4)") do |quality|
73
+ if match = quality.match(/(\d+):(.*)/)
74
+ res = match[1]
75
+ ext = match[2]
76
+ elsif match = quality.match(/\d+/)
77
+ res = match[0]
78
+ ext = nil
79
+ else
80
+ raise OptionParse.InvalidArgument.new("#{quality} is not a valid argument.")
81
+ end
82
+ options[:quality] = {:extension => ext, :resolution => res}
83
+ end
84
+
85
+ opts.on_tail('-h', '--help', 'Display this screen') do
86
+ print_help_and_exit(opts)
87
+ end
88
+ end
89
+
90
+ optparse.parse!(args) # removes all options from args
91
+ print_help_and_exit(optparse) if args.empty? # exit if no video url
92
+ url = args.first # the url is the only element left
93
+ validate_url!(url) # raise exception if invalid url
94
+ options[:url] = url
95
+ options
49
96
  end
50
97
 
51
- #checks if the string is a valid regex (for example "*****" is not)
52
- def self.is_valid_regex?(regex)
53
- Regexp.compile(regex)
54
- rescue RegexpError
55
- false
98
+ def self.print_help_and_exit(opts)
99
+ puts opts
100
+ exit(0)
56
101
  end
57
-
58
- #gets the directory used for saving videos in.
59
- def self.get_save_dir
60
- save_dir = ARGV.find { |arg| arg =~ /--save-dir=./ }
61
- return DEFAULT_SAVE_DIR unless save_dir
62
-
63
- dir = save_dir[/--save-dir=(.+)/, 1]
64
- raise "ERROR: '#{dir}' is not a valid directory or does not exist" unless File.directory?(dir)
65
- dir
102
+
103
+ def self.validate_url!(url)
104
+ unless url =~ /^http/
105
+ raise OptionParser::InvalidArgument.new(
106
+ "please include 'http' with your URL e.g. http://www.youtube.com/watch?v=QH2-TGUlwu4")
107
+ end
66
108
  end
67
109
  end
data/bin/viddl-rb CHANGED
@@ -8,6 +8,7 @@ require "mechanize"
8
8
  require "cgi"
9
9
  require "open-uri"
10
10
  require "open3"
11
+ require "optparse"
11
12
 
12
13
  require "driver.rb"
13
14
  require "downloader.rb"
@@ -20,11 +21,11 @@ require "utility-helper.rb"
20
21
  begin
21
22
  # params is a hash with keys for each of the parameters passed in.
22
23
  # see helper/parameter-parser.rb for what those keys are.
23
- params = ParameterParser.parse_app_parameters
24
+ params = ParameterParser.parse_app_parameters(ARGV)
24
25
 
25
26
  puts "Loading Plugins"
26
27
  ViddlRb::UtilityHelper.load_plugins
27
- "Plugins loaded: #{ViddlRb::PluginBase.registered_plugins.inspect}"
28
+ puts "Plugins loaded: #{ViddlRb::PluginBase.registered_plugins.inspect}"
28
29
 
29
30
  puts "Will try to extract audio: #{params[:extract_audio] == true}."
30
31
  puts "Analyzing URL: #{params[:url]}"
@@ -32,8 +33,13 @@ begin
32
33
  app = Driver.new(params)
33
34
  app.start # starts the download process
34
35
 
36
+ rescue OptionParser::ParseError, ViddlRb::RequirementError => e
37
+ puts "Error: #{e.message}"
38
+ exit(1)
39
+
35
40
  rescue StandardError => e
36
- puts "#{e.message}"
41
+ puts "Error: #{e.message}"
42
+ puts "\nBacktrace:"
43
+ puts e.backtrace
37
44
  exit(1)
38
45
  end
39
-
@@ -3,7 +3,7 @@ module ViddlRb
3
3
  # This class is responsible for extracting audio from video files using ffmpeg.
4
4
  class AudioHelper
5
5
 
6
- def self.extract(file_path)
6
+ def self.extract(file_path, save_dir)
7
7
  no_ext_filename = file_path.split('.')[0..-1][0]
8
8
  #capture stderr because ffmpeg expects an output param and will error out
9
9
  puts "Gathering information about the downloaded file."
@@ -32,7 +32,7 @@ module ViddlRb
32
32
  puts "Unknown audio format: #{audio_format}, using name as extension: '.#{audio_format}'."
33
33
  output_extension = audio_format
34
34
  end
35
- output_filename = "#{no_ext_filename}.#{output_extension}"
35
+ output_filename = File.join(save_dir, "#{no_ext_filename}.#{output_extension}")
36
36
  if File.exist?(output_filename)
37
37
  puts "Audio file seems to exist already, removing it before extraction."
38
38
  File.delete(output_filename)
@@ -46,4 +46,3 @@ module ViddlRb
46
46
  end
47
47
 
48
48
  end
49
-
@@ -1,86 +1,94 @@
1
1
  module ViddlRb
2
2
 
3
+ class RequirementError < StandardError; end
4
+
3
5
  class DownloadHelper
4
- #usually not called directly
5
- def self.fetch_file(uri)
6
- begin
7
- require "progressbar" #http://github.com/nex3/ruby-progressbar
8
- rescue LoadError
9
- puts "ERROR: You don't seem to have curl or wget on your system. In this case you'll need to install the 'progressbar' gem."
10
- exit
11
- end
12
- progress_bar = nil
13
- open(uri, :proxy => nil,
14
- :content_length_proc => lambda { |length|
15
- if length && 0 < length
16
- progress_bar = ProgressBar.new(uri.to_s, length)
17
- progress_bar.file_transfer_mode #to show download speed and file size
18
- end
19
- },
20
- :progress_proc => lambda { |progress|
21
- progress_bar.set(progress) if progress_bar
22
- }) {|file| return file.read}
23
- end
24
-
6
+
7
+ #viddl will use the first of these tools it finds on the system to download the video.
8
+ #if the system does not have any of these tools, net/http is used instead.
9
+ TOOLS_PRIORITY_LIST = [:wget, :curl]
10
+
25
11
  #simple helper that will save a file from the web and save it with a progress bar
26
- def self.save_file(file_uri, file_name, save_dir = ".", amount_of_retries = 6)
12
+ def self.save_file(file_url, file_name, opts = {})
27
13
  trap("SIGINT") { puts "goodbye"; exit }
28
14
 
29
- file_path = File.absolute_path(File.join(save_dir, file_name))
15
+ #default options
16
+ options = {:save_dir => ".",
17
+ :amount_of_retries => 6,
18
+ :tool => get_tool}
19
+
20
+ opts[:tool] = options[:tool] if opts[:tool].nil?
21
+ options.merge!(opts)
22
+
23
+ file_path = File.expand_path(File.join(options[:save_dir], file_name))
24
+ success = false
25
+
30
26
  #Some providers seem to flake out every now end then
31
- amount_of_retries.times do |i|
32
- if os_has?("wget")
33
- puts "using wget"
34
- `wget \"#{file_uri}\" -O #{file_path.inspect}`
35
- elsif os_has?("curl")
36
- puts "using curl"
37
- #require "pry"; binding.pry; exit
27
+ options[:amount_of_retries].times do |i|
28
+ case options[:tool].to_sym
29
+ when :wget
30
+ puts "Using wget"
31
+ success = system "wget \"#{file_url}\" -O #{file_path.inspect}"
32
+ when :curl
33
+ puts "Using curl"
38
34
  #-L means: follow redirects, We set an agent because Vimeo seems to want one
39
- `curl -A 'Wget/1.8.1' --retry 10 --retry-delay 5 --retry-max-time 4 -L \"#{file_uri}\" -o #{file_path.inspect}`
35
+ success = system "curl -A 'Wget/1.8.1' --retry 10 --retry-delay 5 --retry-max-time 4 -L \"#{file_url}\" -o #{file_path.inspect}"
40
36
  else
41
- puts "using net/http"
42
- open(file_path, 'wb') { |file|
43
- file.write(fetch_file(file_uri)); puts
44
- }
45
- end
37
+ require_progressbar
38
+ puts "Using net/http"
39
+ success = download_and_save_file(file_url, file_path)
40
+ end
46
41
  #we were successful, we're outta here
47
- if $? == 0
42
+ if success
48
43
  break
49
44
  else
50
- puts "Download seems to have failed (retrying, attempt #{i+1}/#{amount_of_retries})"
45
+ puts "Download seems to have failed (retrying, attempt #{i+1}/#{options[:amount_of_retries]})"
51
46
  sleep 2
52
47
  end
53
- end
54
- $? == 0
48
+ end
49
+ success
55
50
  end
56
51
 
57
- #checks to see whether the os has a certain utility like wget or curl
58
- #`` returns the standard output of the process
59
- #system returns the exit code of the process
60
- def self.os_has?(utility)
61
- windows = ENV['OS'] =~ /windows/i
52
+ def self.get_tool
53
+ tool = TOOLS_PRIORITY_LIST.find { |tool| ViddlRb::UtilityHelper.os_has?(tool) }
54
+ tool || :net_http
55
+ end
62
56
 
63
- unless windows # if os is not Windows
64
- `which #{utility}`.include?(utility)
65
- else
66
- if has_where?
67
- system("where /q #{utility}") #/q is the quiet mode flag
68
- else
69
- begin #as a fallback we just run the utility itself
70
- system(utility)
71
- rescue Errno::ENOENT
72
- false
57
+ def self.require_progressbar
58
+ begin
59
+ require "progressbar"
60
+ rescue LoadError
61
+ raise RequirementError,
62
+ "you don't seem to have curl or wget on your system. In this case you'll need to install the 'progressbar' gem."
63
+ end
64
+ end
65
+
66
+ # downloads and saves a file using the net/http streaming api
67
+ # return true if the download was successful, else returns false
68
+ def self.download_and_save_file(download_url, full_path)
69
+ final_url = UtilityHelper.get_final_location(download_url) # follow all redirects
70
+ uri = URI(final_url)
71
+ file = File.new(full_path, "wb")
72
+ file_size = 0
73
+
74
+ Net::HTTP.start(uri.host, uri.port) do |http|
75
+ http.request_get(uri.request_uri) do |res|
76
+ file_size = res.read_header["content-length"].to_i
77
+ bar = ProgressBar.new(File.basename(full_path), file_size)
78
+ bar.file_transfer_mode
79
+ res.read_body do |segment|
80
+ bar.inc(segment.size)
81
+ file.write(segment)
73
82
  end
74
83
  end
75
84
  end
85
+ file.close
86
+ print "\n"
87
+ download_successful?(full_path, file_size) #because Net::HTTP.start does not throw Net exceptions
76
88
  end
77
89
 
78
- #checks if Windows has the where utility (Server 2003 and later)
79
- #system only return nil if the command is not found
80
- def self.has_where?
81
- !system("where /q where").nil?
90
+ def self.download_successful?(full_file_path, file_size)
91
+ File.exist?(full_file_path) && File.size(full_file_path) == file_size
82
92
  end
83
93
  end
84
-
85
94
  end
86
-