distribustream 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/README ADDED
@@ -0,0 +1,107 @@
1
+ Welcome to DistribuStream!
2
+
3
+ DistribuStream is a fully open peercasting system which allows on-demand
4
+ or live streaming media to be delivered at a fraction of the normal cost.
5
+
6
+ This README covers the initial public release, known issues, and a general
7
+ development roadmap.
8
+
9
+ --
10
+
11
+ Usage:
12
+
13
+ The DistribuStream gem includes three config files that can be located in
14
+ the conf directory of the gem:
15
+
16
+ example.yml - A standard example file, configuring for 5kB chunks
17
+ debug.yml - Same as example.yml, but configured for additional debug info
18
+ bigchunk.yml - An example config file, using 250kB chunks
19
+
20
+ Chunk size controls how large the segments which are exchanged between peers
21
+ are. Larger chunk sizes reduce the amount of control traffic the server
22
+ sends, but increase the probability of errors.
23
+
24
+ To begin, copy one of these config file and edit it to your choosing. Be
25
+ sure to set the host address to bind to and optionally the vhost you wish
26
+ to identify as.
27
+
28
+ Next, start the DistribuStream server:
29
+
30
+ distribustream --conf myconfig.yml
31
+
32
+ The DistribuStream server manages traffic on the peer network. It also handles
33
+ the checksumming of files.
34
+
35
+ You can see what's going on through the web interface, which runs one port
36
+ above the port you configured the server to listen on. If the server is
37
+ listening on the default port of 6086, you can reach the web interface at:
38
+
39
+ http://myserver.url:6087/
40
+
41
+ To populate the server with files, you need to attach the DistribuStream
42
+ seed. This works off of the same config file as the server, and must
43
+ be run on the same system:
44
+
45
+ dsseed --conf myconfig.yml
46
+
47
+ At this point your server is ready to go.
48
+
49
+ To test your server, use the DistribuStream client:
50
+
51
+ dsclient --url pdtp://myserver.url/file.ext
52
+
53
+ This will download file.ext from your DistribuStream server.
54
+
55
+ While you can't control the output filename at this point, the client supports
56
+ non-seekable output such as pipes. To play streaming media as it downloads,
57
+ you can:
58
+
59
+ mkfifo file.ext
60
+ dsclient --url pdtp://myserver.url/file.ext &
61
+ mediaplayer file.ext
62
+
63
+ --
64
+
65
+ Known Issues:
66
+
67
+ The client presently stores incoming data in a memory buffer. This causes
68
+ the client to consume massive amounts of memory as the file downloads.
69
+ Subsequent releases will fix this by improving the design of the memory
70
+ buffer, moving to a disk-backed buffer and/or discarding some of the
71
+ downloaded data after it's been played back.
72
+
73
+ The protocol facilitates allowing clients to have a moving window of data
74
+ in a stream, so they need not retain data which has already been displayed
75
+ to the user.
76
+
77
+ Seeds are presently not authenticated in any way, thus anyone can attach
78
+ a seed and populate the server with any files of their choosing. However,
79
+ since file checksumming is done by the server itself, this means that only
80
+ seeds running on the same system as the server will actually work.
81
+
82
+ This will be resolved by either incorporating the seed directly into the
83
+ DistribuStream server, or adding both authentication and commands for
84
+ checksumming to the server <-> seed protocol.
85
+
86
+ --
87
+
88
+ Development Roadmap:
89
+
90
+ The immediate goal is to improve the performance of the client, which presently
91
+ consumes far too much RAM for practical use with large media files. Another
92
+ immediate goal is solving the above problems with seeds.
93
+
94
+ DistribuStream uses an assemblage of various tools which do not work together
95
+ particularly well. These include the EventMachine Ruby gem, which provides
96
+ the I/O layer for the DistribuStream server, and the Mongrel web server, which
97
+ runs independently of EventMachine and uses threads.
98
+
99
+ Initial work will focus on converting the existing implementation to a fully
100
+ EventMachine-based approach which eliminates the use of threads.
101
+
102
+ Subsequent work will focus on improving the APIs provided by the various
103
+ components so that the client and server can both
104
+
105
+ Long-term goals include a move to UDP to reduce protocol latency and overhead
106
+ as well as encrypting all traffic to ensure privacy and security of the
107
+ data being transmitted.
data/Rakefile ADDED
@@ -0,0 +1,44 @@
1
+ require 'rake'
2
+ require 'rake/rdoctask'
3
+ require 'rake/gempackagetask'
4
+ load 'distribustream.gemspec'
5
+
6
+ # Default Rake task
7
+ task :default => :rdoc
8
+
9
+ # RDoc
10
+ Rake::RDocTask.new(:rdoc) do |task|
11
+ task.rdoc_dir = 'doc'
12
+ task.title = 'DistribuStream'
13
+ task.rdoc_files.include('bin/**/*.rb')
14
+ task.rdoc_files.include('lib/**/*.rb')
15
+ task.rdoc_files.include('simulation/**/*.rb')
16
+ task.rdoc_files.include('test/**/*.rb')
17
+ end
18
+
19
+ # Gem
20
+ Rake::GemPackageTask.new(GEMSPEC) do |pkg|
21
+ pkg.need_tar = true
22
+ end
23
+
24
+ # RSpec
25
+ begin
26
+ require 'spec/rake/spectask'
27
+
28
+ Spec::Rake::SpecTask.new(:spec) do |task|
29
+ task.spec_files = FileList['**/*_spec.rb']
30
+ end
31
+
32
+ Spec::Rake::SpecTask.new(:specfs) do |task|
33
+ task.spec_files= FileList['**/*_spec.rb']
34
+ task.spec_opts="-f s".split
35
+ end
36
+
37
+ Spec::Rake::SpecTask.new(:spec_coverage) do |task|
38
+ task.spec_files = FileList['**/*_spec.rb']
39
+ task.rcov = true
40
+ end
41
+ rescue LoadError
42
+ end
43
+
44
+
@@ -0,0 +1,60 @@
1
+ #!/usr/bin/env ruby
2
+ #--
3
+ # Copyright (C) 2006-07 ClickCaster, Inc. (info@clickcaster.com)
4
+ # All rights reserved. See COPYING for permissions.
5
+ #
6
+ # This source file is distributed as part of the
7
+ # DistribuStream file transfer system.
8
+ #
9
+ # See http://distribustream.rubyforge.org/
10
+ #++
11
+
12
+ require 'rubygems'
13
+ require 'eventmachine'
14
+ require 'optparse'
15
+ require 'logger'
16
+ require 'mongrel'
17
+
18
+ require File.dirname(__FILE__) + '/../lib/pdtp/server'
19
+
20
+ common_init $0
21
+
22
+ server = PDTP::Server.new
23
+ server.file_service = PDTP::Server::FileService.new
24
+ PDTP::Protocol.listener = server
25
+
26
+ #set up the mongrel server for serving the stats page
27
+ class MongrelServerHandler< Mongrel::HttpHandler
28
+ def initialize(server)
29
+ @server = server
30
+ end
31
+
32
+ def process(request,response)
33
+ response.start(200) do |head, out|
34
+ out.write begin
35
+ outstr = @server.generate_html_stats
36
+ rescue Exception=>e
37
+ outstr = "Exception: #{e}\n#{e.backtrace.join("\n")}"
38
+ end
39
+ end
40
+ end
41
+ end
42
+
43
+ #run the mongrel server
44
+ mongrel_server = Mongrel::HttpServer.new '0.0.0.0', @@config[:port] + 1
45
+ @@log.info "Mongrel server listening on port: #{@@config[:port] + 1}"
46
+ mongrel_server.register '/', MongrelServerHandler.new(server)
47
+ mongrel_server.run
48
+
49
+ #set root directory
50
+ server.file_service.root = @@config[:file_root]
51
+ server.file_service.default_chunk_size = @@config[:chunk_size]
52
+
53
+ EventMachine::run do
54
+ host, port = '0.0.0.0', @@config[:port]
55
+ EventMachine::start_server host, port, PDTP::Protocol
56
+ @@log.info "accepting connections with ev=#{EventMachine::VERSION}"
57
+ @@log.info "host=#{host} port=#{port}"
58
+
59
+ EventMachine::add_periodic_timer(2) { server.clear_all_stalled_transfers }
60
+ end
data/bin/dsclient ADDED
@@ -0,0 +1,43 @@
1
+ #!/usr/bin/env ruby
2
+ #--
3
+ # Copyright (C) 2006-07 ClickCaster, Inc. (info@clickcaster.com)
4
+ # All rights reserved. See COPYING for permissions.
5
+ #
6
+ # This source file is distributed as part of the
7
+ # DistribuStream file transfer system.
8
+ #
9
+ # See http://distribustream.rubyforge.org/
10
+ #++
11
+
12
+ require 'rubygems'
13
+ require 'eventmachine'
14
+ require 'mongrel'
15
+ require 'optparse'
16
+ require 'uri'
17
+
18
+ require File.dirname(__FILE__) + '/../lib/pdtp/client'
19
+
20
+ uri = nil
21
+ listen_port = 8000
22
+
23
+ OptionParser.new do |opts|
24
+ opts.banner = "Usage: #{$0} [options]"
25
+ opts.on("--url URL", "Fetch from the specified URL") do |u|
26
+ uri = URI.parse u
27
+ end
28
+ opts.on("--help", "Prints this usage info.") do
29
+ puts opts
30
+ exit
31
+ end
32
+ opts.on("--listen PORT", "Port to listen on") do |l|
33
+ listen_port = l.to_i
34
+ end
35
+ end.parse!
36
+
37
+ raise "Please specify a URL in the form --url URL" unless uri
38
+ raise "Only pdtp:// URLs are supported" unless uri.scheme == 'pdtp'
39
+
40
+ options = { :listen_port => listen_port }
41
+ options[:port] = uri.port unless uri.port.nil?
42
+
43
+ PDTP::Client.get uri.host, uri.path, options
data/bin/dsseed ADDED
@@ -0,0 +1,103 @@
1
+ #!/usr/bin/env ruby
2
+ #--
3
+ # Copyright (C) 2006-07 ClickCaster, Inc. (info@clickcaster.com)
4
+ # All rights reserved. See COPYING for permissions.
5
+ #
6
+ # This source file is distributed as part of the
7
+ # DistribuStream file transfer system.
8
+ #
9
+ # See http://distribustream.rubyforge.org/
10
+ #++
11
+
12
+ require 'optparse'
13
+ require 'rubygems'
14
+ require 'eventmachine'
15
+ require 'mongrel'
16
+
17
+ require File.dirname(__FILE__) + '/../lib/pdtp/client'
18
+
19
+ common_init $0
20
+
21
+ # Fine all suitable files in the give path
22
+ def find_files(base_path)
23
+ require 'find'
24
+
25
+ found = []
26
+ excludes = %w{.svn CVS}
27
+ base_full = File.expand_path(base_path)
28
+
29
+ Find.find(base_full) do |path|
30
+ if FileTest.directory?(path)
31
+ next unless excludes.include?(File.basename(path))
32
+ Find.prune
33
+ else
34
+ filename = path[(base_path.size - path.size + 1)..-1] #the entire file path after the base_path
35
+ found << filename
36
+ end
37
+ end
38
+
39
+ found
40
+ end
41
+
42
+ # Implements the file service for the pdtp protocol
43
+ class FileServiceProtocol < PDTP::Protocol
44
+ def initialize(*args)
45
+ super
46
+ end
47
+
48
+ # Called after a connection to the server has been established
49
+ def connection_completed
50
+ begin
51
+ listen_port = @@config[:listen_port]
52
+
53
+ #create the client
54
+ client = PDTP::Client.new
55
+ PDTP::Protocol.listener = client
56
+ client.server_connection = self
57
+ client.generate_client_id listen_port
58
+
59
+ # Start a mongrel server on the specified port. If it isnt available, keep trying higher ports
60
+ begin
61
+ mongrel_server=Mongrel::HttpServer.new '0.0.0.0', listen_port
62
+ rescue Exception=>e
63
+ listen_port+=1
64
+ retry
65
+ end
66
+
67
+ @@log.info "listening on port #{listen_port}"
68
+ mongrel_server.register "/", client
69
+ mongrel_server.run
70
+
71
+ # Tell the server a little bit about ourself
72
+ send_message :client_info, :listen_port => listen_port, :client_id => client.my_id
73
+
74
+ @@log.info 'This client is providing'
75
+ sfs = PDTP::Server::FileService.new
76
+ sfs.root = @@config[:file_root]
77
+ client.file_service = sfs #give this client access to all data
78
+
79
+ hostname = @@config[:vhost]
80
+
81
+ # Provide all the files in the root directory
82
+ files = find_files @@config[:file_root]
83
+ files.each { |file| send_message :provide, :url => "http://#{hostname}/#{file}" }
84
+ rescue Exception => e
85
+ puts "Exception in connection_completed: #{e}"
86
+ puts e.backtrace.join("\n")
87
+ exit
88
+ end
89
+ end
90
+
91
+ def unbind
92
+ super
93
+ puts "Disconnected from PDTP server."
94
+ end
95
+ end
96
+
97
+ # Run the EventMachine reactor loop
98
+ EventMachine::run do
99
+ host, port, listen_port = @@config[:host], @@config[:port], @@config[:listen_port]
100
+ connection = EventMachine::connect host, port, FileServiceProtocol
101
+ @@log.info "connecting with ev=#{EventMachine::VERSION}"
102
+ @@log.info "host= #{host} port=#{port}"
103
+ end
data/conf/bigchunk.yml ADDED
@@ -0,0 +1,18 @@
1
+ ---
2
+ # Virtual hostname server identifies itself as
3
+ :vhost: example.com
4
+
5
+ # Hostname or IP address to bind to
6
+ :host: 0.0.0.0
7
+
8
+ # Base port the server listens on
9
+ :port: 6086
10
+
11
+ # Run in quiet mode
12
+ :quiet: true
13
+
14
+ # Base directory from which files are served
15
+ :file_root: /Users/tony/dstest/files
16
+
17
+ # Size of segments to be distributed (in bytes)
18
+ :chunk_size: 250000
data/conf/debug.yml ADDED
@@ -0,0 +1,18 @@
1
+ ---
2
+ # Virtual hostname server identifies itself as
3
+ :vhost: example.com
4
+
5
+ # Hostname or IP address to bind to
6
+ :host: 0.0.0.0
7
+
8
+ # Base port the server listens on
9
+ :port: 6086
10
+
11
+ # Run in quiet mode
12
+ :quiet: false
13
+
14
+ # Base directory from which files are served
15
+ :file_root: /Users/tony/dstest/files
16
+
17
+ # Size of segments to be distributed (in bytes)
18
+ :chunk_size: 5000
data/conf/example.yml ADDED
@@ -0,0 +1,18 @@
1
+ ---
2
+ # Virtual hostname server identifies itself as
3
+ :vhost: example.com
4
+
5
+ # Hostname or IP address to bind to
6
+ :host: 0.0.0.0
7
+
8
+ # Base port the server listens on
9
+ :port: 6086
10
+
11
+ # Run in quiet mode
12
+ :quiet: true
13
+
14
+ # Base directory from which files are served
15
+ :file_root: /Users/tony/dstest/files
16
+
17
+ # Size of segments to be distributed (in bytes)
18
+ :chunk_size: 5000
@@ -0,0 +1,20 @@
1
+ require 'rubygems'
2
+
3
+ GEMSPEC = Gem::Specification.new do |s|
4
+ s.name = "distribustream"
5
+ s.version = "0.1.0"
6
+ s.date = "2008-10-11"
7
+ s.summary = "DistribuStream is a fully open peercasting system allowing on-demand or live streaming media to be delivered at a fraction of the normal cost"
8
+ s.email = "tony@clickcaster.com"
9
+ s.homepage = "http://distribustream.rubyforge.org"
10
+ s.rubyforge_project = "distribustream"
11
+ s.has_rdoc = true
12
+ s.rdoc_options = ["--exclude", "definitions", "--exclude", "indexes"]
13
+ s.extra_rdoc_files = ["COPYING", "README", "CHANGES"]
14
+ s.authors = ["Tony Arcieri", "Ashvin Mysore", "Galen Pahlke", "James Sanders", "Tom Stapleton"]
15
+ s.files = Dir.glob("{bin,lib,conf}/**/*") + ['Rakefile', 'distribustream.gemspec']
16
+ s.executables = %w{distribustream dsseed dsclient}
17
+ s.add_dependency("eventmachine", ">= 0.9.0")
18
+ s.add_dependency("mongrel", ">= 1.0.1")
19
+ s.add_dependency("json", ">= 1.1.0")
20
+ end
@@ -0,0 +1,195 @@
1
+ #--
2
+ # Copyright (C) 2006-07 ClickCaster, Inc. (info@clickcaster.com)
3
+ # All rights reserved. See COPYING for permissions.
4
+ #
5
+ # This source file is distributed as part of the
6
+ # DistribuStream file transfer system.
7
+ #
8
+ # See http://distribustream.rubyforge.org/
9
+ #++
10
+
11
+ require 'rubygems'
12
+ require 'eventmachine'
13
+ require 'mongrel'
14
+ require 'net/http'
15
+ require 'thread'
16
+ require 'digest/md5'
17
+
18
+ require File.dirname(__FILE__) + '/common/common_init'
19
+ require File.dirname(__FILE__) + '/common/protocol'
20
+ require File.dirname(__FILE__) + '/client/protocol'
21
+ require File.dirname(__FILE__) + '/client/file_service'
22
+ require File.dirname(__FILE__) + '/client/transfer'
23
+ require File.dirname(__FILE__) + '/server/file_service'
24
+
25
+ module PDTP
26
+ # This is the main driver for the client-side implementation
27
+ # of PDTP. It maintains a single connection to a server and
28
+ # all the necessary connections to peers. It is responsible
29
+ # for handling all messages corresponding to these connections.
30
+
31
+ # Client inherits from Mongrel::HttpHandler in order to handle
32
+ # incoming HTTP connections
33
+ class Client < Mongrel::HttpHandler
34
+ # Accessor for a client file service instance
35
+ attr_accessor :file_service
36
+ attr_accessor :server_connection
37
+ attr_accessor :my_id
38
+
39
+ def self.get(host, path, options = {})
40
+ path = '/' + path unless path[0] == ?/
41
+
42
+ opts = {
43
+ :host => host,
44
+ :port => 6086,
45
+ :file_root => '.',
46
+ :quiet => true,
47
+ :listen_port => 8000,
48
+ :request_url => "http://#{host}#{path}"
49
+ }.merge(options)
50
+
51
+ common_init $0, opts
52
+
53
+ # Run the EventMachine reactor loop
54
+ EventMachine::run do
55
+ connection = EventMachine::connect host, opts[:port], Client::Protocol
56
+ @@log.info "connecting with ev=#{EventMachine::VERSION}"
57
+ @@log.info "host= #{host} port=#{opts[:port]}"
58
+ end
59
+ end
60
+
61
+ def initialize
62
+ @transfers = []
63
+ @mutex = Mutex.new
64
+ end
65
+
66
+ # This method is called after a connection to the server
67
+ # has been successfully established.
68
+ def connection_created(connection)
69
+ @@log.debug("[mongrel] Opened connection...");
70
+ end
71
+
72
+ # This method is called when the server connection is destroyed
73
+ def connection_destroyed(connection)
74
+ @@log.debug("[mongrel] Closed connection...")
75
+ end
76
+
77
+ # Returns a transfer object if the given connection is a peer associated with
78
+ # that transfer. Otherwise returns nil.
79
+ def get_transfer(connection)
80
+ @transfers.each { |t| return t if t.peer == connection }
81
+ nil
82
+ end
83
+
84
+ # This method is called when an HTTP request is received. It is called in
85
+ # a separate thread, one for each request.
86
+ def process(request,response)
87
+ begin
88
+ @@log.debug "Creating Transfer::Listener"
89
+ transfer = Transfer::Listener.new(
90
+ request,
91
+ response,
92
+ @server_connection,
93
+ @file_service,
94
+ self
95
+ )
96
+
97
+ #Needs to be locked because multiple threads could attempt to append a transfer at once
98
+ @mutex.synchronize { @transfers << transfer }
99
+ transfer.handle_header
100
+ rescue Exception=>e
101
+ transfer.write_http_exception(e)
102
+ end
103
+
104
+ transfer.send_completed_message transfer.hash
105
+ end
106
+
107
+ # Returns true if the given message refers to the given transfer
108
+ def transfer_matches?(transfer, message)
109
+ transfer.peer == message["peer"] and
110
+ transfer.url == message["url"] and
111
+ transfer.byte_range == message["range"] and
112
+ transfer.peer_id == message["peer_id"]
113
+ end
114
+
115
+ # Called when any server message is received. This is the brains of
116
+ # the client's protocol handling.
117
+ def dispatch_message(command, message, connection)
118
+ case command
119
+ when "tell_info" # Receive and store information for this url
120
+ info = FileInfo.new message["url"].split('/').last
121
+ info.file_size = message["size"]
122
+ info.base_chunk_size = message["chunk_size"]
123
+ info.streaming = message["streaming"]
124
+ @file_service.set_info(message["url"], info)
125
+ when "transfer" # Begin a transfer as a connector
126
+ transfer = Transfer::Connector.new(message,@server_connection,@file_service,self)
127
+
128
+ @@log.debug "TRANSFER STARTING"
129
+
130
+ # Run each transfer in its own thread and notify the server upon completion
131
+ Thread.new(transfer) do |t|
132
+ begin
133
+ t.run
134
+ rescue Exception=>e
135
+ @@log.info("Exception in dispatch_message: " + e.exception + "\n" + e.backtrace.join("\n"))
136
+ end
137
+ t.send_completed_message(t.hash)
138
+ end
139
+ when "tell_verify"
140
+ # We are a listener, and asked for verification of a transfer from a server.
141
+ # After asking for verification, we stopped running, and must be restarted
142
+ # if verification is successful
143
+
144
+ found=false
145
+ @transfers.each do |t|
146
+ if t.matches_message?(message)
147
+ finished(t)
148
+ t.tell_verify(message["is_authorized"])
149
+ found=true
150
+ break
151
+ end
152
+ end
153
+
154
+ unless found
155
+ puts "BUG: Tell verify sent for an unknown transfer"
156
+ exit!
157
+ end
158
+ when "hash_verify"
159
+ @@log.debug "Hash verified for url=#{message["url"]} range=#{message["range"]} hash_ok=#{message["hash_ok"]}"
160
+ when "protocol_error", "protocol_warn" #ignore
161
+ else raise "Server sent an unknown message type: #{command} "
162
+ end
163
+ end
164
+
165
+ #Prints the number of transfers associated with this client
166
+ def print_stats
167
+ @@log.debug "client: num_transfers=#{@transfers.size}"
168
+ end
169
+
170
+ #Provides a threadsafe mechanism for transfers to report themselves finished
171
+ def finished(transfer)
172
+ @mutex.synchronize do
173
+ @transfers.delete(transfer)
174
+ end
175
+ end
176
+
177
+ # Generate and set the client ID for an instance
178
+ def generate_client_id(port = 0)
179
+ @my_id = Client.generate_client_id port
180
+ end
181
+
182
+ # Client ID generator routine
183
+ def self.generate_client_id(port = 0)
184
+ md5 = Digest::MD5::new
185
+ now = Time::now
186
+ md5.update now.to_s
187
+ md5.update String(now.usec)
188
+ md5.update String(rand(0))
189
+ md5.update String($$)
190
+
191
+ #return md5.hexdigest+":#{port}" # long id
192
+ return md5.hexdigest[0..5] # short id
193
+ end
194
+ end
195
+ end