bwkfanboy 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ (The MIT License)
2
+
3
+ Copyright (c) 2010 Alexander Gromnitsky.
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ 'Software'), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
19
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
20
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
21
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
22
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.rdoc ADDED
@@ -0,0 +1,88 @@
1
+ = About
2
+
3
+ bwkfanboy is a HTML to Atom feed converter that you can use to watch
4
+ sites that do not provide its own feed.
5
+
6
+ The converter is not a magick tool: you'll need to write a plugin (in
7
+ Ruby) for each site you want to watch. bwkfanboy provides guidelines and
8
+ general assistance.
9
+
10
+ = Architecture
11
+
12
+ == Plugins
13
+
14
+ bwkfanboy comes with 1 exmple plugin that parses a search page of
15
+ dailyprincetonian.com looking for bwk's articles.
16
+
17
+ The plugin is a Ruby class +Page+ that inherits Bwkfanboy::Parse
18
+ parent, overriding 1 method.
19
+
20
+ The plugins can be in the system
21
+
22
+ `gem env gemdir`/gems/bwkfanboy-x.y.z/lib/bwkfanboy/plugins
23
+
24
+ or user's home
25
+
26
+ ~/.bwkfanboy/plugins
27
+
28
+ directories.
29
+
30
+ == Pipeline
31
+
32
+ The program consists of 4 parts:
33
+
34
+ 0. *bwkfanboy* script that takes 1 parameter: the name of a file in
35
+ plugins directories (without the .rb suffix). So, for example to get
36
+ an atom feed from dailyprincetonian.com you type:
37
+
38
+ % bwkfanboy bwk
39
+
40
+ and it will load
41
+ <tt>/usr/local/lib/ruby/gems/1.9/gems/bwkfanboy-0.0.1/lib/bwkfanboy/plugins/bwk.rb</tt>
42
+ file on my FreeBSD machine, fetch and parse html from
43
+ dailyprincetonian.com and generate the required feed, dumping it to
44
+ stdout.
45
+
46
+ The script is just a convinient wrapper for 3 separate utils.
47
+
48
+ 1. *bwkfanboy_fetch*
49
+
50
+ It reads 1 line from stdin for the URL to fetch from. The result will
51
+ be dumped to stdout.
52
+
53
+ 2. *bwkfanboy_parse*
54
+
55
+ It takes 1 parameter: <em>a full path</em> to a plugin file.
56
+
57
+ This util reads stdin expecting it to be a xhtml, parses it and dumps
58
+ the result to stdout in JSON-formatted object.
59
+
60
+ 3. *bwkfanboy_generate*
61
+
62
+ Reads stdin expecting it to be a proper JSON-formatted object.
63
+
64
+ The result will be an Atom feed dumped to stdout in UTF-8.
65
+
66
+ So, without the wrapper all this together looks like:
67
+
68
+ % echo http://example.org | bwkfanboy_fetch |
69
+ bwkfanboy_parse /path/to/my/plugin.rb | bwkfanboy_generate
70
+
71
+ == Log
72
+
73
+ All utils write to <tt>/tmp/bwkfanboy/USER/log/general.log</tt> file if
74
+ permissions allows it.
75
+
76
+ == HTTP
77
+
78
+ There are 2 method to get an Atom feed via HTTP:
79
+
80
+ 1. <tt>web/bwkfanboy.cgi</tt> (from the program tarball), which you may
81
+ copy to your Apache cgi directory and run it. This prohibits you from
82
+ using HOME directory for your own plugins. Also the cgi script
83
+ requires some manual editing (setting 1 variable in it) before even
84
+ you can start utilizing it.
85
+
86
+ 2. Small *bwkfanboy_server* HTTP server. It can run from any user and
87
+ thus is able to inherit env variables for discovering your HOME
88
+ directory. Read bin/bwkfanboy_server to know how to operate it.
data/Rakefile ADDED
@@ -0,0 +1,48 @@
1
+ # -*-ruby-*-
2
+
3
+ require 'rake'
4
+ require 'rake/gempackagetask'
5
+ require 'rake/clean'
6
+ require 'rake/rdoctask'
7
+ require 'rake/testtask'
8
+
9
+ spec = Gem::Specification.new() {|i|
10
+ i.name = "bwkfanboy"
11
+ i.summary = 'A converter from HTML to Atom feed that you can use to watch sites that do not provide its own feed.'
12
+ i.version = '0.0.1'
13
+ i.author = 'Alexander Gromnitsky'
14
+ i.email = 'alexander.gromnitsky@gmail.com'
15
+ i.homepage = 'http://github.com/gromnitsky/bwkfanboy'
16
+ i.platform = Gem::Platform::RUBY
17
+ i.required_ruby_version = '>= 1.9'
18
+ i.files = FileList['lib/**/*', 'bin/*', 'doc/*', '[A-Z]*', 'test/**/*']
19
+
20
+ i.executables = FileList['bin/*'].gsub(/^bin\//, '')
21
+ i.default_executable = i.name
22
+
23
+ i.has_rdoc = true
24
+ i.test_files = FileList['test/test_*.rb']
25
+
26
+ i.rdoc_options << '-m' << 'Bwkfanboy' << '-x' << 'plugins'
27
+ i.extra_rdoc_files = FileList['bin/*']
28
+
29
+ i.add_dependency('activesupport', '>= 3.0.0')
30
+ i.add_dependency('nokogiri', '>= 1.4.3')
31
+ i.add_dependency('open4', '>= 1.0.1')
32
+ i.add_dependency('jsonschema', '>= 2.0.0')
33
+ }
34
+
35
+ Rake::GemPackageTask.new(spec).define()
36
+
37
+ task(default: %(repackage))
38
+
39
+ Rake::RDocTask.new('doc') {|i|
40
+ i.main = "Bwkfanboy"
41
+ i.rdoc_files = FileList['doc/*', 'lib/**/*.rb', 'bin/*']
42
+ i.rdoc_files.exclude("lib/**/plugins", "test")
43
+ }
44
+
45
+ Rake::TestTask.new() {|i|
46
+ i.test_files = FileList['test/test_*.rb']
47
+ i.verbose = true
48
+ }
data/TODO ADDED
@@ -0,0 +1,7 @@
1
+ -*-text-*-
2
+
3
+ 0.0.2
4
+ -----
5
+
6
+ - Add plugin listing to bwkfanboy_server.
7
+ - More tests.
data/bin/bwkfanboy ADDED
@@ -0,0 +1,128 @@
1
+ #!/usr/bin/env ruby19
2
+ # -*-ruby-*-
3
+
4
+ # This program is executed by bin/bwkfanboy_server to do all dirty work:
5
+ # fetch HTML, parse it and generate a pretty Atom feed.
6
+ #
7
+ # It is a wrapper which you can utilize for such common tasks as listing
8
+ # all available plugins.
9
+ #
10
+ # Type:
11
+ #
12
+ # % bwkfanboy -h
13
+ #
14
+ # to get some basic help & read about Bwkfanboy module.
15
+
16
+ require_relative '../lib/bwkfanboy/parser'
17
+
18
+ $conf = {
19
+ prog_name: 'bwkfanboy',
20
+ prog_ver: '0.0.1',
21
+ mode: 'pipe',
22
+ banner: "Usage: #{File.basename($0)} [options] plugin-name"
23
+ }
24
+
25
+ class Plugin # :nodoc: all
26
+ attr_reader :name, :path
27
+
28
+ def initialize(name)
29
+ @name = name
30
+ @path = nil
31
+ end
32
+
33
+ def dirs()
34
+ # try to create user's home plugin directory
35
+ begin
36
+ ['~/.bwkfanboy', '~/.bwkfanboy/plugins'].each {|i|
37
+ Dir.mkdir(File.expand_path(i))
38
+ }
39
+ rescue
40
+ # empty
41
+ end
42
+
43
+ r = []
44
+ dirs = ['~/.bwkfanboy/plugins', "#{Bwkfanboy::Utils.gem_dir_system}/plugins"]
45
+ begin
46
+ # this will fail for user's home directory under Apache CGI
47
+ # environment
48
+ dirs.map! {|i| File.expand_path(i) }
49
+ rescue
50
+ end
51
+ dirs.each {|i|
52
+ if File.readable?(i) then
53
+ r << i
54
+ else
55
+ Bwkfanboy::Utils.warnx("directory #{i} isn't readable");
56
+ end
57
+ }
58
+
59
+ if r.length == 0 then
60
+ Bwkfanboy::Utils.errx(1, "no dirs for plugins found: #{dirs.join(' ')}")
61
+ end
62
+ return r
63
+ end
64
+
65
+ def load()
66
+ abort($conf[:banner]) unless (@name && @name !~ /^\s*$/)
67
+
68
+ dirs.each {|i|
69
+ files = Dir.glob("#{i}/*.rb")
70
+ if (@path = files.index("#{i}/#{@name}.rb")) then
71
+ @path = files[@path]
72
+ break
73
+ end
74
+ }
75
+ Bwkfanboy::Utils.errx(1, "no such plugin '#{@name}'") if ! @path
76
+ Bwkfanboy::Utils.plugin_load(@path, Bwkfanboy::Meta::PLUGIN_CLASS)
77
+
78
+ pn = Page.new()
79
+ pn.check()
80
+ return pn
81
+ end
82
+
83
+ end # class
84
+
85
+ # ----------------------------------------------------------------------
86
+
87
+ o = Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner]) # create OptionParser object
88
+ o.on('-i', 'Show some info about the plugin') { |i| $conf[:mode] = 'info' }
89
+ o.on('-l', 'List all plugins') { |i| $conf[:mode] = 'list' }
90
+ o.on('-p', 'List all plugins paths') { |i| $conf[:mode] = 'path' }
91
+ o.on('-D', '(ignore this) Use URI_DEBUG const instead URI in plugins') { |i| $conf[:mode] = 'debug' }
92
+ Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], o) # run cl parser
93
+
94
+ plugin = Plugin.new(ARGV[0])
95
+
96
+ case $conf[:mode]
97
+ when 'list'
98
+ plugin.dirs().each {|i|
99
+ puts "#{i}:"
100
+ Dir.glob("#{i}/*.rb").each {|j|
101
+ puts "\t#{File.basename(j, '.rb')}"
102
+ }
103
+ }
104
+ when 'path'
105
+ plugin.dirs().each {|i| puts i}
106
+ when 'info'
107
+ plugin.load().dump_info
108
+ else
109
+ # A pipe mode
110
+ pn = plugin.load()
111
+ cmd = "./bwkfanboy_fetch | ./bwkfanboy_parse '#{plugin.path}' | ./bwkfanboy_generate"
112
+ if Bwkfanboy::Utils.cfg[:verbose] >= 2 then
113
+ puts ($conf[:mode] != 'debug' ? pn.class::Meta::URI : pn.class::Meta::URI_DEBUG)
114
+ puts cmd
115
+ exit 0
116
+ end
117
+
118
+ # go to the directory with current script
119
+ Dir.chdir(File.dirname(File.expand_path($0)))
120
+
121
+ pipe = IO.popen(cmd, 'w+')
122
+ pipe.puts ($conf[:mode] != 'debug' ? pn.class::Meta::URI : pn.class::Meta::URI_DEBUG)
123
+ pipe.close_write
124
+ while line = pipe.gets
125
+ puts line
126
+ end
127
+ pipe.close
128
+ end
@@ -0,0 +1,30 @@
1
+ #!/usr/bin/env ruby19
2
+ # -*-ruby-*-
3
+
4
+ # Read stdin for a URI or a full path to the local file, download it (or
5
+ # read for the local file) and print to stdout.
6
+
7
+ require 'open-uri'
8
+
9
+ require_relative '../lib/bwkfanboy/utils'
10
+
11
+ $conf = { banner: "Usage: #{File.basename($0)} [options] < uri" }
12
+
13
+ Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], nil, true)
14
+
15
+ uri = gets.chomp()
16
+
17
+ Bwkfanboy::Utils.veputs(1, "fetching #{uri}\n")
18
+
19
+ begin
20
+ open(uri, "User-Agent" => Bwkfanboy::Meta::USER_AGENT) {|f|
21
+ if defined?(f.meta) && f.status[0] != '200' then
22
+ Bwkfanboy::Utils.errx(1, "cannot fetch #{uri} : HTTP responce: #{f.status[0]}")
23
+ end
24
+ Bwkfanboy::Utils.veputs(1, "charset = #{f.content_type_parse[1][1]}\n") if defined?(f.meta)
25
+ f.each_line {|i| puts i}
26
+ }
27
+ rescue
28
+ # typically Errno::ENOENT
29
+ Bwkfanboy::Utils.errx(1, "cannot fetch: #{$!}");
30
+ end
@@ -0,0 +1,80 @@
1
+ #!/usr/bin/env ruby19
2
+ # -*-ruby-*-
3
+
4
+ # Read stdin for JSON, generate from it an Atom feed and print the
5
+ # result to stdout in UTF-8.
6
+ #
7
+ # One can validate the JSON by providing '--check' command line option
8
+ # (by default the validating is off).
9
+
10
+ require 'rss/maker'
11
+ require 'date'
12
+ require 'json'
13
+ require 'jsonschema'
14
+
15
+ require_relative '../lib/bwkfanboy/utils'
16
+
17
+ $conf = {
18
+ banner: "Usage: #{File.basename($0)} [options] < json",
19
+ check: false
20
+ }
21
+
22
+ o = Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner])
23
+ o.on('--check', 'Validate the input (slow!)') { |i| $conf[:check] = true }
24
+ Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], o) # run cl parser
25
+
26
+ begin
27
+ j = JSON.parse(STDIN.read)
28
+ rescue
29
+ Bwkfanboy::Utils.errx(1, "stdin had invalid JSON");
30
+ end
31
+
32
+ # validate the input
33
+ schema = Bwkfanboy::Utils.gem_dir_system() + '/schema.js'
34
+ if $conf[:check] then
35
+ begin
36
+ JSON::Schema.validate(j, JSON.parse(File.read(schema)))
37
+ rescue
38
+ Bwkfanboy::Utils.errx(1, "JSON validation with schema (#{schema}) failed");
39
+ end
40
+ end
41
+
42
+ feed = RSS::Maker.make("atom") { |maker|
43
+ maker.channel.id = j['channel']['id']
44
+ maker.channel.updated = j['channel']['updated']
45
+ maker.channel.author = j['channel']['author']
46
+ maker.channel.title = j['channel']['title']
47
+
48
+ maker.channel.links.new_link {|i|
49
+ i.href = j['channel']['link']
50
+ i.rel = 'alternate'
51
+ i.type = 'text/html' # eh
52
+ }
53
+
54
+ maker.items.do_sort = true
55
+
56
+ j['x_entries'].each { |i|
57
+ maker.items.new_item do |item|
58
+ item.links.new_link {|k|
59
+ k.href = i['link']
60
+ k.rel = 'alternate'
61
+ k.type = 'text/html' # only to make happy crappy pr2nntp gateway
62
+ }
63
+ item.title = i['title']
64
+ item.author = i['author']
65
+ item.updated = i['updated']
66
+ item.content.type = j['channel']['x_entries_content_type']
67
+
68
+ case item.content.type
69
+ when 'text'
70
+ item.content.content = i['content']
71
+ when 'html'
72
+ item.content.content = i['content']
73
+ else
74
+ item.content.xhtml = i['content']
75
+ end
76
+ end
77
+ }
78
+ }
79
+
80
+ puts feed
@@ -0,0 +1,32 @@
1
+ #!/usr/bin/env ruby19
2
+ # -*-ruby-*-
3
+
4
+ # Take 1 command line parameter: a full path to a plugin.
5
+ #
6
+ # Read stdin for a HTML, parse it and print the result to stdout in JSON
7
+ # format. If '-vv' command line parameters were given, output will be in
8
+ # 'key: value' pairs and <em>not</em> in JSON.
9
+
10
+ require_relative '../lib/bwkfanboy/parser'
11
+
12
+ $conf = {
13
+ banner: "Usage: #{File.basename($0)} [options] /path/to/my/plugin.rb < html"
14
+ }
15
+
16
+ Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], nil, true)
17
+
18
+ if ARGV.size == 0 then
19
+ abort($conf[:banner])
20
+ else
21
+ Bwkfanboy::Utils.plugin_load(ARGV[0], Bwkfanboy::Meta::PLUGIN_CLASS)
22
+ end;
23
+
24
+ pn = Page.new()
25
+ pn.check()
26
+ pn.parse()
27
+
28
+ if Bwkfanboy::Utils.cfg[:verbose] >= 2 then
29
+ pn.dump()
30
+ else
31
+ puts pn.to_json()
32
+ end
@@ -0,0 +1,141 @@
1
+ #!/usr/bin/env ruby19
2
+ # -*-ruby-*-
3
+
4
+ # Start a HTTP server (by default on 127.0.0.1:9042). To get Atom feeds
5
+ # from it, initiate GET request with URI
6
+ #
7
+ # http://localhost:9042/?p=PLUGIN
8
+ #
9
+ # where +PLUGIN+ is a name of a bwkfanboy's plugin (without '.re' suffix).
10
+ #
11
+ # To list all available plugins, point you browser to
12
+ #
13
+ # http://localhost:9042/list
14
+ #
15
+ # The server is intended to run from a non-root user from
16
+ # <tt>~/.login</tt> file. It can detach from a terminal if you give it
17
+ # '-d' command line option.
18
+ #
19
+ # For other help, type:
20
+ #
21
+ # bwkfanboy_server -h
22
+ #
23
+ # The server maintains 2 logs:
24
+ #
25
+ # /tmp/bwkfanboy/USER/log/bwkfanboy_server.log
26
+ # /tmp/bwkfanboy/USER/log/bwkfanboy_server-access.log
27
+ #
28
+ # The file with a pid:
29
+ #
30
+ # /tmp/bwkfanboy/USER/bwkfanboy_server.pid
31
+
32
+ require 'webrick'
33
+ require_relative '../lib/bwkfanboy/utils'
34
+
35
+ $conf = {
36
+ addr: '127.0.0.1',
37
+ port: 9042,
38
+ converter: "./#{Bwkfanboy::Meta::NAME}",
39
+ banner: "Usage: #{File.basename($0)} [options]",
40
+ server_type: WEBrick::SimpleServer,
41
+ workdir: File.dirname(File.expand_path($0)),
42
+ pidfile: "#{Bwkfanboy::Meta::DIR_TMP}/#{File.basename($0)}.pid",
43
+ log: "#{Bwkfanboy::Meta::DIR_LOG}/#{File.basename($0)}.log",
44
+ alog: "#{Bwkfanboy::Meta::DIR_LOG}/#{File.basename($0)}-access.log",
45
+ mode: 'pipe'
46
+ }
47
+
48
+ o = Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner]) # create OptionParser object
49
+ o.on('-b VAL', 'BindAddress') { |i| $conf[:addr] = i }
50
+ o.on('-p VAL', 'A port number') { |i| $conf[:port] = i }
51
+ o.on('-c VAL', "A path to main #{Bwkfanboy::Meta::NAME} executable") { |i| $conf[:converter] = i }
52
+ o.on('-d', 'Detach from a terminal') {|i| $conf[:server_type] = WEBrick::Daemon }
53
+ o.on('-D', '(ignore this) Use URI_DEBUG const instead URI in plugins') { |i| $conf[:mode] = 'debug' }
54
+ Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], o) # run cl parser
55
+
56
+ Bwkfanboy::Utils.dir_tmp_create()
57
+
58
+ class FeedServlet < WEBrick::HTTPServlet::AbstractServlet # :nodoc: all
59
+ def do_GET(req, res)
60
+ if req.query['p'] && req.query['p'] =~ Bwkfanboy::Meta::PLUGIN_NAME
61
+ res['Content-Type'] = 'application/atom+xml; charset=UTF-8'
62
+ res['Content-Disposition'] = "inline; filename=\"#{Bwkfanboy::Meta::NAME}-#{req.query['p']}.xml"
63
+
64
+ cmd = "#{$conf[:converter]} #{$conf[:mode] == 'debug' ? '-D' : ''} #{req.query['p']}"
65
+ r = Bwkfanboy::Utils.cmd_run(cmd)
66
+ if r[0] != 0 then
67
+ raise WEBrick::HTTPStatus::InternalServerError.new("Errors in the pipeline:\n\n #{r[1]}")
68
+ end
69
+
70
+ res.body = r[2]
71
+ else
72
+ raise WEBrick::HTTPStatus::InternalServerError.new("Parameter 'p' required")
73
+ end
74
+ end
75
+ end
76
+
77
+ class FeedListServlet < WEBrick::HTTPServlet::AbstractServlet # :nodoc: all
78
+ def do_GET(req, res)
79
+ cmd = "#{$conf[:converter]} -l"
80
+ r = Bwkfanboy::Utils.cmd_run(cmd)
81
+ if r[0] != 0 then
82
+ raise WEBrick::HTTPStatus::InternalServerError.new("Errors:\n\n #{r[1]}")
83
+ end
84
+
85
+ res.body = r[2]
86
+ end
87
+ end
88
+
89
+ # create temporally files
90
+ def start_callback()
91
+ Dir.chdir($conf[:workdir])
92
+ if ! File.executable?($conf[:converter]) then
93
+ Bwkfanboy::Utils.errx(1, "Missing executable file '#{$conf[:converter]}'")
94
+ end
95
+
96
+ begin
97
+ File.open($conf[:pidfile], "w+") {|i| i.puts $$ }
98
+ rescue
99
+ Bwkfanboy::Utils.warnx("unable to create a pidfile " + $conf[:pidfile])
100
+ end
101
+ end
102
+
103
+ # remove temporally files
104
+ def stop_callback()
105
+ begin
106
+ File.unlink $conf[:pidfile]
107
+ rescue
108
+ # ignore errors
109
+ end
110
+ end
111
+
112
+ def log_create(f)
113
+ begin
114
+ log = Logger.new(f, 2, Bwkfanboy::Meta::LOG_MAXSIZE)
115
+ rescue
116
+ Bwkfanboy::Utils.warnx("cannot open log #{f}");
117
+ return nil
118
+ end
119
+ log.datetime_format = "%H:%M:%S"
120
+ log
121
+ end
122
+
123
+ # ----------------------------------------------------------------------
124
+
125
+ server_log = log_create($conf[:log])
126
+ access_log = [[ log_create($conf[:alog]), WEBrick::AccessLog::COMBINED_LOG_FORMAT ]]
127
+
128
+ s = WEBrick::HTTPServer.new(Port: $conf[:port],
129
+ BindAddress: $conf[:addr],
130
+ ServerType: $conf[:server_type],
131
+ StartCallback: -> {start_callback},
132
+ StopCallback: -> {stop_callback},
133
+ Logger: server_log,
134
+ AccessLog: access_log
135
+ )
136
+ s.mount("/", FeedServlet)
137
+ s.mount("/list", FeedListServlet)
138
+ ['TERM', 'INT'].each {|i|
139
+ trap(i) { s.shutdown }
140
+ }
141
+ s.start
data/doc/README.rdoc ADDED
@@ -0,0 +1,88 @@
1
+ = About
2
+
3
+ bwkfanboy is a HTML to Atom feed converter that you can use to watch
4
+ sites that do not provide its own feed.
5
+
6
+ The converter is not a magick tool: you'll need to write a plugin (in
7
+ Ruby) for each site you want to watch. bwkfanboy provides guidelines and
8
+ general assistance.
9
+
10
+ = Architecture
11
+
12
+ == Plugins
13
+
14
+ bwkfanboy comes with 1 exmple plugin that parses a search page of
15
+ dailyprincetonian.com looking for bwk's articles.
16
+
17
+ The plugin is a Ruby class +Page+ that inherits Bwkfanboy::Parse
18
+ parent, overriding 1 method.
19
+
20
+ The plugins can be in the system
21
+
22
+ `gem env gemdir`/gems/bwkfanboy-x.y.z/lib/bwkfanboy/plugins
23
+
24
+ or user's home
25
+
26
+ ~/.bwkfanboy/plugins
27
+
28
+ directories.
29
+
30
+ == Pipeline
31
+
32
+ The program consists of 4 parts:
33
+
34
+ 0. *bwkfanboy* script that takes 1 parameter: the name of a file in
35
+ plugins directories (without the .rb suffix). So, for example to get
36
+ an atom feed from dailyprincetonian.com you type:
37
+
38
+ % bwkfanboy bwk
39
+
40
+ and it will load
41
+ <tt>/usr/local/lib/ruby/gems/1.9/gems/bwkfanboy-0.0.1/lib/bwkfanboy/plugins/bwk.rb</tt>
42
+ file on my FreeBSD machine, fetch and parse html from
43
+ dailyprincetonian.com and generate the required feed, dumping it to
44
+ stdout.
45
+
46
+ The script is just a convinient wrapper for 3 separate utils.
47
+
48
+ 1. *bwkfanboy_fetch*
49
+
50
+ It reads 1 line from stdin for the URL to fetch from. The result will
51
+ be dumped to stdout.
52
+
53
+ 2. *bwkfanboy_parse*
54
+
55
+ It takes 1 parameter: <em>a full path</em> to a plugin file.
56
+
57
+ This util reads stdin expecting it to be a xhtml, parses it and dumps
58
+ the result to stdout in JSON-formatted object.
59
+
60
+ 3. *bwkfanboy_generate*
61
+
62
+ Reads stdin expecting it to be a proper JSON-formatted object.
63
+
64
+ The result will be an Atom feed dumped to stdout in UTF-8.
65
+
66
+ So, without the wrapper all this together looks like:
67
+
68
+ % echo http://example.org | bwkfanboy_fetch |
69
+ bwkfanboy_parse /path/to/my/plugin.rb | bwkfanboy_generate
70
+
71
+ == Log
72
+
73
+ All utils write to <tt>/tmp/bwkfanboy/USER/log/general.log</tt> file if
74
+ permissions allows it.
75
+
76
+ == HTTP
77
+
78
+ There are 2 method to get an Atom feed via HTTP:
79
+
80
+ 1. <tt>web/bwkfanboy.cgi</tt> (from the program tarball), which you may
81
+ copy to your Apache cgi directory and run it. This prohibits you from
82
+ using HOME directory for your own plugins. Also the cgi script
83
+ requires some manual editing (setting 1 variable in it) before even
84
+ you can start utilizing it.
85
+
86
+ 2. Small *bwkfanboy_server* HTTP server. It can run from any user and
87
+ thus is able to inherit env variables for discovering your HOME
88
+ directory. Read bin/bwkfanboy_server to know how to operate it.