bwkfanboy 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/LICENSE +22 -0
- data/README.rdoc +88 -0
- data/Rakefile +48 -0
- data/TODO +7 -0
- data/bin/bwkfanboy +128 -0
- data/bin/bwkfanboy_fetch +30 -0
- data/bin/bwkfanboy_generate +80 -0
- data/bin/bwkfanboy_parse +32 -0
- data/bin/bwkfanboy_server +141 -0
- data/doc/README.rdoc +88 -0
- data/doc/plugin.rdoc +118 -0
- data/lib/bwkfanboy/parser.rb +143 -0
- data/lib/bwkfanboy/plugins/bwk.rb +33 -0
- data/lib/bwkfanboy/plugins/freebsd-ports-update.rb +76 -0
- data/lib/bwkfanboy/schema.js +39 -0
- data/lib/bwkfanboy/utils.rb +134 -0
- data/test/plugins/bwk.rb +29 -0
- data/test/plugins/empty.rb +0 -0
- data/test/popen4.sh +4 -0
- data/test/semis/bwk.html +398 -0
- data/test/semis/bwk.json +82 -0
- data/test/test_fetch.rb +34 -0
- data/test/test_generate.rb +30 -0
- data/test/test_parse.rb +32 -0
- data/test/test_server.rb +39 -0
- data/test/ts_utils.rb +21 -0
- data/test/xml-clean.sh +8 -0
- metadata +158 -0
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
(The MIT License)
|
2
|
+
|
3
|
+
Copyright (c) 2010 Alexander Gromnitsky.
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
'Software'), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
|
19
|
+
IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
|
20
|
+
CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
|
21
|
+
TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
|
22
|
+
SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.rdoc
ADDED
@@ -0,0 +1,88 @@
|
|
1
|
+
= About
|
2
|
+
|
3
|
+
bwkfanboy is a HTML to Atom feed converter that you can use to watch
|
4
|
+
sites that do not provide its own feed.
|
5
|
+
|
6
|
+
The converter is not a magick tool: you'll need to write a plugin (in
|
7
|
+
Ruby) for each site you want to watch. bwkfanboy provides guidelines and
|
8
|
+
general assistance.
|
9
|
+
|
10
|
+
= Architecture
|
11
|
+
|
12
|
+
== Plugins
|
13
|
+
|
14
|
+
bwkfanboy comes with 1 exmple plugin that parses a search page of
|
15
|
+
dailyprincetonian.com looking for bwk's articles.
|
16
|
+
|
17
|
+
The plugin is a Ruby class +Page+ that inherits Bwkfanboy::Parse
|
18
|
+
parent, overriding 1 method.
|
19
|
+
|
20
|
+
The plugins can be in the system
|
21
|
+
|
22
|
+
`gem env gemdir`/gems/bwkfanboy-x.y.z/lib/bwkfanboy/plugins
|
23
|
+
|
24
|
+
or user's home
|
25
|
+
|
26
|
+
~/.bwkfanboy/plugins
|
27
|
+
|
28
|
+
directories.
|
29
|
+
|
30
|
+
== Pipeline
|
31
|
+
|
32
|
+
The program consists of 4 parts:
|
33
|
+
|
34
|
+
0. *bwkfanboy* script that takes 1 parameter: the name of a file in
|
35
|
+
plugins directories (without the .rb suffix). So, for example to get
|
36
|
+
an atom feed from dailyprincetonian.com you type:
|
37
|
+
|
38
|
+
% bwkfanboy bwk
|
39
|
+
|
40
|
+
and it will load
|
41
|
+
<tt>/usr/local/lib/ruby/gems/1.9/gems/bwkfanboy-0.0.1/lib/bwkfanboy/plugins/bwk.rb</tt>
|
42
|
+
file on my FreeBSD machine, fetch and parse html from
|
43
|
+
dailyprincetonian.com and generate the required feed, dumping it to
|
44
|
+
stdout.
|
45
|
+
|
46
|
+
The script is just a convinient wrapper for 3 separate utils.
|
47
|
+
|
48
|
+
1. *bwkfanboy_fetch*
|
49
|
+
|
50
|
+
It reads 1 line from stdin for the URL to fetch from. The result will
|
51
|
+
be dumped to stdout.
|
52
|
+
|
53
|
+
2. *bwkfanboy_parse*
|
54
|
+
|
55
|
+
It takes 1 parameter: <em>a full path</em> to a plugin file.
|
56
|
+
|
57
|
+
This util reads stdin expecting it to be a xhtml, parses it and dumps
|
58
|
+
the result to stdout in JSON-formatted object.
|
59
|
+
|
60
|
+
3. *bwkfanboy_generate*
|
61
|
+
|
62
|
+
Reads stdin expecting it to be a proper JSON-formatted object.
|
63
|
+
|
64
|
+
The result will be an Atom feed dumped to stdout in UTF-8.
|
65
|
+
|
66
|
+
So, without the wrapper all this together looks like:
|
67
|
+
|
68
|
+
% echo http://example.org | bwkfanboy_fetch |
|
69
|
+
bwkfanboy_parse /path/to/my/plugin.rb | bwkfanboy_generate
|
70
|
+
|
71
|
+
== Log
|
72
|
+
|
73
|
+
All utils write to <tt>/tmp/bwkfanboy/USER/log/general.log</tt> file if
|
74
|
+
permissions allows it.
|
75
|
+
|
76
|
+
== HTTP
|
77
|
+
|
78
|
+
There are 2 method to get an Atom feed via HTTP:
|
79
|
+
|
80
|
+
1. <tt>web/bwkfanboy.cgi</tt> (from the program tarball), which you may
|
81
|
+
copy to your Apache cgi directory and run it. This prohibits you from
|
82
|
+
using HOME directory for your own plugins. Also the cgi script
|
83
|
+
requires some manual editing (setting 1 variable in it) before even
|
84
|
+
you can start utilizing it.
|
85
|
+
|
86
|
+
2. Small *bwkfanboy_server* HTTP server. It can run from any user and
|
87
|
+
thus is able to inherit env variables for discovering your HOME
|
88
|
+
directory. Read bin/bwkfanboy_server to know how to operate it.
|
data/Rakefile
ADDED
@@ -0,0 +1,48 @@
|
|
1
|
+
# -*-ruby-*-
|
2
|
+
|
3
|
+
require 'rake'
|
4
|
+
require 'rake/gempackagetask'
|
5
|
+
require 'rake/clean'
|
6
|
+
require 'rake/rdoctask'
|
7
|
+
require 'rake/testtask'
|
8
|
+
|
9
|
+
spec = Gem::Specification.new() {|i|
|
10
|
+
i.name = "bwkfanboy"
|
11
|
+
i.summary = 'A converter from HTML to Atom feed that you can use to watch sites that do not provide its own feed.'
|
12
|
+
i.version = '0.0.1'
|
13
|
+
i.author = 'Alexander Gromnitsky'
|
14
|
+
i.email = 'alexander.gromnitsky@gmail.com'
|
15
|
+
i.homepage = 'http://github.com/gromnitsky/bwkfanboy'
|
16
|
+
i.platform = Gem::Platform::RUBY
|
17
|
+
i.required_ruby_version = '>= 1.9'
|
18
|
+
i.files = FileList['lib/**/*', 'bin/*', 'doc/*', '[A-Z]*', 'test/**/*']
|
19
|
+
|
20
|
+
i.executables = FileList['bin/*'].gsub(/^bin\//, '')
|
21
|
+
i.default_executable = i.name
|
22
|
+
|
23
|
+
i.has_rdoc = true
|
24
|
+
i.test_files = FileList['test/test_*.rb']
|
25
|
+
|
26
|
+
i.rdoc_options << '-m' << 'Bwkfanboy' << '-x' << 'plugins'
|
27
|
+
i.extra_rdoc_files = FileList['bin/*']
|
28
|
+
|
29
|
+
i.add_dependency('activesupport', '>= 3.0.0')
|
30
|
+
i.add_dependency('nokogiri', '>= 1.4.3')
|
31
|
+
i.add_dependency('open4', '>= 1.0.1')
|
32
|
+
i.add_dependency('jsonschema', '>= 2.0.0')
|
33
|
+
}
|
34
|
+
|
35
|
+
Rake::GemPackageTask.new(spec).define()
|
36
|
+
|
37
|
+
task(default: %(repackage))
|
38
|
+
|
39
|
+
Rake::RDocTask.new('doc') {|i|
|
40
|
+
i.main = "Bwkfanboy"
|
41
|
+
i.rdoc_files = FileList['doc/*', 'lib/**/*.rb', 'bin/*']
|
42
|
+
i.rdoc_files.exclude("lib/**/plugins", "test")
|
43
|
+
}
|
44
|
+
|
45
|
+
Rake::TestTask.new() {|i|
|
46
|
+
i.test_files = FileList['test/test_*.rb']
|
47
|
+
i.verbose = true
|
48
|
+
}
|
data/TODO
ADDED
data/bin/bwkfanboy
ADDED
@@ -0,0 +1,128 @@
|
|
1
|
+
#!/usr/bin/env ruby19
|
2
|
+
# -*-ruby-*-
|
3
|
+
|
4
|
+
# This program is executed by bin/bwkfanboy_server to do all dirty work:
|
5
|
+
# fetch HTML, parse it and generate a pretty Atom feed.
|
6
|
+
#
|
7
|
+
# It is a wrapper which you can utilize for such common tasks as listing
|
8
|
+
# all available plugins.
|
9
|
+
#
|
10
|
+
# Type:
|
11
|
+
#
|
12
|
+
# % bwkfanboy -h
|
13
|
+
#
|
14
|
+
# to get some basic help & read about Bwkfanboy module.
|
15
|
+
|
16
|
+
require_relative '../lib/bwkfanboy/parser'
|
17
|
+
|
18
|
+
$conf = {
|
19
|
+
prog_name: 'bwkfanboy',
|
20
|
+
prog_ver: '0.0.1',
|
21
|
+
mode: 'pipe',
|
22
|
+
banner: "Usage: #{File.basename($0)} [options] plugin-name"
|
23
|
+
}
|
24
|
+
|
25
|
+
class Plugin # :nodoc: all
|
26
|
+
attr_reader :name, :path
|
27
|
+
|
28
|
+
def initialize(name)
|
29
|
+
@name = name
|
30
|
+
@path = nil
|
31
|
+
end
|
32
|
+
|
33
|
+
def dirs()
|
34
|
+
# try to create user's home plugin directory
|
35
|
+
begin
|
36
|
+
['~/.bwkfanboy', '~/.bwkfanboy/plugins'].each {|i|
|
37
|
+
Dir.mkdir(File.expand_path(i))
|
38
|
+
}
|
39
|
+
rescue
|
40
|
+
# empty
|
41
|
+
end
|
42
|
+
|
43
|
+
r = []
|
44
|
+
dirs = ['~/.bwkfanboy/plugins', "#{Bwkfanboy::Utils.gem_dir_system}/plugins"]
|
45
|
+
begin
|
46
|
+
# this will fail for user's home directory under Apache CGI
|
47
|
+
# environment
|
48
|
+
dirs.map! {|i| File.expand_path(i) }
|
49
|
+
rescue
|
50
|
+
end
|
51
|
+
dirs.each {|i|
|
52
|
+
if File.readable?(i) then
|
53
|
+
r << i
|
54
|
+
else
|
55
|
+
Bwkfanboy::Utils.warnx("directory #{i} isn't readable");
|
56
|
+
end
|
57
|
+
}
|
58
|
+
|
59
|
+
if r.length == 0 then
|
60
|
+
Bwkfanboy::Utils.errx(1, "no dirs for plugins found: #{dirs.join(' ')}")
|
61
|
+
end
|
62
|
+
return r
|
63
|
+
end
|
64
|
+
|
65
|
+
def load()
|
66
|
+
abort($conf[:banner]) unless (@name && @name !~ /^\s*$/)
|
67
|
+
|
68
|
+
dirs.each {|i|
|
69
|
+
files = Dir.glob("#{i}/*.rb")
|
70
|
+
if (@path = files.index("#{i}/#{@name}.rb")) then
|
71
|
+
@path = files[@path]
|
72
|
+
break
|
73
|
+
end
|
74
|
+
}
|
75
|
+
Bwkfanboy::Utils.errx(1, "no such plugin '#{@name}'") if ! @path
|
76
|
+
Bwkfanboy::Utils.plugin_load(@path, Bwkfanboy::Meta::PLUGIN_CLASS)
|
77
|
+
|
78
|
+
pn = Page.new()
|
79
|
+
pn.check()
|
80
|
+
return pn
|
81
|
+
end
|
82
|
+
|
83
|
+
end # class
|
84
|
+
|
85
|
+
# ----------------------------------------------------------------------
|
86
|
+
|
87
|
+
o = Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner]) # create OptionParser object
|
88
|
+
o.on('-i', 'Show some info about the plugin') { |i| $conf[:mode] = 'info' }
|
89
|
+
o.on('-l', 'List all plugins') { |i| $conf[:mode] = 'list' }
|
90
|
+
o.on('-p', 'List all plugins paths') { |i| $conf[:mode] = 'path' }
|
91
|
+
o.on('-D', '(ignore this) Use URI_DEBUG const instead URI in plugins') { |i| $conf[:mode] = 'debug' }
|
92
|
+
Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], o) # run cl parser
|
93
|
+
|
94
|
+
plugin = Plugin.new(ARGV[0])
|
95
|
+
|
96
|
+
case $conf[:mode]
|
97
|
+
when 'list'
|
98
|
+
plugin.dirs().each {|i|
|
99
|
+
puts "#{i}:"
|
100
|
+
Dir.glob("#{i}/*.rb").each {|j|
|
101
|
+
puts "\t#{File.basename(j, '.rb')}"
|
102
|
+
}
|
103
|
+
}
|
104
|
+
when 'path'
|
105
|
+
plugin.dirs().each {|i| puts i}
|
106
|
+
when 'info'
|
107
|
+
plugin.load().dump_info
|
108
|
+
else
|
109
|
+
# A pipe mode
|
110
|
+
pn = plugin.load()
|
111
|
+
cmd = "./bwkfanboy_fetch | ./bwkfanboy_parse '#{plugin.path}' | ./bwkfanboy_generate"
|
112
|
+
if Bwkfanboy::Utils.cfg[:verbose] >= 2 then
|
113
|
+
puts ($conf[:mode] != 'debug' ? pn.class::Meta::URI : pn.class::Meta::URI_DEBUG)
|
114
|
+
puts cmd
|
115
|
+
exit 0
|
116
|
+
end
|
117
|
+
|
118
|
+
# go to the directory with current script
|
119
|
+
Dir.chdir(File.dirname(File.expand_path($0)))
|
120
|
+
|
121
|
+
pipe = IO.popen(cmd, 'w+')
|
122
|
+
pipe.puts ($conf[:mode] != 'debug' ? pn.class::Meta::URI : pn.class::Meta::URI_DEBUG)
|
123
|
+
pipe.close_write
|
124
|
+
while line = pipe.gets
|
125
|
+
puts line
|
126
|
+
end
|
127
|
+
pipe.close
|
128
|
+
end
|
data/bin/bwkfanboy_fetch
ADDED
@@ -0,0 +1,30 @@
|
|
1
|
+
#!/usr/bin/env ruby19
|
2
|
+
# -*-ruby-*-
|
3
|
+
|
4
|
+
# Read stdin for a URI or a full path to the local file, download it (or
|
5
|
+
# read for the local file) and print to stdout.
|
6
|
+
|
7
|
+
require 'open-uri'
|
8
|
+
|
9
|
+
require_relative '../lib/bwkfanboy/utils'
|
10
|
+
|
11
|
+
$conf = { banner: "Usage: #{File.basename($0)} [options] < uri" }
|
12
|
+
|
13
|
+
Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], nil, true)
|
14
|
+
|
15
|
+
uri = gets.chomp()
|
16
|
+
|
17
|
+
Bwkfanboy::Utils.veputs(1, "fetching #{uri}\n")
|
18
|
+
|
19
|
+
begin
|
20
|
+
open(uri, "User-Agent" => Bwkfanboy::Meta::USER_AGENT) {|f|
|
21
|
+
if defined?(f.meta) && f.status[0] != '200' then
|
22
|
+
Bwkfanboy::Utils.errx(1, "cannot fetch #{uri} : HTTP responce: #{f.status[0]}")
|
23
|
+
end
|
24
|
+
Bwkfanboy::Utils.veputs(1, "charset = #{f.content_type_parse[1][1]}\n") if defined?(f.meta)
|
25
|
+
f.each_line {|i| puts i}
|
26
|
+
}
|
27
|
+
rescue
|
28
|
+
# typically Errno::ENOENT
|
29
|
+
Bwkfanboy::Utils.errx(1, "cannot fetch: #{$!}");
|
30
|
+
end
|
@@ -0,0 +1,80 @@
|
|
1
|
+
#!/usr/bin/env ruby19
|
2
|
+
# -*-ruby-*-
|
3
|
+
|
4
|
+
# Read stdin for JSON, generate from it an Atom feed and print the
|
5
|
+
# result to stdout in UTF-8.
|
6
|
+
#
|
7
|
+
# One can validate the JSON by providing '--check' command line option
|
8
|
+
# (by default the validating is off).
|
9
|
+
|
10
|
+
require 'rss/maker'
|
11
|
+
require 'date'
|
12
|
+
require 'json'
|
13
|
+
require 'jsonschema'
|
14
|
+
|
15
|
+
require_relative '../lib/bwkfanboy/utils'
|
16
|
+
|
17
|
+
$conf = {
|
18
|
+
banner: "Usage: #{File.basename($0)} [options] < json",
|
19
|
+
check: false
|
20
|
+
}
|
21
|
+
|
22
|
+
o = Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner])
|
23
|
+
o.on('--check', 'Validate the input (slow!)') { |i| $conf[:check] = true }
|
24
|
+
Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], o) # run cl parser
|
25
|
+
|
26
|
+
begin
|
27
|
+
j = JSON.parse(STDIN.read)
|
28
|
+
rescue
|
29
|
+
Bwkfanboy::Utils.errx(1, "stdin had invalid JSON");
|
30
|
+
end
|
31
|
+
|
32
|
+
# validate the input
|
33
|
+
schema = Bwkfanboy::Utils.gem_dir_system() + '/schema.js'
|
34
|
+
if $conf[:check] then
|
35
|
+
begin
|
36
|
+
JSON::Schema.validate(j, JSON.parse(File.read(schema)))
|
37
|
+
rescue
|
38
|
+
Bwkfanboy::Utils.errx(1, "JSON validation with schema (#{schema}) failed");
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
feed = RSS::Maker.make("atom") { |maker|
|
43
|
+
maker.channel.id = j['channel']['id']
|
44
|
+
maker.channel.updated = j['channel']['updated']
|
45
|
+
maker.channel.author = j['channel']['author']
|
46
|
+
maker.channel.title = j['channel']['title']
|
47
|
+
|
48
|
+
maker.channel.links.new_link {|i|
|
49
|
+
i.href = j['channel']['link']
|
50
|
+
i.rel = 'alternate'
|
51
|
+
i.type = 'text/html' # eh
|
52
|
+
}
|
53
|
+
|
54
|
+
maker.items.do_sort = true
|
55
|
+
|
56
|
+
j['x_entries'].each { |i|
|
57
|
+
maker.items.new_item do |item|
|
58
|
+
item.links.new_link {|k|
|
59
|
+
k.href = i['link']
|
60
|
+
k.rel = 'alternate'
|
61
|
+
k.type = 'text/html' # only to make happy crappy pr2nntp gateway
|
62
|
+
}
|
63
|
+
item.title = i['title']
|
64
|
+
item.author = i['author']
|
65
|
+
item.updated = i['updated']
|
66
|
+
item.content.type = j['channel']['x_entries_content_type']
|
67
|
+
|
68
|
+
case item.content.type
|
69
|
+
when 'text'
|
70
|
+
item.content.content = i['content']
|
71
|
+
when 'html'
|
72
|
+
item.content.content = i['content']
|
73
|
+
else
|
74
|
+
item.content.xhtml = i['content']
|
75
|
+
end
|
76
|
+
end
|
77
|
+
}
|
78
|
+
}
|
79
|
+
|
80
|
+
puts feed
|
data/bin/bwkfanboy_parse
ADDED
@@ -0,0 +1,32 @@
|
|
1
|
+
#!/usr/bin/env ruby19
|
2
|
+
# -*-ruby-*-
|
3
|
+
|
4
|
+
# Take 1 command line parameter: a full path to a plugin.
|
5
|
+
#
|
6
|
+
# Read stdin for a HTML, parse it and print the result to stdout in JSON
|
7
|
+
# format. If '-vv' command line parameters were given, output will be in
|
8
|
+
# 'key: value' pairs and <em>not</em> in JSON.
|
9
|
+
|
10
|
+
require_relative '../lib/bwkfanboy/parser'
|
11
|
+
|
12
|
+
$conf = {
|
13
|
+
banner: "Usage: #{File.basename($0)} [options] /path/to/my/plugin.rb < html"
|
14
|
+
}
|
15
|
+
|
16
|
+
Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], nil, true)
|
17
|
+
|
18
|
+
if ARGV.size == 0 then
|
19
|
+
abort($conf[:banner])
|
20
|
+
else
|
21
|
+
Bwkfanboy::Utils.plugin_load(ARGV[0], Bwkfanboy::Meta::PLUGIN_CLASS)
|
22
|
+
end;
|
23
|
+
|
24
|
+
pn = Page.new()
|
25
|
+
pn.check()
|
26
|
+
pn.parse()
|
27
|
+
|
28
|
+
if Bwkfanboy::Utils.cfg[:verbose] >= 2 then
|
29
|
+
pn.dump()
|
30
|
+
else
|
31
|
+
puts pn.to_json()
|
32
|
+
end
|
@@ -0,0 +1,141 @@
|
|
1
|
+
#!/usr/bin/env ruby19
|
2
|
+
# -*-ruby-*-
|
3
|
+
|
4
|
+
# Start a HTTP server (by default on 127.0.0.1:9042). To get Atom feeds
|
5
|
+
# from it, initiate GET request with URI
|
6
|
+
#
|
7
|
+
# http://localhost:9042/?p=PLUGIN
|
8
|
+
#
|
9
|
+
# where +PLUGIN+ is a name of a bwkfanboy's plugin (without '.re' suffix).
|
10
|
+
#
|
11
|
+
# To list all available plugins, point you browser to
|
12
|
+
#
|
13
|
+
# http://localhost:9042/list
|
14
|
+
#
|
15
|
+
# The server is intended to run from a non-root user from
|
16
|
+
# <tt>~/.login</tt> file. It can detach from a terminal if you give it
|
17
|
+
# '-d' command line option.
|
18
|
+
#
|
19
|
+
# For other help, type:
|
20
|
+
#
|
21
|
+
# bwkfanboy_server -h
|
22
|
+
#
|
23
|
+
# The server maintains 2 logs:
|
24
|
+
#
|
25
|
+
# /tmp/bwkfanboy/USER/log/bwkfanboy_server.log
|
26
|
+
# /tmp/bwkfanboy/USER/log/bwkfanboy_server-access.log
|
27
|
+
#
|
28
|
+
# The file with a pid:
|
29
|
+
#
|
30
|
+
# /tmp/bwkfanboy/USER/bwkfanboy_server.pid
|
31
|
+
|
32
|
+
require 'webrick'
|
33
|
+
require_relative '../lib/bwkfanboy/utils'
|
34
|
+
|
35
|
+
$conf = {
|
36
|
+
addr: '127.0.0.1',
|
37
|
+
port: 9042,
|
38
|
+
converter: "./#{Bwkfanboy::Meta::NAME}",
|
39
|
+
banner: "Usage: #{File.basename($0)} [options]",
|
40
|
+
server_type: WEBrick::SimpleServer,
|
41
|
+
workdir: File.dirname(File.expand_path($0)),
|
42
|
+
pidfile: "#{Bwkfanboy::Meta::DIR_TMP}/#{File.basename($0)}.pid",
|
43
|
+
log: "#{Bwkfanboy::Meta::DIR_LOG}/#{File.basename($0)}.log",
|
44
|
+
alog: "#{Bwkfanboy::Meta::DIR_LOG}/#{File.basename($0)}-access.log",
|
45
|
+
mode: 'pipe'
|
46
|
+
}
|
47
|
+
|
48
|
+
o = Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner]) # create OptionParser object
|
49
|
+
o.on('-b VAL', 'BindAddress') { |i| $conf[:addr] = i }
|
50
|
+
o.on('-p VAL', 'A port number') { |i| $conf[:port] = i }
|
51
|
+
o.on('-c VAL', "A path to main #{Bwkfanboy::Meta::NAME} executable") { |i| $conf[:converter] = i }
|
52
|
+
o.on('-d', 'Detach from a terminal') {|i| $conf[:server_type] = WEBrick::Daemon }
|
53
|
+
o.on('-D', '(ignore this) Use URI_DEBUG const instead URI in plugins') { |i| $conf[:mode] = 'debug' }
|
54
|
+
Bwkfanboy::Utils.cl_parse(ARGV, $conf[:banner], o) # run cl parser
|
55
|
+
|
56
|
+
Bwkfanboy::Utils.dir_tmp_create()
|
57
|
+
|
58
|
+
class FeedServlet < WEBrick::HTTPServlet::AbstractServlet # :nodoc: all
|
59
|
+
def do_GET(req, res)
|
60
|
+
if req.query['p'] && req.query['p'] =~ Bwkfanboy::Meta::PLUGIN_NAME
|
61
|
+
res['Content-Type'] = 'application/atom+xml; charset=UTF-8'
|
62
|
+
res['Content-Disposition'] = "inline; filename=\"#{Bwkfanboy::Meta::NAME}-#{req.query['p']}.xml"
|
63
|
+
|
64
|
+
cmd = "#{$conf[:converter]} #{$conf[:mode] == 'debug' ? '-D' : ''} #{req.query['p']}"
|
65
|
+
r = Bwkfanboy::Utils.cmd_run(cmd)
|
66
|
+
if r[0] != 0 then
|
67
|
+
raise WEBrick::HTTPStatus::InternalServerError.new("Errors in the pipeline:\n\n #{r[1]}")
|
68
|
+
end
|
69
|
+
|
70
|
+
res.body = r[2]
|
71
|
+
else
|
72
|
+
raise WEBrick::HTTPStatus::InternalServerError.new("Parameter 'p' required")
|
73
|
+
end
|
74
|
+
end
|
75
|
+
end
|
76
|
+
|
77
|
+
class FeedListServlet < WEBrick::HTTPServlet::AbstractServlet # :nodoc: all
|
78
|
+
def do_GET(req, res)
|
79
|
+
cmd = "#{$conf[:converter]} -l"
|
80
|
+
r = Bwkfanboy::Utils.cmd_run(cmd)
|
81
|
+
if r[0] != 0 then
|
82
|
+
raise WEBrick::HTTPStatus::InternalServerError.new("Errors:\n\n #{r[1]}")
|
83
|
+
end
|
84
|
+
|
85
|
+
res.body = r[2]
|
86
|
+
end
|
87
|
+
end
|
88
|
+
|
89
|
+
# create temporally files
|
90
|
+
def start_callback()
|
91
|
+
Dir.chdir($conf[:workdir])
|
92
|
+
if ! File.executable?($conf[:converter]) then
|
93
|
+
Bwkfanboy::Utils.errx(1, "Missing executable file '#{$conf[:converter]}'")
|
94
|
+
end
|
95
|
+
|
96
|
+
begin
|
97
|
+
File.open($conf[:pidfile], "w+") {|i| i.puts $$ }
|
98
|
+
rescue
|
99
|
+
Bwkfanboy::Utils.warnx("unable to create a pidfile " + $conf[:pidfile])
|
100
|
+
end
|
101
|
+
end
|
102
|
+
|
103
|
+
# remove temporally files
|
104
|
+
def stop_callback()
|
105
|
+
begin
|
106
|
+
File.unlink $conf[:pidfile]
|
107
|
+
rescue
|
108
|
+
# ignore errors
|
109
|
+
end
|
110
|
+
end
|
111
|
+
|
112
|
+
def log_create(f)
|
113
|
+
begin
|
114
|
+
log = Logger.new(f, 2, Bwkfanboy::Meta::LOG_MAXSIZE)
|
115
|
+
rescue
|
116
|
+
Bwkfanboy::Utils.warnx("cannot open log #{f}");
|
117
|
+
return nil
|
118
|
+
end
|
119
|
+
log.datetime_format = "%H:%M:%S"
|
120
|
+
log
|
121
|
+
end
|
122
|
+
|
123
|
+
# ----------------------------------------------------------------------
|
124
|
+
|
125
|
+
server_log = log_create($conf[:log])
|
126
|
+
access_log = [[ log_create($conf[:alog]), WEBrick::AccessLog::COMBINED_LOG_FORMAT ]]
|
127
|
+
|
128
|
+
s = WEBrick::HTTPServer.new(Port: $conf[:port],
|
129
|
+
BindAddress: $conf[:addr],
|
130
|
+
ServerType: $conf[:server_type],
|
131
|
+
StartCallback: -> {start_callback},
|
132
|
+
StopCallback: -> {stop_callback},
|
133
|
+
Logger: server_log,
|
134
|
+
AccessLog: access_log
|
135
|
+
)
|
136
|
+
s.mount("/", FeedServlet)
|
137
|
+
s.mount("/list", FeedListServlet)
|
138
|
+
['TERM', 'INT'].each {|i|
|
139
|
+
trap(i) { s.shutdown }
|
140
|
+
}
|
141
|
+
s.start
|
data/doc/README.rdoc
ADDED
@@ -0,0 +1,88 @@
|
|
1
|
+
= About
|
2
|
+
|
3
|
+
bwkfanboy is a HTML to Atom feed converter that you can use to watch
|
4
|
+
sites that do not provide its own feed.
|
5
|
+
|
6
|
+
The converter is not a magick tool: you'll need to write a plugin (in
|
7
|
+
Ruby) for each site you want to watch. bwkfanboy provides guidelines and
|
8
|
+
general assistance.
|
9
|
+
|
10
|
+
= Architecture
|
11
|
+
|
12
|
+
== Plugins
|
13
|
+
|
14
|
+
bwkfanboy comes with 1 exmple plugin that parses a search page of
|
15
|
+
dailyprincetonian.com looking for bwk's articles.
|
16
|
+
|
17
|
+
The plugin is a Ruby class +Page+ that inherits Bwkfanboy::Parse
|
18
|
+
parent, overriding 1 method.
|
19
|
+
|
20
|
+
The plugins can be in the system
|
21
|
+
|
22
|
+
`gem env gemdir`/gems/bwkfanboy-x.y.z/lib/bwkfanboy/plugins
|
23
|
+
|
24
|
+
or user's home
|
25
|
+
|
26
|
+
~/.bwkfanboy/plugins
|
27
|
+
|
28
|
+
directories.
|
29
|
+
|
30
|
+
== Pipeline
|
31
|
+
|
32
|
+
The program consists of 4 parts:
|
33
|
+
|
34
|
+
0. *bwkfanboy* script that takes 1 parameter: the name of a file in
|
35
|
+
plugins directories (without the .rb suffix). So, for example to get
|
36
|
+
an atom feed from dailyprincetonian.com you type:
|
37
|
+
|
38
|
+
% bwkfanboy bwk
|
39
|
+
|
40
|
+
and it will load
|
41
|
+
<tt>/usr/local/lib/ruby/gems/1.9/gems/bwkfanboy-0.0.1/lib/bwkfanboy/plugins/bwk.rb</tt>
|
42
|
+
file on my FreeBSD machine, fetch and parse html from
|
43
|
+
dailyprincetonian.com and generate the required feed, dumping it to
|
44
|
+
stdout.
|
45
|
+
|
46
|
+
The script is just a convinient wrapper for 3 separate utils.
|
47
|
+
|
48
|
+
1. *bwkfanboy_fetch*
|
49
|
+
|
50
|
+
It reads 1 line from stdin for the URL to fetch from. The result will
|
51
|
+
be dumped to stdout.
|
52
|
+
|
53
|
+
2. *bwkfanboy_parse*
|
54
|
+
|
55
|
+
It takes 1 parameter: <em>a full path</em> to a plugin file.
|
56
|
+
|
57
|
+
This util reads stdin expecting it to be a xhtml, parses it and dumps
|
58
|
+
the result to stdout in JSON-formatted object.
|
59
|
+
|
60
|
+
3. *bwkfanboy_generate*
|
61
|
+
|
62
|
+
Reads stdin expecting it to be a proper JSON-formatted object.
|
63
|
+
|
64
|
+
The result will be an Atom feed dumped to stdout in UTF-8.
|
65
|
+
|
66
|
+
So, without the wrapper all this together looks like:
|
67
|
+
|
68
|
+
% echo http://example.org | bwkfanboy_fetch |
|
69
|
+
bwkfanboy_parse /path/to/my/plugin.rb | bwkfanboy_generate
|
70
|
+
|
71
|
+
== Log
|
72
|
+
|
73
|
+
All utils write to <tt>/tmp/bwkfanboy/USER/log/general.log</tt> file if
|
74
|
+
permissions allows it.
|
75
|
+
|
76
|
+
== HTTP
|
77
|
+
|
78
|
+
There are 2 method to get an Atom feed via HTTP:
|
79
|
+
|
80
|
+
1. <tt>web/bwkfanboy.cgi</tt> (from the program tarball), which you may
|
81
|
+
copy to your Apache cgi directory and run it. This prohibits you from
|
82
|
+
using HOME directory for your own plugins. Also the cgi script
|
83
|
+
requires some manual editing (setting 1 variable in it) before even
|
84
|
+
you can start utilizing it.
|
85
|
+
|
86
|
+
2. Small *bwkfanboy_server* HTTP server. It can run from any user and
|
87
|
+
thus is able to inherit env variables for discovering your HOME
|
88
|
+
directory. Read bin/bwkfanboy_server to know how to operate it.
|