twitterize 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
data/History.txt ADDED
@@ -0,0 +1,5 @@
1
+ == 1.0.0 / 2007-03-17
2
+
3
+ * First release
4
+ * Totally hacktacular!
5
+
data/Manifest.txt ADDED
@@ -0,0 +1,7 @@
1
+ History.txt
2
+ Manifest.txt
3
+ README.txt
4
+ Rakefile
5
+ bin/twitterize
6
+ lib/twitterize.rb
7
+ test/test_twitterize.rb
data/README.txt ADDED
@@ -0,0 +1,108 @@
1
+ twitterize
2
+ by Jacob Harris
3
+ http://www.nycruby.org/
4
+
5
+ == DESCRIPTION:
6
+
7
+ Twitterize is a quick and dirty hack I did in a few hours to play with the Twitter API (seriously, there are no tests and I'm sure there is code crude enough in there to make you recant any friendship we might have). It allows you to take any number of RSS feeds and post them to one or more Twitter accounts. An example of this is how various RSS feeds from the New York Times are sent to twitter accounts nytimes, nyt_arts, nyt_biz, etc. This is accomplished via a command-line script that requires a separate configuration file (see below). Since Twitter is a rapidly growing (read somewhat flaky) service, twitterize also uses a database to store twitters to be posted and recover later if twitter is down. This also allows the app to retain feed GUIDs and avoid duplicate posts.
8
+
9
+ Twitterize has two execution stages. In the first stage, it checks if any feeds are ready to be refreshed, downloads new articles, and saves outgoing twitter messages in the database. In the second stage, it posts any new twitters up to their corresponding accounts. Both stages are independent and can be thought of as a producer-consumer model. When twitterize finishes phase two, it exits. Rather than run continuously as a daemon, I think it's much better if you execute twitterize every 5-30 minutes via a cronjob.
10
+
11
+ == FEATURES/PROBLEMS:
12
+
13
+ * Did I mention before this is hackish software? I didn't even write unit tests (I know, lame!) so you might encounter bugs.
14
+ * Although twitterize runs with a database store (any ActiveRecord-supported DB will do), the list of feeds is updated on startup from feeds specified in the config file. Adding a new feed is as easy as writing some YAML instead of wrangling some SQL (or me coding some more app logic)
15
+ * However, twitterize does NOT remove feeds from the DB if you remove them from the config file. Renaming feeds is not a good idea either.
16
+ * Twitterize doesn't setup the DB or config file for you either. Both of these are possible, but I'm too lazy to do that right now.
17
+ * RSS/Atom/RDF feeds should all work just fine (thanks to Feed Tools)
18
+ * Feed reading is wholesale and crude. I don't support if-modified-since or ETags yet, so please be gentle.
19
+ * I think I've nailed down all the ISO-8859-1/UTF-8 issues, but it's possible you might still get stung.
20
+ * Twitterize does not purge old records from the database. That's up to you, but be careful being too aggressive with infrequently updated blogs.
21
+ * Twitterize is but a cold logical problem. Even if its twitter accounts are your friends, it still does not love you.
22
+
23
+ == SYNOPSIS:
24
+
25
+ twitterize --config-file ~/twitter.yml --verbose --lookback 12h
26
+
27
+ This illustrates some of the salient operational points of twitterize. It takes a config file (see below for format). You can turn on some manner of useless chattering with the --verbose option. The lookback option is actually quite useful, specifying it tells twitter to log feed items earlier that the lookback window, but not post them to Twitter (the h,m,d modifiers are for hours/minutes/days). This is good if you add a bunch of new feeds and don't want to send stories in your blog feed from 2 months ago to Twitter (keep it fresh). I suppose this could be a config-file option, but it was useful for me to make it a command-line option when testing out the NY Times feeds.
28
+
29
+ == REQUIREMENTS:
30
+
31
+ * Twitterize itself is dependent on the following gems:
32
+ * feedtools
33
+ * htmlentities
34
+ * activerecord
35
+ * shorturl (= 0.8.2)
36
+ * There actually is a bug with shortul-0.8.3 and posting to tinyurl, so don't use it (the writer changed the default protocol from post to get)
37
+ * Of course, you also need a database to store twitterize's data.
38
+ * Finally, the tedious and slow process of setting up Twitter accounts is still manual and each one needs a distinct email address (the username+blah@foo.com trick seems to work for some mailer daemons)
39
+
40
+ == INSTALL:
41
+
42
+ * sudo gem install twitterize
43
+ * setup the database. Here is the schema for MySQL for instance:
44
+
45
+ CREATE TABLE `feeds` (
46
+ `id` int(11) NOT NULL auto_increment,
47
+ `name` varchar(255) default NULL,
48
+ `url` text,
49
+ `user` varchar(255) default NULL,
50
+ `password` varchar(255) default NULL,
51
+ `last_check` datetime default NULL,
52
+ `next_check` datetime default NULL,
53
+ `interval` int(11) default NULL,
54
+ PRIMARY KEY (`id`)
55
+ ) DEFAULT CHARSET=utf8
56
+
57
+ CREATE TABLE `items` (
58
+ `id` int(11) NOT NULL auto_increment,
59
+ `feed_id` int(11) default NULL,
60
+ `title` varchar(255) default NULL,
61
+ `guid` varchar(255) default NULL,
62
+ `link` text,
63
+ `twitter` varchar(255) default NULL,
64
+ `published_at` datetime default NULL,
65
+ `posted` tinyint(4) default '0',
66
+ `posted_at` datetime default NULL,
67
+ PRIMARY KEY (`id`)
68
+ ) DEFAULT CHARSET=utf8
69
+
70
+ * create a twitter.yml config file somewhere. This looks like the following
71
+
72
+ database:
73
+ active record settings
74
+
75
+ feeds:
76
+ name1:
77
+ url: the url of the feed
78
+ user: the twitter account to post to
79
+ password: the twitter password
80
+ interval: (secs, optional) to force updates, despite ttl (default: 30 mins)
81
+ name2, etc.
82
+
83
+ * You can add more feeds to the config.yml at a later time and they will be added to the internal database with twitterize runs next.
84
+
85
+ == LICENSE:
86
+
87
+ (The MIT License)
88
+
89
+ Copyright (c) 2007 FIX
90
+
91
+ Permission is hereby granted, free of charge, to any person obtaining
92
+ a copy of this software and associated documentation files (the
93
+ 'Software'), to deal in the Software without restriction, including
94
+ without limitation the rights to use, copy, modify, merge, publish,
95
+ distribute, sublicense, and/or sell copies of the Software, and to
96
+ permit persons to whom the Software is furnished to do so, subject to
97
+ the following conditions:
98
+
99
+ The above copyright notice and this permission notice shall be
100
+ included in all copies or substantial portions of the Software.
101
+
102
+ THE SOFTWARE IS PROVIDED 'AS IS', WITHOUT WARRANTY OF ANY KIND,
103
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
104
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
105
+ IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
106
+ CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
107
+ TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
108
+ SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/Rakefile ADDED
@@ -0,0 +1,21 @@
1
+ # -*- ruby -*-
2
+
3
+ require 'rubygems'
4
+ require 'hoe'
5
+ require './lib/twitterize.rb'
6
+
7
+ Hoe.new('twitterize', Twitterize::VERSION) do |p|
8
+ p.rubyforge_name = 'nycrb'
9
+ p.author = 'Jacob Harris'
10
+ p.email = 'harrisj@schizopolis.net'
11
+ p.summary = 'A command-line program for posting multiple feeds to twitter accounts'
12
+ p.description = p.paragraphs_of('README.txt', 2).join("\n")
13
+ # p.url = p.paragraphs_of('README.txt', 0).first.split(/\n/)[1..-1]
14
+ p.changes = p.paragraphs_of('History.txt', 0..1).join("\n\n")
15
+ p.extra_deps << ['activerecord', '>= 1.15.0']
16
+ p.extra_deps << ['feedtools', '>= 0.2.26']
17
+ p.extra_deps << ['htmlentities', '>= 4.0.0']
18
+ p.extra_deps << ['shorturl', '= 0.8.2']
19
+ end
20
+
21
+ # vim: syntax=Ruby
data/bin/twitterize ADDED
@@ -0,0 +1,120 @@
1
+ #!/usr/bin/env ruby
2
+ require 'rubygems'
3
+ require 'twitterize'
4
+ require 'optparse'
5
+ require 'ostruct'
6
+ require 'shorturl'
7
+
8
+ include Twitterize
9
+
10
+ options = OpenStruct.new
11
+ options.config_file = nil
12
+ options.interval = nil
13
+ options.verbose = false
14
+ options.shorturl = :tinyurl
15
+
16
+ opts = OptionParser.new do |opts|
17
+ opts.banner = "Usage: twitterize [options]"
18
+ opts.separator ""
19
+ opts.separator "Specific options:"
20
+
21
+ # Mandatory argument.
22
+ opts.on("--config-file FILE", "The config file to run twitterize against") do |conf|
23
+ options.config_file = conf
24
+ end
25
+
26
+ opts.on("--lookback INTERVAL", "Only post feed items published in the last INTERVAL seconds", "Can be a straight number or followed by h, m, or d for units") do |interval|
27
+ options.lookback = case interval
28
+ when /^\d+$/ then interval.to_i
29
+ when /^\d+m$/ then interval.to_i * 60
30
+ when /^\d+h$/ then interval.to_i * 60 * 60
31
+ when /^\d+d$/ then interval.to_i * 60 * 60 * 24
32
+ else raise "Interval must be in the form NUMBER[d|m|h]"
33
+ end
34
+ end
35
+
36
+ # Boolean switch.
37
+ opts.on("-v", "--[no-]verbose", "Run verbosely") do |v|
38
+ options.verbose = v
39
+ end
40
+
41
+ opts.on("--shorturl SERVICE", WWW::ShortURL.valid_services, "Shorturl service to use (defaults to tinyurl)", "Allowed options:\n\t\t\t\t\t", WWW::ShortURL.valid_services.map { |s| s.to_s }.sort.join("\n\t\t\t\t\t")) do |shorturl|
42
+ options.shorturl = shorturl.to_sym
43
+ end
44
+
45
+ opts.separator ""
46
+ opts.separator "Common options:"
47
+
48
+ # No argument, shows at tail. This will print an options summary.
49
+ # Try it and see!
50
+ opts.on_tail("-h", "--help", "Show this message") do
51
+ puts opts
52
+ exit
53
+ end
54
+
55
+ # Another typical switch to print the version.
56
+ opts.on_tail("--version", "Show version") do
57
+ puts Twitterize::VERSION
58
+ exit
59
+ end
60
+ end
61
+
62
+ opts.parse!
63
+ puts options.inspect
64
+
65
+ if options.verbose
66
+ def log(string); puts string; end
67
+ else
68
+ def log(string); end
69
+ end
70
+
71
+ if options.config_file.nil?
72
+ puts opts
73
+ exit
74
+ end
75
+
76
+ db = TwitterDB.new(options.config_file)
77
+ db.update_feeds
78
+
79
+ log "Loading new feed items..."
80
+ feeds = Feed.find_refreshable
81
+
82
+ feeds.each do |feed|
83
+ log "Read feed #{feed.name}"
84
+ rss_feed = FeedTools::Feed.open(feed.url)
85
+ rss_feed.items.each do |i|
86
+ res = feed.save_rss_item(i, options.shorturl)
87
+ log " #{res.title}" if res
88
+ end
89
+
90
+ feed.mark_read(rss_feed.ttl)
91
+ end
92
+
93
+ log "Sending new items up to twitter"
94
+
95
+ post_items = Item.find_postable
96
+ begin
97
+ post_items.each do |item|
98
+ feed = item.feed
99
+ #puts "U: #{feed.user} P: #{feed.password} #{item.twitter}"
100
+ item.published_at = Time.now if item.published_at.nil?
101
+
102
+ if options.lookback and item.published_at < Time.now - options.lookback
103
+ log "Skipping (too old): #{item.twitter}"
104
+ item.posted = 2
105
+ else
106
+ log "Posting to #{feed.user}: #{item.twitter}"
107
+ item.posted = 1
108
+ TwitterAPI.post(feed.user, feed.password, item.twitter)
109
+ feed = item.feed
110
+ end
111
+
112
+ item.posted_at = Time.now
113
+ item.save
114
+ #sleep 10
115
+ end
116
+ rescue TwitterException => err
117
+ puts "Error posting to Twitter, will try next time around\n #{err.http_response.inspect.to_s}"
118
+ end
119
+
120
+ exit 0
data/lib/twitterize.rb ADDED
@@ -0,0 +1,141 @@
1
+ require 'rubygems'
2
+ require 'active_record'
3
+ gem 'shorturl', '= 0.8.2'
4
+ require 'feed_tools'
5
+ require 'net/https'
6
+ require 'htmlentities'
7
+ require 'uri'
8
+
9
+ $KCODE='UTF-8'
10
+
11
+ module Twitterize
12
+ VERSION = '1.0.0'
13
+ MAX_TWITTER_CHARS = 140
14
+
15
+ class Item < ActiveRecord::Base
16
+ belongs_to :feed
17
+
18
+ def Item.find_postable
19
+ Item.find(:all, :conditions => ["posted = ?", 0], :order => 'published_at ASC')
20
+ end
21
+
22
+ def Item.save_rss_item(feed, rss_item, shorturl)
23
+ item = Item.find(:first, :conditions => ["guid = ? and feed_id = ?", rss_item.guid, feed.id])
24
+ if item.nil? # not found
25
+ then
26
+ #puts "#{rss_item.guid} NEW"
27
+
28
+ tinyurl = WWW::ShortURL.shorten(rss_item.link, shorturl)
29
+ html_decoder = HTMLEntities.new
30
+
31
+ max_title_len = MAX_TWITTER_CHARS - (tinyurl.size + 4)
32
+ title = html_decoder.decode rss_item.title
33
+ twitter = title[0, max_title_len]
34
+ twitter += "..." if twitter.size == max_title_len
35
+ twitter += " " + tinyurl
36
+
37
+ item = Item.new
38
+ item.link = rss_item.link
39
+ item.guid = rss_item.guid
40
+ item.title = title
41
+ item.published_at = rss_item.published
42
+ item.twitter = twitter
43
+ item.posted = 0
44
+ feed.items << item
45
+ item.save
46
+ item
47
+ else
48
+ #puts "#{rss_item.guid} ALREADY IN DB"
49
+ nil
50
+ end
51
+ end
52
+ end
53
+
54
+ class Feed < ActiveRecord::Base
55
+ has_many :items
56
+
57
+ def Feed.find_refreshable
58
+ Feed.find(:all, :conditions => ["next_check IS NULL or next_check < ?", Time.now])
59
+ end
60
+
61
+ def save_rss_item(item, shorturl)
62
+ Item.save_rss_item(self, item, shorturl)
63
+ end
64
+
65
+ def Feed.update_feeds(feeds)
66
+ raise Error, "No feeds found" if feeds.size == 0
67
+
68
+ feeds.each do |f|
69
+ dbfeed = Feed.find_by_name(f[0])
70
+ dbfeed = Feed.new if dbfeed.nil?
71
+
72
+ dbfeed.name = f[0]
73
+ dbfeed.url = f[1]['url']
74
+ dbfeed.user = f[1]['user']
75
+ dbfeed.password = f[1]['password']
76
+ dbfeed.interval = f[1]['interval'] || nil
77
+ dbfeed.save
78
+ end
79
+ end
80
+
81
+ def mark_read(ttl=nil)
82
+ if interval.nil?
83
+ increment = ttl
84
+ increment = 30.minutes if increment.nil?
85
+ else
86
+ increment = interval
87
+ end
88
+
89
+ self.next_check = Time.now + increment
90
+ save
91
+ end
92
+ end
93
+
94
+
95
+ class TwitterDB
96
+ def initialize(config_file)
97
+ @config = YAML.load_file(config_file)
98
+ connect
99
+ end
100
+
101
+ def connect
102
+ ActiveRecord::Base.establish_connection(@config['database'])
103
+ end
104
+
105
+ def update_feeds
106
+ Feed.update_feeds(@config['feeds'])
107
+ end
108
+ end
109
+
110
+ class TwitterException < RuntimeError
111
+ attr :http_response
112
+ def initialize(response)
113
+ @http_response = response
114
+ end
115
+ end
116
+
117
+ class TwitterAPI
118
+ def self.post(user, password, status)
119
+ post_args = {
120
+ 'status' => status,
121
+ 'source' => 'twitterize'
122
+ }
123
+
124
+ url = URI.parse('http://twitter.com/statuses/update.xml')
125
+ url.user = user
126
+ url.password = password
127
+
128
+ response = Net::HTTP::post_form url, post_args
129
+ puts response.inspect
130
+
131
+ case response
132
+ when Net::HTTPSuccess then
133
+ puts "POSTING #{status}"
134
+ when Net::HTTPFound then
135
+ puts "POSTING #{status}"
136
+ else
137
+ raise TwitterException, response
138
+ end
139
+ end
140
+ end
141
+ end
File without changes
metadata ADDED
@@ -0,0 +1,96 @@
1
+ --- !ruby/object:Gem::Specification
2
+ rubygems_version: 0.9.2
3
+ specification_version: 1
4
+ name: twitterize
5
+ version: !ruby/object:Gem::Version
6
+ version: 1.0.0
7
+ date: 2007-03-22 00:00:00 -04:00
8
+ summary: A command-line program for posting multiple feeds to twitter accounts
9
+ require_paths:
10
+ - lib
11
+ email: harrisj@schizopolis.net
12
+ homepage: http://www.zenspider.com/ZSS/Products/twitterize/
13
+ rubyforge_project: nycrb
14
+ description: Twitterize is a quick and dirty hack I did in a few hours to play with the Twitter API (seriously, there are no tests and I'm sure there is code crude enough in there to make you recant any friendship we might have). It allows you to take any number of RSS feeds and post them to one or more Twitter accounts. An example of this is how various RSS feeds from the New York Times are sent to twitter accounts nytimes, nyt_arts, nyt_biz, etc. This is accomplished via a command-line script that requires a separate configuration file (see below). Since Twitter is a rapidly growing (read somewhat flaky) service, twitterize also uses a database to store twitters to be posted and recover later if twitter is down. This also allows the app to retain feed GUIDs and avoid duplicate posts.
15
+ autorequire:
16
+ default_executable:
17
+ bindir: bin
18
+ has_rdoc: true
19
+ required_ruby_version: !ruby/object:Gem::Version::Requirement
20
+ requirements:
21
+ - - ">"
22
+ - !ruby/object:Gem::Version
23
+ version: 0.0.0
24
+ version:
25
+ platform: ruby
26
+ signing_key:
27
+ cert_chain:
28
+ post_install_message:
29
+ authors:
30
+ - Jacob Harris
31
+ files:
32
+ - History.txt
33
+ - Manifest.txt
34
+ - README.txt
35
+ - Rakefile
36
+ - bin/twitterize
37
+ - lib/twitterize.rb
38
+ - test/test_twitterize.rb
39
+ test_files:
40
+ - test/test_twitterize.rb
41
+ rdoc_options: []
42
+
43
+ extra_rdoc_files: []
44
+
45
+ executables:
46
+ - twitterize
47
+ extensions: []
48
+
49
+ requirements: []
50
+
51
+ dependencies:
52
+ - !ruby/object:Gem::Dependency
53
+ name: activerecord
54
+ version_requirement:
55
+ version_requirements: !ruby/object:Gem::Version::Requirement
56
+ requirements:
57
+ - - ">="
58
+ - !ruby/object:Gem::Version
59
+ version: 1.15.0
60
+ version:
61
+ - !ruby/object:Gem::Dependency
62
+ name: feedtools
63
+ version_requirement:
64
+ version_requirements: !ruby/object:Gem::Version::Requirement
65
+ requirements:
66
+ - - ">="
67
+ - !ruby/object:Gem::Version
68
+ version: 0.2.26
69
+ version:
70
+ - !ruby/object:Gem::Dependency
71
+ name: htmlentities
72
+ version_requirement:
73
+ version_requirements: !ruby/object:Gem::Version::Requirement
74
+ requirements:
75
+ - - ">="
76
+ - !ruby/object:Gem::Version
77
+ version: 4.0.0
78
+ version:
79
+ - !ruby/object:Gem::Dependency
80
+ name: shorturl
81
+ version_requirement:
82
+ version_requirements: !ruby/object:Gem::Version::Requirement
83
+ requirements:
84
+ - - "="
85
+ - !ruby/object:Gem::Version
86
+ version: 0.8.2
87
+ version:
88
+ - !ruby/object:Gem::Dependency
89
+ name: hoe
90
+ version_requirement:
91
+ version_requirements: !ruby/object:Gem::Version::Requirement
92
+ requirements:
93
+ - - ">="
94
+ - !ruby/object:Gem::Version
95
+ version: 1.2.0
96
+ version: