DRMacIver-reddilicious 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/LICENSE +25 -0
- data/README.markdown +26 -0
- data/Rakefile +13 -0
- data/VERSION +1 -0
- data/bin/reddilicious +51 -0
- data/lib/blacklist.rb +44 -0
- data/lib/delicious.rb +15 -0
- data/lib/post.rb +90 -0
- data/lib/reddilicious.rb +195 -0
- data/lib/reddit.rb +60 -0
- data/lib/site.rb +90 -0
- data/lib/stumbleupon.rb +106 -0
- data/lib/twitter.rb +96 -0
- data/reddilicious.gemspec +48 -0
- metadata +67 -0
data/LICENSE
ADDED
@@ -0,0 +1,25 @@
|
|
1
|
+
Copyright (c) 2009, David R. MacIver
|
2
|
+
All rights reserved.
|
3
|
+
|
4
|
+
Redistribution and use in source and binary forms, with or without
|
5
|
+
modification, are permitted provided that the following conditions are met:
|
6
|
+
* Redistributions of source code must retain the above copyright
|
7
|
+
notice, this list of conditions and the following disclaimer.
|
8
|
+
* Redistributions in binary form must reproduce the above copyright
|
9
|
+
notice, this list of conditions and the following disclaimer in the
|
10
|
+
documentation and/or other materials provided with the distribution.
|
11
|
+
* Neither the name of the reddilicious nor the
|
12
|
+
names of its contributors may be used to endorse or promote products
|
13
|
+
derived from this software without specific prior written permission.
|
14
|
+
|
15
|
+
THIS SOFTWARE IS PROVIDED BY David R. MacIver 'AS IS'' AND ANY
|
16
|
+
EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
|
17
|
+
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
|
18
|
+
DISCLAIMED. IN NO EVENT SHALL David R. MacIver BE LIABLE FOR ANY
|
19
|
+
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
|
20
|
+
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES;
|
21
|
+
LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
|
22
|
+
ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
|
23
|
+
(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS
|
24
|
+
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
|
25
|
+
|
data/README.markdown
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
## Reddilicious
|
2
|
+
|
3
|
+
Reddilicious automatically imports links you've up voted on reddit and other social bookmarking sites into delicious. You simply provide it with your account details on each and set up a cron job to run it regularly. It takes care of the rest.
|
4
|
+
|
5
|
+
### Notes on behaviour:
|
6
|
+
|
7
|
+
* Despite the name, reddilicous actually imports from a bunch of different sites. Currently only twitter, reddit and stumbleupon, but that's purely a function of the fact that those are the ones I use at the moment.
|
8
|
+
* Links are tagged with via:source, plus any tags that can be obtained from there.
|
9
|
+
* The date on the link will be set to that at which the URL was initially posted on the source site, not the time you imported it (which plays badly with historical data) or the time you up voted it (which doesn't seem to be available information everywhere).
|
10
|
+
* It will import all your histories, so the initial run will take a while.
|
11
|
+
* This is very slow. There's a mixture of reasons for this - it's written in Ruby, it generate a fair bit of HTTP traffic and it deliberately rate limits itself in a lot of cases. The single biggest reason though is that I don't particularly care and I haven't optimised it. It's intended to be run a couple times an hour by an automated task, and you have to hit pretty damn heavy traffic before it's too slow for that.
|
12
|
+
* Twitter support will pull in any URLs mentioned on your friends time-line, automatically tagging them based on delicious suggestions and tagging them with information about who it was to and from.
|
13
|
+
|
14
|
+
|
15
|
+
### Some general comments:
|
16
|
+
|
17
|
+
* The code is currently a bit grim in places. Some of this is inevitable - site scraping is never going to look pretty - some of it will probably be cleaned up at various points.
|
18
|
+
* Patches are *exceedingly* welcome. There are a pile of sites this could reasonably import from, and I don't use so much as a tenth of them. If you want this to handle your favourite social bookmarking or similar site, please feel free to submit a patch.
|
19
|
+
* I'm currently changing the internal format on a semi regular basis as I figure things out. Once I've got an actual release out I'll have a proper versioning system for upgrades, etc. but I don't yet.
|
20
|
+
|
21
|
+
### Dependencies
|
22
|
+
|
23
|
+
* [json](http://json.rubyforge.org/)
|
24
|
+
* [nokogiri](http://github.com/tenderlove/nokogiri/tree/master)
|
25
|
+
* [httparty](http://github.com/jnunemaker/httparty/tree/master)
|
26
|
+
* [mechanize](http://mechanize.rubyforge.org/mechanize/) (for stumbleupon)
|
data/Rakefile
ADDED
@@ -0,0 +1,13 @@
|
|
1
|
+
require 'rubygems'
|
2
|
+
require 'rake'
|
3
|
+
|
4
|
+
require 'jeweler'
|
5
|
+
Jeweler::Tasks.new do |gem|
|
6
|
+
gem.name = "reddilicious"
|
7
|
+
gem.summary = "reddilicious is a tool for automatically importing links into delicious"
|
8
|
+
gem.email = "david.maciver@gmail.com"
|
9
|
+
gem.homepage = "http://github.com/DRMacIver/reddilicious"
|
10
|
+
gem.authors = ["David R. MacIver"]
|
11
|
+
|
12
|
+
# gem is a Gem::Specification... see http://www.rubygems.org/read/chapter/20 for additional settings
|
13
|
+
end
|
data/VERSION
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
0.0.2
|
data/bin/reddilicious
ADDED
@@ -0,0 +1,51 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
$: << File.join(File.dirname(__FILE__), "../lib")
|
3
|
+
|
4
|
+
require "reddilicious.rb"
|
5
|
+
|
6
|
+
def usage
|
7
|
+
STDERR.puts <<-USAGE
|
8
|
+
Usage:
|
9
|
+
reddilicious update # Update a reddilicious instance, posting the new bookmarks to delicious"
|
10
|
+
reddilicious undo [site] # deletes all imported posts (from specific site, or all if not specified)
|
11
|
+
USAGE
|
12
|
+
Reddilicious.site_names.each do |site|
|
13
|
+
STDERR.puts " reddilicious #{site} # set your #{site} user"
|
14
|
+
end
|
15
|
+
|
16
|
+
exit(1)
|
17
|
+
end
|
18
|
+
|
19
|
+
dir=ENV["REDDILICIOUS_HOME"] || File.join(ENV["HOME"], ".reddilicious")
|
20
|
+
reddilicious = Reddilicious.new(dir)
|
21
|
+
|
22
|
+
if !File.directory?(dir)
|
23
|
+
puts "no such directory #{dir}. Creating..."
|
24
|
+
Dir.mkdir(dir)
|
25
|
+
|
26
|
+
puts "Delicious user name:"
|
27
|
+
delicious = STDIN.gets.strip
|
28
|
+
delicious = nil if delicious == ""
|
29
|
+
puts "Delicious password:"
|
30
|
+
delicious_password = STDIN.gets.strip
|
31
|
+
reddilicious.create!(delicious, delicious_password)
|
32
|
+
end
|
33
|
+
|
34
|
+
case ARGV[0]
|
35
|
+
when "update"
|
36
|
+
File.open("#{dir}/reddilicious.log", "a") do |log|
|
37
|
+
log.sync = true
|
38
|
+
$stdout = log
|
39
|
+
$stderr = log
|
40
|
+
reddilicious.transfer_to_delicious
|
41
|
+
end
|
42
|
+
when *Reddilicious.site_names:
|
43
|
+
reddilicious.site_for(ARGV[0]).ask_for_credentials
|
44
|
+
when "undo"
|
45
|
+
sites = ARGV[1..-1].empty? ? Reddilicious.site_names : ARGV[1..-1]
|
46
|
+
puts "undo import for sites #{sites.inspect}: are you sure? (y/n)"
|
47
|
+
if STDIN.gets.strip.downcase == 'y'
|
48
|
+
sites.each { |s| reddilicious.site_for(s).undo_import! }
|
49
|
+
end
|
50
|
+
else usage
|
51
|
+
end
|
data/lib/blacklist.rb
ADDED
@@ -0,0 +1,44 @@
|
|
1
|
+
class Blacklist
|
2
|
+
def initialize(blacklist)
|
3
|
+
@blacklist = Hash.new{|h, k| h[k] = [] }
|
4
|
+
|
5
|
+
blacklist.each { |list|
|
6
|
+
list.each { |tag|
|
7
|
+
@blacklist[tag] << list
|
8
|
+
}
|
9
|
+
}
|
10
|
+
end
|
11
|
+
|
12
|
+
def self.from_file(file)
|
13
|
+
return nil if !File.exists?(file)
|
14
|
+
Blacklist.new(IO.read(file).split("\n").map{|l| l.split})
|
15
|
+
end
|
16
|
+
|
17
|
+
def blacklisted?(tags)
|
18
|
+
if tags.is_a? String
|
19
|
+
tags = tags.split
|
20
|
+
end
|
21
|
+
|
22
|
+
if !tags.is_a? Array
|
23
|
+
raise "unrecognised argument #{tags.inspect}"
|
24
|
+
end
|
25
|
+
|
26
|
+
shallow_flatten(tags.
|
27
|
+
map{|t| @blacklist[t]}.
|
28
|
+
compact).
|
29
|
+
any?{|set|
|
30
|
+
!set.empty? && set.all?{|t|
|
31
|
+
tags.include?(t)
|
32
|
+
}
|
33
|
+
}
|
34
|
+
end
|
35
|
+
|
36
|
+
private
|
37
|
+
|
38
|
+
def shallow_flatten(enum)
|
39
|
+
it = []
|
40
|
+
enum.each{|x| it += x }
|
41
|
+
it
|
42
|
+
end
|
43
|
+
|
44
|
+
end
|
data/lib/delicious.rb
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
require "rubygems"
|
2
|
+
require "httparty"
|
3
|
+
|
4
|
+
module Delicious
|
5
|
+
include HTTParty
|
6
|
+
base_uri "https://api.del.icio.us/v1"
|
7
|
+
|
8
|
+
#Please set your User-Agent to something identifiable.
|
9
|
+
#The default identifiers like "Java/1.4.3" or "lwp-perl" etc tend to get banned from time to time.
|
10
|
+
headers 'User-Agent' => 'reddilicious (0.1)'
|
11
|
+
|
12
|
+
format :xml
|
13
|
+
end
|
14
|
+
|
15
|
+
|
data/lib/post.rb
ADDED
@@ -0,0 +1,90 @@
|
|
1
|
+
require "ostruct"
|
2
|
+
require "set"
|
3
|
+
require "rubygems"
|
4
|
+
require "nokogiri"
|
5
|
+
require "open-uri"
|
6
|
+
|
7
|
+
class Post
|
8
|
+
NEW_MARKER = "imported_by:reddilicious"
|
9
|
+
|
10
|
+
Data = :url, :dt, :description, :extended, :tags
|
11
|
+
|
12
|
+
attr_accessor *Data
|
13
|
+
|
14
|
+
def initialize(hash=nil)
|
15
|
+
yield self if block_given?
|
16
|
+
if hash
|
17
|
+
hash.each do |key, value|
|
18
|
+
instance_variable_set("@" + key, value)
|
19
|
+
end
|
20
|
+
end
|
21
|
+
raise "all posts must have a URL" if !self.url
|
22
|
+
end
|
23
|
+
|
24
|
+
def to_h
|
25
|
+
it = {}
|
26
|
+
Data.each{|d| it[d] = instance_variable_get("@" + d.to_s)}
|
27
|
+
it
|
28
|
+
end
|
29
|
+
|
30
|
+
def auto_imported?
|
31
|
+
tag_set.include?(NEW_MARKER)
|
32
|
+
end
|
33
|
+
|
34
|
+
def tag_set
|
35
|
+
if !self.tags then Set.new else Set[*self.tags.split] end
|
36
|
+
end
|
37
|
+
|
38
|
+
def fetch_metadata!(suggest_tags=true)
|
39
|
+
self.description ||= begin
|
40
|
+
Nokogiri::HTML(open(url)).xpath("//title").text.gsub("\n", " ").gsub(/ +/, " ").strip
|
41
|
+
rescue Exception => e
|
42
|
+
puts "WARNING: #{e}"
|
43
|
+
url
|
44
|
+
end
|
45
|
+
|
46
|
+
self.description = url if description.empty?
|
47
|
+
|
48
|
+
if suggest_tags
|
49
|
+
suggest = Delicious.get("/posts/suggest", :query => {:url => url} )
|
50
|
+
if suggest['suggest']
|
51
|
+
suggested_tags = suggest["suggest"]["popular"] || []
|
52
|
+
self.tags = suggested_tags.is_a? Array ? suggested_tags.join(" ") : suggested_tags
|
53
|
+
end
|
54
|
+
sleep 1
|
55
|
+
end
|
56
|
+
end
|
57
|
+
|
58
|
+
def merge(that)
|
59
|
+
return self if !that
|
60
|
+
raise "cannot merge posts with different URLS: #{self.url} != #{that.url}" if self.url != that.url
|
61
|
+
|
62
|
+
result = Post.new{|p|
|
63
|
+
p.url = self.url
|
64
|
+
p.description = that.description
|
65
|
+
|
66
|
+
if(self.extended && !that.extended)
|
67
|
+
p.extended = self.extended
|
68
|
+
else
|
69
|
+
p.extended = that.extended
|
70
|
+
end
|
71
|
+
|
72
|
+
old_tags = that.tag_set
|
73
|
+
new_tags = self.tag_set - old_tags
|
74
|
+
|
75
|
+
# remove the new marker unless it was already present in old post
|
76
|
+
new_tags -= [NEW_MARKER] unless that.auto_imported?
|
77
|
+
|
78
|
+
p.tags = if !new_tags.empty?
|
79
|
+
new_tags.to_a.join(" ") + " " + that.tags
|
80
|
+
else
|
81
|
+
that.tags
|
82
|
+
end
|
83
|
+
|
84
|
+
p.dt = [self.dt, that.dt].compact.min
|
85
|
+
}
|
86
|
+
|
87
|
+
result = nil if result == that # FIXME
|
88
|
+
result
|
89
|
+
end
|
90
|
+
end
|
data/lib/reddilicious.rb
ADDED
@@ -0,0 +1,195 @@
|
|
1
|
+
require "site"
|
2
|
+
require "httparty"
|
3
|
+
require "delicious"
|
4
|
+
require "json"
|
5
|
+
require 'net/http'
|
6
|
+
require 'uri'
|
7
|
+
require "blacklist"
|
8
|
+
|
9
|
+
class Reddilicious
|
10
|
+
# lambdas to get lazy loading
|
11
|
+
SitesToClasses = {
|
12
|
+
"reddit" => lambda{
|
13
|
+
require "reddit";
|
14
|
+
Reddit::Liked
|
15
|
+
},
|
16
|
+
|
17
|
+
"stumbleupon" => lambda{
|
18
|
+
require "stumbleupon"
|
19
|
+
StumbleUpon::Favourites
|
20
|
+
},
|
21
|
+
"twitter" => lambda{
|
22
|
+
require "twitter"
|
23
|
+
Twitter::FriendsTimeline
|
24
|
+
}
|
25
|
+
}
|
26
|
+
|
27
|
+
attr_accessor :dir
|
28
|
+
def initialize(dir)
|
29
|
+
@dir = dir
|
30
|
+
@blacklist = Blacklist.from_file(File.join(dir, "blacklist"))
|
31
|
+
@untiny_cache = if File.exists?(untiny_cache_file)
|
32
|
+
JSON.parse(IO.read(untiny_cache_file))
|
33
|
+
else
|
34
|
+
{}
|
35
|
+
end
|
36
|
+
|
37
|
+
if File.exists?(details_file)
|
38
|
+
Delicious.basic_auth(details["delicious_user"], details["delicious_password"])
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
def sites
|
43
|
+
@sites ||= Dir["#{@dir}/*"].map{|x| site_for(File.basename(x))}.compact
|
44
|
+
end
|
45
|
+
|
46
|
+
def site_for(x)
|
47
|
+
c = SitesToClasses[x]
|
48
|
+
c && c.call.new(self)
|
49
|
+
end
|
50
|
+
|
51
|
+
def create!(delicious, delicious_password)
|
52
|
+
File.open(details_file, "w"){|o|
|
53
|
+
o.puts({:delicious_user => delicious, :delicious_password => delicious_password}.to_json)
|
54
|
+
}
|
55
|
+
end
|
56
|
+
|
57
|
+
def details
|
58
|
+
@details ||= JSON.parse(IO.read(details_file))
|
59
|
+
end
|
60
|
+
|
61
|
+
def update_time
|
62
|
+
puts "Checking the server for last update time"
|
63
|
+
sleep 1
|
64
|
+
time = Delicious.get("/posts/update")["update"]["time"]
|
65
|
+
puts "last updated at #{time}"
|
66
|
+
time
|
67
|
+
end
|
68
|
+
|
69
|
+
def update_untiny_cache
|
70
|
+
File.open(untiny_cache_file, 'w') { |f| f << JSON.pretty_generate(@untiny_cache) }
|
71
|
+
end
|
72
|
+
|
73
|
+
def untiny_url(url, n=0)
|
74
|
+
@untiny_cache[url] ||= begin
|
75
|
+
puts "untiny #{url}"
|
76
|
+
resp = Net::HTTP.get_response(URI.parse(url))
|
77
|
+
url = resp['location'] || url
|
78
|
+
if [301, 302].include?(resp.code.to_i) && n<=3 && resp['location']
|
79
|
+
untiny_url(resp['location'], ++n)
|
80
|
+
else
|
81
|
+
url
|
82
|
+
end
|
83
|
+
rescue Exception => e
|
84
|
+
puts "WARNING: #{e}"
|
85
|
+
url
|
86
|
+
end
|
87
|
+
end
|
88
|
+
|
89
|
+
def bookmark_for(url, suggest_tags=true)
|
90
|
+
url = untiny_url(url)
|
91
|
+
Post.new do |post|
|
92
|
+
post.url = url
|
93
|
+
end
|
94
|
+
end
|
95
|
+
|
96
|
+
def delicious_posts
|
97
|
+
puts "Reading existing delicious bookmarks"
|
98
|
+
delicious_post_file = File.join(@dir, "bookmarks.json")
|
99
|
+
|
100
|
+
@existing_posts = nil
|
101
|
+
|
102
|
+
if File.exist? delicious_post_file
|
103
|
+
@existing_posts ||= JSON.parse(IO.read(delicious_post_file))
|
104
|
+
end
|
105
|
+
|
106
|
+
if !@existing_posts || (update_time > @existing_posts["updated"])
|
107
|
+
puts "Bookmarks out of date. Fetching from server"
|
108
|
+
posts = Delicious.get("/posts/all")["posts"]
|
109
|
+
raise "error fetching all posts" unless posts
|
110
|
+
|
111
|
+
@existing_posts = {"updated" => posts["update"], "posts" => posts["post"] || []}
|
112
|
+
File.open(delicious_post_file, "w") do |o|
|
113
|
+
o.puts JSON.pretty_generate(@existing_posts)
|
114
|
+
end
|
115
|
+
else
|
116
|
+
puts "Nothing to do here: Our existing bookmarks are up to date"
|
117
|
+
end
|
118
|
+
|
119
|
+
posts = (@existing_posts["posts"] || []).map{|x| Post.new(x){|p| p.url = x["href"]; p.tags = x["tag"]}}
|
120
|
+
puts "found #{posts.length} existing bookmarks"
|
121
|
+
posts
|
122
|
+
end
|
123
|
+
|
124
|
+
def transfer_to_delicious
|
125
|
+
puts "Beginning import at #{Time.now}"
|
126
|
+
puts "Found importers for #{sites.join(", ")}"
|
127
|
+
new_updates = sites.map{|x| x.update!}.flatten
|
128
|
+
update_untiny_cache
|
129
|
+
|
130
|
+
puts "#{new_updates.length} urls to import"
|
131
|
+
|
132
|
+
return if new_updates.empty?
|
133
|
+
|
134
|
+
urls_to_posts = {}
|
135
|
+
|
136
|
+
new_updates.each do |post|
|
137
|
+
urls_to_posts[post.url] = post.merge(urls_to_posts[post.url])
|
138
|
+
end
|
139
|
+
|
140
|
+
puts "checking for existing bookmarks"
|
141
|
+
delicious_posts.each do |post|
|
142
|
+
update = urls_to_posts[post.url]
|
143
|
+
if update
|
144
|
+
puts "merging existing post for #{post.url}"
|
145
|
+
urls_to_posts[post.url] = update.merge post
|
146
|
+
end
|
147
|
+
end
|
148
|
+
|
149
|
+
new_updates = urls_to_posts.values.compact
|
150
|
+
|
151
|
+
puts "#{new_updates.length} urls after merging"
|
152
|
+
|
153
|
+
new_updates.each do |update|
|
154
|
+
|
155
|
+
blacklisted = @blacklist && @blacklist.blacklisted?(update.tags)
|
156
|
+
|
157
|
+
if blacklisted
|
158
|
+
puts "ignoring #{update.description} due to (#{update.url}) is blacklisting (tags #{update.tags})"
|
159
|
+
else
|
160
|
+
puts "importing #{update.description} (#{update.url})"
|
161
|
+
|
162
|
+
update.fetch_metadata!(false)
|
163
|
+
|
164
|
+
res = Delicious.post("/posts/add", :query => update.to_h)
|
165
|
+
if !res['result'] || res['result']['code'] != 'done'
|
166
|
+
puts "error importing post: #{res.inspect}"
|
167
|
+
end
|
168
|
+
|
169
|
+
sleep(1)
|
170
|
+
end
|
171
|
+
end
|
172
|
+
puts "Saving data to storage"
|
173
|
+
sites.each{|x| x.save!}
|
174
|
+
|
175
|
+
puts "Import complete"
|
176
|
+
|
177
|
+
nil
|
178
|
+
end
|
179
|
+
|
180
|
+
|
181
|
+
def self.site_names
|
182
|
+
SitesToClasses.keys
|
183
|
+
end
|
184
|
+
|
185
|
+
|
186
|
+
private
|
187
|
+
def details_file
|
188
|
+
File.join(dir, "details.json")
|
189
|
+
end
|
190
|
+
|
191
|
+
def untiny_cache_file
|
192
|
+
File.join(dir, "untiny.json")
|
193
|
+
end
|
194
|
+
|
195
|
+
end
|
data/lib/reddit.rb
ADDED
@@ -0,0 +1,60 @@
|
|
1
|
+
require "site"
|
2
|
+
require "httparty"
|
3
|
+
|
4
|
+
# Ruby script for fetching the "posts" history of a reddit user as json
|
5
|
+
|
6
|
+
module Reddit
|
7
|
+
include HTTParty
|
8
|
+
base_uri "http://reddit.com"
|
9
|
+
format :json
|
10
|
+
|
11
|
+
class Liked < Site
|
12
|
+
def name
|
13
|
+
"reddit"
|
14
|
+
end
|
15
|
+
|
16
|
+
def update!
|
17
|
+
puts "Updating Reddit"
|
18
|
+
balance
|
19
|
+
results = []
|
20
|
+
new_results = nil
|
21
|
+
after = nil
|
22
|
+
i = 0
|
23
|
+
while !(new_results = merge_results(Reddit.get("/user/#{credentials["user"].strip}/liked/.json", :query => {"after" => after})["data"]["children"].map{|x| x["data"]})).empty?
|
24
|
+
puts "fetching reddit page #{i}"
|
25
|
+
results += new_results
|
26
|
+
after = new_results[-1]["name"]
|
27
|
+
i += 1
|
28
|
+
end
|
29
|
+
|
30
|
+
results.map{ |update| to_post(update) }
|
31
|
+
end
|
32
|
+
|
33
|
+
def to_post(data)
|
34
|
+
Post.new({
|
35
|
+
"url" => data["url"].gsub("&", "&"), # TODO: Better unescaping
|
36
|
+
"description" => data["title"],
|
37
|
+
"tags" => ["via:reddit", data["subreddit"]].join(" "),
|
38
|
+
"replace" => "yes",
|
39
|
+
"dt" => Time.at(data["created_utc"]).strftime("%Y-%m-%dT%H:%M:%SZ")
|
40
|
+
})
|
41
|
+
end
|
42
|
+
|
43
|
+
def date(post)
|
44
|
+
post["created_utc"]
|
45
|
+
end
|
46
|
+
|
47
|
+
private
|
48
|
+
|
49
|
+
def merge_results(results)
|
50
|
+
return if !results
|
51
|
+
done = false
|
52
|
+
new_results = results.select do |x|
|
53
|
+
!@ids.include?(identifier(x))
|
54
|
+
end
|
55
|
+
@posts += new_results
|
56
|
+
@ids.merge(results.map{|x| x["name"]})
|
57
|
+
new_results
|
58
|
+
end
|
59
|
+
end
|
60
|
+
end
|
data/lib/site.rb
ADDED
@@ -0,0 +1,90 @@
|
|
1
|
+
require "post"
|
2
|
+
require "rubygems"
|
3
|
+
require "json"
|
4
|
+
require "set"
|
5
|
+
|
6
|
+
class Site
|
7
|
+
attr_accessor :posts
|
8
|
+
|
9
|
+
def initialize(reddilicous)
|
10
|
+
@reddilicious = reddilicous
|
11
|
+
@dir = File.join(reddilicous.dir, name)
|
12
|
+
@ids = Set.new
|
13
|
+
Dir.mkdir(@dir) if !File.exists? @dir
|
14
|
+
yield(self) if block_given?
|
15
|
+
end
|
16
|
+
|
17
|
+
def credentials=(credentials)
|
18
|
+
@credentials = credentials
|
19
|
+
File.open(credentials_file, "w"){|o| o.puts JSON.pretty_generate(@credentials)}
|
20
|
+
end
|
21
|
+
|
22
|
+
def credentials
|
23
|
+
@credentials ||= if File.exists? credentials_file then JSON.parse(IO.read(credentials_file)) else {} end
|
24
|
+
end
|
25
|
+
|
26
|
+
def posts
|
27
|
+
@posts ||= if File.exists? posts_file then JSON.parse(IO.read(posts_file)) else [] end;
|
28
|
+
end
|
29
|
+
|
30
|
+
def balance
|
31
|
+
@ids.merge(posts.map{|p| identifier(p)})
|
32
|
+
posts.sort!{|x, y| date(y) <=> date(x)}
|
33
|
+
end
|
34
|
+
|
35
|
+
def save!
|
36
|
+
balance
|
37
|
+
File.open(posts_file, "w"){|o| o.puts JSON.pretty_generate(@posts)} if @posts
|
38
|
+
end
|
39
|
+
|
40
|
+
def to_post(data)
|
41
|
+
Post.new(data)
|
42
|
+
end
|
43
|
+
|
44
|
+
def identifier(post)
|
45
|
+
post["url"]
|
46
|
+
end
|
47
|
+
|
48
|
+
# Not required to return an actual date, only something which compares
|
49
|
+
# in the correct order to be read as such
|
50
|
+
def date(post)
|
51
|
+
post["dt"]
|
52
|
+
end
|
53
|
+
|
54
|
+
def to_s
|
55
|
+
name
|
56
|
+
end
|
57
|
+
|
58
|
+
def imported_links
|
59
|
+
Delicious.get("/posts/all", :query=>{:tag=>["via:#{name}", Post::NEW_MARKER].join(' ')})['posts']['post'] || []
|
60
|
+
end
|
61
|
+
|
62
|
+
def undo_import!
|
63
|
+
posts = imported_links
|
64
|
+
posts = [posts] if posts.is_a?(Hash)
|
65
|
+
puts "undo imports for #{self}" unless posts.empty?
|
66
|
+
|
67
|
+
posts.each do |p|
|
68
|
+
puts "deleting #{p['description']}"
|
69
|
+
res = Delicious.delete('/posts/delete', :query=>{:url=>p['href']})
|
70
|
+
if !res['result'] || res['result']['code'] != 'done'
|
71
|
+
puts "delete failed: #{res.inspect}"
|
72
|
+
end
|
73
|
+
end
|
74
|
+
end
|
75
|
+
|
76
|
+
def ask_for_credentials
|
77
|
+
puts "#{name} user name:"
|
78
|
+
self.credentials = {"username" => STDIN.gets.strip}
|
79
|
+
end
|
80
|
+
|
81
|
+
private
|
82
|
+
|
83
|
+
def posts_file
|
84
|
+
File.join(@dir, "posts.json")
|
85
|
+
end
|
86
|
+
|
87
|
+
def credentials_file
|
88
|
+
File.join(@dir, "credentials.json")
|
89
|
+
end
|
90
|
+
end
|
data/lib/stumbleupon.rb
ADDED
@@ -0,0 +1,106 @@
|
|
1
|
+
require "rubygems"
|
2
|
+
require "mechanize"
|
3
|
+
require "json"
|
4
|
+
require "set"
|
5
|
+
require "site"
|
6
|
+
|
7
|
+
module StumbleUpon
|
8
|
+
|
9
|
+
class Favourites < Site
|
10
|
+
|
11
|
+
def name
|
12
|
+
"stumbleupon"
|
13
|
+
end
|
14
|
+
|
15
|
+
def update!
|
16
|
+
puts "Updating Stumbleupon"
|
17
|
+
balance
|
18
|
+
i = 0
|
19
|
+
|
20
|
+
results = []
|
21
|
+
new_results = nil
|
22
|
+
|
23
|
+
while !new_results || !new_results.empty?
|
24
|
+
puts "fetching stumbleupon page #{i}"
|
25
|
+
new_results = StumbleUpon.fetch_page(credentials["user"], i).select{|x| !@ids.include? identifier(x)}
|
26
|
+
|
27
|
+
results += new_results
|
28
|
+
i += 1
|
29
|
+
end
|
30
|
+
|
31
|
+
|
32
|
+
@posts += results
|
33
|
+
balance
|
34
|
+
results.map{|u| to_post(u)}
|
35
|
+
end
|
36
|
+
end
|
37
|
+
|
38
|
+
def self.fetch_page(user, page_number=nil)
|
39
|
+
url = "http://www.stumbleupon.com/stumbler/#{user}/favorites/"
|
40
|
+
if page_number && (page_number > 0)
|
41
|
+
url += (page_number * 10).to_s
|
42
|
+
url += "/"
|
43
|
+
end
|
44
|
+
|
45
|
+
mechanize = WWW::Mechanize.new{|agent| agent.user_agent_alias = "Linux Mozilla"}
|
46
|
+
|
47
|
+
list_view = mechanize.get(url).search("a").select{|x| x["href"] =~ /viewmode=list/}[0]
|
48
|
+
|
49
|
+
mechanize.click(list_view) if list_view
|
50
|
+
|
51
|
+
mechanize.page.search("dl.dlBlog").map{|post| parse_review(post)}.compact
|
52
|
+
end
|
53
|
+
|
54
|
+
|
55
|
+
private
|
56
|
+
|
57
|
+
def self.parse_review(review)
|
58
|
+
title_elem = review.search("dt")[0]
|
59
|
+
title = title_elem.text
|
60
|
+
|
61
|
+
href = nil
|
62
|
+
# stumbleupon seems to be doing some mangling which means
|
63
|
+
# I'm seeing different results here than in firefox
|
64
|
+
# For the moment the following is our "best guess" as to
|
65
|
+
# what the URL should be.
|
66
|
+
|
67
|
+
urls = review.search("a").map{|x| x["href"]}
|
68
|
+
|
69
|
+
urls.reject!{|x| x !~ /^http:\/\//} # remove javacript only and relative links
|
70
|
+
urls.reject!{|x| x =~ /^http:\/\/www.stumbleupon.com/} # remove internal links
|
71
|
+
|
72
|
+
if urls.length == 1
|
73
|
+
href = urls[0]
|
74
|
+
else
|
75
|
+
STDERR.puts "I didn't know what to do with the post #{title}. Its URLs made no sense"
|
76
|
+
return
|
77
|
+
end
|
78
|
+
|
79
|
+
tags = review.search("a").map{|x| x["href"]}.grep(/\/tag\//).map{|h| h.gsub("/tag/", "").gsub("/", "")}
|
80
|
+
|
81
|
+
tags << "via:stumbleupon"
|
82
|
+
|
83
|
+
contents = review.search("dd").select{|x| x["id"] =~ /blog_contents/}
|
84
|
+
if !contents.empty?
|
85
|
+
contents = contents[0].children.select{|x| x.text?}.join("\n")
|
86
|
+
else
|
87
|
+
contents = nil
|
88
|
+
end
|
89
|
+
|
90
|
+
datetime_string = review.search(".stats")[0].text.gsub(/(am|pm).*$/){$1}.strip.gsub(",", " ").gsub(/ +/, " ")
|
91
|
+
|
92
|
+
# Try and parse something useful out of the SU date string.
|
93
|
+
time_re = /([0-9:]+(?:am|pm))/
|
94
|
+
date_string = datetime_string.gsub(time_re, "").strip
|
95
|
+
date = (if date_string == "" then DateTime.now else DateTime.parse(date_string) end)
|
96
|
+
time = DateTime.parse(datetime_string.scan(time_re)[0][0])
|
97
|
+
dt = Time.utc(date.year, date.month, date.day, time.hour, time.min)
|
98
|
+
|
99
|
+
it = {"url" => href, "description" => title, "tags" => tags.join(" "), "dt" => dt && dt.strftime("%Y-%m-%dT%H:%M:%SZ")}
|
100
|
+
|
101
|
+
it["extended"] = contents if contents
|
102
|
+
|
103
|
+
it
|
104
|
+
end
|
105
|
+
end
|
106
|
+
|
data/lib/twitter.rb
ADDED
@@ -0,0 +1,96 @@
|
|
1
|
+
require "rubygems"
|
2
|
+
require "httparty"
|
3
|
+
require "reddilicious"
|
4
|
+
|
5
|
+
module Twitter
|
6
|
+
include HTTParty
|
7
|
+
base_uri "http://twitter.com"
|
8
|
+
format :json
|
9
|
+
|
10
|
+
class FriendsTimeline < Site
|
11
|
+
def name
|
12
|
+
"twitter"
|
13
|
+
end
|
14
|
+
|
15
|
+
def update!
|
16
|
+
puts "Updating twitter"
|
17
|
+
balance
|
18
|
+
|
19
|
+
last_post_id = posts[0] && posts[0]["id"]
|
20
|
+
|
21
|
+
results = []
|
22
|
+
|
23
|
+
new_tweets = nil
|
24
|
+
|
25
|
+
page = 0
|
26
|
+
|
27
|
+
query = {:count => 200 }
|
28
|
+
|
29
|
+
query[:since_id] = last_post_id if last_post_id
|
30
|
+
|
31
|
+
while !(new_tweets = get_tweets(query)).empty?
|
32
|
+
results += new_tweets
|
33
|
+
puts "importing twitter page #{query[:page] || 0}"
|
34
|
+
query[:page] = (query[:page] || 0) + 1
|
35
|
+
end
|
36
|
+
|
37
|
+
@posts += results
|
38
|
+
|
39
|
+
results.map do |res|
|
40
|
+
urls = res["text"].scan(/(http:\/\/[^,()" ]+)/).flatten
|
41
|
+
ats = res["text"].scan(/@([[:alnum:]]+)/).flatten
|
42
|
+
hashtags = res["text"].scan(/#([[:alnum:]]+)/).flatten
|
43
|
+
retweet = res["text"] =~ /RT[^a-zA-Z]/ || res["text"] =~ /\(via @[^)]+\)/
|
44
|
+
|
45
|
+
urls.map do |url|
|
46
|
+
post = @reddilicious.bookmark_for(url)
|
47
|
+
post.tags = [
|
48
|
+
"via:twitter",
|
49
|
+
Post::NEW_MARKER,
|
50
|
+
ats.map{|a| "to:" + a}.sort,
|
51
|
+
"from:#{res["user"]["screen_name"]}",
|
52
|
+
hashtags,
|
53
|
+
("retweet" if retweet),
|
54
|
+
post.tags
|
55
|
+
].compact.flatten.join(" ").strip
|
56
|
+
|
57
|
+
post.extended = "Imported from http://twitter.com/#{res["user"]["screen_name"]}/status/#{res["id"]}\n\n#{res["text"]}"
|
58
|
+
post.dt = date(res).strftime("%Y-%m-%dT%H:%M:%SZ")
|
59
|
+
|
60
|
+
post
|
61
|
+
end
|
62
|
+
end.flatten
|
63
|
+
end
|
64
|
+
|
65
|
+
|
66
|
+
def get_tweets(query)
|
67
|
+
begin
|
68
|
+
res = Twitter.get("/statuses/friends_timeline.json", :query => query, :basic_auth => {:username => credentials["username"], :password => credentials["password"]})
|
69
|
+
raise "Error fetching timeline: '#{res['error']}'" if res.is_a?(Hash) && res['error']
|
70
|
+
res
|
71
|
+
rescue Crack::ParseError
|
72
|
+
raise # TODO
|
73
|
+
end
|
74
|
+
end
|
75
|
+
|
76
|
+
|
77
|
+
def identifier(post)
|
78
|
+
post["id"]
|
79
|
+
end
|
80
|
+
|
81
|
+
def date(post)
|
82
|
+
DateTime.parse(post["created_at"])
|
83
|
+
end
|
84
|
+
|
85
|
+
|
86
|
+
def ask_for_credentials
|
87
|
+
puts "#{name} user name:"
|
88
|
+
user = STDIN.gets.strip
|
89
|
+
puts "#{name} password:"
|
90
|
+
pass = STDIN.gets.strip
|
91
|
+
self.credentials = {"username" => user, "password" => pass}
|
92
|
+
end
|
93
|
+
end
|
94
|
+
|
95
|
+
|
96
|
+
end
|
@@ -0,0 +1,48 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
|
3
|
+
Gem::Specification.new do |s|
|
4
|
+
s.name = %q{reddilicious}
|
5
|
+
s.version = "0.0.1"
|
6
|
+
|
7
|
+
s.required_rubygems_version = Gem::Requirement.new(">= 0") if s.respond_to? :required_rubygems_version=
|
8
|
+
s.authors = ["David R. MacIver"]
|
9
|
+
s.date = %q{2009-07-16}
|
10
|
+
s.default_executable = %q{reddilicious}
|
11
|
+
s.email = %q{david.maciver@gmail.com}
|
12
|
+
s.executables = ["reddilicious"]
|
13
|
+
s.extra_rdoc_files = [
|
14
|
+
"LICENSE",
|
15
|
+
"README.markdown"
|
16
|
+
]
|
17
|
+
s.files = [
|
18
|
+
"LICENSE",
|
19
|
+
"README.markdown",
|
20
|
+
"Rakefile",
|
21
|
+
"VERSION",
|
22
|
+
"bin/reddilicious",
|
23
|
+
"lib/blacklist.rb",
|
24
|
+
"lib/delicious.rb",
|
25
|
+
"lib/post.rb",
|
26
|
+
"lib/reddilicious.rb",
|
27
|
+
"lib/reddit.rb",
|
28
|
+
"lib/site.rb",
|
29
|
+
"lib/stumbleupon.rb",
|
30
|
+
"lib/twitter.rb",
|
31
|
+
"reddilicious.gemspec"
|
32
|
+
]
|
33
|
+
s.homepage = %q{http://github.com/DRMacIver/reddilicious}
|
34
|
+
s.rdoc_options = ["--charset=UTF-8"]
|
35
|
+
s.require_paths = ["lib"]
|
36
|
+
s.rubygems_version = %q{1.3.4}
|
37
|
+
s.summary = %q{reddilicious is a tool for automatically importing links into delicious}
|
38
|
+
|
39
|
+
if s.respond_to? :specification_version then
|
40
|
+
current_version = Gem::Specification::CURRENT_SPECIFICATION_VERSION
|
41
|
+
s.specification_version = 3
|
42
|
+
|
43
|
+
if Gem::Version.new(Gem::RubyGemsVersion) >= Gem::Version.new('1.2.0') then
|
44
|
+
else
|
45
|
+
end
|
46
|
+
else
|
47
|
+
end
|
48
|
+
end
|
metadata
ADDED
@@ -0,0 +1,67 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: DRMacIver-reddilicious
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
platform: ruby
|
6
|
+
authors:
|
7
|
+
- David R. MacIver
|
8
|
+
autorequire:
|
9
|
+
bindir: bin
|
10
|
+
cert_chain: []
|
11
|
+
|
12
|
+
date: 2009-07-16 00:00:00 -07:00
|
13
|
+
default_executable: reddilicious
|
14
|
+
dependencies: []
|
15
|
+
|
16
|
+
description:
|
17
|
+
email: david.maciver@gmail.com
|
18
|
+
executables:
|
19
|
+
- reddilicious
|
20
|
+
extensions: []
|
21
|
+
|
22
|
+
extra_rdoc_files:
|
23
|
+
- LICENSE
|
24
|
+
- README.markdown
|
25
|
+
files:
|
26
|
+
- LICENSE
|
27
|
+
- README.markdown
|
28
|
+
- Rakefile
|
29
|
+
- VERSION
|
30
|
+
- bin/reddilicious
|
31
|
+
- lib/blacklist.rb
|
32
|
+
- lib/delicious.rb
|
33
|
+
- lib/post.rb
|
34
|
+
- lib/reddilicious.rb
|
35
|
+
- lib/reddit.rb
|
36
|
+
- lib/site.rb
|
37
|
+
- lib/stumbleupon.rb
|
38
|
+
- lib/twitter.rb
|
39
|
+
- reddilicious.gemspec
|
40
|
+
has_rdoc: false
|
41
|
+
homepage: http://github.com/DRMacIver/reddilicious
|
42
|
+
post_install_message:
|
43
|
+
rdoc_options:
|
44
|
+
- --charset=UTF-8
|
45
|
+
require_paths:
|
46
|
+
- lib
|
47
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
48
|
+
requirements:
|
49
|
+
- - ">="
|
50
|
+
- !ruby/object:Gem::Version
|
51
|
+
version: "0"
|
52
|
+
version:
|
53
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
54
|
+
requirements:
|
55
|
+
- - ">="
|
56
|
+
- !ruby/object:Gem::Version
|
57
|
+
version: "0"
|
58
|
+
version:
|
59
|
+
requirements: []
|
60
|
+
|
61
|
+
rubyforge_project:
|
62
|
+
rubygems_version: 1.2.0
|
63
|
+
signing_key:
|
64
|
+
specification_version: 3
|
65
|
+
summary: reddilicious is a tool for automatically importing links into delicious
|
66
|
+
test_files: []
|
67
|
+
|