tweetabout 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.gitignore +17 -0
- data/Gemfile +4 -0
- data/LICENSE +22 -0
- data/README.md +84 -0
- data/Rakefile +2 -0
- data/lib/tweetabout/version.rb +3 -0
- data/lib/tweetabout.rb +67 -0
- data/tweetabout.gemspec +19 -0
- metadata +70 -0
data/.gitignore
ADDED
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Dylan Jhaveri
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,84 @@
|
|
1
|
+
== Tweets About Gem
|
2
|
+
|
3
|
+
This gem takes a twitter username and returns a list of words.
|
4
|
+
|
5
|
+
The words are are ordered list of the most frequently tweeted words based on the users last 1,000 tweets.
|
6
|
+
* Retweets are included
|
7
|
+
* The casing of words doesn't matter (the = The = THE). Output is downcased.
|
8
|
+
* URLs are removed, because that makes sense.
|
9
|
+
|
10
|
+
|
11
|
+
== Installation
|
12
|
+
|
13
|
+
`gem install tweetabout`
|
14
|
+
|
15
|
+
Gemfile
|
16
|
+
|
17
|
+
`gem 'tweetabout'`
|
18
|
+
|
19
|
+
==Twitter API
|
20
|
+
|
21
|
+
The Twitter API imposes a restriction on requests for users' timelines. Each request can only receive a maximum of 200 tweets. To get 1,000 tweets, that means we have to make 5 round trips to the api server. Let's see how longs these requests take. This is the measured response time of the GET request for 5 different twitter usernames:
|
22
|
+
|
23
|
+
ex: GET `http://api.twitter.com/1/statuses/user_timeline.json?screen_name=#{user}&include_rts=true&count=200`
|
24
|
+
(time in ms)
|
25
|
+
|
26
|
+
trial1 trial2 trial3 trial4 trial5
|
27
|
+
request1 806.521 740.214 1363.33 490.090 331.253
|
28
|
+
request2 720.767 537.374 673.249 532.168 478.668
|
29
|
+
request3 492.608 733.955 560.918 547.887 380.81
|
30
|
+
request4 480.757 945.29 645.733 605.972 340.256
|
31
|
+
request5 575.621 469.731 707.737 826.423 169.244
|
32
|
+
|
33
|
+
Based on this small test we can see that response times from the api vary from a few hundred milliseconds up to a full second. Of course this is all influnced by time of day, network connection, and a variety of factors but it's good to know that if we have to make 5 trips in a row to the twitter api server we can't really count on it being very fast. In fact, 5 requests could easily take 2.5 to 3 seconds to complete. No doubt this is the slowest part of this app.
|
34
|
+
|
35
|
+
==Speed
|
36
|
+
All these measurments are for processing the maximum of 1,000 tweets. If the user has less than 1,000 tweets, obviously these processes will be faster.
|
37
|
+
|
38
|
+
To see speeds yourself, checkout the speed branch and watch the server output.
|
39
|
+
|
40
|
+
===get_tweets method
|
41
|
+
`application_controller.rb: 29`
|
42
|
+
This method does 5 GET requests to the Twitter api and stores them all in `@responses` These are the different time measurements:
|
43
|
+
|
44
|
+
4985.406 ms
|
45
|
+
3566.376
|
46
|
+
4071.794
|
47
|
+
7759.329
|
48
|
+
3656.680
|
49
|
+
4510.602
|
50
|
+
|
51
|
+
===Processing responses
|
52
|
+
`@responses.each do |tweet|` block in `application_controller.rb:11
|
53
|
+
This method essentially takes the @responses variable, which is all the tweets, splits the words apart, removes punctuation and creates a hash of keys and values, keys are words, values are the number of times that word has shown up. (the `bad_key` method below is part of this block.
|
54
|
+
|
55
|
+
107.844 ms
|
56
|
+
117.764
|
57
|
+
78.989
|
58
|
+
87.12
|
59
|
+
137.528
|
60
|
+
134.256
|
61
|
+
|
62
|
+
====bad_key method:
|
63
|
+
|
64
|
+
.006ms - .013ms each word
|
65
|
+
|
66
|
+
====Sorting
|
67
|
+
`application_controller.rb:21`
|
68
|
+
|
69
|
+
4.245 ms
|
70
|
+
4.016
|
71
|
+
4.098
|
72
|
+
5.509
|
73
|
+
3.994
|
74
|
+
|
75
|
+
|
76
|
+
|
77
|
+
|
78
|
+
|
79
|
+
|
80
|
+
|
81
|
+
|
82
|
+
|
83
|
+
|
84
|
+
|
data/Rakefile
ADDED
data/lib/tweetabout.rb
ADDED
@@ -0,0 +1,67 @@
|
|
1
|
+
require "tweetabout/version"
|
2
|
+
|
3
|
+
module Tweetabout
|
4
|
+
def self.tweetabout(user)
|
5
|
+
get_tweets(user)
|
6
|
+
|
7
|
+
hash = {}
|
8
|
+
@responses.each do |tweet|
|
9
|
+
tweet.split(" ").each do |key|
|
10
|
+
key = key.gsub(/\W/, "").downcase
|
11
|
+
if hash.has_key?(key)
|
12
|
+
hash["#{key}"] += 1
|
13
|
+
else
|
14
|
+
hash.merge!({"#{key}" => 1}) unless bad_key(key)
|
15
|
+
end
|
16
|
+
end
|
17
|
+
end
|
18
|
+
@sorted_array = hash.sort_by { |keyword, frequency| frequency }.reverse
|
19
|
+
@sorted_array.map { |keyword| keyword[0] }
|
20
|
+
end
|
21
|
+
|
22
|
+
def self.bad_key(key)
|
23
|
+
return true if key.empty?
|
24
|
+
return true if key.start_with?('http')
|
25
|
+
end
|
26
|
+
|
27
|
+
def self.get_tweets(user)
|
28
|
+
count = 200
|
29
|
+
base_url = "http://api.twitter.com/1/statuses/user_timeline.json?screen_name=#{user}&include_rts=true&count=#{count}"
|
30
|
+
|
31
|
+
@responses = []
|
32
|
+
response1 = HTTParty.get("#{base_url}")
|
33
|
+
return if response1.code == 404
|
34
|
+
start_at_1 = response1.last["id"]
|
35
|
+
response1.each do |object|
|
36
|
+
@responses << object["text"]
|
37
|
+
end
|
38
|
+
|
39
|
+
response2 = HTTParty.get("#{base_url}&max_id=#{start_at_1-1}")
|
40
|
+
return if response2.count == 0
|
41
|
+
start_at_2 = response2.last["id"]
|
42
|
+
response2.each do |object|
|
43
|
+
@responses << object["text"]
|
44
|
+
end
|
45
|
+
|
46
|
+
response3 = HTTParty.get("#{base_url}&max_id=#{start_at_2-1}")
|
47
|
+
return if response3.count == 0
|
48
|
+
start_at_3 = response3.last["id"]
|
49
|
+
response3.each do |object|
|
50
|
+
@responses << object["text"]
|
51
|
+
end
|
52
|
+
|
53
|
+
response4 = HTTParty.get("#{base_url}&max_id=#{start_at_3-1}")
|
54
|
+
return if response4.count == 0
|
55
|
+
start_at_4 = response4.last["id"]
|
56
|
+
response4.each do |object|
|
57
|
+
@responses << object["text"]
|
58
|
+
end
|
59
|
+
|
60
|
+
response5 = HTTParty.get("#{base_url}&max_id=#{start_at_4-1}")
|
61
|
+
return if response5.count == 0
|
62
|
+
start_at_5 = response4.last["id"]
|
63
|
+
response5.each do |object|
|
64
|
+
@responses << object["text"]
|
65
|
+
end
|
66
|
+
end
|
67
|
+
end
|
data/tweetabout.gemspec
ADDED
@@ -0,0 +1,19 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
require File.expand_path('../lib/tweetabout/version', __FILE__)
|
3
|
+
|
4
|
+
Gem::Specification.new do |gem|
|
5
|
+
gem.authors = ["Dylan Jhaveri"]
|
6
|
+
gem.email = ["dylanjhaveri@gmail.com"]
|
7
|
+
gem.description = %q{Returns a list of frequently tweeted words}
|
8
|
+
gem.summary = %q{Takes a twitter username and outputs the most frequently tweeted words in their last 1,000 tweets. Includes Re-tweets and excludes links.}
|
9
|
+
gem.homepage = ""
|
10
|
+
|
11
|
+
gem.add_dependency "httparty"
|
12
|
+
|
13
|
+
gem.files = `git ls-files`.split($\)
|
14
|
+
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
15
|
+
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
|
16
|
+
gem.name = "tweetabout"
|
17
|
+
gem.require_paths = ["lib"]
|
18
|
+
gem.version = Tweetabout::VERSION
|
19
|
+
end
|
metadata
ADDED
@@ -0,0 +1,70 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: tweetabout
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Dylan Jhaveri
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2012-08-20 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: httparty
|
16
|
+
requirement: !ruby/object:Gem::Requirement
|
17
|
+
none: false
|
18
|
+
requirements:
|
19
|
+
- - ! '>='
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: '0'
|
22
|
+
type: :runtime
|
23
|
+
prerelease: false
|
24
|
+
version_requirements: !ruby/object:Gem::Requirement
|
25
|
+
none: false
|
26
|
+
requirements:
|
27
|
+
- - ! '>='
|
28
|
+
- !ruby/object:Gem::Version
|
29
|
+
version: '0'
|
30
|
+
description: Returns a list of frequently tweeted words
|
31
|
+
email:
|
32
|
+
- dylanjhaveri@gmail.com
|
33
|
+
executables: []
|
34
|
+
extensions: []
|
35
|
+
extra_rdoc_files: []
|
36
|
+
files:
|
37
|
+
- .gitignore
|
38
|
+
- Gemfile
|
39
|
+
- LICENSE
|
40
|
+
- README.md
|
41
|
+
- Rakefile
|
42
|
+
- lib/tweetabout.rb
|
43
|
+
- lib/tweetabout/version.rb
|
44
|
+
- tweetabout.gemspec
|
45
|
+
homepage: ''
|
46
|
+
licenses: []
|
47
|
+
post_install_message:
|
48
|
+
rdoc_options: []
|
49
|
+
require_paths:
|
50
|
+
- lib
|
51
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
52
|
+
none: false
|
53
|
+
requirements:
|
54
|
+
- - ! '>='
|
55
|
+
- !ruby/object:Gem::Version
|
56
|
+
version: '0'
|
57
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
58
|
+
none: false
|
59
|
+
requirements:
|
60
|
+
- - ! '>='
|
61
|
+
- !ruby/object:Gem::Version
|
62
|
+
version: '0'
|
63
|
+
requirements: []
|
64
|
+
rubyforge_project:
|
65
|
+
rubygems_version: 1.8.21
|
66
|
+
signing_key:
|
67
|
+
specification_version: 3
|
68
|
+
summary: Takes a twitter username and outputs the most frequently tweeted words in
|
69
|
+
their last 1,000 tweets. Includes Re-tweets and excludes links.
|
70
|
+
test_files: []
|