tweetabout 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
data/.gitignore ADDED
@@ -0,0 +1,17 @@
1
+ *.gem
2
+ *.rbc
3
+ .bundle
4
+ .config
5
+ .yardoc
6
+ Gemfile.lock
7
+ InstalledFiles
8
+ _yardoc
9
+ coverage
10
+ doc/
11
+ lib/bundler/man
12
+ pkg
13
+ rdoc
14
+ spec/reports
15
+ test/tmp
16
+ test/version_tmp
17
+ tmp
data/Gemfile ADDED
@@ -0,0 +1,4 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in tweetabout.gemspec
4
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,22 @@
1
+ Copyright (c) 2012 Dylan Jhaveri
2
+
3
+ MIT License
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining
6
+ a copy of this software and associated documentation files (the
7
+ "Software"), to deal in the Software without restriction, including
8
+ without limitation the rights to use, copy, modify, merge, publish,
9
+ distribute, sublicense, and/or sell copies of the Software, and to
10
+ permit persons to whom the Software is furnished to do so, subject to
11
+ the following conditions:
12
+
13
+ The above copyright notice and this permission notice shall be
14
+ included in all copies or substantial portions of the Software.
15
+
16
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
17
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
18
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
19
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
20
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
21
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
22
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,84 @@
1
+ == Tweets About Gem
2
+
3
+ This gem takes a twitter username and returns a list of words.
4
+
5
+ The words are are ordered list of the most frequently tweeted words based on the users last 1,000 tweets.
6
+ * Retweets are included
7
+ * The casing of words doesn't matter (the = The = THE). Output is downcased.
8
+ * URLs are removed, because that makes sense.
9
+
10
+
11
+ == Installation
12
+
13
+ `gem install tweetabout`
14
+
15
+ Gemfile
16
+
17
+ `gem 'tweetabout'`
18
+
19
+ ==Twitter API
20
+
21
+ The Twitter API imposes a restriction on requests for users' timelines. Each request can only receive a maximum of 200 tweets. To get 1,000 tweets, that means we have to make 5 round trips to the api server. Let's see how longs these requests take. This is the measured response time of the GET request for 5 different twitter usernames:
22
+
23
+ ex: GET `http://api.twitter.com/1/statuses/user_timeline.json?screen_name=#{user}&include_rts=true&count=200`
24
+ (time in ms)
25
+
26
+ trial1 trial2 trial3 trial4 trial5
27
+ request1 806.521 740.214 1363.33 490.090 331.253
28
+ request2 720.767 537.374 673.249 532.168 478.668
29
+ request3 492.608 733.955 560.918 547.887 380.81
30
+ request4 480.757 945.29 645.733 605.972 340.256
31
+ request5 575.621 469.731 707.737 826.423 169.244
32
+
33
+ Based on this small test we can see that response times from the api vary from a few hundred milliseconds up to a full second. Of course this is all influnced by time of day, network connection, and a variety of factors but it's good to know that if we have to make 5 trips in a row to the twitter api server we can't really count on it being very fast. In fact, 5 requests could easily take 2.5 to 3 seconds to complete. No doubt this is the slowest part of this app.
34
+
35
+ ==Speed
36
+ All these measurments are for processing the maximum of 1,000 tweets. If the user has less than 1,000 tweets, obviously these processes will be faster.
37
+
38
+ To see speeds yourself, checkout the speed branch and watch the server output.
39
+
40
+ ===get_tweets method
41
+ `application_controller.rb: 29`
42
+ This method does 5 GET requests to the Twitter api and stores them all in `@responses` These are the different time measurements:
43
+
44
+ 4985.406 ms
45
+ 3566.376
46
+ 4071.794
47
+ 7759.329
48
+ 3656.680
49
+ 4510.602
50
+
51
+ ===Processing responses
52
+ `@responses.each do |tweet|` block in `application_controller.rb:11
53
+ This method essentially takes the @responses variable, which is all the tweets, splits the words apart, removes punctuation and creates a hash of keys and values, keys are words, values are the number of times that word has shown up. (the `bad_key` method below is part of this block.
54
+
55
+ 107.844 ms
56
+ 117.764
57
+ 78.989
58
+ 87.12
59
+ 137.528
60
+ 134.256
61
+
62
+ ====bad_key method:
63
+
64
+ .006ms - .013ms each word
65
+
66
+ ====Sorting
67
+ `application_controller.rb:21`
68
+
69
+ 4.245 ms
70
+ 4.016
71
+ 4.098
72
+ 5.509
73
+ 3.994
74
+
75
+
76
+
77
+
78
+
79
+
80
+
81
+
82
+
83
+
84
+
data/Rakefile ADDED
@@ -0,0 +1,2 @@
1
+ #!/usr/bin/env rake
2
+ require "bundler/gem_tasks"
@@ -0,0 +1,3 @@
1
+ module Tweetabout
2
+ VERSION = "0.0.1"
3
+ end
data/lib/tweetabout.rb ADDED
@@ -0,0 +1,67 @@
1
+ require "tweetabout/version"
2
+
3
+ module Tweetabout
4
+ def self.tweetabout(user)
5
+ get_tweets(user)
6
+
7
+ hash = {}
8
+ @responses.each do |tweet|
9
+ tweet.split(" ").each do |key|
10
+ key = key.gsub(/\W/, "").downcase
11
+ if hash.has_key?(key)
12
+ hash["#{key}"] += 1
13
+ else
14
+ hash.merge!({"#{key}" => 1}) unless bad_key(key)
15
+ end
16
+ end
17
+ end
18
+ @sorted_array = hash.sort_by { |keyword, frequency| frequency }.reverse
19
+ @sorted_array.map { |keyword| keyword[0] }
20
+ end
21
+
22
+ def self.bad_key(key)
23
+ return true if key.empty?
24
+ return true if key.start_with?('http')
25
+ end
26
+
27
+ def self.get_tweets(user)
28
+ count = 200
29
+ base_url = "http://api.twitter.com/1/statuses/user_timeline.json?screen_name=#{user}&include_rts=true&count=#{count}"
30
+
31
+ @responses = []
32
+ response1 = HTTParty.get("#{base_url}")
33
+ return if response1.code == 404
34
+ start_at_1 = response1.last["id"]
35
+ response1.each do |object|
36
+ @responses << object["text"]
37
+ end
38
+
39
+ response2 = HTTParty.get("#{base_url}&max_id=#{start_at_1-1}")
40
+ return if response2.count == 0
41
+ start_at_2 = response2.last["id"]
42
+ response2.each do |object|
43
+ @responses << object["text"]
44
+ end
45
+
46
+ response3 = HTTParty.get("#{base_url}&max_id=#{start_at_2-1}")
47
+ return if response3.count == 0
48
+ start_at_3 = response3.last["id"]
49
+ response3.each do |object|
50
+ @responses << object["text"]
51
+ end
52
+
53
+ response4 = HTTParty.get("#{base_url}&max_id=#{start_at_3-1}")
54
+ return if response4.count == 0
55
+ start_at_4 = response4.last["id"]
56
+ response4.each do |object|
57
+ @responses << object["text"]
58
+ end
59
+
60
+ response5 = HTTParty.get("#{base_url}&max_id=#{start_at_4-1}")
61
+ return if response5.count == 0
62
+ start_at_5 = response4.last["id"]
63
+ response5.each do |object|
64
+ @responses << object["text"]
65
+ end
66
+ end
67
+ end
@@ -0,0 +1,19 @@
1
+ # -*- encoding: utf-8 -*-
2
+ require File.expand_path('../lib/tweetabout/version', __FILE__)
3
+
4
+ Gem::Specification.new do |gem|
5
+ gem.authors = ["Dylan Jhaveri"]
6
+ gem.email = ["dylanjhaveri@gmail.com"]
7
+ gem.description = %q{Returns a list of frequently tweeted words}
8
+ gem.summary = %q{Takes a twitter username and outputs the most frequently tweeted words in their last 1,000 tweets. Includes Re-tweets and excludes links.}
9
+ gem.homepage = ""
10
+
11
+ gem.add_dependency "httparty"
12
+
13
+ gem.files = `git ls-files`.split($\)
14
+ gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
15
+ gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
16
+ gem.name = "tweetabout"
17
+ gem.require_paths = ["lib"]
18
+ gem.version = Tweetabout::VERSION
19
+ end
metadata ADDED
@@ -0,0 +1,70 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: tweetabout
3
+ version: !ruby/object:Gem::Version
4
+ version: 0.0.1
5
+ prerelease:
6
+ platform: ruby
7
+ authors:
8
+ - Dylan Jhaveri
9
+ autorequire:
10
+ bindir: bin
11
+ cert_chain: []
12
+ date: 2012-08-20 00:00:00.000000000 Z
13
+ dependencies:
14
+ - !ruby/object:Gem::Dependency
15
+ name: httparty
16
+ requirement: !ruby/object:Gem::Requirement
17
+ none: false
18
+ requirements:
19
+ - - ! '>='
20
+ - !ruby/object:Gem::Version
21
+ version: '0'
22
+ type: :runtime
23
+ prerelease: false
24
+ version_requirements: !ruby/object:Gem::Requirement
25
+ none: false
26
+ requirements:
27
+ - - ! '>='
28
+ - !ruby/object:Gem::Version
29
+ version: '0'
30
+ description: Returns a list of frequently tweeted words
31
+ email:
32
+ - dylanjhaveri@gmail.com
33
+ executables: []
34
+ extensions: []
35
+ extra_rdoc_files: []
36
+ files:
37
+ - .gitignore
38
+ - Gemfile
39
+ - LICENSE
40
+ - README.md
41
+ - Rakefile
42
+ - lib/tweetabout.rb
43
+ - lib/tweetabout/version.rb
44
+ - tweetabout.gemspec
45
+ homepage: ''
46
+ licenses: []
47
+ post_install_message:
48
+ rdoc_options: []
49
+ require_paths:
50
+ - lib
51
+ required_ruby_version: !ruby/object:Gem::Requirement
52
+ none: false
53
+ requirements:
54
+ - - ! '>='
55
+ - !ruby/object:Gem::Version
56
+ version: '0'
57
+ required_rubygems_version: !ruby/object:Gem::Requirement
58
+ none: false
59
+ requirements:
60
+ - - ! '>='
61
+ - !ruby/object:Gem::Version
62
+ version: '0'
63
+ requirements: []
64
+ rubyforge_project:
65
+ rubygems_version: 1.8.21
66
+ signing_key:
67
+ specification_version: 3
68
+ summary: Takes a twitter username and outputs the most frequently tweeted words in
69
+ their last 1,000 tweets. Includes Re-tweets and excludes links.
70
+ test_files: []