tweetabout 0.0.2 → 0.0.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/README.md +21 -13
- data/lib/tweetabout/version.rb +1 -1
- data/lib/tweetabout.rb +9 -4
- metadata +1 -1
data/README.md
CHANGED
@@ -1,4 +1,4 @@
|
|
1
|
-
|
1
|
+
## Tweets About Gem
|
2
2
|
|
3
3
|
This gem takes a twitter username and returns a list of words.
|
4
4
|
|
@@ -8,15 +8,25 @@ The words are are ordered list of the most frequently tweeted words based on the
|
|
8
8
|
* URLs are removed, because that makes sense.
|
9
9
|
|
10
10
|
|
11
|
-
|
11
|
+
## Installation
|
12
12
|
|
13
13
|
`gem install tweetabout`
|
14
14
|
|
15
|
-
Gemfile
|
15
|
+
### Gemfile
|
16
16
|
|
17
17
|
`gem 'tweetabout'`
|
18
18
|
|
19
|
-
|
19
|
+
### Dependencies
|
20
|
+
|
21
|
+
`httparty` because httparty > NET::HTTP
|
22
|
+
|
23
|
+
## TODO
|
24
|
+
|
25
|
+
* Better test coverage. Actually this a reqiorement before using this gem in production. And I know I am a bad Rubyist for this but I promise to get better.
|
26
|
+
|
27
|
+
* Make this Gem useful. Right now it returns a list of all words, it might be interesting to strip out articles and pronouns from the list, that might actually give you an interesting insight into a person's interests.
|
28
|
+
|
29
|
+
##Twitter API
|
20
30
|
|
21
31
|
The Twitter API imposes a restriction on requests for users' timelines. Each request can only receive a maximum of 200 tweets. To get 1,000 tweets, that means we have to make 5 round trips to the api server. Let's see how longs these requests take. This is the measured response time of the GET request for 5 different twitter usernames:
|
22
32
|
|
@@ -30,15 +40,14 @@ ex: GET `http://api.twitter.com/1/statuses/user_timeline.json?screen_name=#{user
|
|
30
40
|
request4 480.757 945.29 645.733 605.972 340.256
|
31
41
|
request5 575.621 469.731 707.737 826.423 169.244
|
32
42
|
|
33
|
-
Based on this small test we can see that response times from the api vary from a few hundred milliseconds up to a full second. Of course this is all influnced by time of day, network connection, and a variety of factors but it's good to know that if we have to make 5 trips in a row to the twitter api server we can't really count on it being very fast. In fact, 5 requests could easily take 2.5 to 3 seconds to complete. No doubt this is the slowest part of this
|
43
|
+
Based on this small test we can see that response times from the api vary from a few hundred milliseconds up to a full second. Of course this is all influnced by time of day, network connection, and a variety of factors but it's good to know that if we have to make 5 trips in a row to the twitter api server we can't really count on it being very fast. In fact, 5 requests could easily take 2.5 to 3 seconds to complete. No doubt this is the slowest part of this gem.
|
34
44
|
|
35
|
-
|
45
|
+
##Speed
|
36
46
|
All these measurments are for processing the maximum of 1,000 tweets. If the user has less than 1,000 tweets, obviously these processes will be faster.
|
37
47
|
|
38
48
|
To see speeds yourself, checkout the speed branch and watch the server output.
|
39
49
|
|
40
|
-
|
41
|
-
`application_controller.rb: 29`
|
50
|
+
### get_tweets method
|
42
51
|
This method does 5 GET requests to the Twitter api and stores them all in `@responses` These are the different time measurements:
|
43
52
|
|
44
53
|
4985.406 ms
|
@@ -48,8 +57,8 @@ This method does 5 GET requests to the Twitter api and stores them all in `@resp
|
|
48
57
|
3656.680
|
49
58
|
4510.602
|
50
59
|
|
51
|
-
|
52
|
-
`@responses.each do |tweet|` block
|
60
|
+
### Processing responses (tweets)
|
61
|
+
`@responses.each do |tweet|` block
|
53
62
|
This method essentially takes the @responses variable, which is all the tweets, splits the words apart, removes punctuation and creates a hash of keys and values, keys are words, values are the number of times that word has shown up. (the `bad_key` method below is part of this block.
|
54
63
|
|
55
64
|
107.844 ms
|
@@ -59,12 +68,11 @@ This method essentially takes the @responses variable, which is all the tweets,
|
|
59
68
|
137.528
|
60
69
|
134.256
|
61
70
|
|
62
|
-
|
71
|
+
#### bad_key method:
|
63
72
|
|
64
73
|
.006ms - .013ms each word
|
65
74
|
|
66
|
-
|
67
|
-
`application_controller.rb:21`
|
75
|
+
#### Sorting
|
68
76
|
|
69
77
|
4.245 ms
|
70
78
|
4.016
|
data/lib/tweetabout/version.rb
CHANGED
data/lib/tweetabout.rb
CHANGED
@@ -32,34 +32,39 @@ module Tweetabout
|
|
32
32
|
@responses = []
|
33
33
|
response1 = HTTParty.get("#{base_url}")
|
34
34
|
return if response1.code == 404
|
35
|
+
return if response1.code == 400
|
35
36
|
start_at_1 = response1.last["id"]
|
36
37
|
response1.each do |object|
|
37
38
|
@responses << object["text"]
|
38
39
|
end
|
39
40
|
|
40
41
|
response2 = HTTParty.get("#{base_url}&max_id=#{start_at_1-1}")
|
41
|
-
return if response2.count ==
|
42
|
+
return if response2.count == 404
|
43
|
+
return if response2.code == 400
|
42
44
|
start_at_2 = response2.last["id"]
|
43
45
|
response2.each do |object|
|
44
46
|
@responses << object["text"]
|
45
47
|
end
|
46
48
|
|
47
49
|
response3 = HTTParty.get("#{base_url}&max_id=#{start_at_2-1}")
|
48
|
-
return if response3.count ==
|
50
|
+
return if response3.count == 404
|
51
|
+
return if response3.code == 400
|
49
52
|
start_at_3 = response3.last["id"]
|
50
53
|
response3.each do |object|
|
51
54
|
@responses << object["text"]
|
52
55
|
end
|
53
56
|
|
54
57
|
response4 = HTTParty.get("#{base_url}&max_id=#{start_at_3-1}")
|
55
|
-
return if response4.count ==
|
58
|
+
return if response4.count == 404
|
59
|
+
return if response4.code == 400
|
56
60
|
start_at_4 = response4.last["id"]
|
57
61
|
response4.each do |object|
|
58
62
|
@responses << object["text"]
|
59
63
|
end
|
60
64
|
|
61
65
|
response5 = HTTParty.get("#{base_url}&max_id=#{start_at_4-1}")
|
62
|
-
return if response5.count ==
|
66
|
+
return if response5.count == 404
|
67
|
+
return if response5.code == 400
|
63
68
|
start_at_5 = response4.last["id"]
|
64
69
|
response5.each do |object|
|
65
70
|
@responses << object["text"]
|