tweetly 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +19 -0
- data/.rspec +2 -0
- data/Gemfile +4 -0
- data/LICENSE +22 -0
- data/README.md +105 -0
- data/Rakefile +2 -0
- data/lib/tweetly.rb +7 -0
- data/lib/tweetly/user.rb +142 -0
- data/lib/tweetly/version.rb +3 -0
- data/spec/spec_helper.rb +14 -0
- data/spec/user_spec.rb +50 -0
- data/tweetly.gemspec +22 -0
- metadata +98 -0
data/.gitignore
ADDED
data/.rspec
ADDED
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Jico Baligod
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,105 @@
|
|
1
|
+
# Tweetly
|
2
|
+
|
3
|
+
Generate nifty word frequency distributions from your Twitter statuses.
|
4
|
+
|
5
|
+
Written and tested in __Ruby 1.9.3__.
|
6
|
+
|
7
|
+
## Installation
|
8
|
+
|
9
|
+
Add this line to your application's Gemfile:
|
10
|
+
|
11
|
+
gem 'tweetly'
|
12
|
+
|
13
|
+
And then execute:
|
14
|
+
|
15
|
+
$ bundle
|
16
|
+
|
17
|
+
Or install it yourself as:
|
18
|
+
|
19
|
+
$ gem install tweetly
|
20
|
+
|
21
|
+
## Usage
|
22
|
+
|
23
|
+
First, let's create a new user.
|
24
|
+
|
25
|
+
```ruby
|
26
|
+
require 'tweetly'
|
27
|
+
user = Tweetly::User.new('jicooo')
|
28
|
+
```
|
29
|
+
|
30
|
+
Now we can print a word frequency distribution list from the last _tweets_ tweets and limit it to _count_ words like so:
|
31
|
+
|
32
|
+
```ruby
|
33
|
+
user.print_word_freq(tweets: 20, count: 5)
|
34
|
+
```
|
35
|
+
|
36
|
+
which prints:
|
37
|
+
|
38
|
+
for (5)
|
39
|
+
$10 (3)
|
40
|
+
@julioz2 (3)
|
41
|
+
my (3)
|
42
|
+
post: (2)
|
43
|
+
|
44
|
+
By default, Tweetly pulls the last 1000 tweets from the user's timeline and displays the entire list of words.
|
45
|
+
|
46
|
+
```ruby
|
47
|
+
user.print_word_freq
|
48
|
+
```
|
49
|
+
|
50
|
+
which prints:
|
51
|
+
|
52
|
+
the (262)
|
53
|
+
to (179)
|
54
|
+
a (165)
|
55
|
+
@Sualehh (161)
|
56
|
+
my (112)
|
57
|
+
RT (107)
|
58
|
+
I (105)
|
59
|
+
of (100)
|
60
|
+
# and so on...
|
61
|
+
|
62
|
+
Well, that's not a very interesting list. Let's ignore retweets as well as a few common words. Additionally, let's constrain our list to words of at least length 5, only word characters (i.e. letters, digits, underscores), and case insensitive:
|
63
|
+
|
64
|
+
```ruby
|
65
|
+
options = {
|
66
|
+
include_rts: false,
|
67
|
+
ignore: ['the', 'to', 'a', 'I'],
|
68
|
+
min_length: 5,
|
69
|
+
words_only: true,
|
70
|
+
case_sensitive: false
|
71
|
+
}
|
72
|
+
user.print_word_freq(options)
|
73
|
+
```
|
74
|
+
|
75
|
+
The above prints out:
|
76
|
+
|
77
|
+
sualehh (206)
|
78
|
+
foursquare (31)
|
79
|
+
julioz2 (30)
|
80
|
+
mlacitation (24)
|
81
|
+
compywiz (24)
|
82
|
+
right (21)
|
83
|
+
think (20)
|
84
|
+
pretty (20)
|
85
|
+
|
86
|
+
Awesome! That's a great looking list.
|
87
|
+
|
88
|
+
If you strictly want the words and not the frequency counts printed, you can pass in the option `print_count: false`. For the above, we would just have
|
89
|
+
|
90
|
+
sualehh
|
91
|
+
foursquare
|
92
|
+
julioz2
|
93
|
+
mlacitation
|
94
|
+
compywiz
|
95
|
+
right
|
96
|
+
think
|
97
|
+
pretty
|
98
|
+
|
99
|
+
## Contributing
|
100
|
+
|
101
|
+
1. Fork it
|
102
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
103
|
+
3. Commit your changes (`git commit -am 'Added some feature'`)
|
104
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
105
|
+
5. Create new Pull Request
|
data/Rakefile
ADDED
data/lib/tweetly.rb
ADDED
data/lib/tweetly/user.rb
ADDED
@@ -0,0 +1,142 @@
|
|
1
|
+
module Tweetly
|
2
|
+
|
3
|
+
class User
|
4
|
+
attr_accessor :name, :screen_name, :protected, :statuses_count, :timeline
|
5
|
+
|
6
|
+
def initialize(name)
|
7
|
+
user = Twitter.user(name)
|
8
|
+
unless user.protected?
|
9
|
+
@name = user.name
|
10
|
+
@screen_name = user.screen_name
|
11
|
+
@protected = user.protected?
|
12
|
+
@statuses_count = user.statuses_count
|
13
|
+
@timeline = []
|
14
|
+
end
|
15
|
+
end
|
16
|
+
|
17
|
+
# Builds a word frequency array from recent tweets.
|
18
|
+
#
|
19
|
+
# @param [Hash] options frequency distribution options
|
20
|
+
# @option [Integer] :tweets number of tweets to consider (max 3200)
|
21
|
+
# @option [Boolean] :words_only consider only word characers
|
22
|
+
# i.e. (letters, symbols, underscores)
|
23
|
+
# @option [Boolean] :case_sensitive whether to consider word case
|
24
|
+
# @option [Boolean] :include_rts whether to consider retweeted statuses
|
25
|
+
# @option [Array<String>] :ignore list of words to ignore
|
26
|
+
# @return [Array<Array<String, Integer>>] list of word frequencies in descending order
|
27
|
+
def word_freq(options={})
|
28
|
+
# Default parameters
|
29
|
+
params = {
|
30
|
+
tweets: 1000,
|
31
|
+
words_only: false,
|
32
|
+
case_sensitive: true,
|
33
|
+
include_rts: true,
|
34
|
+
min_length: nil,
|
35
|
+
ignore: []
|
36
|
+
}
|
37
|
+
params.merge!(options)
|
38
|
+
|
39
|
+
# Fetch Tweets if necessary
|
40
|
+
fetch_timeline(params[:tweets]) if @timeline.count < params[:tweets]
|
41
|
+
|
42
|
+
freqDist = {}
|
43
|
+
|
44
|
+
@timeline.each_with_index do |tweet, i|
|
45
|
+
# :include_rts option
|
46
|
+
next if !params[:include_rts] && tweet.retweeted_status
|
47
|
+
|
48
|
+
tweet.text.split.each do |word|
|
49
|
+
# :ignore option
|
50
|
+
next if params[:ignore].include? word
|
51
|
+
|
52
|
+
# :case_sensitive option
|
53
|
+
word.downcase! if !params[:case_sensitive]
|
54
|
+
|
55
|
+
# :words_only option
|
56
|
+
if params[:words_only]
|
57
|
+
word = word.match(/\w+/)
|
58
|
+
next unless word
|
59
|
+
word = word[0]
|
60
|
+
end
|
61
|
+
|
62
|
+
# :min_length option
|
63
|
+
next if params[:min_length] && word.length < params[:min_length]
|
64
|
+
|
65
|
+
if freqDist.has_key? word
|
66
|
+
freqDist[word] += 1
|
67
|
+
else
|
68
|
+
freqDist[word] = 1
|
69
|
+
end
|
70
|
+
end
|
71
|
+
|
72
|
+
break if i == params[:tweets]
|
73
|
+
end
|
74
|
+
freqDist.sort_by { |k,v| -v }
|
75
|
+
end
|
76
|
+
|
77
|
+
# Prints word frequency distribution list.
|
78
|
+
#
|
79
|
+
# @param [Hash] options printing options
|
80
|
+
# @option [Integer] :tweets number of tweets to consider (default: 100, max: 3200)
|
81
|
+
# @option [Integer] :count prints top count of frequency distribution list
|
82
|
+
# @option [Boolean] :print_count whether to print count next to each word
|
83
|
+
def print_word_freq(options={})
|
84
|
+
opts = {
|
85
|
+
tweets: 1000,
|
86
|
+
print_count: true
|
87
|
+
}
|
88
|
+
opts.merge!(options)
|
89
|
+
dist = word_freq(opts)
|
90
|
+
opts[:count] ||= dist.count
|
91
|
+
dist[0,opts[:count]].each do |k,v|
|
92
|
+
line = "#{k}"
|
93
|
+
line += " (#{v})" if opts[:print_count]
|
94
|
+
puts line
|
95
|
+
end
|
96
|
+
return
|
97
|
+
end
|
98
|
+
|
99
|
+
# Fetches user timeline and caches it in the @timeline field.
|
100
|
+
#
|
101
|
+
# @todo Duplicate tweets due to max_id
|
102
|
+
#
|
103
|
+
# @param [Integer] count the number of Tweets to fetch
|
104
|
+
# @return [Boolean] Success or failure (amount already cached)
|
105
|
+
def fetch_timeline(count=1000)
|
106
|
+
# Twitter API limits to 3200 tweets
|
107
|
+
count = [count, 3200].min
|
108
|
+
|
109
|
+
if @timeline.count < count
|
110
|
+
opts = {
|
111
|
+
trim_user: 1,
|
112
|
+
include_rts: 1
|
113
|
+
}
|
114
|
+
|
115
|
+
while @timeline.length < count
|
116
|
+
# Fetch only what we don't have
|
117
|
+
opts[:max_id] = @timeline.last.id unless @timeline.empty?
|
118
|
+
|
119
|
+
# Max step count is 200 by API
|
120
|
+
# We may need less depending on what's already fetched
|
121
|
+
opts[:count] = [200, count - @timeline.count].min
|
122
|
+
|
123
|
+
# Fetch an extra since max_id is included
|
124
|
+
opts[:count] += 1 if opts[:max_id]
|
125
|
+
|
126
|
+
resp = Twitter.user_timeline(@screen_name, opts)
|
127
|
+
|
128
|
+
# Avoid reinserting max_id tweet
|
129
|
+
resp.delete_at(0) if opts[:max_id]
|
130
|
+
|
131
|
+
resp.each { |tweet| @timeline << tweet }
|
132
|
+
end
|
133
|
+
|
134
|
+
return true
|
135
|
+
else
|
136
|
+
# We've pulled this amount or more already
|
137
|
+
return false
|
138
|
+
end
|
139
|
+
end
|
140
|
+
|
141
|
+
end
|
142
|
+
end
|
data/spec/spec_helper.rb
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
# This file was generated by the `rspec --init` command. Conventionally, all
|
2
|
+
# specs live under a `spec` directory, which RSpec adds to the `$LOAD_PATH`.
|
3
|
+
# Require this file using `require "spec_helper.rb"` to ensure that it is only
|
4
|
+
# loaded once.
|
5
|
+
#
|
6
|
+
# See http://rubydoc.info/gems/rspec-core/RSpec/Core/Configuration
|
7
|
+
|
8
|
+
require_relative "../lib/tweetly"
|
9
|
+
|
10
|
+
RSpec.configure do |config|
|
11
|
+
config.treat_symbols_as_metadata_keys_with_true_values = true
|
12
|
+
config.run_all_when_everything_filtered = true
|
13
|
+
config.filter_run :focus
|
14
|
+
end
|
data/spec/user_spec.rb
ADDED
@@ -0,0 +1,50 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe Tweetly::User do
|
4
|
+
context "Initializing a new user" do
|
5
|
+
before do
|
6
|
+
@user = Tweetly::User.new('jicooo')
|
7
|
+
end
|
8
|
+
|
9
|
+
it "should be created successfully" do
|
10
|
+
@user.should be_an_instance_of Tweetly::User
|
11
|
+
end
|
12
|
+
end
|
13
|
+
|
14
|
+
context "Fetching a user timeline" do
|
15
|
+
before do
|
16
|
+
@user = Tweetly::User.new('jicooo')
|
17
|
+
end
|
18
|
+
|
19
|
+
it "should fetch 1000 tweets by default" do
|
20
|
+
@user.fetch_timeline
|
21
|
+
@user.timeline.count.should == 1000
|
22
|
+
end
|
23
|
+
|
24
|
+
it "should be able to fetch less than 1000" do
|
25
|
+
@user.fetch_timeline(10)
|
26
|
+
@user.timeline.count.should == 10
|
27
|
+
end
|
28
|
+
|
29
|
+
it "should be able to fetch more than 1000" do
|
30
|
+
@user.fetch_timeline(1001)
|
31
|
+
@user.timeline.count.should == 1001
|
32
|
+
end
|
33
|
+
|
34
|
+
it "should not fetch duplicate tweets" do
|
35
|
+
seen = {}
|
36
|
+
duplicate = false
|
37
|
+
@user.timeline.each do |t|
|
38
|
+
if seen.has_key? t.id
|
39
|
+
duplicate = true
|
40
|
+
break
|
41
|
+
else
|
42
|
+
seen[t.id] = 1
|
43
|
+
end
|
44
|
+
end
|
45
|
+
duplicate.should be_false
|
46
|
+
end
|
47
|
+
end
|
48
|
+
|
49
|
+
|
50
|
+
end
|
data/tweetly.gemspec
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
require File.expand_path('../lib/tweetly/version', __FILE__)
|
3
|
+
|
4
|
+
Gem::Specification.new do |gem|
|
5
|
+
gem.authors = ["Jico Baligod"]
|
6
|
+
gem.email = ["jico@baligod.com"]
|
7
|
+
gem.description = %q{Twitter user timeline stats.}
|
8
|
+
gem.summary = %q{Generate word frequency distributions from your Twitter statuses.}
|
9
|
+
gem.homepage = "https://github.com/jico/tweetly"
|
10
|
+
|
11
|
+
gem.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
|
12
|
+
gem.files = `git ls-files`.split("\n")
|
13
|
+
gem.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
|
14
|
+
gem.name = "tweetly"
|
15
|
+
gem.require_paths = ["lib"]
|
16
|
+
gem.version = Tweetly::VERSION
|
17
|
+
|
18
|
+
gem.add_development_dependency "rake"
|
19
|
+
gem.add_development_dependency "rspec", "~> 2.9.0"
|
20
|
+
|
21
|
+
gem.add_dependency "twitter"
|
22
|
+
end
|
metadata
ADDED
@@ -0,0 +1,98 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: tweetly
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Jico Baligod
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2012-05-15 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: rake
|
16
|
+
requirement: &70128994114340 !ruby/object:Gem::Requirement
|
17
|
+
none: false
|
18
|
+
requirements:
|
19
|
+
- - ! '>='
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: '0'
|
22
|
+
type: :development
|
23
|
+
prerelease: false
|
24
|
+
version_requirements: *70128994114340
|
25
|
+
- !ruby/object:Gem::Dependency
|
26
|
+
name: rspec
|
27
|
+
requirement: &70128994113540 !ruby/object:Gem::Requirement
|
28
|
+
none: false
|
29
|
+
requirements:
|
30
|
+
- - ~>
|
31
|
+
- !ruby/object:Gem::Version
|
32
|
+
version: 2.9.0
|
33
|
+
type: :development
|
34
|
+
prerelease: false
|
35
|
+
version_requirements: *70128994113540
|
36
|
+
- !ruby/object:Gem::Dependency
|
37
|
+
name: twitter
|
38
|
+
requirement: &70128994113060 !ruby/object:Gem::Requirement
|
39
|
+
none: false
|
40
|
+
requirements:
|
41
|
+
- - ! '>='
|
42
|
+
- !ruby/object:Gem::Version
|
43
|
+
version: '0'
|
44
|
+
type: :runtime
|
45
|
+
prerelease: false
|
46
|
+
version_requirements: *70128994113060
|
47
|
+
description: Twitter user timeline stats.
|
48
|
+
email:
|
49
|
+
- jico@baligod.com
|
50
|
+
executables: []
|
51
|
+
extensions: []
|
52
|
+
extra_rdoc_files: []
|
53
|
+
files:
|
54
|
+
- .gitignore
|
55
|
+
- .rspec
|
56
|
+
- Gemfile
|
57
|
+
- LICENSE
|
58
|
+
- README.md
|
59
|
+
- Rakefile
|
60
|
+
- lib/tweetly.rb
|
61
|
+
- lib/tweetly/user.rb
|
62
|
+
- lib/tweetly/version.rb
|
63
|
+
- spec/spec_helper.rb
|
64
|
+
- spec/user_spec.rb
|
65
|
+
- tweetly.gemspec
|
66
|
+
homepage: https://github.com/jico/tweetly
|
67
|
+
licenses: []
|
68
|
+
post_install_message:
|
69
|
+
rdoc_options: []
|
70
|
+
require_paths:
|
71
|
+
- lib
|
72
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
73
|
+
none: false
|
74
|
+
requirements:
|
75
|
+
- - ! '>='
|
76
|
+
- !ruby/object:Gem::Version
|
77
|
+
version: '0'
|
78
|
+
segments:
|
79
|
+
- 0
|
80
|
+
hash: -4256241259014341675
|
81
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
82
|
+
none: false
|
83
|
+
requirements:
|
84
|
+
- - ! '>='
|
85
|
+
- !ruby/object:Gem::Version
|
86
|
+
version: '0'
|
87
|
+
segments:
|
88
|
+
- 0
|
89
|
+
hash: -4256241259014341675
|
90
|
+
requirements: []
|
91
|
+
rubyforge_project:
|
92
|
+
rubygems_version: 1.8.17
|
93
|
+
signing_key:
|
94
|
+
specification_version: 3
|
95
|
+
summary: Generate word frequency distributions from your Twitter statuses.
|
96
|
+
test_files:
|
97
|
+
- spec/spec_helper.rb
|
98
|
+
- spec/user_spec.rb
|