tweetly 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/.gitignore +19 -0
- data/.rspec +2 -0
- data/Gemfile +4 -0
- data/LICENSE +22 -0
- data/README.md +105 -0
- data/Rakefile +2 -0
- data/lib/tweetly.rb +7 -0
- data/lib/tweetly/user.rb +142 -0
- data/lib/tweetly/version.rb +3 -0
- data/spec/spec_helper.rb +14 -0
- data/spec/user_spec.rb +50 -0
- data/tweetly.gemspec +22 -0
- metadata +98 -0
data/.gitignore
ADDED
data/.rspec
ADDED
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
Copyright (c) 2012 Jico Baligod
|
2
|
+
|
3
|
+
MIT License
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
6
|
+
a copy of this software and associated documentation files (the
|
7
|
+
"Software"), to deal in the Software without restriction, including
|
8
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
9
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
10
|
+
permit persons to whom the Software is furnished to do so, subject to
|
11
|
+
the following conditions:
|
12
|
+
|
13
|
+
The above copyright notice and this permission notice shall be
|
14
|
+
included in all copies or substantial portions of the Software.
|
15
|
+
|
16
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
17
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
18
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
19
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
20
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
21
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
22
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,105 @@
|
|
1
|
+
# Tweetly
|
2
|
+
|
3
|
+
Generate nifty word frequency distributions from your Twitter statuses.
|
4
|
+
|
5
|
+
Written and tested in __Ruby 1.9.3__.
|
6
|
+
|
7
|
+
## Installation
|
8
|
+
|
9
|
+
Add this line to your application's Gemfile:
|
10
|
+
|
11
|
+
gem 'tweetly'
|
12
|
+
|
13
|
+
And then execute:
|
14
|
+
|
15
|
+
$ bundle
|
16
|
+
|
17
|
+
Or install it yourself as:
|
18
|
+
|
19
|
+
$ gem install tweetly
|
20
|
+
|
21
|
+
## Usage
|
22
|
+
|
23
|
+
First, let's create a new user.
|
24
|
+
|
25
|
+
```ruby
|
26
|
+
require 'tweetly'
|
27
|
+
user = Tweetly::User.new('jicooo')
|
28
|
+
```
|
29
|
+
|
30
|
+
Now we can print a word frequency distribution list from the last _tweets_ tweets and limit it to _count_ words like so:
|
31
|
+
|
32
|
+
```ruby
|
33
|
+
user.print_word_freq(tweets: 20, count: 5)
|
34
|
+
```
|
35
|
+
|
36
|
+
which prints:
|
37
|
+
|
38
|
+
for (5)
|
39
|
+
$10 (3)
|
40
|
+
@julioz2 (3)
|
41
|
+
my (3)
|
42
|
+
post: (2)
|
43
|
+
|
44
|
+
By default, Tweetly pulls the last 1000 tweets from the user's timeline and displays the entire list of words.
|
45
|
+
|
46
|
+
```ruby
|
47
|
+
user.print_word_freq
|
48
|
+
```
|
49
|
+
|
50
|
+
which prints:
|
51
|
+
|
52
|
+
the (262)
|
53
|
+
to (179)
|
54
|
+
a (165)
|
55
|
+
@Sualehh (161)
|
56
|
+
my (112)
|
57
|
+
RT (107)
|
58
|
+
I (105)
|
59
|
+
of (100)
|
60
|
+
# and so on...
|
61
|
+
|
62
|
+
Well, that's not a very interesting list. Let's ignore retweets as well as a few common words. Additionally, let's constrain our list to words of at least length 5, only word characters (i.e. letters, digits, underscores), and case insensitive:
|
63
|
+
|
64
|
+
```ruby
|
65
|
+
options = {
|
66
|
+
include_rts: false,
|
67
|
+
ignore: ['the', 'to', 'a', 'I'],
|
68
|
+
min_length: 5,
|
69
|
+
words_only: true,
|
70
|
+
case_sensitive: false
|
71
|
+
}
|
72
|
+
user.print_word_freq(options)
|
73
|
+
```
|
74
|
+
|
75
|
+
The above prints out:
|
76
|
+
|
77
|
+
sualehh (206)
|
78
|
+
foursquare (31)
|
79
|
+
julioz2 (30)
|
80
|
+
mlacitation (24)
|
81
|
+
compywiz (24)
|
82
|
+
right (21)
|
83
|
+
think (20)
|
84
|
+
pretty (20)
|
85
|
+
|
86
|
+
Awesome! That's a great looking list.
|
87
|
+
|
88
|
+
If you strictly want the words and not the frequency counts printed, you can pass in the option `print_count: false`. For the above, we would just have
|
89
|
+
|
90
|
+
sualehh
|
91
|
+
foursquare
|
92
|
+
julioz2
|
93
|
+
mlacitation
|
94
|
+
compywiz
|
95
|
+
right
|
96
|
+
think
|
97
|
+
pretty
|
98
|
+
|
99
|
+
## Contributing
|
100
|
+
|
101
|
+
1. Fork it
|
102
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
103
|
+
3. Commit your changes (`git commit -am 'Added some feature'`)
|
104
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
105
|
+
5. Create new Pull Request
|
data/Rakefile
ADDED
data/lib/tweetly.rb
ADDED
data/lib/tweetly/user.rb
ADDED
@@ -0,0 +1,142 @@
|
|
1
|
+
module Tweetly
|
2
|
+
|
3
|
+
class User
|
4
|
+
attr_accessor :name, :screen_name, :protected, :statuses_count, :timeline
|
5
|
+
|
6
|
+
def initialize(name)
|
7
|
+
user = Twitter.user(name)
|
8
|
+
unless user.protected?
|
9
|
+
@name = user.name
|
10
|
+
@screen_name = user.screen_name
|
11
|
+
@protected = user.protected?
|
12
|
+
@statuses_count = user.statuses_count
|
13
|
+
@timeline = []
|
14
|
+
end
|
15
|
+
end
|
16
|
+
|
17
|
+
# Builds a word frequency array from recent tweets.
|
18
|
+
#
|
19
|
+
# @param [Hash] options frequency distribution options
|
20
|
+
# @option [Integer] :tweets number of tweets to consider (max 3200)
|
21
|
+
# @option [Boolean] :words_only consider only word characers
|
22
|
+
# i.e. (letters, symbols, underscores)
|
23
|
+
# @option [Boolean] :case_sensitive whether to consider word case
|
24
|
+
# @option [Boolean] :include_rts whether to consider retweeted statuses
|
25
|
+
# @option [Array<String>] :ignore list of words to ignore
|
26
|
+
# @return [Array<Array<String, Integer>>] list of word frequencies in descending order
|
27
|
+
def word_freq(options={})
|
28
|
+
# Default parameters
|
29
|
+
params = {
|
30
|
+
tweets: 1000,
|
31
|
+
words_only: false,
|
32
|
+
case_sensitive: true,
|
33
|
+
include_rts: true,
|
34
|
+
min_length: nil,
|
35
|
+
ignore: []
|
36
|
+
}
|
37
|
+
params.merge!(options)
|
38
|
+
|
39
|
+
# Fetch Tweets if necessary
|
40
|
+
fetch_timeline(params[:tweets]) if @timeline.count < params[:tweets]
|
41
|
+
|
42
|
+
freqDist = {}
|
43
|
+
|
44
|
+
@timeline.each_with_index do |tweet, i|
|
45
|
+
# :include_rts option
|
46
|
+
next if !params[:include_rts] && tweet.retweeted_status
|
47
|
+
|
48
|
+
tweet.text.split.each do |word|
|
49
|
+
# :ignore option
|
50
|
+
next if params[:ignore].include? word
|
51
|
+
|
52
|
+
# :case_sensitive option
|
53
|
+
word.downcase! if !params[:case_sensitive]
|
54
|
+
|
55
|
+
# :words_only option
|
56
|
+
if params[:words_only]
|
57
|
+
word = word.match(/\w+/)
|
58
|
+
next unless word
|
59
|
+
word = word[0]
|
60
|
+
end
|
61
|
+
|
62
|
+
# :min_length option
|
63
|
+
next if params[:min_length] && word.length < params[:min_length]
|
64
|
+
|
65
|
+
if freqDist.has_key? word
|
66
|
+
freqDist[word] += 1
|
67
|
+
else
|
68
|
+
freqDist[word] = 1
|
69
|
+
end
|
70
|
+
end
|
71
|
+
|
72
|
+
break if i == params[:tweets]
|
73
|
+
end
|
74
|
+
freqDist.sort_by { |k,v| -v }
|
75
|
+
end
|
76
|
+
|
77
|
+
# Prints word frequency distribution list.
|
78
|
+
#
|
79
|
+
# @param [Hash] options printing options
|
80
|
+
# @option [Integer] :tweets number of tweets to consider (default: 100, max: 3200)
|
81
|
+
# @option [Integer] :count prints top count of frequency distribution list
|
82
|
+
# @option [Boolean] :print_count whether to print count next to each word
|
83
|
+
def print_word_freq(options={})
|
84
|
+
opts = {
|
85
|
+
tweets: 1000,
|
86
|
+
print_count: true
|
87
|
+
}
|
88
|
+
opts.merge!(options)
|
89
|
+
dist = word_freq(opts)
|
90
|
+
opts[:count] ||= dist.count
|
91
|
+
dist[0,opts[:count]].each do |k,v|
|
92
|
+
line = "#{k}"
|
93
|
+
line += " (#{v})" if opts[:print_count]
|
94
|
+
puts line
|
95
|
+
end
|
96
|
+
return
|
97
|
+
end
|
98
|
+
|
99
|
+
# Fetches user timeline and caches it in the @timeline field.
|
100
|
+
#
|
101
|
+
# @todo Duplicate tweets due to max_id
|
102
|
+
#
|
103
|
+
# @param [Integer] count the number of Tweets to fetch
|
104
|
+
# @return [Boolean] Success or failure (amount already cached)
|
105
|
+
def fetch_timeline(count=1000)
|
106
|
+
# Twitter API limits to 3200 tweets
|
107
|
+
count = [count, 3200].min
|
108
|
+
|
109
|
+
if @timeline.count < count
|
110
|
+
opts = {
|
111
|
+
trim_user: 1,
|
112
|
+
include_rts: 1
|
113
|
+
}
|
114
|
+
|
115
|
+
while @timeline.length < count
|
116
|
+
# Fetch only what we don't have
|
117
|
+
opts[:max_id] = @timeline.last.id unless @timeline.empty?
|
118
|
+
|
119
|
+
# Max step count is 200 by API
|
120
|
+
# We may need less depending on what's already fetched
|
121
|
+
opts[:count] = [200, count - @timeline.count].min
|
122
|
+
|
123
|
+
# Fetch an extra since max_id is included
|
124
|
+
opts[:count] += 1 if opts[:max_id]
|
125
|
+
|
126
|
+
resp = Twitter.user_timeline(@screen_name, opts)
|
127
|
+
|
128
|
+
# Avoid reinserting max_id tweet
|
129
|
+
resp.delete_at(0) if opts[:max_id]
|
130
|
+
|
131
|
+
resp.each { |tweet| @timeline << tweet }
|
132
|
+
end
|
133
|
+
|
134
|
+
return true
|
135
|
+
else
|
136
|
+
# We've pulled this amount or more already
|
137
|
+
return false
|
138
|
+
end
|
139
|
+
end
|
140
|
+
|
141
|
+
end
|
142
|
+
end
|
data/spec/spec_helper.rb
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
# This file was generated by the `rspec --init` command. Conventionally, all
|
2
|
+
# specs live under a `spec` directory, which RSpec adds to the `$LOAD_PATH`.
|
3
|
+
# Require this file using `require "spec_helper.rb"` to ensure that it is only
|
4
|
+
# loaded once.
|
5
|
+
#
|
6
|
+
# See http://rubydoc.info/gems/rspec-core/RSpec/Core/Configuration
|
7
|
+
|
8
|
+
require_relative "../lib/tweetly"
|
9
|
+
|
10
|
+
RSpec.configure do |config|
|
11
|
+
config.treat_symbols_as_metadata_keys_with_true_values = true
|
12
|
+
config.run_all_when_everything_filtered = true
|
13
|
+
config.filter_run :focus
|
14
|
+
end
|
data/spec/user_spec.rb
ADDED
@@ -0,0 +1,50 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
describe Tweetly::User do
|
4
|
+
context "Initializing a new user" do
|
5
|
+
before do
|
6
|
+
@user = Tweetly::User.new('jicooo')
|
7
|
+
end
|
8
|
+
|
9
|
+
it "should be created successfully" do
|
10
|
+
@user.should be_an_instance_of Tweetly::User
|
11
|
+
end
|
12
|
+
end
|
13
|
+
|
14
|
+
context "Fetching a user timeline" do
|
15
|
+
before do
|
16
|
+
@user = Tweetly::User.new('jicooo')
|
17
|
+
end
|
18
|
+
|
19
|
+
it "should fetch 1000 tweets by default" do
|
20
|
+
@user.fetch_timeline
|
21
|
+
@user.timeline.count.should == 1000
|
22
|
+
end
|
23
|
+
|
24
|
+
it "should be able to fetch less than 1000" do
|
25
|
+
@user.fetch_timeline(10)
|
26
|
+
@user.timeline.count.should == 10
|
27
|
+
end
|
28
|
+
|
29
|
+
it "should be able to fetch more than 1000" do
|
30
|
+
@user.fetch_timeline(1001)
|
31
|
+
@user.timeline.count.should == 1001
|
32
|
+
end
|
33
|
+
|
34
|
+
it "should not fetch duplicate tweets" do
|
35
|
+
seen = {}
|
36
|
+
duplicate = false
|
37
|
+
@user.timeline.each do |t|
|
38
|
+
if seen.has_key? t.id
|
39
|
+
duplicate = true
|
40
|
+
break
|
41
|
+
else
|
42
|
+
seen[t.id] = 1
|
43
|
+
end
|
44
|
+
end
|
45
|
+
duplicate.should be_false
|
46
|
+
end
|
47
|
+
end
|
48
|
+
|
49
|
+
|
50
|
+
end
|
data/tweetly.gemspec
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
# -*- encoding: utf-8 -*-
|
2
|
+
require File.expand_path('../lib/tweetly/version', __FILE__)
|
3
|
+
|
4
|
+
Gem::Specification.new do |gem|
|
5
|
+
gem.authors = ["Jico Baligod"]
|
6
|
+
gem.email = ["jico@baligod.com"]
|
7
|
+
gem.description = %q{Twitter user timeline stats.}
|
8
|
+
gem.summary = %q{Generate word frequency distributions from your Twitter statuses.}
|
9
|
+
gem.homepage = "https://github.com/jico/tweetly"
|
10
|
+
|
11
|
+
gem.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
|
12
|
+
gem.files = `git ls-files`.split("\n")
|
13
|
+
gem.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
|
14
|
+
gem.name = "tweetly"
|
15
|
+
gem.require_paths = ["lib"]
|
16
|
+
gem.version = Tweetly::VERSION
|
17
|
+
|
18
|
+
gem.add_development_dependency "rake"
|
19
|
+
gem.add_development_dependency "rspec", "~> 2.9.0"
|
20
|
+
|
21
|
+
gem.add_dependency "twitter"
|
22
|
+
end
|
metadata
ADDED
@@ -0,0 +1,98 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: tweetly
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
version: 0.0.1
|
5
|
+
prerelease:
|
6
|
+
platform: ruby
|
7
|
+
authors:
|
8
|
+
- Jico Baligod
|
9
|
+
autorequire:
|
10
|
+
bindir: bin
|
11
|
+
cert_chain: []
|
12
|
+
date: 2012-05-15 00:00:00.000000000 Z
|
13
|
+
dependencies:
|
14
|
+
- !ruby/object:Gem::Dependency
|
15
|
+
name: rake
|
16
|
+
requirement: &70128994114340 !ruby/object:Gem::Requirement
|
17
|
+
none: false
|
18
|
+
requirements:
|
19
|
+
- - ! '>='
|
20
|
+
- !ruby/object:Gem::Version
|
21
|
+
version: '0'
|
22
|
+
type: :development
|
23
|
+
prerelease: false
|
24
|
+
version_requirements: *70128994114340
|
25
|
+
- !ruby/object:Gem::Dependency
|
26
|
+
name: rspec
|
27
|
+
requirement: &70128994113540 !ruby/object:Gem::Requirement
|
28
|
+
none: false
|
29
|
+
requirements:
|
30
|
+
- - ~>
|
31
|
+
- !ruby/object:Gem::Version
|
32
|
+
version: 2.9.0
|
33
|
+
type: :development
|
34
|
+
prerelease: false
|
35
|
+
version_requirements: *70128994113540
|
36
|
+
- !ruby/object:Gem::Dependency
|
37
|
+
name: twitter
|
38
|
+
requirement: &70128994113060 !ruby/object:Gem::Requirement
|
39
|
+
none: false
|
40
|
+
requirements:
|
41
|
+
- - ! '>='
|
42
|
+
- !ruby/object:Gem::Version
|
43
|
+
version: '0'
|
44
|
+
type: :runtime
|
45
|
+
prerelease: false
|
46
|
+
version_requirements: *70128994113060
|
47
|
+
description: Twitter user timeline stats.
|
48
|
+
email:
|
49
|
+
- jico@baligod.com
|
50
|
+
executables: []
|
51
|
+
extensions: []
|
52
|
+
extra_rdoc_files: []
|
53
|
+
files:
|
54
|
+
- .gitignore
|
55
|
+
- .rspec
|
56
|
+
- Gemfile
|
57
|
+
- LICENSE
|
58
|
+
- README.md
|
59
|
+
- Rakefile
|
60
|
+
- lib/tweetly.rb
|
61
|
+
- lib/tweetly/user.rb
|
62
|
+
- lib/tweetly/version.rb
|
63
|
+
- spec/spec_helper.rb
|
64
|
+
- spec/user_spec.rb
|
65
|
+
- tweetly.gemspec
|
66
|
+
homepage: https://github.com/jico/tweetly
|
67
|
+
licenses: []
|
68
|
+
post_install_message:
|
69
|
+
rdoc_options: []
|
70
|
+
require_paths:
|
71
|
+
- lib
|
72
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
73
|
+
none: false
|
74
|
+
requirements:
|
75
|
+
- - ! '>='
|
76
|
+
- !ruby/object:Gem::Version
|
77
|
+
version: '0'
|
78
|
+
segments:
|
79
|
+
- 0
|
80
|
+
hash: -4256241259014341675
|
81
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
82
|
+
none: false
|
83
|
+
requirements:
|
84
|
+
- - ! '>='
|
85
|
+
- !ruby/object:Gem::Version
|
86
|
+
version: '0'
|
87
|
+
segments:
|
88
|
+
- 0
|
89
|
+
hash: -4256241259014341675
|
90
|
+
requirements: []
|
91
|
+
rubyforge_project:
|
92
|
+
rubygems_version: 1.8.17
|
93
|
+
signing_key:
|
94
|
+
specification_version: 3
|
95
|
+
summary: Generate word frequency distributions from your Twitter statuses.
|
96
|
+
test_files:
|
97
|
+
- spec/spec_helper.rb
|
98
|
+
- spec/user_spec.rb
|