twitterscraper-ruby 0.15.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: a950fb24329aaa1020441e258a8a2144100d732142b6c227bb9b026b8bb73996
4
+ data.tar.gz: 1f64f31e43189e2ee439f5ef6f6d54bc6ea58895adbed67cb8ddbe91af07681a
5
+ SHA512:
6
+ metadata.gz: 8573affbc9a5faa05e5e489364bb2ba0da1aa4f12af35445e5de8b1f8c399eb0575cc9f408b2ba96c3d7fd8b2a74b7dd703229053a33c1f8a883856818033cb9
7
+ data.tar.gz: 2b2b3ad0b2dd9d089a7b6127ed1b0db21e7f4fa5f0c31e6b366d9b5ae444e2244d4200c813b7a3257f43702d2caa9f264515e701602c24f4482a746b89d41328
@@ -0,0 +1,31 @@
1
+ version: 2.1
2
+ orbs:
3
+ ruby: circleci/ruby@0.1.2
4
+
5
+ jobs:
6
+ build:
7
+ docker:
8
+ - image: circleci/ruby:2.6.4-stretch-node
9
+ environment:
10
+ BUNDLER_VERSION: 2.1.4
11
+ executor: ruby/default
12
+ steps:
13
+ - checkout
14
+ - run:
15
+ name: Update bundler
16
+ command: gem update bundler
17
+ - run:
18
+ name: Which bundler?
19
+ command: bundle -v
20
+ - restore_cache:
21
+ keys:
22
+ - gem-cache-v1-{{ arch }}-{{ .Branch }}-{{ checksum "Gemfile.lock" }}
23
+ - gem-cache-v1-{{ arch }}-{{ .Branch }}
24
+ - gem-cache-v1
25
+ - run: bundle install --path vendor/bundle
26
+ - run: bundle clean
27
+ - save_cache:
28
+ key: gem-cache-v1-{{ arch }}-{{ .Branch }}-{{ checksum "Gemfile.lock" }}
29
+ paths:
30
+ - vendor/bundle
31
+ - run: bundle exec rspec
@@ -0,0 +1,10 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /_yardoc/
4
+ /coverage/
5
+ /doc/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+ /cache
10
+ /.idea
data/.irbrc ADDED
@@ -0,0 +1,7 @@
1
+ require 'irb/completion'
2
+ require 'irb/ext/save-history'
3
+
4
+ IRB.conf[:SAVE_HISTORY] = 10000
5
+ IRB.conf[:HISTORY_FILE] = File.expand_path('~/.irb_history')
6
+
7
+ require 'twitterscraper'
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ -fd
2
+ --require spec_helper
@@ -0,0 +1 @@
1
+ 2.6.4
@@ -0,0 +1,6 @@
1
+ ---
2
+ language: ruby
3
+ cache: bundler
4
+ rvm:
5
+ - 2.5.3
6
+ before_install: gem install bundler -v 2.1.4
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at ts_3156@yahoo.co.jp. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [https://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: https://contributor-covenant.org
74
+ [version]: https://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,8 @@
1
+ source "https://rubygems.org"
2
+
3
+ # Specify your gem's dependencies in twitterscraper-ruby.gemspec
4
+ gemspec
5
+
6
+ gem "rake", "~> 12.0"
7
+ gem "minitest", "~> 5.0"
8
+ gem "rspec"
@@ -0,0 +1,42 @@
1
+ PATH
2
+ remote: .
3
+ specs:
4
+ twitterscraper-ruby (0.15.0)
5
+ nokogiri
6
+ parallel
7
+
8
+ GEM
9
+ remote: https://rubygems.org/
10
+ specs:
11
+ diff-lcs (1.4.4)
12
+ mini_portile2 (2.4.0)
13
+ minitest (5.14.1)
14
+ nokogiri (1.10.10)
15
+ mini_portile2 (~> 2.4.0)
16
+ parallel (1.19.2)
17
+ rake (12.3.3)
18
+ rspec (3.9.0)
19
+ rspec-core (~> 3.9.0)
20
+ rspec-expectations (~> 3.9.0)
21
+ rspec-mocks (~> 3.9.0)
22
+ rspec-core (3.9.2)
23
+ rspec-support (~> 3.9.3)
24
+ rspec-expectations (3.9.2)
25
+ diff-lcs (>= 1.2.0, < 2.0)
26
+ rspec-support (~> 3.9.0)
27
+ rspec-mocks (3.9.1)
28
+ diff-lcs (>= 1.2.0, < 2.0)
29
+ rspec-support (~> 3.9.0)
30
+ rspec-support (3.9.3)
31
+
32
+ PLATFORMS
33
+ ruby
34
+
35
+ DEPENDENCIES
36
+ minitest (~> 5.0)
37
+ rake (~> 12.0)
38
+ rspec
39
+ twitterscraper-ruby!
40
+
41
+ BUNDLED WITH
42
+ 2.1.4
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2020 ts-3156
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,174 @@
1
+ # twitterscraper-ruby
2
+
3
+ [![Build Status](https://circleci.com/gh/ts-3156/twitterscraper-ruby.svg?style=svg)](https://circleci.com/gh/ts-3156/twitterscraper-ruby)
4
+ [![Gem Version](https://badge.fury.io/rb/twitterscraper-ruby.svg)](https://badge.fury.io/rb/twitterscraper-ruby)
5
+
6
+ A gem to scrape https://twitter.com/search. This gem is inspired by [taspinar/twitterscraper](https://github.com/taspinar/twitterscraper).
7
+
8
+
9
+ ## Twitter Search API vs. twitterscraper-ruby
10
+
11
+ ### Twitter Search API
12
+
13
+ - The number of tweets: 180 - 450 requests/15 minutes (18,000 - 45,000 tweets/15 minutes)
14
+ - The time window: the past 7 days
15
+
16
+ ### twitterscraper-ruby
17
+
18
+ - The number of tweets: Unlimited
19
+ - The time window: from 2006-3-21 to today
20
+
21
+
22
+ ## Installation
23
+
24
+ First install the library:
25
+
26
+ ```shell script
27
+ $ gem install twitterscraper-ruby
28
+ ````
29
+
30
+
31
+ ## Usage
32
+
33
+ Command-line interface:
34
+
35
+ ```shell script
36
+ # Returns a collection of relevant tweets matching a specified query.
37
+ $ twitterscraper --type search --query KEYWORD --start_date 2020-06-01 --end_date 2020-06-30 --lang ja \
38
+ --limit 100 --threads 10 --output tweets.json
39
+ ```
40
+
41
+ ```shell script
42
+ # Returns a collection of the most recent tweets posted by the user indicated by the screen_name
43
+ $ twitterscraper --type user --query SCREEN_NAME --limit 100 --output tweets.json
44
+ ```
45
+
46
+ From Within Ruby:
47
+
48
+ ```ruby
49
+ require 'twitterscraper'
50
+ client = Twitterscraper::Client.new(cache: true, proxy: true)
51
+ ```
52
+
53
+ ```ruby
54
+ # Returns a collection of relevant tweets matching a specified query.
55
+ tweets = client.search(KEYWORD, start_date: '2020-06-01', end_date: '2020-06-30', lang: 'ja', limit: 100, threads: 10)
56
+ ```
57
+
58
+ ```ruby
59
+ # Returns a collection of the most recent tweets posted by the user indicated by the screen_name
60
+ tweets = client.user_timeline(SCREEN_NAME, limit: 100)
61
+ ```
62
+
63
+
64
+ ## Attributes
65
+
66
+ ### Tweet
67
+
68
+ ```ruby
69
+ tweets.each do |tweet|
70
+ puts tweet.tweet_id
71
+ puts tweet.text
72
+ puts tweet.tweet_url
73
+ puts tweet.created_at
74
+
75
+ hash = tweet.attrs
76
+ puts hash.keys
77
+ end
78
+ ```
79
+
80
+ - screen_name
81
+ - name
82
+ - user_id
83
+ - tweet_id
84
+ - text
85
+ - links
86
+ - hashtags
87
+ - image_urls
88
+ - video_url
89
+ - has_media
90
+ - likes
91
+ - retweets
92
+ - replies
93
+ - is_replied
94
+ - is_reply_to
95
+ - parent_tweet_id
96
+ - reply_to_users
97
+ - tweet_url
98
+ - created_at
99
+
100
+
101
+ ## Search operators
102
+
103
+ | Operator | Finds Tweets... |
104
+ | ------------- | ------------- |
105
+ | watching now | containing both "watching" and "now". This is the default operator. |
106
+ | "happy hour" | containing the exact phrase "happy hour". |
107
+ | love OR hate | containing either "love" or "hate" (or both). |
108
+ | beer -root | containing "beer" but not "root". |
109
+ | #haiku | containing the hashtag "haiku". |
110
+ | from:interior | sent from Twitter account "interior". |
111
+ | to:NASA | a Tweet authored in reply to Twitter account "NASA". |
112
+ | @NASA | mentioning Twitter account "NASA". |
113
+ | puppy filter:media | containing "puppy" and an image or video. |
114
+ | puppy -filter:retweets | containing "puppy", filtering out retweets |
115
+ | superhero since:2015-12-21 | containing "superhero" and sent since date "2015-12-21" (year-month-day). |
116
+ | puppy until:2015-12-21 | containing "puppy" and sent before the date "2015-12-21". |
117
+
118
+ Search operators documentation is in [Standard search operators](https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/standard-operators).
119
+
120
+
121
+ ## Examples
122
+
123
+ ```shell script
124
+ $ twitterscraper --query twitter --limit 1000
125
+ $ cat tweets.json | jq . | less
126
+ ```
127
+
128
+ ```json
129
+ [
130
+ {
131
+ "screen_name": "@screenname",
132
+ "name": "name",
133
+ "user_id": 1194529546483000000,
134
+ "tweet_id": 1282659891992000000,
135
+ "tweet_url": "https://twitter.com/screenname/status/1282659891992000000",
136
+ "created_at": "2020-07-13 12:00:00 +0000",
137
+ "text": "Thanks Twitter!"
138
+ }
139
+ ]
140
+ ```
141
+
142
+ ## CLI Options
143
+
144
+ | Option | Description | Default |
145
+ | ------------- | ------------- | ------------- |
146
+ | `-h`, `--help` | This option displays a summary of twitterscraper. | |
147
+ | `--type` | Specify a search type. | search |
148
+ | `--query` | Specify a keyword used during the search. | |
149
+ | `--start_date` | Used as "since:yyyy-mm-dd for your query. This means "since the date". | |
150
+ | `--end_date` | Used as "until:yyyy-mm-dd for your query. This means "before the date". | |
151
+ | `--lang` | Retrieve tweets written in a specific language. | |
152
+ | `--limit` | Stop scraping when *at least* the number of tweets indicated with --limit is scraped. | 100 |
153
+ | `--order` | Sort order of the results. | desc |
154
+ | `--threads` | Set the number of threads twitterscraper-ruby should initiate while scraping for your query. | 2 |
155
+ | `--proxy` | Scrape https://twitter.com/search via proxies. | true |
156
+ | `--cache` | Enable caching. | true |
157
+ | `--format` | The format of the output. | json |
158
+ | `--output` | The name of the output file. | tweets.json |
159
+ | `--verbose` | Print debug messages. | tweets.json |
160
+
161
+
162
+ ## Contributing
163
+
164
+ Bug reports and pull requests are welcome on GitHub at https://github.com/ts-3156/twitterscraper-ruby. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [code of conduct](https://github.com/ts-3156/twitterscraper-ruby/blob/master/CODE_OF_CONDUCT.md).
165
+
166
+
167
+ ## License
168
+
169
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
170
+
171
+
172
+ ## Code of Conduct
173
+
174
+ Everyone interacting in the twitterscraper-ruby project's codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/ts-3156/twitterscraper-ruby/blob/master/CODE_OF_CONDUCT.md).
@@ -0,0 +1,10 @@
1
+ require "bundler/gem_tasks"
2
+ require "rake/testtask"
3
+
4
+ Rake::TestTask.new(:test) do |t|
5
+ t.libs << "test"
6
+ t.libs << "lib"
7
+ t.test_files = FileList["test/**/*_test.rb"]
8
+ end
9
+
10
+ task :default => :test
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "twitterscraper/ruby"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start(__FILE__)
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,13 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require_relative '../lib/twitterscraper/cli'
4
+
5
+ begin
6
+ cli = Twitterscraper::Cli.new
7
+ cli.parse
8
+ cli.run
9
+ rescue => e
10
+ STDERR.puts e.inspect
11
+ STDERR.puts e.backtrace.join("\n")
12
+ exit 1
13
+ end