RubyGems - twitterscraper-ruby - Versions diffs - 0.15.0 → 0.15.1 - Mend

twitterscraper-ruby 0.15.0 → 0.15.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (6) hide show

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA256:
-  metadata.gz: a950fb24329aaa1020441e258a8a2144100d732142b6c227bb9b026b8bb73996
-  data.tar.gz: 1f64f31e43189e2ee439f5ef6f6d54bc6ea58895adbed67cb8ddbe91af07681a
+  metadata.gz: 7f04cb0ba394884918271b5485b596c07203b7a6e9f4fec42d074ef4f02b6a0a
+  data.tar.gz: a4f618df53d1e8b54954619e87d383e43dbe5a63bbf83b33ee38f975998f2678
 SHA512:
-  metadata.gz: 8573affbc9a5faa05e5e489364bb2ba0da1aa4f12af35445e5de8b1f8c399eb0575cc9f408b2ba96c3d7fd8b2a74b7dd703229053a33c1f8a883856818033cb9
-  data.tar.gz: 2b2b3ad0b2dd9d089a7b6127ed1b0db21e7f4fa5f0c31e6b366d9b5ae444e2244d4200c813b7a3257f43702d2caa9f264515e701602c24f4482a746b89d41328
+  metadata.gz: fa9f02cf3ef0bf280f45b18ebacaec0b06dbd610477355602fcc59d382b5590c990695297e1e793457fdcff4cb7dd037f076c1f0fa4706eb69c67c3a165243e4
+  data.tar.gz: 9c08d9e4d1ee56fa133675bc73a50f502040cc9a2844d9a46a39c38ccdffdf43c15b17c2e4a8b74561f523493ccbc4a055f0add239574d2f5129ee4abe1f5ed9

data/Gemfile.lock CHANGED

@@ -1,7 +1,7 @@
 PATH
   remote: .
   specs:
-    twitterscraper-ruby (0.15.0)
+    twitterscraper-ruby (0.15.1)
       nokogiri
       parallel

data/README.md CHANGED

@@ -5,15 +5,17 @@
 A gem to scrape https://twitter.com/search. This gem is inspired by [taspinar/twitterscraper](https://github.com/taspinar/twitterscraper).
+Please feel free to ask [@ts_3156](https://twitter.com/ts_3156) if you have any questions.
 ## Twitter Search API vs. twitterscraper-ruby
-### Twitter Search API
+#### Twitter Search API
 - The number of tweets: 180 - 450 requests/15 minutes (18,000 - 45,000 tweets/15 minutes)
 - The time window: the past 7 days
-### twitterscraper-ruby
+#### twitterscraper-ruby
 - The number of tweets: Unlimited
 - The time window: from 2006-3-21 to today
@@ -30,37 +32,49 @@ $ gem install twitterscraper-ruby
 ## Usage
-Command-line interface:
+#### Command-line interface:
+Returns a collection of relevant tweets matching a specified query.
 ```shell script
-# Returns a collection of relevant tweets matching a specified query.
 $ twitterscraper --type search --query KEYWORD --start_date 2020-06-01 --end_date 2020-06-30 --lang ja \
       --limit 100 --threads 10 --output tweets.json
 ```
+Returns a collection of the most recent tweets posted by the user indicated by the screen_name
 ```shell script
-# Returns a collection of the most recent tweets posted by the user indicated by the screen_name
 $ twitterscraper --type user --query SCREEN_NAME --limit 100 --output tweets.json
 ```
-From Within Ruby:
+#### From Within Ruby:
 ```ruby
 require 'twitterscraper'
 client = Twitterscraper::Client.new(cache: true, proxy: true)
 ```
+Returns a collection of relevant tweets matching a specified query.
 ```ruby
-# Returns a collection of relevant tweets matching a specified query.
 tweets = client.search(KEYWORD, start_date: '2020-06-01', end_date: '2020-06-30', lang: 'ja', limit: 100, threads: 10)
 ```
+Returns a collection of the most recent tweets posted by the user indicated by the screen_name
 ```ruby
-# Returns a collection of the most recent tweets posted by the user indicated by the screen_name
 tweets = client.user_timeline(SCREEN_NAME, limit: 100)
 ```
+## Examples
+```shell script
+$ twitterscraper --query twitter --limit 1000
+$ cat tweets.json | jq . | less
+```
 ## Attributes
 ### Tweet
@@ -72,11 +86,39 @@ tweets.each do |tweet|
   puts tweet.tweet_url
   puts tweet.created_at
+  attr_names = hash.keys
   hash = tweet.attrs
-  puts hash.keys
+  json = tweet.to_json
 end
 ```
+```json
+[
+  {
+      "screen_name": "@name",
+      "name": "Name",
+      "user_id": 12340000,
+      "tweet_id": 1234000000000000,
+      "text": "Thanks Twitter!",
+      "links": [],
+      "hashtags": [],
+      "image_urls": [],
+      "video_url": null,
+      "has_media": null,
+      "likes": 10,
+      "retweets": 20,
+      "replies": 0,
+      "is_replied": false,
+      "is_reply_to": false,
+      "parent_tweet_id": null,
+      "reply_to_users": [],
+      "tweet_url": "https://twitter.com/name/status/1234000000000000",
+      "timestamp": 1594793000,
+      "created_at": "2020-07-15 00:00:00 +0000"
+    }
+]
+```
 - screen_name
 - name
 - user_id
@@ -118,45 +160,24 @@ end
 Search operators documentation is in [Standard search operators](https://developer.twitter.com/en/docs/tweets/rules-and-filtering/overview/standard-operators).
-## Examples
-```shell script
-$ twitterscraper --query twitter --limit 1000
-$ cat tweets.json | jq . | less
-```
-```json
-[
-  {
-    "screen_name": "@screenname",
-    "name": "name",
-    "user_id": 1194529546483000000,
-    "tweet_id": 1282659891992000000,
-    "tweet_url": "https://twitter.com/screenname/status/1282659891992000000",
-    "created_at": "2020-07-13 12:00:00 +0000",
-    "text": "Thanks Twitter!"
-  }
-]
-```
 ## CLI Options
-| Option | Description | Default |
-| ------------- | ------------- | ------------- |
-| `-h`, `--help` | This option displays a summary of twitterscraper. | |
-| `--type` | Specify a search type. | search |
-| `--query` | Specify a keyword used during the search. | |
-| `--start_date` | Used as "since:yyyy-mm-dd for your query. This means "since the date". | |
-| `--end_date` | Used as "until:yyyy-mm-dd for your query. This means "before the date". | |
-| `--lang` | Retrieve tweets written in a specific language. | |
-| `--limit` | Stop scraping when *at least* the number of tweets indicated with --limit is scraped. | 100 |
-| `--order` | Sort order of the results. | desc |
-| `--threads` | Set the number of threads twitterscraper-ruby should initiate while scraping for your query. | 2 |
-| `--proxy` | Scrape https://twitter.com/search via proxies. | true |
-| `--cache` | Enable caching. | true |
-| `--format` | The format of the output. | json |
-| `--output` | The name of the output file. | tweets.json |
-| `--verbose` | Print debug messages. | tweets.json |
+| Option | Type | Description | Value |
+| ------------- | ------------- | ------------- | ------------- |
+| `--help`       | string  | This option displays a summary of twitterscraper. | |
+| `--type`       | string  | Specify a search type. | search(default) or user |
+| `--query`      | string  | Specify a keyword used during the search. | |
+| `--start_date` | string  | Used as "since:yyyy-mm-dd for your query. This means "since the date". | |
+| `--end_date`   | string  | Used as "until:yyyy-mm-dd for your query. This means "before the date". | |
+| `--lang`       | string  | Retrieve tweets written in a specific language. | |
+| `--limit`      | integer | Stop scraping when *at least* the number of tweets indicated with --limit is scraped. | 100 |
+| `--order`      | string  | Sort a order of the results. | desc(default) or asc |
+| `--threads`    | integer | Set the number of threads twitterscraper-ruby should initiate while scraping for your query. | 2 |
+| `--proxy`      | boolean | Scrape https://twitter.com/search via proxies. | true(default) or false |
+| `--cache`      | boolean | Enable caching. | true(default) or false |
+| `--format`     | string  | The format of the output. | json(default) or html |
+| `--output`     | string  | The name of the output file. | tweets.json |
+| `--verbose`    |         | Print debug messages. | |
 ## Contributing

data/lib/twitterscraper/query.rb CHANGED

@@ -27,8 +27,8 @@ module Twitterscraper
         'include_available_features=1&include_entities=1&' +
         'max_position=__POS__&reset_error_state=false'
-    def build_query_url(query, lang, from_user, pos)
-      if from_user
+    def build_query_url(query, lang, type, pos)
+      if type == 'user'
         if pos
           RELOAD_URL_USER.sub('__USER__', query).sub('__POS__', pos.to_s)
         else
@@ -51,7 +51,7 @@ module Twitterscraper
       end
       Http.get(url, headers, proxy, timeout)
     rescue => e
-      logger.debug "query_single_page: #{e.inspect}"
+      logger.debug "get_single_page: #{e.inspect}"
       if (retries -= 1) > 0
         logger.info "Retrying... (Attempts left: #{retries - 1})"
         retry
@@ -79,7 +79,7 @@ module Twitterscraper
       logger.info "Querying #{query}"
       query = ERB::Util.url_encode(query)
-      url = build_query_url(query, lang, type == 'user', pos)
+      url = build_query_url(query, lang, type, pos)
       http_request = lambda do
         logger.debug "Scraping tweets from #{url}"
         get_single_page(url, headers, proxies)

data/lib/version.rb CHANGED

@@ -1,3 +1,3 @@
 module Twitterscraper
-  VERSION = '0.15.0'
+  VERSION = '0.15.1'
 end

metadata CHANGED

@@ -1,7 +1,7 @@
 --- !ruby/object:Gem::Specification
 name: twitterscraper-ruby
 version: !ruby/object:Gem::Version
-  version: 0.15.0
+  version: 0.15.1
 platform: ruby
 authors:
 - ts-3156