tanakai 1.5.0 → 1.5.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: '0363335680ba18ca855d2413e4efdce1957decab5f31c954e2b04f4f91660ac6'
4
- data.tar.gz: 3fee8a56e284ef3bae724d1ffe3cc7f1614ad374efd59d60a002a7f71353cf06
3
+ metadata.gz: aefc92adbda49240ac69ad9d3163ec3bb54930d8288705943d269ccbba41c64a
4
+ data.tar.gz: c1c361906354aba4edb1dcef9e2737b4e827c794c5f61b4104ec309a67e3ca80
5
5
  SHA512:
6
- metadata.gz: fabeeb2270349d0961294de34abe055906c38477cd4f744da9e033c626939e2672b86b30053a3d8c89bea1889f6392370e1b58296a0327f1a891d4915132478c
7
- data.tar.gz: 8e34927825ef45893de6e00c676621823b4d3ca7c28210c73720409000c04bb228b1ea0dd75293144afd1dbff6f4f370ca0312847d781895434896374bb77f7b
6
+ metadata.gz: c744ae590cbb25e9dde914174be9fb003399bdc25408c9901c3898c296d4ff5428b97d3e4942c952979586946c75b4c7027edd594d438dc387dffb6afedda3bb
7
+ data.tar.gz: ac51ac208e71cd928aa402d62f51e465b11620b742f5ab8af208323e1f636503345ae910a4a6c0fe21f3d415bcff7bba600b6022bf879877695ba711463e97c0
data/CHANGELOG.md CHANGED
@@ -1,5 +1,9 @@
1
1
  # CHANGELOG
2
2
 
3
+ ## 1.5.1
4
+ ### New
5
+ * Add `response_type` to `in_parallel`
6
+
3
7
  ## 1.5.0
4
8
  ### New
5
9
  * First release as Tanakai
data/README.md CHANGED
@@ -626,7 +626,7 @@ Check out **Capybara cheat sheets** where you can see all available methods **to
626
626
 
627
627
  ### `request_to` method
628
628
 
629
- For making requests to a particular method there is `request_to`. It requires minimum two arguments: `:method_name` and `url:`. An optional argument is `data:` (see above what for is it). Example:
629
+ For making requests to a particular method there is `request_to`. It requires minimum two arguments: `:method_name` and `url:`. An optional argument is `data:` (see above what for is it) and `response_type` (defaults to `:html`). Example:
630
630
 
631
631
  ```ruby
632
632
  class Spider < Tanakai::Base
@@ -635,11 +635,12 @@ class Spider < Tanakai::Base
635
635
 
636
636
  def parse(response, url:, data: {})
637
637
  # Process request to `parse_product` method with `https://example.com/some_product` url:
638
- request_to :parse_product, url: "https://example.com/some_product"
638
+ request_to :parse_product, url: "https://example.com/some_product.json", response_type: :json
639
639
  end
640
640
 
641
641
  def parse_product(response, url:, data: {})
642
- puts "From page https://example.com/some_product !"
642
+ puts "JSON parsed from page https://example.com/some_product.json"
643
+ puts response
643
644
  end
644
645
  end
645
646
  ```
@@ -1194,6 +1195,7 @@ I, [2018-08-22 14:49:12 +0400#13033] [M: 46982297486840] INFO -- amazon_spider:
1194
1195
  * `delay:` set delay between requests: `in_parallel(:method, urls, threads: 3, delay: 2)`. Delay can be `Integer`, `Float` or `Range` (`2..5`). In case of a Range, delay number will be chosen randomly for each request: `rand (2..5) # => 3`
1195
1196
  * `engine:` set custom engine than a default one: `in_parallel(:method, urls, threads: 3, engine: :poltergeist_phantomjs)`
1196
1197
  * `config:` pass custom options to config (see [config section](#crawler-config))
1198
+ * `response_type:` response should be returned as `:html` or `:json`, defaults to `:html`
1197
1199
 
1198
1200
  ### Active Support included
1199
1201
 
data/lib/tanakai/base.rb CHANGED
@@ -286,7 +286,7 @@ module Tanakai
286
286
  end
287
287
  end
288
288
 
289
- def in_parallel(handler, urls, threads:, data: {}, delay: nil, engine: @engine, config: {})
289
+ def in_parallel(handler, urls, threads:, data: {}, delay: nil, engine: @engine, config: {}, response_type: :html)
290
290
  parts = urls.in_sorted_groups(threads, false)
291
291
  urls_count = urls.size
292
292
 
@@ -304,12 +304,12 @@ module Tanakai
304
304
  part.each do |url_data|
305
305
  if url_data.class == Hash
306
306
  if url_data[:url].present? && url_data[:data].present?
307
- spider.request_to(handler, delay, url_data)
307
+ spider.request_to(handler, delay, url_data, response_type: response_type)
308
308
  else
309
309
  spider.public_send(handler, url_data)
310
310
  end
311
311
  else
312
- spider.request_to(handler, delay, url: url_data, data: data)
312
+ spider.request_to(handler, delay, url: url_data, data: data, response_type: response_type)
313
313
  end
314
314
  end
315
315
  ensure
@@ -4,7 +4,7 @@ git_source(:github) { |repo| "https://github.com/#{repo}.git" }
4
4
  ruby '>= 2.5'
5
5
 
6
6
  # Framework
7
- gem 'tanakai'
7
+ gem 'tanakai', '~> 1.5'
8
8
 
9
9
  # Require files in directory and child directories recursively
10
10
  gem 'require_all'
@@ -1,3 +1,3 @@
1
1
  module Tanakai
2
- VERSION = "1.5.0"
2
+ VERSION = "1.5.1"
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: tanakai
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.5.0
4
+ version: 1.5.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Victor Afanasev