tanakai 1.6.0 → 1.7.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 7ea3cd20cfaedaebf473e853b66ebe58958e89b7525246444e3c8aeef46a4bf0
4
- data.tar.gz: a2c51b86487d6392a58b533237731996639fe0037c9aca22a6140c3c968eaf7d
3
+ metadata.gz: a544dd8c9d448beccd646a4a536a5597bd9e6348717c4f8103f708798fbc21c5
4
+ data.tar.gz: '093f88b3185e3999f4bed6b6c9df472a1d7c32c4f1e95e044bd313ec4e444f50'
5
5
  SHA512:
6
- metadata.gz: 52d9a730a0a9e08c0a49ee4177a0370f5ed2a12ac9e3925f0a83b0c232dcedb1645d1b6860cb19c8453bbc5777cec02403654e2282e57ad75c5c2cb898b6dc1b
7
- data.tar.gz: '0969ee651ec787b9fa1e47b8d776571b6f4751c29d3dd15bb0c696181ceab8bc826db6f54486df6426151f25f626f4725bf79c51ec8cc8cebebbe6cfa057bfa3'
6
+ metadata.gz: 0bf1db16739720f015902830f588a3a463331b6d36d241fdd0d397bd1d8cb2e8ef921bf8d10a552016b7b2e43e415f938b974001e63a347a8feec9bb8b9b270f
7
+ data.tar.gz: c080531acf9b98fb00c94ca994c3e32281c5eada7a38d15175a481f55e4b8cbe11b707950a9674fdfd09909b6ded2ece608f753b5fb5fb919ae79b6810c97547
data/.gitignore CHANGED
@@ -11,3 +11,4 @@ Gemfile.lock
11
11
  *.retry
12
12
  .tags*
13
13
  *.gem
14
+ .DS_Store
data/CHANGELOG.md CHANGED
@@ -1,18 +1,28 @@
1
1
  # CHANGELOG
2
2
 
3
+ ## Next
4
+ * Your contribution here
5
+
6
+ ## 1.7.0
7
+ ### New
8
+ * Allow passing `data:` to `crawl!` - [glaucocustodio](https://github.com/glaucocustodio)
9
+
10
+ ### Fixes
11
+ * [#4](https://github.com/glaucocustodio/tanakai/pull/4): Fix keyword args on `crawl!` - [milk1000cc](https://github.com/milk1000cc)
12
+
3
13
  ## 1.6.0
4
14
  ### New
5
- * Add support to Ruby 3
15
+ * Add support to Ruby 3 - [glaucocustodio](https://github.com/glaucocustodio)
6
16
 
7
17
  ## 1.5.1
8
18
  ### New
9
- * Add `response_type` to `in_parallel`
19
+ * Add `response_type` to `in_parallel` - [glaucocustodio](https://github.com/glaucocustodio)
10
20
 
11
21
  ## 1.5.0
12
22
  ### New
13
- * First release as Tanakai
14
- * Add support to [Apparition](https://github.com/twalpole/apparition)
15
- * Add support to [Cuprite](https://github.com/rubycdp/cuprite)
23
+ * First release as Tanakai - [glaucocustodio](https://github.com/glaucocustodio)
24
+ * Add support to [Apparition](https://github.com/twalpole/apparition) - [glaucocustodio](https://github.com/glaucocustodio)
25
+ * Add support to [Cuprite](https://github.com/rubycdp/cuprite) - [glaucocustodio](https://github.com/glaucocustodio)
16
26
 
17
27
  ## 1.4.0
18
28
  ### New
data/README.md CHANGED
@@ -1355,6 +1355,12 @@ end # =>
1355
1355
  # {:spider_name=>"example_spider", :status=>:completed, :environment=>"development", :start_time=>2018-08-22 18:49:22 +0400, :stop_time=>2018-08-22 18:49:23 +0400, :running_time=>0.801, :visits=>{:requests=>1, :responses=>1}, :items=>{:sent=>0, :processed=>0}, :error=>nil}
1356
1356
  ```
1357
1357
 
1358
+ You can also pass `data` to `crawl!`:
1359
+
1360
+ ```ruby
1361
+ ExampleSpider.crawl!(data: { foo: "bar" })
1362
+ ```
1363
+
1358
1364
  So what if you're don't care about stats and just want to process request to a particular spider method and get the returning value from this method? Use `.parse!` instead:
1359
1365
 
1360
1366
  #### `.parse!(:method_name, url:)` method
data/lib/tanakai/base.rb CHANGED
@@ -100,7 +100,7 @@ module Tanakai
100
100
  end
101
101
  end
102
102
 
103
- def self.crawl!(exception_on_fail: true)
103
+ def self.crawl!(exception_on_fail: true, data: {})
104
104
  logger.error "Spider: already running: #{name}" and return false if running?
105
105
 
106
106
  @storage = Storage.new
@@ -124,13 +124,13 @@ module Tanakai
124
124
  if start_urls
125
125
  start_urls.each do |start_url|
126
126
  if start_url.class == Hash
127
- spider.request_to(:parse, start_url)
127
+ spider.request_to(:parse, url: start_url, data: data)
128
128
  else
129
- spider.request_to(:parse, url: start_url)
129
+ spider.request_to(:parse, url: start_url, data: data)
130
130
  end
131
131
  end
132
132
  else
133
- spider.parse
133
+ spider.parse(data: data)
134
134
  end
135
135
  rescue StandardError, SignalException, SystemExit => e
136
136
  @run_info.merge!(status: :failed, error: e.inspect)
@@ -160,7 +160,7 @@ module Tanakai
160
160
  if args.present?
161
161
  spider.public_send(handler, *args)
162
162
  elsif request.present?
163
- spider.request_to(handler, request)
163
+ spider.request_to(handler, **request)
164
164
  else
165
165
  spider.public_send(handler)
166
166
  end
@@ -1,3 +1,3 @@
1
1
  module Tanakai
2
- VERSION = "1.6.0"
2
+ VERSION = "1.7.0"
3
3
  end
metadata CHANGED
@@ -1,7 +1,7 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: tanakai
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.6.0
4
+ version: 1.7.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - Victor Afanasev
@@ -9,7 +9,7 @@ authors:
9
9
  autorequire:
10
10
  bindir: exe
11
11
  cert_chain: []
12
- date: 2023-02-16 00:00:00.000000000 Z
12
+ date: 2023-10-25 00:00:00.000000000 Z
13
13
  dependencies:
14
14
  - !ruby/object:Gem::Dependency
15
15
  name: thor