kimurai 1.0.0 → 1.0.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: d108c41e5da08b22c21cc6c71cc3ac7056ddd1af32054c22a22f0c59658bfcb4
4
- data.tar.gz: 8a8d32b7b8646eb50bd9f71d8986edc2ac78efc0e2e6a437b3280cff4418c5dd
3
+ metadata.gz: 7f2185614ca5aa8486c17e0c43b3b035cf22cd18d51617430f556a12af3dc7c8
4
+ data.tar.gz: 9e5c296feb5d020aa13bcfaa7f6f4c77d839ff373e8aa9d3e0abcc953aaa89de
5
5
  SHA512:
6
- metadata.gz: 4c82647cbe276980ef0a246693c7e68c08651351a549f99fbc6618bc9836c4a4ba83b4d09e1e29d06abcfa0d4f70443fb88682f57c544c0218b22940834a48b1
7
- data.tar.gz: 845f04c77fbb5e53b24d048e60f23e2c0f9fdeb4d2fde7dcaaa04bebfebc4454777ade03cae895e444583aafb6c8e56038d0d722589fde10076091903646fdf7
6
+ metadata.gz: 07d92edd8719cbfc701ac7d82975d4c06f5ba9f6adb0bdbbc6731f81655d70d077d140efa54b473b462058042078abab9218f5f00dab244f7478f91c62c8e24b
7
+ data.tar.gz: 5dc6a70b6379a46c58c917455a7eace96c1093944125888cbc2f9b2af93cf065de0ca00e9d98e786d9a2fbc3a53a1ea3dbf1712e703984d063dfc937ad5e0c71
@@ -0,0 +1,6 @@
1
+ # CHANGELOG
2
+ ## HEAD
3
+
4
+ ## 1.0.1
5
+ * Add missing `logger` method to pipeline
6
+ * Fix `set_proxy` in Mechanize and Poltergeist builders
data/README.md CHANGED
@@ -6,6 +6,18 @@
6
6
  <h1>Kimura Framework</h1>
7
7
  </div>
8
8
 
9
+ > **Note about v1.0.0 version:**
10
+ > * The code was massively refactored for a [support](#using-kimurai-inside-existing-ruby-application) to run spiders multiple times from inside a single process. Now it's possible to run Kimurai spiders using background jobs like Sidekiq.
11
+ > * `require 'kimurai'` doesn't require any gems except Active Support. Only when a particular spider [starts](#crawl-method), Capybara will be required with a specific driver.
12
+ > * Although Kimurai [extends](lib/kimurai/capybara_ext) Capybara (all the magic happens inside [extended](lib/kimurai/capybara_ext/session.rb) `Capybara::Session#visit` method), session instances which were created manually will behave normally.
13
+ > * No spaghetti code with `case/when/end` blocks anymore. All drivers [were extended](lib/kimurai/capybara_ext) to support unified methods for cookies, proxies, headers, etc.
14
+ > * `selenium_url_to_set_cookies` @config option don't need anymore if you're use Selenium-like engine with custom cookies setting.
15
+ > * Small changes in design (check the readme again to see what was changed)
16
+ > * Stats database with a web dashboard were removed
17
+ > * Again, massive refactor. Code now looks much better than it was before.
18
+
19
+ <br>
20
+
9
21
  Kimurai is a modern web scraping framework written in Ruby which **works out of box with Headless Chromium/Firefox, PhantomJS**, or simple HTTP requests and **allows to scrape and interact with JavaScript rendered websites.**
10
22
 
11
23
  Kimurai based on well-known [Capybara](https://github.com/teamcapybara/capybara) and [Nokogiri](https://github.com/sparklemotion/nokogiri) gems, so you don't have to learn anything new. Lets see:
@@ -217,7 +229,6 @@ I, [2018-08-22 13:33:30 +0400#23356] [M: 47375890851320] INFO -- infinite_scrol
217
229
  * [Kimurai](#kimurai)
218
230
  * [Features](#features)
219
231
  * [Table of Contents](#table-of-contents)
220
- * [Note about v1.0.0 version](#note-about-v1-0-0-version)
221
232
  * [Installation](#installation)
222
233
  * [Getting to Know](#getting-to-know)
223
234
  * [Interactive console](#interactive-console)
@@ -255,12 +266,6 @@ I, [2018-08-22 13:33:30 +0400#23356] [M: 47375890851320] INFO -- infinite_scrol
255
266
  * [Chat Support and Feedback](#chat-support-and-feedback)
256
267
  * [License](#license)
257
268
 
258
- ## Note about v1.0.0 version
259
- * The code was massively refactored for a [support](#using-kimurai-inside-existing-ruby-application) to run spiders multiple times from inside a single process. Now it's possible to run Kimurai spiders using background jobs like Sidekiq.
260
- * `require 'kimurai'` doesn't require any gems except Active Support. Only when a particular spider [starts](#crawl-method), Capybara will be required with a specific driver.
261
- * Although Kimurai [extends](lib/kimurai/capybara_ext) Capybara (all the magic happens inside [extended](lib/kimurai/capybara_ext/session.rb) `Capybara::Session#visit` method), session instances which were created manually will behave normally.
262
- * Small changes in design (check the readme again to see what was changed)
263
- * Again, massive refactor. Code now looks much better than it was before.
264
269
 
265
270
  ## Installation
266
271
  Kimurai requires Ruby version `>= 2.5.0`. Supported platforms: `Linux` and `Mac OS X`.
@@ -1604,7 +1609,7 @@ To generate a new spider in the project, run:
1604
1609
 
1605
1610
  ```bash
1606
1611
  $ kimurai generate spider example_spider
1607
- create crawlers/example_spider.rb
1612
+ create spiders/example_spider.rb
1608
1613
  ```
1609
1614
 
1610
1615
  Command will generate a new spider class inherited from `ApplicationSpider`:
@@ -37,7 +37,7 @@ module Kimurai
37
37
  if type == "socks5"
38
38
  logger.error "BrowserBuilder (mechanize): can't set socks5 proxy (not supported), skipped"
39
39
  else
40
- @browser.set_proxy(*proxy_string.split(":"))
40
+ @browser.driver.set_proxy(*proxy_string.split(":"))
41
41
  logger.debug "BrowserBuilder (mechanize): enabled #{type} proxy, ip: #{ip}, port: #{port}"
42
42
  end
43
43
  end
@@ -84,7 +84,7 @@ module Kimurai
84
84
 
85
85
  # restart_if
86
86
  if @config.dig(:browser, :restart_if).present?
87
- logger.error "BrowserBuilder (mechanize): `browser restart_if` options not supported by Mechanize, skipped"
87
+ logger.warn "BrowserBuilder (mechanize): `browser restart_if` options not supported by Mechanize, skipped"
88
88
  end
89
89
 
90
90
  # before_request clear_cookies
@@ -59,7 +59,7 @@ module Kimurai
59
59
  proxy_string = (proxy.class == Proc ? proxy.call : proxy).strip
60
60
  ip, port, type = proxy_string.split(":")
61
61
 
62
- @browser.set_proxy(*proxy_string.split(":"))
62
+ @browser.driver.set_proxy(*proxy_string.split(":"))
63
63
  logger.debug "BrowserBuilder (poltergeist_phantomjs): enabled #{type} proxy, ip: #{ip}, port: #{port}"
64
64
  end
65
65
 
@@ -21,5 +21,9 @@ module Kimurai
21
21
  def save_to(path, item, format:, position: true)
22
22
  spider.save_to(path, item, format: format, position: position)
23
23
  end
24
+
25
+ def logger
26
+ spider.logger
27
+ end
24
28
  end
25
29
  end
@@ -1,3 +1,3 @@
1
1
  module Kimurai
2
- VERSION = "1.0.0"
2
+ VERSION = "1.0.1"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: kimurai
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.0.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - Victor Afanasev
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2018-08-23 00:00:00.000000000 Z
11
+ date: 2018-08-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: thor
@@ -264,6 +264,7 @@ extra_rdoc_files: []
264
264
  files:
265
265
  - ".gitignore"
266
266
  - ".travis.yml"
267
+ - CHANGELOG.md
267
268
  - CODE_OF_CONDUCT.md
268
269
  - Gemfile
269
270
  - LICENSE.txt