docapurl 0.1.0 → 0.2.1

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 99de94810d60e3d3c5e934c7dff632f7b6d23c841d02d1359590191f091e6c16
4
- data.tar.gz: 46c12c213e11dce773bc5982cb7327ec985830893007f8b6e4277cf681455611
3
+ metadata.gz: c175e73c80948788100dfec7dab654d13cd0ab9c2f1e1f195cb27be55c858b23
4
+ data.tar.gz: e8c774fd4072aa0cd047afbc7ec7345e9f9a145717ba3c058342502cae6ff6ba
5
5
  SHA512:
6
- metadata.gz: 3058e5f01ee00295ad7d5c9442799e2a41d65507cb8cc189e674f21447d524b5ad14b99f215fab12d9924baf09f2c035c81d7cc9c5ea9318d40425399448bc93
7
- data.tar.gz: e031c7ab870d9c5cfb53ef1c90231c4a1b7250b32fc1507a4a4fbfc0ca07d6d4c6f69debc94379f921ba68e4c67b749eddfb09e1caed3770f38a3411a5bfc040
6
+ metadata.gz: 1aa1857f6b4309ca714ddf6eb53c36bfdde2191db93efc07d2e23df4209a507ade2d48a8e0959710dd229c560642261d2c204dad99262b57d5878cbaa9321c65
7
+ data.tar.gz: 8478697b36cc173f4efb4e02f01ea42e34d3c31300df6e1300599c19d2bf1b36dc159ed3389238780ffa46847ef9dc67c3d3b0a28802a2d15378a9f96b916835
data/CH_README.md ADDED
@@ -0,0 +1,77 @@
1
+ # Docapurl
2
+ 一个ruby的命令行截图网页的工具
3
+
4
+ ## 抓取网页线上服务
5
+
6
+ [https://www.urlprint.com](https://www.urlprint.com) 提供在线抓取网页生成图片的服务, 支持 REST API 调用, 欢迎各位大佬试用
7
+
8
+ ## 安装
9
+
10
+ gem install docapurl
11
+
12
+
13
+ ## 前置条件
14
+
15
+ - ruby 环境
16
+ - chrome
17
+
18
+ 需要安装chrome 浏览器, 并推荐把chrome的path 加入到环境变量PATH 或者 BROWSER_PATH 中
19
+
20
+ ## 使用方式
21
+
22
+ on terminal
23
+
24
+ `docapurl cap [url] [image_path]`
25
+
26
+ 使用 `docapurl help cap` 获得更多帮助
27
+
28
+ ## Example
29
+
30
+ ```
31
+ docapurl cap https:/www.bilibili.com 1.jpg --pagedown-to-bottom
32
+ ```
33
+
34
+ ## FAQ
35
+
36
+ - 为啥docapurl 需要 chrome 以及在哪下载chrome?
37
+
38
+ 因为 docapurl 封装了 ferrum gem 的一些截图功能, 而 ferrum 依赖于 headless Chrome .
39
+ linux 的chrome 一般都不太靠谱, 所以在这里下载官方的 https://www.chromium.org/getting-involved/download-chromium
40
+
41
+ windows 和mac 用户, 则可以在这里下载 https://www.google.com/chrome/.
42
+
43
+ 下载安装完毕 chrome 后, 强烈建议将浏览器path 加入到 path 变量中
44
+
45
+ - 我能否跳过加入浏览器path 到 env PATH 中呢?
46
+
47
+ 可以, 只不过在使用截图时, 需要指定浏览器path `--browser-path=you-path/to/chrome`
48
+
49
+
50
+ - 能否在 EC2 或者 ECS 上用?
51
+
52
+ 可以, 只不过你要在机器中先装好 Chrome
53
+
54
+ 特别提示, AWS ec2 上没中文字体, 需要安装中文字体如果你截图到是中文网页的话.
55
+
56
+
57
+ - 为啥截图网页的时候 网页底部有些图片没有出来?
58
+
59
+ 因为这个网页可能使用了懒加载图片的技术, 需要图片在浏览器视区内才进行加载, cap 默认 5次 pageDown 如果你想确保都加载, 请使用 --pagedown-to-bottom 参数
60
+
61
+ ## Development
62
+
63
+ After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
64
+
65
+ To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
66
+
67
+ ## Contributing
68
+
69
+ Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/docapurl. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
70
+
71
+ ## License
72
+
73
+ The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
74
+
75
+ ## Code of Conduct
76
+
77
+ Everyone interacting in the Docapurl project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/docapurl/blob/master/CODE_OF_CONDUCT.md).
data/Gemfile.lock CHANGED
@@ -1,9 +1,9 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- docapurl (0.1.0)
4
+ docapurl (0.2.1)
5
5
  ferrum (~> 0.11)
6
- thor (~> 0.20)
6
+ thor
7
7
 
8
8
  GEM
9
9
  remote: https://rubygems.org/
@@ -19,7 +19,7 @@ GEM
19
19
  concurrent-ruby (~> 1.1)
20
20
  websocket-driver (>= 0.6, < 0.8)
21
21
  public_suffix (4.0.6)
22
- rake (10.5.0)
22
+ rake (13.0.3)
23
23
  rspec (3.10.0)
24
24
  rspec-core (~> 3.10.0)
25
25
  rspec-expectations (~> 3.10.0)
@@ -33,7 +33,7 @@ GEM
33
33
  diff-lcs (>= 1.2.0, < 2.0)
34
34
  rspec-support (~> 3.10.0)
35
35
  rspec-support (3.10.2)
36
- thor (0.20.3)
36
+ thor (1.1.0)
37
37
  websocket-driver (0.7.3)
38
38
  websocket-extensions (>= 0.1.0)
39
39
  websocket-extensions (0.1.5)
@@ -44,7 +44,7 @@ PLATFORMS
44
44
  DEPENDENCIES
45
45
  bundler (~> 2.0)
46
46
  docapurl!
47
- rake (~> 10.0)
47
+ rake (>= 12.3.3)
48
48
  rspec (~> 3.0)
49
49
 
50
50
  BUNDLED WITH
data/README.md CHANGED
@@ -1,8 +1,14 @@
1
1
  # Docapurl
2
2
 
3
- Welcome to your new gem! In this directory, you'll find the files you need to be able to package up your Ruby library into a gem. Put your Ruby code in the file `lib/docapurl`. To experiment with that code, run `bin/console` for an interactive prompt.
3
+ A tool to screenshot the webpage on terminal.
4
+
5
+ [chinese_readme 中文说明](https://github.com/jicheng1014/docapurl/blob/master/CH_README.md)
6
+
7
+ ## Capture url as a Service
8
+
9
+ [https://www.urlprint.com](https://www.urlprint.com) provides the capture url service, REST API supported.
10
+
4
11
 
5
- TODO: Delete this and the text above, and describe your gem
6
12
 
7
13
  ## Installation
8
14
 
@@ -19,10 +25,59 @@ And then execute:
19
25
  Or install it yourself as:
20
26
 
21
27
  $ gem install docapurl
28
+ ## Prerequisites
29
+
30
+ Chrome browser is required.
31
+ By the default, docapurl will invoke chrome in `PATH` ENV or ENV `BROWSER_PATH`
22
32
 
23
33
  ## Usage
24
34
 
25
- TODO: Write usage instructions here
35
+ on terminal
36
+
37
+ `docapurl cap [url] [image_path]`
38
+
39
+ use `docapurl help cap` to know more details
40
+
41
+ ## Example
42
+
43
+ ```
44
+ docapurl cap https:/www.bilibili.com 1.jpg --pagedown-to-bottom
45
+ ```
46
+
47
+ or if u wanna know more details on screenshot
48
+
49
+ ```
50
+ docapurl cap https:/www.bilibili.com 1.jpg --pagedown-to-bottom --no-headless
51
+ ```
52
+
53
+ ## FAQ
54
+
55
+ - Why docapurl needs chrome and where to download Chrome?
56
+
57
+ Because docapurl just encapsulates functions from ferrum gem, and ferrum depends on headless Chrome.
58
+ There's no official Chrome or Chromium package for Linux don't install it this way because it's either outdated or unofficial, both are bad. Download it from official https://www.chromium.org/getting-involved/download-chromium
59
+
60
+ For mac and windows, u can download Chrome from https://www.google.com/chrome/.
61
+
62
+
63
+
64
+ - Could i skip adding browser path to ENV path?
65
+
66
+ Yes, use param `--browser-path=you-path/to/chrome`
67
+
68
+
69
+ - Could use it on aws EC2 or aliyun ECS servers?
70
+
71
+ Yes, but u should installed chrome on serverw first.
72
+
73
+
74
+ - Why some images in the page didn't show in the screenshot?
75
+
76
+ Because the website may use lazy loading technology to the images, the images loaded when the images in browser viewscreen.
77
+ docapurl invokes keyborard PageDown 5 times default. use param `--pagedown-to-bottom` could ensure all images load happen.
78
+
79
+
80
+
26
81
 
27
82
  ## Development
28
83
 
data/docapurl.gemspec CHANGED
@@ -29,9 +29,9 @@ Gem::Specification.new do |spec|
29
29
  spec.require_paths = ["lib"]
30
30
 
31
31
  spec.add_development_dependency "bundler", "~> 2.0"
32
- spec.add_development_dependency "rake", "~> 10.0"
32
+ spec.add_development_dependency "rake", ">= 12.3.3"
33
33
  spec.add_development_dependency "rspec", "~> 3.0"
34
34
  spec.add_dependency 'ferrum', '~>0.11'
35
- spec.add_dependency "thor", "~> 0.20"
35
+ spec.add_dependency "thor"
36
36
 
37
37
  end
data/lib/docapurl.rb CHANGED
@@ -1,9 +1,5 @@
1
1
  require_relative './docapurl/version'
2
2
  require_relative './docapurl/browser'
3
- begin
4
- require 'byebug'
5
- rescue
6
- end
7
3
  require 'thor'
8
4
 
9
5
  module Docapurl
@@ -24,17 +24,25 @@ module Docapurl
24
24
  options[:path] ||= "screenshot-#{host.to_s == '' ? '' : "#{host}-"}#{Time.now.strftime('%F-%T')}.jpg"
25
25
  logger.info "browser begin to visit url #{url}"
26
26
 
27
+ set_callback("before_visit_func", options)
27
28
  browser.go_to(url)
29
+ set_callback("after_visit_func", options)
30
+
31
+
28
32
  logger.info 'visited'
29
33
  max_pagedown = options[:max_pagedown] || 5
30
34
  pagedown_to_bottom = options.delete :pagedown_to_bottom
31
35
  visit_whole_page(browser, max_pagedown: max_pagedown, pagedown_to_bottom: pagedown_to_bottom)
32
36
 
33
37
  sleep_before_screen = options.delete :sleep_before_screen
34
- logger.info "sleep #{sleep_before_screen.to_i} second before screen"
38
+ logger.info "sleep #{sleep_before_screen.to_i} second before screenshot"
35
39
  sleep(sleep_before_screen.to_i)
36
40
 
41
+
42
+ set_callback("before_screenshot_func", options)
37
43
  browser.screenshot(**options)
44
+ set_callback("after_screenshot_func", options)
45
+
38
46
  logger.info "screenshot ended, path = #{options[:path]}"
39
47
  end
40
48
 
@@ -67,6 +75,18 @@ module Docapurl
67
75
  browser.keyboard.type(:Home)
68
76
  end
69
77
 
78
+ private
79
+
80
+ def set_callback(name, options)
81
+ the_function = options.delete name
82
+ if the_function.nil?
83
+ the_function = options.delete name.to_sym
84
+ end
85
+
86
+ the_function.call(self) unless the_function.nil?
87
+ end
88
+
89
+
70
90
  class << self
71
91
  def cap(url, path = nil, browser_options = {}, cap_options = {})
72
92
  browser = new(browser_options)
@@ -1,3 +1,3 @@
1
1
  module Docapurl
2
- VERSION = "0.1.0"
2
+ VERSION = "0.2.1"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: docapurl
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.2.1
5
5
  platform: ruby
6
6
  authors:
7
7
  - atpking
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2021-04-24 00:00:00.000000000 Z
11
+ date: 2021-05-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -28,16 +28,16 @@ dependencies:
28
28
  name: rake
29
29
  requirement: !ruby/object:Gem::Requirement
30
30
  requirements:
31
- - - "~>"
31
+ - - ">="
32
32
  - !ruby/object:Gem::Version
33
- version: '10.0'
33
+ version: 12.3.3
34
34
  type: :development
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
- - - "~>"
38
+ - - ">="
39
39
  - !ruby/object:Gem::Version
40
- version: '10.0'
40
+ version: 12.3.3
41
41
  - !ruby/object:Gem::Dependency
42
42
  name: rspec
43
43
  requirement: !ruby/object:Gem::Requirement
@@ -70,16 +70,16 @@ dependencies:
70
70
  name: thor
71
71
  requirement: !ruby/object:Gem::Requirement
72
72
  requirements:
73
- - - "~>"
73
+ - - ">="
74
74
  - !ruby/object:Gem::Version
75
- version: '0.20'
75
+ version: '0'
76
76
  type: :runtime
77
77
  prerelease: false
78
78
  version_requirements: !ruby/object:Gem::Requirement
79
79
  requirements:
80
- - - "~>"
80
+ - - ">="
81
81
  - !ruby/object:Gem::Version
82
- version: '0.20'
82
+ version: '0'
83
83
  description:
84
84
  email:
85
85
  - atpking@gmail.com
@@ -91,6 +91,7 @@ files:
91
91
  - ".gitignore"
92
92
  - ".rspec"
93
93
  - ".travis.yml"
94
+ - CH_README.md
94
95
  - CODE_OF_CONDUCT.md
95
96
  - Gemfile
96
97
  - Gemfile.lock