docapurl 0.1.0 → 0.2.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CH_README.md +77 -0
- data/Gemfile.lock +5 -5
- data/README.md +58 -3
- data/docapurl.gemspec +2 -2
- data/lib/docapurl.rb +0 -4
- data/lib/docapurl/browser.rb +21 -1
- data/lib/docapurl/version.rb +1 -1
- metadata +11 -10
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: c175e73c80948788100dfec7dab654d13cd0ab9c2f1e1f195cb27be55c858b23
|
4
|
+
data.tar.gz: e8c774fd4072aa0cd047afbc7ec7345e9f9a145717ba3c058342502cae6ff6ba
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 1aa1857f6b4309ca714ddf6eb53c36bfdde2191db93efc07d2e23df4209a507ade2d48a8e0959710dd229c560642261d2c204dad99262b57d5878cbaa9321c65
|
7
|
+
data.tar.gz: 8478697b36cc173f4efb4e02f01ea42e34d3c31300df6e1300599c19d2bf1b36dc159ed3389238780ffa46847ef9dc67c3d3b0a28802a2d15378a9f96b916835
|
data/CH_README.md
ADDED
@@ -0,0 +1,77 @@
|
|
1
|
+
# Docapurl
|
2
|
+
一个ruby的命令行截图网页的工具
|
3
|
+
|
4
|
+
## 抓取网页线上服务
|
5
|
+
|
6
|
+
[https://www.urlprint.com](https://www.urlprint.com) 提供在线抓取网页生成图片的服务, 支持 REST API 调用, 欢迎各位大佬试用
|
7
|
+
|
8
|
+
## 安装
|
9
|
+
|
10
|
+
gem install docapurl
|
11
|
+
|
12
|
+
|
13
|
+
## 前置条件
|
14
|
+
|
15
|
+
- ruby 环境
|
16
|
+
- chrome
|
17
|
+
|
18
|
+
需要安装chrome 浏览器, 并推荐把chrome的path 加入到环境变量PATH 或者 BROWSER_PATH 中
|
19
|
+
|
20
|
+
## 使用方式
|
21
|
+
|
22
|
+
on terminal
|
23
|
+
|
24
|
+
`docapurl cap [url] [image_path]`
|
25
|
+
|
26
|
+
使用 `docapurl help cap` 获得更多帮助
|
27
|
+
|
28
|
+
## Example
|
29
|
+
|
30
|
+
```
|
31
|
+
docapurl cap https:/www.bilibili.com 1.jpg --pagedown-to-bottom
|
32
|
+
```
|
33
|
+
|
34
|
+
## FAQ
|
35
|
+
|
36
|
+
- 为啥docapurl 需要 chrome 以及在哪下载chrome?
|
37
|
+
|
38
|
+
因为 docapurl 封装了 ferrum gem 的一些截图功能, 而 ferrum 依赖于 headless Chrome .
|
39
|
+
linux 的chrome 一般都不太靠谱, 所以在这里下载官方的 https://www.chromium.org/getting-involved/download-chromium
|
40
|
+
|
41
|
+
windows 和mac 用户, 则可以在这里下载 https://www.google.com/chrome/.
|
42
|
+
|
43
|
+
下载安装完毕 chrome 后, 强烈建议将浏览器path 加入到 path 变量中
|
44
|
+
|
45
|
+
- 我能否跳过加入浏览器path 到 env PATH 中呢?
|
46
|
+
|
47
|
+
可以, 只不过在使用截图时, 需要指定浏览器path `--browser-path=you-path/to/chrome`
|
48
|
+
|
49
|
+
|
50
|
+
- 能否在 EC2 或者 ECS 上用?
|
51
|
+
|
52
|
+
可以, 只不过你要在机器中先装好 Chrome
|
53
|
+
|
54
|
+
特别提示, AWS ec2 上没中文字体, 需要安装中文字体如果你截图到是中文网页的话.
|
55
|
+
|
56
|
+
|
57
|
+
- 为啥截图网页的时候 网页底部有些图片没有出来?
|
58
|
+
|
59
|
+
因为这个网页可能使用了懒加载图片的技术, 需要图片在浏览器视区内才进行加载, cap 默认 5次 pageDown 如果你想确保都加载, 请使用 --pagedown-to-bottom 参数
|
60
|
+
|
61
|
+
## Development
|
62
|
+
|
63
|
+
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
64
|
+
|
65
|
+
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
66
|
+
|
67
|
+
## Contributing
|
68
|
+
|
69
|
+
Bug reports and pull requests are welcome on GitHub at https://github.com/[USERNAME]/docapurl. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
|
70
|
+
|
71
|
+
## License
|
72
|
+
|
73
|
+
The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
|
74
|
+
|
75
|
+
## Code of Conduct
|
76
|
+
|
77
|
+
Everyone interacting in the Docapurl project’s codebases, issue trackers, chat rooms and mailing lists is expected to follow the [code of conduct](https://github.com/[USERNAME]/docapurl/blob/master/CODE_OF_CONDUCT.md).
|
data/Gemfile.lock
CHANGED
@@ -1,9 +1,9 @@
|
|
1
1
|
PATH
|
2
2
|
remote: .
|
3
3
|
specs:
|
4
|
-
docapurl (0.1
|
4
|
+
docapurl (0.2.1)
|
5
5
|
ferrum (~> 0.11)
|
6
|
-
thor
|
6
|
+
thor
|
7
7
|
|
8
8
|
GEM
|
9
9
|
remote: https://rubygems.org/
|
@@ -19,7 +19,7 @@ GEM
|
|
19
19
|
concurrent-ruby (~> 1.1)
|
20
20
|
websocket-driver (>= 0.6, < 0.8)
|
21
21
|
public_suffix (4.0.6)
|
22
|
-
rake (
|
22
|
+
rake (13.0.3)
|
23
23
|
rspec (3.10.0)
|
24
24
|
rspec-core (~> 3.10.0)
|
25
25
|
rspec-expectations (~> 3.10.0)
|
@@ -33,7 +33,7 @@ GEM
|
|
33
33
|
diff-lcs (>= 1.2.0, < 2.0)
|
34
34
|
rspec-support (~> 3.10.0)
|
35
35
|
rspec-support (3.10.2)
|
36
|
-
thor (
|
36
|
+
thor (1.1.0)
|
37
37
|
websocket-driver (0.7.3)
|
38
38
|
websocket-extensions (>= 0.1.0)
|
39
39
|
websocket-extensions (0.1.5)
|
@@ -44,7 +44,7 @@ PLATFORMS
|
|
44
44
|
DEPENDENCIES
|
45
45
|
bundler (~> 2.0)
|
46
46
|
docapurl!
|
47
|
-
rake (
|
47
|
+
rake (>= 12.3.3)
|
48
48
|
rspec (~> 3.0)
|
49
49
|
|
50
50
|
BUNDLED WITH
|
data/README.md
CHANGED
@@ -1,8 +1,14 @@
|
|
1
1
|
# Docapurl
|
2
2
|
|
3
|
-
|
3
|
+
A tool to screenshot the webpage on terminal.
|
4
|
+
|
5
|
+
[chinese_readme 中文说明](https://github.com/jicheng1014/docapurl/blob/master/CH_README.md)
|
6
|
+
|
7
|
+
## Capture url as a Service
|
8
|
+
|
9
|
+
[https://www.urlprint.com](https://www.urlprint.com) provides the capture url service, REST API supported.
|
10
|
+
|
4
11
|
|
5
|
-
TODO: Delete this and the text above, and describe your gem
|
6
12
|
|
7
13
|
## Installation
|
8
14
|
|
@@ -19,10 +25,59 @@ And then execute:
|
|
19
25
|
Or install it yourself as:
|
20
26
|
|
21
27
|
$ gem install docapurl
|
28
|
+
## Prerequisites
|
29
|
+
|
30
|
+
Chrome browser is required.
|
31
|
+
By the default, docapurl will invoke chrome in `PATH` ENV or ENV `BROWSER_PATH`
|
22
32
|
|
23
33
|
## Usage
|
24
34
|
|
25
|
-
|
35
|
+
on terminal
|
36
|
+
|
37
|
+
`docapurl cap [url] [image_path]`
|
38
|
+
|
39
|
+
use `docapurl help cap` to know more details
|
40
|
+
|
41
|
+
## Example
|
42
|
+
|
43
|
+
```
|
44
|
+
docapurl cap https:/www.bilibili.com 1.jpg --pagedown-to-bottom
|
45
|
+
```
|
46
|
+
|
47
|
+
or if u wanna know more details on screenshot
|
48
|
+
|
49
|
+
```
|
50
|
+
docapurl cap https:/www.bilibili.com 1.jpg --pagedown-to-bottom --no-headless
|
51
|
+
```
|
52
|
+
|
53
|
+
## FAQ
|
54
|
+
|
55
|
+
- Why docapurl needs chrome and where to download Chrome?
|
56
|
+
|
57
|
+
Because docapurl just encapsulates functions from ferrum gem, and ferrum depends on headless Chrome.
|
58
|
+
There's no official Chrome or Chromium package for Linux don't install it this way because it's either outdated or unofficial, both are bad. Download it from official https://www.chromium.org/getting-involved/download-chromium
|
59
|
+
|
60
|
+
For mac and windows, u can download Chrome from https://www.google.com/chrome/.
|
61
|
+
|
62
|
+
|
63
|
+
|
64
|
+
- Could i skip adding browser path to ENV path?
|
65
|
+
|
66
|
+
Yes, use param `--browser-path=you-path/to/chrome`
|
67
|
+
|
68
|
+
|
69
|
+
- Could use it on aws EC2 or aliyun ECS servers?
|
70
|
+
|
71
|
+
Yes, but u should installed chrome on serverw first.
|
72
|
+
|
73
|
+
|
74
|
+
- Why some images in the page didn't show in the screenshot?
|
75
|
+
|
76
|
+
Because the website may use lazy loading technology to the images, the images loaded when the images in browser viewscreen.
|
77
|
+
docapurl invokes keyborard PageDown 5 times default. use param `--pagedown-to-bottom` could ensure all images load happen.
|
78
|
+
|
79
|
+
|
80
|
+
|
26
81
|
|
27
82
|
## Development
|
28
83
|
|
data/docapurl.gemspec
CHANGED
@@ -29,9 +29,9 @@ Gem::Specification.new do |spec|
|
|
29
29
|
spec.require_paths = ["lib"]
|
30
30
|
|
31
31
|
spec.add_development_dependency "bundler", "~> 2.0"
|
32
|
-
spec.add_development_dependency "rake", "
|
32
|
+
spec.add_development_dependency "rake", ">= 12.3.3"
|
33
33
|
spec.add_development_dependency "rspec", "~> 3.0"
|
34
34
|
spec.add_dependency 'ferrum', '~>0.11'
|
35
|
-
spec.add_dependency "thor"
|
35
|
+
spec.add_dependency "thor"
|
36
36
|
|
37
37
|
end
|
data/lib/docapurl.rb
CHANGED
data/lib/docapurl/browser.rb
CHANGED
@@ -24,17 +24,25 @@ module Docapurl
|
|
24
24
|
options[:path] ||= "screenshot-#{host.to_s == '' ? '' : "#{host}-"}#{Time.now.strftime('%F-%T')}.jpg"
|
25
25
|
logger.info "browser begin to visit url #{url}"
|
26
26
|
|
27
|
+
set_callback("before_visit_func", options)
|
27
28
|
browser.go_to(url)
|
29
|
+
set_callback("after_visit_func", options)
|
30
|
+
|
31
|
+
|
28
32
|
logger.info 'visited'
|
29
33
|
max_pagedown = options[:max_pagedown] || 5
|
30
34
|
pagedown_to_bottom = options.delete :pagedown_to_bottom
|
31
35
|
visit_whole_page(browser, max_pagedown: max_pagedown, pagedown_to_bottom: pagedown_to_bottom)
|
32
36
|
|
33
37
|
sleep_before_screen = options.delete :sleep_before_screen
|
34
|
-
logger.info "sleep #{sleep_before_screen.to_i} second before
|
38
|
+
logger.info "sleep #{sleep_before_screen.to_i} second before screenshot"
|
35
39
|
sleep(sleep_before_screen.to_i)
|
36
40
|
|
41
|
+
|
42
|
+
set_callback("before_screenshot_func", options)
|
37
43
|
browser.screenshot(**options)
|
44
|
+
set_callback("after_screenshot_func", options)
|
45
|
+
|
38
46
|
logger.info "screenshot ended, path = #{options[:path]}"
|
39
47
|
end
|
40
48
|
|
@@ -67,6 +75,18 @@ module Docapurl
|
|
67
75
|
browser.keyboard.type(:Home)
|
68
76
|
end
|
69
77
|
|
78
|
+
private
|
79
|
+
|
80
|
+
def set_callback(name, options)
|
81
|
+
the_function = options.delete name
|
82
|
+
if the_function.nil?
|
83
|
+
the_function = options.delete name.to_sym
|
84
|
+
end
|
85
|
+
|
86
|
+
the_function.call(self) unless the_function.nil?
|
87
|
+
end
|
88
|
+
|
89
|
+
|
70
90
|
class << self
|
71
91
|
def cap(url, path = nil, browser_options = {}, cap_options = {})
|
72
92
|
browser = new(browser_options)
|
data/lib/docapurl/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: docapurl
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.1
|
4
|
+
version: 0.2.1
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- atpking
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2021-
|
11
|
+
date: 2021-05-11 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -28,16 +28,16 @@ dependencies:
|
|
28
28
|
name: rake
|
29
29
|
requirement: !ruby/object:Gem::Requirement
|
30
30
|
requirements:
|
31
|
-
- - "
|
31
|
+
- - ">="
|
32
32
|
- !ruby/object:Gem::Version
|
33
|
-
version:
|
33
|
+
version: 12.3.3
|
34
34
|
type: :development
|
35
35
|
prerelease: false
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
|
-
- - "
|
38
|
+
- - ">="
|
39
39
|
- !ruby/object:Gem::Version
|
40
|
-
version:
|
40
|
+
version: 12.3.3
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
42
|
name: rspec
|
43
43
|
requirement: !ruby/object:Gem::Requirement
|
@@ -70,16 +70,16 @@ dependencies:
|
|
70
70
|
name: thor
|
71
71
|
requirement: !ruby/object:Gem::Requirement
|
72
72
|
requirements:
|
73
|
-
- - "
|
73
|
+
- - ">="
|
74
74
|
- !ruby/object:Gem::Version
|
75
|
-
version: '0
|
75
|
+
version: '0'
|
76
76
|
type: :runtime
|
77
77
|
prerelease: false
|
78
78
|
version_requirements: !ruby/object:Gem::Requirement
|
79
79
|
requirements:
|
80
|
-
- - "
|
80
|
+
- - ">="
|
81
81
|
- !ruby/object:Gem::Version
|
82
|
-
version: '0
|
82
|
+
version: '0'
|
83
83
|
description:
|
84
84
|
email:
|
85
85
|
- atpking@gmail.com
|
@@ -91,6 +91,7 @@ files:
|
|
91
91
|
- ".gitignore"
|
92
92
|
- ".rspec"
|
93
93
|
- ".travis.yml"
|
94
|
+
- CH_README.md
|
94
95
|
- CODE_OF_CONDUCT.md
|
95
96
|
- Gemfile
|
96
97
|
- Gemfile.lock
|