parklife 0.2.0 → 0.4.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/.github/workflows/tests.yml +7 -2
- data/CHANGELOG.md +17 -0
- data/README.md +40 -5
- data/examples/rack/.gitignore +1 -0
- data/examples/rack/Parkfile +1 -3
- data/examples/rails/.gitignore +2 -0
- data/examples/rails/Gemfile +1 -1
- data/examples/rails/app/assets/images/.keep +0 -0
- data/examples/rails/parklife-build +1 -0
- data/examples/sinatra/.gitignore +1 -0
- data/examples/sinatra/Parkfile +1 -3
- data/lib/parklife/application.rb +11 -3
- data/lib/parklife/browser.rb +22 -0
- data/lib/parklife/cli.rb +18 -12
- data/lib/parklife/config.rb +15 -2
- data/lib/parklife/crawler.rb +26 -44
- data/lib/parklife/errors.rb +12 -5
- data/lib/parklife/rails.rb +1 -6
- data/lib/parklife/utils.rb +37 -6
- data/lib/parklife/version.rb +1 -1
- data/parklife.gemspec +1 -1
- metadata +6 -7
- data/examples/rack/Gemfile.lock +0 -47
- data/examples/rails/Gemfile.lock +0 -150
- data/examples/sinatra/Gemfile.lock +0 -56
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 32db274b9f5ec6ce7c56203a391c30c2d541bb8f319bc1b4130b22a2ed85bcf3
|
4
|
+
data.tar.gz: dce6ff9a4911863acb4fdb19c134f822d729a4ced0df62799fc69853154e412f
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 434a2acf5bd27046330a6f4c954dcf7908e3672db1e9ce08e985b833f299273306b4535115093f8f18517210605da098e347bcd03a3481106e3c35958da5df5a
|
7
|
+
data.tar.gz: 43249f1f24c7f932da0fd3d949b9d2590274ae7360d86a4f70f268c2a566dde1cf946aa6ba665a89c9bd5bbb5d5b6ee3ef714edbb7f4acbd34000045ac72dff5
|
data/.github/workflows/tests.yml
CHANGED
@@ -15,6 +15,8 @@ jobs:
|
|
15
15
|
working-directory: examples/rack
|
16
16
|
- run: bundle exec parklife build
|
17
17
|
working-directory: examples/rack
|
18
|
+
- run: test -f build/index.html
|
19
|
+
working-directory: examples/rack
|
18
20
|
|
19
21
|
example_rails:
|
20
22
|
runs-on: ubuntu-latest
|
@@ -28,8 +30,8 @@ jobs:
|
|
28
30
|
working-directory: examples/rails
|
29
31
|
- run: ./parklife-build
|
30
32
|
working-directory: examples/rails
|
31
|
-
|
32
|
-
|
33
|
+
- run: test -f build/index.html
|
34
|
+
working-directory: examples/rails
|
33
35
|
|
34
36
|
example_sinatra:
|
35
37
|
runs-on: ubuntu-latest
|
@@ -43,6 +45,8 @@ jobs:
|
|
43
45
|
working-directory: examples/sinatra
|
44
46
|
- run: bundle exec parklife build
|
45
47
|
working-directory: examples/sinatra
|
48
|
+
- run: test -f build/index.html
|
49
|
+
working-directory: examples/sinatra
|
46
50
|
|
47
51
|
rspec:
|
48
52
|
runs-on: ubuntu-latest
|
@@ -52,6 +56,7 @@ jobs:
|
|
52
56
|
- '2.7'
|
53
57
|
- '3.0'
|
54
58
|
- '3.1'
|
59
|
+
- '3.2'
|
55
60
|
name: Ruby ${{ matrix.ruby }} RSpec
|
56
61
|
steps:
|
57
62
|
- uses: actions/checkout@v3
|
data/CHANGELOG.md
CHANGED
@@ -1,3 +1,20 @@
|
|
1
|
+
## Version 0.4.0 - 2023-03-01
|
2
|
+
|
3
|
+
- Add a `parklife --version` command.
|
4
|
+
- No need to `require parklife` from the Parkfile.
|
5
|
+
|
6
|
+
## Version 0.3.0 - 2023-02-26
|
7
|
+
|
8
|
+
- Allow overriding `config.base` from the CLI build command with the `--base` option.
|
9
|
+
- Support mounting the app at a path.
|
10
|
+
- Remove Capybara and use Rack::Test directly.
|
11
|
+
- Rename `config.rack_app` to `config.app`.
|
12
|
+
- Don't save the response when `on_404=:skip`.
|
13
|
+
- More accurate progress dots.
|
14
|
+
- Default `build_dir` to `build`.
|
15
|
+
- Fix build paths when `build_dir` isn't a full path.
|
16
|
+
- Ignore pathless links - for instance #fragments and mailto.
|
17
|
+
|
1
18
|
## Version 0.2.0 - 2023-02-21
|
2
19
|
|
3
20
|
- First official version hosted on [RubyGems.org](https://rubygems.org/gems/parklife).
|
data/README.md
CHANGED
@@ -4,6 +4,14 @@
|
|
4
4
|
|
5
5
|
[Parklife](https://github.com/benpickles/parklife) is a Ruby library to render a Rack app (Rails/Sinatra/etc) to a static site so it can be served by [Netlify](https://www.netlify.com), [Now](https://zeit.co/now), [GitHub Pages](https://pages.github.com), S3, or another static server.
|
6
6
|
|
7
|
+
## Installation
|
8
|
+
|
9
|
+
Add Parklife to your application's Gemfile and run bundle install.
|
10
|
+
|
11
|
+
```ruby
|
12
|
+
gem 'parklife'
|
13
|
+
```
|
14
|
+
|
7
15
|
## How to use Parklife with Rails
|
8
16
|
|
9
17
|
Parklife is configured with a file called `Parkfile` in the root of your project, here's an example `Parkfile` for an imaginary Rails app:
|
@@ -27,6 +35,8 @@ Parkfile.application.routes do
|
|
27
35
|
|
28
36
|
# A couple more hidden pages.
|
29
37
|
get easter_egg_path, crawl: true
|
38
|
+
|
39
|
+
# Services typically allow a custom 404 page.
|
30
40
|
get '404.html'
|
31
41
|
end
|
32
42
|
```
|
@@ -42,7 +52,23 @@ $ bundle exec parklife routes
|
|
42
52
|
/404.html
|
43
53
|
```
|
44
54
|
|
45
|
-
Now you can run `parklife build` which will fetch all the routes and save them to the `build` directory ready to be served as a static site.
|
55
|
+
Now you can run `parklife build` which will fetch all the routes and save them to the `build` directory ready to be served as a static site. Inspecting the build directory might look like this:
|
56
|
+
|
57
|
+
```
|
58
|
+
$ find build -type f
|
59
|
+
build/404.html
|
60
|
+
build/about/index.html
|
61
|
+
build/blog/index.html
|
62
|
+
build/blog/2019/03/07/developers-developers-developers/index.html
|
63
|
+
build/blog/2019/04/21/modern-life-is-rubbish/index.html
|
64
|
+
build/blog/2019/05/15/introducing-parklife/index.html
|
65
|
+
build/easter_egg/index.html
|
66
|
+
build/easter_egg/surprise/index.html
|
67
|
+
build/index.html
|
68
|
+
build/location/index.html
|
69
|
+
build/feed.atom
|
70
|
+
build/sitemap.xml
|
71
|
+
```
|
46
72
|
|
47
73
|
Parklife doesn't know about assets (images, CSS, etc) so you likely also need to generate those and copy them to the build directory, see the [Rails example's full build script](examples/rails/parklife-build) for how you might do this.
|
48
74
|
|
@@ -60,11 +86,17 @@ Sometimes you need to point to a link's full URL - maybe for a feed or a social
|
|
60
86
|
Parklife.application.config.base = 'https://foo.example.com'
|
61
87
|
```
|
62
88
|
|
89
|
+
The base URL can also be passed at build-time which will override the Parkfile setting:
|
90
|
+
|
91
|
+
```
|
92
|
+
$ bundle exec parklife build --base https://benpickles.github.io/parklife
|
93
|
+
```
|
94
|
+
|
63
95
|
### Dealing with trailing slashes <small>(turning off nested `index.html`)</small>
|
64
96
|
|
65
97
|
By default Parklife stores files in an `index.html` file nested in directory with the same name as the path - so the route `/my/nested/route` is stored in `/my/nested/route/index.html`. This is to make sure links within the app work without modification making it easier for any static server to host the build.
|
66
98
|
|
67
|
-
However, it's possible to turn this off so that `/my/nested/route` is stored in `/my/nested/route.html`. This allows you to serve trailing slash-less URLs by using [
|
99
|
+
However, it's possible to turn this off so that `/my/nested/route` is stored in `/my/nested/route.html`. This allows you to serve trailing slash-less URLs with GitHub Pages or with Netlify by using their [Pretty URLs feature](https://www.netlify.com/docs/redirects/#trailing-slash) or with some custom nginx config.
|
68
100
|
|
69
101
|
```ruby
|
70
102
|
Parklife.application.config.nested_index = false
|
@@ -72,7 +104,7 @@ Parklife.application.config.nested_index = false
|
|
72
104
|
|
73
105
|
### Changing the build output directory
|
74
106
|
|
75
|
-
The build directory shouldn't exist and is destroyed and recreated before each build.
|
107
|
+
The build directory shouldn't exist and is destroyed and recreated before each build. Defaults to `build`.
|
76
108
|
|
77
109
|
```ruby
|
78
110
|
Parklife.application.config.build_dir = 'my/build/dir'
|
@@ -80,7 +112,10 @@ Parklife.application.config.build_dir = 'my/build/dir'
|
|
80
112
|
|
81
113
|
### Handling a 404
|
82
114
|
|
83
|
-
By default if Parklife encounters a 404 response when fetching a route it will raise an exception (the `:error` setting)
|
115
|
+
By default if Parklife encounters a 404 response when fetching a route it will raise an exception (the `:error` setting) and stop the build. Other values are:
|
116
|
+
|
117
|
+
- `:warn` - output a message to `stderr`, save the response, and continue processing.
|
118
|
+
- `:skip` - silently ignore and not save the response, and continue processing.
|
84
119
|
|
85
120
|
```ruby
|
86
121
|
Parklife.application.config.on_404 = :warn
|
@@ -91,7 +126,7 @@ Parklife.application.config.on_404 = :warn
|
|
91
126
|
If you're not using the Rails configuration you'll need to define this yourself, see the [examples](examples).
|
92
127
|
|
93
128
|
```ruby
|
94
|
-
Parklife.application.config.
|
129
|
+
Parklife.application.config.app
|
95
130
|
```
|
96
131
|
|
97
132
|
## License
|
data/examples/rack/.gitignore
CHANGED
data/examples/rack/Parkfile
CHANGED
@@ -1,4 +1,3 @@
|
|
1
|
-
require 'parklife'
|
2
1
|
require 'rack'
|
3
2
|
|
4
3
|
app = Proc.new { |env|
|
@@ -12,8 +11,7 @@ app = Proc.new { |env|
|
|
12
11
|
}
|
13
12
|
|
14
13
|
Parklife.application.configure do |config|
|
15
|
-
config.
|
16
|
-
config.rack_app = app
|
14
|
+
config.app = app
|
17
15
|
end
|
18
16
|
|
19
17
|
Parklife.application.routes do
|
data/examples/rails/.gitignore
CHANGED
data/examples/rails/Gemfile
CHANGED
File without changes
|
data/examples/sinatra/.gitignore
CHANGED
data/examples/sinatra/Parkfile
CHANGED
@@ -1,4 +1,3 @@
|
|
1
|
-
require 'parklife'
|
2
1
|
require 'sinatra'
|
3
2
|
|
4
3
|
get '/' do
|
@@ -10,8 +9,7 @@ get '/hello/:name' do
|
|
10
9
|
end
|
11
10
|
|
12
11
|
Parklife.application.configure do |config|
|
13
|
-
config.
|
14
|
-
config.rack_app = Sinatra::Application
|
12
|
+
config.app = Sinatra::Application
|
15
13
|
end
|
16
14
|
|
17
15
|
Parklife.application.routes do
|
data/lib/parklife/application.rb
CHANGED
@@ -6,17 +6,16 @@ require 'parklife/route_set'
|
|
6
6
|
|
7
7
|
module Parklife
|
8
8
|
class Application
|
9
|
-
attr_reader :config
|
9
|
+
attr_reader :config
|
10
10
|
|
11
11
|
def initialize
|
12
12
|
@config = Config.new
|
13
13
|
@route_set = RouteSet.new
|
14
|
-
@crawler = Crawler.new(config, @route_set)
|
15
14
|
end
|
16
15
|
|
17
16
|
def build
|
18
17
|
raise BuildDirNotDefinedError if config.build_dir.nil?
|
19
|
-
raise RackAppNotDefinedError if config.
|
18
|
+
raise RackAppNotDefinedError if config.app.nil?
|
20
19
|
|
21
20
|
FileUtils.rm_rf(config.build_dir)
|
22
21
|
Dir.mkdir(config.build_dir)
|
@@ -28,6 +27,15 @@ module Parklife
|
|
28
27
|
yield config
|
29
28
|
end
|
30
29
|
|
30
|
+
def crawler
|
31
|
+
@crawler ||= Crawler.new(config, @route_set)
|
32
|
+
end
|
33
|
+
|
34
|
+
def load_Parkfile(path)
|
35
|
+
raise ParkfileLoadError.new(path) unless File.exist?(path)
|
36
|
+
load path
|
37
|
+
end
|
38
|
+
|
31
39
|
def routes(&block)
|
32
40
|
if block_given?
|
33
41
|
@route_set.instance_eval(&block)
|
@@ -0,0 +1,22 @@
|
|
1
|
+
require 'rack/test'
|
2
|
+
|
3
|
+
module Parklife
|
4
|
+
class Browser
|
5
|
+
attr_reader :app, :base, :env, :session
|
6
|
+
|
7
|
+
def initialize(app, base)
|
8
|
+
@app = app
|
9
|
+
@base = base
|
10
|
+
@env = {
|
11
|
+
'HTTP_HOST' => base.host,
|
12
|
+
'HTTPS' => base.scheme == 'https' ? 'on' : 'off',
|
13
|
+
script_name: base.path.chomp('/'),
|
14
|
+
}
|
15
|
+
@session = Rack::Test::Session.new(app)
|
16
|
+
end
|
17
|
+
|
18
|
+
def get(path)
|
19
|
+
session.get(path, nil, env)
|
20
|
+
end
|
21
|
+
end
|
22
|
+
end
|
data/lib/parklife/cli.rb
CHANGED
@@ -1,9 +1,14 @@
|
|
1
|
+
require 'parklife'
|
1
2
|
require 'thor'
|
2
3
|
|
3
4
|
module Parklife
|
4
5
|
class CLI < Thor
|
5
6
|
desc 'build', 'create a production build'
|
7
|
+
option :base, desc: 'set config.base at build-time - overrides the Parkfile setting'
|
6
8
|
def build
|
9
|
+
# Parkfile config overrides.
|
10
|
+
application.config.base = options[:base] if options[:base]
|
11
|
+
|
7
12
|
application.build
|
8
13
|
end
|
9
14
|
|
@@ -16,21 +21,22 @@ module Parklife
|
|
16
21
|
end
|
17
22
|
end
|
18
23
|
|
24
|
+
map '--version' => :version
|
25
|
+
desc 'version', 'output the current version of Parklife'
|
26
|
+
def version
|
27
|
+
puts Parklife::VERSION
|
28
|
+
end
|
29
|
+
|
19
30
|
private
|
20
31
|
def application
|
21
|
-
@application ||=
|
22
|
-
#
|
23
|
-
|
24
|
-
# Parklife::Application is defined.
|
25
|
-
load discover_Parkfile(Dir.pwd)
|
26
|
-
|
27
|
-
Parklife.application.config.reporter = $stdout
|
28
|
-
Parklife.application
|
29
|
-
end
|
30
|
-
end
|
32
|
+
@application ||= Parklife.application.tap { |app|
|
33
|
+
# Default output to stdout (can be overridden in the Parkfile).
|
34
|
+
app.config.reporter = $stdout
|
31
35
|
|
32
|
-
|
33
|
-
|
36
|
+
# Reach inside the consuming app's directory to apply its Parklife
|
37
|
+
# config.
|
38
|
+
app.load_Parkfile(File.join(Dir.pwd, 'Parkfile'))
|
39
|
+
}
|
34
40
|
end
|
35
41
|
end
|
36
42
|
end
|
data/lib/parklife/config.rb
CHANGED
@@ -1,14 +1,27 @@
|
|
1
1
|
require 'stringio'
|
2
|
+
require 'uri'
|
2
3
|
|
3
4
|
module Parklife
|
4
5
|
class Config
|
5
|
-
|
6
|
-
|
6
|
+
DEFAULT_HOST = 'example.com'
|
7
|
+
DEFAULT_SCHEME = 'http'
|
8
|
+
|
9
|
+
attr_accessor :app, :build_dir, :nested_index, :on_404, :reporter
|
10
|
+
attr_reader :base
|
7
11
|
|
8
12
|
def initialize
|
13
|
+
self.base = nil
|
14
|
+
self.build_dir = 'build'
|
9
15
|
self.nested_index = true
|
10
16
|
self.on_404 = :error
|
11
17
|
self.reporter = StringIO.new
|
12
18
|
end
|
19
|
+
|
20
|
+
def base=(value)
|
21
|
+
uri = URI.parse(value || '')
|
22
|
+
uri.host ||= DEFAULT_HOST
|
23
|
+
uri.scheme ||= DEFAULT_SCHEME
|
24
|
+
@base = uri
|
25
|
+
end
|
13
26
|
end
|
14
27
|
end
|
data/lib/parklife/crawler.rb
CHANGED
@@ -1,32 +1,25 @@
|
|
1
|
-
require '
|
2
|
-
require 'nokogiri'
|
1
|
+
require 'parklife/browser'
|
3
2
|
require 'parklife/route'
|
4
3
|
require 'parklife/utils'
|
5
4
|
require 'set'
|
6
5
|
|
7
6
|
module Parklife
|
8
7
|
class Crawler
|
9
|
-
attr_reader :config, :route_set
|
8
|
+
attr_reader :browser, :config, :route_set
|
10
9
|
|
11
10
|
def initialize(config, route_set)
|
12
11
|
@config = config
|
13
12
|
@route_set = route_set
|
14
|
-
|
15
|
-
Capybara.register_driver :parklife do |app|
|
16
|
-
Capybara::RackTest::Driver.new(app, follow_redirects: false)
|
17
|
-
end
|
13
|
+
@browser = Browser.new(config.app, config.base)
|
18
14
|
end
|
19
15
|
|
20
16
|
def start
|
21
|
-
Capybara.app_host = config.base if config.base
|
22
|
-
Capybara.save_path = config.build_dir
|
23
|
-
|
24
17
|
@routes = route_set.to_a
|
25
18
|
@visited = Set.new
|
26
19
|
|
27
20
|
while route = @routes.shift
|
28
|
-
process_route(route)
|
29
|
-
config.reporter.print
|
21
|
+
processed = process_route(route)
|
22
|
+
config.reporter.print('.') if processed
|
30
23
|
end
|
31
24
|
|
32
25
|
config.reporter.puts
|
@@ -45,60 +38,49 @@ module Parklife
|
|
45
38
|
@visited.include?(route) || @visited.include?(crawled_route)
|
46
39
|
end
|
47
40
|
|
48
|
-
return if already_processed
|
41
|
+
return false if already_processed
|
49
42
|
|
50
|
-
|
43
|
+
response = browser.get(route.path)
|
51
44
|
|
52
|
-
case
|
45
|
+
case response.status
|
53
46
|
when 200
|
54
47
|
# Continue processing the route.
|
55
48
|
when 404
|
56
49
|
case config.on_404
|
57
|
-
when :error
|
58
|
-
raise HTTPError.new(path: route.path, status: 404)
|
59
50
|
when :warn
|
60
|
-
$stderr.puts HTTPError.new(
|
51
|
+
$stderr.puts HTTPError.new(404, route.path).message
|
52
|
+
when :skip
|
53
|
+
return false
|
54
|
+
else
|
55
|
+
raise HTTPError.new(404, route.path)
|
61
56
|
end
|
62
57
|
else
|
63
|
-
raise HTTPError.new(
|
58
|
+
raise HTTPError.new(response.status, route.path)
|
64
59
|
end
|
65
60
|
|
66
|
-
|
67
|
-
Utils::build_path_for(
|
68
|
-
dir: config.build_dir,
|
69
|
-
path: route.path,
|
70
|
-
index: config.nested_index,
|
71
|
-
)
|
72
|
-
)
|
61
|
+
Utils.save_page(route.path, response.body, config)
|
73
62
|
|
74
63
|
@visited << route
|
75
64
|
|
76
65
|
if route.crawl
|
77
|
-
scan_for_links(
|
66
|
+
Utils.scan_for_links(response.body) do |path|
|
67
|
+
# When an app is mounted at a path it responds to URLs that must
|
68
|
+
# exclude the mount path but it generates links that include it (if
|
69
|
+
# it is correctly configured). This prefix must therefore be
|
70
|
+
# stripped from links discovered via crawling.
|
71
|
+
baseless_path = path.delete_prefix(config.base.path)
|
72
|
+
|
73
|
+
route = Route.new(baseless_path, crawl: true)
|
74
|
+
|
78
75
|
# Don't revisit the route if it has already been visited with
|
79
|
-
# crawl=true but do revisit if it wasn't crawled
|
80
|
-
# will always have crawl=true).
|
76
|
+
# crawl=true but do revisit if it wasn't crawled.
|
81
77
|
next if @visited.include?(route)
|
82
78
|
|
83
79
|
@routes << route
|
84
80
|
end
|
85
81
|
end
|
86
|
-
end
|
87
|
-
|
88
|
-
def scan_for_links(html)
|
89
|
-
doc = Nokogiri::HTML.parse(html)
|
90
|
-
doc.css('a').each do |a|
|
91
|
-
uri = URI.parse(a[:href])
|
92
|
-
|
93
|
-
# Don't visit a page that belongs to a different domain.
|
94
|
-
next if uri.host
|
95
|
-
|
96
|
-
yield Route.new(uri.path, crawl: true)
|
97
|
-
end
|
98
|
-
end
|
99
82
|
|
100
|
-
|
101
|
-
@session ||= Capybara::Session.new(:parklife, config.rack_app)
|
83
|
+
true
|
102
84
|
end
|
103
85
|
end
|
104
86
|
end
|
data/lib/parklife/errors.rb
CHANGED
@@ -4,17 +4,24 @@ module Parklife
|
|
4
4
|
RackAppNotDefinedError = Class.new(Error)
|
5
5
|
|
6
6
|
class HTTPError < Error
|
7
|
-
def initialize(path
|
8
|
-
@path = path
|
7
|
+
def initialize(status, path)
|
9
8
|
@status = status
|
9
|
+
@path = path
|
10
10
|
end
|
11
11
|
|
12
12
|
def message
|
13
|
-
%Q(#{status} response from path "#{path}")
|
13
|
+
%Q(#{@status} response from path "#{@path}")
|
14
|
+
end
|
15
|
+
end
|
16
|
+
|
17
|
+
class ParkfileLoadError < Error
|
18
|
+
def initialize(path)
|
19
|
+
@path = path
|
14
20
|
end
|
15
21
|
|
16
|
-
|
17
|
-
|
22
|
+
def message
|
23
|
+
%Q(Cannot load Parkfile "#{@path}")
|
24
|
+
end
|
18
25
|
end
|
19
26
|
|
20
27
|
class RailsNotDefinedError < Error
|
data/lib/parklife/rails.rb
CHANGED
@@ -1,12 +1,7 @@
|
|
1
|
-
require 'parklife/errors'
|
2
|
-
|
3
1
|
raise Parklife::RailsNotDefinedError unless defined?(Rails)
|
4
2
|
|
5
|
-
require 'parklife'
|
6
|
-
|
7
3
|
# Allow use of the consuming Rails application's route helpers from within the
|
8
4
|
# block when defining Parklife routes.
|
9
5
|
Parklife::RouteSet.include(Rails.application.routes.url_helpers)
|
10
6
|
|
11
|
-
Parklife.application.config.
|
12
|
-
Parklife.application.config.rack_app = Rails.application
|
7
|
+
Parklife.application.config.app = Rails.application
|
data/lib/parklife/utils.rb
CHANGED
@@ -1,19 +1,50 @@
|
|
1
|
+
require 'fileutils'
|
2
|
+
require 'nokogiri'
|
3
|
+
|
1
4
|
module Parklife
|
2
5
|
module Utils
|
3
6
|
extend self
|
4
7
|
|
5
|
-
def build_path_for(
|
8
|
+
def build_path_for(path, index: true)
|
6
9
|
path = path.gsub(/^\/|\/$/, '')
|
7
10
|
|
8
11
|
if File.extname(path).empty?
|
9
|
-
if
|
10
|
-
|
12
|
+
if path.empty?
|
13
|
+
'index.html'
|
14
|
+
elsif index
|
15
|
+
File.join(path, 'index.html')
|
11
16
|
else
|
12
|
-
|
13
|
-
File.join(dir, name)
|
17
|
+
"#{path}.html"
|
14
18
|
end
|
15
19
|
else
|
16
|
-
|
20
|
+
path
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
def save_page(path, content, config)
|
25
|
+
build_path = File.join(
|
26
|
+
config.build_dir,
|
27
|
+
build_path_for(path, index: config.nested_index)
|
28
|
+
)
|
29
|
+
FileUtils.mkdir_p(File.dirname(build_path))
|
30
|
+
File.write(build_path, content)
|
31
|
+
end
|
32
|
+
|
33
|
+
def scan_for_links(html)
|
34
|
+
doc = Nokogiri::HTML.parse(html)
|
35
|
+
doc.css('a').each do |a|
|
36
|
+
uri = URI.parse(a[:href])
|
37
|
+
|
38
|
+
# Don't visit a URL that belongs to a different domain - for now this is
|
39
|
+
# a guess that it's not an internal link but it also covers mailto/ftp
|
40
|
+
# links.
|
41
|
+
next if uri.host
|
42
|
+
|
43
|
+
# Don't visit a path-less URL - this will be the case for a #fragment
|
44
|
+
# for example.
|
45
|
+
next if uri.path.nil? || uri.path.empty?
|
46
|
+
|
47
|
+
yield uri.path
|
17
48
|
end
|
18
49
|
end
|
19
50
|
end
|
data/lib/parklife/version.rb
CHANGED
data/parklife.gemspec
CHANGED
@@ -28,8 +28,8 @@ Gem::Specification.new do |spec|
|
|
28
28
|
spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
|
29
29
|
spec.require_paths = ['lib']
|
30
30
|
|
31
|
-
spec.add_dependency 'capybara'
|
32
31
|
spec.add_dependency 'nokogiri'
|
32
|
+
spec.add_dependency 'rack-test'
|
33
33
|
spec.add_dependency 'thor'
|
34
34
|
|
35
35
|
spec.add_development_dependency 'bundler'
|
metadata
CHANGED
@@ -1,17 +1,17 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: parklife
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.4.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- Ben Pickles
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2023-
|
11
|
+
date: 2023-03-01 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
|
-
name:
|
14
|
+
name: nokogiri
|
15
15
|
requirement: !ruby/object:Gem::Requirement
|
16
16
|
requirements:
|
17
17
|
- - ">="
|
@@ -25,7 +25,7 @@ dependencies:
|
|
25
25
|
- !ruby/object:Gem::Version
|
26
26
|
version: '0'
|
27
27
|
- !ruby/object:Gem::Dependency
|
28
|
-
name:
|
28
|
+
name: rack-test
|
29
29
|
requirement: !ruby/object:Gem::Requirement
|
30
30
|
requirements:
|
31
31
|
- - ">="
|
@@ -115,14 +115,13 @@ files:
|
|
115
115
|
- bin/setup
|
116
116
|
- examples/rack/.gitignore
|
117
117
|
- examples/rack/Gemfile
|
118
|
-
- examples/rack/Gemfile.lock
|
119
118
|
- examples/rack/Parkfile
|
120
119
|
- examples/rails/.gitignore
|
121
120
|
- examples/rails/Gemfile
|
122
|
-
- examples/rails/Gemfile.lock
|
123
121
|
- examples/rails/Parkfile
|
124
122
|
- examples/rails/Rakefile
|
125
123
|
- examples/rails/app/assets/config/manifest.js
|
124
|
+
- examples/rails/app/assets/images/.keep
|
126
125
|
- examples/rails/app/assets/stylesheets/application.css
|
127
126
|
- examples/rails/app/assets/stylesheets/global.css
|
128
127
|
- examples/rails/app/controllers/application_controller.rb
|
@@ -174,11 +173,11 @@ files:
|
|
174
173
|
- examples/rails/public/robots.txt
|
175
174
|
- examples/sinatra/.gitignore
|
176
175
|
- examples/sinatra/Gemfile
|
177
|
-
- examples/sinatra/Gemfile.lock
|
178
176
|
- examples/sinatra/Parkfile
|
179
177
|
- exe/parklife
|
180
178
|
- lib/parklife.rb
|
181
179
|
- lib/parklife/application.rb
|
180
|
+
- lib/parklife/browser.rb
|
182
181
|
- lib/parklife/cli.rb
|
183
182
|
- lib/parklife/config.rb
|
184
183
|
- lib/parklife/crawler.rb
|
data/examples/rack/Gemfile.lock
DELETED
@@ -1,47 +0,0 @@
|
|
1
|
-
PATH
|
2
|
-
remote: ../..
|
3
|
-
specs:
|
4
|
-
parklife (0.1.0)
|
5
|
-
capybara
|
6
|
-
nokogiri
|
7
|
-
thor
|
8
|
-
|
9
|
-
GEM
|
10
|
-
remote: https://rubygems.org/
|
11
|
-
specs:
|
12
|
-
addressable (2.8.1)
|
13
|
-
public_suffix (>= 2.0.2, < 6.0)
|
14
|
-
capybara (3.38.0)
|
15
|
-
addressable
|
16
|
-
matrix
|
17
|
-
mini_mime (>= 0.1.3)
|
18
|
-
nokogiri (~> 1.8)
|
19
|
-
rack (>= 1.6.0)
|
20
|
-
rack-test (>= 0.6.3)
|
21
|
-
regexp_parser (>= 1.5, < 3.0)
|
22
|
-
xpath (~> 3.2)
|
23
|
-
matrix (0.4.2)
|
24
|
-
mini_mime (1.1.2)
|
25
|
-
mini_portile2 (2.8.1)
|
26
|
-
nokogiri (1.14.2)
|
27
|
-
mini_portile2 (~> 2.8.0)
|
28
|
-
racc (~> 1.4)
|
29
|
-
public_suffix (5.0.1)
|
30
|
-
racc (1.6.2)
|
31
|
-
rack (2.2.3)
|
32
|
-
rack-test (2.0.2)
|
33
|
-
rack (>= 1.3)
|
34
|
-
regexp_parser (2.7.0)
|
35
|
-
thor (1.2.1)
|
36
|
-
xpath (3.2.0)
|
37
|
-
nokogiri (~> 1.8)
|
38
|
-
|
39
|
-
PLATFORMS
|
40
|
-
ruby
|
41
|
-
|
42
|
-
DEPENDENCIES
|
43
|
-
parklife!
|
44
|
-
rack
|
45
|
-
|
46
|
-
BUNDLED WITH
|
47
|
-
1.17.2
|
data/examples/rails/Gemfile.lock
DELETED
@@ -1,150 +0,0 @@
|
|
1
|
-
PATH
|
2
|
-
remote: ../..
|
3
|
-
specs:
|
4
|
-
parklife (0.1.0)
|
5
|
-
capybara
|
6
|
-
nokogiri
|
7
|
-
thor
|
8
|
-
|
9
|
-
GEM
|
10
|
-
remote: https://rubygems.org/
|
11
|
-
specs:
|
12
|
-
actioncable (5.2.3)
|
13
|
-
actionpack (= 5.2.3)
|
14
|
-
nio4r (~> 2.0)
|
15
|
-
websocket-driver (>= 0.6.1)
|
16
|
-
actionmailer (5.2.3)
|
17
|
-
actionpack (= 5.2.3)
|
18
|
-
actionview (= 5.2.3)
|
19
|
-
activejob (= 5.2.3)
|
20
|
-
mail (~> 2.5, >= 2.5.4)
|
21
|
-
rails-dom-testing (~> 2.0)
|
22
|
-
actionpack (5.2.3)
|
23
|
-
actionview (= 5.2.3)
|
24
|
-
activesupport (= 5.2.3)
|
25
|
-
rack (~> 2.0)
|
26
|
-
rack-test (>= 0.6.3)
|
27
|
-
rails-dom-testing (~> 2.0)
|
28
|
-
rails-html-sanitizer (~> 1.0, >= 1.0.2)
|
29
|
-
actionview (5.2.3)
|
30
|
-
activesupport (= 5.2.3)
|
31
|
-
builder (~> 3.1)
|
32
|
-
erubi (~> 1.4)
|
33
|
-
rails-dom-testing (~> 2.0)
|
34
|
-
rails-html-sanitizer (~> 1.0, >= 1.0.3)
|
35
|
-
activejob (5.2.3)
|
36
|
-
activesupport (= 5.2.3)
|
37
|
-
globalid (>= 0.3.6)
|
38
|
-
activemodel (5.2.3)
|
39
|
-
activesupport (= 5.2.3)
|
40
|
-
activerecord (5.2.3)
|
41
|
-
activemodel (= 5.2.3)
|
42
|
-
activesupport (= 5.2.3)
|
43
|
-
arel (>= 9.0)
|
44
|
-
activestorage (5.2.3)
|
45
|
-
actionpack (= 5.2.3)
|
46
|
-
activerecord (= 5.2.3)
|
47
|
-
marcel (~> 0.3.1)
|
48
|
-
activesupport (5.2.3)
|
49
|
-
concurrent-ruby (~> 1.0, >= 1.0.2)
|
50
|
-
i18n (>= 0.7, < 2)
|
51
|
-
minitest (~> 5.1)
|
52
|
-
tzinfo (~> 1.1)
|
53
|
-
addressable (2.8.1)
|
54
|
-
public_suffix (>= 2.0.2, < 6.0)
|
55
|
-
arel (9.0.0)
|
56
|
-
builder (3.2.3)
|
57
|
-
capybara (3.38.0)
|
58
|
-
addressable
|
59
|
-
matrix
|
60
|
-
mini_mime (>= 0.1.3)
|
61
|
-
nokogiri (~> 1.8)
|
62
|
-
rack (>= 1.6.0)
|
63
|
-
rack-test (>= 0.6.3)
|
64
|
-
regexp_parser (>= 1.5, < 3.0)
|
65
|
-
xpath (~> 3.2)
|
66
|
-
concurrent-ruby (1.1.5)
|
67
|
-
crass (1.0.5)
|
68
|
-
erubi (1.8.0)
|
69
|
-
globalid (0.4.2)
|
70
|
-
activesupport (>= 4.2.0)
|
71
|
-
i18n (1.6.0)
|
72
|
-
concurrent-ruby (~> 1.0)
|
73
|
-
loofah (2.3.1)
|
74
|
-
crass (~> 1.0.2)
|
75
|
-
nokogiri (>= 1.5.9)
|
76
|
-
mail (2.7.1)
|
77
|
-
mini_mime (>= 0.1.1)
|
78
|
-
marcel (0.3.3)
|
79
|
-
mimemagic (~> 0.3.2)
|
80
|
-
matrix (0.4.2)
|
81
|
-
method_source (0.9.2)
|
82
|
-
mimemagic (0.3.10)
|
83
|
-
nokogiri (~> 1)
|
84
|
-
rake
|
85
|
-
mini_mime (1.0.1)
|
86
|
-
mini_portile2 (2.5.1)
|
87
|
-
minitest (5.11.3)
|
88
|
-
nio4r (2.3.1)
|
89
|
-
nokogiri (1.11.5)
|
90
|
-
mini_portile2 (~> 2.5.0)
|
91
|
-
racc (~> 1.4)
|
92
|
-
public_suffix (5.0.1)
|
93
|
-
racc (1.5.2)
|
94
|
-
rack (2.2.3)
|
95
|
-
rack-test (1.1.0)
|
96
|
-
rack (>= 1.0, < 3)
|
97
|
-
rails (5.2.3)
|
98
|
-
actioncable (= 5.2.3)
|
99
|
-
actionmailer (= 5.2.3)
|
100
|
-
actionpack (= 5.2.3)
|
101
|
-
actionview (= 5.2.3)
|
102
|
-
activejob (= 5.2.3)
|
103
|
-
activemodel (= 5.2.3)
|
104
|
-
activerecord (= 5.2.3)
|
105
|
-
activestorage (= 5.2.3)
|
106
|
-
activesupport (= 5.2.3)
|
107
|
-
bundler (>= 1.3.0)
|
108
|
-
railties (= 5.2.3)
|
109
|
-
sprockets-rails (>= 2.0.0)
|
110
|
-
rails-dom-testing (2.0.3)
|
111
|
-
activesupport (>= 4.2.0)
|
112
|
-
nokogiri (>= 1.6)
|
113
|
-
rails-html-sanitizer (1.0.4)
|
114
|
-
loofah (~> 2.2, >= 2.2.2)
|
115
|
-
railties (5.2.3)
|
116
|
-
actionpack (= 5.2.3)
|
117
|
-
activesupport (= 5.2.3)
|
118
|
-
method_source
|
119
|
-
rake (>= 0.8.7)
|
120
|
-
thor (>= 0.19.0, < 2.0)
|
121
|
-
rake (13.0.1)
|
122
|
-
regexp_parser (2.7.0)
|
123
|
-
sprockets (3.7.2)
|
124
|
-
concurrent-ruby (~> 1.0)
|
125
|
-
rack (> 1, < 3)
|
126
|
-
sprockets-rails (3.2.1)
|
127
|
-
actionpack (>= 4.0)
|
128
|
-
activesupport (>= 4.0)
|
129
|
-
sprockets (>= 3.0.0)
|
130
|
-
sqlite3 (1.4.1)
|
131
|
-
thor (0.20.3)
|
132
|
-
thread_safe (0.3.6)
|
133
|
-
tzinfo (1.2.5)
|
134
|
-
thread_safe (~> 0.1)
|
135
|
-
websocket-driver (0.7.0)
|
136
|
-
websocket-extensions (>= 0.1.0)
|
137
|
-
websocket-extensions (0.1.5)
|
138
|
-
xpath (3.2.0)
|
139
|
-
nokogiri (~> 1.8)
|
140
|
-
|
141
|
-
PLATFORMS
|
142
|
-
ruby
|
143
|
-
|
144
|
-
DEPENDENCIES
|
145
|
-
parklife!
|
146
|
-
rails
|
147
|
-
sqlite3
|
148
|
-
|
149
|
-
BUNDLED WITH
|
150
|
-
1.17.2
|
@@ -1,56 +0,0 @@
|
|
1
|
-
PATH
|
2
|
-
remote: ../..
|
3
|
-
specs:
|
4
|
-
parklife (0.1.0)
|
5
|
-
capybara
|
6
|
-
nokogiri
|
7
|
-
thor
|
8
|
-
|
9
|
-
GEM
|
10
|
-
remote: https://rubygems.org/
|
11
|
-
specs:
|
12
|
-
addressable (2.8.1)
|
13
|
-
public_suffix (>= 2.0.2, < 6.0)
|
14
|
-
capybara (3.38.0)
|
15
|
-
addressable
|
16
|
-
matrix
|
17
|
-
mini_mime (>= 0.1.3)
|
18
|
-
nokogiri (~> 1.8)
|
19
|
-
rack (>= 1.6.0)
|
20
|
-
rack-test (>= 0.6.3)
|
21
|
-
regexp_parser (>= 1.5, < 3.0)
|
22
|
-
xpath (~> 3.2)
|
23
|
-
matrix (0.4.2)
|
24
|
-
mini_mime (1.1.2)
|
25
|
-
mini_portile2 (2.8.1)
|
26
|
-
mustermann (1.0.3)
|
27
|
-
nokogiri (1.14.2)
|
28
|
-
mini_portile2 (~> 2.8.0)
|
29
|
-
racc (~> 1.4)
|
30
|
-
public_suffix (5.0.1)
|
31
|
-
racc (1.6.2)
|
32
|
-
rack (2.2.3)
|
33
|
-
rack-protection (2.0.5)
|
34
|
-
rack
|
35
|
-
rack-test (2.0.2)
|
36
|
-
rack (>= 1.3)
|
37
|
-
regexp_parser (2.7.0)
|
38
|
-
sinatra (2.0.5)
|
39
|
-
mustermann (~> 1.0)
|
40
|
-
rack (~> 2.0)
|
41
|
-
rack-protection (= 2.0.5)
|
42
|
-
tilt (~> 2.0)
|
43
|
-
thor (1.2.1)
|
44
|
-
tilt (2.0.9)
|
45
|
-
xpath (3.2.0)
|
46
|
-
nokogiri (~> 1.8)
|
47
|
-
|
48
|
-
PLATFORMS
|
49
|
-
ruby
|
50
|
-
|
51
|
-
DEPENDENCIES
|
52
|
-
parklife!
|
53
|
-
sinatra
|
54
|
-
|
55
|
-
BUNDLED WITH
|
56
|
-
1.17.2
|