jekyll-embed-urls 0.2.0 → 0.4.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/CHANGELOG.md +34 -0
- data/README.md +117 -2
- data/_includes/fallback.html +17 -0
- data/_includes/ogp.html +23 -0
- data/lib/jekyll-embed-urls.rb +6 -59
- data/lib/jekyll/embed.rb +329 -0
- data/lib/jekyll/embed/cache.rb +18 -0
- data/lib/jekyll/embed/content.rb +28 -0
- data/lib/jekyll/embed/filter.rb +35 -0
- data/lib/jekyll/embed/oembed.rb +19 -0
- metadata +98 -7
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 30c09af02fb3b49b49ffae328a7c474377badc49f4a7d10e37c5077411b93e55
|
4
|
+
data.tar.gz: c66c5bf04fad8b11c7df5bef04e01a912a70a0b371dbb93cf14107bfb9de39c5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d4a9316874d7dfd7da173144c7eb608697b06432a5c10f2e9fa4d79ec61474a5bfc4d90ce214316a2659ee5a80147bfeee4d39851e6c88c18b0a9def4e6bd9f1
|
7
|
+
data.tar.gz: 20964fff6fcdaac00ceb536fbfd858e07984ed2f00e7eb59a5537a72a090700844de9133085e9509416de0386f18ac4620015f11be7979c6e347c44126153f30
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,39 @@
|
|
1
1
|
# Changelog
|
2
2
|
|
3
|
+
## v0.4.0
|
4
|
+
|
5
|
+
* Almost a complete rewrite.
|
6
|
+
* Does its best to prevent visitor tracking.
|
7
|
+
* Embed URLs with OEmbed, OGP and fallbacks to discover title and other
|
8
|
+
stuff.
|
9
|
+
* If using Jekyll >= 4.2.0, finds URLs in content and replaces for HTML
|
10
|
+
embed.
|
11
|
+
* Customizable templates.
|
12
|
+
* Uses [UrlPrivacy](https://0xacab.org/sutty/url-privacy) to prevent
|
13
|
+
tracking.
|
14
|
+
|
15
|
+
## v0.3.3
|
16
|
+
|
17
|
+
* Add `allow-popups` to sandbox so you can open links in a new window.
|
18
|
+
|
19
|
+
## v0.3.2
|
20
|
+
|
21
|
+
* Rescue `OEmbed::Error`
|
22
|
+
|
23
|
+
## v0.3.1
|
24
|
+
|
25
|
+
* Put link inside a paragraph so markdown ignores the HTML
|
26
|
+
|
27
|
+
## v0.3.0
|
28
|
+
|
29
|
+
* Reuse the iframe and sandbox it if the embed code contains one
|
30
|
+
|
31
|
+
* Use a Referrer-Policy
|
32
|
+
|
33
|
+
https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy
|
34
|
+
|
35
|
+
https://web.dev/referrer-best-practices/
|
36
|
+
|
3
37
|
## v0.2.0
|
4
38
|
|
5
39
|
* Use a sandboxed iframe
|
data/README.md
CHANGED
@@ -1,8 +1,14 @@
|
|
1
1
|
# jekyll-embed-urls
|
2
2
|
|
3
|
-
This plugin
|
4
|
-
|
3
|
+
This plugin converts URLs to their previsualization by using
|
4
|
+
[OEmbed](https://oembed.com/), [OGP](http://ogp.me/). It fallbacks to
|
5
|
+
showing a card with basic information.
|
5
6
|
|
7
|
+
While developing this plugin, we found out that OEmbed providers tend to
|
8
|
+
inject JavaScript and other ways of tracking users, so this plugin does
|
9
|
+
its best to prevent it.
|
10
|
+
|
11
|
+
For OGP and fallback, you can modify the templates.
|
6
12
|
|
7
13
|
## Installation
|
8
14
|
|
@@ -27,6 +33,37 @@ Add the plugin to your `_config.yml`:
|
|
27
33
|
```yaml
|
28
34
|
plugins:
|
29
35
|
- jekyll-embed-urls
|
36
|
+
embed:
|
37
|
+
# Extra elements to remove
|
38
|
+
scrub:
|
39
|
+
- form
|
40
|
+
- input
|
41
|
+
- textarea
|
42
|
+
- button
|
43
|
+
- fieldset
|
44
|
+
- select
|
45
|
+
- option
|
46
|
+
- optgroup
|
47
|
+
- canvas
|
48
|
+
- area
|
49
|
+
- map
|
50
|
+
# Attribute values can be strings or array of strings
|
51
|
+
attributes:
|
52
|
+
referrerpolicy: strict-origin-when-cross-origin
|
53
|
+
sandbox:
|
54
|
+
- allow-scripts
|
55
|
+
- allow-popups
|
56
|
+
allow:
|
57
|
+
- fullscreen
|
58
|
+
- gyroscope
|
59
|
+
- picture-in-picture
|
60
|
+
- clipboard-write
|
61
|
+
loading: 'lazy'
|
62
|
+
controls: true
|
63
|
+
rel:
|
64
|
+
- noopener
|
65
|
+
- noreferrer
|
66
|
+
target: _blank
|
30
67
|
```
|
31
68
|
|
32
69
|
Then, when you want to embed an URL (like a video) in a post, simply
|
@@ -49,6 +86,81 @@ paragraphs but it needs to be in its own block of text.
|
|
49
86
|
**Another note:** [Invidious doesn't support OEmbed
|
50
87
|
yet](https://github.com/omarroth/invidious/issues/1222) :P
|
51
88
|
|
89
|
+
## Themes
|
90
|
+
|
91
|
+
You can also use it as a Liquid filter, for instance:
|
92
|
+
|
93
|
+
```html
|
94
|
+
{{ page.embed_url | embed }}
|
95
|
+
```
|
96
|
+
|
97
|
+
The `embed` filter takes an URL and replaces it for the HTML. Other
|
98
|
+
filters are `oembed`, `ogp` and `fallback`.
|
99
|
+
|
100
|
+
### Templates
|
101
|
+
|
102
|
+
You can modify the templates by providing your own include files,
|
103
|
+
`_includes/ogp.html` and `_includes/fallback.html`. We don't add any
|
104
|
+
CSS so you can develop your own.
|
105
|
+
|
106
|
+
To access default includes, run `bundle show jekyll-embed-urls` and copy
|
107
|
+
the files from the `_includes` directory to your site.
|
108
|
+
|
109
|
+
## Facebook and Instagram
|
110
|
+
|
111
|
+
Facebook deprecated their OEmbed API and now a token is required for
|
112
|
+
embedding Facebook and Instagram URLs. Set it as an environtment
|
113
|
+
variable named `OEMBED_FACEBOOK_TOKEN`.
|
114
|
+
|
115
|
+
If you don't have it, this plugin make a best effort attempt. Instagram
|
116
|
+
will be available through OGP, but their image URLs expire after
|
117
|
+
a certain time, so your site may appear broken after a while. We could
|
118
|
+
download them but we decided not to because it may infringe on
|
119
|
+
intellectual property laws and personal rights such as privacy, and
|
120
|
+
consequently put our service at risk.
|
121
|
+
|
122
|
+
It's our position that there're legitimate uses for downloading remote
|
123
|
+
media, such as for archiving collective memory (police brutality, public
|
124
|
+
figures speeches, etc.) that may be removed without notice.
|
125
|
+
|
126
|
+
In these cases our recommendation is always not to host with corporate
|
127
|
+
services, since they don't share our politics and actively work against
|
128
|
+
us.
|
129
|
+
|
130
|
+
We're hotlinking and copying text though, assuming that falls under fair
|
131
|
+
use rights.
|
132
|
+
|
133
|
+
## Tracking prevention
|
134
|
+
|
135
|
+
Anti-tracking techniques implemented are:
|
136
|
+
|
137
|
+
* `<script>` and other tags are removed. No external JS is loaded in
|
138
|
+
a local context.
|
139
|
+
|
140
|
+
* `<form>`s and their elements are removed.
|
141
|
+
|
142
|
+
* `<canvas>`, `<area>`, `<map>` are removed.
|
143
|
+
|
144
|
+
* `<iframe>`s are
|
145
|
+
[sandboxed](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox).
|
146
|
+
|
147
|
+
* `<img>`s are lazy loaded. This is not strickly anti-tracking but
|
148
|
+
images are loaded when needed.
|
149
|
+
|
150
|
+
* All URLs get their tracking params removed by
|
151
|
+
[UrlPrivacy](https://0xacab.org/sutty/url-privacy)
|
152
|
+
|
153
|
+
* [Referrer
|
154
|
+
Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy)
|
155
|
+
is implemented for supported elements. Extrangely, `<video>` and
|
156
|
+
`<audio>` don't seem to support it.
|
157
|
+
|
158
|
+
* External links open in a new tab and have `rel="noopener noreferrer"`
|
159
|
+
to prevent [reverse
|
160
|
+
tabnabbing](https://owasp.org/www-community/attacks/Reverse_Tabnabbing).
|
161
|
+
|
162
|
+
If you find more useful techniques, please [open an issue
|
163
|
+
report](https://0xacab.org/sutty/jekyll/jekyll-embed-urls/-/issues).
|
52
164
|
|
53
165
|
## Contributing
|
54
166
|
|
@@ -58,6 +170,9 @@ intended to be a safe, welcoming space for collaboration, and
|
|
58
170
|
contributors are expected to adhere to the [Sutty code of
|
59
171
|
conduct](https://sutty.nl/en/code-of-conduct/).
|
60
172
|
|
173
|
+
If you like our plugins, [please consider
|
174
|
+
donating](https://donaciones.sutty.nl/en/)!
|
175
|
+
|
61
176
|
## License
|
62
177
|
|
63
178
|
The gem is available as free software under the terms of the GPL3
|
@@ -0,0 +1,17 @@
|
|
1
|
+
<article class="fallback">
|
2
|
+
{%- if page.image -%}
|
3
|
+
<img referrerpolicy="{{ embed.referrerpolicy }}" loading="{{ embed.loading }}" src="{{ page.image }}" class="img-fluid" />
|
4
|
+
{%- endif -%}
|
5
|
+
|
6
|
+
<h1>{{ page.title }}</h1>
|
7
|
+
<p class="lead">{{ page.description }}</p>
|
8
|
+
<p><small>
|
9
|
+
<a
|
10
|
+
href="{{ page.url }}"
|
11
|
+
referrerpolicy="{{ embed.referrerpolicy }}"
|
12
|
+
target="{{ embed.target }}"
|
13
|
+
rel="{{ embed.rel }}">
|
14
|
+
{{ page.url }}
|
15
|
+
</a>
|
16
|
+
</small><p>
|
17
|
+
</article>
|
data/_includes/ogp.html
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
<article class="ogp" lang="{{ page.locale }}">
|
2
|
+
{%- if page.video -%}
|
3
|
+
<video poster="{{ page.image }}" class="img-fluid" {{ embed.controls }} src="{{ page.video }}"/>
|
4
|
+
{%- elsif page.image -%}
|
5
|
+
<img referrerpolicy="{{ embed.referrerpolicy }}" loading="{{ embed.loading }}" src="{{ page.image }}" class="img-fluid" />
|
6
|
+
{%- endif -%}
|
7
|
+
|
8
|
+
{%- if page.audio -%}
|
9
|
+
<audio class="img-fluid" {{ embed.controls }} src="{{ page.audio }}"/>
|
10
|
+
{%- endif -%}
|
11
|
+
|
12
|
+
<h1>{{ page.title }}</h1>
|
13
|
+
<p class="lead">{{ page.description }}</p>
|
14
|
+
<p><small>
|
15
|
+
<a
|
16
|
+
href="{{ page.url }}"
|
17
|
+
referrerpolicy="{{ embed.referrerpolicy }}"
|
18
|
+
target="{{ embed.target }}"
|
19
|
+
rel="{{ embed.rel }}">
|
20
|
+
{{ page.url }}
|
21
|
+
</a>
|
22
|
+
</small><p>
|
23
|
+
</article>
|
data/lib/jekyll-embed-urls.rb
CHANGED
@@ -1,62 +1,9 @@
|
|
1
|
-
|
2
|
-
require 'cgi'
|
1
|
+
# frozen_string_literal: true
|
3
2
|
|
3
|
+
require_relative 'jekyll/embed'
|
4
|
+
require_relative 'jekyll/embed/filter'
|
4
5
|
|
5
|
-
#
|
6
|
-
|
7
|
-
Jekyll::
|
8
|
-
# Cache results
|
9
|
-
cache ||= Jekyll::Cache.new('Jekyll::OEmbed::Urls')
|
10
|
-
|
11
|
-
# Only modify documents to be written
|
12
|
-
site.docs_to_write.each do |doc|
|
13
|
-
# Skip text paragraphs
|
14
|
-
next unless %r{\n\nhttps?://} =~ doc.content
|
15
|
-
|
16
|
-
# Split texts by markdown blocks
|
17
|
-
doc.content = doc.content.split("\n\n").map do |p|
|
18
|
-
# Only process lines with URLs
|
19
|
-
if %r{\Ahttps?://} =~ p
|
20
|
-
# Remove empty characters
|
21
|
-
p.strip!
|
22
|
-
|
23
|
-
# @see {https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox}
|
24
|
-
same_origin = p.start_with? site.config['url']
|
25
|
-
|
26
|
-
Jekyll.logger.debug "Finding OEmbed content for #{p}"
|
27
|
-
# Cache the results
|
28
|
-
cache.getset(p) do
|
29
|
-
Jekyll.logger.debug "=> Not cached, obtaining..."
|
30
|
-
|
31
|
-
result = OEmbed::Providers.get(p)
|
32
|
-
|
33
|
-
# Return a sandboxed iframe with the size of the HTML. We
|
34
|
-
# only allow scripts to run inside the iframe and nothing
|
35
|
-
# else.
|
36
|
-
<<~IFRAME
|
37
|
-
<iframe
|
38
|
-
referrerpolicy="no-referrer"
|
39
|
-
sandbox="allow-scripts #{same_origin ? '' : 'allow-same-origin'}"
|
40
|
-
style="min-width:#{result.width}px;min-height:#{result.height || 0}px"
|
41
|
-
srcdoc="#{CGI.escape_html result.html}"
|
42
|
-
></iframe>
|
43
|
-
IFRAME
|
44
|
-
|
45
|
-
result.html
|
46
|
-
rescue OEmbed::NotFound => e
|
47
|
-
# If the URL doesn't support OEmbed just return an external
|
48
|
-
# link.
|
49
|
-
#
|
50
|
-
# TODO: Fetch information with OGP and render a template.
|
51
|
-
Jekyll.logger.warn "#{p} is not oembeddable or URL can't be fetched, showing as URL"
|
52
|
-
|
53
|
-
"<a href=\"#{p}\" target=\"_blank\">#{p}</a>"
|
54
|
-
end
|
55
|
-
else
|
56
|
-
# Otherwise return the original block
|
57
|
-
p
|
58
|
-
end
|
59
|
-
# Rebuild the content
|
60
|
-
end.join("\n\n")
|
61
|
-
end
|
6
|
+
# Configure Embed
|
7
|
+
Jekyll::Hooks.register :site, :after_init do |site|
|
8
|
+
Jekyll::Embed.site = site
|
62
9
|
end
|
data/lib/jekyll/embed.rb
ADDED
@@ -0,0 +1,329 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'faraday'
|
4
|
+
require 'faraday-http-cache'
|
5
|
+
require 'faraday_middleware/response/follow_redirects'
|
6
|
+
require 'loofah'
|
7
|
+
require 'ogp'
|
8
|
+
require 'url_privacy'
|
9
|
+
|
10
|
+
require_relative 'embed/oembed'
|
11
|
+
require_relative 'embed/cache'
|
12
|
+
|
13
|
+
if Gem::Version.new(Jekyll::VERSION) >= Gem::Version.new('4.2.0')
|
14
|
+
require_relative 'embed/content'
|
15
|
+
else
|
16
|
+
Jekyll.logger.warn "Upgrade to Jekyll >= 4.2.0 to embed URLs in content"
|
17
|
+
end
|
18
|
+
|
19
|
+
OEmbed::Providers.register_all
|
20
|
+
OEmbed::Providers.register_fallback(OEmbed::ProviderDiscovery,
|
21
|
+
OEmbed::Providers::Noembed)
|
22
|
+
|
23
|
+
module Jekyll
|
24
|
+
# The idea with this class is to find the best safe representation of
|
25
|
+
# a link. For a YouTube video it could be the sandboxed iframe. This
|
26
|
+
# loads the video and allows you to reproduce it while preventing YT
|
27
|
+
# to call home and send data about your users. But other social networks
|
28
|
+
# will try to take control of their containers by modifying the page.
|
29
|
+
# They resist sandboxing and don't work correctly. For them, we
|
30
|
+
# cleanup unwanted HTML tags such as <script>, and return the HTML,
|
31
|
+
# which you can style using CSS. Twitter does this.
|
32
|
+
#
|
33
|
+
# Others are only available through OGP, so we retrieve the metadata
|
34
|
+
# and render a template, which you can provide in your own theme too.
|
35
|
+
#
|
36
|
+
# We also try for microformats and we would look at Schema.org too but
|
37
|
+
# doesn't seem to be a gem for it yet.
|
38
|
+
#
|
39
|
+
# If the URL doesn't provide anything at all we get the URL, title and
|
40
|
+
# date of last visit.
|
41
|
+
#
|
42
|
+
# Isn't it nice that the corporations that requires us to use OEmbed,
|
43
|
+
# OGP, Twitter Cards, Schema.org and other metadata, don't do use
|
44
|
+
# themselves?
|
45
|
+
#
|
46
|
+
# Also we're going to use heavy caching so we don't hit rate limits or
|
47
|
+
# lose the representation if the service is down or the URL is
|
48
|
+
# removed. We may be tempted to store the resources locally (images,
|
49
|
+
# videos, audio) but we have to take into account that people have
|
50
|
+
# legitimate reasons to remove media from the Internet.
|
51
|
+
class Embed
|
52
|
+
# Attributes to apply by HTMLElement
|
53
|
+
IFRAME_ATTRIBUTES = %w[allow sandbox referrerpolicy loading].freeze
|
54
|
+
IMAGE_ATTRIBUTES = %w[referrerpolicy loading].freeze
|
55
|
+
MEDIA_ATTRIBUTES = %w[controls].freeze
|
56
|
+
A_ATTRIBUTES = %w[referrerpolicy rel target].freeze
|
57
|
+
|
58
|
+
# Templates
|
59
|
+
INCLUDE_OGP = '{% include ogp.html site=site page=page %}'
|
60
|
+
INCLUDE_FALLBACK = '{% include fallback.html site=site page=page %}'
|
61
|
+
|
62
|
+
# The default referrer policy only sends the origin URL (not the
|
63
|
+
# full URL, only the protocol/scheme and domain part) if the remote
|
64
|
+
# URL is HTTPS.
|
65
|
+
#
|
66
|
+
# @see {https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy}
|
67
|
+
#
|
68
|
+
# The default sandbox restrictions only allow scripts in the context
|
69
|
+
# of the iframe and opening new tabs.
|
70
|
+
#
|
71
|
+
# @see {https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox}
|
72
|
+
DEFAULT_CONFIG = {
|
73
|
+
'scrub' => %w[form input textarea button fieldset select option optgroup canvas area map],
|
74
|
+
'attributes' => {
|
75
|
+
'referrerpolicy' => 'strict-origin-when-cross-origin',
|
76
|
+
'sandbox' => %w[allow-scripts allow-popups],
|
77
|
+
'allow' => %w[fullscreen gyroscope picture-in-picture clipboard-write],
|
78
|
+
'loading' => 'lazy',
|
79
|
+
'controls' => true,
|
80
|
+
'rel' => %w[noopener noreferrer],
|
81
|
+
'target' => '_blank'
|
82
|
+
}
|
83
|
+
}
|
84
|
+
|
85
|
+
class << self
|
86
|
+
def site
|
87
|
+
unless @site
|
88
|
+
raise Jekyll::Errors::InvalidConfigurationError,
|
89
|
+
"Site is missing, configure with `Jekyll::Embed.site = site`"
|
90
|
+
end
|
91
|
+
|
92
|
+
@site
|
93
|
+
end
|
94
|
+
|
95
|
+
# This is an initializer of sorts
|
96
|
+
#
|
97
|
+
# @param [Jekyll::Site]
|
98
|
+
# @return [Jekyll::Site]
|
99
|
+
def site=(site)
|
100
|
+
raise ArgumentError, "Site must be a Jekyll::Site" unless site.is_a? Jekyll::Site
|
101
|
+
|
102
|
+
@site = site
|
103
|
+
|
104
|
+
# Add the _includes dir so we can provide default templates that
|
105
|
+
# can be overriden locally or by the theme.
|
106
|
+
site.includes_load_paths << File.expand_path(File.join(__dir__, '..', '..', '_includes'))
|
107
|
+
# Since we're embedding, we're allowing iframes
|
108
|
+
Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2 << 'iframe'
|
109
|
+
|
110
|
+
# Other elements that are disallowed
|
111
|
+
config['scrub']&.each do |scrub|
|
112
|
+
Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2.delete(scrub)
|
113
|
+
end
|
114
|
+
|
115
|
+
payload['embed'] = config['attributes']
|
116
|
+
|
117
|
+
site
|
118
|
+
end
|
119
|
+
|
120
|
+
# Render the URL as HTML
|
121
|
+
#
|
122
|
+
# 1. Try oembed for video and image
|
123
|
+
# 2. If rich oembed, cleanup
|
124
|
+
# 3. If OGP, render templates
|
125
|
+
# 4. Else, render fallback template
|
126
|
+
#
|
127
|
+
# @param [String] URL
|
128
|
+
# @return [String] HTML
|
129
|
+
def embed(url)
|
130
|
+
url.strip!
|
131
|
+
|
132
|
+
# Quick check
|
133
|
+
raise URI::Error unless url.start_with? 'http'
|
134
|
+
|
135
|
+
# Just to verify the URL is valid
|
136
|
+
URI.parse url
|
137
|
+
|
138
|
+
oembed(url) || ogp(url) || fallback(url)
|
139
|
+
rescue URI::Error
|
140
|
+
Jekyll.logger.warn "#{url.inspect} is not a valid URL"
|
141
|
+
|
142
|
+
url
|
143
|
+
end
|
144
|
+
|
145
|
+
# @return [Hash]
|
146
|
+
def config
|
147
|
+
@config ||= Jekyll::Utils.deep_merge_hashes(DEFAULT_CONFIG, (site.config['embed'] || {}))
|
148
|
+
end
|
149
|
+
|
150
|
+
# Try for OEmbed
|
151
|
+
#
|
152
|
+
# @param [String] URL
|
153
|
+
# @return [String,NilClass] Sanitized HTML or nil
|
154
|
+
def oembed(url)
|
155
|
+
cache.getset(url) do
|
156
|
+
oembed = OEmbed::Providers.get url
|
157
|
+
|
158
|
+
# Prevent caching of nil?
|
159
|
+
raise OEmbed::Error unless oembed.respond_to? :html
|
160
|
+
|
161
|
+
# Cleanup. We don't allow running remote scripts locally,
|
162
|
+
# period.
|
163
|
+
cleanup(Loofah.fragment(oembed.html).scrub!(:prune), url).to_s
|
164
|
+
end
|
165
|
+
rescue OEmbed::Error
|
166
|
+
nil
|
167
|
+
end
|
168
|
+
|
169
|
+
# Try for OGP.
|
170
|
+
# @param [String] URL
|
171
|
+
# @return [String,NilClass]
|
172
|
+
def ogp(url)
|
173
|
+
cache.getset(url) do
|
174
|
+
ogp = OGP::OpenGraph.new get(url).body
|
175
|
+
context = info.dup
|
176
|
+
context[:registers][:page] = payload['page'] = ogp.data
|
177
|
+
|
178
|
+
ogp_template.render! payload, context
|
179
|
+
end
|
180
|
+
rescue OGP::MalformedSourceError, OGP::MissingAttributeError, Faraday::Error
|
181
|
+
nil
|
182
|
+
end
|
183
|
+
|
184
|
+
# Try something
|
185
|
+
def fallback(url)
|
186
|
+
cache.getset(url) do
|
187
|
+
html = Nokogiri::HTML.fragment get(url).body
|
188
|
+
element = html.css('article').first
|
189
|
+
element ||= html.css('section').first
|
190
|
+
element ||= html.css('main').first
|
191
|
+
element ||= html.css('body').first
|
192
|
+
title = html.css('title').first
|
193
|
+
description = html.css('meta[name="description"]').first
|
194
|
+
|
195
|
+
context = info.dup
|
196
|
+
context[:registers][:page] = payload['page'] = {
|
197
|
+
'title' => text(title),
|
198
|
+
'description' => text(description),
|
199
|
+
'url' => url,
|
200
|
+
'image' => element.css('img').first&.public_send(:[], 'src')
|
201
|
+
}
|
202
|
+
|
203
|
+
fallback_template.render! payload, context
|
204
|
+
end
|
205
|
+
rescue Faraday::Error, Nokogiri::SyntaxError
|
206
|
+
nil
|
207
|
+
end
|
208
|
+
|
209
|
+
# @param [String] URL
|
210
|
+
# @return [Faraday::Response]
|
211
|
+
def get(url)
|
212
|
+
@get_cache ||= {}
|
213
|
+
@get_cache[url] ||= http_client.get url
|
214
|
+
end
|
215
|
+
|
216
|
+
# @return [Jekyll::Embed::Cache]
|
217
|
+
def cache
|
218
|
+
@cache ||= Jekyll::Embed::Cache.new('Jekyll::Embed')
|
219
|
+
end
|
220
|
+
|
221
|
+
# @return [Faraday::Connection]
|
222
|
+
def http_client
|
223
|
+
@http_client ||= Faraday.new do |builder|
|
224
|
+
builder.use FaradayMiddleware::FollowRedirects
|
225
|
+
builder.use :http_cache, shared_cache: false, store: cache, serializer: Marshal
|
226
|
+
end
|
227
|
+
end
|
228
|
+
|
229
|
+
def cleanup(html, url)
|
230
|
+
# Add our own attributes
|
231
|
+
html.css('iframe').each do |iframe|
|
232
|
+
IFRAME_ATTRIBUTES.each do |attr|
|
233
|
+
iframe[attr] = value_for_attr(attr)
|
234
|
+
end
|
235
|
+
|
236
|
+
# Embedding itself require allow-same-origin
|
237
|
+
iframe['sandbox'] += allow_same_origin(url)
|
238
|
+
end
|
239
|
+
|
240
|
+
html.css('audio, video').each do |media|
|
241
|
+
MEDIA_ATTRIBUTES.each do |attr|
|
242
|
+
media[attr] = value_for_attr(attr)
|
243
|
+
end
|
244
|
+
|
245
|
+
media['src'] = UrlPrivacy.clean media['src']
|
246
|
+
end
|
247
|
+
|
248
|
+
html.css('img').each do |img|
|
249
|
+
IMAGE_ATTRIBUTES.each do |attr|
|
250
|
+
img[attr] = value_for_attr(attr)
|
251
|
+
end
|
252
|
+
end
|
253
|
+
|
254
|
+
html.css('a').each do |a|
|
255
|
+
A_ATTRIBUTES.each do |attr|
|
256
|
+
a[attr] = value_for_attr(attr)
|
257
|
+
end
|
258
|
+
end
|
259
|
+
|
260
|
+
html.css('[src]').each do |element|
|
261
|
+
element['src'] = CGI.escapeHTML(UrlPrivacy.clean(CGI.unescapeHTML(element['src'])))
|
262
|
+
end
|
263
|
+
|
264
|
+
html.css('[href]').each do |element|
|
265
|
+
element['href'] = CGI.escapeHTML(UrlPrivacy.clean(CGI.unescapeHTML(element['href'])))
|
266
|
+
end
|
267
|
+
|
268
|
+
# Return the cleaned up HTML
|
269
|
+
html
|
270
|
+
end
|
271
|
+
|
272
|
+
def text(node)
|
273
|
+
node&.text&.tr("\n", '')&.tr("\r", '')&.strip&.squeeze(' ')
|
274
|
+
end
|
275
|
+
|
276
|
+
private
|
277
|
+
|
278
|
+
def fallback_template
|
279
|
+
@fallback_template ||= site.liquid_renderer.file('fallback.html').parse(INCLUDE_FALLBACK)
|
280
|
+
end
|
281
|
+
|
282
|
+
def ogp_template
|
283
|
+
@ogp_template ||= site.liquid_renderer.file('ogp.html').parse(INCLUDE_OGP)
|
284
|
+
end
|
285
|
+
|
286
|
+
def info
|
287
|
+
@info ||= {
|
288
|
+
registers: { site: site },
|
289
|
+
strict_filters: site.config.dig('liquid', 'strict_filters'),
|
290
|
+
strict_variables: site.config.dig('liquid', 'strict_variables')
|
291
|
+
}
|
292
|
+
end
|
293
|
+
|
294
|
+
# @param [String]
|
295
|
+
# @return [String]
|
296
|
+
def value_for_attr(attr)
|
297
|
+
@value_for_attr ||= {}
|
298
|
+
@value_for_attr[attr] ||=
|
299
|
+
case (value = config.dig('attributes', attr))
|
300
|
+
when String then value
|
301
|
+
when Array then value.join(' ')
|
302
|
+
end
|
303
|
+
end
|
304
|
+
|
305
|
+
# If the iframe comes from the same site, we can allow the same
|
306
|
+
# origin policy on the sandbox.
|
307
|
+
#
|
308
|
+
# @param [String] URL
|
309
|
+
# @return [String]
|
310
|
+
def allow_same_origin(url)
|
311
|
+
unless site.config['url']
|
312
|
+
Jekyll.logger.warn "Add url to _config.yml to determine if the site can embed itself"
|
313
|
+
return ' allow-same-origin'
|
314
|
+
end
|
315
|
+
|
316
|
+
@allow_same_origin ||= {}
|
317
|
+
@allow_same_origin[url] ||= url.start_with?(site.config['url']) ? '' : ' allow-same-origin'
|
318
|
+
end
|
319
|
+
|
320
|
+
# Caches it because Jekyll::Site#site_payload returns a new object
|
321
|
+
# everytime.
|
322
|
+
#
|
323
|
+
# @return [Jekyll::Drops::UnifiedPayloadDrop]
|
324
|
+
def payload
|
325
|
+
@payload ||= site.site_payload
|
326
|
+
end
|
327
|
+
end
|
328
|
+
end
|
329
|
+
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Jekyll
|
4
|
+
class Embed
|
5
|
+
# Jekyll cache that behaves like ActiveSupport::Cache
|
6
|
+
class Cache < Jekyll::Cache
|
7
|
+
def write(key, value)
|
8
|
+
self[key] = value
|
9
|
+
end
|
10
|
+
|
11
|
+
def read(key)
|
12
|
+
self[key]
|
13
|
+
rescue
|
14
|
+
nil
|
15
|
+
end
|
16
|
+
end
|
17
|
+
end
|
18
|
+
end
|
@@ -0,0 +1,28 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'cgi'
|
4
|
+
|
5
|
+
module Jekyll
|
6
|
+
class Embed
|
7
|
+
class Content
|
8
|
+
URL_RE = /<p[^>]*>[\s\n]*(?<url>https?:\/\/[^<\s\n]+)[\s\n]*<\/p>/m.freeze
|
9
|
+
|
10
|
+
class << self
|
11
|
+
# Find URLs on paragraphs. We do it after rendering because
|
12
|
+
# sometimes we use HTML instead of pure Markdown and this way we
|
13
|
+
# catch both.
|
14
|
+
def embed!(content)
|
15
|
+
URL_RE.match(content) do |match|
|
16
|
+
embed = Jekyll::Embed.embed CGI.unescapeHTML(match[:url])
|
17
|
+
|
18
|
+
content.sub! URL_RE, embed
|
19
|
+
end
|
20
|
+
end
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
24
|
+
end
|
25
|
+
|
26
|
+
Jekyll::Hooks.register :posts, :post_convert do |post|
|
27
|
+
Jekyll::Embed::Content.embed! post.content
|
28
|
+
end
|
@@ -0,0 +1,35 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Jekyll
|
4
|
+
class Embed
|
5
|
+
module Filter
|
6
|
+
# This filter takes the URL passed as input an returns its HTML
|
7
|
+
# representation. Embed takes care of everything else.
|
8
|
+
def embed(url)
|
9
|
+
return url unless url.is_a? String
|
10
|
+
|
11
|
+
Embed.embed url
|
12
|
+
end
|
13
|
+
|
14
|
+
def oembed(url)
|
15
|
+
return url unless url.is_a? String
|
16
|
+
|
17
|
+
Embed.oembed url
|
18
|
+
end
|
19
|
+
|
20
|
+
def ogp(url)
|
21
|
+
return url unless url.is_a? String
|
22
|
+
|
23
|
+
Embed.ogp url
|
24
|
+
end
|
25
|
+
|
26
|
+
def fallback(url)
|
27
|
+
return url unless url.is_a? String
|
28
|
+
|
29
|
+
Embed.fallback url
|
30
|
+
end
|
31
|
+
end
|
32
|
+
end
|
33
|
+
end
|
34
|
+
|
35
|
+
Liquid::Template.register_filter(Jekyll::Embed::Filter)
|
@@ -0,0 +1,19 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'oembed'
|
4
|
+
|
5
|
+
module OEmbed
|
6
|
+
module ProviderDecorator
|
7
|
+
def self.included(base)
|
8
|
+
base.class_eval do
|
9
|
+
def http_get(url, _)
|
10
|
+
Jekyll::Embed.get(url.to_s).body
|
11
|
+
rescue Faraday::Error
|
12
|
+
raise OEmbed::UnknownResponse
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
16
|
+
end
|
17
|
+
end
|
18
|
+
|
19
|
+
OEmbed::Provider.include OEmbed::ProviderDecorator
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: jekyll-embed-urls
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.4.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- f
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2021-02-01 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: jekyll
|
@@ -30,14 +30,98 @@ dependencies:
|
|
30
30
|
requirements:
|
31
31
|
- - "~>"
|
32
32
|
- !ruby/object:Gem::Version
|
33
|
-
version: '0.
|
33
|
+
version: '0.15'
|
34
34
|
type: :runtime
|
35
35
|
prerelease: false
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
38
|
- - "~>"
|
39
39
|
- !ruby/object:Gem::Version
|
40
|
-
version: '0.
|
40
|
+
version: '0.15'
|
41
|
+
- !ruby/object:Gem::Dependency
|
42
|
+
name: loofah
|
43
|
+
requirement: !ruby/object:Gem::Requirement
|
44
|
+
requirements:
|
45
|
+
- - "~>"
|
46
|
+
- !ruby/object:Gem::Version
|
47
|
+
version: '2.9'
|
48
|
+
type: :runtime
|
49
|
+
prerelease: false
|
50
|
+
version_requirements: !ruby/object:Gem::Requirement
|
51
|
+
requirements:
|
52
|
+
- - "~>"
|
53
|
+
- !ruby/object:Gem::Version
|
54
|
+
version: '2.9'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: ogp
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '0.4'
|
62
|
+
type: :runtime
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '0.4'
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: faraday
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - "~>"
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
version: '1.3'
|
76
|
+
type: :runtime
|
77
|
+
prerelease: false
|
78
|
+
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - "~>"
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: '1.3'
|
83
|
+
- !ruby/object:Gem::Dependency
|
84
|
+
name: faraday-http-cache
|
85
|
+
requirement: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - "~>"
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: '2.2'
|
90
|
+
type: :runtime
|
91
|
+
prerelease: false
|
92
|
+
version_requirements: !ruby/object:Gem::Requirement
|
93
|
+
requirements:
|
94
|
+
- - "~>"
|
95
|
+
- !ruby/object:Gem::Version
|
96
|
+
version: '2.2'
|
97
|
+
- !ruby/object:Gem::Dependency
|
98
|
+
name: faraday_middleware
|
99
|
+
requirement: !ruby/object:Gem::Requirement
|
100
|
+
requirements:
|
101
|
+
- - "~>"
|
102
|
+
- !ruby/object:Gem::Version
|
103
|
+
version: '1'
|
104
|
+
type: :runtime
|
105
|
+
prerelease: false
|
106
|
+
version_requirements: !ruby/object:Gem::Requirement
|
107
|
+
requirements:
|
108
|
+
- - "~>"
|
109
|
+
- !ruby/object:Gem::Version
|
110
|
+
version: '1'
|
111
|
+
- !ruby/object:Gem::Dependency
|
112
|
+
name: url-privacy
|
113
|
+
requirement: !ruby/object:Gem::Requirement
|
114
|
+
requirements:
|
115
|
+
- - "~>"
|
116
|
+
- !ruby/object:Gem::Version
|
117
|
+
version: '0'
|
118
|
+
type: :runtime
|
119
|
+
prerelease: false
|
120
|
+
version_requirements: !ruby/object:Gem::Requirement
|
121
|
+
requirements:
|
122
|
+
- - "~>"
|
123
|
+
- !ruby/object:Gem::Version
|
124
|
+
version: '0'
|
41
125
|
description: Replaces URLs for their previsualization in Jekyll posts
|
42
126
|
email:
|
43
127
|
- f@sutty.nl
|
@@ -51,7 +135,14 @@ files:
|
|
51
135
|
- CHANGELOG.md
|
52
136
|
- LICENSE.txt
|
53
137
|
- README.md
|
138
|
+
- _includes/fallback.html
|
139
|
+
- _includes/ogp.html
|
54
140
|
- lib/jekyll-embed-urls.rb
|
141
|
+
- lib/jekyll/embed.rb
|
142
|
+
- lib/jekyll/embed/cache.rb
|
143
|
+
- lib/jekyll/embed/content.rb
|
144
|
+
- lib/jekyll/embed/filter.rb
|
145
|
+
- lib/jekyll/embed/oembed.rb
|
55
146
|
homepage: https://0xacab.org/sutty/jekyll/jekyll-embed-urls
|
56
147
|
licenses:
|
57
148
|
- GPL-3.0
|
@@ -74,16 +165,16 @@ require_paths:
|
|
74
165
|
- lib
|
75
166
|
required_ruby_version: !ruby/object:Gem::Requirement
|
76
167
|
requirements:
|
77
|
-
- - "
|
168
|
+
- - ">="
|
78
169
|
- !ruby/object:Gem::Version
|
79
|
-
version:
|
170
|
+
version: 2.6.0
|
80
171
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
81
172
|
requirements:
|
82
173
|
- - ">="
|
83
174
|
- !ruby/object:Gem::Version
|
84
175
|
version: '0'
|
85
176
|
requirements: []
|
86
|
-
rubygems_version: 3.
|
177
|
+
rubygems_version: 3.1.2
|
87
178
|
signing_key:
|
88
179
|
specification_version: 4
|
89
180
|
summary: Embed URL previsualization in Jekyll posts
|