jekyll-embed-urls 0.3.3 → 0.4.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/CHANGELOG.md +12 -0
- data/README.md +114 -2
- data/_includes/fallback.html +17 -0
- data/_includes/ogp.html +23 -0
- data/lib/jekyll-embed-urls.rb +6 -87
- data/lib/jekyll/embed.rb +329 -0
- data/lib/jekyll/embed/cache.rb +18 -0
- data/lib/jekyll/embed/content.rb +28 -0
- data/lib/jekyll/embed/filter.rb +35 -0
- data/lib/jekyll/embed/oembed.rb +19 -0
- metadata +86 -9
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA256:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 30c09af02fb3b49b49ffae328a7c474377badc49f4a7d10e37c5077411b93e55
|
4
|
+
data.tar.gz: c66c5bf04fad8b11c7df5bef04e01a912a70a0b371dbb93cf14107bfb9de39c5
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: d4a9316874d7dfd7da173144c7eb608697b06432a5c10f2e9fa4d79ec61474a5bfc4d90ce214316a2659ee5a80147bfeee4d39851e6c88c18b0a9def4e6bd9f1
|
7
|
+
data.tar.gz: 20964fff6fcdaac00ceb536fbfd858e07984ed2f00e7eb59a5537a72a090700844de9133085e9509416de0386f18ac4620015f11be7979c6e347c44126153f30
|
data/CHANGELOG.md
CHANGED
@@ -1,5 +1,17 @@
|
|
1
1
|
# Changelog
|
2
2
|
|
3
|
+
## v0.4.0
|
4
|
+
|
5
|
+
* Almost a complete rewrite.
|
6
|
+
* Does its best to prevent visitor tracking.
|
7
|
+
* Embed URLs with OEmbed, OGP and fallbacks to discover title and other
|
8
|
+
stuff.
|
9
|
+
* If using Jekyll >= 4.2.0, finds URLs in content and replaces for HTML
|
10
|
+
embed.
|
11
|
+
* Customizable templates.
|
12
|
+
* Uses [UrlPrivacy](https://0xacab.org/sutty/url-privacy) to prevent
|
13
|
+
tracking.
|
14
|
+
|
3
15
|
## v0.3.3
|
4
16
|
|
5
17
|
* Add `allow-popups` to sandbox so you can open links in a new window.
|
data/README.md
CHANGED
@@ -1,8 +1,14 @@
|
|
1
1
|
# jekyll-embed-urls
|
2
2
|
|
3
|
-
This plugin
|
4
|
-
|
3
|
+
This plugin converts URLs to their previsualization by using
|
4
|
+
[OEmbed](https://oembed.com/), [OGP](http://ogp.me/). It fallbacks to
|
5
|
+
showing a card with basic information.
|
5
6
|
|
7
|
+
While developing this plugin, we found out that OEmbed providers tend to
|
8
|
+
inject JavaScript and other ways of tracking users, so this plugin does
|
9
|
+
its best to prevent it.
|
10
|
+
|
11
|
+
For OGP and fallback, you can modify the templates.
|
6
12
|
|
7
13
|
## Installation
|
8
14
|
|
@@ -27,6 +33,37 @@ Add the plugin to your `_config.yml`:
|
|
27
33
|
```yaml
|
28
34
|
plugins:
|
29
35
|
- jekyll-embed-urls
|
36
|
+
embed:
|
37
|
+
# Extra elements to remove
|
38
|
+
scrub:
|
39
|
+
- form
|
40
|
+
- input
|
41
|
+
- textarea
|
42
|
+
- button
|
43
|
+
- fieldset
|
44
|
+
- select
|
45
|
+
- option
|
46
|
+
- optgroup
|
47
|
+
- canvas
|
48
|
+
- area
|
49
|
+
- map
|
50
|
+
# Attribute values can be strings or array of strings
|
51
|
+
attributes:
|
52
|
+
referrerpolicy: strict-origin-when-cross-origin
|
53
|
+
sandbox:
|
54
|
+
- allow-scripts
|
55
|
+
- allow-popups
|
56
|
+
allow:
|
57
|
+
- fullscreen
|
58
|
+
- gyroscope
|
59
|
+
- picture-in-picture
|
60
|
+
- clipboard-write
|
61
|
+
loading: 'lazy'
|
62
|
+
controls: true
|
63
|
+
rel:
|
64
|
+
- noopener
|
65
|
+
- noreferrer
|
66
|
+
target: _blank
|
30
67
|
```
|
31
68
|
|
32
69
|
Then, when you want to embed an URL (like a video) in a post, simply
|
@@ -49,6 +86,81 @@ paragraphs but it needs to be in its own block of text.
|
|
49
86
|
**Another note:** [Invidious doesn't support OEmbed
|
50
87
|
yet](https://github.com/omarroth/invidious/issues/1222) :P
|
51
88
|
|
89
|
+
## Themes
|
90
|
+
|
91
|
+
You can also use it as a Liquid filter, for instance:
|
92
|
+
|
93
|
+
```html
|
94
|
+
{{ page.embed_url | embed }}
|
95
|
+
```
|
96
|
+
|
97
|
+
The `embed` filter takes an URL and replaces it for the HTML. Other
|
98
|
+
filters are `oembed`, `ogp` and `fallback`.
|
99
|
+
|
100
|
+
### Templates
|
101
|
+
|
102
|
+
You can modify the templates by providing your own include files,
|
103
|
+
`_includes/ogp.html` and `_includes/fallback.html`. We don't add any
|
104
|
+
CSS so you can develop your own.
|
105
|
+
|
106
|
+
To access default includes, run `bundle show jekyll-embed-urls` and copy
|
107
|
+
the files from the `_includes` directory to your site.
|
108
|
+
|
109
|
+
## Facebook and Instagram
|
110
|
+
|
111
|
+
Facebook deprecated their OEmbed API and now a token is required for
|
112
|
+
embedding Facebook and Instagram URLs. Set it as an environtment
|
113
|
+
variable named `OEMBED_FACEBOOK_TOKEN`.
|
114
|
+
|
115
|
+
If you don't have it, this plugin make a best effort attempt. Instagram
|
116
|
+
will be available through OGP, but their image URLs expire after
|
117
|
+
a certain time, so your site may appear broken after a while. We could
|
118
|
+
download them but we decided not to because it may infringe on
|
119
|
+
intellectual property laws and personal rights such as privacy, and
|
120
|
+
consequently put our service at risk.
|
121
|
+
|
122
|
+
It's our position that there're legitimate uses for downloading remote
|
123
|
+
media, such as for archiving collective memory (police brutality, public
|
124
|
+
figures speeches, etc.) that may be removed without notice.
|
125
|
+
|
126
|
+
In these cases our recommendation is always not to host with corporate
|
127
|
+
services, since they don't share our politics and actively work against
|
128
|
+
us.
|
129
|
+
|
130
|
+
We're hotlinking and copying text though, assuming that falls under fair
|
131
|
+
use rights.
|
132
|
+
|
133
|
+
## Tracking prevention
|
134
|
+
|
135
|
+
Anti-tracking techniques implemented are:
|
136
|
+
|
137
|
+
* `<script>` and other tags are removed. No external JS is loaded in
|
138
|
+
a local context.
|
139
|
+
|
140
|
+
* `<form>`s and their elements are removed.
|
141
|
+
|
142
|
+
* `<canvas>`, `<area>`, `<map>` are removed.
|
143
|
+
|
144
|
+
* `<iframe>`s are
|
145
|
+
[sandboxed](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox).
|
146
|
+
|
147
|
+
* `<img>`s are lazy loaded. This is not strickly anti-tracking but
|
148
|
+
images are loaded when needed.
|
149
|
+
|
150
|
+
* All URLs get their tracking params removed by
|
151
|
+
[UrlPrivacy](https://0xacab.org/sutty/url-privacy)
|
152
|
+
|
153
|
+
* [Referrer
|
154
|
+
Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy)
|
155
|
+
is implemented for supported elements. Extrangely, `<video>` and
|
156
|
+
`<audio>` don't seem to support it.
|
157
|
+
|
158
|
+
* External links open in a new tab and have `rel="noopener noreferrer"`
|
159
|
+
to prevent [reverse
|
160
|
+
tabnabbing](https://owasp.org/www-community/attacks/Reverse_Tabnabbing).
|
161
|
+
|
162
|
+
If you find more useful techniques, please [open an issue
|
163
|
+
report](https://0xacab.org/sutty/jekyll/jekyll-embed-urls/-/issues).
|
52
164
|
|
53
165
|
## Contributing
|
54
166
|
|
@@ -0,0 +1,17 @@
|
|
1
|
+
<article class="fallback">
|
2
|
+
{%- if page.image -%}
|
3
|
+
<img referrerpolicy="{{ embed.referrerpolicy }}" loading="{{ embed.loading }}" src="{{ page.image }}" class="img-fluid" />
|
4
|
+
{%- endif -%}
|
5
|
+
|
6
|
+
<h1>{{ page.title }}</h1>
|
7
|
+
<p class="lead">{{ page.description }}</p>
|
8
|
+
<p><small>
|
9
|
+
<a
|
10
|
+
href="{{ page.url }}"
|
11
|
+
referrerpolicy="{{ embed.referrerpolicy }}"
|
12
|
+
target="{{ embed.target }}"
|
13
|
+
rel="{{ embed.rel }}">
|
14
|
+
{{ page.url }}
|
15
|
+
</a>
|
16
|
+
</small><p>
|
17
|
+
</article>
|
data/_includes/ogp.html
ADDED
@@ -0,0 +1,23 @@
|
|
1
|
+
<article class="ogp" lang="{{ page.locale }}">
|
2
|
+
{%- if page.video -%}
|
3
|
+
<video poster="{{ page.image }}" class="img-fluid" {{ embed.controls }} src="{{ page.video }}"/>
|
4
|
+
{%- elsif page.image -%}
|
5
|
+
<img referrerpolicy="{{ embed.referrerpolicy }}" loading="{{ embed.loading }}" src="{{ page.image }}" class="img-fluid" />
|
6
|
+
{%- endif -%}
|
7
|
+
|
8
|
+
{%- if page.audio -%}
|
9
|
+
<audio class="img-fluid" {{ embed.controls }} src="{{ page.audio }}"/>
|
10
|
+
{%- endif -%}
|
11
|
+
|
12
|
+
<h1>{{ page.title }}</h1>
|
13
|
+
<p class="lead">{{ page.description }}</p>
|
14
|
+
<p><small>
|
15
|
+
<a
|
16
|
+
href="{{ page.url }}"
|
17
|
+
referrerpolicy="{{ embed.referrerpolicy }}"
|
18
|
+
target="{{ embed.target }}"
|
19
|
+
rel="{{ embed.rel }}">
|
20
|
+
{{ page.url }}
|
21
|
+
</a>
|
22
|
+
</small><p>
|
23
|
+
</article>
|
data/lib/jekyll-embed-urls.rb
CHANGED
@@ -1,90 +1,9 @@
|
|
1
|
-
|
2
|
-
require 'cgi'
|
3
|
-
require 'oga'
|
1
|
+
# frozen_string_literal: true
|
4
2
|
|
5
|
-
|
6
|
-
|
7
|
-
# to set their own size, also send metrics. So they won't work on a
|
8
|
-
# sandboxed iframe, which we were expecting, but they also won't be
|
9
|
-
# comfortable for visitors to use. We're planning on using OGP and
|
10
|
-
# render our own partials (configurable) for this. This way everything
|
11
|
-
# is safer and the embedded content even adapts to the site's design.
|
12
|
-
#
|
13
|
-
# So, expect a major refactoring!
|
3
|
+
require_relative 'jekyll/embed'
|
4
|
+
require_relative 'jekyll/embed/filter'
|
14
5
|
|
15
|
-
|
16
|
-
|
17
|
-
|
18
|
-
|
19
|
-
# Process the content of documents before rendering them to find URLs in
|
20
|
-
# a block.
|
21
|
-
Jekyll::Hooks.register :site, :pre_render do |site|
|
22
|
-
# Cache results
|
23
|
-
cache ||= Jekyll::Cache.new('Jekyll::OEmbed::Urls')
|
24
|
-
# TODO: Make configurable
|
25
|
-
referrer_policy = 'strict-origin-when-cross-origin'
|
26
|
-
|
27
|
-
# Only modify documents to be written
|
28
|
-
site.docs_to_write.each do |doc|
|
29
|
-
# Skip text paragraphs
|
30
|
-
# XXX: Find link in first line
|
31
|
-
next unless %r{\n\n\s*<?https?://} =~ doc.content
|
32
|
-
|
33
|
-
# Split texts by markdown blocks
|
34
|
-
doc.content = doc.content.split("\n\n").map do |p|
|
35
|
-
# Only process lines with URLs
|
36
|
-
next p unless %r{\A\s*<?https?://} =~ p
|
37
|
-
# Remove empty characters and markdown autolinks
|
38
|
-
p = p.strip.tr('<', '').tr('>', '')
|
39
|
-
|
40
|
-
# @see {https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox}
|
41
|
-
same_origin = p.start_with? site.config['url']
|
42
|
-
|
43
|
-
Jekyll.logger.debug "Finding OEmbed content for #{p}"
|
44
|
-
# Cache the results
|
45
|
-
cache.getset(p) do
|
46
|
-
Jekyll.logger.debug "=> Not cached, obtaining..."
|
47
|
-
|
48
|
-
result = OEmbed::Providers.get(p)
|
49
|
-
sandbox = "allow-scripts allow-popups #{same_origin ? '' : 'allow-same-origin'}"
|
50
|
-
|
51
|
-
# If the embed HTML contains an iframe, make sure it has the
|
52
|
-
# correct attributes.
|
53
|
-
if %r{<iframe } =~ result.html
|
54
|
-
html = Oga.parse_html result.html
|
55
|
-
|
56
|
-
html.css('iframe').each do |iframe|
|
57
|
-
iframe.attributes.delete_if do |attr|
|
58
|
-
%w[width height].include? attr.name
|
59
|
-
end
|
60
|
-
|
61
|
-
iframe.attributes << Oga::XML::Attribute.new(name: 'sandbox', value: sandbox)
|
62
|
-
iframe.attributes << Oga::XML::Attribute.new(name: 'referrerpolicy', value: referrer_policy)
|
63
|
-
end
|
64
|
-
|
65
|
-
html.to_xml
|
66
|
-
else
|
67
|
-
# Return a sandboxed iframe with the size of the HTML. We
|
68
|
-
# only allow scripts to run inside the iframe and nothing
|
69
|
-
# else.
|
70
|
-
<<~IFRAME
|
71
|
-
<iframe
|
72
|
-
referrerpolicy="#{referrer_policy}"
|
73
|
-
sandbox="#{sandbox}"
|
74
|
-
style="min-width:#{result.width}px;min-height:#{result.height || 0}px"
|
75
|
-
srcdoc="#{CGI.escape_html result.html}"></iframe>
|
76
|
-
IFRAME
|
77
|
-
end
|
78
|
-
rescue OEmbed::Error => e
|
79
|
-
# If the URL doesn't support OEmbed just return an external
|
80
|
-
# link.
|
81
|
-
#
|
82
|
-
# TODO: Fetch information with OGP and render a template.
|
83
|
-
Jekyll.logger.warn "#{p} is not oembeddable or URL can't be fetched, showing as URL: #{e}"
|
84
|
-
|
85
|
-
%(<p><a href="#{p}" target="_blank" referrerpolicy="#{referrer_policy}">#{p}</a></p>)
|
86
|
-
end
|
87
|
-
# Rebuild the content
|
88
|
-
end.join("\n\n")
|
89
|
-
end
|
6
|
+
# Configure Embed
|
7
|
+
Jekyll::Hooks.register :site, :after_init do |site|
|
8
|
+
Jekyll::Embed.site = site
|
90
9
|
end
|
data/lib/jekyll/embed.rb
ADDED
@@ -0,0 +1,329 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'faraday'
|
4
|
+
require 'faraday-http-cache'
|
5
|
+
require 'faraday_middleware/response/follow_redirects'
|
6
|
+
require 'loofah'
|
7
|
+
require 'ogp'
|
8
|
+
require 'url_privacy'
|
9
|
+
|
10
|
+
require_relative 'embed/oembed'
|
11
|
+
require_relative 'embed/cache'
|
12
|
+
|
13
|
+
if Gem::Version.new(Jekyll::VERSION) >= Gem::Version.new('4.2.0')
|
14
|
+
require_relative 'embed/content'
|
15
|
+
else
|
16
|
+
Jekyll.logger.warn "Upgrade to Jekyll >= 4.2.0 to embed URLs in content"
|
17
|
+
end
|
18
|
+
|
19
|
+
OEmbed::Providers.register_all
|
20
|
+
OEmbed::Providers.register_fallback(OEmbed::ProviderDiscovery,
|
21
|
+
OEmbed::Providers::Noembed)
|
22
|
+
|
23
|
+
module Jekyll
|
24
|
+
# The idea with this class is to find the best safe representation of
|
25
|
+
# a link. For a YouTube video it could be the sandboxed iframe. This
|
26
|
+
# loads the video and allows you to reproduce it while preventing YT
|
27
|
+
# to call home and send data about your users. But other social networks
|
28
|
+
# will try to take control of their containers by modifying the page.
|
29
|
+
# They resist sandboxing and don't work correctly. For them, we
|
30
|
+
# cleanup unwanted HTML tags such as <script>, and return the HTML,
|
31
|
+
# which you can style using CSS. Twitter does this.
|
32
|
+
#
|
33
|
+
# Others are only available through OGP, so we retrieve the metadata
|
34
|
+
# and render a template, which you can provide in your own theme too.
|
35
|
+
#
|
36
|
+
# We also try for microformats and we would look at Schema.org too but
|
37
|
+
# doesn't seem to be a gem for it yet.
|
38
|
+
#
|
39
|
+
# If the URL doesn't provide anything at all we get the URL, title and
|
40
|
+
# date of last visit.
|
41
|
+
#
|
42
|
+
# Isn't it nice that the corporations that requires us to use OEmbed,
|
43
|
+
# OGP, Twitter Cards, Schema.org and other metadata, don't do use
|
44
|
+
# themselves?
|
45
|
+
#
|
46
|
+
# Also we're going to use heavy caching so we don't hit rate limits or
|
47
|
+
# lose the representation if the service is down or the URL is
|
48
|
+
# removed. We may be tempted to store the resources locally (images,
|
49
|
+
# videos, audio) but we have to take into account that people have
|
50
|
+
# legitimate reasons to remove media from the Internet.
|
51
|
+
class Embed
|
52
|
+
# Attributes to apply by HTMLElement
|
53
|
+
IFRAME_ATTRIBUTES = %w[allow sandbox referrerpolicy loading].freeze
|
54
|
+
IMAGE_ATTRIBUTES = %w[referrerpolicy loading].freeze
|
55
|
+
MEDIA_ATTRIBUTES = %w[controls].freeze
|
56
|
+
A_ATTRIBUTES = %w[referrerpolicy rel target].freeze
|
57
|
+
|
58
|
+
# Templates
|
59
|
+
INCLUDE_OGP = '{% include ogp.html site=site page=page %}'
|
60
|
+
INCLUDE_FALLBACK = '{% include fallback.html site=site page=page %}'
|
61
|
+
|
62
|
+
# The default referrer policy only sends the origin URL (not the
|
63
|
+
# full URL, only the protocol/scheme and domain part) if the remote
|
64
|
+
# URL is HTTPS.
|
65
|
+
#
|
66
|
+
# @see {https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy}
|
67
|
+
#
|
68
|
+
# The default sandbox restrictions only allow scripts in the context
|
69
|
+
# of the iframe and opening new tabs.
|
70
|
+
#
|
71
|
+
# @see {https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox}
|
72
|
+
DEFAULT_CONFIG = {
|
73
|
+
'scrub' => %w[form input textarea button fieldset select option optgroup canvas area map],
|
74
|
+
'attributes' => {
|
75
|
+
'referrerpolicy' => 'strict-origin-when-cross-origin',
|
76
|
+
'sandbox' => %w[allow-scripts allow-popups],
|
77
|
+
'allow' => %w[fullscreen gyroscope picture-in-picture clipboard-write],
|
78
|
+
'loading' => 'lazy',
|
79
|
+
'controls' => true,
|
80
|
+
'rel' => %w[noopener noreferrer],
|
81
|
+
'target' => '_blank'
|
82
|
+
}
|
83
|
+
}
|
84
|
+
|
85
|
+
class << self
|
86
|
+
def site
|
87
|
+
unless @site
|
88
|
+
raise Jekyll::Errors::InvalidConfigurationError,
|
89
|
+
"Site is missing, configure with `Jekyll::Embed.site = site`"
|
90
|
+
end
|
91
|
+
|
92
|
+
@site
|
93
|
+
end
|
94
|
+
|
95
|
+
# This is an initializer of sorts
|
96
|
+
#
|
97
|
+
# @param [Jekyll::Site]
|
98
|
+
# @return [Jekyll::Site]
|
99
|
+
def site=(site)
|
100
|
+
raise ArgumentError, "Site must be a Jekyll::Site" unless site.is_a? Jekyll::Site
|
101
|
+
|
102
|
+
@site = site
|
103
|
+
|
104
|
+
# Add the _includes dir so we can provide default templates that
|
105
|
+
# can be overriden locally or by the theme.
|
106
|
+
site.includes_load_paths << File.expand_path(File.join(__dir__, '..', '..', '_includes'))
|
107
|
+
# Since we're embedding, we're allowing iframes
|
108
|
+
Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2 << 'iframe'
|
109
|
+
|
110
|
+
# Other elements that are disallowed
|
111
|
+
config['scrub']&.each do |scrub|
|
112
|
+
Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2.delete(scrub)
|
113
|
+
end
|
114
|
+
|
115
|
+
payload['embed'] = config['attributes']
|
116
|
+
|
117
|
+
site
|
118
|
+
end
|
119
|
+
|
120
|
+
# Render the URL as HTML
|
121
|
+
#
|
122
|
+
# 1. Try oembed for video and image
|
123
|
+
# 2. If rich oembed, cleanup
|
124
|
+
# 3. If OGP, render templates
|
125
|
+
# 4. Else, render fallback template
|
126
|
+
#
|
127
|
+
# @param [String] URL
|
128
|
+
# @return [String] HTML
|
129
|
+
def embed(url)
|
130
|
+
url.strip!
|
131
|
+
|
132
|
+
# Quick check
|
133
|
+
raise URI::Error unless url.start_with? 'http'
|
134
|
+
|
135
|
+
# Just to verify the URL is valid
|
136
|
+
URI.parse url
|
137
|
+
|
138
|
+
oembed(url) || ogp(url) || fallback(url)
|
139
|
+
rescue URI::Error
|
140
|
+
Jekyll.logger.warn "#{url.inspect} is not a valid URL"
|
141
|
+
|
142
|
+
url
|
143
|
+
end
|
144
|
+
|
145
|
+
# @return [Hash]
|
146
|
+
def config
|
147
|
+
@config ||= Jekyll::Utils.deep_merge_hashes(DEFAULT_CONFIG, (site.config['embed'] || {}))
|
148
|
+
end
|
149
|
+
|
150
|
+
# Try for OEmbed
|
151
|
+
#
|
152
|
+
# @param [String] URL
|
153
|
+
# @return [String,NilClass] Sanitized HTML or nil
|
154
|
+
def oembed(url)
|
155
|
+
cache.getset(url) do
|
156
|
+
oembed = OEmbed::Providers.get url
|
157
|
+
|
158
|
+
# Prevent caching of nil?
|
159
|
+
raise OEmbed::Error unless oembed.respond_to? :html
|
160
|
+
|
161
|
+
# Cleanup. We don't allow running remote scripts locally,
|
162
|
+
# period.
|
163
|
+
cleanup(Loofah.fragment(oembed.html).scrub!(:prune), url).to_s
|
164
|
+
end
|
165
|
+
rescue OEmbed::Error
|
166
|
+
nil
|
167
|
+
end
|
168
|
+
|
169
|
+
# Try for OGP.
|
170
|
+
# @param [String] URL
|
171
|
+
# @return [String,NilClass]
|
172
|
+
def ogp(url)
|
173
|
+
cache.getset(url) do
|
174
|
+
ogp = OGP::OpenGraph.new get(url).body
|
175
|
+
context = info.dup
|
176
|
+
context[:registers][:page] = payload['page'] = ogp.data
|
177
|
+
|
178
|
+
ogp_template.render! payload, context
|
179
|
+
end
|
180
|
+
rescue OGP::MalformedSourceError, OGP::MissingAttributeError, Faraday::Error
|
181
|
+
nil
|
182
|
+
end
|
183
|
+
|
184
|
+
# Try something
|
185
|
+
def fallback(url)
|
186
|
+
cache.getset(url) do
|
187
|
+
html = Nokogiri::HTML.fragment get(url).body
|
188
|
+
element = html.css('article').first
|
189
|
+
element ||= html.css('section').first
|
190
|
+
element ||= html.css('main').first
|
191
|
+
element ||= html.css('body').first
|
192
|
+
title = html.css('title').first
|
193
|
+
description = html.css('meta[name="description"]').first
|
194
|
+
|
195
|
+
context = info.dup
|
196
|
+
context[:registers][:page] = payload['page'] = {
|
197
|
+
'title' => text(title),
|
198
|
+
'description' => text(description),
|
199
|
+
'url' => url,
|
200
|
+
'image' => element.css('img').first&.public_send(:[], 'src')
|
201
|
+
}
|
202
|
+
|
203
|
+
fallback_template.render! payload, context
|
204
|
+
end
|
205
|
+
rescue Faraday::Error, Nokogiri::SyntaxError
|
206
|
+
nil
|
207
|
+
end
|
208
|
+
|
209
|
+
# @param [String] URL
|
210
|
+
# @return [Faraday::Response]
|
211
|
+
def get(url)
|
212
|
+
@get_cache ||= {}
|
213
|
+
@get_cache[url] ||= http_client.get url
|
214
|
+
end
|
215
|
+
|
216
|
+
# @return [Jekyll::Embed::Cache]
|
217
|
+
def cache
|
218
|
+
@cache ||= Jekyll::Embed::Cache.new('Jekyll::Embed')
|
219
|
+
end
|
220
|
+
|
221
|
+
# @return [Faraday::Connection]
|
222
|
+
def http_client
|
223
|
+
@http_client ||= Faraday.new do |builder|
|
224
|
+
builder.use FaradayMiddleware::FollowRedirects
|
225
|
+
builder.use :http_cache, shared_cache: false, store: cache, serializer: Marshal
|
226
|
+
end
|
227
|
+
end
|
228
|
+
|
229
|
+
def cleanup(html, url)
|
230
|
+
# Add our own attributes
|
231
|
+
html.css('iframe').each do |iframe|
|
232
|
+
IFRAME_ATTRIBUTES.each do |attr|
|
233
|
+
iframe[attr] = value_for_attr(attr)
|
234
|
+
end
|
235
|
+
|
236
|
+
# Embedding itself require allow-same-origin
|
237
|
+
iframe['sandbox'] += allow_same_origin(url)
|
238
|
+
end
|
239
|
+
|
240
|
+
html.css('audio, video').each do |media|
|
241
|
+
MEDIA_ATTRIBUTES.each do |attr|
|
242
|
+
media[attr] = value_for_attr(attr)
|
243
|
+
end
|
244
|
+
|
245
|
+
media['src'] = UrlPrivacy.clean media['src']
|
246
|
+
end
|
247
|
+
|
248
|
+
html.css('img').each do |img|
|
249
|
+
IMAGE_ATTRIBUTES.each do |attr|
|
250
|
+
img[attr] = value_for_attr(attr)
|
251
|
+
end
|
252
|
+
end
|
253
|
+
|
254
|
+
html.css('a').each do |a|
|
255
|
+
A_ATTRIBUTES.each do |attr|
|
256
|
+
a[attr] = value_for_attr(attr)
|
257
|
+
end
|
258
|
+
end
|
259
|
+
|
260
|
+
html.css('[src]').each do |element|
|
261
|
+
element['src'] = CGI.escapeHTML(UrlPrivacy.clean(CGI.unescapeHTML(element['src'])))
|
262
|
+
end
|
263
|
+
|
264
|
+
html.css('[href]').each do |element|
|
265
|
+
element['href'] = CGI.escapeHTML(UrlPrivacy.clean(CGI.unescapeHTML(element['href'])))
|
266
|
+
end
|
267
|
+
|
268
|
+
# Return the cleaned up HTML
|
269
|
+
html
|
270
|
+
end
|
271
|
+
|
272
|
+
def text(node)
|
273
|
+
node&.text&.tr("\n", '')&.tr("\r", '')&.strip&.squeeze(' ')
|
274
|
+
end
|
275
|
+
|
276
|
+
private
|
277
|
+
|
278
|
+
def fallback_template
|
279
|
+
@fallback_template ||= site.liquid_renderer.file('fallback.html').parse(INCLUDE_FALLBACK)
|
280
|
+
end
|
281
|
+
|
282
|
+
def ogp_template
|
283
|
+
@ogp_template ||= site.liquid_renderer.file('ogp.html').parse(INCLUDE_OGP)
|
284
|
+
end
|
285
|
+
|
286
|
+
def info
|
287
|
+
@info ||= {
|
288
|
+
registers: { site: site },
|
289
|
+
strict_filters: site.config.dig('liquid', 'strict_filters'),
|
290
|
+
strict_variables: site.config.dig('liquid', 'strict_variables')
|
291
|
+
}
|
292
|
+
end
|
293
|
+
|
294
|
+
# @param [String]
|
295
|
+
# @return [String]
|
296
|
+
def value_for_attr(attr)
|
297
|
+
@value_for_attr ||= {}
|
298
|
+
@value_for_attr[attr] ||=
|
299
|
+
case (value = config.dig('attributes', attr))
|
300
|
+
when String then value
|
301
|
+
when Array then value.join(' ')
|
302
|
+
end
|
303
|
+
end
|
304
|
+
|
305
|
+
# If the iframe comes from the same site, we can allow the same
|
306
|
+
# origin policy on the sandbox.
|
307
|
+
#
|
308
|
+
# @param [String] URL
|
309
|
+
# @return [String]
|
310
|
+
def allow_same_origin(url)
|
311
|
+
unless site.config['url']
|
312
|
+
Jekyll.logger.warn "Add url to _config.yml to determine if the site can embed itself"
|
313
|
+
return ' allow-same-origin'
|
314
|
+
end
|
315
|
+
|
316
|
+
@allow_same_origin ||= {}
|
317
|
+
@allow_same_origin[url] ||= url.start_with?(site.config['url']) ? '' : ' allow-same-origin'
|
318
|
+
end
|
319
|
+
|
320
|
+
# Caches it because Jekyll::Site#site_payload returns a new object
|
321
|
+
# everytime.
|
322
|
+
#
|
323
|
+
# @return [Jekyll::Drops::UnifiedPayloadDrop]
|
324
|
+
def payload
|
325
|
+
@payload ||= site.site_payload
|
326
|
+
end
|
327
|
+
end
|
328
|
+
end
|
329
|
+
end
|
@@ -0,0 +1,18 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Jekyll
|
4
|
+
class Embed
|
5
|
+
# Jekyll cache that behaves like ActiveSupport::Cache
|
6
|
+
class Cache < Jekyll::Cache
|
7
|
+
def write(key, value)
|
8
|
+
self[key] = value
|
9
|
+
end
|
10
|
+
|
11
|
+
def read(key)
|
12
|
+
self[key]
|
13
|
+
rescue
|
14
|
+
nil
|
15
|
+
end
|
16
|
+
end
|
17
|
+
end
|
18
|
+
end
|
@@ -0,0 +1,28 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'cgi'
|
4
|
+
|
5
|
+
module Jekyll
|
6
|
+
class Embed
|
7
|
+
class Content
|
8
|
+
URL_RE = /<p[^>]*>[\s\n]*(?<url>https?:\/\/[^<\s\n]+)[\s\n]*<\/p>/m.freeze
|
9
|
+
|
10
|
+
class << self
|
11
|
+
# Find URLs on paragraphs. We do it after rendering because
|
12
|
+
# sometimes we use HTML instead of pure Markdown and this way we
|
13
|
+
# catch both.
|
14
|
+
def embed!(content)
|
15
|
+
URL_RE.match(content) do |match|
|
16
|
+
embed = Jekyll::Embed.embed CGI.unescapeHTML(match[:url])
|
17
|
+
|
18
|
+
content.sub! URL_RE, embed
|
19
|
+
end
|
20
|
+
end
|
21
|
+
end
|
22
|
+
end
|
23
|
+
end
|
24
|
+
end
|
25
|
+
|
26
|
+
Jekyll::Hooks.register :posts, :post_convert do |post|
|
27
|
+
Jekyll::Embed::Content.embed! post.content
|
28
|
+
end
|
@@ -0,0 +1,35 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
module Jekyll
|
4
|
+
class Embed
|
5
|
+
module Filter
|
6
|
+
# This filter takes the URL passed as input an returns its HTML
|
7
|
+
# representation. Embed takes care of everything else.
|
8
|
+
def embed(url)
|
9
|
+
return url unless url.is_a? String
|
10
|
+
|
11
|
+
Embed.embed url
|
12
|
+
end
|
13
|
+
|
14
|
+
def oembed(url)
|
15
|
+
return url unless url.is_a? String
|
16
|
+
|
17
|
+
Embed.oembed url
|
18
|
+
end
|
19
|
+
|
20
|
+
def ogp(url)
|
21
|
+
return url unless url.is_a? String
|
22
|
+
|
23
|
+
Embed.ogp url
|
24
|
+
end
|
25
|
+
|
26
|
+
def fallback(url)
|
27
|
+
return url unless url.is_a? String
|
28
|
+
|
29
|
+
Embed.fallback url
|
30
|
+
end
|
31
|
+
end
|
32
|
+
end
|
33
|
+
end
|
34
|
+
|
35
|
+
Liquid::Template.register_filter(Jekyll::Embed::Filter)
|
@@ -0,0 +1,19 @@
|
|
1
|
+
# frozen_string_literal: true
|
2
|
+
|
3
|
+
require 'oembed'
|
4
|
+
|
5
|
+
module OEmbed
|
6
|
+
module ProviderDecorator
|
7
|
+
def self.included(base)
|
8
|
+
base.class_eval do
|
9
|
+
def http_get(url, _)
|
10
|
+
Jekyll::Embed.get(url.to_s).body
|
11
|
+
rescue Faraday::Error
|
12
|
+
raise OEmbed::UnknownResponse
|
13
|
+
end
|
14
|
+
end
|
15
|
+
end
|
16
|
+
end
|
17
|
+
end
|
18
|
+
|
19
|
+
OEmbed::Provider.include OEmbed::ProviderDecorator
|
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: jekyll-embed-urls
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.
|
4
|
+
version: 0.4.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- f
|
8
8
|
autorequire:
|
9
9
|
bindir: bin
|
10
10
|
cert_chain: []
|
11
|
-
date:
|
11
|
+
date: 2021-02-01 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: jekyll
|
@@ -30,28 +30,98 @@ dependencies:
|
|
30
30
|
requirements:
|
31
31
|
- - "~>"
|
32
32
|
- !ruby/object:Gem::Version
|
33
|
-
version: '0.
|
33
|
+
version: '0.15'
|
34
34
|
type: :runtime
|
35
35
|
prerelease: false
|
36
36
|
version_requirements: !ruby/object:Gem::Requirement
|
37
37
|
requirements:
|
38
38
|
- - "~>"
|
39
39
|
- !ruby/object:Gem::Version
|
40
|
-
version: '0.
|
40
|
+
version: '0.15'
|
41
41
|
- !ruby/object:Gem::Dependency
|
42
|
-
name:
|
42
|
+
name: loofah
|
43
43
|
requirement: !ruby/object:Gem::Requirement
|
44
44
|
requirements:
|
45
45
|
- - "~>"
|
46
46
|
- !ruby/object:Gem::Version
|
47
|
-
version: '2.
|
47
|
+
version: '2.9'
|
48
48
|
type: :runtime
|
49
49
|
prerelease: false
|
50
50
|
version_requirements: !ruby/object:Gem::Requirement
|
51
51
|
requirements:
|
52
52
|
- - "~>"
|
53
53
|
- !ruby/object:Gem::Version
|
54
|
-
version: '2.
|
54
|
+
version: '2.9'
|
55
|
+
- !ruby/object:Gem::Dependency
|
56
|
+
name: ogp
|
57
|
+
requirement: !ruby/object:Gem::Requirement
|
58
|
+
requirements:
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: '0.4'
|
62
|
+
type: :runtime
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: '0.4'
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: faraday
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - "~>"
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
version: '1.3'
|
76
|
+
type: :runtime
|
77
|
+
prerelease: false
|
78
|
+
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - "~>"
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: '1.3'
|
83
|
+
- !ruby/object:Gem::Dependency
|
84
|
+
name: faraday-http-cache
|
85
|
+
requirement: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - "~>"
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: '2.2'
|
90
|
+
type: :runtime
|
91
|
+
prerelease: false
|
92
|
+
version_requirements: !ruby/object:Gem::Requirement
|
93
|
+
requirements:
|
94
|
+
- - "~>"
|
95
|
+
- !ruby/object:Gem::Version
|
96
|
+
version: '2.2'
|
97
|
+
- !ruby/object:Gem::Dependency
|
98
|
+
name: faraday_middleware
|
99
|
+
requirement: !ruby/object:Gem::Requirement
|
100
|
+
requirements:
|
101
|
+
- - "~>"
|
102
|
+
- !ruby/object:Gem::Version
|
103
|
+
version: '1'
|
104
|
+
type: :runtime
|
105
|
+
prerelease: false
|
106
|
+
version_requirements: !ruby/object:Gem::Requirement
|
107
|
+
requirements:
|
108
|
+
- - "~>"
|
109
|
+
- !ruby/object:Gem::Version
|
110
|
+
version: '1'
|
111
|
+
- !ruby/object:Gem::Dependency
|
112
|
+
name: url-privacy
|
113
|
+
requirement: !ruby/object:Gem::Requirement
|
114
|
+
requirements:
|
115
|
+
- - "~>"
|
116
|
+
- !ruby/object:Gem::Version
|
117
|
+
version: '0'
|
118
|
+
type: :runtime
|
119
|
+
prerelease: false
|
120
|
+
version_requirements: !ruby/object:Gem::Requirement
|
121
|
+
requirements:
|
122
|
+
- - "~>"
|
123
|
+
- !ruby/object:Gem::Version
|
124
|
+
version: '0'
|
55
125
|
description: Replaces URLs for their previsualization in Jekyll posts
|
56
126
|
email:
|
57
127
|
- f@sutty.nl
|
@@ -65,7 +135,14 @@ files:
|
|
65
135
|
- CHANGELOG.md
|
66
136
|
- LICENSE.txt
|
67
137
|
- README.md
|
138
|
+
- _includes/fallback.html
|
139
|
+
- _includes/ogp.html
|
68
140
|
- lib/jekyll-embed-urls.rb
|
141
|
+
- lib/jekyll/embed.rb
|
142
|
+
- lib/jekyll/embed/cache.rb
|
143
|
+
- lib/jekyll/embed/content.rb
|
144
|
+
- lib/jekyll/embed/filter.rb
|
145
|
+
- lib/jekyll/embed/oembed.rb
|
69
146
|
homepage: https://0xacab.org/sutty/jekyll/jekyll-embed-urls
|
70
147
|
licenses:
|
71
148
|
- GPL-3.0
|
@@ -88,9 +165,9 @@ require_paths:
|
|
88
165
|
- lib
|
89
166
|
required_ruby_version: !ruby/object:Gem::Requirement
|
90
167
|
requirements:
|
91
|
-
- - "
|
168
|
+
- - ">="
|
92
169
|
- !ruby/object:Gem::Version
|
93
|
-
version:
|
170
|
+
version: 2.6.0
|
94
171
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
95
172
|
requirements:
|
96
173
|
- - ">="
|