jekyll-embed-urls 0.3.3 → 0.4.3

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 5e8b338448bae54c12ee6c9119852b2d93859ddd032c242ace46c0515eb2a4eb
4
- data.tar.gz: fc960acd1bec7c2f189e7c46cf20739026395700827dd7b5f79c6937178c5c0a
3
+ metadata.gz: 938576c9cdee4a9b13de0e7d2d17983db2704f7882352d5a88010c66c8ddb122
4
+ data.tar.gz: e34c7de8e8d9b4b36d3b995888015362fd1a561db17cd3b4304bca252f12168c
5
5
  SHA512:
6
- metadata.gz: db3a88054e7716581d5b87dbf302138331e7e8b11571946d0389eb93c9e2cdc5d9b3e5212cfe7af941d7d9346893e69b80c26d5aabec717d12b0ff4cd94a69b9
7
- data.tar.gz: f5ef11ac9fa6b6ab8d0826cb0d024ff4a01de43645a7123e4b28832f29a7a61eef332004646618834fa80f2f7f2bb870efe074cd32200112364da3cc143c3adc
6
+ metadata.gz: 3c4cb82fb1f9accb6250dbe4b0460cde02867b7bf4b1befb93d2ca8c2f1c4bdb7204c9ceb30847bea1b1f7283be4d72f33137fcbc01c08bfce2ae93cda593cb7
7
+ data.tar.gz: 0e9f02488eb339c43c33c7a90f0192af3c2e069f038b2b399f6eba991423853589277e88369c0d920e03003957e2d26f3106b487db8b0b300e414abf02394625
data/CHANGELOG.md CHANGED
@@ -1,5 +1,29 @@
1
1
  # Changelog
2
2
 
3
+ ## v0.4.3
4
+
5
+ * Correctly use Feature Policy
6
+
7
+ ## v0.4.2
8
+
9
+ * Fix on v0.4.1
10
+
11
+ ## v0.4.1
12
+
13
+ * Don't fail if remote URL returns an empty body
14
+
15
+ ## v0.4.0
16
+
17
+ * Almost a complete rewrite.
18
+ * Does its best to prevent visitor tracking.
19
+ * Embed URLs with OEmbed, OGP and fallbacks to discover title and other
20
+ stuff.
21
+ * If using Jekyll >= 4.2.0, finds URLs in content and replaces for HTML
22
+ embed.
23
+ * Customizable templates.
24
+ * Uses [UrlPrivacy](https://0xacab.org/sutty/url-privacy) to prevent
25
+ tracking.
26
+
3
27
  ## v0.3.3
4
28
 
5
29
  * Add `allow-popups` to sandbox so you can open links in a new window.
data/README.md CHANGED
@@ -1,8 +1,14 @@
1
1
  # jekyll-embed-urls
2
2
 
3
- This plugin finds the URL previsualization and replaces them in posts by
4
- using [OEmbed](https://oembed.com/) and other techniques.
3
+ This plugin converts URLs to their previsualization by using
4
+ [OEmbed](https://oembed.com/), [OGP](http://ogp.me/). It fallbacks to
5
+ showing a card with basic information.
5
6
 
7
+ While developing this plugin, we found out that OEmbed providers tend to
8
+ inject JavaScript and other ways of tracking users, so this plugin does
9
+ its best to prevent it.
10
+
11
+ For OGP and fallback, you can modify the templates.
6
12
 
7
13
  ## Installation
8
14
 
@@ -27,6 +33,37 @@ Add the plugin to your `_config.yml`:
27
33
  ```yaml
28
34
  plugins:
29
35
  - jekyll-embed-urls
36
+ embed:
37
+ # Extra elements to remove
38
+ scrub:
39
+ - form
40
+ - input
41
+ - textarea
42
+ - button
43
+ - fieldset
44
+ - select
45
+ - option
46
+ - optgroup
47
+ - canvas
48
+ - area
49
+ - map
50
+ # Attribute values can be strings or array of strings
51
+ attributes:
52
+ referrerpolicy: strict-origin-when-cross-origin
53
+ sandbox:
54
+ - allow-scripts
55
+ - allow-popups
56
+ allow:
57
+ - fullscreen;
58
+ - gyroscope;
59
+ - picture-in-picture;
60
+ - clipboard-write;
61
+ loading: 'lazy'
62
+ controls: true
63
+ rel:
64
+ - noopener
65
+ - noreferrer
66
+ target: _blank
30
67
  ```
31
68
 
32
69
  Then, when you want to embed an URL (like a video) in a post, simply
@@ -49,6 +86,90 @@ paragraphs but it needs to be in its own block of text.
49
86
  **Another note:** [Invidious doesn't support OEmbed
50
87
  yet](https://github.com/omarroth/invidious/issues/1222) :P
51
88
 
89
+ ## Themes
90
+
91
+ You can also use it as a Liquid filter, for instance:
92
+
93
+ ```html
94
+ {{ page.embed_url | embed }}
95
+ ```
96
+
97
+ The `embed` filter takes an URL and replaces it for the HTML. Other
98
+ filters are `oembed`, `ogp` and `fallback`.
99
+
100
+ ### Templates
101
+
102
+ You can modify the templates by providing your own include files,
103
+ `_includes/ogp.html` and `_includes/fallback.html`. We don't add any
104
+ CSS so you can develop your own.
105
+
106
+ To access default includes, run `bundle show jekyll-embed-urls` and copy
107
+ the files from the `_includes` directory to your site.
108
+
109
+ ## Facebook and Instagram
110
+
111
+ Facebook deprecated their OEmbed API and now a token is required for
112
+ embedding Facebook and Instagram URLs. Set it as an environtment
113
+ variable named `OEMBED_FACEBOOK_TOKEN`.
114
+
115
+ If you don't have it, this plugin make a best effort attempt. Instagram
116
+ will be available through OGP, but their image URLs expire after
117
+ a certain time, so your site may appear broken after a while. We could
118
+ download them but we decided not to because it may infringe on
119
+ intellectual property laws and personal rights such as privacy, and
120
+ consequently put our service at risk.
121
+
122
+ It's our position that there're legitimate uses for downloading remote
123
+ media, such as for archiving collective memory (police brutality, public
124
+ figures speeches, etc.) that may be removed without notice.
125
+
126
+ In these cases our recommendation is always not to host with corporate
127
+ services, since they don't share our politics and actively work against
128
+ us.
129
+
130
+ We're hotlinking and copying text though, assuming that falls under fair
131
+ use rights.
132
+
133
+ ## Tracking prevention
134
+
135
+ Anti-tracking techniques implemented are:
136
+
137
+ * `<script>` and other tags are removed. No external JS is loaded in
138
+ a local context.
139
+
140
+ * `<form>`s and their elements are removed.
141
+
142
+ * `<canvas>`, `<area>`, `<map>` are removed.
143
+
144
+ * `<iframe>`s are
145
+ [sandboxed](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox).
146
+
147
+ * `<img>`s are lazy loaded. This is not strickly anti-tracking but
148
+ images are loaded when needed.
149
+
150
+ * All URLs get their tracking params removed by
151
+ [UrlPrivacy](https://0xacab.org/sutty/url-privacy)
152
+
153
+ * [Referrer
154
+ Policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy)
155
+ is implemented for supported elements. Extrangely, `<video>` and
156
+ `<audio>` don't seem to support it.
157
+
158
+ * External links open in a new tab and have `rel="noopener noreferrer"`
159
+ to prevent [reverse
160
+ tabnabbing](https://owasp.org/www-community/attacks/Reverse_Tabnabbing).
161
+
162
+ If you find more useful techniques, please [open an issue
163
+ report](https://0xacab.org/sutty/jekyll/jekyll-embed-urls/-/issues).
164
+
165
+ ## Feature policy
166
+
167
+ [Feature
168
+ policy](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Feature-Policy)
169
+ is a list of directives for allowing or denying features.
170
+
171
+ The directives are separated by semicolons. Any directive not mentioned
172
+ in the configuration is assumed to have a "none" policy by this plugin.
52
173
 
53
174
  ## Contributing
54
175
 
@@ -0,0 +1,17 @@
1
+ <article class="fallback">
2
+ {%- if page.image -%}
3
+ <img referrerpolicy="{{ embed.referrerpolicy }}" loading="{{ embed.loading }}" src="{{ page.image }}" class="img-fluid" />
4
+ {%- endif -%}
5
+
6
+ <h1>{{ page.title }}</h1>
7
+ <p class="lead">{{ page.description }}</p>
8
+ <p><small>
9
+ <a
10
+ href="{{ page.url }}"
11
+ referrerpolicy="{{ embed.referrerpolicy }}"
12
+ target="{{ embed.target }}"
13
+ rel="{{ embed.rel }}">
14
+ {{ page.url }}
15
+ </a>
16
+ </small><p>
17
+ </article>
@@ -0,0 +1,23 @@
1
+ <article class="ogp" lang="{{ page.locale }}">
2
+ {%- if page.video -%}
3
+ <video poster="{{ page.image }}" class="img-fluid" {{ embed.controls }} src="{{ page.video }}"/>
4
+ {%- elsif page.image -%}
5
+ <img referrerpolicy="{{ embed.referrerpolicy }}" loading="{{ embed.loading }}" src="{{ page.image }}" class="img-fluid" />
6
+ {%- endif -%}
7
+
8
+ {%- if page.audio -%}
9
+ <audio class="img-fluid" {{ embed.controls }} src="{{ page.audio }}"/>
10
+ {%- endif -%}
11
+
12
+ <h1>{{ page.title }}</h1>
13
+ <p class="lead">{{ page.description }}</p>
14
+ <p><small>
15
+ <a
16
+ href="{{ page.url }}"
17
+ referrerpolicy="{{ embed.referrerpolicy }}"
18
+ target="{{ embed.target }}"
19
+ rel="{{ embed.rel }}">
20
+ {{ page.url }}
21
+ </a>
22
+ </small><p>
23
+ </article>
@@ -0,0 +1,18 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Jekyll
4
+ class Embed
5
+ # Jekyll cache that behaves like ActiveSupport::Cache
6
+ class Cache < Jekyll::Cache
7
+ def write(key, value)
8
+ self[key] = value
9
+ end
10
+
11
+ def read(key)
12
+ self[key]
13
+ rescue
14
+ nil
15
+ end
16
+ end
17
+ end
18
+ end
@@ -0,0 +1,28 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'cgi'
4
+
5
+ module Jekyll
6
+ class Embed
7
+ class Content
8
+ URL_RE = /<p[^>]*>[\s\n]*(?<url>https?:\/\/[^<\s\n]+)[\s\n]*<\/p>/m.freeze
9
+
10
+ class << self
11
+ # Find URLs on paragraphs. We do it after rendering because
12
+ # sometimes we use HTML instead of pure Markdown and this way we
13
+ # catch both.
14
+ def embed!(content)
15
+ URL_RE.match(content) do |match|
16
+ embed = Jekyll::Embed.embed CGI.unescapeHTML(match[:url])
17
+
18
+ content.sub! URL_RE, embed
19
+ end
20
+ end
21
+ end
22
+ end
23
+ end
24
+ end
25
+
26
+ Jekyll::Hooks.register :posts, :post_convert do |post|
27
+ Jekyll::Embed::Content.embed! post.content
28
+ end
@@ -0,0 +1,35 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Jekyll
4
+ class Embed
5
+ module Filter
6
+ # This filter takes the URL passed as input an returns its HTML
7
+ # representation. Embed takes care of everything else.
8
+ def embed(url)
9
+ return url unless url.is_a? String
10
+
11
+ Embed.embed url
12
+ end
13
+
14
+ def oembed(url)
15
+ return url unless url.is_a? String
16
+
17
+ Embed.oembed url
18
+ end
19
+
20
+ def ogp(url)
21
+ return url unless url.is_a? String
22
+
23
+ Embed.ogp url
24
+ end
25
+
26
+ def fallback(url)
27
+ return url unless url.is_a? String
28
+
29
+ Embed.fallback url
30
+ end
31
+ end
32
+ end
33
+ end
34
+
35
+ Liquid::Template.register_filter(Jekyll::Embed::Filter)
@@ -0,0 +1,19 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'oembed'
4
+
5
+ module OEmbed
6
+ module ProviderDecorator
7
+ def self.included(base)
8
+ base.class_eval do
9
+ def http_get(url, _)
10
+ Jekyll::Embed.get(url.to_s).body
11
+ rescue Faraday::Error
12
+ raise OEmbed::UnknownResponse
13
+ end
14
+ end
15
+ end
16
+ end
17
+ end
18
+
19
+ OEmbed::Provider.include OEmbed::ProviderDecorator
@@ -0,0 +1,335 @@
1
+ # frozen_string_literal: true
2
+
3
+ require 'faraday'
4
+ require 'faraday-http-cache'
5
+ require 'faraday_middleware/response/follow_redirects'
6
+ require 'loofah'
7
+ require 'ogp'
8
+ require 'url_privacy'
9
+
10
+ require_relative 'embed/oembed'
11
+ require_relative 'embed/cache'
12
+
13
+ if Gem::Version.new(Jekyll::VERSION) >= Gem::Version.new('4.2.0')
14
+ require_relative 'embed/content'
15
+ else
16
+ Jekyll.logger.warn "Upgrade to Jekyll >= 4.2.0 to embed URLs in content"
17
+ end
18
+
19
+ OEmbed::Providers.register_all
20
+ OEmbed::Providers.register_fallback(OEmbed::ProviderDiscovery,
21
+ OEmbed::Providers::Noembed)
22
+
23
+ module Jekyll
24
+ # The idea with this class is to find the best safe representation of
25
+ # a link. For a YouTube video it could be the sandboxed iframe. This
26
+ # loads the video and allows you to reproduce it while preventing YT
27
+ # to call home and send data about your users. But other social networks
28
+ # will try to take control of their containers by modifying the page.
29
+ # They resist sandboxing and don't work correctly. For them, we
30
+ # cleanup unwanted HTML tags such as <script>, and return the HTML,
31
+ # which you can style using CSS. Twitter does this.
32
+ #
33
+ # Others are only available through OGP, so we retrieve the metadata
34
+ # and render a template, which you can provide in your own theme too.
35
+ #
36
+ # We also try for microformats and we would look at Schema.org too but
37
+ # doesn't seem to be a gem for it yet.
38
+ #
39
+ # If the URL doesn't provide anything at all we get the URL, title and
40
+ # date of last visit.
41
+ #
42
+ # Isn't it nice that the corporations that requires us to use OEmbed,
43
+ # OGP, Twitter Cards, Schema.org and other metadata, don't do use
44
+ # themselves?
45
+ #
46
+ # Also we're going to use heavy caching so we don't hit rate limits or
47
+ # lose the representation if the service is down or the URL is
48
+ # removed. We may be tempted to store the resources locally (images,
49
+ # videos, audio) but we have to take into account that people have
50
+ # legitimate reasons to remove media from the Internet.
51
+ class Embed
52
+ # Attributes to apply by HTMLElement
53
+ IFRAME_ATTRIBUTES = %w[allow sandbox referrerpolicy loading].freeze
54
+ IMAGE_ATTRIBUTES = %w[referrerpolicy loading].freeze
55
+ MEDIA_ATTRIBUTES = %w[controls].freeze
56
+ A_ATTRIBUTES = %w[referrerpolicy rel target].freeze
57
+
58
+ # Directive from Feature Policy
59
+ # @see {https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Feature-Policy#directives}
60
+ DIRECTIVES = %w[accelerometer ambient-light-sensor autoplay battery camera display-capture document-domain encrypted-media execution-while-not-rendered execution-while-out-of-viewport fullscreen gamepad geolocation gyroscope layout-animations legacy-image-formats magnetometer microphone midi navigation-override oversized-images payment picture-in-picture publickey-credentials-get speaker-selection sync-xhr usb screen-wake-lock web-share xr-spatial-tracking].freeze
61
+
62
+ # Templates
63
+ INCLUDE_OGP = '{% include ogp.html site=site page=page %}'
64
+ INCLUDE_FALLBACK = '{% include fallback.html site=site page=page %}'
65
+
66
+ # The default referrer policy only sends the origin URL (not the
67
+ # full URL, only the protocol/scheme and domain part) if the remote
68
+ # URL is HTTPS.
69
+ #
70
+ # @see {https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Referrer-Policy}
71
+ #
72
+ # The default sandbox restrictions only allow scripts in the context
73
+ # of the iframe and opening new tabs.
74
+ #
75
+ # @see {https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox}
76
+ DEFAULT_CONFIG = {
77
+ 'scrub' => %w[form input textarea button fieldset select option optgroup canvas area map],
78
+ 'attributes' => {
79
+ 'referrerpolicy' => 'strict-origin-when-cross-origin',
80
+ 'sandbox' => %w[allow-scripts allow-popups],
81
+ 'allow' => %w[fullscreen; gyroscope; picture-in-picture; clipboard-write;],
82
+ 'loading' => 'lazy',
83
+ 'controls' => true,
84
+ 'rel' => %w[noopener noreferrer],
85
+ 'target' => '_blank'
86
+ }
87
+ }
88
+
89
+ class << self
90
+ def site
91
+ unless @site
92
+ raise Jekyll::Errors::InvalidConfigurationError,
93
+ "Site is missing, configure with `Jekyll::Embed.site = site`"
94
+ end
95
+
96
+ @site
97
+ end
98
+
99
+ # This is an initializer of sorts
100
+ #
101
+ # @param [Jekyll::Site]
102
+ # @return [Jekyll::Site]
103
+ def site=(site)
104
+ raise ArgumentError, "Site must be a Jekyll::Site" unless site.is_a? Jekyll::Site
105
+
106
+ @site = site
107
+
108
+ # Add the _includes dir so we can provide default templates that
109
+ # can be overriden locally or by the theme.
110
+ site.includes_load_paths << File.expand_path(File.join(__dir__, '..', '..', '_includes'))
111
+ # Since we're embedding, we're allowing iframes
112
+ Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2 << 'iframe'
113
+
114
+ # Other elements that are disallowed
115
+ config['scrub']&.each do |scrub|
116
+ Loofah::HTML5::SafeList::ALLOWED_ELEMENTS_WITH_LIBXML2.delete(scrub)
117
+ end
118
+
119
+ payload['embed'] = config['attributes']
120
+
121
+ site
122
+ end
123
+
124
+ # Render the URL as HTML
125
+ #
126
+ # 1. Try oembed for video and image
127
+ # 2. If rich oembed, cleanup
128
+ # 3. If OGP, render templates
129
+ # 4. Else, render fallback template
130
+ #
131
+ # @param [String] URL
132
+ # @return [String] HTML
133
+ def embed(url)
134
+ url.strip!
135
+
136
+ # Quick check
137
+ raise URI::Error unless url.start_with? 'http'
138
+
139
+ # Just to verify the URL is valid
140
+ URI.parse url
141
+
142
+ oembed(url) || ogp(url) || fallback(url)
143
+ rescue URI::Error
144
+ Jekyll.logger.warn "#{url.inspect} is not a valid URL"
145
+
146
+ url
147
+ end
148
+
149
+ # @return [Hash]
150
+ def config
151
+ @config ||= Jekyll::Utils.deep_merge_hashes(DEFAULT_CONFIG, (site.config['embed'] || {})).tap do |c|
152
+ c['attributes']['allow'].concat (DIRECTIVES - c.dig('attributes', 'allow').join.split(';').map { |s| s.split(' ').first }).join(" 'none';|").split('|')
153
+ end
154
+ end
155
+
156
+ # Try for OEmbed
157
+ #
158
+ # @param [String] URL
159
+ # @return [String,NilClass] Sanitized HTML or nil
160
+ def oembed(url)
161
+ cache.getset(url) do
162
+ oembed = OEmbed::Providers.get url
163
+
164
+ # Prevent caching of nil?
165
+ raise OEmbed::Error unless oembed.respond_to? :html
166
+
167
+ # Cleanup. We don't allow running remote scripts locally,
168
+ # period.
169
+ cleanup(Loofah.fragment(oembed.html).scrub!(:prune), url).to_s
170
+ end
171
+ rescue OEmbed::Error
172
+ nil
173
+ end
174
+
175
+ # Try for OGP.
176
+ # @param [String] URL
177
+ # @return [String,NilClass]
178
+ def ogp(url)
179
+ cache.getset(url) do
180
+ ogp = OGP::OpenGraph.new get(url).body
181
+ context = info.dup
182
+ context[:registers][:page] = payload['page'] = ogp.data
183
+
184
+ ogp_template.render! payload, context
185
+ end
186
+ rescue OGP::MalformedSourceError, OGP::MissingAttributeError, Faraday::Error
187
+ nil
188
+ end
189
+
190
+ # Try something
191
+ def fallback(url)
192
+ cache.getset(url) do
193
+ html = Nokogiri::HTML.fragment get(url).body
194
+ element = html.css('article').first
195
+ element ||= html.css('section').first
196
+ element ||= html.css('main').first
197
+ element ||= html.css('body').first
198
+ title = html.css('title').first
199
+ description = html.css('meta[name="description"]').first
200
+
201
+ context = info.dup
202
+ context[:registers][:page] = payload['page'] = {
203
+ 'title' => text(title),
204
+ 'description' => text(description),
205
+ 'url' => url,
206
+ 'image' => element&.css('img')&.first&.public_send(:[], 'src')
207
+ }
208
+
209
+ fallback_template.render! payload, context
210
+ end
211
+ rescue Faraday::Error, Nokogiri::SyntaxError
212
+ nil
213
+ end
214
+
215
+ # @param [String] URL
216
+ # @return [Faraday::Response]
217
+ def get(url)
218
+ @get_cache ||= {}
219
+ @get_cache[url] ||= http_client.get url
220
+ end
221
+
222
+ # @return [Jekyll::Embed::Cache]
223
+ def cache
224
+ @cache ||= Jekyll::Embed::Cache.new('Jekyll::Embed')
225
+ end
226
+
227
+ # @return [Faraday::Connection]
228
+ def http_client
229
+ @http_client ||= Faraday.new do |builder|
230
+ builder.use FaradayMiddleware::FollowRedirects
231
+ builder.use :http_cache, shared_cache: false, store: cache, serializer: Marshal
232
+ end
233
+ end
234
+
235
+ def cleanup(html, url)
236
+ # Add our own attributes
237
+ html.css('iframe').each do |iframe|
238
+ IFRAME_ATTRIBUTES.each do |attr|
239
+ iframe[attr] = value_for_attr(attr)
240
+ end
241
+
242
+ # Embedding itself require allow-same-origin
243
+ iframe['sandbox'] += allow_same_origin(url)
244
+ end
245
+
246
+ html.css('audio, video').each do |media|
247
+ MEDIA_ATTRIBUTES.each do |attr|
248
+ media[attr] = value_for_attr(attr)
249
+ end
250
+
251
+ media['src'] = UrlPrivacy.clean media['src']
252
+ end
253
+
254
+ html.css('img').each do |img|
255
+ IMAGE_ATTRIBUTES.each do |attr|
256
+ img[attr] = value_for_attr(attr)
257
+ end
258
+ end
259
+
260
+ html.css('a').each do |a|
261
+ A_ATTRIBUTES.each do |attr|
262
+ a[attr] = value_for_attr(attr)
263
+ end
264
+ end
265
+
266
+ html.css('[src]').each do |element|
267
+ element['src'] = CGI.escapeHTML(UrlPrivacy.clean(CGI.unescapeHTML(element['src'])))
268
+ end
269
+
270
+ html.css('[href]').each do |element|
271
+ element['href'] = CGI.escapeHTML(UrlPrivacy.clean(CGI.unescapeHTML(element['href'])))
272
+ end
273
+
274
+ # Return the cleaned up HTML
275
+ html
276
+ end
277
+
278
+ def text(node)
279
+ node&.text&.tr("\n", '')&.tr("\r", '')&.strip&.squeeze(' ')
280
+ end
281
+
282
+ private
283
+
284
+ def fallback_template
285
+ @fallback_template ||= site.liquid_renderer.file('fallback.html').parse(INCLUDE_FALLBACK)
286
+ end
287
+
288
+ def ogp_template
289
+ @ogp_template ||= site.liquid_renderer.file('ogp.html').parse(INCLUDE_OGP)
290
+ end
291
+
292
+ def info
293
+ @info ||= {
294
+ registers: { site: site },
295
+ strict_filters: site.config.dig('liquid', 'strict_filters'),
296
+ strict_variables: site.config.dig('liquid', 'strict_variables')
297
+ }
298
+ end
299
+
300
+ # @param [String]
301
+ # @return [String]
302
+ def value_for_attr(attr)
303
+ @value_for_attr ||= {}
304
+ @value_for_attr[attr] ||=
305
+ case (value = config.dig('attributes', attr))
306
+ when String then value
307
+ when Array then value.join(' ')
308
+ end
309
+ end
310
+
311
+ # If the iframe comes from the same site, we can allow the same
312
+ # origin policy on the sandbox.
313
+ #
314
+ # @param [String] URL
315
+ # @return [String]
316
+ def allow_same_origin(url)
317
+ unless site.config['url']
318
+ Jekyll.logger.warn "Add url to _config.yml to determine if the site can embed itself"
319
+ return ' allow-same-origin'
320
+ end
321
+
322
+ @allow_same_origin ||= {}
323
+ @allow_same_origin[url] ||= url.start_with?(site.config['url']) ? '' : ' allow-same-origin'
324
+ end
325
+
326
+ # Caches it because Jekyll::Site#site_payload returns a new object
327
+ # everytime.
328
+ #
329
+ # @return [Jekyll::Drops::UnifiedPayloadDrop]
330
+ def payload
331
+ @payload ||= site.site_payload
332
+ end
333
+ end
334
+ end
335
+ end
@@ -1,90 +1,9 @@
1
- require 'oembed'
2
- require 'cgi'
3
- require 'oga'
1
+ # frozen_string_literal: true
4
2
 
5
- # TODO: We tested several of the mainstream embedable contents (YT, IG,
6
- # Twitter) and specially IG and Twitter just want to take over the page
7
- # to set their own size, also send metrics. So they won't work on a
8
- # sandboxed iframe, which we were expecting, but they also won't be
9
- # comfortable for visitors to use. We're planning on using OGP and
10
- # render our own partials (configurable) for this. This way everything
11
- # is safer and the embedded content even adapts to the site's design.
12
- #
13
- # So, expect a major refactoring!
3
+ require_relative 'jekyll/embed'
4
+ require_relative 'jekyll/embed/filter'
14
5
 
15
- OEmbed::Providers.register_all
16
- OEmbed::Providers.register_fallback(OEmbed::ProviderDiscovery,
17
- OEmbed::Providers::Noembed)
18
-
19
- # Process the content of documents before rendering them to find URLs in
20
- # a block.
21
- Jekyll::Hooks.register :site, :pre_render do |site|
22
- # Cache results
23
- cache ||= Jekyll::Cache.new('Jekyll::OEmbed::Urls')
24
- # TODO: Make configurable
25
- referrer_policy = 'strict-origin-when-cross-origin'
26
-
27
- # Only modify documents to be written
28
- site.docs_to_write.each do |doc|
29
- # Skip text paragraphs
30
- # XXX: Find link in first line
31
- next unless %r{\n\n\s*<?https?://} =~ doc.content
32
-
33
- # Split texts by markdown blocks
34
- doc.content = doc.content.split("\n\n").map do |p|
35
- # Only process lines with URLs
36
- next p unless %r{\A\s*<?https?://} =~ p
37
- # Remove empty characters and markdown autolinks
38
- p = p.strip.tr('<', '').tr('>', '')
39
-
40
- # @see {https://developer.mozilla.org/en-US/docs/Web/HTML/Element/iframe#attr-sandbox}
41
- same_origin = p.start_with? site.config['url']
42
-
43
- Jekyll.logger.debug "Finding OEmbed content for #{p}"
44
- # Cache the results
45
- cache.getset(p) do
46
- Jekyll.logger.debug "=> Not cached, obtaining..."
47
-
48
- result = OEmbed::Providers.get(p)
49
- sandbox = "allow-scripts allow-popups #{same_origin ? '' : 'allow-same-origin'}"
50
-
51
- # If the embed HTML contains an iframe, make sure it has the
52
- # correct attributes.
53
- if %r{<iframe } =~ result.html
54
- html = Oga.parse_html result.html
55
-
56
- html.css('iframe').each do |iframe|
57
- iframe.attributes.delete_if do |attr|
58
- %w[width height].include? attr.name
59
- end
60
-
61
- iframe.attributes << Oga::XML::Attribute.new(name: 'sandbox', value: sandbox)
62
- iframe.attributes << Oga::XML::Attribute.new(name: 'referrerpolicy', value: referrer_policy)
63
- end
64
-
65
- html.to_xml
66
- else
67
- # Return a sandboxed iframe with the size of the HTML. We
68
- # only allow scripts to run inside the iframe and nothing
69
- # else.
70
- <<~IFRAME
71
- <iframe
72
- referrerpolicy="#{referrer_policy}"
73
- sandbox="#{sandbox}"
74
- style="min-width:#{result.width}px;min-height:#{result.height || 0}px"
75
- srcdoc="#{CGI.escape_html result.html}"></iframe>
76
- IFRAME
77
- end
78
- rescue OEmbed::Error => e
79
- # If the URL doesn't support OEmbed just return an external
80
- # link.
81
- #
82
- # TODO: Fetch information with OGP and render a template.
83
- Jekyll.logger.warn "#{p} is not oembeddable or URL can't be fetched, showing as URL: #{e}"
84
-
85
- %(<p><a href="#{p}" target="_blank" referrerpolicy="#{referrer_policy}">#{p}</a></p>)
86
- end
87
- # Rebuild the content
88
- end.join("\n\n")
89
- end
6
+ # Configure Embed
7
+ Jekyll::Hooks.register :site, :after_init do |site|
8
+ Jekyll::Embed.site = site
90
9
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: jekyll-embed-urls
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.3.3
4
+ version: 0.4.3
5
5
  platform: ruby
6
6
  authors:
7
7
  - f
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2020-10-30 00:00:00.000000000 Z
11
+ date: 2021-09-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: jekyll
@@ -30,28 +30,98 @@ dependencies:
30
30
  requirements:
31
31
  - - "~>"
32
32
  - !ruby/object:Gem::Version
33
- version: '0.13'
33
+ version: '0.15'
34
34
  type: :runtime
35
35
  prerelease: false
36
36
  version_requirements: !ruby/object:Gem::Requirement
37
37
  requirements:
38
38
  - - "~>"
39
39
  - !ruby/object:Gem::Version
40
- version: '0.13'
40
+ version: '0.15'
41
41
  - !ruby/object:Gem::Dependency
42
- name: oga
42
+ name: loofah
43
43
  requirement: !ruby/object:Gem::Requirement
44
44
  requirements:
45
45
  - - "~>"
46
46
  - !ruby/object:Gem::Version
47
- version: '2.15'
47
+ version: '2.9'
48
48
  type: :runtime
49
49
  prerelease: false
50
50
  version_requirements: !ruby/object:Gem::Requirement
51
51
  requirements:
52
52
  - - "~>"
53
53
  - !ruby/object:Gem::Version
54
- version: '2.15'
54
+ version: '2.9'
55
+ - !ruby/object:Gem::Dependency
56
+ name: ogp
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '0.4'
62
+ type: :runtime
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '0.4'
69
+ - !ruby/object:Gem::Dependency
70
+ name: faraday
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
74
+ - !ruby/object:Gem::Version
75
+ version: '1.3'
76
+ type: :runtime
77
+ prerelease: false
78
+ version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: '1.3'
83
+ - !ruby/object:Gem::Dependency
84
+ name: faraday-http-cache
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: '2.2'
90
+ type: :runtime
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: '2.2'
97
+ - !ruby/object:Gem::Dependency
98
+ name: faraday_middleware
99
+ requirement: !ruby/object:Gem::Requirement
100
+ requirements:
101
+ - - "~>"
102
+ - !ruby/object:Gem::Version
103
+ version: '1'
104
+ type: :runtime
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - "~>"
109
+ - !ruby/object:Gem::Version
110
+ version: '1'
111
+ - !ruby/object:Gem::Dependency
112
+ name: url-privacy
113
+ requirement: !ruby/object:Gem::Requirement
114
+ requirements:
115
+ - - "~>"
116
+ - !ruby/object:Gem::Version
117
+ version: '0'
118
+ type: :runtime
119
+ prerelease: false
120
+ version_requirements: !ruby/object:Gem::Requirement
121
+ requirements:
122
+ - - "~>"
123
+ - !ruby/object:Gem::Version
124
+ version: '0'
55
125
  description: Replaces URLs for their previsualization in Jekyll posts
56
126
  email:
57
127
  - f@sutty.nl
@@ -65,7 +135,14 @@ files:
65
135
  - CHANGELOG.md
66
136
  - LICENSE.txt
67
137
  - README.md
138
+ - _includes/fallback.html
139
+ - _includes/ogp.html
68
140
  - lib/jekyll-embed-urls.rb
141
+ - lib/jekyll/embed.rb
142
+ - lib/jekyll/embed/cache.rb
143
+ - lib/jekyll/embed/content.rb
144
+ - lib/jekyll/embed/filter.rb
145
+ - lib/jekyll/embed/oembed.rb
69
146
  homepage: https://0xacab.org/sutty/jekyll/jekyll-embed-urls
70
147
  licenses:
71
148
  - GPL-3.0
@@ -88,9 +165,9 @@ require_paths:
88
165
  - lib
89
166
  required_ruby_version: !ruby/object:Gem::Requirement
90
167
  requirements:
91
- - - "~>"
168
+ - - ">="
92
169
  - !ruby/object:Gem::Version
93
- version: '2.6'
170
+ version: 2.6.0
94
171
  required_rubygems_version: !ruby/object:Gem::Requirement
95
172
  requirements:
96
173
  - - ">="