html-pipeline 2.14.3 → 3.0.0.pre1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/.github/FUNDING.yml +11 -3
- data/.github/dependabot.yml +20 -0
- data/.github/workflows/automerge.yml +34 -0
- data/.github/workflows/lint.yml +23 -0
- data/.github/workflows/tag_and_release.yml +70 -0
- data/.github/workflows/test.yml +33 -0
- data/.rubocop.yml +17 -0
- data/CHANGELOG.md +28 -2
- data/Gemfile +29 -15
- data/{LICENSE → LICENSE.txt} +2 -2
- data/README.md +209 -218
- data/Rakefile +14 -7
- data/UPGRADING.md +35 -0
- data/html-pipeline.gemspec +31 -21
- data/lib/html-pipeline.rb +3 -0
- data/lib/html_pipeline/convert_filter/markdown_filter.rb +26 -0
- data/lib/html_pipeline/convert_filter.rb +17 -0
- data/lib/html_pipeline/filter.rb +89 -0
- data/lib/{html/pipeline → html_pipeline/node_filter}/absolute_source_filter.rb +23 -21
- data/lib/{html/pipeline → html_pipeline/node_filter}/emoji_filter.rb +58 -54
- data/lib/html_pipeline/node_filter/https_filter.rb +22 -0
- data/lib/html_pipeline/node_filter/image_max_width_filter.rb +40 -0
- data/lib/{html/pipeline/@mention_filter.rb → html_pipeline/node_filter/mention_filter.rb} +55 -69
- data/lib/html_pipeline/node_filter/table_of_contents_filter.rb +68 -0
- data/lib/html_pipeline/node_filter/team_mention_filter.rb +105 -0
- data/lib/html_pipeline/node_filter.rb +31 -0
- data/lib/html_pipeline/sanitization_filter.rb +65 -0
- data/lib/{html/pipeline → html_pipeline/text_filter}/image_filter.rb +3 -3
- data/lib/{html/pipeline → html_pipeline/text_filter}/plain_text_input_filter.rb +3 -5
- data/lib/html_pipeline/text_filter.rb +21 -0
- data/lib/html_pipeline/version.rb +5 -0
- data/lib/html_pipeline.rb +252 -0
- metadata +52 -54
- data/.travis.yml +0 -43
- data/Appraisals +0 -19
- data/CONTRIBUTING.md +0 -60
- data/bin/html-pipeline +0 -78
- data/lib/html/pipeline/@team_mention_filter.rb +0 -99
- data/lib/html/pipeline/autolink_filter.rb +0 -34
- data/lib/html/pipeline/body_content.rb +0 -44
- data/lib/html/pipeline/camo_filter.rb +0 -105
- data/lib/html/pipeline/email_reply_filter.rb +0 -69
- data/lib/html/pipeline/filter.rb +0 -165
- data/lib/html/pipeline/https_filter.rb +0 -29
- data/lib/html/pipeline/image_max_width_filter.rb +0 -37
- data/lib/html/pipeline/markdown_filter.rb +0 -56
- data/lib/html/pipeline/sanitization_filter.rb +0 -144
- data/lib/html/pipeline/syntax_highlight_filter.rb +0 -50
- data/lib/html/pipeline/text_filter.rb +0 -16
- data/lib/html/pipeline/textile_filter.rb +0 -25
- data/lib/html/pipeline/toc_filter.rb +0 -69
- data/lib/html/pipeline/version.rb +0 -7
- data/lib/html/pipeline.rb +0 -210
data/README.md
CHANGED
@@ -1,23 +1,23 @@
|
|
1
|
-
#
|
1
|
+
# HTMLPipeline
|
2
2
|
|
3
|
-
|
4
|
-
|
3
|
+
> **Note**
|
4
|
+
> This README refers to the behavior in the new 3.0.0.pre1 gem.
|
5
|
+
|
6
|
+
HTML processing filters and utilities. This module is a small
|
7
|
+
framework for defining CSS-based content filters and applying them to user
|
5
8
|
provided content.
|
6
9
|
|
7
|
-
[
|
10
|
+
[Although this project was started at GitHub](https://github.com/blog/1311-html-pipeline-chainable-content-filters), they no longer do. This gem must be considered standalone and independent from GitHub.
|
8
11
|
|
9
12
|
- [Installation](#installation)
|
10
13
|
- [Usage](#usage)
|
11
|
-
- [Examples](#examples)
|
14
|
+
- [More Examples](#more-examples)
|
12
15
|
- [Filters](#filters)
|
13
16
|
- [Dependencies](#dependencies)
|
14
17
|
- [Documentation](#documentation)
|
15
|
-
- [Extending](#extending)
|
16
|
-
- [3rd Party Extensions](#3rd-party-extensions)
|
17
18
|
- [Instrumenting](#instrumenting)
|
18
|
-
- [
|
19
|
-
|
20
|
-
- [Releasing A New Version](#releasing-a-new-version)
|
19
|
+
- [Third Party Extensions](#third-party-extensions)
|
20
|
+
- [FAQ](#faq)
|
21
21
|
|
22
22
|
## Installation
|
23
23
|
|
@@ -42,220 +42,216 @@ $ gem install html-pipeline
|
|
42
42
|
## Usage
|
43
43
|
|
44
44
|
This library provides a handful of chainable HTML filters to transform user
|
45
|
-
content into markup.
|
46
|
-
|
47
|
-
outputs the result.
|
45
|
+
content into HTML markup. Each filter does some work, and then hands off the
|
46
|
+
results tothe next filter. A pipeline has several kinds of filters available to use:
|
48
47
|
|
49
|
-
|
48
|
+
- Multiple `TextFilter`s, which operate a UTF-8 string
|
49
|
+
- A `ConvertFilter` filter, which turns text into HTML (eg., Commonmark/Asciidoc -> HTML)
|
50
|
+
- A `SanitizationFilter`, which remove dangerous/unwanted HTML elements and attributes
|
51
|
+
- Multiple `NodeFilter`s, which operate on a UTF-8 HTML document
|
50
52
|
|
51
|
-
|
52
|
-
require 'html/pipeline'
|
53
|
+
You can assemble each sequence into a single pipeline, or choose to call each filter individually.
|
53
54
|
|
54
|
-
|
55
|
-
|
56
|
-
|
55
|
+
As an example, suppose we want to transform Commonmark source text into Markdown HTML. With the content, we also want to:
|
56
|
+
|
57
|
+
- change every instance of `$NAME` to "`Johnny"
|
58
|
+
- strip undesired HTML
|
59
|
+
- linkify @mention
|
57
60
|
|
58
|
-
|
59
|
-
output to the next filter's input. So if you wanted to have content be
|
60
|
-
filtered through Markdown and be syntax highlighted, you can create the
|
61
|
-
following pipeline:
|
61
|
+
We can construct a pipeline to do all that like this:
|
62
62
|
|
63
63
|
```ruby
|
64
|
-
|
65
|
-
HTML::Pipeline::MarkdownFilter,
|
66
|
-
HTML::Pipeline::SyntaxHighlightFilter
|
67
|
-
]
|
68
|
-
result = pipeline.call <<-CODE
|
69
|
-
This is *great*:
|
64
|
+
require 'html_pipeline'
|
70
65
|
|
71
|
-
|
66
|
+
class HelloJohnnyFilter < HTMLPipelineFilter
|
67
|
+
def call
|
68
|
+
text.gsub("$NAME", "Johnny")
|
69
|
+
end
|
70
|
+
end
|
72
71
|
|
73
|
-
|
74
|
-
|
72
|
+
pipeline = HTMLPipeline.new(
|
73
|
+
text_filters: [HelloJohnnyFilter.new]
|
74
|
+
convert_filter: HTMLPipeline::ConvertFilter::MarkdownFilter.new),
|
75
|
+
# note: next line is not needed as sanitization occurs by default;
|
76
|
+
# see below for more info
|
77
|
+
sanitization_config: HTMLPipeline::SanitizationFilter::DEFAULT_CONFIG,
|
78
|
+
node_filters: [HTMLPipeline::NodeFilter::MentionFilter.new]
|
79
|
+
)
|
80
|
+
pipeline.call(user_supplied_text) # recommended: can call pipeline over and over
|
75
81
|
```
|
76
82
|
|
77
|
-
|
78
|
-
|
79
|
-
```html
|
80
|
-
<p>This is <em>great</em>:</p>
|
83
|
+
Filters can be custom ones you create (like `HelloJohnnyFilter`), and `HTMLPipeline` additionally provides several helpful ones (detailed below). If you only need a single filter, you can call one individually, too:
|
81
84
|
|
82
|
-
|
83
|
-
|
85
|
+
```ruby
|
86
|
+
filter = HTMLPipeline::ConvertFilter::MarkdownFilter.new(text)
|
87
|
+
filter.call
|
84
88
|
```
|
85
89
|
|
86
|
-
|
90
|
+
Filters combine into a sequential pipeline, and each filter hands its
|
91
|
+
output to the next filter's input. Text filters are
|
92
|
+
processed first, then the convert filter, sanitization filter, and finally, the node filters.
|
87
93
|
|
88
|
-
Some filters take
|
94
|
+
Some filters take optional `context` and/or `result` hash(es). These are
|
89
95
|
used to pass around arguments and metadata between filters in a pipeline. For
|
90
|
-
example, if you
|
91
|
-
option in the context hash:
|
96
|
+
example, if you want to disable footnotes in the `MarkdownFilter`, you can pass an option in the context hash:
|
92
97
|
|
93
98
|
```ruby
|
94
|
-
|
99
|
+
context = { markdown: extensions: { footnotes: false } }
|
100
|
+
filter = HTMLPipeline::ConvertFilter::MarkdownFilter.new("Hi **world**!", context: context)
|
95
101
|
filter.call
|
96
102
|
```
|
97
103
|
|
98
|
-
|
104
|
+
Please refer to the documentation for each filter to understand what configuration options are available.
|
105
|
+
|
106
|
+
### More Examples
|
99
107
|
|
100
|
-
|
108
|
+
Different pipelines can be defined for different parts of an app. Here are a few
|
101
109
|
paraphrased snippets to get you started:
|
102
110
|
|
103
111
|
```ruby
|
104
112
|
# The context hash is how you pass options between different filters.
|
105
113
|
# See individual filter source for explanation of options.
|
106
114
|
context = {
|
107
|
-
:
|
108
|
-
:
|
115
|
+
asset_root: "http://your-domain.com/where/your/images/live/icons",
|
116
|
+
base_url: "http://your-domain.com"
|
109
117
|
}
|
110
118
|
|
111
|
-
# Pipeline providing sanitization and image hijacking but no mention
|
112
|
-
# related features.
|
113
|
-
SimplePipeline = Pipeline.new [
|
114
|
-
SanitizationFilter,
|
115
|
-
TableOfContentsFilter, # add 'name' anchors to all headers and generate toc list
|
116
|
-
CamoFilter,
|
117
|
-
ImageMaxWidthFilter,
|
118
|
-
SyntaxHighlightFilter,
|
119
|
-
EmojiFilter,
|
120
|
-
AutolinkFilter
|
121
|
-
], context
|
122
|
-
|
123
119
|
# Pipeline used for user provided content on the web
|
124
|
-
MarkdownPipeline =
|
125
|
-
|
126
|
-
|
127
|
-
|
128
|
-
|
129
|
-
|
130
|
-
MentionFilter,
|
131
|
-
EmojiFilter,
|
132
|
-
SyntaxHighlightFilter
|
133
|
-
], context.merge(:gfm => true) # enable github formatted markdown
|
134
|
-
|
135
|
-
|
136
|
-
# Define a pipeline based on another pipeline's filters
|
137
|
-
NonGFMMarkdownPipeline = Pipeline.new(MarkdownPipeline.filters,
|
138
|
-
context.merge(:gfm => false))
|
120
|
+
MarkdownPipeline = HTMLPipeline.new (
|
121
|
+
text_filters: [HTMLPipeline::TextFilter::ImageMaxWidthFilter.new],
|
122
|
+
convert_filter: [HTMLPipeline::ConvertFilter::MarkdownFilter.new],
|
123
|
+
node_filters: [
|
124
|
+
HTMLPipeline::NodeFilter::HttpsFilter.new,HTMLPipeline::NodeFilter::MentionFilter.new,
|
125
|
+
], context: context)
|
139
126
|
|
140
127
|
# Pipelines aren't limited to the web. You can use them for email
|
141
128
|
# processing also.
|
142
|
-
HtmlEmailPipeline =
|
143
|
-
|
144
|
-
|
145
|
-
|
146
|
-
|
147
|
-
# Just emoji.
|
148
|
-
EmojiPipeline = Pipeline.new [
|
149
|
-
PlainTextInputFilter,
|
150
|
-
EmojiFilter
|
151
|
-
], context
|
129
|
+
HtmlEmailPipeline = HTMLPipeline.new(
|
130
|
+
text_filters: [
|
131
|
+
PlainTextInputFilter.new,
|
132
|
+
ImageMaxWidthFilter.new
|
133
|
+
], {})
|
152
134
|
```
|
153
135
|
|
154
136
|
## Filters
|
155
137
|
|
156
|
-
|
157
|
-
* `TeamMentionFilter` - replace `@org/team` mentions with links
|
158
|
-
* `AbsoluteSourceFilter` - replace relative image urls with fully qualified versions
|
159
|
-
* `AutolinkFilter` - auto_linking urls in HTML
|
160
|
-
* `CamoFilter` - replace http image urls with [camo-fied](https://github.com/atmos/camo) https versions
|
161
|
-
* `EmailReplyFilter` - util filter for working with emails
|
162
|
-
* `EmojiFilter` - everyone loves [emoji](http://www.emoji-cheat-sheet.com/)!
|
163
|
-
* `HttpsFilter` - HTML Filter for replacing http github urls with https versions.
|
164
|
-
* `ImageMaxWidthFilter` - link to full size image for large images
|
165
|
-
* `MarkdownFilter` - convert markdown to html
|
166
|
-
* `PlainTextInputFilter` - html escape text and wrap the result in a div
|
167
|
-
* `SanitizationFilter` - allow sanitize user markup
|
168
|
-
* `SyntaxHighlightFilter` - code syntax highlighter
|
169
|
-
* `TextileFilter` - convert textile to html
|
170
|
-
* `TableOfContentsFilter` - anchor headings with name attributes and generate Table of Contents html unordered list linking headings
|
138
|
+
### TextFilters
|
171
139
|
|
172
|
-
|
140
|
+
`TextFilter`s must define a method named `call` which is called on the text. `@text`, `@config`, and `@result` are available to use, and any changes made to these ivars are passed on to the next filter.
|
141
|
+
|
142
|
+
- `ImageFilter` - converts image `url` into `<img>` tag
|
143
|
+
- `PlainTextInputFilter` - html escape text and wrap the result in a `<div>`
|
173
144
|
|
174
|
-
|
175
|
-
|
176
|
-
`
|
177
|
-
|
178
|
-
|
145
|
+
### ConvertFilter
|
146
|
+
|
147
|
+
The `ConvertFilter` takes text and turns it into HTML. `@text`, `@config`, and `@result` are available to use. `ConvertFilter` must defined a method named `call`, taking one argument, `text`. `call` must return a string representing the new HTML document.
|
148
|
+
|
149
|
+
- `MarkdownFilter` - creates HTML from text using [Commonmarker](https://www.github.com/gjtorikian/commonmarker)
|
150
|
+
|
151
|
+
### Sanitization
|
152
|
+
|
153
|
+
Because the web can be a scary place, HTML is automatically sanitized after the `ConvertFilter` runs and before the `NodeFilter`s are processed. This is to prevent malicious or unexpected input from entering the pipeline.
|
154
|
+
|
155
|
+
The sanitization process takes a hash configuration of settings. See the [Selma](https://www.github.com/gjtorikian/selma) documentation for more information on how to configure these settings.
|
156
|
+
|
157
|
+
A default sanitization config is provided by this library (`HTMLPipeline::SanitizationFilter::DEFAULT_CONFIG`). A sample custom sanitization allowlist might look like this:
|
179
158
|
|
180
159
|
```ruby
|
181
|
-
|
160
|
+
ALLOWLIST = {
|
161
|
+
elements: ["p", "pre", "code"]
|
162
|
+
}
|
163
|
+
|
164
|
+
pipeline = HTMLPipeline.new \
|
165
|
+
text_filters: [
|
166
|
+
HTMLPipeline::MarkdownFilter,
|
167
|
+
],
|
168
|
+
convert_filter: [HTMLPipeline::ConvertFilter::MarkdownFilter.new],
|
169
|
+
sanitization_config: ALLOWLIST
|
170
|
+
|
171
|
+
result = pipeline.call <<-CODE
|
172
|
+
This is *great*:
|
173
|
+
|
174
|
+
some_code(:first)
|
175
|
+
|
176
|
+
CODE
|
177
|
+
result[:output].to_s
|
182
178
|
```
|
183
179
|
|
184
|
-
|
185
|
-
* `EmailReplyFilter` - `escape_utils`, `email_reply_parser`
|
186
|
-
* `EmojiFilter` - `gemoji`
|
187
|
-
* `MarkdownFilter` - `commonmarker`
|
188
|
-
* `PlainTextInputFilter` - `escape_utils`
|
189
|
-
* `SanitizationFilter` - `sanitize`
|
190
|
-
* `SyntaxHighlightFilter` - `rouge`
|
191
|
-
* `TableOfContentsFilter` - `escape_utils`
|
192
|
-
* `TextileFilter` - `RedCloth`
|
180
|
+
This would print:
|
193
181
|
|
194
|
-
|
182
|
+
```html
|
183
|
+
<p>This is great:</p>
|
184
|
+
<pre><code>some_code(:first)
|
185
|
+
</code></pre>
|
186
|
+
```
|
195
187
|
|
196
|
-
|
188
|
+
Sanitization can be disabled if and only if `nil` is explicitly passed as
|
189
|
+
the config:
|
197
190
|
|
198
|
-
|
191
|
+
```ruby
|
192
|
+
pipeline = HTMLPipeline.new \
|
193
|
+
text_filters: [
|
194
|
+
HTMLPipeline::MarkdownFilter,
|
195
|
+
],
|
196
|
+
convert_filter: [HTMLPipeline::ConvertFilter::MarkdownFilter.new],
|
197
|
+
sanitization_config: nil
|
198
|
+
```
|
199
|
+
|
200
|
+
For more examples of customizing the sanitization process to include the tags you want, check out [the tests](test/sanitization_filter_test.rb) and [the FAQ](#faq).
|
199
201
|
|
200
|
-
|
201
|
-
To write a custom filter, you need a class with a `call` method that inherits
|
202
|
-
from `HTML::Pipeline::Filter`.
|
202
|
+
### NodeFilters
|
203
203
|
|
204
|
-
|
204
|
+
`NodeFilters`s can operate either on HTML elements or text nodes using CSS selectors. Each `NodeFilter` must define a method named `selector` which provides an instance of `Selma::Selector`. If elements are being manipulated, `handle_element` must be defined, taking one argument, `element`; if text nodes are being manipulated, `handle_text_chunk` must be defined, taking one argument, `text_chunk`. `@config`, and `@result` are available to use, and any changes made to these ivars are passed on to the next filter.
|
205
|
+
|
206
|
+
`NodeFilter` also has an optional method, `after_initialize`, which is run after the filter initializes. This can be useful in setting up a custom state for `result` to take advantage of.
|
207
|
+
|
208
|
+
Here's an example `NodeFilter` that adds a base url to images that are root relative:
|
205
209
|
|
206
210
|
```ruby
|
207
211
|
require 'uri'
|
208
212
|
|
209
|
-
class RootRelativeFilter <
|
213
|
+
class RootRelativeFilter < HTMLPipeline::NodeFilter
|
210
214
|
|
211
|
-
|
212
|
-
|
213
|
-
|
214
|
-
|
215
|
-
if src.start_with? '/'
|
216
|
-
img["src"] = URI.join(context[:base_url], src).to_s
|
217
|
-
end
|
218
|
-
end
|
219
|
-
doc
|
215
|
+
SELECTOR = Selma::Selector.new(match_element: "img")
|
216
|
+
|
217
|
+
def selector
|
218
|
+
SELECTOR
|
220
219
|
end
|
221
220
|
|
221
|
+
def handle_element(img)
|
222
|
+
next if img['src'].nil?
|
223
|
+
src = img['src'].strip
|
224
|
+
if src.start_with? '/'
|
225
|
+
img["src"] = URI.join(context[:base_url], src).to_s
|
226
|
+
end
|
227
|
+
end
|
222
228
|
end
|
223
229
|
```
|
224
230
|
|
225
|
-
|
231
|
+
For more information on how to write effective `NodeFilter`s, refer to the provided filters, and see the underlying lib, [Selma](https://www.github.com/gjtorikian/selma) for more information.
|
226
232
|
|
227
|
-
|
228
|
-
|
229
|
-
|
233
|
+
- `AbsoluteSourceFilter` - replace relative image urls with fully qualified versions
|
234
|
+
- `EmojiFilter` - converts `:<emoji>:` to [emoji](http://www.emoji-cheat-sheet.com/)!
|
235
|
+
- `HttpsFilter` - Replacing http urls with https versions
|
236
|
+
- `ImageMaxWidthFilter` - link to full size image for large images
|
237
|
+
- `MentionFilter` - replace `@user` mentions with links
|
238
|
+
- `SanitizationFilter` - allow sanitize user markup
|
239
|
+
- `TableOfContentsFilter` - anchor headings with name attributes and generate Table of Contents html unordered list linking headings
|
240
|
+
- `TeamMentionFilter` - replace `@org/team` mentions with links
|
230
241
|
|
231
|
-
|
242
|
+
## Dependencies
|
232
243
|
|
233
|
-
|
234
|
-
|
235
|
-
whether the filter is a common enough use case to belong in this gem, or should be
|
236
|
-
built as an external gem.
|
244
|
+
Since filters can be customized to your heart's content, gem dependencies are _not_ bundled; this project doesn't know which of the default filters you might use, and as such, you must bundle each filter's gem
|
245
|
+
dependencies yourself.
|
237
246
|
|
238
|
-
|
247
|
+
> **Note**
|
248
|
+
> See the [Gemfile](/Gemfile) `:test` group for any version requirements.
|
239
249
|
|
240
|
-
|
241
|
-
* [jekyll-html-pipeline](https://github.com/gjtorikian/jekyll-html-pipeline)
|
242
|
-
* [nanoc-html-pipeline](https://github.com/burnto/nanoc-html-pipeline)
|
243
|
-
* [html-pipeline-bitly](https://github.com/dewski/html-pipeline-bitly)
|
244
|
-
* [html-pipeline-cite](https://github.com/lifted-studios/html-pipeline-cite)
|
245
|
-
* [tilt-html-pipeline](https://github.com/bradgessler/tilt-html-pipeline)
|
246
|
-
* [html-pipeline-wiki-link'](https://github.com/lifted-studios/html-pipeline-wiki-link) - WikiMedia-style wiki links
|
247
|
-
* [task_list](https://github.com/github/task_list) - GitHub flavor Markdown Task List
|
248
|
-
* [html-pipeline-nico_link](https://github.com/rutan/html-pipeline-nico_link) - An HTML::Pipeline filter for [niconico](http://www.nicovideo.jp) description links
|
249
|
-
* [html-pipeline-gitlab](https://gitlab.com/gitlab-org/html-pipeline-gitlab) - This gem implements various filters for html-pipeline used by GitLab
|
250
|
-
* [html-pipeline-youtube](https://github.com/st0012/html-pipeline-youtube) - An HTML::Pipeline filter for YouTube links
|
251
|
-
* [html-pipeline-flickr](https://github.com/st0012/html-pipeline-flickr) - An HTML::Pipeline filter for Flickr links
|
252
|
-
* [html-pipeline-vimeo](https://github.com/dlackty/html-pipeline-vimeo) - An HTML::Pipeline filter for Vimeo links
|
253
|
-
* [html-pipeline-hashtag](https://github.com/mr-dxdy/html-pipeline-hashtag) - An HTML::Pipeline filter for hashtags
|
254
|
-
* [html-pipeline-linkify_github](https://github.com/jollygoodcode/html-pipeline-linkify_github) - An HTML::Pipeline filter to autolink GitHub urls
|
255
|
-
* [html-pipeline-redcarpet_filter](https://github.com/bmikol/html-pipeline-redcarpet_filter) - Render Markdown source text into Markdown HTML using Redcarpet
|
256
|
-
* [html-pipeline-typogruby_filter](https://github.com/bmikol/html-pipeline-typogruby_filter) - Add Typogruby text filters to your HTML::Pipeline
|
257
|
-
* [korgi](https://github.com/jodeci/korgi) - HTML::Pipeline filters for links to Rails resources
|
250
|
+
When developing a custom filter, call `HTMLPipeline.require_dependency` at the start to ensure that the local machine has the necessary dependency. You can also use `HTMLPipeline.require_dependencies` to provide a list of dependencies to check.
|
258
251
|
|
252
|
+
## Documentation
|
253
|
+
|
254
|
+
Full reference documentation can be [found here](http://rubydoc.info/gems/html-pipeline/frames).
|
259
255
|
|
260
256
|
## Instrumenting
|
261
257
|
|
@@ -263,107 +259,102 @@ Filters and Pipelines can be set up to be instrumented when called. The pipeline
|
|
263
259
|
must be setup with an
|
264
260
|
[ActiveSupport::Notifications](http://api.rubyonrails.org/classes/ActiveSupport/Notifications.html)
|
265
261
|
compatible service object and a name. New pipeline objects will default to the
|
266
|
-
`
|
262
|
+
`HTMLPipeline.default_instrumentation_service` object.
|
267
263
|
|
268
|
-
```
|
264
|
+
```ruby
|
269
265
|
# the AS::Notifications-compatible service object
|
270
266
|
service = ActiveSupport::Notifications
|
271
267
|
|
272
268
|
# instrument a specific pipeline
|
273
|
-
pipeline =
|
269
|
+
pipeline = HTMLPipeline.new [MarkdownFilter], context
|
274
270
|
pipeline.setup_instrumentation "MarkdownPipeline", service
|
275
271
|
|
276
272
|
# or set default instrumentation service for all new pipelines
|
277
|
-
|
278
|
-
pipeline =
|
273
|
+
HTMLPipeline.default_instrumentation_service = service
|
274
|
+
pipeline = HTMLPipeline.new [MarkdownFilter], context
|
279
275
|
pipeline.setup_instrumentation "MarkdownPipeline"
|
280
276
|
```
|
281
277
|
|
282
278
|
Filters are instrumented when they are run through the pipeline. A
|
283
|
-
`call_filter.html_pipeline` event is published once
|
284
|
-
`
|
279
|
+
`call_filter.html_pipeline` event is published once any filter finishes; `call_text_filters`
|
280
|
+
and `call_node_filters` is published when all of the text and node filters are finished, respectively.
|
281
|
+
The `payload` should include the `filter` name. Each filter will trigger its own
|
285
282
|
instrumentation call.
|
286
283
|
|
287
|
-
```
|
284
|
+
```ruby
|
288
285
|
service.subscribe "call_filter.html_pipeline" do |event, start, ending, transaction_id, payload|
|
289
286
|
payload[:pipeline] #=> "MarkdownPipeline", set with `setup_instrumentation`
|
290
287
|
payload[:filter] #=> "MarkdownFilter"
|
291
288
|
payload[:context] #=> context Hash
|
292
289
|
payload[:result] #=> instance of result class
|
293
|
-
payload[:result][:output] #=> output HTML String
|
290
|
+
payload[:result][:output] #=> output HTML String
|
294
291
|
end
|
295
292
|
```
|
296
293
|
|
297
294
|
The full pipeline is also instrumented:
|
298
295
|
|
299
|
-
```
|
300
|
-
service.subscribe "
|
296
|
+
```ruby
|
297
|
+
service.subscribe "call_text_filters.html_pipeline" do |event, start, ending, transaction_id, payload|
|
301
298
|
payload[:pipeline] #=> "MarkdownPipeline", set with `setup_instrumentation`
|
302
299
|
payload[:filters] #=> ["MarkdownFilter"]
|
303
|
-
payload[:doc] #=> HTML String
|
300
|
+
payload[:doc] #=> HTML String
|
304
301
|
payload[:context] #=> context Hash
|
305
302
|
payload[:result] #=> instance of result class
|
306
|
-
payload[:result][:output] #=> output HTML String
|
303
|
+
payload[:result][:output] #=> output HTML String
|
307
304
|
end
|
308
305
|
```
|
309
306
|
|
307
|
+
## Third Party Extensions
|
308
|
+
|
309
|
+
If you have an idea for a filter, propose it as
|
310
|
+
[an issue](https://github.com/gjtorikian/html-pipeline/issues) first. This allows us to discuss
|
311
|
+
whether the filter is a common enough use case to belong in this gem, or should be
|
312
|
+
built as an external gem.
|
313
|
+
|
314
|
+
Here are some extensions people have built:
|
315
|
+
|
316
|
+
- [html-pipeline-asciidoc_filter](https://github.com/asciidoctor/html-pipeline-asciidoc_filter)
|
317
|
+
- [jekyll-html-pipeline](https://github.com/gjtorikian/jekyll-html-pipeline)
|
318
|
+
- [nanoc-html-pipeline](https://github.com/burnto/nanoc-html-pipeline)
|
319
|
+
- [html-pipeline-bitly](https://github.com/dewski/html-pipeline-bitly)
|
320
|
+
- [html-pipeline-cite](https://github.com/lifted-studios/html-pipeline-cite)
|
321
|
+
- [tilt-html-pipeline](https://github.com/bradgessler/tilt-html-pipeline)
|
322
|
+
- [html-pipeline-wiki-link'](https://github.com/lifted-studios/html-pipeline-wiki-link) - WikiMedia-style wiki links
|
323
|
+
- [task_list](https://github.com/github/task_list) - GitHub flavor Markdown Task List
|
324
|
+
- [html-pipeline-nico_link](https://github.com/rutan/html-pipeline-nico_link) - An HTMLPipeline filter for [niconico](http://www.nicovideo.jp) description links
|
325
|
+
- [html-pipeline-gitlab](https://gitlab.com/gitlab-org/html-pipeline-gitlab) - This gem implements various filters for html-pipeline used by GitLab
|
326
|
+
- [html-pipeline-youtube](https://github.com/st0012/html-pipeline-youtube) - An HTMLPipeline filter for YouTube links
|
327
|
+
- [html-pipeline-flickr](https://github.com/st0012/html-pipeline-flickr) - An HTMLPipeline filter for Flickr links
|
328
|
+
- [html-pipeline-vimeo](https://github.com/dlackty/html-pipeline-vimeo) - An HTMLPipeline filter for Vimeo links
|
329
|
+
- [html-pipeline-hashtag](https://github.com/mr-dxdy/html-pipeline-hashtag) - An HTMLPipeline filter for hashtags
|
330
|
+
- [html-pipeline-linkify_github](https://github.com/jollygoodcode/html-pipeline-linkify_github) - An HTMLPipeline filter to autolink GitHub urls
|
331
|
+
- [html-pipeline-redcarpet_filter](https://github.com/bmikol/html-pipeline-redcarpet_filter) - Render Markdown source text into Markdown HTML using Redcarpet
|
332
|
+
- [html-pipeline-typogruby_filter](https://github.com/bmikol/html-pipeline-typogruby_filter) - Add Typogruby text filters to your HTMLPipeline
|
333
|
+
- [korgi](https://github.com/jodeci/korgi) - HTMLPipeline filters for links to Rails resources
|
334
|
+
|
310
335
|
## FAQ
|
311
336
|
|
312
337
|
### 1. Why doesn't my pipeline work when there's no root element in the document?
|
313
338
|
|
314
339
|
To make a pipeline work on a plain text document, put the `PlainTextInputFilter`
|
315
|
-
at the
|
316
|
-
filters have a root element to work with. If you're passing in an HTML fragment,
|
340
|
+
at the end of your `text_filter`s config . This will wrap the content in a `div` so the filters have a root element to work with. If you're passing in an HTML fragment,
|
317
341
|
but it doesn't have a root element, you can wrap the content in a `div`
|
318
|
-
yourself.
|
319
|
-
|
320
|
-
```ruby
|
321
|
-
EmojiPipeline = Pipeline.new [
|
322
|
-
PlainTextInputFilter, # <- Wraps input in a div and escapes html tags
|
323
|
-
EmojiFilter
|
324
|
-
], context
|
325
|
-
|
326
|
-
plain_text = "Gutentag! :wave:"
|
327
|
-
EmojiPipeline.call(plain_text)
|
328
|
-
|
329
|
-
html_fragment = "This is outside of an html element, but <strong>this isn't. :+1:</strong>"
|
330
|
-
EmojiPipeline.call("<div>#{html_fragment}</div>") # <- Wrap your own html fragments to avoid escaping
|
331
|
-
```
|
342
|
+
yourself.
|
332
343
|
|
333
344
|
### 2. How do I customize an allowlist for `SanitizationFilter`s?
|
334
345
|
|
335
|
-
`SanitizationFilter::ALLOWLIST` is the default allowlist used if no
|
336
|
-
argument is given
|
346
|
+
`HTMLPipeline::SanitizationFilter::ALLOWLIST` is the default allowlist used if no `sanitization_config`
|
347
|
+
argument is given. The default is a good starting template for
|
337
348
|
you to add additional elements. You can either modify the constant's value, or
|
338
|
-
re-define your own
|
339
|
-
|
340
|
-
## Contributing
|
349
|
+
re-define your own config and pass that in, such as:
|
341
350
|
|
342
|
-
|
343
|
-
|
344
|
-
|
345
|
-
|
346
|
-
3. Commit your changes (`git commit -am 'Added some feature'`)
|
347
|
-
4. Push to the branch (`git push origin my-new-feature`)
|
348
|
-
5. Create new [Pull Request](https://help.github.com/articles/using-pull-requests)
|
349
|
-
|
350
|
-
To see what has changed in recent versions, see the [CHANGELOG](https://github.com/jch/html-pipeline/blob/master/CHANGELOG.md).
|
351
|
+
```ruby
|
352
|
+
config = HTMLPipeline::SanitizerFilter::DEFAULT_CONFIG.dup
|
353
|
+
config[:elements] << "iframe" # sure, whatever you want
|
354
|
+
```
|
351
355
|
|
352
356
|
### Contributors
|
353
357
|
|
354
|
-
Thanks to all of [these contributors](https://github.com/
|
355
|
-
|
356
|
-
Project is a member of the [OSS Manifesto](http://ossmanifesto.org/).
|
357
|
-
|
358
|
-
The current maintainer is @gjtorikian
|
359
|
-
|
360
|
-
### Releasing A New Version
|
361
|
-
|
362
|
-
This section is for gem maintainers to cut a new version of the gem.
|
358
|
+
Thanks to all of [these contributors](https://github.com/gjtorikian/html-pipeline/graphs/contributors).
|
363
359
|
|
364
|
-
|
365
|
-
* update lib/html/pipeline/version.rb to next version number X.X.X
|
366
|
-
* update CHANGELOG.md. Prepare a draft with `script/changelog`
|
367
|
-
* push branch and create a new pull request
|
368
|
-
* after tests are green, merge to master
|
369
|
-
* on the master branch, run `script/release`
|
360
|
+
This project is a member of the [OSS Manifesto](http://ossmanifesto.org/).
|
data/Rakefile
CHANGED
@@ -1,17 +1,24 @@
|
|
1
1
|
#!/usr/bin/env rake
|
2
2
|
# frozen_string_literal: true
|
3
3
|
|
4
|
-
require
|
5
|
-
require
|
6
|
-
|
7
|
-
require 'bundler/gem_tasks'
|
8
|
-
require 'rake/testtask'
|
4
|
+
require "bundler/gem_tasks"
|
5
|
+
require "rubygems/package_task"
|
6
|
+
require "rake/testtask"
|
9
7
|
|
10
8
|
Rake::TestTask.new do |t|
|
11
|
-
t.libs <<
|
12
|
-
t.test_files = FileList[
|
9
|
+
t.libs << "test"
|
10
|
+
t.test_files = FileList["test/**/*_test.rb"]
|
13
11
|
t.verbose = true
|
14
12
|
t.warning = false
|
15
13
|
end
|
16
14
|
|
17
15
|
task default: :test
|
16
|
+
|
17
|
+
require "rubocop/rake_task"
|
18
|
+
|
19
|
+
RuboCop::RakeTask.new(:rubocop)
|
20
|
+
|
21
|
+
GEMSPEC = Bundler.load_gemspec("html-pipeline.gemspec")
|
22
|
+
gem_path = Gem::PackageTask.new(GEMSPEC).define
|
23
|
+
desc "Package the ruby gem"
|
24
|
+
task "package" => [gem_path]
|
data/UPGRADING.md
ADDED
@@ -0,0 +1,35 @@
|
|
1
|
+
# Upgrade Guide
|
2
|
+
|
3
|
+
## From v2 to v3
|
4
|
+
|
5
|
+
HTMLPipeline v3 is a massive improvement over this still much loved (and woefully under-maintained) project. This section will attempt to list all of the breaking changes between the two versions and provide suggestions on how to upgrade.
|
6
|
+
|
7
|
+
### Changed namespace
|
8
|
+
|
9
|
+
This project is now under a module called `HTMLPipeline`, not `HTML::Pipeline`.
|
10
|
+
|
11
|
+
### Removed filters
|
12
|
+
|
13
|
+
The following filters were removed:
|
14
|
+
|
15
|
+
- `AutolinkFilter`: this is handled by [Commonmarker](https://www.github.com/gjtorikian/commonmarker) and can be disabled/enabled through the `MarkdownFilter`'s `context` hash
|
16
|
+
- `SyntaxHighlightFilter`: this is handled by [Commonmarker](https://www.github.com/gjtorikian/commonmarker) and can be disabled/enabled through the `MarkdownFilter`'s `context` hash
|
17
|
+
- `SanitizationFilter`: this is handled by [Selma](https://www.github.com/gjtorikian/selma); configuration can be done through the `sanitization_config` hash
|
18
|
+
|
19
|
+
- `EmailReplyFilter`
|
20
|
+
- `CamoFilter`
|
21
|
+
- `TextFilter`
|
22
|
+
|
23
|
+
### Changed API
|
24
|
+
|
25
|
+
The new way to call this project is as follows:
|
26
|
+
|
27
|
+
```ruby
|
28
|
+
HTMLPipeline.new(
|
29
|
+
text_filters: [], # array of instantiated (`.new`ed) `HTMLPipeline::TextFilter`
|
30
|
+
convert_filter:, # a filter that runs to turn text into HTML
|
31
|
+
sanitization_config: {}, # an allowlist of elements/attributes/protocols to keep
|
32
|
+
node_filters: []) # array of instantiated (`.new`ed) `HTMLPipeline::NodeFilter`
|
33
|
+
```
|
34
|
+
|
35
|
+
Please refer to the README for more information on constructing filters. In most cases, the underlying filter needs only a few changes, primarily to make use of [Selma](https://www.github.com/gjtorikian/selma) rather than Nokogiri.
|