html-pipeline 2.14.3 → 3.0.0.pre1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (54) hide show
  1. checksums.yaml +4 -4
  2. data/.github/FUNDING.yml +11 -3
  3. data/.github/dependabot.yml +20 -0
  4. data/.github/workflows/automerge.yml +34 -0
  5. data/.github/workflows/lint.yml +23 -0
  6. data/.github/workflows/tag_and_release.yml +70 -0
  7. data/.github/workflows/test.yml +33 -0
  8. data/.rubocop.yml +17 -0
  9. data/CHANGELOG.md +28 -2
  10. data/Gemfile +29 -15
  11. data/{LICENSE → LICENSE.txt} +2 -2
  12. data/README.md +209 -218
  13. data/Rakefile +14 -7
  14. data/UPGRADING.md +35 -0
  15. data/html-pipeline.gemspec +31 -21
  16. data/lib/html-pipeline.rb +3 -0
  17. data/lib/html_pipeline/convert_filter/markdown_filter.rb +26 -0
  18. data/lib/html_pipeline/convert_filter.rb +17 -0
  19. data/lib/html_pipeline/filter.rb +89 -0
  20. data/lib/{html/pipeline → html_pipeline/node_filter}/absolute_source_filter.rb +23 -21
  21. data/lib/{html/pipeline → html_pipeline/node_filter}/emoji_filter.rb +58 -54
  22. data/lib/html_pipeline/node_filter/https_filter.rb +22 -0
  23. data/lib/html_pipeline/node_filter/image_max_width_filter.rb +40 -0
  24. data/lib/{html/pipeline/@mention_filter.rb → html_pipeline/node_filter/mention_filter.rb} +55 -69
  25. data/lib/html_pipeline/node_filter/table_of_contents_filter.rb +68 -0
  26. data/lib/html_pipeline/node_filter/team_mention_filter.rb +105 -0
  27. data/lib/html_pipeline/node_filter.rb +31 -0
  28. data/lib/html_pipeline/sanitization_filter.rb +65 -0
  29. data/lib/{html/pipeline → html_pipeline/text_filter}/image_filter.rb +3 -3
  30. data/lib/{html/pipeline → html_pipeline/text_filter}/plain_text_input_filter.rb +3 -5
  31. data/lib/html_pipeline/text_filter.rb +21 -0
  32. data/lib/html_pipeline/version.rb +5 -0
  33. data/lib/html_pipeline.rb +252 -0
  34. metadata +52 -54
  35. data/.travis.yml +0 -43
  36. data/Appraisals +0 -19
  37. data/CONTRIBUTING.md +0 -60
  38. data/bin/html-pipeline +0 -78
  39. data/lib/html/pipeline/@team_mention_filter.rb +0 -99
  40. data/lib/html/pipeline/autolink_filter.rb +0 -34
  41. data/lib/html/pipeline/body_content.rb +0 -44
  42. data/lib/html/pipeline/camo_filter.rb +0 -105
  43. data/lib/html/pipeline/email_reply_filter.rb +0 -69
  44. data/lib/html/pipeline/filter.rb +0 -165
  45. data/lib/html/pipeline/https_filter.rb +0 -29
  46. data/lib/html/pipeline/image_max_width_filter.rb +0 -37
  47. data/lib/html/pipeline/markdown_filter.rb +0 -56
  48. data/lib/html/pipeline/sanitization_filter.rb +0 -144
  49. data/lib/html/pipeline/syntax_highlight_filter.rb +0 -50
  50. data/lib/html/pipeline/text_filter.rb +0 -16
  51. data/lib/html/pipeline/textile_filter.rb +0 -25
  52. data/lib/html/pipeline/toc_filter.rb +0 -69
  53. data/lib/html/pipeline/version.rb +0 -7
  54. data/lib/html/pipeline.rb +0 -210
data/README.md CHANGED
@@ -1,23 +1,23 @@
1
- # HTML::Pipeline [![Build Status](https://travis-ci.org/jch/html-pipeline.svg?branch=master)](https://travis-ci.org/jch/html-pipeline)
1
+ # HTMLPipeline
2
2
 
3
- HTML processing filters and utilities. This module includes a small
4
- framework for defining DOM based content filters and applying them to user
3
+ > **Note**
4
+ > This README refers to the behavior in the new 3.0.0.pre1 gem.
5
+
6
+ HTML processing filters and utilities. This module is a small
7
+ framework for defining CSS-based content filters and applying them to user
5
8
  provided content.
6
9
 
7
- [This project was started at GitHub](https://github.com/blog/1311-html-pipeline-chainable-content-filters). While GitHub still uses a similar design and pattern for rendering content, this gem should be considered standalone and independent from GitHub.
10
+ [Although this project was started at GitHub](https://github.com/blog/1311-html-pipeline-chainable-content-filters), they no longer do. This gem must be considered standalone and independent from GitHub.
8
11
 
9
12
  - [Installation](#installation)
10
13
  - [Usage](#usage)
11
- - [Examples](#examples)
14
+ - [More Examples](#more-examples)
12
15
  - [Filters](#filters)
13
16
  - [Dependencies](#dependencies)
14
17
  - [Documentation](#documentation)
15
- - [Extending](#extending)
16
- - [3rd Party Extensions](#3rd-party-extensions)
17
18
  - [Instrumenting](#instrumenting)
18
- - [Contributing](#contributing)
19
- - [Contributors](#contributors)
20
- - [Releasing A New Version](#releasing-a-new-version)
19
+ - [Third Party Extensions](#third-party-extensions)
20
+ - [FAQ](#faq)
21
21
 
22
22
  ## Installation
23
23
 
@@ -42,220 +42,216 @@ $ gem install html-pipeline
42
42
  ## Usage
43
43
 
44
44
  This library provides a handful of chainable HTML filters to transform user
45
- content into markup. A filter takes an HTML string or
46
- `Nokogiri::HTML::DocumentFragment`, optionally manipulates it, and then
47
- outputs the result.
45
+ content into HTML markup. Each filter does some work, and then hands off the
46
+ results tothe next filter. A pipeline has several kinds of filters available to use:
48
47
 
49
- For example, to transform Markdown source into Markdown HTML:
48
+ - Multiple `TextFilter`s, which operate a UTF-8 string
49
+ - A `ConvertFilter` filter, which turns text into HTML (eg., Commonmark/Asciidoc -> HTML)
50
+ - A `SanitizationFilter`, which remove dangerous/unwanted HTML elements and attributes
51
+ - Multiple `NodeFilter`s, which operate on a UTF-8 HTML document
50
52
 
51
- ```ruby
52
- require 'html/pipeline'
53
+ You can assemble each sequence into a single pipeline, or choose to call each filter individually.
53
54
 
54
- filter = HTML::Pipeline::MarkdownFilter.new("Hi **world**!")
55
- filter.call
56
- ```
55
+ As an example, suppose we want to transform Commonmark source text into Markdown HTML. With the content, we also want to:
56
+
57
+ - change every instance of `$NAME` to "`Johnny"
58
+ - strip undesired HTML
59
+ - linkify @mention
57
60
 
58
- Filters can be combined into a pipeline which causes each filter to hand its
59
- output to the next filter's input. So if you wanted to have content be
60
- filtered through Markdown and be syntax highlighted, you can create the
61
- following pipeline:
61
+ We can construct a pipeline to do all that like this:
62
62
 
63
63
  ```ruby
64
- pipeline = HTML::Pipeline.new [
65
- HTML::Pipeline::MarkdownFilter,
66
- HTML::Pipeline::SyntaxHighlightFilter
67
- ]
68
- result = pipeline.call <<-CODE
69
- This is *great*:
64
+ require 'html_pipeline'
70
65
 
71
- some_code(:first)
66
+ class HelloJohnnyFilter < HTMLPipelineFilter
67
+ def call
68
+ text.gsub("$NAME", "Johnny")
69
+ end
70
+ end
72
71
 
73
- CODE
74
- result[:output].to_s
72
+ pipeline = HTMLPipeline.new(
73
+ text_filters: [HelloJohnnyFilter.new]
74
+ convert_filter: HTMLPipeline::ConvertFilter::MarkdownFilter.new),
75
+ # note: next line is not needed as sanitization occurs by default;
76
+ # see below for more info
77
+ sanitization_config: HTMLPipeline::SanitizationFilter::DEFAULT_CONFIG,
78
+ node_filters: [HTMLPipeline::NodeFilter::MentionFilter.new]
79
+ )
80
+ pipeline.call(user_supplied_text) # recommended: can call pipeline over and over
75
81
  ```
76
82
 
77
- Prints:
78
-
79
- ```html
80
- <p>This is <em>great</em>:</p>
83
+ Filters can be custom ones you create (like `HelloJohnnyFilter`), and `HTMLPipeline` additionally provides several helpful ones (detailed below). If you only need a single filter, you can call one individually, too:
81
84
 
82
- <pre><code>some_code(:first)
83
- </code></pre>
85
+ ```ruby
86
+ filter = HTMLPipeline::ConvertFilter::MarkdownFilter.new(text)
87
+ filter.call
84
88
  ```
85
89
 
86
- To generate CSS for HTML formatted code, use the [Rouge CSS Theme](https://github.com/rouge-ruby/rouge#css-options) `#css` method. `rouge` is a dependency of the `SyntaxHighlightFilter`.
90
+ Filters combine into a sequential pipeline, and each filter hands its
91
+ output to the next filter's input. Text filters are
92
+ processed first, then the convert filter, sanitization filter, and finally, the node filters.
87
93
 
88
- Some filters take an optional **context** and/or **result** hash. These are
94
+ Some filters take optional `context` and/or `result` hash(es). These are
89
95
  used to pass around arguments and metadata between filters in a pipeline. For
90
- example, if you don't want to use GitHub formatted Markdown, you can pass an
91
- option in the context hash:
96
+ example, if you want to disable footnotes in the `MarkdownFilter`, you can pass an option in the context hash:
92
97
 
93
98
  ```ruby
94
- filter = HTML::Pipeline::MarkdownFilter.new("Hi **world**!", :gfm => false)
99
+ context = { markdown: extensions: { footnotes: false } }
100
+ filter = HTMLPipeline::ConvertFilter::MarkdownFilter.new("Hi **world**!", context: context)
95
101
  filter.call
96
102
  ```
97
103
 
98
- ### Examples
104
+ Please refer to the documentation for each filter to understand what configuration options are available.
105
+
106
+ ### More Examples
99
107
 
100
- We define different pipelines for different parts of our app. Here are a few
108
+ Different pipelines can be defined for different parts of an app. Here are a few
101
109
  paraphrased snippets to get you started:
102
110
 
103
111
  ```ruby
104
112
  # The context hash is how you pass options between different filters.
105
113
  # See individual filter source for explanation of options.
106
114
  context = {
107
- :asset_root => "http://your-domain.com/where/your/images/live/icons",
108
- :base_url => "http://your-domain.com"
115
+ asset_root: "http://your-domain.com/where/your/images/live/icons",
116
+ base_url: "http://your-domain.com"
109
117
  }
110
118
 
111
- # Pipeline providing sanitization and image hijacking but no mention
112
- # related features.
113
- SimplePipeline = Pipeline.new [
114
- SanitizationFilter,
115
- TableOfContentsFilter, # add 'name' anchors to all headers and generate toc list
116
- CamoFilter,
117
- ImageMaxWidthFilter,
118
- SyntaxHighlightFilter,
119
- EmojiFilter,
120
- AutolinkFilter
121
- ], context
122
-
123
119
  # Pipeline used for user provided content on the web
124
- MarkdownPipeline = Pipeline.new [
125
- MarkdownFilter,
126
- SanitizationFilter,
127
- CamoFilter,
128
- ImageMaxWidthFilter,
129
- HttpsFilter,
130
- MentionFilter,
131
- EmojiFilter,
132
- SyntaxHighlightFilter
133
- ], context.merge(:gfm => true) # enable github formatted markdown
134
-
135
-
136
- # Define a pipeline based on another pipeline's filters
137
- NonGFMMarkdownPipeline = Pipeline.new(MarkdownPipeline.filters,
138
- context.merge(:gfm => false))
120
+ MarkdownPipeline = HTMLPipeline.new (
121
+ text_filters: [HTMLPipeline::TextFilter::ImageMaxWidthFilter.new],
122
+ convert_filter: [HTMLPipeline::ConvertFilter::MarkdownFilter.new],
123
+ node_filters: [
124
+ HTMLPipeline::NodeFilter::HttpsFilter.new,HTMLPipeline::NodeFilter::MentionFilter.new,
125
+ ], context: context)
139
126
 
140
127
  # Pipelines aren't limited to the web. You can use them for email
141
128
  # processing also.
142
- HtmlEmailPipeline = Pipeline.new [
143
- PlainTextInputFilter,
144
- ImageMaxWidthFilter
145
- ], {}
146
-
147
- # Just emoji.
148
- EmojiPipeline = Pipeline.new [
149
- PlainTextInputFilter,
150
- EmojiFilter
151
- ], context
129
+ HtmlEmailPipeline = HTMLPipeline.new(
130
+ text_filters: [
131
+ PlainTextInputFilter.new,
132
+ ImageMaxWidthFilter.new
133
+ ], {})
152
134
  ```
153
135
 
154
136
  ## Filters
155
137
 
156
- * `MentionFilter` - replace `@user` mentions with links
157
- * `TeamMentionFilter` - replace `@org/team` mentions with links
158
- * `AbsoluteSourceFilter` - replace relative image urls with fully qualified versions
159
- * `AutolinkFilter` - auto_linking urls in HTML
160
- * `CamoFilter` - replace http image urls with [camo-fied](https://github.com/atmos/camo) https versions
161
- * `EmailReplyFilter` - util filter for working with emails
162
- * `EmojiFilter` - everyone loves [emoji](http://www.emoji-cheat-sheet.com/)!
163
- * `HttpsFilter` - HTML Filter for replacing http github urls with https versions.
164
- * `ImageMaxWidthFilter` - link to full size image for large images
165
- * `MarkdownFilter` - convert markdown to html
166
- * `PlainTextInputFilter` - html escape text and wrap the result in a div
167
- * `SanitizationFilter` - allow sanitize user markup
168
- * `SyntaxHighlightFilter` - code syntax highlighter
169
- * `TextileFilter` - convert textile to html
170
- * `TableOfContentsFilter` - anchor headings with name attributes and generate Table of Contents html unordered list linking headings
138
+ ### TextFilters
171
139
 
172
- ## Dependencies
140
+ `TextFilter`s must define a method named `call` which is called on the text. `@text`, `@config`, and `@result` are available to use, and any changes made to these ivars are passed on to the next filter.
141
+
142
+ - `ImageFilter` - converts image `url` into `<img>` tag
143
+ - `PlainTextInputFilter` - html escape text and wrap the result in a `<div>`
173
144
 
174
- Filter gem dependencies are not bundled; you must bundle the filter's gem
175
- dependencies. The below list details filters with dependencies. For example,
176
- `SyntaxHighlightFilter` uses [rouge](https://github.com/jneen/rouge)
177
- to detect and highlight languages. For example, to use the `SyntaxHighlightFilter`,
178
- add the following to your Gemfile:
145
+ ### ConvertFilter
146
+
147
+ The `ConvertFilter` takes text and turns it into HTML. `@text`, `@config`, and `@result` are available to use. `ConvertFilter` must defined a method named `call`, taking one argument, `text`. `call` must return a string representing the new HTML document.
148
+
149
+ - `MarkdownFilter` - creates HTML from text using [Commonmarker](https://www.github.com/gjtorikian/commonmarker)
150
+
151
+ ### Sanitization
152
+
153
+ Because the web can be a scary place, HTML is automatically sanitized after the `ConvertFilter` runs and before the `NodeFilter`s are processed. This is to prevent malicious or unexpected input from entering the pipeline.
154
+
155
+ The sanitization process takes a hash configuration of settings. See the [Selma](https://www.github.com/gjtorikian/selma) documentation for more information on how to configure these settings.
156
+
157
+ A default sanitization config is provided by this library (`HTMLPipeline::SanitizationFilter::DEFAULT_CONFIG`). A sample custom sanitization allowlist might look like this:
179
158
 
180
159
  ```ruby
181
- gem 'rouge'
160
+ ALLOWLIST = {
161
+ elements: ["p", "pre", "code"]
162
+ }
163
+
164
+ pipeline = HTMLPipeline.new \
165
+ text_filters: [
166
+ HTMLPipeline::MarkdownFilter,
167
+ ],
168
+ convert_filter: [HTMLPipeline::ConvertFilter::MarkdownFilter.new],
169
+ sanitization_config: ALLOWLIST
170
+
171
+ result = pipeline.call <<-CODE
172
+ This is *great*:
173
+
174
+ some_code(:first)
175
+
176
+ CODE
177
+ result[:output].to_s
182
178
  ```
183
179
 
184
- * `AutolinkFilter` - `rinku`
185
- * `EmailReplyFilter` - `escape_utils`, `email_reply_parser`
186
- * `EmojiFilter` - `gemoji`
187
- * `MarkdownFilter` - `commonmarker`
188
- * `PlainTextInputFilter` - `escape_utils`
189
- * `SanitizationFilter` - `sanitize`
190
- * `SyntaxHighlightFilter` - `rouge`
191
- * `TableOfContentsFilter` - `escape_utils`
192
- * `TextileFilter` - `RedCloth`
180
+ This would print:
193
181
 
194
- _Note:_ See [Gemfile](/Gemfile) `:test` block for version requirements.
182
+ ```html
183
+ <p>This is great:</p>
184
+ <pre><code>some_code(:first)
185
+ </code></pre>
186
+ ```
195
187
 
196
- ## Documentation
188
+ Sanitization can be disabled if and only if `nil` is explicitly passed as
189
+ the config:
197
190
 
198
- Full reference documentation can be [found here](http://rubydoc.info/gems/html-pipeline/frames).
191
+ ```ruby
192
+ pipeline = HTMLPipeline.new \
193
+ text_filters: [
194
+ HTMLPipeline::MarkdownFilter,
195
+ ],
196
+ convert_filter: [HTMLPipeline::ConvertFilter::MarkdownFilter.new],
197
+ sanitization_config: nil
198
+ ```
199
+
200
+ For more examples of customizing the sanitization process to include the tags you want, check out [the tests](test/sanitization_filter_test.rb) and [the FAQ](#faq).
199
201
 
200
- ## Extending
201
- To write a custom filter, you need a class with a `call` method that inherits
202
- from `HTML::Pipeline::Filter`.
202
+ ### NodeFilters
203
203
 
204
- For example this filter adds a base url to images that are root relative:
204
+ `NodeFilters`s can operate either on HTML elements or text nodes using CSS selectors. Each `NodeFilter` must define a method named `selector` which provides an instance of `Selma::Selector`. If elements are being manipulated, `handle_element` must be defined, taking one argument, `element`; if text nodes are being manipulated, `handle_text_chunk` must be defined, taking one argument, `text_chunk`. `@config`, and `@result` are available to use, and any changes made to these ivars are passed on to the next filter.
205
+
206
+ `NodeFilter` also has an optional method, `after_initialize`, which is run after the filter initializes. This can be useful in setting up a custom state for `result` to take advantage of.
207
+
208
+ Here's an example `NodeFilter` that adds a base url to images that are root relative:
205
209
 
206
210
  ```ruby
207
211
  require 'uri'
208
212
 
209
- class RootRelativeFilter < HTML::Pipeline::Filter
213
+ class RootRelativeFilter < HTMLPipeline::NodeFilter
210
214
 
211
- def call
212
- doc.search("img").each do |img|
213
- next if img['src'].nil?
214
- src = img['src'].strip
215
- if src.start_with? '/'
216
- img["src"] = URI.join(context[:base_url], src).to_s
217
- end
218
- end
219
- doc
215
+ SELECTOR = Selma::Selector.new(match_element: "img")
216
+
217
+ def selector
218
+ SELECTOR
220
219
  end
221
220
 
221
+ def handle_element(img)
222
+ next if img['src'].nil?
223
+ src = img['src'].strip
224
+ if src.start_with? '/'
225
+ img["src"] = URI.join(context[:base_url], src).to_s
226
+ end
227
+ end
222
228
  end
223
229
  ```
224
230
 
225
- Now this filter can be used in a pipeline:
231
+ For more information on how to write effective `NodeFilter`s, refer to the provided filters, and see the underlying lib, [Selma](https://www.github.com/gjtorikian/selma) for more information.
226
232
 
227
- ```ruby
228
- Pipeline.new [ RootRelativeFilter ], { :base_url => 'http://somehost.com' }
229
- ```
233
+ - `AbsoluteSourceFilter` - replace relative image urls with fully qualified versions
234
+ - `EmojiFilter` - converts `:<emoji>:` to [emoji](http://www.emoji-cheat-sheet.com/)!
235
+ - `HttpsFilter` - Replacing http urls with https versions
236
+ - `ImageMaxWidthFilter` - link to full size image for large images
237
+ - `MentionFilter` - replace `@user` mentions with links
238
+ - `SanitizationFilter` - allow sanitize user markup
239
+ - `TableOfContentsFilter` - anchor headings with name attributes and generate Table of Contents html unordered list linking headings
240
+ - `TeamMentionFilter` - replace `@org/team` mentions with links
230
241
 
231
- ### 3rd Party Extensions
242
+ ## Dependencies
232
243
 
233
- If you have an idea for a filter, propose it as
234
- [an issue](https://github.com/jch/html-pipeline/issues) first. This allows us discuss
235
- whether the filter is a common enough use case to belong in this gem, or should be
236
- built as an external gem.
244
+ Since filters can be customized to your heart's content, gem dependencies are _not_ bundled; this project doesn't know which of the default filters you might use, and as such, you must bundle each filter's gem
245
+ dependencies yourself.
237
246
 
238
- Here are some extensions people have built:
247
+ > **Note**
248
+ > See the [Gemfile](/Gemfile) `:test` group for any version requirements.
239
249
 
240
- * [html-pipeline-asciidoc_filter](https://github.com/asciidoctor/html-pipeline-asciidoc_filter)
241
- * [jekyll-html-pipeline](https://github.com/gjtorikian/jekyll-html-pipeline)
242
- * [nanoc-html-pipeline](https://github.com/burnto/nanoc-html-pipeline)
243
- * [html-pipeline-bitly](https://github.com/dewski/html-pipeline-bitly)
244
- * [html-pipeline-cite](https://github.com/lifted-studios/html-pipeline-cite)
245
- * [tilt-html-pipeline](https://github.com/bradgessler/tilt-html-pipeline)
246
- * [html-pipeline-wiki-link'](https://github.com/lifted-studios/html-pipeline-wiki-link) - WikiMedia-style wiki links
247
- * [task_list](https://github.com/github/task_list) - GitHub flavor Markdown Task List
248
- * [html-pipeline-nico_link](https://github.com/rutan/html-pipeline-nico_link) - An HTML::Pipeline filter for [niconico](http://www.nicovideo.jp) description links
249
- * [html-pipeline-gitlab](https://gitlab.com/gitlab-org/html-pipeline-gitlab) - This gem implements various filters for html-pipeline used by GitLab
250
- * [html-pipeline-youtube](https://github.com/st0012/html-pipeline-youtube) - An HTML::Pipeline filter for YouTube links
251
- * [html-pipeline-flickr](https://github.com/st0012/html-pipeline-flickr) - An HTML::Pipeline filter for Flickr links
252
- * [html-pipeline-vimeo](https://github.com/dlackty/html-pipeline-vimeo) - An HTML::Pipeline filter for Vimeo links
253
- * [html-pipeline-hashtag](https://github.com/mr-dxdy/html-pipeline-hashtag) - An HTML::Pipeline filter for hashtags
254
- * [html-pipeline-linkify_github](https://github.com/jollygoodcode/html-pipeline-linkify_github) - An HTML::Pipeline filter to autolink GitHub urls
255
- * [html-pipeline-redcarpet_filter](https://github.com/bmikol/html-pipeline-redcarpet_filter) - Render Markdown source text into Markdown HTML using Redcarpet
256
- * [html-pipeline-typogruby_filter](https://github.com/bmikol/html-pipeline-typogruby_filter) - Add Typogruby text filters to your HTML::Pipeline
257
- * [korgi](https://github.com/jodeci/korgi) - HTML::Pipeline filters for links to Rails resources
250
+ When developing a custom filter, call `HTMLPipeline.require_dependency` at the start to ensure that the local machine has the necessary dependency. You can also use `HTMLPipeline.require_dependencies` to provide a list of dependencies to check.
258
251
 
252
+ ## Documentation
253
+
254
+ Full reference documentation can be [found here](http://rubydoc.info/gems/html-pipeline/frames).
259
255
 
260
256
  ## Instrumenting
261
257
 
@@ -263,107 +259,102 @@ Filters and Pipelines can be set up to be instrumented when called. The pipeline
263
259
  must be setup with an
264
260
  [ActiveSupport::Notifications](http://api.rubyonrails.org/classes/ActiveSupport/Notifications.html)
265
261
  compatible service object and a name. New pipeline objects will default to the
266
- `HTML::Pipeline.default_instrumentation_service` object.
262
+ `HTMLPipeline.default_instrumentation_service` object.
267
263
 
268
- ``` ruby
264
+ ```ruby
269
265
  # the AS::Notifications-compatible service object
270
266
  service = ActiveSupport::Notifications
271
267
 
272
268
  # instrument a specific pipeline
273
- pipeline = HTML::Pipeline.new [MarkdownFilter], context
269
+ pipeline = HTMLPipeline.new [MarkdownFilter], context
274
270
  pipeline.setup_instrumentation "MarkdownPipeline", service
275
271
 
276
272
  # or set default instrumentation service for all new pipelines
277
- HTML::Pipeline.default_instrumentation_service = service
278
- pipeline = HTML::Pipeline.new [MarkdownFilter], context
273
+ HTMLPipeline.default_instrumentation_service = service
274
+ pipeline = HTMLPipeline.new [MarkdownFilter], context
279
275
  pipeline.setup_instrumentation "MarkdownPipeline"
280
276
  ```
281
277
 
282
278
  Filters are instrumented when they are run through the pipeline. A
283
- `call_filter.html_pipeline` event is published once the filter finishes. The
284
- `payload` should include the `filter` name. Each filter will trigger its own
279
+ `call_filter.html_pipeline` event is published once any filter finishes; `call_text_filters`
280
+ and `call_node_filters` is published when all of the text and node filters are finished, respectively.
281
+ The `payload` should include the `filter` name. Each filter will trigger its own
285
282
  instrumentation call.
286
283
 
287
- ``` ruby
284
+ ```ruby
288
285
  service.subscribe "call_filter.html_pipeline" do |event, start, ending, transaction_id, payload|
289
286
  payload[:pipeline] #=> "MarkdownPipeline", set with `setup_instrumentation`
290
287
  payload[:filter] #=> "MarkdownFilter"
291
288
  payload[:context] #=> context Hash
292
289
  payload[:result] #=> instance of result class
293
- payload[:result][:output] #=> output HTML String or Nokogiri::DocumentFragment
290
+ payload[:result][:output] #=> output HTML String
294
291
  end
295
292
  ```
296
293
 
297
294
  The full pipeline is also instrumented:
298
295
 
299
- ``` ruby
300
- service.subscribe "call_pipeline.html_pipeline" do |event, start, ending, transaction_id, payload|
296
+ ```ruby
297
+ service.subscribe "call_text_filters.html_pipeline" do |event, start, ending, transaction_id, payload|
301
298
  payload[:pipeline] #=> "MarkdownPipeline", set with `setup_instrumentation`
302
299
  payload[:filters] #=> ["MarkdownFilter"]
303
- payload[:doc] #=> HTML String or Nokogiri::DocumentFragment
300
+ payload[:doc] #=> HTML String
304
301
  payload[:context] #=> context Hash
305
302
  payload[:result] #=> instance of result class
306
- payload[:result][:output] #=> output HTML String or Nokogiri::DocumentFragment
303
+ payload[:result][:output] #=> output HTML String
307
304
  end
308
305
  ```
309
306
 
307
+ ## Third Party Extensions
308
+
309
+ If you have an idea for a filter, propose it as
310
+ [an issue](https://github.com/gjtorikian/html-pipeline/issues) first. This allows us to discuss
311
+ whether the filter is a common enough use case to belong in this gem, or should be
312
+ built as an external gem.
313
+
314
+ Here are some extensions people have built:
315
+
316
+ - [html-pipeline-asciidoc_filter](https://github.com/asciidoctor/html-pipeline-asciidoc_filter)
317
+ - [jekyll-html-pipeline](https://github.com/gjtorikian/jekyll-html-pipeline)
318
+ - [nanoc-html-pipeline](https://github.com/burnto/nanoc-html-pipeline)
319
+ - [html-pipeline-bitly](https://github.com/dewski/html-pipeline-bitly)
320
+ - [html-pipeline-cite](https://github.com/lifted-studios/html-pipeline-cite)
321
+ - [tilt-html-pipeline](https://github.com/bradgessler/tilt-html-pipeline)
322
+ - [html-pipeline-wiki-link'](https://github.com/lifted-studios/html-pipeline-wiki-link) - WikiMedia-style wiki links
323
+ - [task_list](https://github.com/github/task_list) - GitHub flavor Markdown Task List
324
+ - [html-pipeline-nico_link](https://github.com/rutan/html-pipeline-nico_link) - An HTMLPipeline filter for [niconico](http://www.nicovideo.jp) description links
325
+ - [html-pipeline-gitlab](https://gitlab.com/gitlab-org/html-pipeline-gitlab) - This gem implements various filters for html-pipeline used by GitLab
326
+ - [html-pipeline-youtube](https://github.com/st0012/html-pipeline-youtube) - An HTMLPipeline filter for YouTube links
327
+ - [html-pipeline-flickr](https://github.com/st0012/html-pipeline-flickr) - An HTMLPipeline filter for Flickr links
328
+ - [html-pipeline-vimeo](https://github.com/dlackty/html-pipeline-vimeo) - An HTMLPipeline filter for Vimeo links
329
+ - [html-pipeline-hashtag](https://github.com/mr-dxdy/html-pipeline-hashtag) - An HTMLPipeline filter for hashtags
330
+ - [html-pipeline-linkify_github](https://github.com/jollygoodcode/html-pipeline-linkify_github) - An HTMLPipeline filter to autolink GitHub urls
331
+ - [html-pipeline-redcarpet_filter](https://github.com/bmikol/html-pipeline-redcarpet_filter) - Render Markdown source text into Markdown HTML using Redcarpet
332
+ - [html-pipeline-typogruby_filter](https://github.com/bmikol/html-pipeline-typogruby_filter) - Add Typogruby text filters to your HTMLPipeline
333
+ - [korgi](https://github.com/jodeci/korgi) - HTMLPipeline filters for links to Rails resources
334
+
310
335
  ## FAQ
311
336
 
312
337
  ### 1. Why doesn't my pipeline work when there's no root element in the document?
313
338
 
314
339
  To make a pipeline work on a plain text document, put the `PlainTextInputFilter`
315
- at the beginning of your pipeline. This will wrap the content in a `div` so the
316
- filters have a root element to work with. If you're passing in an HTML fragment,
340
+ at the end of your `text_filter`s config . This will wrap the content in a `div` so the filters have a root element to work with. If you're passing in an HTML fragment,
317
341
  but it doesn't have a root element, you can wrap the content in a `div`
318
- yourself. For example:
319
-
320
- ```ruby
321
- EmojiPipeline = Pipeline.new [
322
- PlainTextInputFilter, # <- Wraps input in a div and escapes html tags
323
- EmojiFilter
324
- ], context
325
-
326
- plain_text = "Gutentag! :wave:"
327
- EmojiPipeline.call(plain_text)
328
-
329
- html_fragment = "This is outside of an html element, but <strong>this isn't. :+1:</strong>"
330
- EmojiPipeline.call("<div>#{html_fragment}</div>") # <- Wrap your own html fragments to avoid escaping
331
- ```
342
+ yourself.
332
343
 
333
344
  ### 2. How do I customize an allowlist for `SanitizationFilter`s?
334
345
 
335
- `SanitizationFilter::ALLOWLIST` is the default allowlist used if no `:allowlist`
336
- argument is given in the context. The default is a good starting template for
346
+ `HTMLPipeline::SanitizationFilter::ALLOWLIST` is the default allowlist used if no `sanitization_config`
347
+ argument is given. The default is a good starting template for
337
348
  you to add additional elements. You can either modify the constant's value, or
338
- re-define your own constant and pass that in via the context.
339
-
340
- ## Contributing
349
+ re-define your own config and pass that in, such as:
341
350
 
342
- Please review the [Contributing Guide](https://github.com/jch/html-pipeline/blob/master/CONTRIBUTING.md).
343
-
344
- 1. [Fork it](https://help.github.com/articles/fork-a-repo)
345
- 2. Create your feature branch (`git checkout -b my-new-feature`)
346
- 3. Commit your changes (`git commit -am 'Added some feature'`)
347
- 4. Push to the branch (`git push origin my-new-feature`)
348
- 5. Create new [Pull Request](https://help.github.com/articles/using-pull-requests)
349
-
350
- To see what has changed in recent versions, see the [CHANGELOG](https://github.com/jch/html-pipeline/blob/master/CHANGELOG.md).
351
+ ```ruby
352
+ config = HTMLPipeline::SanitizerFilter::DEFAULT_CONFIG.dup
353
+ config[:elements] << "iframe" # sure, whatever you want
354
+ ```
351
355
 
352
356
  ### Contributors
353
357
 
354
- Thanks to all of [these contributors](https://github.com/jch/html-pipeline/graphs/contributors).
355
-
356
- Project is a member of the [OSS Manifesto](http://ossmanifesto.org/).
357
-
358
- The current maintainer is @gjtorikian
359
-
360
- ### Releasing A New Version
361
-
362
- This section is for gem maintainers to cut a new version of the gem.
358
+ Thanks to all of [these contributors](https://github.com/gjtorikian/html-pipeline/graphs/contributors).
363
359
 
364
- * create a new branch named `release-x.y.z` where `x.y.z` follows [semver](http://semver.org)
365
- * update lib/html/pipeline/version.rb to next version number X.X.X
366
- * update CHANGELOG.md. Prepare a draft with `script/changelog`
367
- * push branch and create a new pull request
368
- * after tests are green, merge to master
369
- * on the master branch, run `script/release`
360
+ This project is a member of the [OSS Manifesto](http://ossmanifesto.org/).
data/Rakefile CHANGED
@@ -1,17 +1,24 @@
1
1
  #!/usr/bin/env rake
2
2
  # frozen_string_literal: true
3
3
 
4
- require 'rubygems'
5
- require 'bundler/setup'
6
-
7
- require 'bundler/gem_tasks'
8
- require 'rake/testtask'
4
+ require "bundler/gem_tasks"
5
+ require "rubygems/package_task"
6
+ require "rake/testtask"
9
7
 
10
8
  Rake::TestTask.new do |t|
11
- t.libs << 'test'
12
- t.test_files = FileList['test/**/*_test.rb']
9
+ t.libs << "test"
10
+ t.test_files = FileList["test/**/*_test.rb"]
13
11
  t.verbose = true
14
12
  t.warning = false
15
13
  end
16
14
 
17
15
  task default: :test
16
+
17
+ require "rubocop/rake_task"
18
+
19
+ RuboCop::RakeTask.new(:rubocop)
20
+
21
+ GEMSPEC = Bundler.load_gemspec("html-pipeline.gemspec")
22
+ gem_path = Gem::PackageTask.new(GEMSPEC).define
23
+ desc "Package the ruby gem"
24
+ task "package" => [gem_path]
data/UPGRADING.md ADDED
@@ -0,0 +1,35 @@
1
+ # Upgrade Guide
2
+
3
+ ## From v2 to v3
4
+
5
+ HTMLPipeline v3 is a massive improvement over this still much loved (and woefully under-maintained) project. This section will attempt to list all of the breaking changes between the two versions and provide suggestions on how to upgrade.
6
+
7
+ ### Changed namespace
8
+
9
+ This project is now under a module called `HTMLPipeline`, not `HTML::Pipeline`.
10
+
11
+ ### Removed filters
12
+
13
+ The following filters were removed:
14
+
15
+ - `AutolinkFilter`: this is handled by [Commonmarker](https://www.github.com/gjtorikian/commonmarker) and can be disabled/enabled through the `MarkdownFilter`'s `context` hash
16
+ - `SyntaxHighlightFilter`: this is handled by [Commonmarker](https://www.github.com/gjtorikian/commonmarker) and can be disabled/enabled through the `MarkdownFilter`'s `context` hash
17
+ - `SanitizationFilter`: this is handled by [Selma](https://www.github.com/gjtorikian/selma); configuration can be done through the `sanitization_config` hash
18
+
19
+ - `EmailReplyFilter`
20
+ - `CamoFilter`
21
+ - `TextFilter`
22
+
23
+ ### Changed API
24
+
25
+ The new way to call this project is as follows:
26
+
27
+ ```ruby
28
+ HTMLPipeline.new(
29
+ text_filters: [], # array of instantiated (`.new`ed) `HTMLPipeline::TextFilter`
30
+ convert_filter:, # a filter that runs to turn text into HTML
31
+ sanitization_config: {}, # an allowlist of elements/attributes/protocols to keep
32
+ node_filters: []) # array of instantiated (`.new`ed) `HTMLPipeline::NodeFilter`
33
+ ```
34
+
35
+ Please refer to the README for more information on constructing filters. In most cases, the underlying filter needs only a few changes, primarily to make use of [Selma](https://www.github.com/gjtorikian/selma) rather than Nokogiri.