html-pipeline-plus 2.10.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.gitignore +20 -0
- data/.travis.yml +34 -0
- data/Appraisals +13 -0
- data/CHANGELOG.md +221 -0
- data/CONTRIBUTING.md +60 -0
- data/Gemfile +23 -0
- data/LICENSE +22 -0
- data/README.md +370 -0
- data/Rakefile +15 -0
- data/bin/html-pipeline-plus +78 -0
- data/html-pipeline-plus.gemspec +28 -0
- data/lib/html/pipeline-plus/@mention_filter.rb +138 -0
- data/lib/html/pipeline-plus/absolute_source_filter.rb +45 -0
- data/lib/html/pipeline-plus/autolink_filter.rb +27 -0
- data/lib/html/pipeline-plus/body_content.rb +42 -0
- data/lib/html/pipeline-plus/camo_filter.rb +93 -0
- data/lib/html/pipeline-plus/email_reply_filter.rb +66 -0
- data/lib/html/pipeline-plus/emoji_filter.rb +125 -0
- data/lib/html/pipeline-plus/filter.rb +163 -0
- data/lib/html/pipeline-plus/https_filter.rb +27 -0
- data/lib/html/pipeline-plus/image_filter.rb +17 -0
- data/lib/html/pipeline-plus/image_max_width_filter.rb +35 -0
- data/lib/html/pipeline-plus/markdown_filter.rb +37 -0
- data/lib/html/pipeline-plus/plain_text_input_filter.rb +13 -0
- data/lib/html/pipeline-plus/sanitization_filter.rb +137 -0
- data/lib/html/pipeline-plus/syntax_highlight_filter.rb +44 -0
- data/lib/html/pipeline-plus/text_filter.rb +14 -0
- data/lib/html/pipeline-plus/textile_filter.rb +23 -0
- data/lib/html/pipeline-plus/toc_filter.rb +67 -0
- data/lib/html/pipeline-plus/version.rb +5 -0
- data/lib/html/pipeline-plus.rb +207 -0
- data/test.txt +13 -0
- metadata +115 -0
data/README.md
ADDED
@@ -0,0 +1,370 @@
|
|
1
|
+
# HTML::Pipeline [](https://travis-ci.org/jch/html-pipeline)
|
2
|
+
|
3
|
+
GitHub HTML processing filters and utilities. This module includes a small
|
4
|
+
framework for defining DOM based content filters and applying them to user
|
5
|
+
provided content. Read an introduction about this project in
|
6
|
+
[this blog post](https://github.com/blog/1311-html-pipeline-chainable-content-filters).
|
7
|
+
|
8
|
+
`Note`: This plugin is fork and modify base `html-pipeline v2.10.0`: [https://github.com/jch/html-pipeline](https://github.com/jch/html-pipeline) .
|
9
|
+
|
10
|
+
- [HTML::Pipeline](./)
|
11
|
+
- [Installation](#installation)
|
12
|
+
- [Usage](#usage)
|
13
|
+
- [Examples](#examples)
|
14
|
+
- [Filters](#filters)
|
15
|
+
- [Dependencies](#dependencies)
|
16
|
+
- [Documentation](#documentation)
|
17
|
+
- [Extending](#extending)
|
18
|
+
- [3rd Party Extensions](#3rd-party-extensions)
|
19
|
+
- [Instrumenting](#instrumenting)
|
20
|
+
- [FAQ](#faq)
|
21
|
+
- [1. Why doesn't my pipeline work when there's no root element in the document?](#1-why-doesnt-my-pipeline-work-when-theres-no-root-element-in-the-document)
|
22
|
+
- [2. How do I customize a whitelist for `SanitizationFilter`s?](#2-how-do-i-customize-a-whitelist-for-sanitizationfilters)
|
23
|
+
- [Contributing](#contributing)
|
24
|
+
- [Contributors](#contributors)
|
25
|
+
- [Releasing A New Version](#releasing-a-new-version)
|
26
|
+
|
27
|
+
## Installation
|
28
|
+
|
29
|
+
Add this line to your application's Gemfile:
|
30
|
+
|
31
|
+
```ruby
|
32
|
+
gem 'html-pipeline-plus'
|
33
|
+
```
|
34
|
+
|
35
|
+
And then execute:
|
36
|
+
|
37
|
+
```sh
|
38
|
+
$ bundle
|
39
|
+
```
|
40
|
+
|
41
|
+
Or install it yourself as:
|
42
|
+
|
43
|
+
```sh
|
44
|
+
$ gem install html-pipeline-plus
|
45
|
+
```
|
46
|
+
|
47
|
+
## Usage
|
48
|
+
|
49
|
+
This library provides a handful of chainable HTML filters to transform user
|
50
|
+
content into markup. A filter takes an HTML string or
|
51
|
+
`Nokogiri::HTML::DocumentFragment`, optionally manipulates it, and then
|
52
|
+
outputs the result.
|
53
|
+
|
54
|
+
For example, to transform Markdown source into Markdown HTML:
|
55
|
+
|
56
|
+
```ruby
|
57
|
+
require 'html/pipeline'
|
58
|
+
|
59
|
+
filter = HTML::Pipeline::MarkdownFilter.new("Hi **world**!")
|
60
|
+
filter.call
|
61
|
+
```
|
62
|
+
|
63
|
+
Filters can be combined into a pipeline which causes each filter to hand its
|
64
|
+
output to the next filter's input. So if you wanted to have content be
|
65
|
+
filtered through Markdown and be syntax highlighted, you can create the
|
66
|
+
following pipeline:
|
67
|
+
|
68
|
+
```ruby
|
69
|
+
pipeline = HTML::Pipeline.new [
|
70
|
+
HTML::Pipeline::MarkdownFilter,
|
71
|
+
HTML::Pipeline::SyntaxHighlightFilter
|
72
|
+
]
|
73
|
+
result = pipeline.call <<-CODE
|
74
|
+
This is *great*:
|
75
|
+
|
76
|
+
some_code(:first)
|
77
|
+
|
78
|
+
CODE
|
79
|
+
result[:output].to_s
|
80
|
+
```
|
81
|
+
|
82
|
+
Prints:
|
83
|
+
|
84
|
+
```html
|
85
|
+
<p>This is <em>great</em>:</p>
|
86
|
+
|
87
|
+
<pre><code>some_code(:first)
|
88
|
+
</code></pre>
|
89
|
+
```
|
90
|
+
|
91
|
+
To generate CSS for HTML formatted code, use the [Rouge CSS Theme](https://github.com/jneen/rouge#css-theme-options) `#css` method. `rouge` is a dependency of the `SyntaxHighlightFilter`.
|
92
|
+
|
93
|
+
Some filters take an optional **context** and/or **result** hash. These are
|
94
|
+
used to pass around arguments and metadata between filters in a pipeline. For
|
95
|
+
example, if you don't want to use GitHub formatted Markdown, you can pass an
|
96
|
+
option in the context hash:
|
97
|
+
|
98
|
+
```ruby
|
99
|
+
filter = HTML::Pipeline::MarkdownFilter.new("Hi **world**!", :gfm => false)
|
100
|
+
filter.call
|
101
|
+
```
|
102
|
+
|
103
|
+
### Examples
|
104
|
+
|
105
|
+
We define different pipelines for different parts of our app. Here are a few
|
106
|
+
paraphrased snippets to get you started:
|
107
|
+
|
108
|
+
```ruby
|
109
|
+
# The context hash is how you pass options between different filters.
|
110
|
+
# See individual filter source for explanation of options.
|
111
|
+
context = {
|
112
|
+
:asset_root => "http://your-domain.com/where/your/images/live/icons",
|
113
|
+
:base_url => "http://your-domain.com"
|
114
|
+
}
|
115
|
+
|
116
|
+
# Pipeline providing sanitization and image hijacking but no mention
|
117
|
+
# related features.
|
118
|
+
SimplePipeline = Pipeline.new [
|
119
|
+
SanitizationFilter,
|
120
|
+
TableOfContentsFilter, # add 'name' anchors to all headers and generate toc list
|
121
|
+
CamoFilter,
|
122
|
+
ImageMaxWidthFilter,
|
123
|
+
SyntaxHighlightFilter,
|
124
|
+
EmojiFilter,
|
125
|
+
AutolinkFilter
|
126
|
+
], context
|
127
|
+
|
128
|
+
# Pipeline used for user provided content on the web
|
129
|
+
MarkdownPipeline = Pipeline.new [
|
130
|
+
MarkdownFilter,
|
131
|
+
SanitizationFilter,
|
132
|
+
CamoFilter,
|
133
|
+
ImageMaxWidthFilter,
|
134
|
+
HttpsFilter,
|
135
|
+
MentionFilter,
|
136
|
+
EmojiFilter,
|
137
|
+
SyntaxHighlightFilter
|
138
|
+
], context.merge(:gfm => true) # enable github formatted markdown
|
139
|
+
|
140
|
+
|
141
|
+
# Define a pipeline based on another pipeline's filters
|
142
|
+
NonGFMMarkdownPipeline = Pipeline.new(MarkdownPipeline.filters,
|
143
|
+
context.merge(:gfm => false))
|
144
|
+
|
145
|
+
# Pipelines aren't limited to the web. You can use them for email
|
146
|
+
# processing also.
|
147
|
+
HtmlEmailPipeline = Pipeline.new [
|
148
|
+
PlainTextInputFilter,
|
149
|
+
ImageMaxWidthFilter
|
150
|
+
], {}
|
151
|
+
|
152
|
+
# Just emoji.
|
153
|
+
EmojiPipeline = Pipeline.new [
|
154
|
+
PlainTextInputFilter,
|
155
|
+
EmojiFilter
|
156
|
+
], context
|
157
|
+
```
|
158
|
+
|
159
|
+
## Filters
|
160
|
+
|
161
|
+
* `MentionFilter` - replace `@user` mentions with links
|
162
|
+
* `AbsoluteSourceFilter` - replace relative image urls with fully qualified versions
|
163
|
+
* `AutolinkFilter` - auto_linking urls in HTML
|
164
|
+
* `CamoFilter` - replace http image urls with [camo-fied](https://github.com/atmos/camo) https versions
|
165
|
+
* `EmailReplyFilter` - util filter for working with emails
|
166
|
+
* `EmojiFilter` - everyone loves [emoji](http://www.emoji-cheat-sheet.com/)!
|
167
|
+
* `HttpsFilter` - HTML Filter for replacing http github urls with https versions.
|
168
|
+
* `ImageMaxWidthFilter` - link to full size image for large images
|
169
|
+
* `MarkdownFilter` - convert markdown to html
|
170
|
+
* `PlainTextInputFilter` - html escape text and wrap the result in a div
|
171
|
+
* `SanitizationFilter` - whitelist sanitize user markup
|
172
|
+
* `SyntaxHighlightFilter` - code syntax highlighter
|
173
|
+
* `TextileFilter` - convert textile to html
|
174
|
+
* `TableOfContentsFilter` - anchor headings with name attributes and generate Table of Contents html unordered list linking headings
|
175
|
+
|
176
|
+
## Dependencies
|
177
|
+
|
178
|
+
Filter gem dependencies are not bundled; you must bundle the filter's gem
|
179
|
+
dependencies. The below list details filters with dependencies. For example,
|
180
|
+
`SyntaxHighlightFilter` uses [rouge](https://github.com/jneen/rouge)
|
181
|
+
to detect and highlight languages. For example, to use the `SyntaxHighlightFilter`,
|
182
|
+
add the following to your Gemfile:
|
183
|
+
|
184
|
+
```ruby
|
185
|
+
gem 'rouge'
|
186
|
+
```
|
187
|
+
|
188
|
+
* `AutolinkFilter` - `rinku`
|
189
|
+
* `EmailReplyFilter` - `escape_utils`, `email_reply_parser`
|
190
|
+
* `EmojiFilter` - `gemoji`
|
191
|
+
* `MarkdownFilter` - `commonmarker`
|
192
|
+
* `PlainTextInputFilter` - `escape_utils`
|
193
|
+
* `SanitizationFilter` - `sanitize`
|
194
|
+
* `SyntaxHighlightFilter` - `rouge`
|
195
|
+
* `TableOfContentsFilter` - `escape_utils`
|
196
|
+
* `TextileFilter` - `RedCloth`
|
197
|
+
|
198
|
+
_Note:_ See [Gemfile](/Gemfile) `:test` block for version requirements.
|
199
|
+
|
200
|
+
## Documentation
|
201
|
+
|
202
|
+
Full reference documentation can be [found here](http://rubydoc.info/gems/html-pipeline/frames).
|
203
|
+
|
204
|
+
## Extending
|
205
|
+
To write a custom filter, you need a class with a `call` method that inherits
|
206
|
+
from `HTML::Pipeline::Filter`.
|
207
|
+
|
208
|
+
For example this filter adds a base url to images that are root relative:
|
209
|
+
|
210
|
+
```ruby
|
211
|
+
require 'uri'
|
212
|
+
|
213
|
+
class RootRelativeFilter < HTML::Pipeline::Filter
|
214
|
+
|
215
|
+
def call
|
216
|
+
doc.search("img").each do |img|
|
217
|
+
next if img['src'].nil?
|
218
|
+
src = img['src'].strip
|
219
|
+
if src.start_with? '/'
|
220
|
+
img["src"] = URI.join(context[:base_url], src).to_s
|
221
|
+
end
|
222
|
+
end
|
223
|
+
doc
|
224
|
+
end
|
225
|
+
|
226
|
+
end
|
227
|
+
```
|
228
|
+
|
229
|
+
Now this filter can be used in a pipeline:
|
230
|
+
|
231
|
+
```ruby
|
232
|
+
Pipeline.new [ RootRelativeFilter ], { :base_url => 'http://somehost.com' }
|
233
|
+
```
|
234
|
+
|
235
|
+
### 3rd Party Extensions
|
236
|
+
|
237
|
+
If you have an idea for a filter, propose it as
|
238
|
+
[an issue](https://github.com/shines77/html-pipeline-plus/issues) first. This allows us discuss
|
239
|
+
whether the filter is a common enough use case to belong in this gem, or should be
|
240
|
+
built as an external gem.
|
241
|
+
|
242
|
+
Here are some extensions people have built:
|
243
|
+
|
244
|
+
* [html-pipeline-asciidoc_filter](https://github.com/asciidoctor/html-pipeline-asciidoc_filter)
|
245
|
+
* [jekyll-html-pipeline](https://github.com/gjtorikian/jekyll-html-pipeline)
|
246
|
+
* [nanoc-html-pipeline](https://github.com/burnto/nanoc-html-pipeline)
|
247
|
+
* [html-pipeline-bitly](https://github.com/dewski/html-pipeline-bitly)
|
248
|
+
* [html-pipeline-cite](https://github.com/lifted-studios/html-pipeline-cite)
|
249
|
+
* [tilt-html-pipeline](https://github.com/bradgessler/tilt-html-pipeline)
|
250
|
+
* [html-pipeline-wiki-link'](https://github.com/lifted-studios/html-pipeline-wiki-link) - WikiMedia-style wiki links
|
251
|
+
* [task_list](https://github.com/github/task_list) - GitHub flavor Markdown Task List
|
252
|
+
* [html-pipeline-nico_link](https://github.com/rutan/html-pipeline-nico_link) - An HTML::Pipeline filter for [niconico](http://www.nicovideo.jp) description links
|
253
|
+
* [html-pipeline-gitlab](https://gitlab.com/gitlab-org/html-pipeline-gitlab) - This gem implements various filters for html-pipeline-plus used by GitLab
|
254
|
+
* [html-pipeline-youtube](https://github.com/st0012/html-pipeline-youtube) - An HTML::Pipeline filter for YouTube links
|
255
|
+
* [html-pipeline-flickr](https://github.com/st0012/html-pipeline-flickr) - An HTML::Pipeline filter for Flickr links
|
256
|
+
* [html-pipeline-vimeo](https://github.com/dlackty/html-pipeline-vimeo) - An HTML::Pipeline filter for Vimeo links
|
257
|
+
* [html-pipeline-hashtag](https://github.com/mr-dxdy/html-pipeline-hashtag) - An HTML::Pipeline filter for hashtags
|
258
|
+
* [html-pipeline-linkify_github](https://github.com/jollygoodcode/html-pipeline-linkify_github) - An HTML::Pipeline filter to autolink GitHub urls
|
259
|
+
* [html-pipeline-redcarpet_filter](https://github.com/bmikol/html-pipeline-redcarpet_filter) - Render Markdown source text into Markdown HTML using Redcarpet
|
260
|
+
* [html-pipeline-typogruby_filter](https://github.com/bmikol/html-pipeline-typogruby_filter) - Add Typogruby text filters to your HTML::Pipeline
|
261
|
+
* [korgi](https://github.com/jodeci/korgi) - HTML::Pipeline filters for links to Rails resources
|
262
|
+
|
263
|
+
|
264
|
+
## Instrumenting
|
265
|
+
|
266
|
+
Filters and Pipelines can be set up to be instrumented when called. The pipeline must be setup with an
|
267
|
+
[ActiveSupport::Notifications](http://api.rubyonrails.org/classes/ActiveSupport/Notifications.html)
|
268
|
+
compatible service object and a name. New pipeline objects will default to the
|
269
|
+
`HTML::Pipeline.default_instrumentation_service` object.
|
270
|
+
|
271
|
+
``` ruby
|
272
|
+
# the AS::Notifications-compatible service object
|
273
|
+
service = ActiveSupport::Notifications
|
274
|
+
|
275
|
+
# instrument a specific pipeline
|
276
|
+
pipeline = HTML::Pipeline.new [MarkdownFilter], context
|
277
|
+
pipeline.setup_instrumentation "MarkdownPipeline", service
|
278
|
+
|
279
|
+
# or set default instrumentation service for all new pipelines
|
280
|
+
HTML::Pipeline.default_instrumentation_service = service
|
281
|
+
pipeline = HTML::Pipeline.new [MarkdownFilter], context
|
282
|
+
pipeline.setup_instrumentation "MarkdownPipeline"
|
283
|
+
```
|
284
|
+
|
285
|
+
Filters are instrumented when they are run through the pipeline. A
|
286
|
+
`call_filter.html_pipeline` event is published once the filter finishes. The
|
287
|
+
`payload` should include the `filter` name. Each filter will trigger its own
|
288
|
+
instrumentation call.
|
289
|
+
|
290
|
+
``` ruby
|
291
|
+
service.subscribe "call_filter.html_pipeline" do |event, start, ending, transaction_id, payload|
|
292
|
+
payload[:pipeline] #=> "MarkdownPipeline", set with `setup_instrumentation`
|
293
|
+
payload[:filter] #=> "MarkdownFilter"
|
294
|
+
payload[:context] #=> context Hash
|
295
|
+
payload[:result] #=> instance of result class
|
296
|
+
payload[:result][:output] #=> output HTML String or Nokogiri::DocumentFragment
|
297
|
+
end
|
298
|
+
```
|
299
|
+
|
300
|
+
The full pipeline is also instrumented:
|
301
|
+
|
302
|
+
``` ruby
|
303
|
+
service.subscribe "call_pipeline.html_pipeline" do |event, start, ending, transaction_id, payload|
|
304
|
+
payload[:pipeline] #=> "MarkdownPipeline", set with `setup_instrumentation`
|
305
|
+
payload[:filters] #=> ["MarkdownFilter"]
|
306
|
+
payload[:doc] #=> HTML String or Nokogiri::DocumentFragment
|
307
|
+
payload[:context] #=> context Hash
|
308
|
+
payload[:result] #=> instance of result class
|
309
|
+
payload[:result][:output] #=> output HTML String or Nokogiri::DocumentFragment
|
310
|
+
end
|
311
|
+
```
|
312
|
+
|
313
|
+
## FAQ
|
314
|
+
|
315
|
+
### 1. Why doesn't my pipeline work when there's no root element in the document?
|
316
|
+
|
317
|
+
To make a pipeline work on a plain text document, put the `PlainTextInputFilter`
|
318
|
+
at the beginning of your pipeline. This will wrap the content in a `div` so the
|
319
|
+
filters have a root element to work with. If you're passing in an HTML fragment,
|
320
|
+
but it doesn't have a root element, you can wrap the content in a `div`
|
321
|
+
yourself. For example:
|
322
|
+
|
323
|
+
```ruby
|
324
|
+
EmojiPipeline = Pipeline.new [
|
325
|
+
PlainTextInputFilter, # <- Wraps input in a div and escapes html tags
|
326
|
+
EmojiFilter
|
327
|
+
], context
|
328
|
+
|
329
|
+
plain_text = "Gutentag! :wave:"
|
330
|
+
EmojiPipeline.call(plain_text)
|
331
|
+
|
332
|
+
html_fragment = "This is outside of an html element, but <strong>this isn't. :+1:</strong>"
|
333
|
+
EmojiPipeline.call("<div>#{html_fragment}</div>") # <- Wrap your own html fragments to avoid escaping
|
334
|
+
```
|
335
|
+
|
336
|
+
### 2. How do I customize a whitelist for `SanitizationFilter`s?
|
337
|
+
|
338
|
+
`SanitizationFilter::WHITELIST` is the default whitelist used if no `:whitelist`
|
339
|
+
argument is given in the context. The default is a good starting template for
|
340
|
+
you to add additional elements. You can either modify the constant's value, or
|
341
|
+
re-define your own constant and pass that in via the context.
|
342
|
+
|
343
|
+
## Contributing
|
344
|
+
|
345
|
+
Please review the [Contributing Guide](https://github.com/jch/html-pipeline/blob/master/CONTRIBUTING.md).
|
346
|
+
|
347
|
+
1. [Fork it](https://help.github.com/articles/fork-a-repo)
|
348
|
+
2. Create your feature branch (`git checkout -b my-new-feature`)
|
349
|
+
3. Commit your changes (`git commit -am 'Added some feature'`)
|
350
|
+
4. Push to the branch (`git push origin my-new-feature`)
|
351
|
+
5. Create new [Pull Request](https://help.github.com/articles/using-pull-requests)
|
352
|
+
|
353
|
+
To see what has changed in recent versions, see the [CHANGELOG](https://github.com/jch/html-pipeline/blob/master/CHANGELOG.md).
|
354
|
+
|
355
|
+
### Contributors
|
356
|
+
|
357
|
+
Thanks to all of [these contributors](https://github.com/jch/html-pipeline/graphs/contributors).
|
358
|
+
|
359
|
+
Project is a member of the [OSS Manifesto](http://ossmanifesto.org/).
|
360
|
+
|
361
|
+
### Releasing A New Version
|
362
|
+
|
363
|
+
This section is for gem maintainers to cut a new version of the gem.
|
364
|
+
|
365
|
+
* create a new branch named `release-x.y.z` where `x.y.z` follows [semver](http://semver.org)
|
366
|
+
* update lib/html/pipeline/version.rb to next version number X.X.X
|
367
|
+
* update CHANGELOG.md. Prepare a draft with `script/changelog`
|
368
|
+
* push branch and create a new pull request
|
369
|
+
* after tests are green, merge to master
|
370
|
+
* on the master branch, run `script/release`
|
data/Rakefile
ADDED
@@ -0,0 +1,15 @@
|
|
1
|
+
#!/usr/bin/env rake
|
2
|
+
require 'rubygems'
|
3
|
+
require 'bundler/setup'
|
4
|
+
|
5
|
+
require 'bundler/gem_tasks'
|
6
|
+
require 'rake/testtask'
|
7
|
+
|
8
|
+
Rake::TestTask.new do |t|
|
9
|
+
t.libs << 'test'
|
10
|
+
t.test_files = FileList['test/**/*_test.rb']
|
11
|
+
t.verbose = true
|
12
|
+
t.warning = false
|
13
|
+
end
|
14
|
+
|
15
|
+
task default: :test
|
@@ -0,0 +1,78 @@
|
|
1
|
+
#!/usr/bin/env ruby
|
2
|
+
require 'html/pipeline-plus'
|
3
|
+
|
4
|
+
require 'optparse'
|
5
|
+
|
6
|
+
# Accept "help", too
|
7
|
+
.map! { |a| a == 'help' ? '--help' : a }
|
8
|
+
|
9
|
+
onParser.new do |opts|
|
10
|
+
opts.banner = <<-HELP.gsub(/^ /, '')
|
11
|
+
Usage: html-pipeline-plus [-h] [-f]
|
12
|
+
html-pipeline-plus [FILTER [FILTER [...]]] < file.md
|
13
|
+
cat file.md | html-pipeline-plus [FILTER [FILTER [...]]]
|
14
|
+
HELP
|
15
|
+
|
16
|
+
opts.separator 'Options:'
|
17
|
+
|
18
|
+
opts.on('-f', '--filters', 'List the available filters') do
|
19
|
+
filters = HTML::Pipeline.constants.grep(/\w+Filter$/)
|
20
|
+
.map { |f| f.to_s.gsub(/Filter$/, '') }
|
21
|
+
|
22
|
+
# Text filter doesn't work, no call method
|
23
|
+
filters -= ['Text']
|
24
|
+
|
25
|
+
abort <<-HELP.gsub(/^ /, '')
|
26
|
+
Available filters:
|
27
|
+
#{filters.join("\n ")}
|
28
|
+
HELP
|
29
|
+
end
|
30
|
+
end.parse!
|
31
|
+
|
32
|
+
# Default to a GitHub-ish pipeline
|
33
|
+
if ARGV.empty?
|
34
|
+
|
35
|
+
filters = [
|
36
|
+
HTML::Pipeline::MarkdownFilter,
|
37
|
+
HTML::Pipeline::SanitizationFilter,
|
38
|
+
HTML::Pipeline::ImageMaxWidthFilter,
|
39
|
+
HTML::Pipeline::EmojiFilter,
|
40
|
+
HTML::Pipeline::AutolinkFilter,
|
41
|
+
HTML::Pipeline::TableOfContentsFilter
|
42
|
+
]
|
43
|
+
|
44
|
+
# Add syntax highlighting if rouge is present
|
45
|
+
begin
|
46
|
+
require 'rouge'
|
47
|
+
filters << HTML::Pipeline::SyntaxHighlightFilter
|
48
|
+
rescue LoadError
|
49
|
+
end
|
50
|
+
|
51
|
+
else
|
52
|
+
|
53
|
+
def filter_named(name)
|
54
|
+
case name
|
55
|
+
when 'Text'
|
56
|
+
raise NameError # Text filter doesn't work, no call method
|
57
|
+
end
|
58
|
+
|
59
|
+
HTML::Pipeline.const_get("#{name}Filter")
|
60
|
+
rescue NameError => e
|
61
|
+
abort "Unknown filter '#{name}'. List filters with the -f option."
|
62
|
+
end
|
63
|
+
|
64
|
+
filters = []
|
65
|
+
until ARGV.empty?
|
66
|
+
name = ARGV.shift
|
67
|
+
filters << filter_named(name)
|
68
|
+
end
|
69
|
+
|
70
|
+
end
|
71
|
+
|
72
|
+
context = {
|
73
|
+
asset_root: '/assets',
|
74
|
+
base_url: '/',
|
75
|
+
gfm: true
|
76
|
+
}
|
77
|
+
|
78
|
+
puts HTML::Pipeline.new(filters, context).call(ARGF.read)[:output]
|
@@ -0,0 +1,28 @@
|
|
1
|
+
|
2
|
+
require File.expand_path('../lib/html/pipeline-plus/version', __FILE__)
|
3
|
+
|
4
|
+
Gem::Specification.new do |gem|
|
5
|
+
gem.name = 'html-pipeline-plus'
|
6
|
+
gem.version = HTML::Pipeline::VERSION
|
7
|
+
gem.license = 'MIT'
|
8
|
+
gem.authors = ['Ryan Tomayko', 'Jerry Cheung', 'Garen J. Torikian', 'shines77']
|
9
|
+
gem.email = ['ryan@github.com', 'jerry@github.com', 'gjtorikian@gmail.com', 'gz_shines@msn.com']
|
10
|
+
gem.description = 'GitHub HTML processing filters and utilities'
|
11
|
+
gem.summary = 'Helpers for processing content through a chain of filters'
|
12
|
+
gem.homepage = 'https://github.com/shines77/html-pipeline-plus/'
|
13
|
+
|
14
|
+
gem.files = `git ls-files -z`.split("\x0").reject { |f| f =~ %r{^(test|gemfiles|script)/} }
|
15
|
+
gem.require_paths = ['lib']
|
16
|
+
|
17
|
+
gem.add_dependency 'activesupport', '>= 2'
|
18
|
+
gem.add_dependency 'nokogiri', '>= 1.4'
|
19
|
+
|
20
|
+
gem.post_install_message = <<msg
|
21
|
+
-------------------------------------------------
|
22
|
+
Thank you for installing html-pipeline-plus!
|
23
|
+
You must bundle Filter gem dependencies.
|
24
|
+
See html-pipeline-plus README.md for more details.
|
25
|
+
https://github.com/shines77/html-pipeline-plus#dependencies
|
26
|
+
-------------------------------------------------
|
27
|
+
msg
|
28
|
+
end
|
@@ -0,0 +1,138 @@
|
|
1
|
+
require 'set'
|
2
|
+
|
3
|
+
module HTML
|
4
|
+
class Pipeline
|
5
|
+
# HTML filter that replaces @user mentions with links. Mentions within <pre>,
|
6
|
+
# <code>, and <a> elements are ignored. Mentions that reference users that do
|
7
|
+
# not exist are ignored.
|
8
|
+
#
|
9
|
+
# Context options:
|
10
|
+
# :base_url - Used to construct links to user profile pages for each
|
11
|
+
# mention.
|
12
|
+
# :info_url - Used to link to "more info" when someone mentions @mention
|
13
|
+
# or @mentioned.
|
14
|
+
# :username_pattern - Used to provide a custom regular expression to
|
15
|
+
# identify usernames
|
16
|
+
#
|
17
|
+
class MentionFilter < Filter
|
18
|
+
# Public: Find user @mentions in text. See
|
19
|
+
# MentionFilter#mention_link_filter.
|
20
|
+
#
|
21
|
+
# MentionFilter.mentioned_logins_in(text) do |match, login, is_mentioned|
|
22
|
+
# "<a href=...>#{login}</a>"
|
23
|
+
# end
|
24
|
+
#
|
25
|
+
# text - String text to search.
|
26
|
+
#
|
27
|
+
# Yields the String match, the String login name, and a Boolean determining
|
28
|
+
# if the match = "@mention[ed]". The yield's return replaces the match in
|
29
|
+
# the original text.
|
30
|
+
#
|
31
|
+
# Returns a String replaced with the return of the block.
|
32
|
+
def self.mentioned_logins_in(text, username_pattern = UsernamePattern)
|
33
|
+
text.gsub MentionPatterns[username_pattern] do |match|
|
34
|
+
login = Regexp.last_match(1)
|
35
|
+
yield match, login, MentionLogins.include?(login.downcase)
|
36
|
+
end
|
37
|
+
end
|
38
|
+
|
39
|
+
# Hash that contains all of the mention patterns used by the pipeline
|
40
|
+
MentionPatterns = Hash.new do |hash, key|
|
41
|
+
hash[key] = /
|
42
|
+
(?:^|\W) # beginning of string or non-word char
|
43
|
+
@((?>#{key})) # @username
|
44
|
+
(?!\/) # without a trailing slash
|
45
|
+
(?=
|
46
|
+
\.+[ \t\W]| # dots followed by space or non-word character
|
47
|
+
\.+$| # dots at end of line
|
48
|
+
[^0-9a-zA-Z_.]| # non-word character except dot
|
49
|
+
$ # end of line
|
50
|
+
)
|
51
|
+
/ix
|
52
|
+
end
|
53
|
+
|
54
|
+
# Default pattern used to extract usernames from text. The value can be
|
55
|
+
# overriden by providing the username_pattern variable in the context.
|
56
|
+
UsernamePattern = /[a-z0-9][a-z0-9-]*/
|
57
|
+
|
58
|
+
# List of username logins that, when mentioned, link to the blog post
|
59
|
+
# about @mentions instead of triggering a real mention.
|
60
|
+
MentionLogins = %w[
|
61
|
+
mention
|
62
|
+
mentions
|
63
|
+
mentioned
|
64
|
+
mentioning
|
65
|
+
].freeze
|
66
|
+
|
67
|
+
# Don't look for mentions in text nodes that are children of these elements
|
68
|
+
IGNORE_PARENTS = %w(pre code a style script).to_set
|
69
|
+
|
70
|
+
def call
|
71
|
+
result[:mentioned_usernames] ||= []
|
72
|
+
|
73
|
+
doc.search('.//text()').each do |node|
|
74
|
+
content = node.to_html
|
75
|
+
next unless content.include?('@')
|
76
|
+
next if has_ancestor?(node, IGNORE_PARENTS)
|
77
|
+
html = mention_link_filter(content, base_url, info_url, username_pattern)
|
78
|
+
next if html == content
|
79
|
+
node.replace(html)
|
80
|
+
end
|
81
|
+
doc
|
82
|
+
end
|
83
|
+
|
84
|
+
# The URL to provide when someone @mentions a "mention" name, such as
|
85
|
+
# @mention or @mentioned, that will give them more info on mentions.
|
86
|
+
def info_url
|
87
|
+
context[:info_url] || nil
|
88
|
+
end
|
89
|
+
|
90
|
+
def username_pattern
|
91
|
+
context[:username_pattern] || UsernamePattern
|
92
|
+
end
|
93
|
+
|
94
|
+
# Replace user @mentions in text with links to the mentioned user's
|
95
|
+
# profile page.
|
96
|
+
#
|
97
|
+
# text - String text to replace @mention usernames in.
|
98
|
+
# base_url - The base URL used to construct user profile URLs.
|
99
|
+
# info_url - The "more info" URL used to link to more info on @mentions.
|
100
|
+
# If nil we don't link @mention or @mentioned.
|
101
|
+
# username_pattern - Regular expression used to identify usernames in
|
102
|
+
# text
|
103
|
+
#
|
104
|
+
# Returns a string with @mentions replaced with links. All links have a
|
105
|
+
# 'user-mention' class name attached for styling.
|
106
|
+
def mention_link_filter(text, _base_url = '/', info_url = nil, username_pattern = UsernamePattern)
|
107
|
+
self.class.mentioned_logins_in(text, username_pattern) do |match, login, is_mentioned|
|
108
|
+
link =
|
109
|
+
if is_mentioned
|
110
|
+
link_to_mention_info(login, info_url)
|
111
|
+
else
|
112
|
+
link_to_mentioned_user(login)
|
113
|
+
end
|
114
|
+
|
115
|
+
link ? match.sub("@#{login}", link) : match
|
116
|
+
end
|
117
|
+
end
|
118
|
+
|
119
|
+
def link_to_mention_info(text, info_url = nil)
|
120
|
+
return "@#{text}" if info_url.nil?
|
121
|
+
"<a href='#{info_url}' class='user-mention'>" \
|
122
|
+
"@#{text}" \
|
123
|
+
'</a>'
|
124
|
+
end
|
125
|
+
|
126
|
+
def link_to_mentioned_user(login)
|
127
|
+
result[:mentioned_usernames] |= [login]
|
128
|
+
|
129
|
+
url = base_url.dup
|
130
|
+
url << '/' unless url =~ /[\/~]\z/
|
131
|
+
|
132
|
+
"<a href='#{url << login}' class='user-mention'>" \
|
133
|
+
"@#{login}" \
|
134
|
+
'</a>'
|
135
|
+
end
|
136
|
+
end
|
137
|
+
end
|
138
|
+
end
|
@@ -0,0 +1,45 @@
|
|
1
|
+
require 'uri'
|
2
|
+
|
3
|
+
module HTML
|
4
|
+
class Pipeline
|
5
|
+
class AbsoluteSourceFilter < Filter
|
6
|
+
# HTML Filter for replacing relative and root relative image URLs with
|
7
|
+
# fully qualified URLs
|
8
|
+
#
|
9
|
+
# This is useful if an image is root relative but should really be going
|
10
|
+
# through a cdn, or if the content for the page assumes the host is known
|
11
|
+
# i.e. scraped webpages and some RSS feeds.
|
12
|
+
#
|
13
|
+
# Context options:
|
14
|
+
# :image_base_url - Base URL for image host for root relative src.
|
15
|
+
# :image_subpage_url - For relative src.
|
16
|
+
#
|
17
|
+
# This filter does not write additional information to the context.
|
18
|
+
# This filter would need to be run before CamoFilter.
|
19
|
+
def call
|
20
|
+
doc.search('img').each do |element|
|
21
|
+
next if element['src'].nil? || element['src'].empty?
|
22
|
+
src = element['src'].strip
|
23
|
+
next if src.start_with? 'http'
|
24
|
+
base = if src.start_with? '/'
|
25
|
+
image_base_url
|
26
|
+
else
|
27
|
+
image_subpage_url
|
28
|
+
end
|
29
|
+
element['src'] = URI.join(base, src).to_s
|
30
|
+
end
|
31
|
+
doc
|
32
|
+
end
|
33
|
+
|
34
|
+
# Private: the base url you want to use
|
35
|
+
def image_base_url
|
36
|
+
context[:image_base_url] || raise("Missing context :image_base_url for #{self.class.name}")
|
37
|
+
end
|
38
|
+
|
39
|
+
# Private: the relative url you want to use
|
40
|
+
def image_subpage_url
|
41
|
+
context[:image_subpage_url] || raise("Missing context :image_subpage_url for #{self.class.name}")
|
42
|
+
end
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|