html_truncator 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- data/Gemfile +1 -0
- data/MIT-LICENSE +20 -0
- data/README.md +87 -0
- data/lib/html_truncator.rb +50 -0
- metadata +94 -0
data/Gemfile
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
gemspec
|
data/MIT-LICENSE
ADDED
@@ -0,0 +1,20 @@
|
|
1
|
+
Copyright (c) 2009 Bruno Michel
|
2
|
+
|
3
|
+
Permission is hereby granted, free of charge, to any person obtaining
|
4
|
+
a copy of this software and associated documentation files (the
|
5
|
+
"Software"), to deal in the Software without restriction, including
|
6
|
+
without limitation the rights to use, copy, modify, merge, publish,
|
7
|
+
distribute, sublicense, and/or sell copies of the Software, and to
|
8
|
+
permit persons to whom the Software is furnished to do so, subject to
|
9
|
+
the following conditions:
|
10
|
+
|
11
|
+
The above copyright notice and this permission notice shall be
|
12
|
+
included in all copies or substantial portions of the Software.
|
13
|
+
|
14
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
|
15
|
+
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
|
16
|
+
MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
|
17
|
+
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
|
18
|
+
LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
|
19
|
+
OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
|
20
|
+
WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,87 @@
|
|
1
|
+
HTML Truncator
|
2
|
+
==============
|
3
|
+
|
4
|
+
Wants to truncate an HTML string properly? This gem is for you.
|
5
|
+
It's powered by [Nokogiri](http://nokogiri.org/)!
|
6
|
+
|
7
|
+
|
8
|
+
How to use it
|
9
|
+
-------------
|
10
|
+
|
11
|
+
It's very simple. Install it with rubygems:
|
12
|
+
|
13
|
+
gem install html_truncator
|
14
|
+
|
15
|
+
Or, if you use bundler, add it to your `Gemfile`:
|
16
|
+
|
17
|
+
gem "html_truncator", :version => "~>0.1"
|
18
|
+
|
19
|
+
Then you can use it in your code:
|
20
|
+
|
21
|
+
require "html_truncator"
|
22
|
+
HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3)
|
23
|
+
# => "<p>Lorem ipsum dolor...</p>"
|
24
|
+
|
25
|
+
The HTML_Truncator class has only one method, `truncate`, with 3 arguments:
|
26
|
+
|
27
|
+
* the HTML-formatted string to truncate
|
28
|
+
* the number of words to keep (real words, tags and attributes aren't count)
|
29
|
+
* the ellipsis (optional, '...' by default).
|
30
|
+
|
31
|
+
|
32
|
+
Examples
|
33
|
+
--------
|
34
|
+
|
35
|
+
A simple example:
|
36
|
+
|
37
|
+
HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3)
|
38
|
+
# => "<p>Lorem ipsum dolor...</p>"
|
39
|
+
|
40
|
+
If the text is too short to be truncated, it won't be modified:
|
41
|
+
|
42
|
+
HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 5)
|
43
|
+
=> "<p>Lorem ipsum dolor sit amet.</p>"
|
44
|
+
|
45
|
+
You can customize the ellipsis:
|
46
|
+
|
47
|
+
HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3, " (truncated)")
|
48
|
+
=> "<p>Lorem ipsum dolor (truncated)</p>"
|
49
|
+
|
50
|
+
And even have HTML in the ellipsis:
|
51
|
+
|
52
|
+
HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3, '<a href="/more-to-read">...</a>')
|
53
|
+
=> "<p>Lorem ipsum dolor<a href="/more-to-read">...</a></p>"
|
54
|
+
|
55
|
+
|
56
|
+
Alternatives
|
57
|
+
------------
|
58
|
+
|
59
|
+
Rails has a `truncate` helper, but as the doc says:
|
60
|
+
|
61
|
+
> Care should be taken if text contains HTML tags or entities,
|
62
|
+
because truncation may produce invalid HTML (such as unbalanced or incomplete tags).
|
63
|
+
|
64
|
+
I know there are some Ruby code to truncate HTML, like:
|
65
|
+
|
66
|
+
* https://github.com/hgimenez/truncate_html
|
67
|
+
* https://gist.github.com/101410
|
68
|
+
* http://henrik.nyh.se/2008/01/rails-truncate-html-helper
|
69
|
+
* http://blog.madebydna.com/all/code/2010/06/04/ruby-helper-to-cleanly-truncate-html.html
|
70
|
+
|
71
|
+
But I'm not pleased with these solutions: they are either based on regexp for
|
72
|
+
parsing the content (too fragile), they don't put the ellipsis where expected,
|
73
|
+
they cut words and sometimes leave empty DOM nodes. So I made my own gem ;-)
|
74
|
+
|
75
|
+
|
76
|
+
Issues or Suggestions
|
77
|
+
---------------------
|
78
|
+
|
79
|
+
Found an issue or have a suggestion? Please report it on
|
80
|
+
[Github's issue tracker](http://github.com/nono/HTML-Truncator/issues).
|
81
|
+
|
82
|
+
If you wants to make a pull request, please check the specs before:
|
83
|
+
|
84
|
+
rspec spec
|
85
|
+
|
86
|
+
|
87
|
+
Copyright (c) 2011 Bruno Michel <bmichel@menfin.info>, released under the MIT license
|
@@ -0,0 +1,50 @@
|
|
1
|
+
require "nokogiri"
|
2
|
+
|
3
|
+
|
4
|
+
class HTML_Truncator
|
5
|
+
def self.truncate(text, max_words, ellipsis="...")
|
6
|
+
doc = Nokogiri::HTML::DocumentFragment.parse(text)
|
7
|
+
doc.truncate(max_words, ellipsis)
|
8
|
+
end
|
9
|
+
end
|
10
|
+
|
11
|
+
|
12
|
+
class Nokogiri::HTML::DocumentFragment
|
13
|
+
def truncate(max_words, ellipsis)
|
14
|
+
inner_truncate(max_words, ellipsis).first
|
15
|
+
end
|
16
|
+
end
|
17
|
+
|
18
|
+
class Nokogiri::XML::Node
|
19
|
+
def truncate(max_words, ellipsis)
|
20
|
+
inner, remaining = inner_truncate(max_words, ellipsis)
|
21
|
+
children.remove
|
22
|
+
add_child Nokogiri::HTML::DocumentFragment.parse(inner)
|
23
|
+
[to_xml, max_words - remaining]
|
24
|
+
end
|
25
|
+
|
26
|
+
def inner_truncate(max_words, ellipsis)
|
27
|
+
inner, remaining = "", max_words
|
28
|
+
self.children.each do |node|
|
29
|
+
txt, nb = node.truncate(remaining, ellipsis)
|
30
|
+
remaining -= nb
|
31
|
+
inner += txt
|
32
|
+
break if remaining < 0
|
33
|
+
end
|
34
|
+
[inner, remaining]
|
35
|
+
end
|
36
|
+
|
37
|
+
def nb_words
|
38
|
+
inner_text.split.length
|
39
|
+
end
|
40
|
+
end
|
41
|
+
|
42
|
+
class Nokogiri::XML::Text
|
43
|
+
def truncate(max_words, ellipsis)
|
44
|
+
words = content.split
|
45
|
+
nb_words = words.length
|
46
|
+
return [to_xhtml, nb_words] if nb_words <= max_words
|
47
|
+
return [ellipsis, 1] if max_words == 0
|
48
|
+
[words.slice(0, max_words).join(' ') + ellipsis, nb_words]
|
49
|
+
end
|
50
|
+
end
|
metadata
ADDED
@@ -0,0 +1,94 @@
|
|
1
|
+
--- !ruby/object:Gem::Specification
|
2
|
+
name: html_truncator
|
3
|
+
version: !ruby/object:Gem::Version
|
4
|
+
prerelease: false
|
5
|
+
segments:
|
6
|
+
- 0
|
7
|
+
- 1
|
8
|
+
- 0
|
9
|
+
version: 0.1.0
|
10
|
+
platform: ruby
|
11
|
+
authors:
|
12
|
+
- Bruno Michel
|
13
|
+
autorequire:
|
14
|
+
bindir: bin
|
15
|
+
cert_chain: []
|
16
|
+
|
17
|
+
date: 2011-01-09 00:00:00 +01:00
|
18
|
+
default_executable:
|
19
|
+
dependencies:
|
20
|
+
- !ruby/object:Gem::Dependency
|
21
|
+
name: nokogiri
|
22
|
+
prerelease: false
|
23
|
+
requirement: &id001 !ruby/object:Gem::Requirement
|
24
|
+
none: false
|
25
|
+
requirements:
|
26
|
+
- - ~>
|
27
|
+
- !ruby/object:Gem::Version
|
28
|
+
segments:
|
29
|
+
- 1
|
30
|
+
- 4
|
31
|
+
version: "1.4"
|
32
|
+
type: :runtime
|
33
|
+
version_requirements: *id001
|
34
|
+
- !ruby/object:Gem::Dependency
|
35
|
+
name: rspec
|
36
|
+
prerelease: false
|
37
|
+
requirement: &id002 !ruby/object:Gem::Requirement
|
38
|
+
none: false
|
39
|
+
requirements:
|
40
|
+
- - ~>
|
41
|
+
- !ruby/object:Gem::Version
|
42
|
+
segments:
|
43
|
+
- 2
|
44
|
+
- 4
|
45
|
+
version: "2.4"
|
46
|
+
type: :development
|
47
|
+
version_requirements: *id002
|
48
|
+
description: Wants to truncate an HTML string properly? This gem is for you.
|
49
|
+
email: bmichel@menfin.info
|
50
|
+
executables: []
|
51
|
+
|
52
|
+
extensions: []
|
53
|
+
|
54
|
+
extra_rdoc_files:
|
55
|
+
- README.md
|
56
|
+
files:
|
57
|
+
- MIT-LICENSE
|
58
|
+
- README.md
|
59
|
+
- Gemfile
|
60
|
+
- lib/html_truncator.rb
|
61
|
+
has_rdoc: true
|
62
|
+
homepage: http://github.com/nono/HTML-Truncator
|
63
|
+
licenses: []
|
64
|
+
|
65
|
+
post_install_message:
|
66
|
+
rdoc_options: []
|
67
|
+
|
68
|
+
require_paths:
|
69
|
+
- lib
|
70
|
+
required_ruby_version: !ruby/object:Gem::Requirement
|
71
|
+
none: false
|
72
|
+
requirements:
|
73
|
+
- - ">="
|
74
|
+
- !ruby/object:Gem::Version
|
75
|
+
segments:
|
76
|
+
- 0
|
77
|
+
version: "0"
|
78
|
+
required_rubygems_version: !ruby/object:Gem::Requirement
|
79
|
+
none: false
|
80
|
+
requirements:
|
81
|
+
- - ">="
|
82
|
+
- !ruby/object:Gem::Version
|
83
|
+
segments:
|
84
|
+
- 0
|
85
|
+
version: "0"
|
86
|
+
requirements: []
|
87
|
+
|
88
|
+
rubyforge_project:
|
89
|
+
rubygems_version: 1.3.7
|
90
|
+
signing_key:
|
91
|
+
specification_version: 3
|
92
|
+
summary: Wants to truncate an HTML string properly? This gem is for you.
|
93
|
+
test_files: []
|
94
|
+
|