html_truncator 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (5) hide show
  1. data/Gemfile +1 -0
  2. data/MIT-LICENSE +20 -0
  3. data/README.md +87 -0
  4. data/lib/html_truncator.rb +50 -0
  5. metadata +94 -0
data/Gemfile ADDED
@@ -0,0 +1 @@
1
+ gemspec
data/MIT-LICENSE ADDED
@@ -0,0 +1,20 @@
1
+ Copyright (c) 2009 Bruno Michel
2
+
3
+ Permission is hereby granted, free of charge, to any person obtaining
4
+ a copy of this software and associated documentation files (the
5
+ "Software"), to deal in the Software without restriction, including
6
+ without limitation the rights to use, copy, modify, merge, publish,
7
+ distribute, sublicense, and/or sell copies of the Software, and to
8
+ permit persons to whom the Software is furnished to do so, subject to
9
+ the following conditions:
10
+
11
+ The above copyright notice and this permission notice shall be
12
+ included in all copies or substantial portions of the Software.
13
+
14
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
15
+ EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
16
+ MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
17
+ NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
18
+ LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
19
+ OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
20
+ WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,87 @@
1
+ HTML Truncator
2
+ ==============
3
+
4
+ Wants to truncate an HTML string properly? This gem is for you.
5
+ It's powered by [Nokogiri](http://nokogiri.org/)!
6
+
7
+
8
+ How to use it
9
+ -------------
10
+
11
+ It's very simple. Install it with rubygems:
12
+
13
+ gem install html_truncator
14
+
15
+ Or, if you use bundler, add it to your `Gemfile`:
16
+
17
+ gem "html_truncator", :version => "~>0.1"
18
+
19
+ Then you can use it in your code:
20
+
21
+ require "html_truncator"
22
+ HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3)
23
+ # => "<p>Lorem ipsum dolor...</p>"
24
+
25
+ The HTML_Truncator class has only one method, `truncate`, with 3 arguments:
26
+
27
+ * the HTML-formatted string to truncate
28
+ * the number of words to keep (real words, tags and attributes aren't count)
29
+ * the ellipsis (optional, '...' by default).
30
+
31
+
32
+ Examples
33
+ --------
34
+
35
+ A simple example:
36
+
37
+ HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3)
38
+ # => "<p>Lorem ipsum dolor...</p>"
39
+
40
+ If the text is too short to be truncated, it won't be modified:
41
+
42
+ HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 5)
43
+ => "<p>Lorem ipsum dolor sit amet.</p>"
44
+
45
+ You can customize the ellipsis:
46
+
47
+ HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3, " (truncated)")
48
+ => "<p>Lorem ipsum dolor (truncated)</p>"
49
+
50
+ And even have HTML in the ellipsis:
51
+
52
+ HTML_Truncator.truncate("<p>Lorem ipsum dolor sit amet.</p>", 3, '<a href="/more-to-read">...</a>')
53
+ => "<p>Lorem ipsum dolor<a href="/more-to-read">...</a></p>"
54
+
55
+
56
+ Alternatives
57
+ ------------
58
+
59
+ Rails has a `truncate` helper, but as the doc says:
60
+
61
+ > Care should be taken if text contains HTML tags or entities,
62
+ because truncation may produce invalid HTML (such as unbalanced or incomplete tags).
63
+
64
+ I know there are some Ruby code to truncate HTML, like:
65
+
66
+ * https://github.com/hgimenez/truncate_html
67
+ * https://gist.github.com/101410
68
+ * http://henrik.nyh.se/2008/01/rails-truncate-html-helper
69
+ * http://blog.madebydna.com/all/code/2010/06/04/ruby-helper-to-cleanly-truncate-html.html
70
+
71
+ But I'm not pleased with these solutions: they are either based on regexp for
72
+ parsing the content (too fragile), they don't put the ellipsis where expected,
73
+ they cut words and sometimes leave empty DOM nodes. So I made my own gem ;-)
74
+
75
+
76
+ Issues or Suggestions
77
+ ---------------------
78
+
79
+ Found an issue or have a suggestion? Please report it on
80
+ [Github's issue tracker](http://github.com/nono/HTML-Truncator/issues).
81
+
82
+ If you wants to make a pull request, please check the specs before:
83
+
84
+ rspec spec
85
+
86
+
87
+ Copyright (c) 2011 Bruno Michel <bmichel@menfin.info>, released under the MIT license
@@ -0,0 +1,50 @@
1
+ require "nokogiri"
2
+
3
+
4
+ class HTML_Truncator
5
+ def self.truncate(text, max_words, ellipsis="...")
6
+ doc = Nokogiri::HTML::DocumentFragment.parse(text)
7
+ doc.truncate(max_words, ellipsis)
8
+ end
9
+ end
10
+
11
+
12
+ class Nokogiri::HTML::DocumentFragment
13
+ def truncate(max_words, ellipsis)
14
+ inner_truncate(max_words, ellipsis).first
15
+ end
16
+ end
17
+
18
+ class Nokogiri::XML::Node
19
+ def truncate(max_words, ellipsis)
20
+ inner, remaining = inner_truncate(max_words, ellipsis)
21
+ children.remove
22
+ add_child Nokogiri::HTML::DocumentFragment.parse(inner)
23
+ [to_xml, max_words - remaining]
24
+ end
25
+
26
+ def inner_truncate(max_words, ellipsis)
27
+ inner, remaining = "", max_words
28
+ self.children.each do |node|
29
+ txt, nb = node.truncate(remaining, ellipsis)
30
+ remaining -= nb
31
+ inner += txt
32
+ break if remaining < 0
33
+ end
34
+ [inner, remaining]
35
+ end
36
+
37
+ def nb_words
38
+ inner_text.split.length
39
+ end
40
+ end
41
+
42
+ class Nokogiri::XML::Text
43
+ def truncate(max_words, ellipsis)
44
+ words = content.split
45
+ nb_words = words.length
46
+ return [to_xhtml, nb_words] if nb_words <= max_words
47
+ return [ellipsis, 1] if max_words == 0
48
+ [words.slice(0, max_words).join(' ') + ellipsis, nb_words]
49
+ end
50
+ end
metadata ADDED
@@ -0,0 +1,94 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: html_truncator
3
+ version: !ruby/object:Gem::Version
4
+ prerelease: false
5
+ segments:
6
+ - 0
7
+ - 1
8
+ - 0
9
+ version: 0.1.0
10
+ platform: ruby
11
+ authors:
12
+ - Bruno Michel
13
+ autorequire:
14
+ bindir: bin
15
+ cert_chain: []
16
+
17
+ date: 2011-01-09 00:00:00 +01:00
18
+ default_executable:
19
+ dependencies:
20
+ - !ruby/object:Gem::Dependency
21
+ name: nokogiri
22
+ prerelease: false
23
+ requirement: &id001 !ruby/object:Gem::Requirement
24
+ none: false
25
+ requirements:
26
+ - - ~>
27
+ - !ruby/object:Gem::Version
28
+ segments:
29
+ - 1
30
+ - 4
31
+ version: "1.4"
32
+ type: :runtime
33
+ version_requirements: *id001
34
+ - !ruby/object:Gem::Dependency
35
+ name: rspec
36
+ prerelease: false
37
+ requirement: &id002 !ruby/object:Gem::Requirement
38
+ none: false
39
+ requirements:
40
+ - - ~>
41
+ - !ruby/object:Gem::Version
42
+ segments:
43
+ - 2
44
+ - 4
45
+ version: "2.4"
46
+ type: :development
47
+ version_requirements: *id002
48
+ description: Wants to truncate an HTML string properly? This gem is for you.
49
+ email: bmichel@menfin.info
50
+ executables: []
51
+
52
+ extensions: []
53
+
54
+ extra_rdoc_files:
55
+ - README.md
56
+ files:
57
+ - MIT-LICENSE
58
+ - README.md
59
+ - Gemfile
60
+ - lib/html_truncator.rb
61
+ has_rdoc: true
62
+ homepage: http://github.com/nono/HTML-Truncator
63
+ licenses: []
64
+
65
+ post_install_message:
66
+ rdoc_options: []
67
+
68
+ require_paths:
69
+ - lib
70
+ required_ruby_version: !ruby/object:Gem::Requirement
71
+ none: false
72
+ requirements:
73
+ - - ">="
74
+ - !ruby/object:Gem::Version
75
+ segments:
76
+ - 0
77
+ version: "0"
78
+ required_rubygems_version: !ruby/object:Gem::Requirement
79
+ none: false
80
+ requirements:
81
+ - - ">="
82
+ - !ruby/object:Gem::Version
83
+ segments:
84
+ - 0
85
+ version: "0"
86
+ requirements: []
87
+
88
+ rubyforge_project:
89
+ rubygems_version: 1.3.7
90
+ signing_key:
91
+ specification_version: 3
92
+ summary: Wants to truncate an HTML string properly? This gem is for you.
93
+ test_files: []
94
+