html_aware_truncation 1.0.0

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA1:
3
+ metadata.gz: 81d7b744f16d55b31ecb503a6d98fe9b1575751a
4
+ data.tar.gz: 783c9c22e310d260e8e357fc1b08517b6cf983ba
5
+ SHA512:
6
+ metadata.gz: a78a2d9a01ccf0fd98a6cd449292231d16eebef9e4aab4c3bf956a40dd93a033a942ce372914f960031d9bcf27ce27998a56d8ab0fb9039d41a03048999e791f
7
+ data.tar.gz: 9abc87ca6410bb25d55c84d4d67b8b1dc201df3091807d5890f8eab70aa77026bd3f718e0f4a01a3adbe8e21e2694cf84f33cbcd55b0c4d36a90d7451d8fcaf9
@@ -0,0 +1,12 @@
1
+ /.bundle/
2
+ /.yardoc
3
+ /Gemfile.lock
4
+ /_yardoc/
5
+ /coverage/
6
+ /doc/
7
+ /pkg/
8
+ /spec/reports/
9
+ /tmp/
10
+ .byebug_history
11
+ # rspec failure tracking
12
+ .rspec_status
data/.rspec ADDED
@@ -0,0 +1,2 @@
1
+ --format documentation
2
+ --color
@@ -0,0 +1,5 @@
1
+ sudo: false
2
+ language: ruby
3
+ rvm:
4
+ - 2.3.3
5
+ before_install: gem install bundler -v 1.14.5
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source 'https://rubygems.org'
2
+
3
+ # Specify your gem's dependencies in html_aware_truncation.gemspec
4
+ gemspec
5
+
6
+ gem "byebug", group: [:development, :test]
@@ -0,0 +1,21 @@
1
+ The MIT License (MIT)
2
+
3
+ Copyright (c) 2017 Jonathan Rochkind
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in
13
+ all copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21
+ THE SOFTWARE.
@@ -0,0 +1,89 @@
1
+ # HtmlAwareTruncation
2
+ [![Gem Version](https://badge.fury.io/rb/html_aware_truncation.svg)](https://badge.fury.io/rb/html_aware_truncation)
3
+ [![Build Status](https://travis-ci.org/jrochkind/html_aware_truncation.svg?branch=master)](https://travis-ci.org/jrochkind/html_aware_truncation)
4
+
5
+
6
+ Yet another ruby html-aware truncation routine. Truncate HTML to max text characters,
7
+ resulting in still legal HTML without any unclosed tags etc.
8
+
9
+ I was unable to find an existing solution that met my needs:
10
+ * Uses [nokogiri](https://github.com/sparklemotion/nokogiri) (cause it's really good at handling somewhat invalid HTML input, and you probably already have it as a dependency)
11
+ * Does not monkey-patch nokogiri or String or anything else.
12
+ * Follows Rails [truncate helper](http://api.rubyonrails.org/classes/ActionView/Helpers/TextHelper.html#method-i-truncate)
13
+ semantics, including a custom :separator that can be a string or regex, usually for word boundaries.
14
+
15
+
16
+ ## Usage
17
+
18
+ ```ruby
19
+ require 'html_aware_truncation'
20
+ string = "<p>Lots of html <b>with bolded stuff</b></p>"
21
+ HtmlAwareTruncation.truncate_html(string, length: 10)
22
+ # => "<p>Lots of h…</p>"
23
+ HtmlAwareTruncation.truncate_html(string, length: 10, separator: /\b/)
24
+ # => "<p>Lots of …</p>"
25
+ HtmlAwareTruncation.truncate_html(string, length: 10, separator: /\b/, omission: '--')
26
+ # => "<p>Lots of --</p>"
27
+ ```
28
+
29
+ If you already have a Nokogiri node, or want to do the Nokogiri
30
+ parsing and serialization yourself, you can pass a single Nokogiri node
31
+ to `truncate_nokogiri_node`. Often a `Nokogiri::HTML::DocumentFragment` makes sense:
32
+
33
+ ```ruby
34
+ node = Nokogiri::HTML::DocumentFragment.parse(some_html_str)
35
+ HtmlAwareTruncation.truncate_nokogiri_node(some_html_str, length: 10)
36
+ # => Returns a Nokogiri node, may mutate original passed in, not entirely sure.
37
+ ```
38
+
39
+ For convenience, you can `include` the `HtmlAwareTruncation` module, to
40
+ get it's methods as mixins.
41
+
42
+ ```ruby
43
+ require 'html_aware_truncation'
44
+ class Something
45
+ include HtmlAwareTruncation
46
+
47
+ def something
48
+ truncate_html(whatever)
49
+ end
50
+ end
51
+ ```
52
+
53
+ ## Known problems
54
+
55
+ This isn't perfect, but it's good enough for me to use in several production
56
+ apps. In edge cases, it may sometimes:
57
+
58
+ * May in some cases be an extra character (or a few) above the specified `length` limit (off by one error maybe?)
59
+ * put the omission mark in a node of it's own, which is kind of silly: `"<p>Stuff <b>…</b></p>"`
60
+ * leave one or more empty nodes at the end: `"<p>Stuff and...<b></b></p>"`
61
+ * Put the omission mark in a tag/node that really ought not to have text content: `"<ul><li>stuff</li>…</ul>"
62
+ (This one bothers me the most, it's the only case I know this gem produces slightly illegal HTML, but generally happens rarely)
63
+
64
+ Some specs marked `pending` demonstrate some "bad behavior", but there may be others un-tested.
65
+
66
+ In general though, this has not caused me real problems in production, it works out.
67
+ I still find this preferable to other alternative gems I know about, so I packaged it up in
68
+ case you do too. Patches welcome.
69
+
70
+ ## Contributing
71
+
72
+ Bug reports and pull requests are welcome on GitHub at https://github.com/jrochkind/html_aware_truncation.
73
+
74
+
75
+ ## License
76
+
77
+ The gem is available as open source under the terms of the [MIT License](http://opensource.org/licenses/MIT).
78
+
79
+ ## Alternatives
80
+
81
+ I adapted some code or tests from some of these. I mostly adapted from
82
+ an example in [a blog post now only in the wayback machine](https://web-beta.archive.org/web/20160116165808/http://blog.madebydna.com/all/code/2010/06/04/ruby-helper-to-cleanly-truncate-html.html).
83
+ Alternative examples can also be useful to look at to see how/if they solve the known problems with this gem, for ideas.
84
+
85
+ * https://github.com/nono/HTML-Truncator
86
+ * https://github.com/hgmnz/truncate_html
87
+ * https://github.com/ianwhite/truncate_html
88
+
89
+
@@ -0,0 +1,6 @@
1
+ require "bundler/gem_tasks"
2
+ require "rspec/core/rake_task"
3
+
4
+ RSpec::Core::RakeTask.new(:spec)
5
+
6
+ task :default => :spec
@@ -0,0 +1,14 @@
1
+ #!/usr/bin/env ruby
2
+
3
+ require "bundler/setup"
4
+ require "html_aware_truncation"
5
+
6
+ # You can add fixtures and/or initialization code here to make experimenting
7
+ # with your gem easier. You can also use a different console, if you like.
8
+
9
+ # (If you use this, don't forget to add pry to your Gemfile!)
10
+ # require "pry"
11
+ # Pry.start
12
+
13
+ require "irb"
14
+ IRB.start(__FILE__)
@@ -0,0 +1,8 @@
1
+ #!/usr/bin/env bash
2
+ set -euo pipefail
3
+ IFS=$'\n\t'
4
+ set -vx
5
+
6
+ bundle install
7
+
8
+ # Do any other automated setup that you need to do here
@@ -0,0 +1,28 @@
1
+ # coding: utf-8
2
+ lib = File.expand_path('../lib', __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require 'html_aware_truncation/version'
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "html_aware_truncation"
8
+ spec.version = HtmlAwareTruncation::VERSION
9
+ spec.authors = ["Jonathan Rochkind"]
10
+ spec.email = ["jrochkind@chemheritage.org"]
11
+
12
+ spec.summary = %q{Yet another ruby html-aware truncation routine}
13
+ #spec.homepage = "TODO: Put your gem's website or public repo URL here."
14
+ spec.license = "MIT"
15
+
16
+ spec.files = `git ls-files -z`.split("\x0").reject do |f|
17
+ f.match(%r{^(test|spec|features)/})
18
+ end
19
+ # spec.bindir = "exe"
20
+ # spec.executables = spec.files.grep(%r{^exe/}) { |f| File.basename(f) }
21
+ spec.require_paths = ["lib"]
22
+
23
+ spec.add_runtime_dependency "nokogiri", "~> 1.0"
24
+
25
+ spec.add_development_dependency "bundler", "~> 1.14"
26
+ spec.add_development_dependency "rake"
27
+ spec.add_development_dependency "rspec", "~> 3.0"
28
+ end
@@ -0,0 +1,72 @@
1
+ require "html_aware_truncation/version"
2
+ require 'nokogiri'
3
+
4
+ module HtmlAwareTruncation
5
+ define_singleton_method(:default_length) { @default_length }
6
+ define_singleton_method(:default_length=) { |val| @default_length = val }
7
+ self.default_length = 200
8
+ define_singleton_method(:default_omission) { @default_omission }
9
+ define_singleton_method(:default_omission=) { |val| @default_omission = val }
10
+ self.default_omission = '…'
11
+
12
+ def truncate_html(str,
13
+ length: HtmlAwareTruncation.default_length,
14
+ omission: HtmlAwareTruncation.default_omission,
15
+ separator: nil)
16
+
17
+ HtmlAwareTruncation.truncate_nokogiri_node(
18
+ Nokogiri::HTML::DocumentFragment.parse(str),
19
+ length: length,
20
+ omission: omission,
21
+ separator: separator
22
+ ).to_html
23
+ end
24
+ module_function :truncate_html
25
+
26
+ # HTML-aware truncation of a `Nokogiri::HTML::DocumentFragment`, perhaps
27
+ # one you created with `Nokogiri::HTML::DocumentFragment.parse(str)`
28
+ # Returns a TODO. (may mutate input?)
29
+ #
30
+ # See also truncate_string, which will take and return a string, parsing
31
+ # for you for convenience.
32
+ def truncate_nokogiri_node(node,
33
+ length: HtmlAwareTruncation.default_length,
34
+ omission: HtmlAwareTruncation.default_omission,
35
+ separator: nil)
36
+ if node.kind_of?(::Nokogiri::XML::Text)
37
+ if node.content.length > length
38
+ allowable_endpoint = [0, length - omission.length].max
39
+ if separator
40
+ allowable_endpoint = (node.content.rindex(separator, allowable_endpoint) || allowable_endpoint)
41
+ end
42
+
43
+ ::Nokogiri::XML::Text.new(node.content.slice(0, allowable_endpoint) + omission, node.parent)
44
+ else
45
+ node.dup
46
+ end
47
+ else # DocumentFragment or Element
48
+ return node if node.inner_text.length <= length
49
+
50
+ truncated_node = node.dup
51
+ truncated_node.children.remove
52
+ remaining_length = length
53
+
54
+ node.children.each do |child|
55
+ if remaining_length == 0
56
+ truncated_node.add_child ::Nokogiri::XML::Text.new(omission, truncated_node)
57
+ break
58
+ elsif remaining_length < 0
59
+ break
60
+ end
61
+ truncated_node.add_child HtmlAwareTruncation.truncate_nokogiri_node(child, length: remaining_length, omission: omission, separator: separator)
62
+ # can end up less than 0 if the child was truncated to fit, that's
63
+ # fine:
64
+ remaining_length = remaining_length - child.inner_text.length
65
+
66
+ end
67
+ truncated_node
68
+ end
69
+ end
70
+ module_function :truncate_nokogiri_node
71
+
72
+ end
@@ -0,0 +1,3 @@
1
+ module HtmlAwareTruncation
2
+ VERSION = "1.0.0"
3
+ end
metadata ADDED
@@ -0,0 +1,112 @@
1
+ --- !ruby/object:Gem::Specification
2
+ name: html_aware_truncation
3
+ version: !ruby/object:Gem::Version
4
+ version: 1.0.0
5
+ platform: ruby
6
+ authors:
7
+ - Jonathan Rochkind
8
+ autorequire:
9
+ bindir: bin
10
+ cert_chain: []
11
+ date: 2017-03-28 00:00:00.000000000 Z
12
+ dependencies:
13
+ - !ruby/object:Gem::Dependency
14
+ name: nokogiri
15
+ requirement: !ruby/object:Gem::Requirement
16
+ requirements:
17
+ - - "~>"
18
+ - !ruby/object:Gem::Version
19
+ version: '1.0'
20
+ type: :runtime
21
+ prerelease: false
22
+ version_requirements: !ruby/object:Gem::Requirement
23
+ requirements:
24
+ - - "~>"
25
+ - !ruby/object:Gem::Version
26
+ version: '1.0'
27
+ - !ruby/object:Gem::Dependency
28
+ name: bundler
29
+ requirement: !ruby/object:Gem::Requirement
30
+ requirements:
31
+ - - "~>"
32
+ - !ruby/object:Gem::Version
33
+ version: '1.14'
34
+ type: :development
35
+ prerelease: false
36
+ version_requirements: !ruby/object:Gem::Requirement
37
+ requirements:
38
+ - - "~>"
39
+ - !ruby/object:Gem::Version
40
+ version: '1.14'
41
+ - !ruby/object:Gem::Dependency
42
+ name: rake
43
+ requirement: !ruby/object:Gem::Requirement
44
+ requirements:
45
+ - - ">="
46
+ - !ruby/object:Gem::Version
47
+ version: '0'
48
+ type: :development
49
+ prerelease: false
50
+ version_requirements: !ruby/object:Gem::Requirement
51
+ requirements:
52
+ - - ">="
53
+ - !ruby/object:Gem::Version
54
+ version: '0'
55
+ - !ruby/object:Gem::Dependency
56
+ name: rspec
57
+ requirement: !ruby/object:Gem::Requirement
58
+ requirements:
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: '3.0'
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: '3.0'
69
+ description:
70
+ email:
71
+ - jrochkind@chemheritage.org
72
+ executables: []
73
+ extensions: []
74
+ extra_rdoc_files: []
75
+ files:
76
+ - ".gitignore"
77
+ - ".rspec"
78
+ - ".travis.yml"
79
+ - Gemfile
80
+ - LICENSE.txt
81
+ - README.md
82
+ - Rakefile
83
+ - bin/console
84
+ - bin/setup
85
+ - html_aware_truncation.gemspec
86
+ - lib/html_aware_truncation.rb
87
+ - lib/html_aware_truncation/version.rb
88
+ homepage:
89
+ licenses:
90
+ - MIT
91
+ metadata: {}
92
+ post_install_message:
93
+ rdoc_options: []
94
+ require_paths:
95
+ - lib
96
+ required_ruby_version: !ruby/object:Gem::Requirement
97
+ requirements:
98
+ - - ">="
99
+ - !ruby/object:Gem::Version
100
+ version: '0'
101
+ required_rubygems_version: !ruby/object:Gem::Requirement
102
+ requirements:
103
+ - - ">="
104
+ - !ruby/object:Gem::Version
105
+ version: '0'
106
+ requirements: []
107
+ rubyforge_project:
108
+ rubygems_version: 2.5.2
109
+ signing_key:
110
+ specification_version: 4
111
+ summary: Yet another ruby html-aware truncation routine
112
+ test_files: []