RubyGems - algolia_html_extractor - Versions diffs - 2.1.0 → 2.2.0 - Mend

algolia_html_extractor 2.1.0 → 2.2.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (5) hide show

checksums.yaml +4 -4
data/README.md +21 -11
data/lib/algolia_html_extractor.rb +3 -3
data/lib/version.rb +1 -1
metadata +2 -2

checksums.yaml CHANGED

@@ -1,7 +1,7 @@
 ---
 SHA1:
-  metadata.gz: 5a8828fb4ece535b803731889c5f8be758f06a7e
-  data.tar.gz: 7b8b80ce73ddefeaa901e5483b2b4822ad7f35a3
+  metadata.gz: bbf8df27c69c4d6f2f16de4bd7cf18fcd703fb43
+  data.tar.gz: a01708af7fe1a3c42d364a099e443ac05f6f8a75
 SHA512:
-  metadata.gz: 4e12dc7fc939f8d7551cc0f1637c394bd5794bc13521fc9c04dde418a238980956e74ebcede3e5317c39e0b652ea1e85c775dd8ac6be0dbd1f026005b915485b
-  data.tar.gz: a894a352c8efc2a4214c1c0ae4088d143ae1b831e86edb57ea6da3012cfe21a3fb22f230ec5337b0d240feb75a309bd155301140103500c1d6a5426c20b9bfcc
+  metadata.gz: 9d9d8af70a4310d871a96fd34a789de3ce0df0ba4621cf237727fcc514dbbfb9fd3d26a35ae3df6fd9b6574752e290d4254bdea7f1622cadba99a07a6a870adf
+  data.tar.gz: e74cc7ca6db7fddc84c903715a44c70df47fb27f303ee1635579b89f47269fab168e9933582fef73269ad0e24fdeae97caa5c1924c57a0553242c33407f7492c

data/README.md CHANGED

@@ -1,5 +1,11 @@
 # algolia_html_extractor
+[![Gem Version][1]](http://badge.fury.io/rb/algolia_html_extractor)
+[![Build Status][2]](https://travis-ci.org/algolia/html-extractor)
+[![Coverage Status][3]](https://coveralls.io/github/algolia/html-extractor?branch=master)
+[![Code Climate][4]](https://codeclimate.com/github/algolia/html-extractor)
+![Ruby >= 2.3.0][5]
 This gem can convert HTML content into JSON records ready to be pushed to
 Algolia.
@@ -93,13 +99,13 @@ Each record has a `objectID` that uniquely identify it (computed by a hash of al
 the other values).
 It also contains the HTML tag name in `tag_name` (by default `<p>`
-paragraphs are extracted, but see the [settings][3] on how to change it).
+paragraphs are extracted, but see the [settings][6] on how to change it).
 `html` contains the whole `outerContent` of the element, including the wrapping
 tags and inner children. The `text` attribute contains the textual content,
 stripping out all HTML.
-`node` contains the [Nokogiri node][4] instance. The lib uses it internally to
+`node` contains the [Nokogiri node][7] instance. The lib uses it internally to
 extract all the relevant information but is also exposed if you want to process
 the node further.
@@ -109,7 +115,7 @@ Anchors are searched in `name` and `id` attributes of headings.
 `hierarchy` then contains a snapshot of the current heading hierarchy of the
 paragraph. The `lvlX` syntax is used to be compatible with the records
-[DocSearch][5] is using.
+[DocSearch][8] is using.
 The `weight` attribute is used to provide an easy way to rank two records
 relative to each other.
@@ -142,7 +148,7 @@ and generic bug reports.
 ## Bug Reports and feature requests
 For any bug or ideas of new features, please start by checking in the
-[issues](https://github.com/pixelastic/html-hierarchy-extractor/issues) tab if
+[issues][9] tab if
 it hasn't been discussed already. If not, feel free to open a new issue.
 ## Pull Requests
@@ -165,7 +171,7 @@ cp ./scripts/git_hooks/* ./.git/hooks
 This will add a `pre-commit` and `pre-push` scripts that will respectively check
 that all files are lint-free before committing, and pass all tests before
 pushing. If any of those two hooks give your errors, you should fix the code
-before commiting or pushing.
+before committing or pushing.
 Having those steps helps keeping the codebase clean as much as possible, and
 avoid polluting discussion in PR about style.
@@ -182,7 +188,7 @@ Rubocop, and the configuration can be found in `.rubocop.yml`.
 ## Test
-`rake test` will run all the tests.
+`rake test` will run all the tests.
 `rake coverage` will do the same, but also adding the code coverage files to
 `./coverage`. This should be useful in a CI environment.
@@ -210,8 +216,12 @@ This gem was previously named `html-hierarchy-extractor` but has been renamed to
 convention. That's also why this gem directly starts at v2.0.
-[1]: https://www.algolia.com/
-[2]: https://community.algolia.com/docsearch/
-[3]: #Settings
-[4]: http://www.rubydoc.info/github/sparklemotion/nokogiri/Nokogiri/XML/Node
-[5]: https://community.algolia.com/docsearch/
+[1]: https://badge.fury.io/rb/algolia_html_extractor.svg
+[2]: https://travis-ci.org/algolia/html-extractor.svg?branch=master
+[3]: https://coveralls.io/repos/algolia/html-extractor/badge.svg?branch=master&service=github
+[4]: https://codeclimate.com/github/algolia/html-extractor/badges/gpa.svg
+[5]: https://img.shields.io/badge/ruby-%3E%3D%202.3.0-green.svg
+[6]: #Settings
+[7]: http://www.rubydoc.info/github/sparklemotion/nokogiri/Nokogiri/XML/Node
+[8]: https://community.algolia.com/docsearch/
+[9]: https://github.com/pixelastic/html-hierarchy-extractor/issues

data/lib/algolia_html_extractor.rb CHANGED

@@ -118,12 +118,12 @@ class AlgoliaHTMLExtractor
       next unless node.matches?(@options[:css_selector])
       # Stop if node is empty
-      text = extract_text(node)
-      next if text.empty?
+      content = extract_text(node)
+      next if content.empty?
       item = {
         html: extract_html(node),
-        text: text,
+        content: content,
         tag_name: extract_tag_name(node),
         hierarchy: current_hierarchy.clone,
         anchor: current_anchor,

data/lib/version.rb CHANGED

@@ -1,5 +1,5 @@
 # Expose gem version
 # rubocop:disable Style/SingleLineMethods
 class AlgoliaHTMLExtractorVersion
-  def self.to_s; '2.1.0' end
+  def self.to_s; '2.2.0' end
 end

metadata CHANGED

@@ -1,14 +1,14 @@
 --- !ruby/object:Gem::Specification
 name: algolia_html_extractor
 version: !ruby/object:Gem::Version
-  version: 2.1.0
+  version: 2.2.0
 platform: ruby
 authors:
 - Tim Carry
 autorequire:
 bindir: bin
 cert_chain: []
-date: 2017-11-10 00:00:00.000000000 Z
+date: 2017-12-19 00:00:00.000000000 Z
 dependencies:
 - !ruby/object:Gem::Dependency
   name: awesome_print