micromicro 1.0.0 → 2.0.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (42) hide show
  1. checksums.yaml +4 -4
  2. data/CHANGELOG.md +33 -0
  3. data/CONTRIBUTING.md +2 -2
  4. data/README.md +21 -20
  5. data/lib/micro_micro/collectible.rb +2 -0
  6. data/lib/micro_micro/collections/base_collection.rb +7 -1
  7. data/lib/micro_micro/collections/items_collection.rb +3 -1
  8. data/lib/micro_micro/collections/properties_collection.rb +12 -0
  9. data/lib/micro_micro/collections/relationships_collection.rb +11 -10
  10. data/lib/micro_micro/document.rb +11 -99
  11. data/lib/micro_micro/helpers.rb +88 -0
  12. data/lib/micro_micro/implied_property.rb +2 -0
  13. data/lib/micro_micro/item.rb +57 -62
  14. data/lib/micro_micro/parsers/base_implied_property_parser.rb +29 -0
  15. data/lib/micro_micro/parsers/base_property_parser.rb +6 -14
  16. data/lib/micro_micro/parsers/date_time_parser.rb +60 -25
  17. data/lib/micro_micro/parsers/date_time_property_parser.rb +10 -9
  18. data/lib/micro_micro/parsers/embedded_markup_property_parser.rb +4 -3
  19. data/lib/micro_micro/parsers/implied_name_property_parser.rb +15 -17
  20. data/lib/micro_micro/parsers/implied_photo_property_parser.rb +21 -45
  21. data/lib/micro_micro/parsers/implied_url_property_parser.rb +12 -31
  22. data/lib/micro_micro/parsers/plain_text_property_parser.rb +4 -2
  23. data/lib/micro_micro/parsers/url_property_parser.rb +22 -14
  24. data/lib/micro_micro/parsers/value_class_pattern_parser.rb +29 -44
  25. data/lib/micro_micro/property.rb +68 -56
  26. data/lib/micro_micro/relationship.rb +15 -13
  27. data/lib/micro_micro/version.rb +3 -1
  28. data/lib/micromicro.rb +31 -26
  29. data/micromicro.gemspec +14 -9
  30. metadata +23 -32
  31. data/.editorconfig +0 -14
  32. data/.gitignore +0 -34
  33. data/.gitmodules +0 -3
  34. data/.reek.yml +0 -8
  35. data/.rspec +0 -2
  36. data/.rubocop +0 -3
  37. data/.rubocop.yml +0 -25
  38. data/.ruby-version +0 -1
  39. data/.simplecov +0 -13
  40. data/.travis.yml +0 -19
  41. data/Gemfile +0 -14
  42. data/Rakefile +0 -18
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 6c5c3cb23e1c8338fef3d7cca36c47e4dcc834f3819525c01bd7ce6dab314971
4
- data.tar.gz: 27794eaac80a6701c7910f4e31200a4bc999d0b8082b3ca1d2dd5a87a0bdaa82
3
+ metadata.gz: d9c965d277e7c87c68de6ce7a12aff28da257d813573de2519bfe36943df1f4f
4
+ data.tar.gz: a62d02a8fe962b060c36a9c615445f586fe76ea1c570d71f05175d140d631d32
5
5
  SHA512:
6
- metadata.gz: d59f245021fb8bec36e1e319cffce3c3a6168ea763952efc8759bda7cb7a7bf13420632cdcfd2b988d72524a348f5a1238ff206cfd3aa75bfb5a57621f4f3b8c
7
- data.tar.gz: 9355a25a3b65fe72828abf95d4dcec2829e46772904487b030a906cd95708f4bc946570f507bae326df2a3e8066e3de1c332799977cf8db749269e095bba413b
6
+ metadata.gz: 8ee19e4ca072c36137746dd54e435405a9a749761c9b735d75ff9f97a5c28f8e5367f79c3a7ad45c83039eb8cb54f1043cac2dd7ddcd1bfd2d4fbbad0d218ad6
7
+ data.tar.gz: f5f88370b3fc92e2d467c8e358798c6266b5fb9dbfe50fb13e74b2235572e9adfbfeb4f171b1df3d5bf474cc549bac7adfe3fa7c2a59b6830d2bacc3cab3d500
data/CHANGELOG.md CHANGED
@@ -1,5 +1,38 @@
1
1
  # Changelog
2
2
 
3
+ ## 2.0.1 / 2022-08-20
4
+
5
+ - Use ruby/debug instead of pry-byebug (2965b2e)
6
+ - Update nokogiri-html-ext to v0.2.2 (921c486)
7
+ - Include root items with property class names (dd14212)
8
+
9
+ ## 2.0.0 / 2022-08-12
10
+
11
+ - Refactor implied property parsers (203fec9)
12
+ - Add `Helpers` module (caa1c02)
13
+ - New `PropertiesCollection` and `Property` instance methods (e9bb38b):
14
+ - `PropertiesCollection#plain_text_properties`
15
+ - `PropertiesCollection#url_properties`
16
+ - `Property#date_time_property?`
17
+ - `Property#embedded_markup_property?`
18
+ - `Property#plain_text_property?`
19
+ - `Property#url_property?`
20
+ - Remove Addressable (66c2bb4)
21
+ - Refactor classes to use nokogiri-html-ext (33fdf4a)
22
+ - Update activesupport (563bf56)
23
+ - **Breaking change:** Set minimum supported Ruby to 2.7 (ba17d05)
24
+ - Update development Ruby to 2.7.6 (ba17d05)
25
+ - Remove Reek (c1e76c5)
26
+ - Update runtime dependency version constraints (f83f26a)
27
+ - ~~**Breaking change:** Set minimum supported Ruby to 2.6~~ (fc588cd)
28
+ - ~~Update development Ruby to 2.6.10~~ (d05a2ac)
29
+
30
+ ## 1.1.0 / 2021-06-10
31
+
32
+ - Replace Absolutely dependency with Addressable (e93721b)
33
+ - Add support for Ruby 3.0 (d897c54)
34
+ - Update development Ruby version to 2.6.10 (051c9ad)
35
+
3
36
  ## 1.0.0 / 2020-11-08
4
37
 
5
38
  - Add `MicroMicro::Item#plain_text_properties` and `MicroMicro::Item#url_properties` methods (351e1f1)
data/CONTRIBUTING.md CHANGED
@@ -8,9 +8,9 @@ There are a couple ways you can help improve MicroMicro:
8
8
 
9
9
  ## Getting Started
10
10
 
11
- MicroMicro is developed using Ruby 2.5.8 and is additionally tested against Ruby 2.6 and 2.7 using [Travis CI](https://travis-ci.com/jgarber623/micromicro).
11
+ MicroMicro is developed using Ruby 2.7.6 and is additionally tested against Ruby 3.0 and 3.1 using [GitHub Actions](https://github.com/jgarber623/micromicro/actions).
12
12
 
13
- Before making changes to MicroMicro, you'll want to install Ruby 2.5.8. It's recommended that you use a Ruby version managment tool like [rbenv](https://github.com/rbenv/rbenv), [chruby](https://github.com/postmodern/chruby), or [rvm](https://github.com/rvm/rvm). Once you've installed Ruby 2.5.8 using your method of choice, install the project's gems by running:
13
+ Before making changes to MicroMicro, you'll want to install Ruby 2.7.6. It's recommended that you use a Ruby version managment tool like [rbenv](https://github.com/rbenv/rbenv), [chruby](https://github.com/postmodern/chruby), or [rvm](https://github.com/rvm/rvm). Once you've installed Ruby 2.7.6 using your method of choice, install the project's gems by running:
14
14
 
15
15
  ```sh
16
16
  bundle install
data/README.md CHANGED
@@ -1,42 +1,43 @@
1
1
  # MicroMicro
2
2
 
3
- **A Ruby gem for extracting [microformats2](http://microformats.org/wiki/microformats2)-encoded data from HTML documents.**
3
+ **A Ruby gem for extracting [microformats2](https://microformats.org/wiki/microformats2)-encoded data from HTML documents.**
4
4
 
5
- [![Build](https://img.shields.io/travis/com/jgarber623/micromicro/master.svg?style=for-the-badge)](https://travis-ci.com/jgarber623/micromicro)
6
- [![Dependencies](https://img.shields.io/depfu/jgarber623/micromicro.svg?style=for-the-badge)](https://depfu.com/github/jgarber623/micromicro)
7
- [![Maintainability](https://img.shields.io/codeclimate/maintainability/jgarber623/micromicro.svg?style=for-the-badge)](https://codeclimate.com/github/jgarber623/micromicro)
8
- [![Coverage](https://img.shields.io/codeclimate/c/jgarber623/micromicro.svg?style=for-the-badge)](https://codeclimate.com/github/jgarber623/micromicro/code)
5
+ [![Gem](https://img.shields.io/gem/v/micromicro.svg?logo=rubygems&style=for-the-badge)](https://rubygems.org/gems/micromicro)
6
+ [![Downloads](https://img.shields.io/gem/dt/micromicro.svg?logo=rubygems&style=for-the-badge)](https://rubygems.org/gems/micromicro)
7
+ [![Build](https://img.shields.io/github/workflow/status/jgarber623/micromicro/CI?logo=github&style=for-the-badge)](https://github.com/jgarber623/micromicro/actions/workflows/ci.yml)
8
+ [![Maintainability](https://img.shields.io/codeclimate/maintainability/jgarber623/micromicro.svg?logo=code-climate&style=for-the-badge)](https://codeclimate.com/github/jgarber623/micromicro)
9
+ [![Coverage](https://img.shields.io/codeclimate/c/jgarber623/micromicro.svg?logo=code-climate&style=for-the-badge)](https://codeclimate.com/github/jgarber623/micromicro/code)
9
10
 
10
11
  ## Key Features
11
12
 
12
- - Parses microformats2-encoded HTML documents according to the [microformats2 parsing specification](http://microformats.org/wiki/microformats2-parsing)
13
+ - Parses microformats2-encoded HTML documents according to the [microformats2 parsing specification](https://microformats.org/wiki/microformats2-parsing)
13
14
  - Passes all microformats2 tests from [the official test suite](https://github.com/microformats/tests)¹
14
- - Supports Ruby 2.5 and newer
15
+ - Supports Ruby 2.7 and newer
15
16
 
16
- **Note:** MicroMicro **does not** parse [Classic Microformats](http://microformats.org/wiki/Main_Page#Classic_Microformats) (referred to in [the parsing specification](http://microformats.org/wiki/microformats2-parsing#note_backward_compatibility_details) as "backcompat root classes" and "backcompat properties"). If parsing documents marked up in this fashion, consider using [the official microformats-ruby parser](https://github.com/microformats/microformats-ruby).
17
+ **Note:** MicroMicro **does not** parse [Classic Microformats](https://microformats.org/wiki/Main_Page#Classic_Microformats) (referred to in [the parsing specification](https://microformats.org/wiki/microformats2-parsing#note_backward_compatibility_details) as "backcompat root classes" and "backcompat properties" and in vocabulary specifications in the "Parser Compatibility" sections [e.g. [h-entry](https://microformats.org/wiki/h-entry#Parser_Compatibility)]). To parse documents marked up with Classic Microformats, consider using [the official microformats-ruby parser](https://github.com/microformats/microformats-ruby).
17
18
 
18
19
  <small>¹ …with some exceptions until [this pull request](https://github.com/microformats/tests/pull/112) is merged.</small>
19
20
 
20
21
  ## Getting Started
21
22
 
22
- Before installing and using MicroMicro, you'll want to have [Ruby](https://www.ruby-lang.org) 2.5 (or newer) installed. It's recommended that you use a Ruby version managment tool like [rbenv](https://github.com/rbenv/rbenv), [chruby](https://github.com/postmodern/chruby), or [rvm](https://github.com/rvm/rvm).
23
+ Before installing and using MicroMicro, you'll want to have [Ruby](https://www.ruby-lang.org) 2.7 (or newer) installed. It's recommended that you use a Ruby version managment tool like [rbenv](https://github.com/rbenv/rbenv), [chruby](https://github.com/postmodern/chruby), or [rvm](https://github.com/rvm/rvm).
23
24
 
24
- MicroMicro is developed using Ruby 2.5.8 and is additionally tested against Ruby 2.6 and 2.7 using [Travis CI](https://travis-ci.com/jgarber623/micromicro).
25
+ MicroMicro is developed using Ruby 2.7.6 and is additionally tested against Ruby 3.0 and 3.1 using [GitHub Actions](https://github.com/jgarber623/micromicro/actions).
25
26
 
26
27
  ## Installation
27
28
 
28
- If you're using [Bundler](https://bundler.io), add MicroMicro to your project's `Gemfile`:
29
+ If you're using [Bundler](https://bundler.io) to manage gem dependencies, add MicroMicro to your project's Gemfile:
29
30
 
30
31
  ```ruby
31
- source 'https://rubygems.org'
32
-
33
32
  gem 'micromicro'
34
33
  ```
35
34
 
36
- …and hop over to your command prompt and run…
35
+ …and run `bundle install` in your shell.
36
+
37
+ To install the gem manually, run the following in your shell:
37
38
 
38
39
  ```sh
39
- $ bundle install
40
+ gem install micromicro
40
41
  ```
41
42
 
42
43
  ## Usage
@@ -64,10 +65,10 @@ The `Hash` produced by calling `doc.to_h` may be converted to JSON (e.g. `doc.to
64
65
  Another example pulling the source HTML from [Tantek](https://tantek.com)'s website:
65
66
 
66
67
  ```ruby
67
- require "net/http"
68
- require "micromicro"
68
+ require 'net/http'
69
+ require 'micromicro'
69
70
 
70
- url = "https://tantek.com"
71
+ url = 'https://tantek.com'
71
72
  rsp = Net::HTTP.get(URI.parse(url))
72
73
 
73
74
  doc = MicroMicro.parse(rsp, url)
@@ -144,11 +145,11 @@ doc.relationships.find { |relationship| relationship.rels.include?('webmention')
144
145
 
145
146
  ## Contributing
146
147
 
147
- Interested in helping improve MicroMicro? Awesome! Your help is greatly appreciated. See [CONTRIBUTING.md](https://github.com/jgarber623/micromicro/blob/master/CONTRIBUTING.md) for details.
148
+ Interested in helping improve MicroMicro? Awesome! Your help is greatly appreciated. See [CONTRIBUTING.md](https://github.com/jgarber623/micromicro/blob/main/CONTRIBUTING.md) for details.
148
149
 
149
150
  ## Acknowledgments
150
151
 
151
- MicroMicro wouldn't exist without the hard work of everyone involved in the [microformats](http://microformats.org) community. Additionally, the comprehensive [microformats test suite](https://github.com/microformats/tests) was invaluable in the development of this Ruby gem.
152
+ MicroMicro wouldn't exist without the hard work of everyone involved in the [microformats](https://microformats.org) community. Additionally, the comprehensive [microformats test suite](https://github.com/microformats/tests) was invaluable in the development of this Ruby gem.
152
153
 
153
154
  MicroMicro is written and maintained by [Jason Garber](https://sixtwothree.org).
154
155
 
@@ -1,3 +1,5 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module MicroMicro
2
4
  module Collectible
3
5
  attr_accessor :collection
@@ -1,3 +1,5 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module MicroMicro
2
4
  module Collections
3
5
  class BaseCollection
@@ -12,10 +14,14 @@ module MicroMicro
12
14
  members.each { |member| push(member) }
13
15
  end
14
16
 
17
+ # :nocov:
15
18
  # @return [String]
16
19
  def inspect
17
- format(%(#<#{self.class.name}:%#0x count: #{count}, members: #{members.inspect}>), object_id)
20
+ "#<#{self.class}:#{format('%#0x', object_id)} " \
21
+ "count: #{count}, " \
22
+ "members: #{members.inspect}>"
18
23
  end
24
+ # :nocov:
19
25
 
20
26
  # @param member [MicroMicro::Item, MicroMicro::Property, MicroMicro::Relationship]
21
27
  def push(member)
@@ -1,3 +1,5 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module MicroMicro
2
4
  module Collections
3
5
  class ItemsCollection < BaseCollection
@@ -8,7 +10,7 @@ module MicroMicro
8
10
 
9
11
  # @return [Array<String>]
10
12
  def types
11
- @types ||= map(&:types).flatten.uniq.sort
13
+ @types ||= flat_map(&:types).uniq.sort
12
14
  end
13
15
  end
14
16
  end
@@ -1,3 +1,5 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module MicroMicro
2
4
  module Collections
3
5
  class PropertiesCollection < BaseCollection
@@ -6,11 +8,21 @@ module MicroMicro
6
8
  @names ||= map(&:name).uniq.sort
7
9
  end
8
10
 
11
+ # @return [MicroMicro::Collections::PropertiesCollection]
12
+ def plain_text_properties
13
+ self.class.new(select(&:plain_text_property?))
14
+ end
15
+
9
16
  # @return [Hash{Symbol => Array<String, Hash>}]
10
17
  def to_h
11
18
  group_by(&:name).symbolize_keys.deep_transform_values(&:value)
12
19
  end
13
20
 
21
+ # @return [MicroMicro::Collections::PropertiesCollection]
22
+ def url_properties
23
+ self.class.new(select(&:url_property?))
24
+ end
25
+
14
26
  # @return [Array<String, Hash>]
15
27
  def values
16
28
  @values ||= map(&:value).uniq
@@ -1,26 +1,27 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module MicroMicro
2
4
  module Collections
3
5
  class RelationshipsCollection < BaseCollection
4
- # @see http://microformats.org/wiki/microformats2-parsing#parse_a_hyperlink_element_for_rel_microformats
5
- #
6
- # @return [Hash{Symbol => Hash{Symbol => Array, String}}]
7
- def group_by_url
8
- group_by(&:href).symbolize_keys.transform_values { |relationships| relationships.first.to_h.slice!(:href) }
9
- end
10
-
11
- # @see http://microformats.org/wiki/microformats2-parsing#parse_a_hyperlink_element_for_rel_microformats
6
+ # @see https://microformats.org/wiki/microformats2-parsing#parse_a_hyperlink_element_for_rel_microformats
12
7
  #
13
8
  # @return [Hash{Symbol => Array<String>}]
14
9
  def group_by_rel
15
- # flat_map { |member| member.rels.map { |rel| [rel, member.href] } }.group_by(&:shift).symbolize_keys.transform_values(&:flatten).transform_values(&:uniq)
16
10
  each_with_object(Hash.new { |hash, key| hash[key] = [] }) do |member, hash|
17
11
  member.rels.each { |rel| hash[rel] << member.href }
18
12
  end.symbolize_keys.transform_values(&:uniq)
19
13
  end
20
14
 
15
+ # @see https://microformats.org/wiki/microformats2-parsing#parse_a_hyperlink_element_for_rel_microformats
16
+ #
17
+ # @return [Hash{Symbol => Hash{Symbol => Array, String}}]
18
+ def group_by_url
19
+ group_by(&:href).symbolize_keys.transform_values { |relationships| relationships.first.to_h.slice!(:href) }
20
+ end
21
+
21
22
  # @return [Array<String>]
22
23
  def rels
23
- @rels ||= map(&:rels).flatten.uniq.sort
24
+ @rels ||= flat_map(&:rels).uniq.sort
24
25
  end
25
26
 
26
27
  # @return [Array<String>]
@@ -1,29 +1,7 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module MicroMicro
2
4
  class Document
3
- # A map of HTML `srcset` attributes and their associated element names
4
- #
5
- # @see https://html.spec.whatwg.org/#srcset-attributes
6
- # @see https://html.spec.whatwg.org/#attributes-3
7
- HTML_IMAGE_CANDIDATE_STRINGS_ATTRIBUTES_MAP = {
8
- 'imagesrcset' => %w[link],
9
- 'srcset' => %w[img source]
10
- }.freeze
11
-
12
- # A map of HTML URL attributes and their associated element names
13
- #
14
- # @see https://html.spec.whatwg.org/#attributes-3
15
- HTML_URL_ATTRIBUTES_MAP = {
16
- 'action' => %w[form],
17
- 'cite' => %w[blockquote del ins q],
18
- 'data' => %w[object],
19
- 'formaction' => %w[button input],
20
- 'href' => %w[a area base link],
21
- 'manifest' => %w[html],
22
- 'ping' => %w[a area],
23
- 'poster' => %w[video],
24
- 'src' => %w[audio embed iframe img input script source track video]
25
- }.freeze
26
-
27
5
  # Parse a string of HTML for microformats2-encoded data.
28
6
  #
29
7
  # MicroMicro::Document.new('<a href="/" class="h-card" rel="me">Jason Garber</a>', 'https://sixtwothree.org')
@@ -38,22 +16,23 @@ module MicroMicro
38
16
  # @param markup [String] The HTML to parse for microformats2-encoded data.
39
17
  # @param base_url [String] The URL associated with markup. Used for relative URL resolution.
40
18
  def initialize(markup, base_url)
41
- @markup = markup
42
- @base_url = base_url
43
-
44
- resolve_relative_urls
19
+ @document = Nokogiri::HTML(markup, base_url).resolve_relative_urls!
45
20
  end
46
21
 
22
+ # :nocov:
47
23
  # @return [String]
48
24
  def inspect
49
- format(%(#<#{self.class.name}:%#0x items: #{items.inspect}, relationships: #{relationships.inspect}>), object_id)
25
+ "#<#{self.class}:#{format('%#0x', object_id)} " \
26
+ "items: #{items.inspect}, " \
27
+ "relationships: #{relationships.inspect}>"
50
28
  end
29
+ # :nocov:
51
30
 
52
31
  # A collection of items parsed from the provided markup.
53
32
  #
54
33
  # @return [MicroMicro::Collections::ItemsCollection]
55
34
  def items
56
- @items ||= Collections::ItemsCollection.new(Item.items_from(document))
35
+ @items ||= Collections::ItemsCollection.new(Item.from_context(document.element_children))
57
36
  end
58
37
 
59
38
  # A collection of relationships parsed from the provided markup.
@@ -65,7 +44,7 @@ module MicroMicro
65
44
 
66
45
  # Return the parsed document as a Hash.
67
46
  #
68
- # @see http://microformats.org/wiki/microformats2-parsing#parse_a_document_for_microformats
47
+ # @see https://microformats.org/wiki/microformats2-parsing#parse_a_document_for_microformats
69
48
  #
70
49
  # @return [Hash{Symbol => Array, Hash}]
71
50
  def to_h
@@ -76,76 +55,9 @@ module MicroMicro
76
55
  }
77
56
  end
78
57
 
79
- # Ignore this node?
80
- #
81
- # @param node [Nokogiri::XML::Element]
82
- # @return [Boolean]
83
- def self.ignore_node?(node)
84
- ignored_node_names.include?(node.name)
85
- end
86
-
87
- # A list of HTML element names the parser should ignore.
88
- #
89
- # @return [Array<String>]
90
- def self.ignored_node_names
91
- %w[script style template]
92
- end
93
-
94
- # @see http://microformats.org/wiki/microformats2-parsing#parse_an_element_for_properties
95
- # @see http://microformats.org/wiki/microformats2-parsing#parsing_for_implied_properties
96
- #
97
- # @param context [Nokogiri::HTML::Document, Nokogiri::XML::NodeSet, Nokogiri::XML::Element]
98
- # @yield [context]
99
- # @return [String]
100
- def self.text_content_from(context)
101
- context.css(*ignored_node_names).unlink
102
-
103
- yield(context) if block_given?
104
-
105
- context.text.strip
106
- end
107
-
108
58
  private
109
59
 
110
- attr_reader :base_url, :markup
111
-
112
- # @return [Nokogiri::XML::Element, nil]
113
- def base_element
114
- @base_element ||= Nokogiri::HTML(markup).at('//base[@href]')
115
- end
116
-
117
60
  # @return [Nokogiri::HTML::Document]
118
- def document
119
- @document ||= Nokogiri::HTML(markup, resolved_base_url)
120
- end
121
-
122
- def resolve_relative_urls
123
- HTML_URL_ATTRIBUTES_MAP.each do |attribute, names|
124
- document.xpath(*names.map { |name| "//#{name}[@#{attribute}]" }).each do |node|
125
- node[attribute] = Absolutely.to_abs(base: resolved_base_url, relative: node[attribute].strip)
126
- end
127
- end
128
-
129
- HTML_IMAGE_CANDIDATE_STRINGS_ATTRIBUTES_MAP.each do |attribute, names|
130
- document.xpath(*names.map { |name| "//#{name}[@#{attribute}]" }).each do |node|
131
- candidates = node[attribute].split(',').map(&:strip).map { |candidate| candidate.match(/^(?<url>.+?)(?<descriptor>\s+.+)?$/) }
132
-
133
- node[attribute] = candidates.map { |candidate| "#{Absolutely.to_abs(base: resolved_base_url, relative: candidate[:url])}#{candidate[:descriptor]}" }.join(', ')
134
- end
135
- end
136
-
137
- self
138
- end
139
-
140
- # @return [String]
141
- def resolved_base_url
142
- @resolved_base_url ||= begin
143
- if base_element
144
- Absolutely.to_abs(base: base_url, relative: base_element['href'].strip)
145
- else
146
- base_url
147
- end
148
- end
149
- end
61
+ attr_reader :document
150
62
  end
151
63
  end
@@ -0,0 +1,88 @@
1
+ # frozen_string_literal: true
2
+
3
+ module MicroMicro
4
+ module Helpers
5
+ IGNORED_NODE_NAMES = %w[script style template].freeze
6
+
7
+ # @param node [Nokogiri::XML::Element]
8
+ # @param attributes_map [Hash{String => Array}]
9
+ # @return [String, nil]
10
+ def self.attribute_value_from(node, attributes_map)
11
+ attributes_map.filter_map do |attribute, names|
12
+ node[attribute] if names.include?(node.name) && node[attribute]
13
+ end.first
14
+ end
15
+
16
+ # @param node [Nokogiri::XML::Element]
17
+ # @return [Boolean]
18
+ def self.ignore_node?(node)
19
+ IGNORED_NODE_NAMES.include?(node.name)
20
+ end
21
+
22
+ # @param nodes [Nokogiri::XML::NodeSet]
23
+ # @return [Boolean]
24
+ def self.ignore_nodes?(nodes)
25
+ (nodes.map(&:name) & IGNORED_NODE_NAMES).any?
26
+ end
27
+
28
+ # @param node [Nokogiri::XML::Element]
29
+ # @return [Boolean]
30
+ def self.item_node?(node)
31
+ root_class_names_from(node).any?
32
+ end
33
+
34
+ # @param nodes [Nokogiri::XML::NodeSet]
35
+ # @return [Boolean]
36
+ def self.item_nodes?(nodes)
37
+ nodes.filter_map { |node| item_node?(node) }.any?
38
+ end
39
+
40
+ # @param node [Nokogiri::XML::Element]
41
+ # @return [Array<String>]
42
+ def self.property_class_names_from(node)
43
+ node.classes.grep(/^(?:dt|e|p|u)(?:-[0-9a-z]+)?(?:-[a-z]+)+$/).uniq
44
+ end
45
+
46
+ # @param node [Nokogiri::XML::Element]
47
+ # @return [Boolean]
48
+ def self.property_node?(node)
49
+ property_class_names_from(node).any?
50
+ end
51
+
52
+ # @param node [Nokogiri::XML::Element]
53
+ # @return [Array<String>]
54
+ def self.root_class_names_from(node)
55
+ node.classes.grep(/^h(?:-[0-9a-z]+)?(?:-[a-z]+)+$/).uniq.sort
56
+ end
57
+
58
+ # @see https://microformats.org/wiki/microformats2-parsing#parse_an_element_for_properties
59
+ # @see https://microformats.org/wiki/microformats2-parsing#parsing_for_implied_properties
60
+ #
61
+ # @param context [Nokogiri::HTML::Document, Nokogiri::XML::NodeSet, Nokogiri::XML::Element]
62
+ # @yield [context]
63
+ # @return [String]
64
+ def self.text_content_from(context)
65
+ context.css(*IGNORED_NODE_NAMES).unlink
66
+
67
+ yield(context) if block_given?
68
+
69
+ context.text.strip
70
+ end
71
+
72
+ # @see https://microformats.org/wiki/value-class-pattern#Basic_Parsing
73
+ #
74
+ # @param node [Nokogiri::XML::Element]
75
+ # @return [Boolean]
76
+ def self.value_class_node?(node)
77
+ node.classes.include?('value')
78
+ end
79
+
80
+ # @see https://microformats.org/wiki/value-class-pattern#Parsing_value_from_a_title_attribute
81
+ #
82
+ # @param node [Nokogiri::XML::Element]
83
+ # @return [Boolean]
84
+ def self.value_title_node?(node)
85
+ node.classes.include?('value-title')
86
+ end
87
+ end
88
+ end
@@ -1,3 +1,5 @@
1
+ # frozen_string_literal: true
2
+
1
3
  module MicroMicro
2
4
  class ImpliedProperty < Property
3
5
  IMPLIED_PROPERTY_PARSERS_MAP = {