panchira 1.3.0 → 1.3.4

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 9f1b155399ed2e9caef218bd5599d72c3d6102c5b7e0da50fc297fe4db99614c
4
- data.tar.gz: 92e31687ac7d5edbbd94df6eff606cabaa3dab891b57fda1a3b9a4af3815949c
3
+ metadata.gz: 8c708ae7d488bb68419b16ba734bca6e31c023913104453373269168780781f1
4
+ data.tar.gz: 7ed26206ae6a2c87c662dfef7f2b9817232142bbdb474160150f25348d8f5fc6
5
5
  SHA512:
6
- metadata.gz: 8b8a5e274a776a2a384c4901b754fa14025c1f4a3b735708d5e692656b2462f1fbdbc8df63edde2600c154b440195e3c4183a42a1a9d9245668e35b0494ae41d
7
- data.tar.gz: 727acf463987e381620e0a6d5200c22063df44d4d5fa4c3f086abf950ea7543f105fab7077446cc6cdfcc3beb9c90b8da4a617ddb5146bd3930c4ab91c67f4e0
6
+ metadata.gz: 5576bf4e990a86737e2f8a520c619a1028c4fa32d2f372e35e21e3a0ae914ebfdff03fc5ffcbe29d85c4332dfe2223b56af71a97065b365c575ae3ad97b24abb
7
+ data.tar.gz: 28b1c641f95a3cff9097032d4695d1eda1b9f759c4bc598778e6949bf63cf3d711a7526b5736ed7391fe17d30e71517676f6c7ae4a076375c917ed7d3f2afa8e
@@ -16,7 +16,7 @@ on:
16
16
  jobs:
17
17
  test:
18
18
 
19
- runs-on: ubuntu-latest
19
+ runs-on: ubuntu-18.04
20
20
 
21
21
  steps:
22
22
  - uses: actions/checkout@v2
data/CHANGELOG.md CHANGED
@@ -4,6 +4,29 @@ All notable changes to this project will be documented in this file.
4
4
  The format is based on [Keep a Changelog](http://keepachangelog.com/)
5
5
  and this project adheres to [Semantic Versioning](http://semver.org/).
6
6
 
7
+ ## 1.3.4 - 2021-07-26
8
+ ### Fixed
9
+ - Fixed an issue where Iwara Resolver failed when a description was present.
10
+
11
+ ## 1.3.3 - 2021-07-25
12
+ ### Added
13
+ - Added Support for Iwara.
14
+
15
+ ### Fixed
16
+ - Fixed an issue where DLsite Resolver was retrieving wrong tags.
17
+
18
+ ## 1.3.2 - 2021-05-23
19
+ ### Fixed
20
+ - Fixed an issue where Fanza Resolver was retrieving incorrect canonical URLs.
21
+ - Fixed an issue where Narou Resolver was retrieving wrong descriptions.
22
+
23
+ ### Changed
24
+ - Updated dependencies.
25
+
26
+ ## 1.3.1 - 2021-02-17
27
+ ### Added
28
+ - Added support for Fanza Video.
29
+
7
30
  ## 1.3.0 - 2021-02-06
8
31
  ### Added
9
32
  - Added support for multiple authors. PanchiraResult#authors now returns an array of authors.
@@ -67,6 +90,8 @@ and this project adheres to [Semantic Versioning](http://semver.org/).
67
90
  ### Added
68
91
  - Released Panchira gem. At this time we can parse only 5 websites.
69
92
 
93
+ [1.3.2]: https://github.com/nuita/panchira/releases/tag/v1.3.2
94
+ [1.3.1]: https://github.com/nuita/panchira/releases/tag/v1.3.1
70
95
  [1.3.0]: https://github.com/nuita/panchira/releases/tag/v1.3.0
71
96
  [1.2.0]: https://github.com/nuita/panchira/releases/tag/v1.2.0
72
97
  [1.1.0]: https://github.com/nuita/panchira/releases/tag/v1.1.0
data/Gemfile.lock CHANGED
@@ -1,41 +1,41 @@
1
1
  PATH
2
2
  remote: .
3
3
  specs:
4
- panchira (1.3.0)
4
+ panchira (1.3.4)
5
5
  fastimage (~> 2.1.7)
6
6
  nokogiri (>= 1.10.9, < 1.12.0)
7
7
 
8
8
  GEM
9
9
  remote: https://rubygems.org/
10
10
  specs:
11
- ast (2.4.1)
11
+ ast (2.4.2)
12
12
  fastimage (2.1.7)
13
- minitest (5.14.2)
14
- nokogiri (1.11.0-x86_64-darwin)
13
+ minitest (5.14.4)
14
+ nokogiri (1.11.7-x86_64-darwin)
15
15
  racc (~> 1.4)
16
16
  parallel (1.20.1)
17
- parser (3.0.0.0)
17
+ parser (3.0.1.1)
18
18
  ast (~> 2.4.1)
19
19
  racc (1.5.2)
20
20
  rainbow (3.0.0)
21
21
  rake (12.3.3)
22
- regexp_parser (2.0.3)
23
- rexml (3.2.4)
24
- rubocop (1.7.0)
22
+ regexp_parser (2.1.1)
23
+ rexml (3.2.5)
24
+ rubocop (1.15.0)
25
25
  parallel (~> 1.10)
26
- parser (>= 2.7.1.5)
26
+ parser (>= 3.0.0.0)
27
27
  rainbow (>= 2.2.2, < 4.0)
28
28
  regexp_parser (>= 1.8, < 3.0)
29
29
  rexml
30
- rubocop-ast (>= 1.2.0, < 2.0)
30
+ rubocop-ast (>= 1.5.0, < 2.0)
31
31
  ruby-progressbar (~> 1.7)
32
- unicode-display_width (>= 1.4.0, < 2.0)
33
- rubocop-ast (1.4.0)
34
- parser (>= 2.7.1.5)
35
- rubocop-minitest (0.10.2)
36
- rubocop (>= 0.87, < 2.0)
32
+ unicode-display_width (>= 1.4.0, < 3.0)
33
+ rubocop-ast (1.5.0)
34
+ parser (>= 3.0.1.1)
35
+ rubocop-minitest (0.12.1)
36
+ rubocop (>= 0.90, < 2.0)
37
37
  ruby-progressbar (1.11.0)
38
- unicode-display_width (1.7.0)
38
+ unicode-display_width (2.0.0)
39
39
 
40
40
  PLATFORMS
41
41
  ruby
data/README.md CHANGED
@@ -6,7 +6,7 @@
6
6
 
7
7
  Due to some legal or ethical issues, most hentai and NSFW platforms don't clarify their content on meta tags. As a result, most hentai platforms are rendered poorly on the card previews on social media.
8
8
 
9
- To solve this issue, Panchira is made to parse correct and uncensored metadata from such web platforms (at this time we cover **DLSite, Komiflo, Melonbooks, Nijie, Pixiv, Shousetsuka ni narou, Fanza and Twitter**).
9
+ To solve this issue, Panchira is made to parse correct and uncensored metadata from such web platforms (at this time we cover **DLSite, Komiflo, Melonbooks, Nijie, Pixiv, Shousetsuka ni narou, Fanza, Iwara and Twitter**).
10
10
 
11
11
  If you need card previews of hentai on your web application, but can't get them with simply parsing metatags, then it is time for Panchira.
12
12
 
@@ -39,7 +39,13 @@ module Panchira
39
39
  end
40
40
 
41
41
  def parse_tags
42
- @page.css('.main_genre').children.children.map(&:text)
42
+ @page.css('table[id*="work_"] tr').each do |tr|
43
+ next unless tr.css('th').text =~ /ジャンル/
44
+
45
+ return tr.css('td a').map do |node|
46
+ node.text.strip
47
+ end
48
+ end
43
49
  end
44
50
  end
45
51
 
@@ -41,6 +41,12 @@ module Panchira
41
41
 
42
42
  private
43
43
 
44
+ # canonical urlに別サービス(FANZA GAMES)のURLが設定されていることがあるため、
45
+ # 別サービスの場合はとりあえず元URLを設定する
46
+ def parse_canonical_url
47
+ @url
48
+ end
49
+
44
50
  def parse_circle
45
51
  @page.css('a.circleName__txt').first.text
46
52
  end
@@ -49,8 +55,24 @@ module Panchira
49
55
  @page.css('.genreTag__item').map { |t| t.text.strip }
50
56
  end
51
57
  end
58
+
59
+ class FanzaVideoResolver < FanzaResolver
60
+ URL_REGEXP = %r{www.dmm.co.jp/digital/}.freeze
61
+
62
+ private
63
+
64
+ def parse_title
65
+ # og:titleは文字数制限で短く切られてる
66
+ @page.title.match(/(.+)- \S+ - FANZA動画/)[1]&.strip || super
67
+ end
68
+
69
+ def parse_image_url
70
+ super.sub(/(pr|ps).jpg$/, 'pl.jpg')
71
+ end
72
+ end
52
73
  end
53
74
 
54
75
  ::Panchira::Extensions.register(Panchira::Fanza::FanzaBookResolver)
55
76
  ::Panchira::Extensions.register(Panchira::Fanza::FanzaDoujinResolver)
77
+ ::Panchira::Extensions.register(Panchira::Fanza::FanzaVideoResolver)
56
78
  end
@@ -0,0 +1,39 @@
1
+ # frozen_string_literal: true
2
+
3
+ module Panchira
4
+ class IwaraResolver < Resolver
5
+ URL_REGEXP = /(www|ecchi)\.iwara\.tv\//.freeze
6
+
7
+ private
8
+ def parse_title
9
+ super.split(' | ')[0]
10
+ end
11
+
12
+ def parse_image_url
13
+ url = @page.at_css('#video-player')&.attributes['poster']&.value
14
+ 'https:' + url if url
15
+ end
16
+
17
+ def parse_author
18
+ @page.at_css('.node-info .username')&.children&.[](0)&.text
19
+ end
20
+
21
+ def parse_tags
22
+ @page.css('.field-name-field-categories .field-item').map { |e| e.children&.text }
23
+ end
24
+
25
+ def parse_description
26
+ @page.at_css('.field-name-body')&.text
27
+ end
28
+
29
+ def cookie
30
+ 'show_adult=1'
31
+ end
32
+
33
+ def parse_canonical_url
34
+ @url # canonical has relative path. ignore it
35
+ end
36
+ end
37
+
38
+ ::Panchira::Extensions.register(Panchira::IwaraResolver)
39
+ end
@@ -25,6 +25,10 @@ module Panchira
25
25
  Nokogiri::HTML.parse(res.body, uri)
26
26
  end
27
27
 
28
+ def parse_description
29
+ @desc&.xpath('//*[@id="noveltable1"]/tr/td')&.first&.text&.strip
30
+ end
31
+
28
32
  def parse_author
29
33
  @desc&.xpath('//*[@id="noveltable1"]/tr[2]/td')&.text&.strip
30
34
  end
@@ -33,6 +37,11 @@ module Panchira
33
37
  # つらい。
34
38
  @desc&.xpath('//*[@id="noveltable1"]/tr[3]')&.text&.split("\n\n\n")&.dig(1)&.split(' ')
35
39
  end
40
+
41
+ # og:urlで指定されたncode.syosetu.com/~~~にアクセスすると301で戻されるので何もしない
42
+ def parse_canonical_url
43
+ @url
44
+ end
36
45
  end
37
46
 
38
47
  class NcodeResolver < Resolver
@@ -47,6 +56,10 @@ module Panchira
47
56
  end
48
57
  end
49
58
 
59
+ def parse_description
60
+ @desc&.xpath('//*[@id="noveltable1"]/tr/td')&.first&.text&.strip
61
+ end
62
+
50
63
  def parse_author
51
64
  @desc&.xpath('//*[@id="noveltable1"]/tr[2]/td')&.text&.strip
52
65
  end
@@ -55,6 +68,11 @@ module Panchira
55
68
  # めっちゃつらい。
56
69
  @desc&.xpath('//*[@id="noveltable1"]/tr[3]')&.text&.split("\n\n\n")&.dig(1)&.delete("\u00A0")&.split(' ')&.grep_v('')
57
70
  end
71
+
72
+ # og:urlで指定されたncode.syosetu.com/~~~にアクセスすると301で戻されるので何もしない
73
+ def parse_canonical_url
74
+ @url
75
+ end
58
76
  end
59
77
  end
60
78
 
@@ -67,17 +67,20 @@ module Panchira
67
67
  # fetch page and refresh canonical_url until canonical_url converges.
68
68
  loop do
69
69
  url_in_res = @page.css('//link[rel="canonical"]/@href').to_s
70
+ if url_in_res.empty?
71
+ url_in_res = @page.css('//meta[property="og:url"]/@content').to_s
72
+ end
70
73
 
71
74
  if url_in_res.empty?
72
75
  return history.last || @url
73
- else
74
- if history.include?(url_in_res) || history.length > 5
75
- return url_in_res
76
- else
77
- history.push(url_in_res)
78
- @page = fetch_page(url_in_res)
79
- end
80
76
  end
77
+
78
+ if history.include?(url_in_res) || history.length > 5
79
+ return url_in_res
80
+ end
81
+
82
+ history.push(url_in_res)
83
+ @page = fetch_page(url_in_res)
81
84
  end
82
85
  end
83
86
 
@@ -1,5 +1,5 @@
1
1
  # frozen_string_literal: true
2
2
 
3
3
  module Panchira
4
- VERSION = '1.3.0'
4
+ VERSION = '1.3.4'
5
5
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: panchira
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.3.0
4
+ version: 1.3.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - kyp
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2021-02-06 00:00:00.000000000 Z
11
+ date: 2021-07-26 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -141,6 +141,7 @@ files:
141
141
  - lib/panchira/resolvers/dlsite_resolver.rb
142
142
  - lib/panchira/resolvers/fanza_resolver.rb
143
143
  - lib/panchira/resolvers/image_resolver.rb
144
+ - lib/panchira/resolvers/iwara_resolver.rb
144
145
  - lib/panchira/resolvers/komiflo_resolver.rb
145
146
  - lib/panchira/resolvers/melonbooks_resolver.rb
146
147
  - lib/panchira/resolvers/narou_resolver.rb