opengraph_parser 0.2.3 → 0.2.4

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: df0c81f7b80017bff4c885d2c414eef59fabf51270ed39deb4c865ad26be55ac
4
+ data.tar.gz: '02908954486292679e02fe035aedce81fe1575d49ad8d09419bdcdaf43dceb3e'
5
+ SHA512:
6
+ metadata.gz: aeff1818cd446174ccd6737c11faa59296184186480f373a43967cab9a36a89412cca711a81d1ed6572b20bea2b5b5909b6bcd2614e2047128a2684827135e10
7
+ data.tar.gz: 196240064499563538c518f454ac15d8498003bae0a732f13e09e109ac0e5037bb3ce5431c47858ea1e797a0622d04ba51c52df3f9f8eab84bd69a7f1074a61f
data/README.md ADDED
@@ -0,0 +1,74 @@
1
+ # OpengraphParser
2
+
3
+ OpengraphParser is a simple Ruby library for parsing Open Graph protocol information from a website. Learn more about the protocol at:
4
+ http://ogp.me
5
+
6
+ ## Installation
7
+
8
+ ```bash
9
+ gem install opengraph_parser
10
+ ```
11
+
12
+ or add to Gemfile
13
+
14
+ ```bash
15
+ gem "opengraph_parser"
16
+ ```
17
+
18
+ ## Usage
19
+
20
+ ### Parsing an URL
21
+
22
+ ```ruby
23
+ og = OpenGraph.new("http://ogp.me")
24
+ og.title # => "Open Graph protocol"
25
+ og.type # => "website"
26
+ og.url # => "http://ogp.me/"
27
+ og.description # => "The Open Graph protocol enables any web page to become a rich object in a social graph."
28
+ og.images # => ["http://ogp.me/logo.png"]
29
+ ```
30
+
31
+ You can also get other Open Graph metadata as:
32
+
33
+ ```ruby
34
+ og.metadata # => {"og:image:type"=>"image/png", "og:image:width"=>"300", "og:image:height"=>"300"}
35
+ ```
36
+
37
+ ### Parsing a HTML document
38
+
39
+ ```ruby
40
+ og = OpenGraph.new(html_string)
41
+ ```
42
+
43
+ ### Custom header fields
44
+ In some cases you may need to change fields in HTTP request header for an URL
45
+ ```ruby
46
+ og = OpenGraph.new("http://opg.me", { :headers => {'User-Agent' => 'Custom User Agent'} })
47
+ ```
48
+
49
+ ### Fallback
50
+ If you try to parse Open Graph information for a website that doesn’t have any Open Graph metadata, the library will try to find other information in the website as the following rules:
51
+
52
+ * `<title>` for title
53
+ * `<meta name="description">` for description
54
+ * `<link rel="image_src">` or all `<img>` tags for images
55
+
56
+ You can disable this fallback lookup by passing false to init method:
57
+
58
+ ```ruby
59
+ og = OpenGraph.new("http://ogp.me", false)
60
+ ```
61
+
62
+ ## Contributing to opengraph_parser
63
+
64
+ * Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
65
+ * Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it.
66
+ * Fork the project.
67
+ * Start a feature/bugfix branch.
68
+ * Commit and push until you are happy with your contribution.
69
+ * Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
70
+ * Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.
71
+
72
+ ## Copyright
73
+
74
+ Copyright (c) 2013 Huy Ha. See LICENSE.txt for further details.
data/lib/open_graph.rb CHANGED
@@ -4,7 +4,7 @@ require "addressable/uri"
4
4
  require 'uri'
5
5
 
6
6
  class OpenGraph
7
- attr_accessor :src, :url, :type, :title, :description, :images, :metadata, :response, :original_images
7
+ attr_accessor :src, :url, :type, :title, :description, :images, :metadata, :response, :original_images, :html_content
8
8
 
9
9
  def initialize(src, fallback = true, options = {})
10
10
  if fallback.is_a? Hash
@@ -25,8 +25,10 @@ class OpenGraph
25
25
  begin
26
26
  if @src.include? '</html>'
27
27
  @body = @src
28
+ @html_content = true
28
29
  else
29
30
  @body = RedirectFollower.new(@src, options).resolve.body
31
+ @html_content = false
30
32
  end
31
33
  rescue
32
34
  @title = @url = @src
@@ -66,6 +68,10 @@ class OpenGraph
66
68
  @description = description_meta.attribute("content").to_s.strip
67
69
  end
68
70
 
71
+ if @description.to_s.empty?
72
+ @description = fetch_first_text(doc)
73
+ end
74
+
69
75
  fetch_images(doc, "//head//link[@rel='image_src']", "href") if @images.empty?
70
76
  fetch_images(doc, "//img", "src") if @images.empty?
71
77
  end
@@ -73,7 +79,11 @@ class OpenGraph
73
79
 
74
80
  def check_images_path
75
81
  @original_images = @images.dup
76
- uri = Addressable::URI.parse(@src)
82
+
83
+ uri = Addressable::URI.parse(@url || @src)
84
+
85
+ return unless uri
86
+
77
87
  imgs = @images.dup
78
88
  @images = []
79
89
  imgs.each do |img|
@@ -96,6 +106,13 @@ class OpenGraph
96
106
  end
97
107
  end
98
108
 
109
+ def fetch_first_text(doc)
110
+ doc.xpath('//p').each do |p|
111
+ s = p.text.to_s.strip
112
+ return s if s.length > 20
113
+ end
114
+ end
115
+
99
116
  def add_metadata(metadata_container, path, content)
100
117
  path_elements = path.split(':')
101
118
  if path_elements.size > 1
@@ -117,4 +134,4 @@ class OpenGraph
117
134
  metadata_container
118
135
  end
119
136
  end
120
- end
137
+ end
@@ -6,19 +6,16 @@ class RedirectFollower
6
6
 
7
7
  attr_accessor :url, :body, :redirect_limit, :response, :headers
8
8
 
9
- def initialize(url, limit = REDIRECT_DEFAULT_LIMIT, options = {})
10
- if limit.is_a? Hash
11
- options = limit
12
- limit = REDIRECT_DEFAULT_LIMIT
13
- end
14
- @url, @redirect_limit = url, limit
9
+ def initialize(url, options = {})
10
+ @url = url
11
+ @redirect_limit = options[:redirect_limit] || REDIRECT_DEFAULT_LIMIT
15
12
  @headers = options[:headers] || {}
16
13
  end
17
14
 
18
15
  def resolve
19
16
  raise TooManyRedirects if redirect_limit < 0
20
17
 
21
- uri = URI.parse(URI.escape(url))
18
+ uri = Addressable::URI.parse(url)
22
19
 
23
20
  http = Net::HTTP.new(uri.host, uri.port)
24
21
  if uri.scheme == 'https'
@@ -45,4 +42,4 @@ class RedirectFollower
45
42
  response['location']
46
43
  end
47
44
  end
48
- end
45
+ end
metadata CHANGED
@@ -1,83 +1,100 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: opengraph_parser
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.2.3
5
- prerelease:
4
+ version: 0.2.4
6
5
  platform: ruby
7
6
  authors:
8
7
  - Huy Ha
9
8
  - Duc Trinh
10
- autorequire:
9
+ autorequire:
11
10
  bindir: bin
12
11
  cert_chain: []
13
- date: 2013-05-23 00:00:00.000000000 Z
12
+ date: 2021-12-23 00:00:00.000000000 Z
14
13
  dependencies:
15
14
  - !ruby/object:Gem::Dependency
16
15
  name: nokogiri
17
- requirement: &70365561662080 !ruby/object:Gem::Requirement
18
- none: false
16
+ requirement: !ruby/object:Gem::Requirement
19
17
  requirements:
20
- - - ! '>='
18
+ - - ">="
21
19
  - !ruby/object:Gem::Version
22
20
  version: '0'
23
21
  type: :runtime
24
22
  prerelease: false
25
- version_requirements: *70365561662080
23
+ version_requirements: !ruby/object:Gem::Requirement
24
+ requirements:
25
+ - - ">="
26
+ - !ruby/object:Gem::Version
27
+ version: '0'
26
28
  - !ruby/object:Gem::Dependency
27
29
  name: addressable
28
- requirement: &70365561661360 !ruby/object:Gem::Requirement
29
- none: false
30
+ requirement: !ruby/object:Gem::Requirement
30
31
  requirements:
31
- - - ! '>='
32
+ - - ">="
32
33
  - !ruby/object:Gem::Version
33
34
  version: '0'
34
35
  type: :runtime
35
36
  prerelease: false
36
- version_requirements: *70365561661360
37
+ version_requirements: !ruby/object:Gem::Requirement
38
+ requirements:
39
+ - - ">="
40
+ - !ruby/object:Gem::Version
41
+ version: '0'
37
42
  - !ruby/object:Gem::Dependency
38
43
  name: rspec
39
- requirement: &70365561660600 !ruby/object:Gem::Requirement
40
- none: false
44
+ requirement: !ruby/object:Gem::Requirement
41
45
  requirements:
42
- - - ! '>='
46
+ - - ">="
43
47
  - !ruby/object:Gem::Version
44
48
  version: '0'
45
49
  type: :development
46
50
  prerelease: false
47
- version_requirements: *70365561660600
51
+ version_requirements: !ruby/object:Gem::Requirement
52
+ requirements:
53
+ - - ">="
54
+ - !ruby/object:Gem::Version
55
+ version: '0'
48
56
  - !ruby/object:Gem::Dependency
49
57
  name: rdoc
50
- requirement: &70365561659860 !ruby/object:Gem::Requirement
51
- none: false
58
+ requirement: !ruby/object:Gem::Requirement
52
59
  requirements:
53
- - - ! '>='
60
+ - - ">="
54
61
  - !ruby/object:Gem::Version
55
62
  version: '0'
56
63
  type: :development
57
64
  prerelease: false
58
- version_requirements: *70365561659860
65
+ version_requirements: !ruby/object:Gem::Requirement
66
+ requirements:
67
+ - - ">="
68
+ - !ruby/object:Gem::Version
69
+ version: '0'
59
70
  - !ruby/object:Gem::Dependency
60
71
  name: bundler
61
- requirement: &70365561659200 !ruby/object:Gem::Requirement
62
- none: false
72
+ requirement: !ruby/object:Gem::Requirement
63
73
  requirements:
64
- - - ! '>='
74
+ - - ">="
65
75
  - !ruby/object:Gem::Version
66
76
  version: '0'
67
77
  type: :development
68
78
  prerelease: false
69
- version_requirements: *70365561659200
79
+ version_requirements: !ruby/object:Gem::Requirement
80
+ requirements:
81
+ - - ">="
82
+ - !ruby/object:Gem::Version
83
+ version: '0'
70
84
  - !ruby/object:Gem::Dependency
71
85
  name: jeweler
72
- requirement: &70365561658400 !ruby/object:Gem::Requirement
73
- none: false
86
+ requirement: !ruby/object:Gem::Requirement
74
87
  requirements:
75
- - - ! '>='
88
+ - - ">="
76
89
  - !ruby/object:Gem::Version
77
90
  version: '0'
78
91
  type: :development
79
92
  prerelease: false
80
- version_requirements: *70365561658400
93
+ version_requirements: !ruby/object:Gem::Requirement
94
+ requirements:
95
+ - - ">="
96
+ - !ruby/object:Gem::Version
97
+ version: '0'
81
98
  description: A simple Ruby library for parsing Open Graph Protocol information from
82
99
  a website. It also includes a fallback solution when the website has no Open Graph
83
100
  information.
@@ -86,39 +103,34 @@ executables: []
86
103
  extensions: []
87
104
  extra_rdoc_files:
88
105
  - LICENSE.txt
89
- - README.rdoc
106
+ - README.md
90
107
  files:
108
+ - LICENSE.txt
109
+ - README.md
91
110
  - lib/open_graph.rb
92
111
  - lib/opengraph_parser.rb
93
112
  - lib/redirect_follower.rb
94
- - LICENSE.txt
95
- - README.rdoc
96
113
  homepage: http://github.com/huyha85/opengraph_parser
97
114
  licenses:
98
115
  - MIT
99
- post_install_message:
116
+ metadata: {}
117
+ post_install_message:
100
118
  rdoc_options: []
101
119
  require_paths:
102
120
  - lib
103
121
  required_ruby_version: !ruby/object:Gem::Requirement
104
- none: false
105
122
  requirements:
106
- - - ! '>='
123
+ - - ">="
107
124
  - !ruby/object:Gem::Version
108
125
  version: '0'
109
- segments:
110
- - 0
111
- hash: 1017772970234203575
112
126
  required_rubygems_version: !ruby/object:Gem::Requirement
113
- none: false
114
127
  requirements:
115
- - - ! '>='
128
+ - - ">="
116
129
  - !ruby/object:Gem::Version
117
130
  version: '0'
118
131
  requirements: []
119
- rubyforge_project:
120
- rubygems_version: 1.8.10
121
- signing_key:
132
+ rubygems_version: 3.0.3
133
+ signing_key:
122
134
  specification_version: 3
123
135
  summary: A simple Ruby library for parsing Open Graph Protocol information from a
124
136
  website.
data/README.rdoc DELETED
@@ -1,45 +0,0 @@
1
- = OpengraphParser
2
-
3
- OpengraphParser is a simple Ruby library for parsing Open Graph protocol information from a web site. Learn more about the protocol at:
4
- http://ogp.me
5
-
6
- == Installation
7
- gem install opengraph_parser
8
-
9
- or add to Gemfile
10
-
11
- gem "opengraph_parser"
12
-
13
- == Usage
14
- og = OpenGraph.new("http://ogp.me")
15
- og.title # => "Open Graph protocol"
16
- og.type # => "website"
17
- og.url # => "http://ogp.me/"
18
- og.description # => "The Open Graph protocol enables any web page to become a rich object in a social graph."
19
- og.images # => ["http://ogp.me/logo.png"]
20
-
21
- You can also get other Open Graph metadata as:
22
- og.metadata # => {"og:image:type"=>"image/png", "og:image:width"=>"300", "og:image:height"=>"300"}
23
-
24
- If you try to parse Open Graph information for a website that doesn’t have any Open Graph metadata, the library will try to find other information in the website as the following rules:
25
- <title> for title
26
- <meta name="description"> for description
27
- <link rel="image_src"> or all <img> tags for images
28
-
29
- You can disable this fallback lookup by passing false to init method:
30
- og = OpenGraph.new("http://ogp.me", false)
31
-
32
- == Contributing to opengraph_parser
33
-
34
- * Check out the latest master to make sure the feature hasn't been implemented or the bug hasn't been fixed yet.
35
- * Check out the issue tracker to make sure someone already hasn't requested it and/or contributed it.
36
- * Fork the project.
37
- * Start a feature/bugfix branch.
38
- * Commit and push until you are happy with your contribution.
39
- * Make sure to add tests for it. This is important so I don't break it in a future version unintentionally.
40
- * Please try not to mess with the Rakefile, version, or history. If you want to have your own version, or is otherwise necessary, that is fine, but please isolate to its own commit so I can cherry-pick around it.
41
-
42
- == Copyright
43
-
44
- Copyright (c) 2012 Huy Ha. See LICENSE.txt for
45
- further details.