metainspector 1.0.0 → 1.0.1

Sign up to get free protection for your applications and to get access to all the features.
Files changed (5) hide show
  1. data/History.txt +4 -0
  2. data/README.txt +22 -8
  3. data/Rakefile +5 -6
  4. data/lib/metainspector.rb +10 -10
  5. metadata +7 -7
data/History.txt CHANGED
@@ -1,3 +1,7 @@
1
+ == 1.0.1 / 2007-12-06
2
+
3
+ * Added some info at README.txt, translated all methods to English
4
+
1
5
  == 1.0.0 / 2007-12-06
2
6
 
3
7
  * MetaInspector is born!
data/README.txt CHANGED
@@ -1,32 +1,46 @@
1
1
  metainspector
2
- by FIX (your name)
3
- FIX (url)
2
+ by Jaime Iniesta
3
+ http://metainspector.rubyforge.org/
4
4
 
5
5
  == DESCRIPTION:
6
6
 
7
- FIX (describe your package)
7
+ Ruby gem for web scraping purposes. It scrapes a given URL, and returns you a hash with data from it like for example the title, meta description, meta keywords, an array with all the links, all the images in it, etc.
8
8
 
9
9
  == FEATURES/PROBLEMS:
10
10
 
11
- * FIX (list of features or problems)
11
+ * Scrape a given URL and return data from its HTML
12
12
 
13
13
  == SYNOPSIS:
14
14
 
15
- FIX (code sample of usage)
15
+ # Require all gems and libs needed...
16
+ require 'rubygems'
17
+ require 'open-uri'
18
+ require 'hpricot'
19
+ require 'metainspector'
20
+
21
+ # Scrape an URL...
22
+ page_data = MetaInspector.scrape(url)
23
+
24
+ # See extracted data...
25
+ page_data['title']
26
+ page_data['description']
27
+ page_data['keywords']
28
+ page_data['links']
16
29
 
17
30
  == REQUIREMENTS:
18
31
 
19
- * FIX (list of requirements)
32
+ * open-uri
33
+ * hpricot
20
34
 
21
35
  == INSTALL:
22
36
 
23
- * FIX (sudo gem install, anything else)
37
+ * sudo gem install metainspector
24
38
 
25
39
  == LICENSE:
26
40
 
27
41
  (The MIT License)
28
42
 
29
- Copyright (c) 2007 FIX
43
+ Copyright (c) 2007 Jaime Iniesta
30
44
 
31
45
  Permission is hereby granted, free of charge, to any person obtaining
32
46
  a copy of this software and associated documentation files (the
data/Rakefile CHANGED
@@ -8,12 +8,11 @@ require './lib/metainspector.rb'
8
8
 
9
9
  Hoe.new('metainspector', MetaInspector::VERSION) do |p|
10
10
  p.rubyforge_name = 'metainspector'
11
- p.remote_rdoc_dir = '' # Release to root
12
- # p.author = 'FIX'
13
- # p.email = 'FIX'
14
- # p.summary = 'FIX'
15
- # p.description = p.paragraphs_of('README.txt', 2..5).join("\n\n")
16
- # p.url = p.paragraphs_of('README.txt', 0).first.split(/\n/)[1..-1]
11
+ p.author = 'Jaime Iniesta'
12
+ p.email = 'jaimeiniesta@gmail.com'
13
+ p.summary = 'Ruby gem for web scraping purposes. It scrapes a given URL, and returns you a hash with data from it like for example the title, meta description, meta keywords, an array with all the links, all the images in it, etc.'
14
+ p.description = p.paragraphs_of('README.txt', 2..5).join("\n\n")
15
+ p.url = p.paragraphs_of('README.txt', 0).first.split(/\n/)[1..-1]
17
16
  p.changes = p.paragraphs_of('History.txt', 0..1).join("\n\n")
18
17
  end
19
18
 
data/lib/metainspector.rb CHANGED
@@ -1,39 +1,39 @@
1
1
  class MetaInspector
2
- VERSION = '1.0.0'
2
+ VERSION = '1.0.1'
3
3
 
4
4
  Hpricot.buffer_size = 300000
5
5
 
6
6
  def self.scrape(url)
7
7
  doc = Hpricot(open(url))
8
8
 
9
- # Buscamos titulo
9
+ # Searching title...
10
10
  if (!doc.at('title').nil?)
11
11
  title = doc.at('title').inner_html
12
12
  else
13
13
  title = ""
14
14
  end
15
15
 
16
- # Buscamos description
16
+ # Searching meta description...
17
17
  if (!doc.at("meta[@name='description']").nil?)
18
18
  description = doc.at("meta[@name='description']")['content']
19
19
  else
20
20
  description = ""
21
21
  end
22
22
 
23
- # Buscamos keywords
23
+ # Searching meta keywords...
24
24
  if (!doc.at("meta[@name='keywords']").nil?)
25
25
  keywords = doc.at("meta[@name='keywords']")['content']
26
26
  else
27
27
  keywords = ""
28
28
  end
29
29
 
30
- # Buscamos enlaces
31
- enlaces = []
32
- doc.search("//a").each do |enlace|
33
- enlaces << enlace.attributes["href"] if (!enlace.attributes["href"].nil?)
30
+ # Searching links...
31
+ links = []
32
+ doc.search("//a").each do |link|
33
+ links << link.attributes["href"] if (!link.attributes["href"].nil?)
34
34
  end
35
35
 
36
- # Devolvemos todo
37
- {'ok' => true, 'title' => title, 'description' => description, 'keywords' => keywords, 'enlaces' => enlaces}
36
+ # Returning all data...
37
+ {'ok' => true, 'title' => title, 'description' => description, 'keywords' => keywords, 'links' => links}
38
38
  end
39
39
  end
metadata CHANGED
@@ -3,15 +3,15 @@ rubygems_version: 0.9.2
3
3
  specification_version: 1
4
4
  name: metainspector
5
5
  version: !ruby/object:Gem::Version
6
- version: 1.0.0
7
- date: 2007-12-06 00:00:00 +01:00
8
- summary: The author was too lazy to write a summary
6
+ version: 1.0.1
7
+ date: 2007-12-07 00:00:00 +01:00
8
+ summary: Ruby gem for web scraping purposes. It scrapes a given URL, and returns you a hash with data from it like for example the title, meta description, meta keywords, an array with all the links, all the images in it, etc.
9
9
  require_paths:
10
10
  - lib
11
- email: ryand-ruby@zenspider.com
12
- homepage: http://www.zenspider.com/ZSS/Products/metainspector/
11
+ email: jaimeiniesta@gmail.com
12
+ homepage: " by Jaime Iniesta"
13
13
  rubyforge_project: metainspector
14
- description: The author was too lazy to write a description
14
+ description: "== FEATURES/PROBLEMS: * Scrape a given URL and return data from its HTML == SYNOPSIS: # Require all gems and libs needed... require 'rubygems' require 'open-uri' require 'hpricot' require 'metainspector' # Scrape an URL... page_data = MetaInspector.scrape(url)"
15
15
  autorequire:
16
16
  default_executable:
17
17
  bindir: bin
@@ -27,7 +27,7 @@ signing_key:
27
27
  cert_chain:
28
28
  post_install_message:
29
29
  authors:
30
- - Ryan Davis
30
+ - Jaime Iniesta
31
31
  files:
32
32
  - History.txt
33
33
  - Manifest.txt