rapgenius 0.0.2 → 0.0.3
Sign up to get free protection for your applications and to get access to all the features.
- data/.gitignore +2 -0
- data/CHANGELOG.md +14 -0
- data/README.md +95 -0
- data/lib/rapgenius/annotation.rb +6 -5
- data/lib/rapgenius/scraper.rb +31 -9
- data/lib/rapgenius/song.rb +1 -4
- data/lib/rapgenius/version.rb +1 -1
- data/pkg/rapgenius-0.0.2.gem +0 -0
- data/rapgenius.gemspec +7 -5
- data/spec/rapgenius/annotation_spec.rb +41 -0
- data/spec/rapgenius/scraper_spec.rb +54 -0
- data/spec/rapgenius/song_spec.rb +46 -0
- data/spec/spec_helper.rb +4 -1
- data/spec/support/vcr.rb +11 -0
- metadata +46 -13
- data/Gemfile.lock +0 -38
- data/spec/annotation_spec.rb +0 -68
- data/spec/scraper_spec.rb +0 -64
- data/spec/song_spec.rb +0 -44
- data/spec/support/annotation.html +0 -440
- data/spec/support/song.html +0 -1358
data/.gitignore
ADDED
data/CHANGELOG.md
ADDED
@@ -0,0 +1,14 @@
|
|
1
|
+
# Changelog
|
2
|
+
|
3
|
+
__v0.0.1__ (17th August 2013)
|
4
|
+
|
5
|
+
* Initial version
|
6
|
+
|
7
|
+
__v0.0.2__ (17th August 2013)
|
8
|
+
|
9
|
+
* Adds `RapGenius::Song.find` to replicate behaviour in `RapGenius::Annotation`
|
10
|
+
|
11
|
+
__v0.0.3__ (22nd August 2013, *contributed by [tsigo](https://github.com/tsigo)*)
|
12
|
+
|
13
|
+
* Improves implementation of HTTParty
|
14
|
+
* Reorganises specs to use VCR
|
data/README.md
ADDED
@@ -0,0 +1,95 @@
|
|
1
|
+
# rapgenius
|
2
|
+
|
3
|
+
![Rap Genius logo](http://f.cl.ly/items/303W0c1i2r100j2u3Y0y/Screen%20Shot%202013-08-17%20at%2016.01.19.png)
|
4
|
+
|
5
|
+
## What does this do?
|
6
|
+
|
7
|
+
It's a Ruby gem for accessing lyrics and explanations on
|
8
|
+
[Rap Genius](http://rapgenius.com).
|
9
|
+
|
10
|
+
They very sadly [don't have an API](https://twitter.com/RapGenius/status/245057326321655808) so I decided to replicate one for myself
|
11
|
+
with a nice bit of screen scraping with [Nokogiri](https://github.com/sparklemotion/nokogiri), much like my [amex](https://github.com/timrogers/amex), [ucas](https://github.com/timrogers/ucas) and [lloydstsb](https://github.com/timrogers/lloydstsb) gems.
|
12
|
+
|
13
|
+
## Installation
|
14
|
+
|
15
|
+
Install the gem, and you're ready to go. Simply add the following to your
|
16
|
+
Gemfile:
|
17
|
+
|
18
|
+
`gem "rapgenius", "~> 0.0.2"`
|
19
|
+
|
20
|
+
## Usage
|
21
|
+
|
22
|
+
Songs on Rap Genius don't have numeric identifiers as far as I can tell - they're identified by a URL slug featuring the artist and song name, for instance "Big-sean-control-lyrics". We use this to fetch a particular track, like so:
|
23
|
+
|
24
|
+
```ruby
|
25
|
+
require 'rapgenius'
|
26
|
+
song = RapGenius::Song.find("Big-sean-control-lyrics")
|
27
|
+
```
|
28
|
+
|
29
|
+
Once you've got the song, you can easily load details about it. This uses
|
30
|
+
Nokogiri to fetch the song's page and then parse it:
|
31
|
+
|
32
|
+
```ruby
|
33
|
+
song.title
|
34
|
+
# => "Control"
|
35
|
+
|
36
|
+
song.artist
|
37
|
+
# => "Big Sean"
|
38
|
+
|
39
|
+
song.full_artist
|
40
|
+
# => "Big Sean (Ft. Jay Electronica & Kendrick Lamar)"
|
41
|
+
|
42
|
+
song.images
|
43
|
+
# => ["http://s3.amazonaws.com/rapgenius/1376434983_jay-electronica.jpg", "http://s3.amazonaws.com/rapgenius/1375029260_Big%20Sean.png", "http://s3.amazonaws.com/rapgenius/Kendrick-Lamar-1024x680.jpg"]
|
44
|
+
|
45
|
+
song.description
|
46
|
+
# => "The non-album cut from Sean that basically blew up the Internet due to a world-beating verse by Kendrick Lamar...
|
47
|
+
```
|
48
|
+
|
49
|
+
The `#annotations` accessor on a Song returns an array of RapGenius::Annotation
|
50
|
+
objects corresponding to different annotated lines of the song, identified by
|
51
|
+
their `id`.
|
52
|
+
|
53
|
+
You can look these up manually using `RapGenius::Annotation.find("id")`. You
|
54
|
+
can grab the ID for a lyric from a RapGenius page by right clicking on an annotation, copying the shortcut and then finding the number after "http://rapgenius.com".
|
55
|
+
|
56
|
+
```ruby
|
57
|
+
song.annotations
|
58
|
+
# => [<RapGenius::Annotation>, <RapGenius::Annotation>...]
|
59
|
+
|
60
|
+
annotation = song.annotations[99]
|
61
|
+
|
62
|
+
annotation.lyric
|
63
|
+
# => "And that goes for Jermaine Cole, Big KRIT, Wale\nPusha T, Meek Millz, A$AP Rocky, Drake\nBig Sean, Jay Electron', Tyler, Mac Miller"
|
64
|
+
|
65
|
+
annotation.explanation
|
66
|
+
# => "Kendrick calls out some of the biggest names in present day Hip-hop...""
|
67
|
+
|
68
|
+
annotation.song == song # You can get back to the song from the annotation...
|
69
|
+
# => true
|
70
|
+
|
71
|
+
annotation.id
|
72
|
+
# => "2093001"
|
73
|
+
|
74
|
+
annotation2 = RapGenius::Annotation.find("2093001") # Fetching directly...
|
75
|
+
|
76
|
+
annotation == annotations2
|
77
|
+
# => true
|
78
|
+
```
|
79
|
+
|
80
|
+
## Contributing
|
81
|
+
|
82
|
+
There are a few things I'd love to see added to this gem:
|
83
|
+
|
84
|
+
* __Searching__ - having to know the path to a particular track's lyrics isn't super intuitive
|
85
|
+
* __Support for *\*Genius*__ - RapG enius also have other sites on subdomains like [News Genius](http://news.rapgenius.com) and [Poetry Genius](http://poetry.rapgenius.com). These could very easily be supported, since theyre identical in terms of markup.
|
86
|
+
|
87
|
+
This gem is open source, so feel free to add anything you want, then make a pull request. A few quick tips:
|
88
|
+
|
89
|
+
* Don't update the version numbers before your pull request - I'll sort that part out for you!
|
90
|
+
* Make sure you write specs, then run them with `$ bundle exec rake`
|
91
|
+
* Update this README.md file so I, and users, know how your changes work
|
92
|
+
|
93
|
+
## Get in touch
|
94
|
+
|
95
|
+
Any questions, thoughts or comments? Email me at <me@timrogers.co.uk>.
|
data/lib/rapgenius/annotation.rb
CHANGED
@@ -26,11 +26,12 @@ module RapGenius
|
|
26
26
|
end
|
27
27
|
|
28
28
|
def song
|
29
|
-
|
30
|
-
attr('content').to_s
|
31
|
-
|
32
|
-
@song ||= Song.new(entry_path)
|
29
|
+
@song ||= Song.new(song_url)
|
33
30
|
end
|
34
31
|
|
32
|
+
def song_url
|
33
|
+
@song_url ||= document.css('meta[property="rap_genius:song"]').
|
34
|
+
attr('content').to_s
|
35
|
+
end
|
35
36
|
end
|
36
|
-
end
|
37
|
+
end
|
data/lib/rapgenius/scraper.rb
CHANGED
@@ -3,33 +3,55 @@ require 'httparty'
|
|
3
3
|
|
4
4
|
module RapGenius
|
5
5
|
module Scraper
|
6
|
-
|
6
|
+
# Custom HTTParty parser that parses the returned body with Nokogiri
|
7
|
+
class NokogiriParser < HTTParty::Parser
|
8
|
+
SupportedFormats.merge!('text/html' => :html)
|
7
9
|
|
8
|
-
|
10
|
+
def html
|
11
|
+
Nokogiri::HTML(body)
|
12
|
+
end
|
13
|
+
end
|
14
|
+
|
15
|
+
# HTTParty client
|
16
|
+
#
|
17
|
+
# Sets some useful defaults for all of our requests.
|
18
|
+
#
|
19
|
+
# See Scraper#fetch
|
20
|
+
class Client
|
21
|
+
include HTTParty
|
22
|
+
|
23
|
+
format :html
|
24
|
+
parser NokogiriParser
|
25
|
+
base_uri 'http://rapgenius.com'
|
26
|
+
headers 'User-Agent' => "rapgenius.rb v#{RapGenius::VERSION}"
|
27
|
+
end
|
9
28
|
|
29
|
+
BASE_URL = Client.base_uri + "/".freeze
|
30
|
+
|
31
|
+
attr_reader :url
|
10
32
|
|
11
33
|
def url=(url)
|
12
|
-
|
13
|
-
@url =
|
34
|
+
unless url =~ /^https?:\/\//
|
35
|
+
@url = BASE_URL + url
|
14
36
|
else
|
15
37
|
@url = url
|
16
38
|
end
|
17
39
|
end
|
18
40
|
|
19
41
|
def document
|
20
|
-
@document ||=
|
42
|
+
@document ||= fetch(@url)
|
21
43
|
end
|
22
44
|
|
23
45
|
private
|
46
|
+
|
24
47
|
def fetch(url)
|
25
|
-
response =
|
48
|
+
response = Client.get(url)
|
26
49
|
|
27
50
|
if response.code != 200
|
28
51
|
raise ScraperError, "Received a #{response.code} HTTP response"
|
29
52
|
end
|
30
53
|
|
31
|
-
response.
|
54
|
+
response.parsed_response
|
32
55
|
end
|
33
|
-
|
34
56
|
end
|
35
|
-
end
|
57
|
+
end
|
data/lib/rapgenius/song.rb
CHANGED
data/lib/rapgenius/version.rb
CHANGED
Binary file
|
data/rapgenius.gemspec
CHANGED
@@ -14,13 +14,15 @@ Gem::Specification.new do |s|
|
|
14
14
|
"working at Rap Genius is the API". With this magical screen-scraping gem,
|
15
15
|
you can access the wealth of data on the internet Talmud in Ruby.}
|
16
16
|
|
17
|
-
s.add_runtime_dependency "nokogiri",
|
18
|
-
s.add_runtime_dependency "httparty",
|
19
|
-
s.add_development_dependency "rspec",
|
20
|
-
s.add_development_dependency "mocha",
|
17
|
+
s.add_runtime_dependency "nokogiri", "~>1.6.0"
|
18
|
+
s.add_runtime_dependency "httparty", "~>0.11.0"
|
19
|
+
s.add_development_dependency "rspec", "~>2.14.1"
|
20
|
+
s.add_development_dependency "mocha", "~>0.14.0"
|
21
|
+
s.add_development_dependency "webmock", "~>1.11.0"
|
22
|
+
s.add_development_dependency "vcr", "~>2.5.0"
|
21
23
|
|
22
24
|
s.files = `git ls-files`.split("\n")
|
23
25
|
s.test_files = `git ls-files -- {test,spec,features}/*`.split("\n")
|
24
26
|
s.executables = `git ls-files -- bin/*`.split("\n").map{ |f| File.basename(f) }
|
25
27
|
s.require_paths = ["lib"]
|
26
|
-
end
|
28
|
+
end
|
@@ -0,0 +1,41 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
module RapGenius
|
4
|
+
describe Annotation, vcr: {cassette_name: "big-sean-annotation"} do
|
5
|
+
|
6
|
+
let(:annotation) { described_class.new(id: "2092393") }
|
7
|
+
subject { annotation }
|
8
|
+
|
9
|
+
its(:id) { should eq "2092393" }
|
10
|
+
its(:url) { should eq "http://rapgenius.com/2092393" }
|
11
|
+
its(:song) { should be_a Song }
|
12
|
+
its(:song_url) { should eq "http://rapgenius.com/Big-sean-control-lyrics" }
|
13
|
+
|
14
|
+
describe "#lyric" do
|
15
|
+
it "should have the correct lyric" do
|
16
|
+
annotation.lyric.should eq "You gon' get this rain like it's May weather,"
|
17
|
+
end
|
18
|
+
end
|
19
|
+
|
20
|
+
describe "#explanation" do
|
21
|
+
it "should have the correct explanation" do
|
22
|
+
annotation.explanation.should include "making it rain"
|
23
|
+
end
|
24
|
+
end
|
25
|
+
|
26
|
+
describe '.find' do
|
27
|
+
it "returns a new instance at the specified path" do
|
28
|
+
i = described_class.find("foobar")
|
29
|
+
i.should be_an Annotation
|
30
|
+
i.id.should eq "foobar"
|
31
|
+
end
|
32
|
+
end
|
33
|
+
|
34
|
+
context "with additional parameters passed into the constructor" do
|
35
|
+
let(:annotation) { described_class.new(id: "5678", lyric: "foo") }
|
36
|
+
|
37
|
+
its(:id) { should eq "5678" }
|
38
|
+
its(:lyric) { should eq "foo" }
|
39
|
+
end
|
40
|
+
end
|
41
|
+
end
|
@@ -0,0 +1,54 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
class ScraperTester
|
4
|
+
include RapGenius::Scraper
|
5
|
+
end
|
6
|
+
|
7
|
+
module RapGenius
|
8
|
+
describe Scraper do
|
9
|
+
|
10
|
+
let(:scraper) { ScraperTester.new }
|
11
|
+
|
12
|
+
describe "#url=" do
|
13
|
+
it "forms the URL with the base URL, if the current path is relative" do
|
14
|
+
scraper.url = "foobar"
|
15
|
+
scraper.url.should include RapGenius::Scraper::BASE_URL
|
16
|
+
end
|
17
|
+
|
18
|
+
it "leaves the URL as it is if already complete" do
|
19
|
+
scraper.url = "http://foobar.com/baz"
|
20
|
+
scraper.url.should eq "http://foobar.com/baz"
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
describe "#document" do
|
25
|
+
before do
|
26
|
+
scraper.url = "http://foo.bar/"
|
27
|
+
end
|
28
|
+
|
29
|
+
context "with a successful request" do
|
30
|
+
before do
|
31
|
+
stub_request(:get, "http://foo.bar").to_return({body: 'ok', status: 200})
|
32
|
+
end
|
33
|
+
|
34
|
+
it "returns a Nokogiri document object" do
|
35
|
+
scraper.document.should be_a Nokogiri::HTML::Document
|
36
|
+
end
|
37
|
+
|
38
|
+
it "contains the tags in page received back from the HTTP request" do
|
39
|
+
scraper.document.css('body').length.should eq 1
|
40
|
+
end
|
41
|
+
end
|
42
|
+
|
43
|
+
context "with a failed request" do
|
44
|
+
before do
|
45
|
+
stub_request(:get, "http://foo.bar").to_return({body: '', status: 404})
|
46
|
+
end
|
47
|
+
|
48
|
+
it "raises a ScraperError" do
|
49
|
+
expect { scraper.document }.to raise_error(RapGenius::ScraperError)
|
50
|
+
end
|
51
|
+
end
|
52
|
+
end
|
53
|
+
end
|
54
|
+
end
|
@@ -0,0 +1,46 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
|
3
|
+
module RapGenius
|
4
|
+
describe Song do
|
5
|
+
context "given Big Sean's Control", vcr: {cassette_name: "big-sean-control-lyrics"} do
|
6
|
+
subject { described_class.new("Big-sean-control-lyrics") }
|
7
|
+
|
8
|
+
its(:url) { should eq "http://rapgenius.com/Big-sean-control-lyrics" }
|
9
|
+
its(:title) { should eq "Control" }
|
10
|
+
its(:artist) { should eq "Big Sean" }
|
11
|
+
its(:description) { should include "blew up the Internet" }
|
12
|
+
its(:full_artist) { should include "(Ft. Jay Electronica & Kendrick Lamar)"}
|
13
|
+
|
14
|
+
describe "#images" do
|
15
|
+
it "should be an Array" do
|
16
|
+
subject.images.should be_an Array
|
17
|
+
end
|
18
|
+
|
19
|
+
it "should include Big Sean's picture" do
|
20
|
+
subject.images.should include "http://s3.amazonaws.com/rapgenius/1375029260_Big%20Sean.png"
|
21
|
+
end
|
22
|
+
end
|
23
|
+
|
24
|
+
describe "#annotations" do
|
25
|
+
it "should be an Array of Annotation objects" do
|
26
|
+
subject.annotations.should be_an Array
|
27
|
+
subject.annotations.first.should be_a Annotation
|
28
|
+
end
|
29
|
+
|
30
|
+
it "should be of a valid length" do
|
31
|
+
# Annotations get added and removed from the live site; we want our
|
32
|
+
# count to be somewhat accurate, within reason.
|
33
|
+
subject.annotations.length.should be_within(15).of(130)
|
34
|
+
end
|
35
|
+
end
|
36
|
+
end
|
37
|
+
|
38
|
+
describe '.find' do
|
39
|
+
it "returns a new instance at the specified path" do
|
40
|
+
i = described_class.find("foobar")
|
41
|
+
i.should be_a Song
|
42
|
+
i.url.should eq 'http://rapgenius.com/foobar'
|
43
|
+
end
|
44
|
+
end
|
45
|
+
end
|
46
|
+
end
|
data/spec/spec_helper.rb
CHANGED
data/spec/support/vcr.rb
ADDED
@@ -0,0 +1,11 @@
|
|
1
|
+
require 'vcr'
|
2
|
+
|
3
|
+
VCR.configure do |c|
|
4
|
+
c.default_cassette_options = {
|
5
|
+
record: :new_episodes,
|
6
|
+
re_record_interval: 24 * 60 * 60
|
7
|
+
}
|
8
|
+
c.cassette_library_dir = File.expand_path('../cassettes/', __FILE__)
|
9
|
+
c.hook_into :webmock
|
10
|
+
c.configure_rspec_metadata!
|
11
|
+
end
|
metadata
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: rapgenius
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
4
|
+
version: 0.0.3
|
5
5
|
prerelease:
|
6
6
|
platform: ruby
|
7
7
|
authors:
|
@@ -9,7 +9,7 @@ authors:
|
|
9
9
|
autorequire:
|
10
10
|
bindir: bin
|
11
11
|
cert_chain: []
|
12
|
-
date: 2013-08-
|
12
|
+
date: 2013-08-22 00:00:00.000000000 Z
|
13
13
|
dependencies:
|
14
14
|
- !ruby/object:Gem::Dependency
|
15
15
|
name: nokogiri
|
@@ -75,6 +75,38 @@ dependencies:
|
|
75
75
|
- - ~>
|
76
76
|
- !ruby/object:Gem::Version
|
77
77
|
version: 0.14.0
|
78
|
+
- !ruby/object:Gem::Dependency
|
79
|
+
name: webmock
|
80
|
+
requirement: !ruby/object:Gem::Requirement
|
81
|
+
none: false
|
82
|
+
requirements:
|
83
|
+
- - ~>
|
84
|
+
- !ruby/object:Gem::Version
|
85
|
+
version: 1.11.0
|
86
|
+
type: :development
|
87
|
+
prerelease: false
|
88
|
+
version_requirements: !ruby/object:Gem::Requirement
|
89
|
+
none: false
|
90
|
+
requirements:
|
91
|
+
- - ~>
|
92
|
+
- !ruby/object:Gem::Version
|
93
|
+
version: 1.11.0
|
94
|
+
- !ruby/object:Gem::Dependency
|
95
|
+
name: vcr
|
96
|
+
requirement: !ruby/object:Gem::Requirement
|
97
|
+
none: false
|
98
|
+
requirements:
|
99
|
+
- - ~>
|
100
|
+
- !ruby/object:Gem::Version
|
101
|
+
version: 2.5.0
|
102
|
+
type: :development
|
103
|
+
prerelease: false
|
104
|
+
version_requirements: !ruby/object:Gem::Requirement
|
105
|
+
none: false
|
106
|
+
requirements:
|
107
|
+
- - ~>
|
108
|
+
- !ruby/object:Gem::Version
|
109
|
+
version: 2.5.0
|
78
110
|
description: ! "Up until until now, to quote RapGenius themselves,\n \"working
|
79
111
|
at Rap Genius is the API\". With this magical screen-scraping gem,\n you can
|
80
112
|
access the wealth of data on the internet Talmud in Ruby."
|
@@ -84,9 +116,11 @@ executables: []
|
|
84
116
|
extensions: []
|
85
117
|
extra_rdoc_files: []
|
86
118
|
files:
|
119
|
+
- .gitignore
|
120
|
+
- CHANGELOG.md
|
87
121
|
- Gemfile
|
88
|
-
- Gemfile.lock
|
89
122
|
- LICENSE
|
123
|
+
- README.md
|
90
124
|
- Rakefile
|
91
125
|
- lib/rapgenius.rb
|
92
126
|
- lib/rapgenius/annotation.rb
|
@@ -95,13 +129,13 @@ files:
|
|
95
129
|
- lib/rapgenius/song.rb
|
96
130
|
- lib/rapgenius/version.rb
|
97
131
|
- pkg/rapgenius-0.0.1.gem
|
132
|
+
- pkg/rapgenius-0.0.2.gem
|
98
133
|
- rapgenius.gemspec
|
99
|
-
- spec/annotation_spec.rb
|
100
|
-
- spec/scraper_spec.rb
|
101
|
-
- spec/song_spec.rb
|
134
|
+
- spec/rapgenius/annotation_spec.rb
|
135
|
+
- spec/rapgenius/scraper_spec.rb
|
136
|
+
- spec/rapgenius/song_spec.rb
|
102
137
|
- spec/spec_helper.rb
|
103
|
-
- spec/support/
|
104
|
-
- spec/support/song.html
|
138
|
+
- spec/support/vcr.rb
|
105
139
|
homepage: http://timrogers.co.uk
|
106
140
|
licenses: []
|
107
141
|
post_install_message:
|
@@ -127,9 +161,8 @@ signing_key:
|
|
127
161
|
specification_version: 3
|
128
162
|
summary: A gem for accessing lyrics and explanations on RapGenius.com
|
129
163
|
test_files:
|
130
|
-
- spec/annotation_spec.rb
|
131
|
-
- spec/scraper_spec.rb
|
132
|
-
- spec/song_spec.rb
|
164
|
+
- spec/rapgenius/annotation_spec.rb
|
165
|
+
- spec/rapgenius/scraper_spec.rb
|
166
|
+
- spec/rapgenius/song_spec.rb
|
133
167
|
- spec/spec_helper.rb
|
134
|
-
- spec/support/
|
135
|
-
- spec/support/song.html
|
168
|
+
- spec/support/vcr.rb
|