bootleg 0.0.6 → 0.0.7
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +7 -0
- data/.gitignore +1 -0
- data/Gemfile +0 -6
- data/README.md +43 -53
- data/bootleg.gemspec +6 -7
- data/lib/bootleg.rb +5 -7
- data/lib/bootleg/agent.rb +31 -0
- data/lib/bootleg/movie.rb +30 -0
- data/lib/bootleg/page.rb +26 -0
- data/lib/bootleg/theater.rb +54 -0
- data/lib/bootleg/version.rb +1 -1
- data/spec/lib/agent_spec.rb +11 -0
- data/spec/lib/movie_spec.rb +37 -0
- data/spec/lib/page_spec.rb +21 -0
- data/spec/lib/theater_spec.rb +52 -0
- metadata +31 -83
- data/README.rdoc +0 -3
- data/lib/extractor.rb +0 -32
- data/lib/finder.rb +0 -14
- data/lib/generators/bootleg/USAGE +0 -0
- data/lib/generators/bootleg/install_generator.rb +0 -15
- data/lib/generators/bootleg/movie_generator.rb +0 -24
- data/lib/generators/bootleg/showtime_generator.rb +0 -24
- data/lib/generators/bootleg/templates/movie_migration.rb +0 -10
- data/lib/generators/bootleg/templates/movie_model.rb +0 -10
- data/lib/generators/bootleg/templates/showtime_migration.rb +0 -12
- data/lib/generators/bootleg/templates/showtime_model.rb +0 -14
- data/lib/generators/bootleg/templates/theater_migration.rb +0 -10
- data/lib/generators/bootleg/templates/theater_model.rb +0 -10
- data/lib/generators/bootleg/theater_generator.rb +0 -24
- data/lib/manager.rb +0 -27
- data/lib/modules/href.rb +0 -18
- data/lib/modules/movie.rb +0 -24
- data/lib/modules/theater.rb +0 -45
- data/lib/modules/zipcode.rb +0 -12
- data/lib/presenter.rb +0 -8
- data/spec/extractor_spec.rb +0 -59
- data/spec/finder_spec.rb +0 -26
- data/spec/presenter_spec.rb +0 -4
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA1:
|
3
|
+
metadata.gz: dfffbc8d963307c1e4023281762f735b18b0b117
|
4
|
+
data.tar.gz: 0c0af543cf5c8b8c3da8fa0b2c00d5babcab9762
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 2f20a7f0caf9fdaa96fccc43dd38f85a5e8b9a29740c8aef03bfacc5e5da2b7a9ab264ca252272e567e47d4b4395a9076acb17974d9feec5b4848324e0413d3d
|
7
|
+
data.tar.gz: 74025fc09a0c95f8e7bc9900a4bdb4f8c20c5890c78cc2d6767dfa82e34ca2b64d483abf4a08fa67f08792666df628a1722c294ca254eec402524319a86abc74
|
data/.gitignore
CHANGED
data/Gemfile
CHANGED
data/README.md
CHANGED
@@ -1,71 +1,61 @@
|
|
1
1
|
# Bootleg
|
2
2
|
|
3
|
-
Bootleg
|
4
|
-
|
5
|
-
radius.
|
3
|
+
Bootleg represents my playground for scraping movies from
|
4
|
+
[moviefone.com](http://moviefone.com)
|
6
5
|
|
7
|
-
##
|
6
|
+
## If you want you can play with it as well
|
8
7
|
|
9
|
-
|
8
|
+
To gain access to the gem simply:
|
10
9
|
|
11
|
-
|
10
|
+
```
|
11
|
+
require 'bootleg'
|
12
|
+
```
|
12
13
|
|
13
|
-
|
14
|
+
Currently Bootleg supports search for movies and theaters within a certain
|
15
|
+
zipcode.
|
14
16
|
|
15
|
-
|
17
|
+
```
|
18
|
+
agent = Bootleg::Agent.new(zipcode: 20851)
|
19
|
+
```
|
16
20
|
|
17
|
-
|
21
|
+
The bootleg agent is responsible for initializing the enviornment and providing
|
22
|
+
a simple method ``page`` that gives us access to the first search results.
|
18
23
|
|
19
|
-
|
24
|
+
```
|
25
|
+
page = agent.page
|
26
|
+
```
|
20
27
|
|
21
|
-
|
28
|
+
A page has two handy methods. The ``theaters`` method gives you an array with
|
29
|
+
all the available theaters. You have access to a theater's title, link, price
|
30
|
+
for an adult or the price for a child as well as the address of the theater and
|
31
|
+
the movies that are currently palying at that theater. The page also has a
|
32
|
+
``next`` method that give you access to the next page. You can keep navigating
|
33
|
+
through the search results until ``page.next`` returns _Last Page_.
|
22
34
|
|
23
|
-
|
35
|
+
```
|
36
|
+
page = page.next
|
37
|
+
```
|
24
38
|
|
25
|
-
|
39
|
+
Within theaters you also have access to movies. A movie has attributes for title
|
40
|
+
link and showtimes.
|
26
41
|
|
27
|
-
|
42
|
+
```
|
43
|
+
theaters = page.theaters
|
44
|
+
```
|
28
45
|
|
29
|
-
|
46
|
+
This gives you a list of theaters on the current page.
|
30
47
|
|
31
|
-
|
48
|
+
For example if you would like access to all the movies that run in a certain
|
49
|
+
theater you could do the following.
|
32
50
|
|
33
|
-
|
34
|
-
|
51
|
+
```
|
52
|
+
theaters.first.movies
|
53
|
+
```
|
54
|
+
If you need to scrape additional information from either theaters or movies it is
|
55
|
+
fairly easy to extend either the theater class or the movies class.
|
35
56
|
|
36
|
-
|
57
|
+
## TODO
|
37
58
|
|
38
|
-
|
39
|
-
|
40
|
-
|
41
|
-
|
42
|
-
$ theaters = movie.theaters
|
43
|
-
|
44
|
-
Get all the showtimes:
|
45
|
-
|
46
|
-
$ showtimes = movie.showtimes
|
47
|
-
|
48
|
-
Get the theater of the showtimes:
|
49
|
-
|
50
|
-
$ theater = showtimes.first.theater
|
51
|
-
|
52
|
-
## Other Details
|
53
|
-
|
54
|
-
The content is stored in 3 Active Record models BootlegMovie,
|
55
|
-
BootlegTheater and BootlegShowtime. After you load a zipcode just start
|
56
|
-
a rails console and take a look at the models to see what information is
|
57
|
-
stored inside.
|
58
|
-
|
59
|
-
The zipcode is stored under BootlegShowtime.:w
|
60
|
-
|
61
|
-
## Contributing
|
62
|
-
|
63
|
-
1. Fork it
|
64
|
-
2. Create your feature branch (`git checkout -b my-new-feature`)
|
65
|
-
3. Commit your changes (`git commit -am 'Add some feature'`)
|
66
|
-
4. Push to the branch (`git push origin my-new-feature`)
|
67
|
-
5. Create new Pull Request
|
68
|
-
|
69
|
-
It is relatively easy to pull other information from moviefone.com. If you
|
70
|
-
need an extra feature and you would like to contribute feel free to shoot
|
71
|
-
me an email at marius@mlpinit.com beforehand.
|
59
|
+
Some of the main improvements this gem could use is better error support for
|
60
|
+
different edge cases. The gem could also benefit from more tests to ilustrate
|
61
|
+
some of the patterns identified while scrapping movies and theaters.
|
data/bootleg.gemspec
CHANGED
@@ -8,17 +8,16 @@ Gem::Specification.new do |gem|
|
|
8
8
|
gem.version = Bootleg::VERSION
|
9
9
|
gem.authors = ["Marius L. Pop"]
|
10
10
|
gem.email = ["marius@mlpinit.com"]
|
11
|
-
gem.description = %q{
|
12
|
-
gem.summary = %q{
|
11
|
+
gem.description = %q{ This gems allows you to navigate through the results }
|
12
|
+
gem.summary = %q{ Zipcode based scrapping for moviefone.com }
|
13
13
|
gem.homepage = ""
|
14
14
|
|
15
15
|
gem.files = `git ls-files`.split($/)
|
16
16
|
gem.executables = gem.files.grep(%r{^bin/}).map{ |f| File.basename(f) }
|
17
17
|
gem.test_files = gem.files.grep(%r{^(test|spec|features)/})
|
18
18
|
gem.require_paths = ["lib"]
|
19
|
-
|
20
|
-
gem.add_dependency
|
21
|
-
|
22
|
-
gem.
|
23
|
-
gem.add_development_dependency "rspec"
|
19
|
+
|
20
|
+
gem.add_dependency 'mechanize', '~> 2.7.3'
|
21
|
+
|
22
|
+
gem.add_development_dependency 'rspec', '3.0.0.beta2'
|
24
23
|
end
|
data/lib/bootleg.rb
CHANGED
@@ -0,0 +1,31 @@
|
|
1
|
+
module Bootleg
|
2
|
+
class Agent
|
3
|
+
|
4
|
+
attr_reader :zipcode
|
5
|
+
|
6
|
+
def initialize(args)
|
7
|
+
@zipcode = args.fetch(:zipcode)
|
8
|
+
end
|
9
|
+
|
10
|
+
def page
|
11
|
+
@page ||= Bootleg::Page.new page: mechanize.submit(search_form)
|
12
|
+
end
|
13
|
+
|
14
|
+
private
|
15
|
+
|
16
|
+
def mechanize
|
17
|
+
Mechanize.new
|
18
|
+
end
|
19
|
+
|
20
|
+
def home_page
|
21
|
+
mechanize.get('http://moviefone.com')
|
22
|
+
end
|
23
|
+
|
24
|
+
def search_form
|
25
|
+
home_page.form_with(id: 'frm-search').tap do |form|
|
26
|
+
form.fields.last.value = zipcode
|
27
|
+
end
|
28
|
+
end
|
29
|
+
|
30
|
+
end
|
31
|
+
end
|
@@ -0,0 +1,30 @@
|
|
1
|
+
module Bootleg
|
2
|
+
class Movie
|
3
|
+
|
4
|
+
attr_reader :movie
|
5
|
+
private :movie
|
6
|
+
|
7
|
+
def initialize(args)
|
8
|
+
@movie ||= args.fetch(:movie)
|
9
|
+
end
|
10
|
+
|
11
|
+
def title
|
12
|
+
movie_info.text
|
13
|
+
end
|
14
|
+
|
15
|
+
def link
|
16
|
+
movie_info.attributes['href'].value
|
17
|
+
end
|
18
|
+
|
19
|
+
def showtimes
|
20
|
+
movie.search('span.stDisplay').map { |times| times.text }
|
21
|
+
end
|
22
|
+
|
23
|
+
private
|
24
|
+
|
25
|
+
def movie_info
|
26
|
+
@movie_info ||= movie.search('div.movietitle a').last
|
27
|
+
end
|
28
|
+
|
29
|
+
end
|
30
|
+
end
|
data/lib/bootleg/page.rb
ADDED
@@ -0,0 +1,26 @@
|
|
1
|
+
module Bootleg
|
2
|
+
class Page
|
3
|
+
|
4
|
+
attr_reader :page
|
5
|
+
|
6
|
+
def initialize(args)
|
7
|
+
@page ||= args.fetch(:page)
|
8
|
+
end
|
9
|
+
|
10
|
+
def next
|
11
|
+
link ? self.class.new(page: link.click) : 'Last Page'
|
12
|
+
end
|
13
|
+
|
14
|
+
def theaters
|
15
|
+
page.search('div.theater').
|
16
|
+
map { |theater| Bootleg::Theater.new(theater: theater) }
|
17
|
+
end
|
18
|
+
|
19
|
+
private
|
20
|
+
|
21
|
+
def link
|
22
|
+
@link ||= page.link_with(class: 'next-showtime')
|
23
|
+
end
|
24
|
+
|
25
|
+
end
|
26
|
+
end
|
@@ -0,0 +1,54 @@
|
|
1
|
+
module Bootleg
|
2
|
+
class Theater
|
3
|
+
|
4
|
+
attr_reader :theater
|
5
|
+
private :theater
|
6
|
+
|
7
|
+
def initialize(args)
|
8
|
+
@theater ||= args.fetch(:theater)
|
9
|
+
end
|
10
|
+
|
11
|
+
def title
|
12
|
+
title_info.text
|
13
|
+
end
|
14
|
+
|
15
|
+
def link
|
16
|
+
title_info.attributes['href'].value
|
17
|
+
end
|
18
|
+
|
19
|
+
def address
|
20
|
+
# removes the phone number at the end of the address and the white space
|
21
|
+
theater.search('p.address').text.sub(/\|.*/,'').strip
|
22
|
+
end
|
23
|
+
|
24
|
+
def adult_price
|
25
|
+
prices.first
|
26
|
+
end
|
27
|
+
|
28
|
+
def child_price
|
29
|
+
prices.last
|
30
|
+
end
|
31
|
+
|
32
|
+
def movies
|
33
|
+
theater.search('div.movie-data-wrap').
|
34
|
+
map { |movie| Bootleg::Movie.new(movie: movie) }
|
35
|
+
end
|
36
|
+
|
37
|
+
private
|
38
|
+
|
39
|
+
def title_info
|
40
|
+
@title_link ||= theater.search('div.title a').last
|
41
|
+
end
|
42
|
+
|
43
|
+
def prices
|
44
|
+
@prices ||= theater.
|
45
|
+
search("div.prices").
|
46
|
+
first.
|
47
|
+
text.
|
48
|
+
gsub(/\s/,'').
|
49
|
+
split('|').
|
50
|
+
map { |price| $& if price.match /\$.*\d{2}/ }
|
51
|
+
end
|
52
|
+
|
53
|
+
end
|
54
|
+
end
|
data/lib/bootleg/version.rb
CHANGED
@@ -0,0 +1,11 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
require 'bootleg/agent'
|
3
|
+
|
4
|
+
describe Bootleg::Agent do
|
5
|
+
|
6
|
+
it 'raises ArgumentError if not initialized with a zipcode' do
|
7
|
+
expect{described_class.new(not_zipcode: 'hey')}.
|
8
|
+
to raise_error(KeyError, 'key not found: :zipcode')
|
9
|
+
end
|
10
|
+
|
11
|
+
end
|
@@ -0,0 +1,37 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
require 'bootleg/movie'
|
3
|
+
|
4
|
+
describe Bootleg::Movie do
|
5
|
+
|
6
|
+
it 'raises KeyError if not initialized with a nokogiri-movie' do
|
7
|
+
expect{described_class.new(not_movie: 'not movie')}.
|
8
|
+
to raise_error(KeyError, 'key not found: :movie')
|
9
|
+
end
|
10
|
+
|
11
|
+
let(:attributes) {}
|
12
|
+
let(:nokogiri_movie) { double 'Nokogiri Movie' }
|
13
|
+
let(:href) { double( 'Href', value: 'http://moviefone.com') }
|
14
|
+
let(:movie) { double('Movie', text: 'Matrix', attributes: { 'href' => href }) }
|
15
|
+
let(:times) { [ double(text: '1am'), double(text: '2pm')] }
|
16
|
+
|
17
|
+
subject { described_class.new(movie: nokogiri_movie) }
|
18
|
+
|
19
|
+
it 'has a title' do
|
20
|
+
allow(nokogiri_movie).to receive_message_chain(:search, :last).
|
21
|
+
and_return(movie)
|
22
|
+
expect(subject.title).to eq('Matrix')
|
23
|
+
end
|
24
|
+
|
25
|
+
it 'has a link' do
|
26
|
+
allow(nokogiri_movie).to receive_message_chain(:search, :last).
|
27
|
+
and_return(movie)
|
28
|
+
expect(subject.link).to eq('http://moviefone.com')
|
29
|
+
end
|
30
|
+
|
31
|
+
it 'has showtimes' do
|
32
|
+
allow(nokogiri_movie).to receive(:search).with('span.stDisplay').
|
33
|
+
and_return(times)
|
34
|
+
expect(subject.showtimes).to include('1am', '2pm')
|
35
|
+
end
|
36
|
+
|
37
|
+
end
|
@@ -0,0 +1,21 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
require 'bootleg/page'
|
3
|
+
|
4
|
+
describe Bootleg::Page do
|
5
|
+
|
6
|
+
it 'raises KeyError if not initialized with a nokogiri-page' do
|
7
|
+
expect{described_class.new(not_page: 'not page')}.
|
8
|
+
to raise_error(KeyError, 'key not found: :page')
|
9
|
+
end
|
10
|
+
|
11
|
+
let(:link) { double 'Link' }
|
12
|
+
let(:nokogiri_page) { double 'Nokogiri Page' }
|
13
|
+
subject { described_class.new(page: nokogiri_page) }
|
14
|
+
|
15
|
+
it 'returns "Last Page" if no more pages' do
|
16
|
+
allow(nokogiri_page).to receive(:link_with).
|
17
|
+
with(class: 'next-showtime').and_return(nil)
|
18
|
+
expect(subject.next).to eq('Last Page')
|
19
|
+
end
|
20
|
+
|
21
|
+
end
|
@@ -0,0 +1,52 @@
|
|
1
|
+
require 'spec_helper'
|
2
|
+
require 'bootleg/theater'
|
3
|
+
|
4
|
+
describe Bootleg::Theater do
|
5
|
+
|
6
|
+
it 'raises KeyError if not initialized with a nokogiri-theater' do
|
7
|
+
expect{described_class.new(not_theater: 'not theater')}.
|
8
|
+
to raise_error(KeyError, 'key not found: :theater')
|
9
|
+
end
|
10
|
+
|
11
|
+
let(:nokogiri_theater) { double 'Nokogiri Theater' }
|
12
|
+
let(:theater) { double('Theater', text: 'Royal', attributes: { 'href' => href }) }
|
13
|
+
let(:href) { double( 'Href', value: 'http://moviefone.com') }
|
14
|
+
let(:address) { double('Address', text: "\n\n\t Rockville, MD | 234-222") }
|
15
|
+
let(:prices) { double('Prices', text: 'RegularPrice:$11.50 | ChildPrice:$6.50') }
|
16
|
+
|
17
|
+
subject { described_class.new(theater: nokogiri_theater) }
|
18
|
+
|
19
|
+
it 'has a title' do
|
20
|
+
allow(nokogiri_theater).to receive_message_chain(:search, :last).
|
21
|
+
and_return(theater)
|
22
|
+
expect(subject.title).to eq('Royal')
|
23
|
+
end
|
24
|
+
|
25
|
+
it 'has a link' do
|
26
|
+
allow(nokogiri_theater).to receive_message_chain(:search, :last).
|
27
|
+
and_return(theater)
|
28
|
+
expect(subject.link).to eq('http://moviefone.com')
|
29
|
+
end
|
30
|
+
|
31
|
+
it 'has an address' do
|
32
|
+
allow(nokogiri_theater).to receive(:search).with('p.address').
|
33
|
+
and_return(address)
|
34
|
+
expect(subject.address).to eq("Rockville, MD")
|
35
|
+
end
|
36
|
+
|
37
|
+
context 'has price' do
|
38
|
+
before do
|
39
|
+
allow(nokogiri_theater).to receive_message_chain(:search, :first).
|
40
|
+
and_return(prices)
|
41
|
+
end
|
42
|
+
|
43
|
+
it 'for adult' do
|
44
|
+
expect(subject.adult_price).to eq('$11.50')
|
45
|
+
end
|
46
|
+
|
47
|
+
it 'for child' do
|
48
|
+
expect(subject.child_price).to eq('$6.50')
|
49
|
+
end
|
50
|
+
end
|
51
|
+
|
52
|
+
end
|
metadata
CHANGED
@@ -1,148 +1,96 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: bootleg
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0.
|
5
|
-
prerelease:
|
4
|
+
version: 0.0.7
|
6
5
|
platform: ruby
|
7
6
|
authors:
|
8
7
|
- Marius L. Pop
|
9
8
|
autorequire:
|
10
9
|
bindir: bin
|
11
10
|
cert_chain: []
|
12
|
-
date:
|
11
|
+
date: 2016-05-25 00:00:00.000000000 Z
|
13
12
|
dependencies:
|
14
|
-
- !ruby/object:Gem::Dependency
|
15
|
-
name: nokogiri
|
16
|
-
requirement: !ruby/object:Gem::Requirement
|
17
|
-
none: false
|
18
|
-
requirements:
|
19
|
-
- - ! '>='
|
20
|
-
- !ruby/object:Gem::Version
|
21
|
-
version: '0'
|
22
|
-
type: :runtime
|
23
|
-
prerelease: false
|
24
|
-
version_requirements: !ruby/object:Gem::Requirement
|
25
|
-
none: false
|
26
|
-
requirements:
|
27
|
-
- - ! '>='
|
28
|
-
- !ruby/object:Gem::Version
|
29
|
-
version: '0'
|
30
13
|
- !ruby/object:Gem::Dependency
|
31
14
|
name: mechanize
|
32
15
|
requirement: !ruby/object:Gem::Requirement
|
33
|
-
none: false
|
34
|
-
requirements:
|
35
|
-
- - ! '>='
|
36
|
-
- !ruby/object:Gem::Version
|
37
|
-
version: '0'
|
38
|
-
type: :runtime
|
39
|
-
prerelease: false
|
40
|
-
version_requirements: !ruby/object:Gem::Requirement
|
41
|
-
none: false
|
42
|
-
requirements:
|
43
|
-
- - ! '>='
|
44
|
-
- !ruby/object:Gem::Version
|
45
|
-
version: '0'
|
46
|
-
- !ruby/object:Gem::Dependency
|
47
|
-
name: activerecord
|
48
|
-
requirement: !ruby/object:Gem::Requirement
|
49
|
-
none: false
|
50
16
|
requirements:
|
51
|
-
- -
|
17
|
+
- - "~>"
|
52
18
|
- !ruby/object:Gem::Version
|
53
|
-
version:
|
19
|
+
version: 2.7.3
|
54
20
|
type: :runtime
|
55
21
|
prerelease: false
|
56
22
|
version_requirements: !ruby/object:Gem::Requirement
|
57
|
-
none: false
|
58
23
|
requirements:
|
59
|
-
- -
|
24
|
+
- - "~>"
|
60
25
|
- !ruby/object:Gem::Version
|
61
|
-
version:
|
26
|
+
version: 2.7.3
|
62
27
|
- !ruby/object:Gem::Dependency
|
63
28
|
name: rspec
|
64
29
|
requirement: !ruby/object:Gem::Requirement
|
65
|
-
none: false
|
66
30
|
requirements:
|
67
|
-
- -
|
31
|
+
- - '='
|
68
32
|
- !ruby/object:Gem::Version
|
69
|
-
version:
|
33
|
+
version: 3.0.0.beta2
|
70
34
|
type: :development
|
71
35
|
prerelease: false
|
72
36
|
version_requirements: !ruby/object:Gem::Requirement
|
73
|
-
none: false
|
74
37
|
requirements:
|
75
|
-
- -
|
38
|
+
- - '='
|
76
39
|
- !ruby/object:Gem::Version
|
77
|
-
version:
|
78
|
-
description:
|
79
|
-
movifone.com'
|
40
|
+
version: 3.0.0.beta2
|
41
|
+
description: " This gems allows you to navigate through the results "
|
80
42
|
email:
|
81
43
|
- marius@mlpinit.com
|
82
44
|
executables: []
|
83
45
|
extensions: []
|
84
46
|
extra_rdoc_files: []
|
85
47
|
files:
|
86
|
-
- .gitignore
|
48
|
+
- ".gitignore"
|
87
49
|
- Gemfile
|
88
50
|
- LICENSE.txt
|
89
51
|
- README.md
|
90
|
-
- README.rdoc
|
91
52
|
- Rakefile
|
92
53
|
- bootleg.gemspec
|
93
54
|
- lib/bootleg.rb
|
55
|
+
- lib/bootleg/agent.rb
|
56
|
+
- lib/bootleg/movie.rb
|
57
|
+
- lib/bootleg/page.rb
|
58
|
+
- lib/bootleg/theater.rb
|
94
59
|
- lib/bootleg/version.rb
|
95
|
-
- lib/extractor.rb
|
96
|
-
- lib/finder.rb
|
97
|
-
- lib/generators/bootleg/USAGE
|
98
|
-
- lib/generators/bootleg/install_generator.rb
|
99
|
-
- lib/generators/bootleg/movie_generator.rb
|
100
|
-
- lib/generators/bootleg/showtime_generator.rb
|
101
|
-
- lib/generators/bootleg/templates/movie_migration.rb
|
102
|
-
- lib/generators/bootleg/templates/movie_model.rb
|
103
|
-
- lib/generators/bootleg/templates/showtime_migration.rb
|
104
|
-
- lib/generators/bootleg/templates/showtime_model.rb
|
105
|
-
- lib/generators/bootleg/templates/theater_migration.rb
|
106
|
-
- lib/generators/bootleg/templates/theater_model.rb
|
107
|
-
- lib/generators/bootleg/theater_generator.rb
|
108
|
-
- lib/manager.rb
|
109
|
-
- lib/modules/href.rb
|
110
|
-
- lib/modules/movie.rb
|
111
|
-
- lib/modules/theater.rb
|
112
|
-
- lib/modules/zipcode.rb
|
113
|
-
- lib/presenter.rb
|
114
60
|
- spec/.rspec
|
115
|
-
- spec/
|
116
|
-
- spec/
|
117
|
-
- spec/
|
61
|
+
- spec/lib/agent_spec.rb
|
62
|
+
- spec/lib/movie_spec.rb
|
63
|
+
- spec/lib/page_spec.rb
|
64
|
+
- spec/lib/theater_spec.rb
|
118
65
|
- spec/spec_helper.rb
|
119
66
|
homepage: ''
|
120
67
|
licenses: []
|
68
|
+
metadata: {}
|
121
69
|
post_install_message:
|
122
70
|
rdoc_options: []
|
123
71
|
require_paths:
|
124
72
|
- lib
|
125
73
|
required_ruby_version: !ruby/object:Gem::Requirement
|
126
|
-
none: false
|
127
74
|
requirements:
|
128
|
-
- -
|
75
|
+
- - ">="
|
129
76
|
- !ruby/object:Gem::Version
|
130
77
|
version: '0'
|
131
78
|
required_rubygems_version: !ruby/object:Gem::Requirement
|
132
|
-
none: false
|
133
79
|
requirements:
|
134
|
-
- -
|
80
|
+
- - ">="
|
135
81
|
- !ruby/object:Gem::Version
|
136
82
|
version: '0'
|
137
83
|
requirements: []
|
138
84
|
rubyforge_project:
|
139
|
-
rubygems_version:
|
85
|
+
rubygems_version: 2.5.1
|
140
86
|
signing_key:
|
141
|
-
specification_version:
|
142
|
-
summary:
|
87
|
+
specification_version: 4
|
88
|
+
summary: Zipcode based scrapping for moviefone.com
|
143
89
|
test_files:
|
144
90
|
- spec/.rspec
|
145
|
-
- spec/
|
146
|
-
- spec/
|
147
|
-
- spec/
|
91
|
+
- spec/lib/agent_spec.rb
|
92
|
+
- spec/lib/movie_spec.rb
|
93
|
+
- spec/lib/page_spec.rb
|
94
|
+
- spec/lib/theater_spec.rb
|
148
95
|
- spec/spec_helper.rb
|
96
|
+
has_rdoc:
|
data/README.rdoc
DELETED
data/lib/extractor.rb
DELETED
@@ -1,32 +0,0 @@
|
|
1
|
-
require 'mechanize'
|
2
|
-
require 'nokogiri'
|
3
|
-
require 'open-uri'
|
4
|
-
require_relative 'finder'
|
5
|
-
require_relative 'modules/theater'
|
6
|
-
|
7
|
-
class Extractor
|
8
|
-
|
9
|
-
attr_reader :page_theaters
|
10
|
-
|
11
|
-
def initialize(page, zipcode)
|
12
|
-
@page = (Nokogiri::HTML(open(page)))
|
13
|
-
@page_theaters = []
|
14
|
-
extract_movies
|
15
|
-
@zipcode ||= zipcode
|
16
|
-
end
|
17
|
-
|
18
|
-
def extract_movies
|
19
|
-
theaters.each do |theater|
|
20
|
-
theater.extend Theater
|
21
|
-
BootlegTheater.create!(name: theater.name, href: theater.link)
|
22
|
-
theater_info = { name: theater.name, href: theater.link, movies: theater.movies}
|
23
|
-
@page_theaters << theater_info
|
24
|
-
end
|
25
|
-
end
|
26
|
-
|
27
|
-
private
|
28
|
-
|
29
|
-
def theaters
|
30
|
-
@page.css('div.theater')
|
31
|
-
end
|
32
|
-
end
|
data/lib/finder.rb
DELETED
File without changes
|
@@ -1,15 +0,0 @@
|
|
1
|
-
require 'rails/generators/migration'
|
2
|
-
|
3
|
-
module Bootleg
|
4
|
-
module Generators
|
5
|
-
class InstallGenerator < Rails::Generators::Base
|
6
|
-
source_root File.expand_path('../templates', __FILE__)
|
7
|
-
|
8
|
-
def run_generators
|
9
|
-
generate "bootleg:movie"
|
10
|
-
generate "bootleg:theater"
|
11
|
-
generate "bootleg:showtime"
|
12
|
-
end
|
13
|
-
end
|
14
|
-
end
|
15
|
-
end
|
@@ -1,24 +0,0 @@
|
|
1
|
-
require 'rails/generators/migration'
|
2
|
-
|
3
|
-
module Bootleg
|
4
|
-
module Generators
|
5
|
-
class MovieGenerator < Rails::Generators::Base
|
6
|
-
include Rails::Generators::Migration
|
7
|
-
|
8
|
-
source_root File.expand_path('../templates', __FILE__)
|
9
|
-
|
10
|
-
def generate_movie_migration
|
11
|
-
migration_template "movie_migration.rb", "db/migrate/create_bootleg_movies.rb"
|
12
|
-
end
|
13
|
-
|
14
|
-
def generate_movie_model
|
15
|
-
copy_file "movie_model.rb", "app/models/bootleg_movie.rb"
|
16
|
-
end
|
17
|
-
|
18
|
-
def self.next_migration_number(path)
|
19
|
-
@migration_number = Time.now.utc.strftime("%Y%m%d%H%M%S").to_i.to_s
|
20
|
-
end
|
21
|
-
end
|
22
|
-
end
|
23
|
-
end
|
24
|
-
|
@@ -1,24 +0,0 @@
|
|
1
|
-
require 'rails/generators/migration'
|
2
|
-
|
3
|
-
module Bootleg
|
4
|
-
module Generators
|
5
|
-
class ShowtimeGenerator < Rails::Generators::Base
|
6
|
-
include Rails::Generators::Migration
|
7
|
-
|
8
|
-
source_root File.expand_path('../templates', __FILE__)
|
9
|
-
|
10
|
-
def generate_theater_movie_migration
|
11
|
-
migration_template "showtime_migration.rb", "db/migrate/create_bootleg_showtimes.rb"
|
12
|
-
end
|
13
|
-
|
14
|
-
def generate_theater_movie_model
|
15
|
-
copy_file "showtime_model.rb", "app/models/bootleg_showtime.rb"
|
16
|
-
end
|
17
|
-
|
18
|
-
def self.next_migration_number(path)
|
19
|
-
@migration_number = Time.now.utc.strftime("%Y%m%d%H%M%S").to_i.to_s
|
20
|
-
end
|
21
|
-
end
|
22
|
-
end
|
23
|
-
end
|
24
|
-
|
@@ -1,14 +0,0 @@
|
|
1
|
-
class BootlegShowtime < ActiveRecord::Base
|
2
|
-
attr_accessible :bootleg_movie_id, :bootleg_theater_id, :zipcode, :showtimes
|
3
|
-
|
4
|
-
belongs_to :bootleg_movie
|
5
|
-
belongs_to :bootleg_theater
|
6
|
-
|
7
|
-
def theater
|
8
|
-
bootleg_theater
|
9
|
-
end
|
10
|
-
|
11
|
-
def movie
|
12
|
-
bootleg_movie
|
13
|
-
end
|
14
|
-
end
|
@@ -1,24 +0,0 @@
|
|
1
|
-
require 'rails/generators/migration'
|
2
|
-
|
3
|
-
module Bootleg
|
4
|
-
module Generators
|
5
|
-
class TheaterGenerator < Rails::Generators::Base
|
6
|
-
include Rails::Generators::Migration
|
7
|
-
|
8
|
-
source_root File.expand_path('../templates', __FILE__)
|
9
|
-
|
10
|
-
def generate_theater_migration
|
11
|
-
migration_template "theater_migration.rb", "db/migrate/create_bootleg_theaters.rb"
|
12
|
-
end
|
13
|
-
|
14
|
-
def generate_theater_model
|
15
|
-
copy_file "theater_model.rb", "app/models/bootleg_theater.rb"
|
16
|
-
end
|
17
|
-
|
18
|
-
def self.next_migration_number(path)
|
19
|
-
@migration_number = Time.now.utc.strftime("%Y%m%d%H%M%S").to_i.to_s
|
20
|
-
end
|
21
|
-
end
|
22
|
-
end
|
23
|
-
end
|
24
|
-
|
data/lib/manager.rb
DELETED
@@ -1,27 +0,0 @@
|
|
1
|
-
require_relative 'finder'
|
2
|
-
require_relative 'extractor'
|
3
|
-
|
4
|
-
class Manager
|
5
|
-
|
6
|
-
class << self
|
7
|
-
attr_accessor :zipcode
|
8
|
-
end
|
9
|
-
|
10
|
-
def initialize(zipcode)
|
11
|
-
@zipcode = zipcode
|
12
|
-
@pages ||= find_pages
|
13
|
-
@all_theaters = []
|
14
|
-
Manager.zipcode = zipcode
|
15
|
-
end
|
16
|
-
|
17
|
-
def find_pages
|
18
|
-
Finder.new(@zipcode).hrefs
|
19
|
-
end
|
20
|
-
|
21
|
-
def extract_theaters
|
22
|
-
@pages.each do |page|
|
23
|
-
@all_theaters << Extractor.new(page, @zipcode).page_theaters
|
24
|
-
end
|
25
|
-
@all_theaters.flatten
|
26
|
-
end
|
27
|
-
end
|
data/lib/modules/href.rb
DELETED
@@ -1,18 +0,0 @@
|
|
1
|
-
module Href
|
2
|
-
def all
|
3
|
-
pages = []
|
4
|
-
count.times { |nr| pages << url + nr.to_s }
|
5
|
-
pages
|
6
|
-
end
|
7
|
-
|
8
|
-
private
|
9
|
-
|
10
|
-
def count
|
11
|
-
self.links.select { |link| link.text.size < 3 and link.text =~ /\d/ }.last.text.to_i
|
12
|
-
end
|
13
|
-
|
14
|
-
def url
|
15
|
-
self.uri.to_s + '?page='
|
16
|
-
end
|
17
|
-
end
|
18
|
-
|
data/lib/modules/movie.rb
DELETED
@@ -1,24 +0,0 @@
|
|
1
|
-
module Movie
|
2
|
-
def name
|
3
|
-
details.css('a').text.strip
|
4
|
-
end
|
5
|
-
|
6
|
-
def link
|
7
|
-
"http://www.moviefone.com" + details.css('a').attribute('href').value
|
8
|
-
end
|
9
|
-
|
10
|
-
def showtimes
|
11
|
-
values = []
|
12
|
-
showtimes = self.css('a.gt').empty? ? self.css('span.stDisplay') : self.css('a.gt')
|
13
|
-
showtimes.each do |time|
|
14
|
-
values << time.text
|
15
|
-
end
|
16
|
-
values
|
17
|
-
end
|
18
|
-
|
19
|
-
private
|
20
|
-
|
21
|
-
def details
|
22
|
-
self.css('div.movietitle')
|
23
|
-
end
|
24
|
-
end
|
data/lib/modules/theater.rb
DELETED
@@ -1,45 +0,0 @@
|
|
1
|
-
require_relative 'movie'
|
2
|
-
|
3
|
-
module Theater
|
4
|
-
def name
|
5
|
-
details.text.strip
|
6
|
-
end
|
7
|
-
|
8
|
-
def link
|
9
|
-
details.attribute('href').value
|
10
|
-
end
|
11
|
-
|
12
|
-
def movies
|
13
|
-
movies = self.css('div.movie-listing.first')
|
14
|
-
values = []
|
15
|
-
theater = BootlegTheater.last
|
16
|
-
movies.each do |movie|
|
17
|
-
movie.extend Movie
|
18
|
-
movie_info = { name: movie.name, href: movie.link, showtimes: movie.showtimes }
|
19
|
-
values << movie_info
|
20
|
-
insert_movies(theater,movie)
|
21
|
-
end
|
22
|
-
values
|
23
|
-
end
|
24
|
-
|
25
|
-
private
|
26
|
-
def details
|
27
|
-
self.css('h3.title').css('a')
|
28
|
-
end
|
29
|
-
|
30
|
-
def insert_movies(theater, movie)
|
31
|
-
existing_movie = BootlegMovie.where(name: movie.name).first
|
32
|
-
if existing_movie
|
33
|
-
showtime = theater.bootleg_showtimes.new
|
34
|
-
showtime.bootleg_movie_id = existing_movie.id
|
35
|
-
showtime.save
|
36
|
-
else
|
37
|
-
theater.movies.create!(name: movie.name, href: movie.link)
|
38
|
-
end
|
39
|
-
showtime = BootlegShowtime.last
|
40
|
-
showtime.showtimes = movie.showtimes.to_s.gsub(/-/, '').gsub(/\n/,'').strip
|
41
|
-
showtime.zipcode = Manager.zipcode
|
42
|
-
showtime.date = Time.zone.now
|
43
|
-
showtime.save
|
44
|
-
end
|
45
|
-
end
|
data/lib/modules/zipcode.rb
DELETED
data/lib/presenter.rb
DELETED
data/spec/extractor_spec.rb
DELETED
@@ -1,59 +0,0 @@
|
|
1
|
-
require 'spec_helper'
|
2
|
-
|
3
|
-
describe Extractor do
|
4
|
-
it "should raise an error withouth arguments" do
|
5
|
-
expect{ Extractor.new }.to raise_error(ArgumentError)
|
6
|
-
end
|
7
|
-
|
8
|
-
it "should not raise error with one argument" do
|
9
|
-
expect { Extractor.new("http://www.moviefone.com")}.to_not raise_error(ArgumentError)
|
10
|
-
end
|
11
|
-
|
12
|
-
it "should raise an error with more then one argument" do
|
13
|
-
expect { Extractor.new("arg1", "arg2") }.to raise_error(ArgumentError)
|
14
|
-
end
|
15
|
-
|
16
|
-
before :all do
|
17
|
-
@theaters = Extractor.new("http://www.moviefone.com/showtimes/manchester-md/21102/theaters").page_theaters
|
18
|
-
end
|
19
|
-
|
20
|
-
it "should pull out no more then 5 theaters" do
|
21
|
-
@theaters.size.should eq(5)
|
22
|
-
end
|
23
|
-
|
24
|
-
describe Theater do
|
25
|
-
before :all do
|
26
|
-
@theater = @theaters[1]
|
27
|
-
end
|
28
|
-
|
29
|
-
it "name should match expression" do
|
30
|
-
expect(@theater[:name]).to match(/(\w|\s)/)
|
31
|
-
end
|
32
|
-
|
33
|
-
it "href should be a link" do
|
34
|
-
expect(@theater[:href]).to match(/http:\/\/www\.moviefone\.com/)
|
35
|
-
end
|
36
|
-
|
37
|
-
describe Movie do
|
38
|
-
|
39
|
-
before :all do
|
40
|
-
@movie = @theater[:movies].first
|
41
|
-
end
|
42
|
-
it "should have a name, href and showtimes" do
|
43
|
-
expect(@movie.size).to eq(3)
|
44
|
-
end
|
45
|
-
|
46
|
-
it "name should match expression" do
|
47
|
-
expect(@movie[:name]).to match(/(\w|\s)/)
|
48
|
-
end
|
49
|
-
|
50
|
-
it "href shoud mathc expression" do
|
51
|
-
expect(@movie[:href]).to match(/http:\/\/www\.moviefone\.com/)
|
52
|
-
end
|
53
|
-
|
54
|
-
it "shotimes returns an array" do
|
55
|
-
expect(@movie[:showtimes].class).to be(Array)
|
56
|
-
end
|
57
|
-
end
|
58
|
-
end
|
59
|
-
end
|
data/spec/finder_spec.rb
DELETED
@@ -1,26 +0,0 @@
|
|
1
|
-
require 'spec_helper'
|
2
|
-
|
3
|
-
describe Finder do
|
4
|
-
it "should raise an error with no arguments" do
|
5
|
-
expect { Finder.new }.to raise_error(ArgumentError)
|
6
|
-
end
|
7
|
-
|
8
|
-
it "should not raise error with one argument" do
|
9
|
-
expect { Finder.new("smth") }.to_not raise_error(ArgumentError)
|
10
|
-
end
|
11
|
-
|
12
|
-
it "should raise an error with more then one arguments" do
|
13
|
-
expect { Finder.new("smth", "smthelse") }.to raise_error(ArgumentError)
|
14
|
-
end
|
15
|
-
|
16
|
-
|
17
|
-
describe Href do
|
18
|
-
before :all do
|
19
|
-
@hrefs = Finder.new('21102').hrefs
|
20
|
-
end
|
21
|
-
|
22
|
-
it "should have a size of 3" do
|
23
|
-
expect(@hrefs.size).to eq(3)
|
24
|
-
end
|
25
|
-
end
|
26
|
-
end
|
data/spec/presenter_spec.rb
DELETED