insta_scrape 0.0.1 → 0.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 940483c4764250e854209f404cdd6bd8cfdda1fd
4
- data.tar.gz: 383a98b071cbc9b4b5873261e758d5d5217ad73f
3
+ metadata.gz: 34460e7bb43c001ca79d3c2c3e4cd058d87cb407
4
+ data.tar.gz: 69e80191442e15dd7eaa6b3b23b06fd6c9874615
5
5
  SHA512:
6
- metadata.gz: 280ccc5a73ccff07e08bf8ce2acc625c3168aabbf62d6c501920fc7f66898bbecee34b7669f9d031668591ff07003adfdc71e5667bb30b72aa5253b1cad2746f
7
- data.tar.gz: 86579d7c6a0ab587cb7a672448ed988cb194bdfc0baea146e00eb1ca28ec7528f26932f0c71b5e50c2e1c9ec51af25a19680a44ba7b95790d85b4f0a6d2ad78d
6
+ metadata.gz: 58f25720adaa3fbf538f6aedd3c3dc2ad5d16975afb9b8287934ea08fac3006b9b22cd7dab0ac36d8bd820c2cfad3914714d09703559dd61d84f8c2c6cc527dd
7
+ data.tar.gz: ec7b603c8425cac9d446ad8dc8d6bada6acda4aa275cbcf85a526f9d0484aa167c8170c2de329d0ec83f8702cd4ddb996ee02e21fbaed9a6db7b7ca80712589b
data/README.md CHANGED
@@ -1,16 +1,24 @@
1
+ [![Build Status](https://travis-ci.org/dannyvassallo/insta_scrape.svg?branch=master)](https://travis-ci.org/dannyvassallo/insta_scrape)[![Gem Version](https://badge.fury.io/rb/insta_scrape.svg)](https://badge.fury.io/rb/insta_scrape)
2
+ ![alt text](https://s3-us-west-2.amazonaws.com/instascrape/instascrapelogo.png "logo")
1
3
  # InstaScrape
2
4
 
3
5
  A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly.
4
6
  This gem is dependent on Capybara, PhantomJS, and Poltergeist.
5
7
 
8
+ ## Note
9
+
10
+ The number of results may vary as this isn't an official endpoint.
11
+
12
+ ## Todo
13
+
14
+ * Pagination
15
+ * Assess infinite scroll
16
+
6
17
  ## Installation
7
18
 
8
19
  Add this line to your application's Gemfile:
9
20
 
10
21
  ```ruby
11
- gem 'poltergeist'
12
- gem 'phantomjs', :require => 'phantomjs/poltergeist'
13
- gem 'capybara'
14
22
  gem 'insta_scrape'
15
23
  ```
16
24
 
@@ -29,21 +37,49 @@ The scrape maps the response objects to an array. The objects currently have 2 a
29
37
  The simplest use is the following case:
30
38
 
31
39
  ```ruby
32
- scraper = InstaScrape.new
33
40
  #InstaScrape takes one argument. In this case its the #test hashtag.
34
- scrape_result = scraper.hashtag("test")
41
+ @insta_scrape = InstaScrape.new
42
+ scrape_result = @insta_scrape.hashtag("test")
35
43
  scrape_result.each do |post|
36
44
  puts post["image"]
37
45
  puts post["link"]
38
46
  end
39
47
  ```
40
48
 
49
+ Here is a `.erb` example using MaterializeCSS to render the posts as cards:
50
+ ```ruby
51
+
52
+ #in your controller or helper assuming you aren't storing the posts
53
+ @insta_scrape = InstaScrape.new
54
+ @posts = @insta_scrape.hashtag("test")
55
+
56
+ # your .erb file
57
+ <% @posts.each do |post| %>
58
+ <div class="col s12 m6 l4">
59
+ <div class="card hoverable">
60
+ <div class="card-image"><a href="<%= post['link'] %>"><img src="<%= post['image'] %>"></a></div>
61
+ <div class="card-content">
62
+ <!-- <p></p> -->
63
+ </div>
64
+ <div class="card-action center-align"><a class="btn black" href="<%= post['link'] %>">Open Post</a></div>
65
+ </div>
66
+ </div>
67
+ <% end %>
68
+ ```
69
+
41
70
  ## Development
42
71
 
43
72
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
44
73
 
45
74
  To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
46
75
 
76
+ ## Deployment / Build
77
+
78
+ ```
79
+ gem build insta_scrape.gemspec
80
+ gem push insta_scrape-v.v.v.gem
81
+ ```
82
+
47
83
  ## Contributing
48
84
 
49
85
  Bug reports and pull requests are welcome on GitHub at https://github.com/dannyvassallo/insta_scrape. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
data/insta_scrape.gemspec CHANGED
@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
10
10
  spec.email = ["danielvassallo87@gmail.com"]
11
11
 
12
12
  spec.summary = %q{Use Instagram Hashtag Embeds in 2016}
13
- spec.description = %q{Use hashtag embeds in 2016 because Instagram's API changes are unethical}
13
+ spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.}
14
14
  spec.homepage = "https://github.com/dannyvassallo/insta_scrape"
15
15
  spec.license = "MIT"
16
16
 
@@ -22,7 +22,12 @@ Gem::Specification.new do |spec|
22
22
  spec.add_development_dependency "bundler", "~> 1.11"
23
23
  spec.add_development_dependency "rake", "~> 10.0"
24
24
  spec.add_development_dependency "rspec", "~> 3.0"
25
- spec.add_development_dependency "capybara"
26
- spec.add_development_dependency "phantomjs"
27
- spec.add_development_dependency "poltergeist"
25
+ spec.add_development_dependency "capybara", "~> 2.7.1"
26
+ spec.add_development_dependency "phantomjs", "~> 2.1.1.0"
27
+ spec.add_development_dependency "poltergeist", "~> 1.9.0"
28
+
29
+ spec.add_runtime_dependency "capybara", ">= 2.7.1"
30
+ spec.add_runtime_dependency "phantomjs", ">= 2.1.1.0"
31
+ spec.add_runtime_dependency "poltergeist", ">= 1.9.0"
32
+
28
33
  end
data/lib/insta_scrape.rb CHANGED
@@ -16,6 +16,27 @@ class InstaScrape
16
16
  def hashtag(hashtag)
17
17
  visit "https://www.instagram.com/explore/tags/#{hashtag}/"
18
18
  @posts = []
19
+
20
+ begin
21
+ page.find('a', :text => "Load more", exact: true).click
22
+ max_iteration = 10
23
+ iteration = 0
24
+ while iteration < max_iteration do
25
+ iteration += 1
26
+ 5.times { page.execute_script "window.scrollBy(0,10000)" }
27
+ sleep 0.2
28
+ end
29
+ iterate_through_posts
30
+ rescue Capybara::ElementNotFound => e
31
+ begin
32
+ iterate_through_posts
33
+ end
34
+ end
35
+ end
36
+
37
+ private
38
+
39
+ def iterate_through_posts
19
40
  all("article div div div a").each do |post|
20
41
 
21
42
  link = post["href"]
@@ -29,6 +50,12 @@ class InstaScrape
29
50
  @posts << info
30
51
 
31
52
  end
53
+
54
+ #log
55
+ puts "POST COUNT: #{@posts.length}"
56
+
57
+ #return result
32
58
  return @posts
33
59
  end
60
+
34
61
  end
@@ -1,3 +1,3 @@
1
1
  class InstaScrape
2
- VERSION = "0.0.1"
2
+ VERSION = "0.1.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: insta_scrape
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - dannyvassallo
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-06-10 00:00:00.000000000 Z
11
+ date: 2016-06-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -56,45 +56,88 @@ dependencies:
56
56
  name: capybara
57
57
  requirement: !ruby/object:Gem::Requirement
58
58
  requirements:
59
- - - ">="
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: 2.7.1
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: 2.7.1
69
+ - !ruby/object:Gem::Dependency
70
+ name: phantomjs
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
60
74
  - !ruby/object:Gem::Version
61
- version: '0'
75
+ version: 2.1.1.0
62
76
  type: :development
63
77
  prerelease: false
64
78
  version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: 2.1.1.0
83
+ - !ruby/object:Gem::Dependency
84
+ name: poltergeist
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: 1.9.0
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: 1.9.0
97
+ - !ruby/object:Gem::Dependency
98
+ name: capybara
99
+ requirement: !ruby/object:Gem::Requirement
65
100
  requirements:
66
101
  - - ">="
67
102
  - !ruby/object:Gem::Version
68
- version: '0'
103
+ version: 2.7.1
104
+ type: :runtime
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - ">="
109
+ - !ruby/object:Gem::Version
110
+ version: 2.7.1
69
111
  - !ruby/object:Gem::Dependency
70
112
  name: phantomjs
71
113
  requirement: !ruby/object:Gem::Requirement
72
114
  requirements:
73
115
  - - ">="
74
116
  - !ruby/object:Gem::Version
75
- version: '0'
76
- type: :development
117
+ version: 2.1.1.0
118
+ type: :runtime
77
119
  prerelease: false
78
120
  version_requirements: !ruby/object:Gem::Requirement
79
121
  requirements:
80
122
  - - ">="
81
123
  - !ruby/object:Gem::Version
82
- version: '0'
124
+ version: 2.1.1.0
83
125
  - !ruby/object:Gem::Dependency
84
126
  name: poltergeist
85
127
  requirement: !ruby/object:Gem::Requirement
86
128
  requirements:
87
129
  - - ">="
88
130
  - !ruby/object:Gem::Version
89
- version: '0'
90
- type: :development
131
+ version: 1.9.0
132
+ type: :runtime
91
133
  prerelease: false
92
134
  version_requirements: !ruby/object:Gem::Requirement
93
135
  requirements:
94
136
  - - ">="
95
137
  - !ruby/object:Gem::Version
96
- version: '0'
97
- description: Use hashtag embeds in 2016 because Instagram's API changes are unethical
138
+ version: 1.9.0
139
+ description: A ruby scraper for instagram in 2016. Because the hashtag deprecation
140
+ in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.
98
141
  email:
99
142
  - danielvassallo87@gmail.com
100
143
  executables: []