insta_scrape 0.0.1 → 0.1.0

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: 940483c4764250e854209f404cdd6bd8cfdda1fd
4
- data.tar.gz: 383a98b071cbc9b4b5873261e758d5d5217ad73f
3
+ metadata.gz: 34460e7bb43c001ca79d3c2c3e4cd058d87cb407
4
+ data.tar.gz: 69e80191442e15dd7eaa6b3b23b06fd6c9874615
5
5
  SHA512:
6
- metadata.gz: 280ccc5a73ccff07e08bf8ce2acc625c3168aabbf62d6c501920fc7f66898bbecee34b7669f9d031668591ff07003adfdc71e5667bb30b72aa5253b1cad2746f
7
- data.tar.gz: 86579d7c6a0ab587cb7a672448ed988cb194bdfc0baea146e00eb1ca28ec7528f26932f0c71b5e50c2e1c9ec51af25a19680a44ba7b95790d85b4f0a6d2ad78d
6
+ metadata.gz: 58f25720adaa3fbf538f6aedd3c3dc2ad5d16975afb9b8287934ea08fac3006b9b22cd7dab0ac36d8bd820c2cfad3914714d09703559dd61d84f8c2c6cc527dd
7
+ data.tar.gz: ec7b603c8425cac9d446ad8dc8d6bada6acda4aa275cbcf85a526f9d0484aa167c8170c2de329d0ec83f8702cd4ddb996ee02e21fbaed9a6db7b7ca80712589b
data/README.md CHANGED
@@ -1,16 +1,24 @@
1
+ [![Build Status](https://travis-ci.org/dannyvassallo/insta_scrape.svg?branch=master)](https://travis-ci.org/dannyvassallo/insta_scrape)[![Gem Version](https://badge.fury.io/rb/insta_scrape.svg)](https://badge.fury.io/rb/insta_scrape)
2
+ ![alt text](https://s3-us-west-2.amazonaws.com/instascrape/instascrapelogo.png "logo")
1
3
  # InstaScrape
2
4
 
3
5
  A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly.
4
6
  This gem is dependent on Capybara, PhantomJS, and Poltergeist.
5
7
 
8
+ ## Note
9
+
10
+ The number of results may vary as this isn't an official endpoint.
11
+
12
+ ## Todo
13
+
14
+ * Pagination
15
+ * Assess infinite scroll
16
+
6
17
  ## Installation
7
18
 
8
19
  Add this line to your application's Gemfile:
9
20
 
10
21
  ```ruby
11
- gem 'poltergeist'
12
- gem 'phantomjs', :require => 'phantomjs/poltergeist'
13
- gem 'capybara'
14
22
  gem 'insta_scrape'
15
23
  ```
16
24
 
@@ -29,21 +37,49 @@ The scrape maps the response objects to an array. The objects currently have 2 a
29
37
  The simplest use is the following case:
30
38
 
31
39
  ```ruby
32
- scraper = InstaScrape.new
33
40
  #InstaScrape takes one argument. In this case its the #test hashtag.
34
- scrape_result = scraper.hashtag("test")
41
+ @insta_scrape = InstaScrape.new
42
+ scrape_result = @insta_scrape.hashtag("test")
35
43
  scrape_result.each do |post|
36
44
  puts post["image"]
37
45
  puts post["link"]
38
46
  end
39
47
  ```
40
48
 
49
+ Here is a `.erb` example using MaterializeCSS to render the posts as cards:
50
+ ```ruby
51
+
52
+ #in your controller or helper assuming you aren't storing the posts
53
+ @insta_scrape = InstaScrape.new
54
+ @posts = @insta_scrape.hashtag("test")
55
+
56
+ # your .erb file
57
+ <% @posts.each do |post| %>
58
+ <div class="col s12 m6 l4">
59
+ <div class="card hoverable">
60
+ <div class="card-image"><a href="<%= post['link'] %>"><img src="<%= post['image'] %>"></a></div>
61
+ <div class="card-content">
62
+ <!-- <p></p> -->
63
+ </div>
64
+ <div class="card-action center-align"><a class="btn black" href="<%= post['link'] %>">Open Post</a></div>
65
+ </div>
66
+ </div>
67
+ <% end %>
68
+ ```
69
+
41
70
  ## Development
42
71
 
43
72
  After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
44
73
 
45
74
  To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
46
75
 
76
+ ## Deployment / Build
77
+
78
+ ```
79
+ gem build insta_scrape.gemspec
80
+ gem push insta_scrape-v.v.v.gem
81
+ ```
82
+
47
83
  ## Contributing
48
84
 
49
85
  Bug reports and pull requests are welcome on GitHub at https://github.com/dannyvassallo/insta_scrape. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
data/insta_scrape.gemspec CHANGED
@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
10
10
  spec.email = ["danielvassallo87@gmail.com"]
11
11
 
12
12
  spec.summary = %q{Use Instagram Hashtag Embeds in 2016}
13
- spec.description = %q{Use hashtag embeds in 2016 because Instagram's API changes are unethical}
13
+ spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.}
14
14
  spec.homepage = "https://github.com/dannyvassallo/insta_scrape"
15
15
  spec.license = "MIT"
16
16
 
@@ -22,7 +22,12 @@ Gem::Specification.new do |spec|
22
22
  spec.add_development_dependency "bundler", "~> 1.11"
23
23
  spec.add_development_dependency "rake", "~> 10.0"
24
24
  spec.add_development_dependency "rspec", "~> 3.0"
25
- spec.add_development_dependency "capybara"
26
- spec.add_development_dependency "phantomjs"
27
- spec.add_development_dependency "poltergeist"
25
+ spec.add_development_dependency "capybara", "~> 2.7.1"
26
+ spec.add_development_dependency "phantomjs", "~> 2.1.1.0"
27
+ spec.add_development_dependency "poltergeist", "~> 1.9.0"
28
+
29
+ spec.add_runtime_dependency "capybara", ">= 2.7.1"
30
+ spec.add_runtime_dependency "phantomjs", ">= 2.1.1.0"
31
+ spec.add_runtime_dependency "poltergeist", ">= 1.9.0"
32
+
28
33
  end
data/lib/insta_scrape.rb CHANGED
@@ -16,6 +16,27 @@ class InstaScrape
16
16
  def hashtag(hashtag)
17
17
  visit "https://www.instagram.com/explore/tags/#{hashtag}/"
18
18
  @posts = []
19
+
20
+ begin
21
+ page.find('a', :text => "Load more", exact: true).click
22
+ max_iteration = 10
23
+ iteration = 0
24
+ while iteration < max_iteration do
25
+ iteration += 1
26
+ 5.times { page.execute_script "window.scrollBy(0,10000)" }
27
+ sleep 0.2
28
+ end
29
+ iterate_through_posts
30
+ rescue Capybara::ElementNotFound => e
31
+ begin
32
+ iterate_through_posts
33
+ end
34
+ end
35
+ end
36
+
37
+ private
38
+
39
+ def iterate_through_posts
19
40
  all("article div div div a").each do |post|
20
41
 
21
42
  link = post["href"]
@@ -29,6 +50,12 @@ class InstaScrape
29
50
  @posts << info
30
51
 
31
52
  end
53
+
54
+ #log
55
+ puts "POST COUNT: #{@posts.length}"
56
+
57
+ #return result
32
58
  return @posts
33
59
  end
60
+
34
61
  end
@@ -1,3 +1,3 @@
1
1
  class InstaScrape
2
- VERSION = "0.0.1"
2
+ VERSION = "0.1.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: insta_scrape
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.0.1
4
+ version: 0.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - dannyvassallo
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-06-10 00:00:00.000000000 Z
11
+ date: 2016-06-11 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -56,45 +56,88 @@ dependencies:
56
56
  name: capybara
57
57
  requirement: !ruby/object:Gem::Requirement
58
58
  requirements:
59
- - - ">="
59
+ - - "~>"
60
+ - !ruby/object:Gem::Version
61
+ version: 2.7.1
62
+ type: :development
63
+ prerelease: false
64
+ version_requirements: !ruby/object:Gem::Requirement
65
+ requirements:
66
+ - - "~>"
67
+ - !ruby/object:Gem::Version
68
+ version: 2.7.1
69
+ - !ruby/object:Gem::Dependency
70
+ name: phantomjs
71
+ requirement: !ruby/object:Gem::Requirement
72
+ requirements:
73
+ - - "~>"
60
74
  - !ruby/object:Gem::Version
61
- version: '0'
75
+ version: 2.1.1.0
62
76
  type: :development
63
77
  prerelease: false
64
78
  version_requirements: !ruby/object:Gem::Requirement
79
+ requirements:
80
+ - - "~>"
81
+ - !ruby/object:Gem::Version
82
+ version: 2.1.1.0
83
+ - !ruby/object:Gem::Dependency
84
+ name: poltergeist
85
+ requirement: !ruby/object:Gem::Requirement
86
+ requirements:
87
+ - - "~>"
88
+ - !ruby/object:Gem::Version
89
+ version: 1.9.0
90
+ type: :development
91
+ prerelease: false
92
+ version_requirements: !ruby/object:Gem::Requirement
93
+ requirements:
94
+ - - "~>"
95
+ - !ruby/object:Gem::Version
96
+ version: 1.9.0
97
+ - !ruby/object:Gem::Dependency
98
+ name: capybara
99
+ requirement: !ruby/object:Gem::Requirement
65
100
  requirements:
66
101
  - - ">="
67
102
  - !ruby/object:Gem::Version
68
- version: '0'
103
+ version: 2.7.1
104
+ type: :runtime
105
+ prerelease: false
106
+ version_requirements: !ruby/object:Gem::Requirement
107
+ requirements:
108
+ - - ">="
109
+ - !ruby/object:Gem::Version
110
+ version: 2.7.1
69
111
  - !ruby/object:Gem::Dependency
70
112
  name: phantomjs
71
113
  requirement: !ruby/object:Gem::Requirement
72
114
  requirements:
73
115
  - - ">="
74
116
  - !ruby/object:Gem::Version
75
- version: '0'
76
- type: :development
117
+ version: 2.1.1.0
118
+ type: :runtime
77
119
  prerelease: false
78
120
  version_requirements: !ruby/object:Gem::Requirement
79
121
  requirements:
80
122
  - - ">="
81
123
  - !ruby/object:Gem::Version
82
- version: '0'
124
+ version: 2.1.1.0
83
125
  - !ruby/object:Gem::Dependency
84
126
  name: poltergeist
85
127
  requirement: !ruby/object:Gem::Requirement
86
128
  requirements:
87
129
  - - ">="
88
130
  - !ruby/object:Gem::Version
89
- version: '0'
90
- type: :development
131
+ version: 1.9.0
132
+ type: :runtime
91
133
  prerelease: false
92
134
  version_requirements: !ruby/object:Gem::Requirement
93
135
  requirements:
94
136
  - - ">="
95
137
  - !ruby/object:Gem::Version
96
- version: '0'
97
- description: Use hashtag embeds in 2016 because Instagram's API changes are unethical
138
+ version: 1.9.0
139
+ description: A ruby scraper for instagram in 2016. Because the hashtag deprecation
140
+ in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.
98
141
  email:
99
142
  - danielvassallo87@gmail.com
100
143
  executables: []