insta_scrape 0.0.1 → 0.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +41 -5
- data/insta_scrape.gemspec +9 -4
- data/lib/insta_scrape.rb +27 -0
- data/lib/insta_scrape/version.rb +1 -1
- metadata +55 -12
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 34460e7bb43c001ca79d3c2c3e4cd058d87cb407
|
4
|
+
data.tar.gz: 69e80191442e15dd7eaa6b3b23b06fd6c9874615
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 58f25720adaa3fbf538f6aedd3c3dc2ad5d16975afb9b8287934ea08fac3006b9b22cd7dab0ac36d8bd820c2cfad3914714d09703559dd61d84f8c2c6cc527dd
|
7
|
+
data.tar.gz: ec7b603c8425cac9d446ad8dc8d6bada6acda4aa275cbcf85a526f9d0484aa167c8170c2de329d0ec83f8702cd4ddb996ee02e21fbaed9a6db7b7ca80712589b
|
data/README.md
CHANGED
@@ -1,16 +1,24 @@
|
|
1
|
+
[](https://travis-ci.org/dannyvassallo/insta_scrape)[](https://badge.fury.io/rb/insta_scrape)
|
2
|
+

|
1
3
|
# InstaScrape
|
2
4
|
|
3
5
|
A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly.
|
4
6
|
This gem is dependent on Capybara, PhantomJS, and Poltergeist.
|
5
7
|
|
8
|
+
## Note
|
9
|
+
|
10
|
+
The number of results may vary as this isn't an official endpoint.
|
11
|
+
|
12
|
+
## Todo
|
13
|
+
|
14
|
+
* Pagination
|
15
|
+
* Assess infinite scroll
|
16
|
+
|
6
17
|
## Installation
|
7
18
|
|
8
19
|
Add this line to your application's Gemfile:
|
9
20
|
|
10
21
|
```ruby
|
11
|
-
gem 'poltergeist'
|
12
|
-
gem 'phantomjs', :require => 'phantomjs/poltergeist'
|
13
|
-
gem 'capybara'
|
14
22
|
gem 'insta_scrape'
|
15
23
|
```
|
16
24
|
|
@@ -29,21 +37,49 @@ The scrape maps the response objects to an array. The objects currently have 2 a
|
|
29
37
|
The simplest use is the following case:
|
30
38
|
|
31
39
|
```ruby
|
32
|
-
scraper = InstaScrape.new
|
33
40
|
#InstaScrape takes one argument. In this case its the #test hashtag.
|
34
|
-
|
41
|
+
@insta_scrape = InstaScrape.new
|
42
|
+
scrape_result = @insta_scrape.hashtag("test")
|
35
43
|
scrape_result.each do |post|
|
36
44
|
puts post["image"]
|
37
45
|
puts post["link"]
|
38
46
|
end
|
39
47
|
```
|
40
48
|
|
49
|
+
Here is a `.erb` example using MaterializeCSS to render the posts as cards:
|
50
|
+
```ruby
|
51
|
+
|
52
|
+
#in your controller or helper assuming you aren't storing the posts
|
53
|
+
@insta_scrape = InstaScrape.new
|
54
|
+
@posts = @insta_scrape.hashtag("test")
|
55
|
+
|
56
|
+
# your .erb file
|
57
|
+
<% @posts.each do |post| %>
|
58
|
+
<div class="col s12 m6 l4">
|
59
|
+
<div class="card hoverable">
|
60
|
+
<div class="card-image"><a href="<%= post['link'] %>"><img src="<%= post['image'] %>"></a></div>
|
61
|
+
<div class="card-content">
|
62
|
+
<!-- <p></p> -->
|
63
|
+
</div>
|
64
|
+
<div class="card-action center-align"><a class="btn black" href="<%= post['link'] %>">Open Post</a></div>
|
65
|
+
</div>
|
66
|
+
</div>
|
67
|
+
<% end %>
|
68
|
+
```
|
69
|
+
|
41
70
|
## Development
|
42
71
|
|
43
72
|
After checking out the repo, run `bin/setup` to install dependencies. Then, run `rake spec` to run the tests. You can also run `bin/console` for an interactive prompt that will allow you to experiment.
|
44
73
|
|
45
74
|
To install this gem onto your local machine, run `bundle exec rake install`. To release a new version, update the version number in `version.rb`, and then run `bundle exec rake release`, which will create a git tag for the version, push git commits and tags, and push the `.gem` file to [rubygems.org](https://rubygems.org).
|
46
75
|
|
76
|
+
## Deployment / Build
|
77
|
+
|
78
|
+
```
|
79
|
+
gem build insta_scrape.gemspec
|
80
|
+
gem push insta_scrape-v.v.v.gem
|
81
|
+
```
|
82
|
+
|
47
83
|
## Contributing
|
48
84
|
|
49
85
|
Bug reports and pull requests are welcome on GitHub at https://github.com/dannyvassallo/insta_scrape. This project is intended to be a safe, welcoming space for collaboration, and contributors are expected to adhere to the [Contributor Covenant](http://contributor-covenant.org) code of conduct.
|
data/insta_scrape.gemspec
CHANGED
@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
|
|
10
10
|
spec.email = ["danielvassallo87@gmail.com"]
|
11
11
|
|
12
12
|
spec.summary = %q{Use Instagram Hashtag Embeds in 2016}
|
13
|
-
spec.description = %q{
|
13
|
+
spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.}
|
14
14
|
spec.homepage = "https://github.com/dannyvassallo/insta_scrape"
|
15
15
|
spec.license = "MIT"
|
16
16
|
|
@@ -22,7 +22,12 @@ Gem::Specification.new do |spec|
|
|
22
22
|
spec.add_development_dependency "bundler", "~> 1.11"
|
23
23
|
spec.add_development_dependency "rake", "~> 10.0"
|
24
24
|
spec.add_development_dependency "rspec", "~> 3.0"
|
25
|
-
spec.add_development_dependency "capybara"
|
26
|
-
spec.add_development_dependency "phantomjs"
|
27
|
-
spec.add_development_dependency "poltergeist"
|
25
|
+
spec.add_development_dependency "capybara", "~> 2.7.1"
|
26
|
+
spec.add_development_dependency "phantomjs", "~> 2.1.1.0"
|
27
|
+
spec.add_development_dependency "poltergeist", "~> 1.9.0"
|
28
|
+
|
29
|
+
spec.add_runtime_dependency "capybara", ">= 2.7.1"
|
30
|
+
spec.add_runtime_dependency "phantomjs", ">= 2.1.1.0"
|
31
|
+
spec.add_runtime_dependency "poltergeist", ">= 1.9.0"
|
32
|
+
|
28
33
|
end
|
data/lib/insta_scrape.rb
CHANGED
@@ -16,6 +16,27 @@ class InstaScrape
|
|
16
16
|
def hashtag(hashtag)
|
17
17
|
visit "https://www.instagram.com/explore/tags/#{hashtag}/"
|
18
18
|
@posts = []
|
19
|
+
|
20
|
+
begin
|
21
|
+
page.find('a', :text => "Load more", exact: true).click
|
22
|
+
max_iteration = 10
|
23
|
+
iteration = 0
|
24
|
+
while iteration < max_iteration do
|
25
|
+
iteration += 1
|
26
|
+
5.times { page.execute_script "window.scrollBy(0,10000)" }
|
27
|
+
sleep 0.2
|
28
|
+
end
|
29
|
+
iterate_through_posts
|
30
|
+
rescue Capybara::ElementNotFound => e
|
31
|
+
begin
|
32
|
+
iterate_through_posts
|
33
|
+
end
|
34
|
+
end
|
35
|
+
end
|
36
|
+
|
37
|
+
private
|
38
|
+
|
39
|
+
def iterate_through_posts
|
19
40
|
all("article div div div a").each do |post|
|
20
41
|
|
21
42
|
link = post["href"]
|
@@ -29,6 +50,12 @@ class InstaScrape
|
|
29
50
|
@posts << info
|
30
51
|
|
31
52
|
end
|
53
|
+
|
54
|
+
#log
|
55
|
+
puts "POST COUNT: #{@posts.length}"
|
56
|
+
|
57
|
+
#return result
|
32
58
|
return @posts
|
33
59
|
end
|
60
|
+
|
34
61
|
end
|
data/lib/insta_scrape/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: insta_scrape
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 0.0
|
4
|
+
version: 0.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- dannyvassallo
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-06-
|
11
|
+
date: 2016-06-11 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -56,45 +56,88 @@ dependencies:
|
|
56
56
|
name: capybara
|
57
57
|
requirement: !ruby/object:Gem::Requirement
|
58
58
|
requirements:
|
59
|
-
- - "
|
59
|
+
- - "~>"
|
60
|
+
- !ruby/object:Gem::Version
|
61
|
+
version: 2.7.1
|
62
|
+
type: :development
|
63
|
+
prerelease: false
|
64
|
+
version_requirements: !ruby/object:Gem::Requirement
|
65
|
+
requirements:
|
66
|
+
- - "~>"
|
67
|
+
- !ruby/object:Gem::Version
|
68
|
+
version: 2.7.1
|
69
|
+
- !ruby/object:Gem::Dependency
|
70
|
+
name: phantomjs
|
71
|
+
requirement: !ruby/object:Gem::Requirement
|
72
|
+
requirements:
|
73
|
+
- - "~>"
|
60
74
|
- !ruby/object:Gem::Version
|
61
|
-
version:
|
75
|
+
version: 2.1.1.0
|
62
76
|
type: :development
|
63
77
|
prerelease: false
|
64
78
|
version_requirements: !ruby/object:Gem::Requirement
|
79
|
+
requirements:
|
80
|
+
- - "~>"
|
81
|
+
- !ruby/object:Gem::Version
|
82
|
+
version: 2.1.1.0
|
83
|
+
- !ruby/object:Gem::Dependency
|
84
|
+
name: poltergeist
|
85
|
+
requirement: !ruby/object:Gem::Requirement
|
86
|
+
requirements:
|
87
|
+
- - "~>"
|
88
|
+
- !ruby/object:Gem::Version
|
89
|
+
version: 1.9.0
|
90
|
+
type: :development
|
91
|
+
prerelease: false
|
92
|
+
version_requirements: !ruby/object:Gem::Requirement
|
93
|
+
requirements:
|
94
|
+
- - "~>"
|
95
|
+
- !ruby/object:Gem::Version
|
96
|
+
version: 1.9.0
|
97
|
+
- !ruby/object:Gem::Dependency
|
98
|
+
name: capybara
|
99
|
+
requirement: !ruby/object:Gem::Requirement
|
65
100
|
requirements:
|
66
101
|
- - ">="
|
67
102
|
- !ruby/object:Gem::Version
|
68
|
-
version:
|
103
|
+
version: 2.7.1
|
104
|
+
type: :runtime
|
105
|
+
prerelease: false
|
106
|
+
version_requirements: !ruby/object:Gem::Requirement
|
107
|
+
requirements:
|
108
|
+
- - ">="
|
109
|
+
- !ruby/object:Gem::Version
|
110
|
+
version: 2.7.1
|
69
111
|
- !ruby/object:Gem::Dependency
|
70
112
|
name: phantomjs
|
71
113
|
requirement: !ruby/object:Gem::Requirement
|
72
114
|
requirements:
|
73
115
|
- - ">="
|
74
116
|
- !ruby/object:Gem::Version
|
75
|
-
version:
|
76
|
-
type: :
|
117
|
+
version: 2.1.1.0
|
118
|
+
type: :runtime
|
77
119
|
prerelease: false
|
78
120
|
version_requirements: !ruby/object:Gem::Requirement
|
79
121
|
requirements:
|
80
122
|
- - ">="
|
81
123
|
- !ruby/object:Gem::Version
|
82
|
-
version:
|
124
|
+
version: 2.1.1.0
|
83
125
|
- !ruby/object:Gem::Dependency
|
84
126
|
name: poltergeist
|
85
127
|
requirement: !ruby/object:Gem::Requirement
|
86
128
|
requirements:
|
87
129
|
- - ">="
|
88
130
|
- !ruby/object:Gem::Version
|
89
|
-
version:
|
90
|
-
type: :
|
131
|
+
version: 1.9.0
|
132
|
+
type: :runtime
|
91
133
|
prerelease: false
|
92
134
|
version_requirements: !ruby/object:Gem::Requirement
|
93
135
|
requirements:
|
94
136
|
- - ">="
|
95
137
|
- !ruby/object:Gem::Version
|
96
|
-
version:
|
97
|
-
description:
|
138
|
+
version: 1.9.0
|
139
|
+
description: A ruby scraper for instagram in 2016. Because the hashtag deprecation
|
140
|
+
in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.
|
98
141
|
email:
|
99
142
|
- danielvassallo87@gmail.com
|
100
143
|
executables: []
|