insta_scrape 1.0.0 → 1.1.0
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +4 -4
- data/README.md +46 -16
- data/insta_scrape.gemspec +1 -1
- data/lib/insta_scrape.rb +55 -2
- data/lib/insta_scrape/version.rb +1 -1
- metadata +4 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 05c519af7d4c14f15487b131afd234e2cd0c6435
|
4
|
+
data.tar.gz: 66aee1224c0a0c9d931f864d331d698f35b0b60d
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8f5bb16a646d4a73f1b226b206d19b53edf5b8cc9b1bb88b6ac27bf9b302bc39096eee6a367870553a07f2b8560946fc6cc0cc1c1f36ce1d1a5a1e112caa13a8
|
7
|
+
data.tar.gz: 50a190f914e87580b97293595cc60bc78a0f037871b9a0a11b9ca0bbb33cef8af5c3571fd4087c3ea5c093b003da8bf4711bdfbdbadc78932563a46f5efe1fd5
|
data/README.md
CHANGED
@@ -7,21 +7,23 @@ This gem is dependent on Capybara, PhantomJS, and Poltergeist.
|
|
7
7
|
|
8
8
|
Using this gem you can access multiple facets of the instagram API without needing authorization, most importantly the hashtag.
|
9
9
|
|
10
|
+
v.1.1.0 introducing "long_scrape" methods! Now with more instgram posts!
|
11
|
+
|
10
12
|
## Note
|
11
13
|
|
12
14
|
The number of results may vary when using certain methods as this isn't an official endpoint.
|
13
15
|
|
14
16
|
## Todo
|
15
17
|
|
16
|
-
* Pagination
|
17
|
-
|
18
|
+
* Built-in Pagination
|
19
|
+
|
18
20
|
|
19
21
|
## Installation
|
20
22
|
|
21
23
|
Add this line to your application's Gemfile:
|
22
24
|
|
23
25
|
```ruby
|
24
|
-
gem
|
26
|
+
gem "insta_scrape"
|
25
27
|
```
|
26
28
|
|
27
29
|
And then execute:
|
@@ -36,25 +38,47 @@ Or install it yourself as:
|
|
36
38
|
|
37
39
|
###Available methods
|
38
40
|
|
39
|
-
|
41
|
+
Long scrape method take two arguments -- (hashtag || username, time_in_seconds)
|
42
|
+
Each other method accepts only one argument - a hashtag or a username.
|
40
43
|
|
44
|
+
|
45
|
+
####Long Scrape Methods
|
46
|
+
```ruby
|
47
|
+
#These can take a while but produce the best results
|
48
|
+
#I would recommend running a background job to pull these scrapes
|
49
|
+
|
50
|
+
#long scrape a user and their posts
|
51
|
+
#depending on how long you run the scrape
|
52
|
+
#you can pull an entire user profile and all of their posts
|
53
|
+
#30 seconds is enough for a casual user (maybe less)
|
54
|
+
InstaScrape.long_scrape_user_info_and_posts('foofighters', 30)
|
55
|
+
#this does the same without pulling user info
|
56
|
+
InstaScrape.long_scrape_user_posts('foofighters', 30)
|
57
|
+
|
58
|
+
#pull all posts from a hashtag
|
59
|
+
#infinite scroll will run as long as the time passed in
|
60
|
+
InstaScrape.long_scrape_hashtag('test', 60)
|
61
|
+
#=> > 2k instagram posts! Tested in specs!
|
62
|
+
```
|
63
|
+
|
64
|
+
####Regular Methods
|
41
65
|
```ruby
|
42
66
|
#scrape a hashtag for as many results as possible
|
43
67
|
InstaScrape.hashtag("test")
|
44
68
|
#scrape all user info
|
45
|
-
InstaScrape.user_info("
|
69
|
+
InstaScrape.user_info("foofighters")
|
46
70
|
#scrape all user info and posts
|
47
|
-
InstaScrape.user_info_and_posts(
|
71
|
+
InstaScrape.user_info_and_posts("foofighters")
|
48
72
|
#scrape just a users posts (as many as possible)
|
49
|
-
InstaScrape.user_posts(
|
73
|
+
InstaScrape.user_posts("foofighters")
|
50
74
|
#scrape a users follower count
|
51
|
-
InstaScrape.user_follower_count(
|
75
|
+
InstaScrape.user_follower_count("foofighters")
|
52
76
|
#scrape a users following count
|
53
|
-
InstaScrape.user_following_count(
|
77
|
+
InstaScrape.user_following_count("foofighters")
|
54
78
|
#scrape a users post count
|
55
|
-
InstaScrape.user_post_count(
|
79
|
+
InstaScrape.user_post_count("foofighters")
|
56
80
|
#scrape a users description
|
57
|
-
InstaScrape.user_description(
|
81
|
+
InstaScrape.user_description("foofighters")
|
58
82
|
```
|
59
83
|
|
60
84
|
####Hashtag, User Post, and Nested Posts Scrape
|
@@ -62,8 +86,8 @@ InstaScrape.user_description('foofighters')
|
|
62
86
|
```ruby
|
63
87
|
#basic use case
|
64
88
|
|
65
|
-
#scrape_result = InstaScrape.user_info_and_posts(
|
66
|
-
#scrape_result = InstaScrape.user_posts(
|
89
|
+
#scrape_result = InstaScrape.user_info_and_posts("foofighters").posts
|
90
|
+
#scrape_result = InstaScrape.user_posts("foofighters")
|
67
91
|
scrape_result = InstaScrape.hashtag("test")
|
68
92
|
scrape_result.each do |post|
|
69
93
|
puts post.image
|
@@ -76,8 +100,8 @@ Here is a `.erb` example using MaterializeCSS to render the posts as cards:
|
|
76
100
|
```ruby
|
77
101
|
#in your controller or helper assuming you aren't storing the posts
|
78
102
|
|
79
|
-
#@posts = InstaScrape.user_info_and_posts(
|
80
|
-
#@posts = InstaScrape.user_posts(
|
103
|
+
#@posts = InstaScrape.user_info_and_posts("foofighters").posts
|
104
|
+
#@posts = InstaScrape.user_posts("foofighters")
|
81
105
|
@posts = InstaScrape.hashtag("test")
|
82
106
|
```
|
83
107
|
|
@@ -116,11 +140,17 @@ u.follower_count
|
|
116
140
|
#returns => "1.5m"
|
117
141
|
u.following_count
|
118
142
|
#returns => "35"
|
143
|
+
u.description
|
144
|
+
#returns => "Foo Fighters Rock band smarturl.it/sonic-highways"
|
145
|
+
|
146
|
+
#and in the event you'd need it
|
147
|
+
u.username
|
148
|
+
#returns => "foofighters"
|
119
149
|
```
|
120
150
|
|
121
151
|
Each of these attributes is accessible using the methods listed above as well.
|
122
152
|
|
123
|
-
Using `u = InstaScrape.user_info_and_posts(
|
153
|
+
Using `u = InstaScrape.user_info_and_posts("foofighters")` will give access to the `u.posts` attribute and can be iterated through.
|
124
154
|
The example above covers this.
|
125
155
|
|
126
156
|
## Development
|
data/insta_scrape.gemspec
CHANGED
@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
|
|
10
10
|
spec.email = ["danielvassallo87@gmail.com"]
|
11
11
|
|
12
12
|
spec.summary = %q{Use Instagram Hashtag Embeds in 2016}
|
13
|
-
spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.}
|
13
|
+
spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist. v.1.1.0 -- Introducing long_scrape methods! Get thousands of photo results on hashtags and full user profiles with ALL posts! See the documentation for API usage.}
|
14
14
|
spec.homepage = "https://github.com/dannyvassallo/insta_scrape"
|
15
15
|
spec.license = "MIT"
|
16
16
|
|
data/lib/insta_scrape.rb
CHANGED
@@ -10,6 +10,26 @@ module InstaScrape
|
|
10
10
|
scrape_posts
|
11
11
|
end
|
12
12
|
|
13
|
+
#long scrape a hashtag
|
14
|
+
def self.long_scrape_hashtag(hashtag, scrape_length)
|
15
|
+
visit "https://www.instagram.com/explore/tags/#{hashtag}/"
|
16
|
+
@posts = []
|
17
|
+
long_scrape_posts(scrape_length)
|
18
|
+
end
|
19
|
+
|
20
|
+
#long scrape a hashtag
|
21
|
+
def self.long_scrape_user_posts(username, scrape_length)
|
22
|
+
@posts = []
|
23
|
+
long_scrape_user_posts_method(username, scrape_length)
|
24
|
+
end
|
25
|
+
|
26
|
+
#get user info and posts
|
27
|
+
def self.long_scrape_user_info_and_posts(username, scrape_length)
|
28
|
+
scrape_user_info(username)
|
29
|
+
long_scrape_user_posts_method(username, scrape_length)
|
30
|
+
@user = InstaScrape::InstagramUserWithPosts.new(username, @image, @post_count, @follower_count, @following_count, @description, @posts)
|
31
|
+
end
|
32
|
+
|
13
33
|
#get user info
|
14
34
|
def self.user_info(username)
|
15
35
|
scrape_user_info(username)
|
@@ -95,8 +115,35 @@ module InstaScrape
|
|
95
115
|
iteration = 0
|
96
116
|
while iteration < max_iteration do
|
97
117
|
iteration += 1
|
98
|
-
|
99
|
-
sleep 0.
|
118
|
+
page.execute_script "window.scrollTo(0,document.body.scrollHeight);"
|
119
|
+
sleep 0.1
|
120
|
+
page.execute_script "window.scrollTo(0,(document.body.scrollHeight - 5000));"
|
121
|
+
sleep 0.1
|
122
|
+
end
|
123
|
+
iterate_through_posts
|
124
|
+
rescue Capybara::ElementNotFound => e
|
125
|
+
begin
|
126
|
+
iterate_through_posts
|
127
|
+
end
|
128
|
+
end
|
129
|
+
end
|
130
|
+
|
131
|
+
def self.long_scrape_posts(scrape_length_in_seconds)
|
132
|
+
begin
|
133
|
+
page.find('a', :text => "Load more", exact: true).click
|
134
|
+
max_iteration = (scrape_length_in_seconds / 0.3)
|
135
|
+
iteration = 0
|
136
|
+
@loader = "."
|
137
|
+
while iteration < max_iteration do
|
138
|
+
puts "InstaScrape is working. Please wait.#{@loader}"
|
139
|
+
iteration += 1
|
140
|
+
sleep 0.1
|
141
|
+
page.execute_script "window.scrollTo(0,document.body.scrollHeight);"
|
142
|
+
sleep 0.1
|
143
|
+
page.execute_script "window.scrollTo(0,(document.body.scrollHeight - 5000));"
|
144
|
+
sleep 0.1
|
145
|
+
@loader << "."
|
146
|
+
system "clear"
|
100
147
|
end
|
101
148
|
iterate_through_posts
|
102
149
|
rescue Capybara::ElementNotFound => e
|
@@ -106,6 +153,12 @@ module InstaScrape
|
|
106
153
|
end
|
107
154
|
end
|
108
155
|
|
156
|
+
def self.long_scrape_user_posts_method(username, scrape_length_in_seconds)
|
157
|
+
@posts = []
|
158
|
+
visit "https://www.instagram.com/#{username}/"
|
159
|
+
long_scrape_posts(scrape_length_in_seconds)
|
160
|
+
end
|
161
|
+
|
109
162
|
def self.scrape_user_posts(username)
|
110
163
|
@posts = []
|
111
164
|
visit "https://www.instagram.com/#{username}/"
|
data/lib/insta_scrape/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: insta_scrape
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- dannyvassallo
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-06-
|
11
|
+
date: 2016-06-13 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -174,6 +174,8 @@ dependencies:
|
|
174
174
|
version: 1.9.0
|
175
175
|
description: A ruby scraper for instagram in 2016. Because the hashtag deprecation
|
176
176
|
in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.
|
177
|
+
v.1.1.0 -- Introducing long_scrape methods! Get thousands of photo results on hashtags
|
178
|
+
and full user profiles with ALL posts! See the documentation for API usage.
|
177
179
|
email:
|
178
180
|
- danielvassallo87@gmail.com
|
179
181
|
executables: []
|