insta_scrape 1.0.0 → 1.1.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- checksums.yaml +4 -4
- data/README.md +46 -16
- data/insta_scrape.gemspec +1 -1
- data/lib/insta_scrape.rb +55 -2
- data/lib/insta_scrape/version.rb +1 -1
- metadata +4 -2
checksums.yaml
CHANGED
@@ -1,7 +1,7 @@
|
|
1
1
|
---
|
2
2
|
SHA1:
|
3
|
-
metadata.gz:
|
4
|
-
data.tar.gz:
|
3
|
+
metadata.gz: 05c519af7d4c14f15487b131afd234e2cd0c6435
|
4
|
+
data.tar.gz: 66aee1224c0a0c9d931f864d331d698f35b0b60d
|
5
5
|
SHA512:
|
6
|
-
metadata.gz:
|
7
|
-
data.tar.gz:
|
6
|
+
metadata.gz: 8f5bb16a646d4a73f1b226b206d19b53edf5b8cc9b1bb88b6ac27bf9b302bc39096eee6a367870553a07f2b8560946fc6cc0cc1c1f36ce1d1a5a1e112caa13a8
|
7
|
+
data.tar.gz: 50a190f914e87580b97293595cc60bc78a0f037871b9a0a11b9ca0bbb33cef8af5c3571fd4087c3ea5c093b003da8bf4711bdfbdbadc78932563a46f5efe1fd5
|
data/README.md
CHANGED
@@ -7,21 +7,23 @@ This gem is dependent on Capybara, PhantomJS, and Poltergeist.
|
|
7
7
|
|
8
8
|
Using this gem you can access multiple facets of the instagram API without needing authorization, most importantly the hashtag.
|
9
9
|
|
10
|
+
v.1.1.0 introducing "long_scrape" methods! Now with more instgram posts!
|
11
|
+
|
10
12
|
## Note
|
11
13
|
|
12
14
|
The number of results may vary when using certain methods as this isn't an official endpoint.
|
13
15
|
|
14
16
|
## Todo
|
15
17
|
|
16
|
-
* Pagination
|
17
|
-
|
18
|
+
* Built-in Pagination
|
19
|
+
|
18
20
|
|
19
21
|
## Installation
|
20
22
|
|
21
23
|
Add this line to your application's Gemfile:
|
22
24
|
|
23
25
|
```ruby
|
24
|
-
gem
|
26
|
+
gem "insta_scrape"
|
25
27
|
```
|
26
28
|
|
27
29
|
And then execute:
|
@@ -36,25 +38,47 @@ Or install it yourself as:
|
|
36
38
|
|
37
39
|
###Available methods
|
38
40
|
|
39
|
-
|
41
|
+
Long scrape method take two arguments -- (hashtag || username, time_in_seconds)
|
42
|
+
Each other method accepts only one argument - a hashtag or a username.
|
40
43
|
|
44
|
+
|
45
|
+
####Long Scrape Methods
|
46
|
+
```ruby
|
47
|
+
#These can take a while but produce the best results
|
48
|
+
#I would recommend running a background job to pull these scrapes
|
49
|
+
|
50
|
+
#long scrape a user and their posts
|
51
|
+
#depending on how long you run the scrape
|
52
|
+
#you can pull an entire user profile and all of their posts
|
53
|
+
#30 seconds is enough for a casual user (maybe less)
|
54
|
+
InstaScrape.long_scrape_user_info_and_posts('foofighters', 30)
|
55
|
+
#this does the same without pulling user info
|
56
|
+
InstaScrape.long_scrape_user_posts('foofighters', 30)
|
57
|
+
|
58
|
+
#pull all posts from a hashtag
|
59
|
+
#infinite scroll will run as long as the time passed in
|
60
|
+
InstaScrape.long_scrape_hashtag('test', 60)
|
61
|
+
#=> > 2k instagram posts! Tested in specs!
|
62
|
+
```
|
63
|
+
|
64
|
+
####Regular Methods
|
41
65
|
```ruby
|
42
66
|
#scrape a hashtag for as many results as possible
|
43
67
|
InstaScrape.hashtag("test")
|
44
68
|
#scrape all user info
|
45
|
-
InstaScrape.user_info("
|
69
|
+
InstaScrape.user_info("foofighters")
|
46
70
|
#scrape all user info and posts
|
47
|
-
InstaScrape.user_info_and_posts(
|
71
|
+
InstaScrape.user_info_and_posts("foofighters")
|
48
72
|
#scrape just a users posts (as many as possible)
|
49
|
-
InstaScrape.user_posts(
|
73
|
+
InstaScrape.user_posts("foofighters")
|
50
74
|
#scrape a users follower count
|
51
|
-
InstaScrape.user_follower_count(
|
75
|
+
InstaScrape.user_follower_count("foofighters")
|
52
76
|
#scrape a users following count
|
53
|
-
InstaScrape.user_following_count(
|
77
|
+
InstaScrape.user_following_count("foofighters")
|
54
78
|
#scrape a users post count
|
55
|
-
InstaScrape.user_post_count(
|
79
|
+
InstaScrape.user_post_count("foofighters")
|
56
80
|
#scrape a users description
|
57
|
-
InstaScrape.user_description(
|
81
|
+
InstaScrape.user_description("foofighters")
|
58
82
|
```
|
59
83
|
|
60
84
|
####Hashtag, User Post, and Nested Posts Scrape
|
@@ -62,8 +86,8 @@ InstaScrape.user_description('foofighters')
|
|
62
86
|
```ruby
|
63
87
|
#basic use case
|
64
88
|
|
65
|
-
#scrape_result = InstaScrape.user_info_and_posts(
|
66
|
-
#scrape_result = InstaScrape.user_posts(
|
89
|
+
#scrape_result = InstaScrape.user_info_and_posts("foofighters").posts
|
90
|
+
#scrape_result = InstaScrape.user_posts("foofighters")
|
67
91
|
scrape_result = InstaScrape.hashtag("test")
|
68
92
|
scrape_result.each do |post|
|
69
93
|
puts post.image
|
@@ -76,8 +100,8 @@ Here is a `.erb` example using MaterializeCSS to render the posts as cards:
|
|
76
100
|
```ruby
|
77
101
|
#in your controller or helper assuming you aren't storing the posts
|
78
102
|
|
79
|
-
#@posts = InstaScrape.user_info_and_posts(
|
80
|
-
#@posts = InstaScrape.user_posts(
|
103
|
+
#@posts = InstaScrape.user_info_and_posts("foofighters").posts
|
104
|
+
#@posts = InstaScrape.user_posts("foofighters")
|
81
105
|
@posts = InstaScrape.hashtag("test")
|
82
106
|
```
|
83
107
|
|
@@ -116,11 +140,17 @@ u.follower_count
|
|
116
140
|
#returns => "1.5m"
|
117
141
|
u.following_count
|
118
142
|
#returns => "35"
|
143
|
+
u.description
|
144
|
+
#returns => "Foo Fighters Rock band smarturl.it/sonic-highways"
|
145
|
+
|
146
|
+
#and in the event you'd need it
|
147
|
+
u.username
|
148
|
+
#returns => "foofighters"
|
119
149
|
```
|
120
150
|
|
121
151
|
Each of these attributes is accessible using the methods listed above as well.
|
122
152
|
|
123
|
-
Using `u = InstaScrape.user_info_and_posts(
|
153
|
+
Using `u = InstaScrape.user_info_and_posts("foofighters")` will give access to the `u.posts` attribute and can be iterated through.
|
124
154
|
The example above covers this.
|
125
155
|
|
126
156
|
## Development
|
data/insta_scrape.gemspec
CHANGED
@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
|
|
10
10
|
spec.email = ["danielvassallo87@gmail.com"]
|
11
11
|
|
12
12
|
spec.summary = %q{Use Instagram Hashtag Embeds in 2016}
|
13
|
-
spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.}
|
13
|
+
spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist. v.1.1.0 -- Introducing long_scrape methods! Get thousands of photo results on hashtags and full user profiles with ALL posts! See the documentation for API usage.}
|
14
14
|
spec.homepage = "https://github.com/dannyvassallo/insta_scrape"
|
15
15
|
spec.license = "MIT"
|
16
16
|
|
data/lib/insta_scrape.rb
CHANGED
@@ -10,6 +10,26 @@ module InstaScrape
|
|
10
10
|
scrape_posts
|
11
11
|
end
|
12
12
|
|
13
|
+
#long scrape a hashtag
|
14
|
+
def self.long_scrape_hashtag(hashtag, scrape_length)
|
15
|
+
visit "https://www.instagram.com/explore/tags/#{hashtag}/"
|
16
|
+
@posts = []
|
17
|
+
long_scrape_posts(scrape_length)
|
18
|
+
end
|
19
|
+
|
20
|
+
#long scrape a hashtag
|
21
|
+
def self.long_scrape_user_posts(username, scrape_length)
|
22
|
+
@posts = []
|
23
|
+
long_scrape_user_posts_method(username, scrape_length)
|
24
|
+
end
|
25
|
+
|
26
|
+
#get user info and posts
|
27
|
+
def self.long_scrape_user_info_and_posts(username, scrape_length)
|
28
|
+
scrape_user_info(username)
|
29
|
+
long_scrape_user_posts_method(username, scrape_length)
|
30
|
+
@user = InstaScrape::InstagramUserWithPosts.new(username, @image, @post_count, @follower_count, @following_count, @description, @posts)
|
31
|
+
end
|
32
|
+
|
13
33
|
#get user info
|
14
34
|
def self.user_info(username)
|
15
35
|
scrape_user_info(username)
|
@@ -95,8 +115,35 @@ module InstaScrape
|
|
95
115
|
iteration = 0
|
96
116
|
while iteration < max_iteration do
|
97
117
|
iteration += 1
|
98
|
-
|
99
|
-
sleep 0.
|
118
|
+
page.execute_script "window.scrollTo(0,document.body.scrollHeight);"
|
119
|
+
sleep 0.1
|
120
|
+
page.execute_script "window.scrollTo(0,(document.body.scrollHeight - 5000));"
|
121
|
+
sleep 0.1
|
122
|
+
end
|
123
|
+
iterate_through_posts
|
124
|
+
rescue Capybara::ElementNotFound => e
|
125
|
+
begin
|
126
|
+
iterate_through_posts
|
127
|
+
end
|
128
|
+
end
|
129
|
+
end
|
130
|
+
|
131
|
+
def self.long_scrape_posts(scrape_length_in_seconds)
|
132
|
+
begin
|
133
|
+
page.find('a', :text => "Load more", exact: true).click
|
134
|
+
max_iteration = (scrape_length_in_seconds / 0.3)
|
135
|
+
iteration = 0
|
136
|
+
@loader = "."
|
137
|
+
while iteration < max_iteration do
|
138
|
+
puts "InstaScrape is working. Please wait.#{@loader}"
|
139
|
+
iteration += 1
|
140
|
+
sleep 0.1
|
141
|
+
page.execute_script "window.scrollTo(0,document.body.scrollHeight);"
|
142
|
+
sleep 0.1
|
143
|
+
page.execute_script "window.scrollTo(0,(document.body.scrollHeight - 5000));"
|
144
|
+
sleep 0.1
|
145
|
+
@loader << "."
|
146
|
+
system "clear"
|
100
147
|
end
|
101
148
|
iterate_through_posts
|
102
149
|
rescue Capybara::ElementNotFound => e
|
@@ -106,6 +153,12 @@ module InstaScrape
|
|
106
153
|
end
|
107
154
|
end
|
108
155
|
|
156
|
+
def self.long_scrape_user_posts_method(username, scrape_length_in_seconds)
|
157
|
+
@posts = []
|
158
|
+
visit "https://www.instagram.com/#{username}/"
|
159
|
+
long_scrape_posts(scrape_length_in_seconds)
|
160
|
+
end
|
161
|
+
|
109
162
|
def self.scrape_user_posts(username)
|
110
163
|
@posts = []
|
111
164
|
visit "https://www.instagram.com/#{username}/"
|
data/lib/insta_scrape/version.rb
CHANGED
metadata
CHANGED
@@ -1,14 +1,14 @@
|
|
1
1
|
--- !ruby/object:Gem::Specification
|
2
2
|
name: insta_scrape
|
3
3
|
version: !ruby/object:Gem::Version
|
4
|
-
version: 1.
|
4
|
+
version: 1.1.0
|
5
5
|
platform: ruby
|
6
6
|
authors:
|
7
7
|
- dannyvassallo
|
8
8
|
autorequire:
|
9
9
|
bindir: exe
|
10
10
|
cert_chain: []
|
11
|
-
date: 2016-06-
|
11
|
+
date: 2016-06-13 00:00:00.000000000 Z
|
12
12
|
dependencies:
|
13
13
|
- !ruby/object:Gem::Dependency
|
14
14
|
name: bundler
|
@@ -174,6 +174,8 @@ dependencies:
|
|
174
174
|
version: 1.9.0
|
175
175
|
description: A ruby scraper for instagram in 2016. Because the hashtag deprecation
|
176
176
|
in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.
|
177
|
+
v.1.1.0 -- Introducing long_scrape methods! Get thousands of photo results on hashtags
|
178
|
+
and full user profiles with ALL posts! See the documentation for API usage.
|
177
179
|
email:
|
178
180
|
- danielvassallo87@gmail.com
|
179
181
|
executables: []
|