insta_scrape 1.0.0 → 1.1.0

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA1:
3
- metadata.gz: abbaf8a5b61d0f1d25d3480a64b16eedacd32b58
4
- data.tar.gz: a67b9a14e265d90f47dbd566b21d64e42694fe43
3
+ metadata.gz: 05c519af7d4c14f15487b131afd234e2cd0c6435
4
+ data.tar.gz: 66aee1224c0a0c9d931f864d331d698f35b0b60d
5
5
  SHA512:
6
- metadata.gz: 5a2cf697cafb2e5a46ac4dad8453666cb50d4c2923dae7d6344b9fd35aef517798d4a45c8b0d260c7f780b1036505eafd6efaa716f3b0759892ed5b6f97fd1fd
7
- data.tar.gz: 235eb28eee937dfa4350ef708b1117b96f49689d89a2c65f95bc3fcd3e4514f170cb491544c58507225ba71a1a7403c235bd1e72543e775119f8d2da19bb0ca0
6
+ metadata.gz: 8f5bb16a646d4a73f1b226b206d19b53edf5b8cc9b1bb88b6ac27bf9b302bc39096eee6a367870553a07f2b8560946fc6cc0cc1c1f36ce1d1a5a1e112caa13a8
7
+ data.tar.gz: 50a190f914e87580b97293595cc60bc78a0f037871b9a0a11b9ca0bbb33cef8af5c3571fd4087c3ea5c093b003da8bf4711bdfbdbadc78932563a46f5efe1fd5
data/README.md CHANGED
@@ -7,21 +7,23 @@ This gem is dependent on Capybara, PhantomJS, and Poltergeist.
7
7
 
8
8
  Using this gem you can access multiple facets of the instagram API without needing authorization, most importantly the hashtag.
9
9
 
10
+ v.1.1.0 introducing "long_scrape" methods! Now with more instgram posts!
11
+
10
12
  ## Note
11
13
 
12
14
  The number of results may vary when using certain methods as this isn't an official endpoint.
13
15
 
14
16
  ## Todo
15
17
 
16
- * Pagination
17
- * Assess infinite scroll
18
+ * Built-in Pagination
19
+
18
20
 
19
21
  ## Installation
20
22
 
21
23
  Add this line to your application's Gemfile:
22
24
 
23
25
  ```ruby
24
- gem 'insta_scrape'
26
+ gem "insta_scrape"
25
27
  ```
26
28
 
27
29
  And then execute:
@@ -36,25 +38,47 @@ Or install it yourself as:
36
38
 
37
39
  ###Available methods
38
40
 
39
- As of right now, each method accepts only one argument - a hashtag or a username.
41
+ Long scrape method take two arguments -- (hashtag || username, time_in_seconds)
42
+ Each other method accepts only one argument - a hashtag or a username.
40
43
 
44
+
45
+ ####Long Scrape Methods
46
+ ```ruby
47
+ #These can take a while but produce the best results
48
+ #I would recommend running a background job to pull these scrapes
49
+
50
+ #long scrape a user and their posts
51
+ #depending on how long you run the scrape
52
+ #you can pull an entire user profile and all of their posts
53
+ #30 seconds is enough for a casual user (maybe less)
54
+ InstaScrape.long_scrape_user_info_and_posts('foofighters', 30)
55
+ #this does the same without pulling user info
56
+ InstaScrape.long_scrape_user_posts('foofighters', 30)
57
+
58
+ #pull all posts from a hashtag
59
+ #infinite scroll will run as long as the time passed in
60
+ InstaScrape.long_scrape_hashtag('test', 60)
61
+ #=> > 2k instagram posts! Tested in specs!
62
+ ```
63
+
64
+ ####Regular Methods
41
65
  ```ruby
42
66
  #scrape a hashtag for as many results as possible
43
67
  InstaScrape.hashtag("test")
44
68
  #scrape all user info
45
- InstaScrape.user_info("dannyvassallo")
69
+ InstaScrape.user_info("foofighters")
46
70
  #scrape all user info and posts
47
- InstaScrape.user_info_and_posts('foofighters')
71
+ InstaScrape.user_info_and_posts("foofighters")
48
72
  #scrape just a users posts (as many as possible)
49
- InstaScrape.user_posts('foofighters')
73
+ InstaScrape.user_posts("foofighters")
50
74
  #scrape a users follower count
51
- InstaScrape.user_follower_count('foofighters')
75
+ InstaScrape.user_follower_count("foofighters")
52
76
  #scrape a users following count
53
- InstaScrape.user_following_count('foofighters')
77
+ InstaScrape.user_following_count("foofighters")
54
78
  #scrape a users post count
55
- InstaScrape.user_post_count('foofighters')
79
+ InstaScrape.user_post_count("foofighters")
56
80
  #scrape a users description
57
- InstaScrape.user_description('foofighters')
81
+ InstaScrape.user_description("foofighters")
58
82
  ```
59
83
 
60
84
  ####Hashtag, User Post, and Nested Posts Scrape
@@ -62,8 +86,8 @@ InstaScrape.user_description('foofighters')
62
86
  ```ruby
63
87
  #basic use case
64
88
 
65
- #scrape_result = InstaScrape.user_info_and_posts('foofighters').posts
66
- #scrape_result = InstaScrape.user_posts('foofighters')
89
+ #scrape_result = InstaScrape.user_info_and_posts("foofighters").posts
90
+ #scrape_result = InstaScrape.user_posts("foofighters")
67
91
  scrape_result = InstaScrape.hashtag("test")
68
92
  scrape_result.each do |post|
69
93
  puts post.image
@@ -76,8 +100,8 @@ Here is a `.erb` example using MaterializeCSS to render the posts as cards:
76
100
  ```ruby
77
101
  #in your controller or helper assuming you aren't storing the posts
78
102
 
79
- #@posts = InstaScrape.user_info_and_posts('foofighters').posts
80
- #@posts = InstaScrape.user_posts('foofighters')
103
+ #@posts = InstaScrape.user_info_and_posts("foofighters").posts
104
+ #@posts = InstaScrape.user_posts("foofighters")
81
105
  @posts = InstaScrape.hashtag("test")
82
106
  ```
83
107
 
@@ -116,11 +140,17 @@ u.follower_count
116
140
  #returns => "1.5m"
117
141
  u.following_count
118
142
  #returns => "35"
143
+ u.description
144
+ #returns => "Foo Fighters Rock band smarturl.it/sonic-highways"
145
+
146
+ #and in the event you'd need it
147
+ u.username
148
+ #returns => "foofighters"
119
149
  ```
120
150
 
121
151
  Each of these attributes is accessible using the methods listed above as well.
122
152
 
123
- Using `u = InstaScrape.user_info_and_posts('foofighters')` will give access to the `u.posts` attribute and can be iterated through.
153
+ Using `u = InstaScrape.user_info_and_posts("foofighters")` will give access to the `u.posts` attribute and can be iterated through.
124
154
  The example above covers this.
125
155
 
126
156
  ## Development
@@ -10,7 +10,7 @@ Gem::Specification.new do |spec|
10
10
  spec.email = ["danielvassallo87@gmail.com"]
11
11
 
12
12
  spec.summary = %q{Use Instagram Hashtag Embeds in 2016}
13
- spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.}
13
+ spec.description = %q{A ruby scraper for instagram in 2016. Because the hashtag deprecation in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist. v.1.1.0 -- Introducing long_scrape methods! Get thousands of photo results on hashtags and full user profiles with ALL posts! See the documentation for API usage.}
14
14
  spec.homepage = "https://github.com/dannyvassallo/insta_scrape"
15
15
  spec.license = "MIT"
16
16
 
@@ -10,6 +10,26 @@ module InstaScrape
10
10
  scrape_posts
11
11
  end
12
12
 
13
+ #long scrape a hashtag
14
+ def self.long_scrape_hashtag(hashtag, scrape_length)
15
+ visit "https://www.instagram.com/explore/tags/#{hashtag}/"
16
+ @posts = []
17
+ long_scrape_posts(scrape_length)
18
+ end
19
+
20
+ #long scrape a hashtag
21
+ def self.long_scrape_user_posts(username, scrape_length)
22
+ @posts = []
23
+ long_scrape_user_posts_method(username, scrape_length)
24
+ end
25
+
26
+ #get user info and posts
27
+ def self.long_scrape_user_info_and_posts(username, scrape_length)
28
+ scrape_user_info(username)
29
+ long_scrape_user_posts_method(username, scrape_length)
30
+ @user = InstaScrape::InstagramUserWithPosts.new(username, @image, @post_count, @follower_count, @following_count, @description, @posts)
31
+ end
32
+
13
33
  #get user info
14
34
  def self.user_info(username)
15
35
  scrape_user_info(username)
@@ -95,8 +115,35 @@ module InstaScrape
95
115
  iteration = 0
96
116
  while iteration < max_iteration do
97
117
  iteration += 1
98
- 5.times { page.execute_script "window.scrollBy(0,10000)" }
99
- sleep 0.2
118
+ page.execute_script "window.scrollTo(0,document.body.scrollHeight);"
119
+ sleep 0.1
120
+ page.execute_script "window.scrollTo(0,(document.body.scrollHeight - 5000));"
121
+ sleep 0.1
122
+ end
123
+ iterate_through_posts
124
+ rescue Capybara::ElementNotFound => e
125
+ begin
126
+ iterate_through_posts
127
+ end
128
+ end
129
+ end
130
+
131
+ def self.long_scrape_posts(scrape_length_in_seconds)
132
+ begin
133
+ page.find('a', :text => "Load more", exact: true).click
134
+ max_iteration = (scrape_length_in_seconds / 0.3)
135
+ iteration = 0
136
+ @loader = "."
137
+ while iteration < max_iteration do
138
+ puts "InstaScrape is working. Please wait.#{@loader}"
139
+ iteration += 1
140
+ sleep 0.1
141
+ page.execute_script "window.scrollTo(0,document.body.scrollHeight);"
142
+ sleep 0.1
143
+ page.execute_script "window.scrollTo(0,(document.body.scrollHeight - 5000));"
144
+ sleep 0.1
145
+ @loader << "."
146
+ system "clear"
100
147
  end
101
148
  iterate_through_posts
102
149
  rescue Capybara::ElementNotFound => e
@@ -106,6 +153,12 @@ module InstaScrape
106
153
  end
107
154
  end
108
155
 
156
+ def self.long_scrape_user_posts_method(username, scrape_length_in_seconds)
157
+ @posts = []
158
+ visit "https://www.instagram.com/#{username}/"
159
+ long_scrape_posts(scrape_length_in_seconds)
160
+ end
161
+
109
162
  def self.scrape_user_posts(username)
110
163
  @posts = []
111
164
  visit "https://www.instagram.com/#{username}/"
@@ -1,3 +1,3 @@
1
1
  module InstaScrape
2
- VERSION = "1.0.0"
2
+ VERSION = "1.1.0"
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: insta_scrape
3
3
  version: !ruby/object:Gem::Version
4
- version: 1.0.0
4
+ version: 1.1.0
5
5
  platform: ruby
6
6
  authors:
7
7
  - dannyvassallo
8
8
  autorequire:
9
9
  bindir: exe
10
10
  cert_chain: []
11
- date: 2016-06-11 00:00:00.000000000 Z
11
+ date: 2016-06-13 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -174,6 +174,8 @@ dependencies:
174
174
  version: 1.9.0
175
175
  description: A ruby scraper for instagram in 2016. Because the hashtag deprecation
176
176
  in the API is just silly. This gem is dependent on Capybara, PhantomJS, and Poltergeist.
177
+ v.1.1.0 -- Introducing long_scrape methods! Get thousands of photo results on hashtags
178
+ and full user profiles with ALL posts! See the documentation for API usage.
177
179
  email:
178
180
  - danielvassallo87@gmail.com
179
181
  executables: []