ae_easy-login 0.0.2 → 0.0.3

Sign up to get free protection for your applications and to get access to all the features.
@@ -73,16 +73,438 @@ src="http://img.shields.io/badge/license-MIT-yellowgreen.svg"></a></p>
73
73
  handle login and session recovery, quite useful when scraping websites with
74
74
  login features and expiring sessions.</p>
75
75
 
76
- <p>Install gem: <code>gem install &#39;ae_easy-login&#39;</code></p>
76
+ <p>Install gem: <code>ruby gem install &#39;ae_easy-login&#39; </code></p>
77
77
 
78
- <p>Require gem: <code>require &#39;ae_easy/login&#39;</code></p>
78
+ <p>Require gem: <code>ruby require &#39;ae_easy/login&#39; </code></p>
79
79
 
80
- <p>Documentation can be found <a
80
+ <p>Code documentation can be found <a
81
81
  href="http://rubydoc.org/gems/ae_easy-login/frames">here</a>.</p>
82
+
83
+ <h2 id="label-How+to+implement">How to implement</h2>
84
+
85
+ <h3 id="label-Before+you+start">Before you start</h3>
86
+
87
+ <p>It is true that most user cases for <code>ae_easy-login</code> gem applies
88
+ to websites with login pages and create sessions, so we will cover this
89
+ scenario on our example.</p>
90
+
91
+ <p>Therefore, <code>ae_easy-login</code> gem is designed to handle
92
+ <strong>ANY</strong> kind of session recovery, even those that doesn&#39;t
93
+ requires a login form <code>POST</code> by just changing the flow from:</p>
94
+
95
+ <pre class="code ruby"><code class="ruby">login -&gt; login_post -&gt; restore
96
+ </code></pre>
97
+
98
+ <p>To whatever you need like for example:</p>
99
+
100
+ <pre class="code ruby"><code class="ruby">home -&gt; search_page -&gt; restore
101
+ </code></pre>
102
+
103
+ <p>Here are some user case examples that can be fixed by
104
+ <code>ae_easy-login</code> gem:</p>
105
+ <ul><li>
106
+ <p>Websites that invalidate requests with fast expiring cookies created on
107
+ first request.</p>
108
+ </li><li>
109
+ <p>Websites that generates tokens on every search (either on cookies or
110
+ query_params) that are required to fetch a detail page.</p>
111
+ </li><li>
112
+ <p>Websites that expires session due inactivity.</p>
113
+ </li><li>
114
+ <p>Websites that uses complex login flows.</p>
115
+ </li><li>
116
+ <p>etc.</p>
117
+ </li></ul>
118
+
119
+ <p>Feel confident to expirement with it until it fit all your needs.</p>
120
+
121
+ <h3 id="label-Adding+ae_easy-login+to+your+project">Adding ae_easy-login to your project</h3>
122
+
123
+ <p>Let&#39;s assume a simple project implementing <code>ae_easy</code> like
124
+ the one described on <a
125
+ href="https://github.com/answersengine/ae_easy/blob/master/README.md">ae_easy
126
+ README.md</a> that scrapers your website.</p>
127
+
128
+ <p>Now lets assume your website has a login page
129
+ <code>https://example.com/login</code> with a session that expires before
130
+ our sample project scrape job finish, causing all remaining webpages to
131
+ respond <code>403</code> HTTP response code and fail… quite the problem
132
+ isn&#39;t it? Well, not anymore, <code>ae_easy-login</code> gem to the
133
+ rescue!</p>
134
+
135
+ <p>First, let&#39;s create our base module that will contain our session
136
+ validation and recovery logic, for this example, we will call it
137
+ <code>LoginEnable</code> :</p>
138
+
139
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./lib/login_enable.rb
140
+ </span>
141
+ <span class='kw'>module</span> <span class='const'>LoginEnable</span>
142
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login.html" title="AeEasy::Login (module)">Login</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login/Plugin.html" title="AeEasy::Login::Plugin (module)">Plugin</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login/Plugin/EnabledBehavior.html" title="AeEasy::Login::Plugin::EnabledBehavior (module)">EnabledBehavior</a></span></span>
143
+
144
+ <span class='comment'># Hook to initialize login_flow configuration.
145
+ </span> <span class='kw'>def</span> <span class='id identifier rubyid_initialize_hook_login_plugin_enabled_behavior'>initialize_hook_login_plugin_enabled_behavior</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
146
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='label'>app_config:</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Config</span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='rparen'>)</span><span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
147
+ <span class='ivar'>@login_flow</span> <span class='op'>=</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login.html" title="AeEasy::Login (module)">Login</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login/Flow.html" title="AeEasy::Login::Flow (class)">Flow</a></span></span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span> <span class='id identifier rubyid_opts'>opts</span>
148
+ <span class='ivar'>@cookie</span> <span class='op'>=</span> <span class='kw'>nil</span>
149
+ <span class='kw'>end</span>
150
+
151
+ <span class='comment'># Get cookie after applying response cookie.
152
+ </span> <span class='comment'># @return [String] Cookie string.
153
+ </span> <span class='kw'>def</span> <span class='id identifier rubyid_cookie'>cookie</span>
154
+ <span class='kw'>return</span> <span class='ivar'>@cookie</span> <span class='kw'>if</span> <span class='ivar'>@cookie</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
155
+
156
+ <span class='id identifier rubyid_raw_cookie'>raw_cookie</span> <span class='op'>=</span> <span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>response_cookie</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span> <span class='op'>||</span> <span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>response_headers</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>Set-Cookie</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span>
157
+ <span class='ivar'>@cookie</span> <span class='op'>=</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Helper</span><span class='op'>::</span><span class='const'>Cookie</span><span class='period'>.</span><span class='id identifier rubyid_update'>update</span><span class='lparen'>(</span><span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>headers</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>Cookie</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='comma'>,</span> <span class='id identifier rubyid_raw_cookie'>raw_cookie</span><span class='rparen'>)</span>
158
+ <span class='ivar'>@cookie</span>
159
+ <span class='kw'>end</span>
160
+
161
+ <span class='comment'># Validates session.
162
+ </span> <span class='comment'># @return [Boolean] `true` when session is valid, else `false`.
163
+ </span> <span class='kw'>def</span> <span class='id identifier rubyid_valid_session?'>valid_session?</span>
164
+ <span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>200</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>404</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_include?'>include?</span> <span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>response_status_code</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_to_s'>to_s</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
165
+ <span class='kw'>end</span>
166
+
167
+ <span class='comment'># Fix page session when session is invalid.
168
+ </span> <span class='comment'># @return [Boolean] `true` when session is valid, else `false`.
169
+ </span> <span class='kw'>def</span> <span class='id identifier rubyid_fix_session'>fix_session</span>
170
+ <span class='kw'>return</span> <span class='kw'>true</span> <span class='kw'>if</span> <span class='id identifier rubyid_valid_session?'>valid_session?</span>
171
+
172
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_fix_session'>fix_session</span> <span class='kw'>do</span>
173
+ <span class='id identifier rubyid_save_pages'>save_pages</span> <span class='lbracket'>[</span><span class='lbrace'>{</span>
174
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>https://example.com/login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
175
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
176
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>priority</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='int'>9</span><span class='comma'>,</span>
177
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>freshness</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='const'>Time</span><span class='period'>.</span><span class='id identifier rubyid_now'>now</span><span class='period'>.</span><span class='id identifier rubyid_iso8601'>iso8601</span><span class='comma'>,</span>
178
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>stl=</span><span class='embexpr_beg'>#{</span><span class='id identifier rubyid_salt'>salt</span><span class='embexpr_end'>}</span><span class='tstring_end'>&quot;</span></span><span class='comma'>,</span>
179
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>headers</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='lbrace'>{</span>
180
+ <span class='comment'># Add any extra header you need here
181
+ </span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>Cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>stl=</span><span class='embexpr_beg'>#{</span><span class='id identifier rubyid_salt'>salt</span><span class='embexpr_end'>}</span><span class='tstring_end'>&quot;</span></span>
182
+ <span class='rbrace'>}</span>
183
+ <span class='rbrace'>}</span><span class='rbracket'>]</span>
184
+ <span class='kw'>end</span>
185
+
186
+ <span class='kw'>false</span>
187
+ <span class='kw'>end</span>
188
+ <span class='kw'>end</span>
189
+ </code></pre>
190
+
191
+ <p>Notice that our example <code>valid_session</code> method uses
192
+ <code>200</code> and <code>404</code> HTTP response codes to validate that
193
+ our session hasn&#39;t expired yet, therefore, <strong><em>this might not
194
+ be the case for your website</em></strong>, so make sure to modify this
195
+ method to fit your needs.</p>
196
+
197
+ <p>Our <code>fix_session</code> method will store any page with a failed
198
+ session by creating an output so it can be restored later once we have the
199
+ new active session cookie.</p>
200
+
201
+ <p><code>fix_session</code> method will also mark the current session cookie
202
+ as expired and <strong><em>enqueue a new <code>login</code> page with HIGH
203
+ priority as long as another parser hasn&#39;t already did it to avoid
204
+ duplicates</em></strong>.</p>
205
+
206
+ <p><code>cookie</code> method will merge the request cookies with the response
207
+ cookies, so we can be sure that the cookies are always updated when needed.</p>
208
+
209
+ <p>Next step is to create a simple parser that enqueue the <code>POST</code>
210
+ of our login page:</p>
211
+
212
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/login.rb
213
+ </span>
214
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
215
+ <span class='kw'>class</span> <span class='const'>Login</span>
216
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
217
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
218
+
219
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
220
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='lbrace'>{</span>
221
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>http://example.com/login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
222
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>login_post</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
223
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>priority</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='int'>10</span><span class='comma'>,</span>
224
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>method</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>POST</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
225
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_cookie'>cookie</span><span class='comma'>,</span>
226
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>headers</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='lbrace'>{</span>
227
+ <span class='comment'># Add any extra header you need here
228
+ </span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>Cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_cookie'>cookie</span>
229
+ <span class='rbrace'>}</span>
230
+ <span class='rbrace'>}</span>
231
+ <span class='kw'>end</span>
232
+ <span class='kw'>end</span>
233
+ <span class='kw'>end</span>
234
+ </code></pre>
235
+
236
+ <p>Now let&#39;s handle the login response, seed and restore any page with an
237
+ expired session:</p>
238
+
239
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/login_post.rb
240
+ </span>
241
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
242
+ <span class='kw'>class</span> <span class='const'>LoginPost</span>
243
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
244
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
245
+
246
+ <span class='kw'>def</span> <span class='id identifier rubyid_seed!'>seed!</span>
247
+ <span class='kw'>return</span> <span class='kw'>if</span> <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_seeded?'>seeded?</span>
248
+
249
+ <span class='const'>Seeders</span><span class='op'>::</span><span class='const'>Seeder</span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span><span class='lparen'>(</span><span class='label'>context:</span> <span class='id identifier rubyid_context'>context</span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_seed'>seed</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_new_page'>new_page</span><span class='op'>|</span>
250
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_fix_page!'>fix_page!</span> <span class='id identifier rubyid_new_page'>new_page</span>
251
+ <span class='kw'>end</span>
252
+
253
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_seeded!'>seeded!</span>
254
+ <span class='kw'>end</span>
255
+
256
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
257
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_update_config'>update_config</span><span class='lparen'>(</span>
258
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_get_cookie'>get_cookie</span><span class='comma'>,</span>
259
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>expired</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='kw'>false</span>
260
+ <span class='rparen'>)</span>
261
+
262
+ <span class='comment'># Wait for any pending fetch to be hold
263
+ </span> <span class='id identifier rubyid_sleep'>sleep</span> <span class='int'>10</span>
264
+
265
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_restore_held_pages'>restore_held_pages</span>
266
+ <span class='id identifier rubyid_seed!'>seed!</span>
267
+ <span class='kw'>end</span>
268
+ <span class='kw'>end</span>
269
+ <span class='kw'>end</span>
270
+ </code></pre>
271
+
272
+ <p>Notice something interesting? that&#39;s right, the seeding happens
273
+ <strong>AFTER</strong> we got our new active session cookie, so the pages
274
+ we seed includes the session cookie. We use
275
+ <code>login_flow.fix_page!</code> method to add our latest active session
276
+ cookie along some internal <code>page['vars']</code> (used to handle page
277
+ recovery) to our seeded pages.</p>
278
+
279
+ <p><strong>IMPORTANT:</strong> This example assumes that
280
+ <code>login_post</code> pages will never fails, but you might need to add
281
+ some extra validations to make sure the login attempt was successful before
282
+ restoring your pages.</p>
283
+
284
+ <p><strong><em>Note:</em></strong> This example assumes that all pages to be
285
+ seeded requires an active session, so we will add it to all pages we seed,
286
+ but this will likely not apply to all pages to be seeded in a real life
287
+ scenario, so make sure to add it only to those pages that requires an
288
+ active session.</p>
289
+
290
+ <p>So next step is to modify our seeder so it allow the cookie inclusion by
291
+ adding a <code>block</code> param that will be used by our
292
+ <code>Parsers::LoginPost#seed!</code> method:</p>
293
+
294
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./seeder/seeder.rb
295
+ </span>
296
+ <span class='kw'>module</span> <span class='const'>Seeder</span>
297
+ <span class='kw'>class</span> <span class='const'>Seeder</span>
298
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Seeder</span>
299
+
300
+ <span class='kw'>def</span> <span class='id identifier rubyid_seed'>seed</span> <span class='op'>&amp;</span><span class='id identifier rubyid_block'>block</span>
301
+ <span class='id identifier rubyid_new_page'>new_page</span> <span class='op'>=</span> <span class='lbrace'>{</span>
302
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>https://example.com/login.rb?query=food</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
303
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>search</span><span class='tstring_end'>&#39;</span></span>
304
+ <span class='rbrace'>}</span>
305
+ <span class='id identifier rubyid_block'>block</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_page'>page</span><span class='rparen'>)</span> <span class='kw'>unless</span> <span class='id identifier rubyid_block'>block</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
306
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='id identifier rubyid_new_page'>new_page</span>
307
+ <span class='kw'>end</span>
308
+ <span class='kw'>end</span>
309
+ <span class='kw'>end</span>
310
+ </code></pre>
311
+
312
+ <p>Now we will need to create a new seeder to seed login page:</p>
313
+
314
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./seeder/login.rb
315
+ </span>
316
+ <span class='kw'>module</span> <span class='const'>Seeder</span>
317
+ <span class='kw'>class</span> <span class='const'>Login</span>
318
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Seeder</span>
319
+
320
+ <span class='kw'>def</span> <span class='id identifier rubyid_seed'>seed</span>
321
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='lbrace'>{</span>
322
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>https://example.com/login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
323
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
324
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>priority</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='int'>9</span>
325
+ <span class='rbrace'>}</span>
326
+ <span class='kw'>end</span>
327
+ <span class='kw'>end</span>
328
+ <span class='kw'>end</span>
329
+ </code></pre>
330
+
331
+ <p>Now let&#39;s modify our <code>./config.yaml</code> to add our new page
332
+ types on it, as well as let us parse failed fetched pages since our example
333
+ assumes that website will return <code>403</code> HTTP response code when
334
+ session has expired:</p>
335
+
336
+ <pre class="code ruby"><code class="ruby"># ./config.yaml
337
+
338
+ parse_failed_pages: true
339
+
340
+ seeder:
341
+ file: ./router/seeder.rb
342
+ disabled: false
343
+
344
+ parsers:
345
+ - page_type: search
346
+ file: ./router/parser.rb
347
+ disabled: false
348
+ - page_type: product
349
+ file: ./router/parser.rb
350
+ disabled: false
351
+ - page_type: login
352
+ file: ./router/parser.rb
353
+ disabled: false
354
+ - page_type: login_post
355
+ file: ./router/parser.rb
356
+ disabled: false
357
+ </code></pre>
358
+
359
+ <p>And don&#39;t forget to modify <code>./ae_easy.yaml</code> to add our new
360
+ routes and change our seeder so login page can be seed first instead of our
361
+ old seeder:</p>
362
+
363
+ <pre class="code ruby"><code class="ruby"># ./ae_easy.yaml
364
+
365
+ router:
366
+ parser:
367
+ routes:
368
+ - page_type: search
369
+ class: Parsers::Search
370
+ - page_type: product
371
+ class: Parsers::Product
372
+ - page_type: login
373
+ class: Parsers::Login
374
+ - page_type: login_post
375
+ class: Parsers::LoginPost
376
+
377
+ seeder:
378
+ routes:
379
+ - class: Seeder::Login
380
+ </code></pre>
381
+
382
+ <p>Now, let&#39;s will need to modify our routers as well since we modified
383
+ our <code>ae_easy.yaml</code> routes and added new classes:</p>
384
+
385
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./router/seeder.rb
386
+ </span>
387
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>ae_easy/router</span><span class='tstring_end'>&#39;</span></span>
388
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./seeder/login</span><span class='tstring_end'>&#39;</span></span>
389
+
390
+ <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Router</span><span class='op'>::</span><span class='const'>Seeder</span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span><span class='period'>.</span><span class='id identifier rubyid_route'>route</span> <span class='label'>context:</span> <span class='kw'>self</span>
391
+ </code></pre>
392
+
393
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./router/parser.rb
394
+ </span>
395
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>cgi</span><span class='tstring_end'>&#39;</span></span>
396
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>ae_easy/router</span><span class='tstring_end'>&#39;</span></span>
397
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>ae_easy/login</span><span class='tstring_end'>&#39;</span></span>
398
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./lib/login_enable</span><span class='tstring_end'>&#39;</span></span>
399
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./seeder/seeder</span><span class='tstring_end'>&#39;</span></span>
400
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./parsers/search</span><span class='tstring_end'>&#39;</span></span>
401
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./parsers/product</span><span class='tstring_end'>&#39;</span></span>
402
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./parsers/login</span><span class='tstring_end'>&#39;</span></span>
403
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./parsers/login_post</span><span class='tstring_end'>&#39;</span></span>
404
+
405
+ <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Router</span><span class='op'>::</span><span class='const'>Parser</span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span><span class='period'>.</span><span class='id identifier rubyid_route'>route</span> <span class='label'>context:</span> <span class='kw'>self</span>
406
+ </code></pre>
407
+
408
+ <p>Next, we need to include our <code>LoginEnable</code> module on every
409
+ parser that requires session validation to fix any expired session request.
410
+ To do this, we will be using our <code>LoginEnable#fix_session</code>
411
+ function as the first thing to do on each parser&#39;s <code>parse</code>
412
+ method:</p>
413
+
414
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/search.rb
415
+ </span>
416
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
417
+ <span class='kw'>class</span> <span class='const'>Search</span>
418
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
419
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
420
+
421
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
422
+ <span class='kw'>return</span> <span class='kw'>unless</span> <span class='id identifier rubyid_fix_session'>fix_session</span>
423
+
424
+ <span class='id identifier rubyid_html'>html</span> <span class='op'>=</span> <span class='const'>Nokogiri</span><span class='period'>.</span><span class='const'>HTML</span> <span class='id identifier rubyid_content'>content</span>
425
+ <span class='id identifier rubyid_html'>html</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>.name</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_element'>element</span><span class='op'>|</span>
426
+ <span class='id identifier rubyid_name'>name</span> <span class='op'>=</span> <span class='id identifier rubyid_element'>element</span><span class='period'>.</span><span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
427
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='lbrace'>{</span>
428
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>https://example.com/product/</span><span class='embexpr_beg'>#{</span><span class='const'>CGI</span><span class='op'>::</span><span class='id identifier rubyid_escape'>escape</span> <span class='id identifier rubyid_name'>name</span><span class='embexpr_end'>}</span><span class='tstring_end'>&quot;</span></span><span class='comma'>,</span>
429
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>product</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
430
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>vars</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='lbrace'>{</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>name</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_name'>name</span><span class='rbrace'>}</span>
431
+ <span class='rbrace'>}</span>
432
+ <span class='kw'>end</span>
433
+ <span class='kw'>end</span>
434
+ <span class='kw'>end</span>
435
+ <span class='kw'>end</span>
436
+ </code></pre>
437
+
438
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/product.rb
439
+ </span>
440
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
441
+ <span class='kw'>class</span> <span class='const'>Product</span>
442
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
443
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
444
+
445
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
446
+ <span class='kw'>return</span> <span class='kw'>unless</span> <span class='id identifier rubyid_fix_session'>fix_session</span>
447
+
448
+ <span class='id identifier rubyid_html'>html</span> <span class='op'>=</span> <span class='const'>Nokogiri</span><span class='period'>.</span><span class='const'>HTML</span> <span class='id identifier rubyid_content'>content</span>
449
+ <span class='id identifier rubyid_description'>description</span> <span class='op'>=</span> <span class='id identifier rubyid_html'>html</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>.description</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_first'>first</span><span class='period'>.</span><span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
450
+ <span class='id identifier rubyid_outputs'>outputs</span> <span class='op'>&lt;&lt;</span> <span class='lbrace'>{</span>
451
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>_collection</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>product</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
452
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>name</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>vars</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>name</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='comma'>,</span>
453
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>description</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_description'>description</span>
454
+ <span class='rbrace'>}</span>
455
+ <span class='kw'>end</span>
456
+ <span class='kw'>end</span>
457
+ <span class='kw'>end</span>
458
+ </code></pre>
459
+
460
+ <p><strong><em>Note:</em></strong> This example asumes that all pages requires
461
+ an active session, so we will add it to all parsers, but this will likely
462
+ not apply to all parsers in a real life scenario since not all web pages
463
+ will require session, so make sure to add it to only the parsers that needs
464
+ it.</p>
465
+
466
+ <p>Finally, we need to make sure that every page that requires an active
467
+ session is enqueued within our latest active session cookie, so we need to
468
+ use <code>login_flow.fix_page!</code> method on all pages to be enqueued
469
+ that applies.</p>
470
+
471
+ <p>As for this example, we already add it to our search pages enqueued by our
472
+ seeder, so the only place left to modify is
473
+ <code>./parsers/search.rb</code> parser since it enqueues
474
+ <code>product</code> pages:</p>
475
+
476
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/search.rb
477
+ </span>
478
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
479
+ <span class='kw'>class</span> <span class='const'>Search</span>
480
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
481
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
482
+
483
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
484
+ <span class='kw'>return</span> <span class='kw'>unless</span> <span class='id identifier rubyid_fix_session'>fix_session</span>
485
+
486
+ <span class='id identifier rubyid_html'>html</span> <span class='op'>=</span> <span class='const'>Nokogiri</span><span class='period'>.</span><span class='const'>HTML</span> <span class='id identifier rubyid_content'>content</span>
487
+ <span class='id identifier rubyid_html'>html</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>.name</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_element'>element</span><span class='op'>|</span>
488
+ <span class='id identifier rubyid_name'>name</span> <span class='op'>=</span> <span class='id identifier rubyid_element'>element</span><span class='period'>.</span><span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
489
+ <span class='id identifier rubyid_new_page'>new_page</span> <span class='op'>=</span> <span class='lbrace'>{</span>
490
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>https://example.com/product/</span><span class='embexpr_beg'>#{</span><span class='const'>CGI</span><span class='op'>::</span><span class='id identifier rubyid_escape'>escape</span> <span class='id identifier rubyid_name'>name</span><span class='embexpr_end'>}</span><span class='tstring_end'>&quot;</span></span><span class='comma'>,</span>
491
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>product</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
492
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>vars</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='lbrace'>{</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>name</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_name'>name</span><span class='rbrace'>}</span>
493
+ <span class='rbrace'>}</span>
494
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_fix_page!'>fix_page!</span> <span class='id identifier rubyid_new_page'>new_page</span>
495
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='id identifier rubyid_new_page'>new_page</span>
496
+ <span class='kw'>end</span>
497
+ <span class='kw'>end</span>
498
+ <span class='kw'>end</span>
499
+ <span class='kw'>end</span>
500
+ </code></pre>
501
+
502
+ <p>Hurray! Now you have implemented a fully functional login flow with auto
503
+ recovery capabilities on your project.</p>
82
504
  </div></div>
83
505
 
84
506
  <div id="footer">
85
- Generated on Fri Sep 27 22:03:08 2019 by
507
+ Generated on Wed Oct 23 21:36:55 2019 by
86
508
  <a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
87
509
  0.9.20 (ruby-2.5.3).
88
510
  </div>
@@ -100,7 +100,7 @@
100
100
  </div>
101
101
 
102
102
  <div id="footer">
103
- Generated on Fri Sep 27 22:03:08 2019 by
103
+ Generated on Wed Oct 23 21:36:55 2019 by
104
104
  <a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
105
105
  0.9.20 (ruby-2.5.3).
106
106
  </div>
@@ -230,11 +230,13 @@ module AeEasy
230
230
  # @param [Hash] held_page Held page to fix.
231
231
  def fix_page! held_page
232
232
  clean_page_response! held_page
233
- held_page['cookie'] = merge_cookie held_page['cookie']
234
- held_page['headers'] = {} unless held_page.has_key? 'headers'
235
- if held_page['headers'].has_key? 'Cookie'
236
- held_page['headers']['Cookie'] = merge_cookie held_page['headers']['Cookie']
237
- end
233
+ cookie_key = held_page.has_key?(:cookie) ? :cookie : 'cookie'
234
+ headers_key = held_page.has_key?(:headers) ? :headers : 'headers'
235
+ held_page[cookie_key] = merge_cookie held_page[cookie_key]
236
+ held_page[headers_key] = {} unless held_page.has_key? headers_key
237
+ header_cookie_key = held_page[headers_key].has_key?('cookie') ? 'cookie' : 'Cookie'
238
+ held_page[headers_key][header_cookie_key] = '' unless held_page[headers_key].has_key? header_cookie_key
239
+ held_page[headers_key][header_cookie_key] = merge_cookie held_page[headers_key][header_cookie_key]
238
240
  add_vars! held_page
239
241
  custom_fix! held_page
240
242
  end