ae_easy-login 0.0.2 → 0.0.3

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -73,16 +73,438 @@ src="http://img.shields.io/badge/license-MIT-yellowgreen.svg"></a></p>
73
73
  handle login and session recovery, quite useful when scraping websites with
74
74
  login features and expiring sessions.</p>
75
75
 
76
- <p>Install gem: <code>gem install &#39;ae_easy-login&#39;</code></p>
76
+ <p>Install gem: <code>ruby gem install &#39;ae_easy-login&#39; </code></p>
77
77
 
78
- <p>Require gem: <code>require &#39;ae_easy/login&#39;</code></p>
78
+ <p>Require gem: <code>ruby require &#39;ae_easy/login&#39; </code></p>
79
79
 
80
- <p>Documentation can be found <a
80
+ <p>Code documentation can be found <a
81
81
  href="http://rubydoc.org/gems/ae_easy-login/frames">here</a>.</p>
82
+
83
+ <h2 id="label-How+to+implement">How to implement</h2>
84
+
85
+ <h3 id="label-Before+you+start">Before you start</h3>
86
+
87
+ <p>It is true that most user cases for <code>ae_easy-login</code> gem applies
88
+ to websites with login pages and create sessions, so we will cover this
89
+ scenario on our example.</p>
90
+
91
+ <p>Therefore, <code>ae_easy-login</code> gem is designed to handle
92
+ <strong>ANY</strong> kind of session recovery, even those that doesn&#39;t
93
+ requires a login form <code>POST</code> by just changing the flow from:</p>
94
+
95
+ <pre class="code ruby"><code class="ruby">login -&gt; login_post -&gt; restore
96
+ </code></pre>
97
+
98
+ <p>To whatever you need like for example:</p>
99
+
100
+ <pre class="code ruby"><code class="ruby">home -&gt; search_page -&gt; restore
101
+ </code></pre>
102
+
103
+ <p>Here are some user case examples that can be fixed by
104
+ <code>ae_easy-login</code> gem:</p>
105
+ <ul><li>
106
+ <p>Websites that invalidate requests with fast expiring cookies created on
107
+ first request.</p>
108
+ </li><li>
109
+ <p>Websites that generates tokens on every search (either on cookies or
110
+ query_params) that are required to fetch a detail page.</p>
111
+ </li><li>
112
+ <p>Websites that expires session due inactivity.</p>
113
+ </li><li>
114
+ <p>Websites that uses complex login flows.</p>
115
+ </li><li>
116
+ <p>etc.</p>
117
+ </li></ul>
118
+
119
+ <p>Feel confident to expirement with it until it fit all your needs.</p>
120
+
121
+ <h3 id="label-Adding+ae_easy-login+to+your+project">Adding ae_easy-login to your project</h3>
122
+
123
+ <p>Let&#39;s assume a simple project implementing <code>ae_easy</code> like
124
+ the one described on <a
125
+ href="https://github.com/answersengine/ae_easy/blob/master/README.md">ae_easy
126
+ README.md</a> that scrapers your website.</p>
127
+
128
+ <p>Now lets assume your website has a login page
129
+ <code>https://example.com/login</code> with a session that expires before
130
+ our sample project scrape job finish, causing all remaining webpages to
131
+ respond <code>403</code> HTTP response code and fail… quite the problem
132
+ isn&#39;t it? Well, not anymore, <code>ae_easy-login</code> gem to the
133
+ rescue!</p>
134
+
135
+ <p>First, let&#39;s create our base module that will contain our session
136
+ validation and recovery logic, for this example, we will call it
137
+ <code>LoginEnable</code> :</p>
138
+
139
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./lib/login_enable.rb
140
+ </span>
141
+ <span class='kw'>module</span> <span class='const'>LoginEnable</span>
142
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login.html" title="AeEasy::Login (module)">Login</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login/Plugin.html" title="AeEasy::Login::Plugin (module)">Plugin</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login/Plugin/EnabledBehavior.html" title="AeEasy::Login::Plugin::EnabledBehavior (module)">EnabledBehavior</a></span></span>
143
+
144
+ <span class='comment'># Hook to initialize login_flow configuration.
145
+ </span> <span class='kw'>def</span> <span class='id identifier rubyid_initialize_hook_login_plugin_enabled_behavior'>initialize_hook_login_plugin_enabled_behavior</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
146
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='label'>app_config:</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Config</span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='rparen'>)</span><span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
147
+ <span class='ivar'>@login_flow</span> <span class='op'>=</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login.html" title="AeEasy::Login (module)">Login</a></span></span><span class='op'>::</span><span class='const'><span class='object_link'><a href="AeEasy/Login/Flow.html" title="AeEasy::Login::Flow (class)">Flow</a></span></span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span> <span class='id identifier rubyid_opts'>opts</span>
148
+ <span class='ivar'>@cookie</span> <span class='op'>=</span> <span class='kw'>nil</span>
149
+ <span class='kw'>end</span>
150
+
151
+ <span class='comment'># Get cookie after applying response cookie.
152
+ </span> <span class='comment'># @return [String] Cookie string.
153
+ </span> <span class='kw'>def</span> <span class='id identifier rubyid_cookie'>cookie</span>
154
+ <span class='kw'>return</span> <span class='ivar'>@cookie</span> <span class='kw'>if</span> <span class='ivar'>@cookie</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
155
+
156
+ <span class='id identifier rubyid_raw_cookie'>raw_cookie</span> <span class='op'>=</span> <span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>response_cookie</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span> <span class='op'>||</span> <span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>response_headers</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>Set-Cookie</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span>
157
+ <span class='ivar'>@cookie</span> <span class='op'>=</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Helper</span><span class='op'>::</span><span class='const'>Cookie</span><span class='period'>.</span><span class='id identifier rubyid_update'>update</span><span class='lparen'>(</span><span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>headers</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>Cookie</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='comma'>,</span> <span class='id identifier rubyid_raw_cookie'>raw_cookie</span><span class='rparen'>)</span>
158
+ <span class='ivar'>@cookie</span>
159
+ <span class='kw'>end</span>
160
+
161
+ <span class='comment'># Validates session.
162
+ </span> <span class='comment'># @return [Boolean] `true` when session is valid, else `false`.
163
+ </span> <span class='kw'>def</span> <span class='id identifier rubyid_valid_session?'>valid_session?</span>
164
+ <span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>200</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>404</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_include?'>include?</span> <span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>response_status_code</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_to_s'>to_s</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
165
+ <span class='kw'>end</span>
166
+
167
+ <span class='comment'># Fix page session when session is invalid.
168
+ </span> <span class='comment'># @return [Boolean] `true` when session is valid, else `false`.
169
+ </span> <span class='kw'>def</span> <span class='id identifier rubyid_fix_session'>fix_session</span>
170
+ <span class='kw'>return</span> <span class='kw'>true</span> <span class='kw'>if</span> <span class='id identifier rubyid_valid_session?'>valid_session?</span>
171
+
172
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_fix_session'>fix_session</span> <span class='kw'>do</span>
173
+ <span class='id identifier rubyid_save_pages'>save_pages</span> <span class='lbracket'>[</span><span class='lbrace'>{</span>
174
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>https://example.com/login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
175
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
176
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>priority</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='int'>9</span><span class='comma'>,</span>
177
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>freshness</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='const'>Time</span><span class='period'>.</span><span class='id identifier rubyid_now'>now</span><span class='period'>.</span><span class='id identifier rubyid_iso8601'>iso8601</span><span class='comma'>,</span>
178
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>stl=</span><span class='embexpr_beg'>#{</span><span class='id identifier rubyid_salt'>salt</span><span class='embexpr_end'>}</span><span class='tstring_end'>&quot;</span></span><span class='comma'>,</span>
179
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>headers</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='lbrace'>{</span>
180
+ <span class='comment'># Add any extra header you need here
181
+ </span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>Cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>stl=</span><span class='embexpr_beg'>#{</span><span class='id identifier rubyid_salt'>salt</span><span class='embexpr_end'>}</span><span class='tstring_end'>&quot;</span></span>
182
+ <span class='rbrace'>}</span>
183
+ <span class='rbrace'>}</span><span class='rbracket'>]</span>
184
+ <span class='kw'>end</span>
185
+
186
+ <span class='kw'>false</span>
187
+ <span class='kw'>end</span>
188
+ <span class='kw'>end</span>
189
+ </code></pre>
190
+
191
+ <p>Notice that our example <code>valid_session</code> method uses
192
+ <code>200</code> and <code>404</code> HTTP response codes to validate that
193
+ our session hasn&#39;t expired yet, therefore, <strong><em>this might not
194
+ be the case for your website</em></strong>, so make sure to modify this
195
+ method to fit your needs.</p>
196
+
197
+ <p>Our <code>fix_session</code> method will store any page with a failed
198
+ session by creating an output so it can be restored later once we have the
199
+ new active session cookie.</p>
200
+
201
+ <p><code>fix_session</code> method will also mark the current session cookie
202
+ as expired and <strong><em>enqueue a new <code>login</code> page with HIGH
203
+ priority as long as another parser hasn&#39;t already did it to avoid
204
+ duplicates</em></strong>.</p>
205
+
206
+ <p><code>cookie</code> method will merge the request cookies with the response
207
+ cookies, so we can be sure that the cookies are always updated when needed.</p>
208
+
209
+ <p>Next step is to create a simple parser that enqueue the <code>POST</code>
210
+ of our login page:</p>
211
+
212
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/login.rb
213
+ </span>
214
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
215
+ <span class='kw'>class</span> <span class='const'>Login</span>
216
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
217
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
218
+
219
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
220
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='lbrace'>{</span>
221
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>http://example.com/login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
222
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>login_post</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
223
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>priority</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='int'>10</span><span class='comma'>,</span>
224
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>method</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>POST</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
225
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_cookie'>cookie</span><span class='comma'>,</span>
226
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>headers</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='lbrace'>{</span>
227
+ <span class='comment'># Add any extra header you need here
228
+ </span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>Cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_cookie'>cookie</span>
229
+ <span class='rbrace'>}</span>
230
+ <span class='rbrace'>}</span>
231
+ <span class='kw'>end</span>
232
+ <span class='kw'>end</span>
233
+ <span class='kw'>end</span>
234
+ </code></pre>
235
+
236
+ <p>Now let&#39;s handle the login response, seed and restore any page with an
237
+ expired session:</p>
238
+
239
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/login_post.rb
240
+ </span>
241
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
242
+ <span class='kw'>class</span> <span class='const'>LoginPost</span>
243
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
244
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
245
+
246
+ <span class='kw'>def</span> <span class='id identifier rubyid_seed!'>seed!</span>
247
+ <span class='kw'>return</span> <span class='kw'>if</span> <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_seeded?'>seeded?</span>
248
+
249
+ <span class='const'>Seeders</span><span class='op'>::</span><span class='const'>Seeder</span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span><span class='lparen'>(</span><span class='label'>context:</span> <span class='id identifier rubyid_context'>context</span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_seed'>seed</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_new_page'>new_page</span><span class='op'>|</span>
250
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_fix_page!'>fix_page!</span> <span class='id identifier rubyid_new_page'>new_page</span>
251
+ <span class='kw'>end</span>
252
+
253
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_seeded!'>seeded!</span>
254
+ <span class='kw'>end</span>
255
+
256
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
257
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_update_config'>update_config</span><span class='lparen'>(</span>
258
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>cookie</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_get_cookie'>get_cookie</span><span class='comma'>,</span>
259
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>expired</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='kw'>false</span>
260
+ <span class='rparen'>)</span>
261
+
262
+ <span class='comment'># Wait for any pending fetch to be hold
263
+ </span> <span class='id identifier rubyid_sleep'>sleep</span> <span class='int'>10</span>
264
+
265
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_restore_held_pages'>restore_held_pages</span>
266
+ <span class='id identifier rubyid_seed!'>seed!</span>
267
+ <span class='kw'>end</span>
268
+ <span class='kw'>end</span>
269
+ <span class='kw'>end</span>
270
+ </code></pre>
271
+
272
+ <p>Notice something interesting? that&#39;s right, the seeding happens
273
+ <strong>AFTER</strong> we got our new active session cookie, so the pages
274
+ we seed includes the session cookie. We use
275
+ <code>login_flow.fix_page!</code> method to add our latest active session
276
+ cookie along some internal <code>page['vars']</code> (used to handle page
277
+ recovery) to our seeded pages.</p>
278
+
279
+ <p><strong>IMPORTANT:</strong> This example assumes that
280
+ <code>login_post</code> pages will never fails, but you might need to add
281
+ some extra validations to make sure the login attempt was successful before
282
+ restoring your pages.</p>
283
+
284
+ <p><strong><em>Note:</em></strong> This example assumes that all pages to be
285
+ seeded requires an active session, so we will add it to all pages we seed,
286
+ but this will likely not apply to all pages to be seeded in a real life
287
+ scenario, so make sure to add it only to those pages that requires an
288
+ active session.</p>
289
+
290
+ <p>So next step is to modify our seeder so it allow the cookie inclusion by
291
+ adding a <code>block</code> param that will be used by our
292
+ <code>Parsers::LoginPost#seed!</code> method:</p>
293
+
294
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./seeder/seeder.rb
295
+ </span>
296
+ <span class='kw'>module</span> <span class='const'>Seeder</span>
297
+ <span class='kw'>class</span> <span class='const'>Seeder</span>
298
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Seeder</span>
299
+
300
+ <span class='kw'>def</span> <span class='id identifier rubyid_seed'>seed</span> <span class='op'>&amp;</span><span class='id identifier rubyid_block'>block</span>
301
+ <span class='id identifier rubyid_new_page'>new_page</span> <span class='op'>=</span> <span class='lbrace'>{</span>
302
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>https://example.com/login.rb?query=food</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
303
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>search</span><span class='tstring_end'>&#39;</span></span>
304
+ <span class='rbrace'>}</span>
305
+ <span class='id identifier rubyid_block'>block</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_page'>page</span><span class='rparen'>)</span> <span class='kw'>unless</span> <span class='id identifier rubyid_block'>block</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
306
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='id identifier rubyid_new_page'>new_page</span>
307
+ <span class='kw'>end</span>
308
+ <span class='kw'>end</span>
309
+ <span class='kw'>end</span>
310
+ </code></pre>
311
+
312
+ <p>Now we will need to create a new seeder to seed login page:</p>
313
+
314
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./seeder/login.rb
315
+ </span>
316
+ <span class='kw'>module</span> <span class='const'>Seeder</span>
317
+ <span class='kw'>class</span> <span class='const'>Login</span>
318
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Seeder</span>
319
+
320
+ <span class='kw'>def</span> <span class='id identifier rubyid_seed'>seed</span>
321
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='lbrace'>{</span>
322
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>https://example.com/login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
323
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>login</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
324
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>priority</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='int'>9</span>
325
+ <span class='rbrace'>}</span>
326
+ <span class='kw'>end</span>
327
+ <span class='kw'>end</span>
328
+ <span class='kw'>end</span>
329
+ </code></pre>
330
+
331
+ <p>Now let&#39;s modify our <code>./config.yaml</code> to add our new page
332
+ types on it, as well as let us parse failed fetched pages since our example
333
+ assumes that website will return <code>403</code> HTTP response code when
334
+ session has expired:</p>
335
+
336
+ <pre class="code ruby"><code class="ruby"># ./config.yaml
337
+
338
+ parse_failed_pages: true
339
+
340
+ seeder:
341
+ file: ./router/seeder.rb
342
+ disabled: false
343
+
344
+ parsers:
345
+ - page_type: search
346
+ file: ./router/parser.rb
347
+ disabled: false
348
+ - page_type: product
349
+ file: ./router/parser.rb
350
+ disabled: false
351
+ - page_type: login
352
+ file: ./router/parser.rb
353
+ disabled: false
354
+ - page_type: login_post
355
+ file: ./router/parser.rb
356
+ disabled: false
357
+ </code></pre>
358
+
359
+ <p>And don&#39;t forget to modify <code>./ae_easy.yaml</code> to add our new
360
+ routes and change our seeder so login page can be seed first instead of our
361
+ old seeder:</p>
362
+
363
+ <pre class="code ruby"><code class="ruby"># ./ae_easy.yaml
364
+
365
+ router:
366
+ parser:
367
+ routes:
368
+ - page_type: search
369
+ class: Parsers::Search
370
+ - page_type: product
371
+ class: Parsers::Product
372
+ - page_type: login
373
+ class: Parsers::Login
374
+ - page_type: login_post
375
+ class: Parsers::LoginPost
376
+
377
+ seeder:
378
+ routes:
379
+ - class: Seeder::Login
380
+ </code></pre>
381
+
382
+ <p>Now, let&#39;s will need to modify our routers as well since we modified
383
+ our <code>ae_easy.yaml</code> routes and added new classes:</p>
384
+
385
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./router/seeder.rb
386
+ </span>
387
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>ae_easy/router</span><span class='tstring_end'>&#39;</span></span>
388
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./seeder/login</span><span class='tstring_end'>&#39;</span></span>
389
+
390
+ <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Router</span><span class='op'>::</span><span class='const'>Seeder</span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span><span class='period'>.</span><span class='id identifier rubyid_route'>route</span> <span class='label'>context:</span> <span class='kw'>self</span>
391
+ </code></pre>
392
+
393
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./router/parser.rb
394
+ </span>
395
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>cgi</span><span class='tstring_end'>&#39;</span></span>
396
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>ae_easy/router</span><span class='tstring_end'>&#39;</span></span>
397
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>ae_easy/login</span><span class='tstring_end'>&#39;</span></span>
398
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./lib/login_enable</span><span class='tstring_end'>&#39;</span></span>
399
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./seeder/seeder</span><span class='tstring_end'>&#39;</span></span>
400
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./parsers/search</span><span class='tstring_end'>&#39;</span></span>
401
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./parsers/product</span><span class='tstring_end'>&#39;</span></span>
402
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./parsers/login</span><span class='tstring_end'>&#39;</span></span>
403
+ <span class='id identifier rubyid_require'>require</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>./parsers/login_post</span><span class='tstring_end'>&#39;</span></span>
404
+
405
+ <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Router</span><span class='op'>::</span><span class='const'>Parser</span><span class='period'>.</span><span class='id identifier rubyid_new'>new</span><span class='period'>.</span><span class='id identifier rubyid_route'>route</span> <span class='label'>context:</span> <span class='kw'>self</span>
406
+ </code></pre>
407
+
408
+ <p>Next, we need to include our <code>LoginEnable</code> module on every
409
+ parser that requires session validation to fix any expired session request.
410
+ To do this, we will be using our <code>LoginEnable#fix_session</code>
411
+ function as the first thing to do on each parser&#39;s <code>parse</code>
412
+ method:</p>
413
+
414
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/search.rb
415
+ </span>
416
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
417
+ <span class='kw'>class</span> <span class='const'>Search</span>
418
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
419
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
420
+
421
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
422
+ <span class='kw'>return</span> <span class='kw'>unless</span> <span class='id identifier rubyid_fix_session'>fix_session</span>
423
+
424
+ <span class='id identifier rubyid_html'>html</span> <span class='op'>=</span> <span class='const'>Nokogiri</span><span class='period'>.</span><span class='const'>HTML</span> <span class='id identifier rubyid_content'>content</span>
425
+ <span class='id identifier rubyid_html'>html</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>.name</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_element'>element</span><span class='op'>|</span>
426
+ <span class='id identifier rubyid_name'>name</span> <span class='op'>=</span> <span class='id identifier rubyid_element'>element</span><span class='period'>.</span><span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
427
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='lbrace'>{</span>
428
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>https://example.com/product/</span><span class='embexpr_beg'>#{</span><span class='const'>CGI</span><span class='op'>::</span><span class='id identifier rubyid_escape'>escape</span> <span class='id identifier rubyid_name'>name</span><span class='embexpr_end'>}</span><span class='tstring_end'>&quot;</span></span><span class='comma'>,</span>
429
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>product</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
430
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>vars</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='lbrace'>{</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>name</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_name'>name</span><span class='rbrace'>}</span>
431
+ <span class='rbrace'>}</span>
432
+ <span class='kw'>end</span>
433
+ <span class='kw'>end</span>
434
+ <span class='kw'>end</span>
435
+ <span class='kw'>end</span>
436
+ </code></pre>
437
+
438
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/product.rb
439
+ </span>
440
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
441
+ <span class='kw'>class</span> <span class='const'>Product</span>
442
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
443
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
444
+
445
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
446
+ <span class='kw'>return</span> <span class='kw'>unless</span> <span class='id identifier rubyid_fix_session'>fix_session</span>
447
+
448
+ <span class='id identifier rubyid_html'>html</span> <span class='op'>=</span> <span class='const'>Nokogiri</span><span class='period'>.</span><span class='const'>HTML</span> <span class='id identifier rubyid_content'>content</span>
449
+ <span class='id identifier rubyid_description'>description</span> <span class='op'>=</span> <span class='id identifier rubyid_html'>html</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>.description</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_first'>first</span><span class='period'>.</span><span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
450
+ <span class='id identifier rubyid_outputs'>outputs</span> <span class='op'>&lt;&lt;</span> <span class='lbrace'>{</span>
451
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>_collection</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>product</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
452
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>name</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_page'>page</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>vars</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='lbracket'>[</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>name</span><span class='tstring_end'>&#39;</span></span><span class='rbracket'>]</span><span class='comma'>,</span>
453
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>description</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_description'>description</span>
454
+ <span class='rbrace'>}</span>
455
+ <span class='kw'>end</span>
456
+ <span class='kw'>end</span>
457
+ <span class='kw'>end</span>
458
+ </code></pre>
459
+
460
+ <p><strong><em>Note:</em></strong> This example asumes that all pages requires
461
+ an active session, so we will add it to all parsers, but this will likely
462
+ not apply to all parsers in a real life scenario since not all web pages
463
+ will require session, so make sure to add it to only the parsers that needs
464
+ it.</p>
465
+
466
+ <p>Finally, we need to make sure that every page that requires an active
467
+ session is enqueued within our latest active session cookie, so we need to
468
+ use <code>login_flow.fix_page!</code> method on all pages to be enqueued
469
+ that applies.</p>
470
+
471
+ <p>As for this example, we already add it to our search pages enqueued by our
472
+ seeder, so the only place left to modify is
473
+ <code>./parsers/search.rb</code> parser since it enqueues
474
+ <code>product</code> pages:</p>
475
+
476
+ <pre class="code ruby"><code class="ruby"><span class='comment'># ./parsers/search.rb
477
+ </span>
478
+ <span class='kw'>module</span> <span class='const'>Parsers</span>
479
+ <span class='kw'>class</span> <span class='const'>Search</span>
480
+ <span class='id identifier rubyid_include'>include</span> <span class='const'><span class='object_link'><a href="AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span><span class='op'>::</span><span class='const'>Core</span><span class='op'>::</span><span class='const'>Plugin</span><span class='op'>::</span><span class='const'>Parser</span>
481
+ <span class='id identifier rubyid_include'>include</span> <span class='const'>LoginEnable</span>
482
+
483
+ <span class='kw'>def</span> <span class='id identifier rubyid_parse'>parse</span>
484
+ <span class='kw'>return</span> <span class='kw'>unless</span> <span class='id identifier rubyid_fix_session'>fix_session</span>
485
+
486
+ <span class='id identifier rubyid_html'>html</span> <span class='op'>=</span> <span class='const'>Nokogiri</span><span class='period'>.</span><span class='const'>HTML</span> <span class='id identifier rubyid_content'>content</span>
487
+ <span class='id identifier rubyid_html'>html</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>.name</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_element'>element</span><span class='op'>|</span>
488
+ <span class='id identifier rubyid_name'>name</span> <span class='op'>=</span> <span class='id identifier rubyid_element'>element</span><span class='period'>.</span><span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
489
+ <span class='id identifier rubyid_new_page'>new_page</span> <span class='op'>=</span> <span class='lbrace'>{</span>
490
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>url</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>https://example.com/product/</span><span class='embexpr_beg'>#{</span><span class='const'>CGI</span><span class='op'>::</span><span class='id identifier rubyid_escape'>escape</span> <span class='id identifier rubyid_name'>name</span><span class='embexpr_end'>}</span><span class='tstring_end'>&quot;</span></span><span class='comma'>,</span>
491
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>page_type</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>product</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span>
492
+ <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>vars</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='lbrace'>{</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>name</span><span class='tstring_end'>&#39;</span></span> <span class='op'>=&gt;</span> <span class='id identifier rubyid_name'>name</span><span class='rbrace'>}</span>
493
+ <span class='rbrace'>}</span>
494
+ <span class='id identifier rubyid_login_flow'>login_flow</span><span class='period'>.</span><span class='id identifier rubyid_fix_page!'>fix_page!</span> <span class='id identifier rubyid_new_page'>new_page</span>
495
+ <span class='id identifier rubyid_pages'>pages</span> <span class='op'>&lt;&lt;</span> <span class='id identifier rubyid_new_page'>new_page</span>
496
+ <span class='kw'>end</span>
497
+ <span class='kw'>end</span>
498
+ <span class='kw'>end</span>
499
+ <span class='kw'>end</span>
500
+ </code></pre>
501
+
502
+ <p>Hurray! Now you have implemented a fully functional login flow with auto
503
+ recovery capabilities on your project.</p>
82
504
  </div></div>
83
505
 
84
506
  <div id="footer">
85
- Generated on Fri Sep 27 22:03:08 2019 by
507
+ Generated on Wed Oct 23 21:36:55 2019 by
86
508
  <a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
87
509
  0.9.20 (ruby-2.5.3).
88
510
  </div>
@@ -100,7 +100,7 @@
100
100
  </div>
101
101
 
102
102
  <div id="footer">
103
- Generated on Fri Sep 27 22:03:08 2019 by
103
+ Generated on Wed Oct 23 21:36:55 2019 by
104
104
  <a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
105
105
  0.9.20 (ruby-2.5.3).
106
106
  </div>
@@ -230,11 +230,13 @@ module AeEasy
230
230
  # @param [Hash] held_page Held page to fix.
231
231
  def fix_page! held_page
232
232
  clean_page_response! held_page
233
- held_page['cookie'] = merge_cookie held_page['cookie']
234
- held_page['headers'] = {} unless held_page.has_key? 'headers'
235
- if held_page['headers'].has_key? 'Cookie'
236
- held_page['headers']['Cookie'] = merge_cookie held_page['headers']['Cookie']
237
- end
233
+ cookie_key = held_page.has_key?(:cookie) ? :cookie : 'cookie'
234
+ headers_key = held_page.has_key?(:headers) ? :headers : 'headers'
235
+ held_page[cookie_key] = merge_cookie held_page[cookie_key]
236
+ held_page[headers_key] = {} unless held_page.has_key? headers_key
237
+ header_cookie_key = held_page[headers_key].has_key?('cookie') ? 'cookie' : 'Cookie'
238
+ held_page[headers_key][header_cookie_key] = '' unless held_page[headers_key].has_key? header_cookie_key
239
+ held_page[headers_key][header_cookie_key] = merge_cookie held_page[headers_key][header_cookie_key]
238
240
  add_vars! held_page
239
241
  custom_fix! held_page
240
242
  end