ae_easy-text 0.0.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: 239189344e783f67b085da7394e535aa693a4b067c62b8d0b16f733a0b19d4f7
4
+ data.tar.gz: ca144105f26e399116b05560ff870f6aa051a04696602f6f68f67f06b9e0bfda
5
+ SHA512:
6
+ metadata.gz: 0b7c4495eeb71e5dae3ad799d14f8a2d83989a949183ee3df2837191b4a4f3a10965ead38416ccda078da7b29fc083eb02fd53a24999f97473b02f77489d921c
7
+ data.tar.gz: 4f377b26bcfb0ef4cce7806d153fb97de115d0e0bc4beef5e43b55b0125e1936d6c4440d1131cbe9cb4d5fe28a1810c2820269f6c317511068c91aff42ad8126
data/.gitignore ADDED
@@ -0,0 +1,12 @@
1
+ /.byebug*
2
+ /.bundle/
3
+ /.yardoc
4
+ /_yardoc/
5
+ /coverage/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+ /certs/
10
+ /checksum/
11
+ /vendor/
12
+ /Gemfile.lock
data/.travis.yml ADDED
@@ -0,0 +1,7 @@
1
+ ---
2
+ sudo: false
3
+ language: ruby
4
+ cache: bundler
5
+ rvm:
6
+ - 2.4.2
7
+ before_install: gem install bundler -v 1.16.3
data/.yardopts ADDED
@@ -0,0 +1 @@
1
+ --no-private
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at parama@answersengine.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source "https://rubygems.org"
2
+
3
+ git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
4
+
5
+ # Specify your gem's dependencies in answersengine.gemspec
6
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2019 AnswersEngine
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
data/README.md ADDED
@@ -0,0 +1,16 @@
1
+ [![Documentation](http://img.shields.io/badge/docs-rdoc.info-blue.svg)](http://rubydoc.org/gems/ae_easy-text/frames)
2
+ [![Gem Version](https://badge.fury.io/rb/ae_easy-text.svg)](http://github.com/answersengine/ae_easy-text/releases)
3
+ [![License](http://img.shields.io/badge/license-MIT-yellowgreen.svg)](#license)
4
+
5
+ # AeEasy text module
6
+ ## Description
7
+
8
+ AeEasy text is part of AeEasy gem collection. It provides multiple text parsing helpers to ease common text parsing user cases.
9
+
10
+ Install gem:
11
+ ```gem install 'ae_easy-text'```
12
+
13
+ Require gem:
14
+ ```require 'ae_easy-text'```
15
+
16
+ Documentation can be found [here](http://rubydoc.org/gems/ae_easy-text/frames).
data/Rakefile ADDED
@@ -0,0 +1,22 @@
1
+ require 'benchmark'
2
+ require 'bundler/gem_tasks'
3
+ require 'rake/testtask'
4
+
5
+ Rake::TestTask.new do |t|
6
+ t.libs = ['lib', 'test']
7
+ t.warning = false
8
+ t.verbose = false
9
+ t.test_files = FileList['./test/**/*_test.rb']
10
+ end
11
+
12
+ desc 'Benchmark another task execution | usage example: benchmark[my_task, param1, param2]'
13
+ task :benchmark, [:task] do |task, args|
14
+ task_name = args[:task]
15
+ if task_name.nil?
16
+ puts "Should select a task."
17
+ exit 1
18
+ end
19
+ puts Benchmark.measure{ Rake::Task[task_name].invoke *args.extras }
20
+ end
21
+
22
+ task default: :test
@@ -0,0 +1,49 @@
1
+
2
+ lib = File.expand_path("../lib", __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require "ae_easy/text/version"
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "ae_easy-text"
8
+ spec.version = AeEasy::Text::VERSION
9
+ spec.authors = ["Eduardo Rosales"]
10
+ spec.email = ["eduardo@datahen.com"]
11
+
12
+ spec.summary = %q{AnswersEngine Easy toolkit text module}
13
+ spec.description = %q{AnswersEngine Easy toolkit text module contains multiple text parsing helpers.}
14
+ spec.homepage = "https://answersengine.com"
15
+ spec.license = "MIT"
16
+
17
+ # spec.cert_chain = ['certs/ae_easy.pem']
18
+ # spec.signing_key = File.expand_path("~/.ssh/gems/gem-private_ae_easy.pem") if $0 =~ /gem\z/
19
+
20
+ # Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
21
+ # to allow pushing to a single host or delete this section to allow pushing to any host.
22
+ if spec.respond_to?(:metadata)
23
+ # spec.metadata["allowed_push_host"] = "TODO: Set to 'http://mygemserver.com'"
24
+
25
+ spec.metadata["homepage_uri"] = spec.homepage
26
+ spec.metadata["source_code_uri"] = "https://github.com/answersengine/ae_easy-text"
27
+ # spec.metadata["changelog_uri"] = "TODO: Put your gem's CHANGELOG.md URL here."
28
+ else
29
+ raise "RubyGems 2.0 or newer is required to protect against " \
30
+ "public gem pushes."
31
+ end
32
+
33
+ # Specify which files should be added to the gem when it is released.
34
+ # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
35
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
36
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
37
+ end
38
+ spec.require_paths = ["lib"]
39
+ spec.required_ruby_version = '>= 2.2.2'
40
+
41
+ spec.add_dependency 'ae_easy-core', '>= 0'
42
+ spec.add_development_dependency 'bundler', '>= 1.16.3'
43
+ spec.add_development_dependency 'rake', '>= 10.0'
44
+ spec.add_development_dependency 'minitest', '>= 5.11'
45
+ spec.add_development_dependency 'simplecov', '>= 0.16.1'
46
+ spec.add_development_dependency 'simplecov-console', '>= 0.4.2'
47
+ spec.add_development_dependency 'timecop', '>= 0.9.1'
48
+ spec.add_development_dependency 'byebug', '>= 0'
49
+ end
data/doc/AeEasy.html ADDED
@@ -0,0 +1,117 @@
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <meta charset="utf-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>
7
+ Module: AeEasy
8
+
9
+ &mdash; Documentation by YARD 0.9.18
10
+
11
+ </title>
12
+
13
+ <link rel="stylesheet" href="css/style.css" type="text/css" charset="utf-8" />
14
+
15
+ <link rel="stylesheet" href="css/common.css" type="text/css" charset="utf-8" />
16
+
17
+ <script type="text/javascript" charset="utf-8">
18
+ pathId = "AeEasy";
19
+ relpath = '';
20
+ </script>
21
+
22
+
23
+ <script type="text/javascript" charset="utf-8" src="js/jquery.js"></script>
24
+
25
+ <script type="text/javascript" charset="utf-8" src="js/app.js"></script>
26
+
27
+
28
+ </head>
29
+ <body>
30
+ <div class="nav_wrap">
31
+ <iframe id="nav" src="class_list.html?1"></iframe>
32
+ <div id="resizer"></div>
33
+ </div>
34
+
35
+ <div id="main" tabindex="-1">
36
+ <div id="header">
37
+ <div id="menu">
38
+
39
+ <a href="_index.html">Index (A)</a> &raquo;
40
+
41
+
42
+ <span class="title">AeEasy</span>
43
+
44
+ </div>
45
+
46
+ <div id="search">
47
+
48
+ <a class="full_list_link" id="class_list_link"
49
+ href="class_list.html">
50
+
51
+ <svg width="24" height="24">
52
+ <rect x="0" y="4" width="24" height="4" rx="1" ry="1"></rect>
53
+ <rect x="0" y="12" width="24" height="4" rx="1" ry="1"></rect>
54
+ <rect x="0" y="20" width="24" height="4" rx="1" ry="1"></rect>
55
+ </svg>
56
+ </a>
57
+
58
+ </div>
59
+ <div class="clear"></div>
60
+ </div>
61
+
62
+ <div id="content"><h1>Module: AeEasy
63
+
64
+
65
+
66
+ </h1>
67
+ <div class="box_info">
68
+
69
+
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+
79
+ <dl>
80
+ <dt>Defined in:</dt>
81
+ <dd>lib/ae_easy/text.rb<span class="defines">,<br />
82
+ lib/ae_easy/text/version.rb</span>
83
+ </dd>
84
+ </dl>
85
+
86
+ </div>
87
+
88
+ <h2>Defined Under Namespace</h2>
89
+ <p class="children">
90
+
91
+
92
+ <strong class="modules">Modules:</strong> <span class='object_link'><a href="AeEasy/Text.html" title="AeEasy::Text (module)">Text</a></span>
93
+
94
+
95
+
96
+
97
+ </p>
98
+
99
+
100
+
101
+
102
+
103
+
104
+
105
+
106
+
107
+ </div>
108
+
109
+ <div id="footer">
110
+ Generated on Tue Feb 26 16:50:02 2019 by
111
+ <a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
112
+ 0.9.18 (ruby-2.5.3).
113
+ </div>
114
+
115
+ </div>
116
+ </body>
117
+ </html>
@@ -0,0 +1,2024 @@
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <meta charset="utf-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>
7
+ Module: AeEasy::Text
8
+
9
+ &mdash; Documentation by YARD 0.9.18
10
+
11
+ </title>
12
+
13
+ <link rel="stylesheet" href="../css/style.css" type="text/css" charset="utf-8" />
14
+
15
+ <link rel="stylesheet" href="../css/common.css" type="text/css" charset="utf-8" />
16
+
17
+ <script type="text/javascript" charset="utf-8">
18
+ pathId = "AeEasy::Text";
19
+ relpath = '../';
20
+ </script>
21
+
22
+
23
+ <script type="text/javascript" charset="utf-8" src="../js/jquery.js"></script>
24
+
25
+ <script type="text/javascript" charset="utf-8" src="../js/app.js"></script>
26
+
27
+
28
+ </head>
29
+ <body>
30
+ <div class="nav_wrap">
31
+ <iframe id="nav" src="../class_list.html?1"></iframe>
32
+ <div id="resizer"></div>
33
+ </div>
34
+
35
+ <div id="main" tabindex="-1">
36
+ <div id="header">
37
+ <div id="menu">
38
+
39
+ <a href="../_index.html">Index (T)</a> &raquo;
40
+ <span class='title'><span class='object_link'><a href="../AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span>
41
+ &raquo;
42
+ <span class="title">Text</span>
43
+
44
+ </div>
45
+
46
+ <div id="search">
47
+
48
+ <a class="full_list_link" id="class_list_link"
49
+ href="../class_list.html">
50
+
51
+ <svg width="24" height="24">
52
+ <rect x="0" y="4" width="24" height="4" rx="1" ry="1"></rect>
53
+ <rect x="0" y="12" width="24" height="4" rx="1" ry="1"></rect>
54
+ <rect x="0" y="20" width="24" height="4" rx="1" ry="1"></rect>
55
+ </svg>
56
+ </a>
57
+
58
+ </div>
59
+ <div class="clear"></div>
60
+ </div>
61
+
62
+ <div id="content"><h1>Module: AeEasy::Text
63
+
64
+
65
+
66
+ </h1>
67
+ <div class="box_info">
68
+
69
+
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+
79
+ <dl>
80
+ <dt>Defined in:</dt>
81
+ <dd>lib/ae_easy/text.rb<span class="defines">,<br />
82
+ lib/ae_easy/text/version.rb</span>
83
+ </dd>
84
+ </dl>
85
+
86
+ </div>
87
+
88
+
89
+
90
+ <h2>
91
+ Constant Summary
92
+ <small><a href="#" class="constants_summary_toggle">collapse</a></small>
93
+ </h2>
94
+
95
+ <dl class="constants">
96
+
97
+ <dt id="VERSION-constant" class="">VERSION =
98
+ <div class="docstring">
99
+ <div class="discussion">
100
+
101
+ <p>Gem version</p>
102
+
103
+
104
+ </div>
105
+ </div>
106
+ <div class="tags">
107
+
108
+
109
+ </div>
110
+ </dt>
111
+ <dd><pre class="code"><span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>0.0.1</span><span class='tstring_end'>&quot;</span></span></pre></dd>
112
+
113
+ </dl>
114
+
115
+
116
+
117
+
118
+
119
+
120
+
121
+
122
+
123
+ <h2>
124
+ Class Method Summary
125
+ <small><a href="#" class="summary_toggle">collapse</a></small>
126
+ </h2>
127
+
128
+ <ul class="summary">
129
+
130
+ <li class="public ">
131
+ <span class="summary_signature">
132
+
133
+ <a href="#decode_html-class_method" title="decode_html (class method)">.<strong>decode_html</strong>(text) &#x21d2; String </a>
134
+
135
+
136
+
137
+ </span>
138
+
139
+
140
+
141
+
142
+
143
+
144
+
145
+
146
+
147
+ <span class="summary_desc"><div class='inline'>
148
+ <p>Decode HTML entities from text .</p>
149
+ </div></span>
150
+
151
+ </li>
152
+
153
+
154
+ <li class="public ">
155
+ <span class="summary_signature">
156
+
157
+ <a href="#default_parser-class_method" title="default_parser (class method)">.<strong>default_parser</strong>(cell_element, data, key) &#x21d2; Object </a>
158
+
159
+
160
+
161
+ </span>
162
+
163
+
164
+
165
+
166
+
167
+
168
+
169
+
170
+
171
+ <span class="summary_desc"><div class='inline'>
172
+ <p>Default cell content parser used to parse cell element.</p>
173
+ </div></span>
174
+
175
+ </li>
176
+
177
+
178
+ <li class="public ">
179
+ <span class="summary_signature">
180
+
181
+ <a href="#encode_html-class_method" title="encode_html (class method)">.<strong>encode_html</strong>(text) &#x21d2; String </a>
182
+
183
+
184
+
185
+ </span>
186
+
187
+
188
+
189
+
190
+
191
+
192
+
193
+
194
+
195
+ <span class="summary_desc"><div class='inline'>
196
+ <p>Encode text for valid HTML entities.</p>
197
+ </div></span>
198
+
199
+ </li>
200
+
201
+
202
+ <li class="public ">
203
+ <span class="summary_signature">
204
+
205
+ <a href="#hash-class_method" title="hash (class method)">.<strong>hash</strong>(object) &#x21d2; String </a>
206
+
207
+
208
+
209
+ </span>
210
+
211
+
212
+
213
+
214
+
215
+
216
+
217
+
218
+
219
+ <span class="summary_desc"><div class='inline'>
220
+ <p>Create a hash from object.</p>
221
+ </div></span>
222
+
223
+ </li>
224
+
225
+
226
+ <li class="public ">
227
+ <span class="summary_signature">
228
+
229
+ <a href="#parse_content-class_method" title="parse_content (class method)">.<strong>parse_content</strong>(opts) {|data, row, header_map| ... } &#x21d2; Array&lt;Hash&gt;<sup>?</sup> </a>
230
+
231
+
232
+
233
+ </span>
234
+
235
+
236
+
237
+
238
+
239
+
240
+
241
+
242
+
243
+ <span class="summary_desc"><div class='inline'>
244
+ <p>Parse row data matching a selector using a header map to translate
245
+ between columns and friendly keys.</p>
246
+ </div></span>
247
+
248
+ </li>
249
+
250
+
251
+ <li class="public ">
252
+ <span class="summary_signature">
253
+
254
+ <a href="#parse_header_map-class_method" title="parse_header_map (class method)">.<strong>parse_header_map</strong>(opts = {}) &#x21d2; Hash{Symbol,String =&gt; Integer}<sup>?</sup> </a>
255
+
256
+
257
+
258
+ </span>
259
+
260
+
261
+
262
+
263
+
264
+
265
+
266
+
267
+
268
+ <span class="summary_desc"><div class='inline'>
269
+ <p>Parse header from selector and create a header map to match a column key
270
+ with column index.</p>
271
+ </div></span>
272
+
273
+ </li>
274
+
275
+
276
+ <li class="public ">
277
+ <span class="summary_signature">
278
+
279
+ <a href="#parse_table-class_method" title="parse_table (class method)">.<strong>parse_table</strong>(opts = {}) {|data, row, header_map| ... } &#x21d2; Hash{Symbol =&gt; Array,Hash,nil} </a>
280
+
281
+
282
+
283
+ </span>
284
+
285
+
286
+
287
+
288
+
289
+
290
+
291
+
292
+
293
+ <span class="summary_desc"><div class='inline'>
294
+ <p>Parse data from a horizontal table like structure matching a selectors and
295
+ using a header map to match columns.</p>
296
+ </div></span>
297
+
298
+ </li>
299
+
300
+
301
+ <li class="public ">
302
+ <span class="summary_signature">
303
+
304
+ <a href="#parse_vertical_table-class_method" title="parse_vertical_table (class method)">.<strong>parse_vertical_table</strong>(opts = {}) {|data, row, header_map| ... } &#x21d2; Hash{Symbol =&gt; Array,Hash,nil} </a>
305
+
306
+
307
+
308
+ </span>
309
+
310
+
311
+
312
+
313
+
314
+
315
+
316
+
317
+
318
+ <span class="summary_desc"><div class='inline'>
319
+ <p>Parse data from a vertical table like structure matching a selectors and
320
+ using a header map to match columns.</p>
321
+ </div></span>
322
+
323
+ </li>
324
+
325
+
326
+ <li class="public ">
327
+ <span class="summary_signature">
328
+
329
+ <a href="#strip-class_method" title="strip (class method)">.<strong>strip</strong>(raw_text) &#x21d2; String<sup>?</sup> </a>
330
+
331
+
332
+
333
+ </span>
334
+
335
+
336
+
337
+
338
+
339
+
340
+
341
+
342
+
343
+ <span class="summary_desc"><div class='inline'>
344
+ <p>Strip a value.</p>
345
+ </div></span>
346
+
347
+ </li>
348
+
349
+
350
+ <li class="public ">
351
+ <span class="summary_signature">
352
+
353
+ <a href="#translate_label_to_key-class_method" title="translate_label_to_key (class method)">.<strong>translate_label_to_key</strong>(element, label_map) &#x21d2; Symbol, String </a>
354
+
355
+
356
+
357
+ </span>
358
+
359
+
360
+
361
+
362
+
363
+
364
+
365
+
366
+
367
+ <span class="summary_desc"><div class='inline'>
368
+ <p>Extract column label and translate it into a frienly key.</p>
369
+ </div></span>
370
+
371
+ </li>
372
+
373
+
374
+ </ul>
375
+
376
+
377
+
378
+
379
+ <div id="class_method_details" class="method_details_list">
380
+ <h2>Class Method Details</h2>
381
+
382
+
383
+ <div class="method_details first">
384
+ <h3 class="signature first" id="decode_html-class_method">
385
+
386
+ .<strong>decode_html</strong>(text) &#x21d2; <tt>String</tt>
387
+
388
+
389
+
390
+
391
+
392
+ </h3><div class="docstring">
393
+ <div class="discussion">
394
+
395
+ <p>Decode HTML entities from text .</p>
396
+
397
+
398
+ </div>
399
+ </div>
400
+ <div class="tags">
401
+ <p class="tag_title">Parameters:</p>
402
+ <ul class="param">
403
+
404
+ <li>
405
+
406
+ <span class='name'>text</span>
407
+
408
+
409
+ <span class='type'>(<tt>String</tt>)</span>
410
+
411
+
412
+
413
+ &mdash;
414
+ <div class='inline'>
415
+ <p>Text to decode.</p>
416
+ </div>
417
+
418
+ </li>
419
+
420
+ </ul>
421
+
422
+ <p class="tag_title">Returns:</p>
423
+ <ul class="return">
424
+
425
+ <li>
426
+
427
+
428
+ <span class='type'>(<tt>String</tt>)</span>
429
+
430
+
431
+
432
+ </li>
433
+
434
+ </ul>
435
+
436
+ </div><table class="source_code">
437
+ <tr>
438
+ <td>
439
+ <pre class="lines">
440
+
441
+
442
+ 33
443
+ 34
444
+ 35</pre>
445
+ </td>
446
+ <td>
447
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 33</span>
448
+
449
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_decode_html'>decode_html</span> <span class='id identifier rubyid_text'>text</span>
450
+ <span class='const'>CGI</span><span class='period'>.</span><span class='id identifier rubyid_unescapeHTML'>unescapeHTML</span> <span class='id identifier rubyid_text'>text</span>
451
+ <span class='kw'>end</span></pre>
452
+ </td>
453
+ </tr>
454
+ </table>
455
+ </div>
456
+
457
+ <div class="method_details ">
458
+ <h3 class="signature " id="default_parser-class_method">
459
+
460
+ .<strong>default_parser</strong>(cell_element, data, key) &#x21d2; <tt>Object</tt>
461
+
462
+
463
+
464
+
465
+
466
+ </h3><div class="docstring">
467
+ <div class="discussion">
468
+
469
+ <p>Default cell content parser used to parse cell element.</p>
470
+
471
+
472
+ </div>
473
+ </div>
474
+ <div class="tags">
475
+ <p class="tag_title">Parameters:</p>
476
+ <ul class="param">
477
+
478
+ <li>
479
+
480
+ <span class='name'>cell_element</span>
481
+
482
+
483
+ <span class='type'>(<tt>Nokogiri::Element</tt>)</span>
484
+
485
+
486
+
487
+ &mdash;
488
+ <div class='inline'>
489
+ <p>Cell element to parse.</p>
490
+ </div>
491
+
492
+ </li>
493
+
494
+ <li>
495
+
496
+ <span class='name'>data</span>
497
+
498
+
499
+ <span class='type'>(<tt>Hash</tt>)</span>
500
+
501
+
502
+
503
+ &mdash;
504
+ <div class='inline'>
505
+ <p>Data hash to save parsed data into.</p>
506
+ </div>
507
+
508
+ </li>
509
+
510
+ <li>
511
+
512
+ <span class='name'>key</span>
513
+
514
+
515
+ <span class='type'>(<tt>String</tt>, <tt>Symbol</tt>)</span>
516
+
517
+
518
+
519
+ &mdash;
520
+ <div class='inline'>
521
+ <p>Header column key being parsed.</p>
522
+ </div>
523
+
524
+ </li>
525
+
526
+ </ul>
527
+
528
+
529
+ </div><table class="source_code">
530
+ <tr>
531
+ <td>
532
+ <pre class="lines">
533
+
534
+
535
+ 60
536
+ 61
537
+ 62
538
+ 63</pre>
539
+ </td>
540
+ <td>
541
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 60</span>
542
+
543
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_default_parser'>default_parser</span> <span class='id identifier rubyid_cell_element'>cell_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span>
544
+ <span class='id identifier rubyid_cell_element'>cell_element</span><span class='op'>&amp;.</span><span class='id identifier rubyid_search'>search</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>//i</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_remove'>remove</span>
545
+ <span class='id identifier rubyid_row_data'>row_data</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span> <span class='op'>=</span> <span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_cell_element'>cell_element</span><span class='op'>&amp;.</span><span class='id identifier rubyid_text'>text</span>
546
+ <span class='kw'>end</span></pre>
547
+ </td>
548
+ </tr>
549
+ </table>
550
+ </div>
551
+
552
+ <div class="method_details ">
553
+ <h3 class="signature " id="encode_html-class_method">
554
+
555
+ .<strong>encode_html</strong>(text) &#x21d2; <tt>String</tt>
556
+
557
+
558
+
559
+
560
+
561
+ </h3><div class="docstring">
562
+ <div class="discussion">
563
+
564
+ <p>Encode text for valid HTML entities.</p>
565
+
566
+
567
+ </div>
568
+ </div>
569
+ <div class="tags">
570
+ <p class="tag_title">Parameters:</p>
571
+ <ul class="param">
572
+
573
+ <li>
574
+
575
+ <span class='name'>text</span>
576
+
577
+
578
+ <span class='type'>(<tt>String</tt>)</span>
579
+
580
+
581
+
582
+ &mdash;
583
+ <div class='inline'>
584
+ <p>Text to encode.</p>
585
+ </div>
586
+
587
+ </li>
588
+
589
+ </ul>
590
+
591
+ <p class="tag_title">Returns:</p>
592
+ <ul class="return">
593
+
594
+ <li>
595
+
596
+
597
+ <span class='type'>(<tt>String</tt>)</span>
598
+
599
+
600
+
601
+ </li>
602
+
603
+ </ul>
604
+
605
+ </div><table class="source_code">
606
+ <tr>
607
+ <td>
608
+ <pre class="lines">
609
+
610
+
611
+ 24
612
+ 25
613
+ 26</pre>
614
+ </td>
615
+ <td>
616
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 24</span>
617
+
618
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_encode_html'>encode_html</span> <span class='id identifier rubyid_text'>text</span>
619
+ <span class='const'>CGI</span><span class='period'>.</span><span class='id identifier rubyid_escapeHTML'>escapeHTML</span> <span class='id identifier rubyid_text'>text</span>
620
+ <span class='kw'>end</span></pre>
621
+ </td>
622
+ </tr>
623
+ </table>
624
+ </div>
625
+
626
+ <div class="method_details ">
627
+ <h3 class="signature " id="hash-class_method">
628
+
629
+ .<strong>hash</strong>(object) &#x21d2; <tt>String</tt>
630
+
631
+
632
+
633
+
634
+
635
+ </h3><div class="docstring">
636
+ <div class="discussion">
637
+
638
+ <p>Create a hash from object</p>
639
+
640
+
641
+ </div>
642
+ </div>
643
+ <div class="tags">
644
+ <p class="tag_title">Parameters:</p>
645
+ <ul class="param">
646
+
647
+ <li>
648
+
649
+ <span class='name'>object</span>
650
+
651
+
652
+ <span class='type'>(<tt>String</tt>, <tt>Hash</tt>, <tt>Object</tt>)</span>
653
+
654
+
655
+
656
+ &mdash;
657
+ <div class='inline'>
658
+ <p>Object to create hash from.</p>
659
+ </div>
660
+
661
+ </li>
662
+
663
+ </ul>
664
+
665
+ <p class="tag_title">Returns:</p>
666
+ <ul class="return">
667
+
668
+ <li>
669
+
670
+
671
+ <span class='type'>(<tt>String</tt>)</span>
672
+
673
+
674
+
675
+ </li>
676
+
677
+ </ul>
678
+
679
+ </div><table class="source_code">
680
+ <tr>
681
+ <td>
682
+ <pre class="lines">
683
+
684
+
685
+ 14
686
+ 15
687
+ 16
688
+ 17</pre>
689
+ </td>
690
+ <td>
691
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 14</span>
692
+
693
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_hash'>hash</span> <span class='id identifier rubyid_object'>object</span>
694
+ <span class='id identifier rubyid_object'>object</span> <span class='op'>=</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_hash'>hash</span> <span class='kw'>if</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span> <span class='const'>Hash</span>
695
+ <span class='const'>Digest</span><span class='op'>::</span><span class='const'>SHA1</span><span class='period'>.</span><span class='id identifier rubyid_hexdigest'>hexdigest</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_to_s'>to_s</span>
696
+ <span class='kw'>end</span></pre>
697
+ </td>
698
+ </tr>
699
+ </table>
700
+ </div>
701
+
702
+ <div class="method_details ">
703
+ <h3 class="signature " id="parse_content-class_method">
704
+
705
+ .<strong>parse_content</strong>(opts) {|data, row, header_map| ... } &#x21d2; <tt>Array&lt;Hash&gt;</tt><sup>?</sup>
706
+
707
+
708
+
709
+
710
+
711
+ </h3><div class="docstring">
712
+ <div class="discussion">
713
+
714
+ <p>Parse row data matching a selector using a header map to translate</p>
715
+
716
+ <pre class="code ruby"><code class="ruby">between columns and friendly keys.
717
+ </code></pre>
718
+
719
+
720
+ </div>
721
+ </div>
722
+ <div class="tags">
723
+ <p class="tag_title">Parameters:</p>
724
+ <ul class="param">
725
+
726
+ <li>
727
+
728
+ <span class='name'>opts</span>
729
+
730
+
731
+ <span class='type'>(<tt>Hash</tt>)</span>
732
+
733
+
734
+
735
+ &mdash;
736
+ <div class='inline'>
737
+ <p>({}) Configuration options.</p>
738
+ </div>
739
+
740
+ </li>
741
+
742
+ </ul>
743
+
744
+
745
+
746
+
747
+ <p class="tag_title">Options Hash (<tt>opts</tt>):</p>
748
+ <ul class="option">
749
+
750
+ <li>
751
+ <span class="name">:html</span>
752
+ <span class="type">(<tt>Nokogiri::Element</tt>)</span>
753
+ <span class="default">
754
+
755
+ </span>
756
+
757
+ &mdash; <div class='inline'>
758
+ <p>Container element to search into.</p>
759
+ </div>
760
+
761
+ </li>
762
+
763
+ <li>
764
+ <span class="name">:selector</span>
765
+ <span class="type">(<tt>String</tt>)</span>
766
+ <span class="default">
767
+
768
+ </span>
769
+
770
+ &mdash; <div class='inline'>
771
+ <p>CSS selector to match content cells.</p>
772
+ </div>
773
+
774
+ </li>
775
+
776
+ <li>
777
+ <span class="name">:first_row_header</span>
778
+ <span class="type">(<tt>Boolean</tt>)</span>
779
+ <span class="default">
780
+
781
+ &mdash; default:
782
+ <tt>false</tt>
783
+
784
+ </span>
785
+
786
+ &mdash; <div class='inline'>
787
+ <p>If true then first matching element will be assumed to be header and
788
+ ignored.</p>
789
+ </div>
790
+
791
+ </li>
792
+
793
+ <li>
794
+ <span class="name">:header_map</span>
795
+ <span class="type">(<tt>Hash{Symbol,String =&gt; Integer}</tt>)</span>
796
+ <span class="default">
797
+
798
+ </span>
799
+
800
+ &mdash; <div class='inline'>
801
+ <p>Header key vs index dictionary.</p>
802
+ </div>
803
+
804
+ </li>
805
+
806
+ <li>
807
+ <span class="name">:column_parsers</span>
808
+ <span class="type">(<tt>Hash{Symbol,String =&gt; lambda,proc}</tt>)</span>
809
+ <span class="default">
810
+
811
+ &mdash; default:
812
+ <tt>{}</tt>
813
+
814
+ </span>
815
+
816
+ &mdash; <div class='inline'>
817
+ <p>Custom column parsers for advance data extraction.</p>
818
+ </div>
819
+
820
+ </li>
821
+
822
+ </ul>
823
+
824
+
825
+ <p class="tag_title">Yield Parameters:</p>
826
+ <ul class="yieldparam">
827
+
828
+ <li>
829
+
830
+ <span class='name'>data</span>
831
+
832
+
833
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Object}</tt>)</span>
834
+
835
+
836
+
837
+ &mdash;
838
+ <div class='inline'>
839
+ <p>Parsed row data.</p>
840
+ </div>
841
+
842
+ </li>
843
+
844
+ <li>
845
+
846
+ <span class='name'>row</span>
847
+
848
+
849
+ <span class='type'>(<tt>Array</tt>)</span>
850
+
851
+
852
+
853
+ &mdash;
854
+ <div class='inline'>
855
+ <p>Raw row data.</p>
856
+ </div>
857
+
858
+ </li>
859
+
860
+ <li>
861
+
862
+ <span class='name'>header_map</span>
863
+
864
+
865
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Integer}</tt>)</span>
866
+
867
+
868
+
869
+ &mdash;
870
+ <div class='inline'>
871
+ <p>Header map used.</p>
872
+ </div>
873
+
874
+ </li>
875
+
876
+ </ul>
877
+ <p class="tag_title">Yield Returns:</p>
878
+ <ul class="yieldreturn">
879
+
880
+ <li>
881
+
882
+
883
+ <span class='type'>(<tt>Boolean</tt>)</span>
884
+
885
+
886
+
887
+ &mdash;
888
+ <div class='inline'>
889
+ <p>`true` when valid, else `false`.</p>
890
+ </div>
891
+
892
+ </li>
893
+
894
+ </ul>
895
+ <p class="tag_title">Returns:</p>
896
+ <ul class="return">
897
+
898
+ <li>
899
+
900
+
901
+ <span class='type'>(<tt>Array&lt;Hash&gt;</tt>, <tt>nil</tt>)</span>
902
+
903
+
904
+
905
+ &mdash;
906
+ <div class='inline'>
907
+ <p>Parsed rows data.</p>
908
+ </div>
909
+
910
+ </li>
911
+
912
+ </ul>
913
+
914
+ </div><table class="source_code">
915
+ <tr>
916
+ <td>
917
+ <pre class="lines">
918
+
919
+
920
+ 84
921
+ 85
922
+ 86
923
+ 87
924
+ 88
925
+ 89
926
+ 90
927
+ 91
928
+ 92
929
+ 93
930
+ 94
931
+ 95
932
+ 96
933
+ 97
934
+ 98
935
+ 99
936
+ 100
937
+ 101
938
+ 102
939
+ 103
940
+ 104
941
+ 105
942
+ 106
943
+ 107
944
+ 108
945
+ 109
946
+ 110
947
+ 111
948
+ 112
949
+ 113
950
+ 114
951
+ 115
952
+ 116
953
+ 117
954
+ 118
955
+ 119
956
+ 120
957
+ 121
958
+ 122</pre>
959
+ </td>
960
+ <td>
961
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 84</span>
962
+
963
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_content'>parse_content</span> <span class='id identifier rubyid_opts'>opts</span><span class='comma'>,</span> <span class='op'>&amp;</span><span class='id identifier rubyid_filter'>filter</span>
964
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
965
+ <span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
966
+ <span class='label'>selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
967
+ <span class='label'>first_row_header:</span> <span class='kw'>false</span><span class='comma'>,</span>
968
+ <span class='label'>header_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
969
+ <span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
970
+ <span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
971
+
972
+ <span class='comment'># Setup config
973
+ </span> <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='rbracket'>]</span>
974
+ <span class='id identifier rubyid_row_data'>row_data</span> <span class='op'>=</span> <span class='id identifier rubyid_child_element'>child_element</span> <span class='op'>=</span> <span class='kw'>nil</span>
975
+ <span class='id identifier rubyid_first'>first</span> <span class='op'>=</span> <span class='id identifier rubyid_first_row_header'>first_row_header</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span>
976
+ <span class='id identifier rubyid_header_map'>header_map</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_map</span><span class='rbracket'>]</span>
977
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span>
978
+
979
+ <span class='comment'># Get and parse rows
980
+ </span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
981
+ <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
982
+ <span class='comment'># First row header validation
983
+ </span> <span class='kw'>if</span> <span class='id identifier rubyid_first'>first</span> <span class='op'>&amp;&amp;</span> <span class='id identifier rubyid_first_row_header'>first_row_header</span>
984
+ <span class='id identifier rubyid_first'>first</span> <span class='op'>=</span> <span class='kw'>false</span>
985
+ <span class='kw'>next</span>
986
+ <span class='kw'>end</span>
987
+
988
+ <span class='comment'># Extract content data
989
+ </span> <span class='id identifier rubyid_row_data'>row_data</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
990
+ <span class='id identifier rubyid_header_map'>header_map</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_key'>key</span><span class='comma'>,</span> <span class='id identifier rubyid_index'>index</span><span class='op'>|</span>
991
+ <span class='comment'># Parse column html with default or custom parser
992
+ </span> <span class='id identifier rubyid_child_element'>child_element</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_children'>children</span><span class='lbracket'>[</span><span class='id identifier rubyid_index'>index</span><span class='rbracket'>]</span>
993
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span>
994
+ <span class='id identifier rubyid_default_parser'>default_parser</span><span class='lparen'>(</span><span class='id identifier rubyid_child_element'>child_element</span><span class='comma'>,</span> <span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span> <span class='op'>:</span>
995
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_child_element'>child_element</span><span class='comma'>,</span> <span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span>
996
+ <span class='kw'>end</span>
997
+ <span class='kw'>next</span> <span class='kw'>unless</span> <span class='id identifier rubyid_filter'>filter</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>||</span> <span class='id identifier rubyid_filter'>filter</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_row'>row</span><span class='comma'>,</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='rparen'>)</span>
998
+ <span class='id identifier rubyid_data'>data</span> <span class='op'>&lt;&lt;</span> <span class='id identifier rubyid_row_data'>row_data</span>
999
+ <span class='kw'>end</span>
1000
+ <span class='id identifier rubyid_data'>data</span>
1001
+ <span class='kw'>end</span></pre>
1002
+ </td>
1003
+ </tr>
1004
+ </table>
1005
+ </div>
1006
+
1007
+ <div class="method_details ">
1008
+ <h3 class="signature " id="parse_header_map-class_method">
1009
+
1010
+ .<strong>parse_header_map</strong>(opts = {}) &#x21d2; <tt>Hash{Symbol,String =&gt; Integer}</tt><sup>?</sup>
1011
+
1012
+
1013
+
1014
+
1015
+
1016
+ </h3><div class="docstring">
1017
+ <div class="discussion">
1018
+
1019
+ <p>Parse header from selector and create a header map to match a column key</p>
1020
+
1021
+ <pre class="code ruby"><code class="ruby">with column index.
1022
+ </code></pre>
1023
+
1024
+
1025
+ </div>
1026
+ </div>
1027
+ <div class="tags">
1028
+ <p class="tag_title">Parameters:</p>
1029
+ <ul class="param">
1030
+
1031
+ <li>
1032
+
1033
+ <span class='name'>opts</span>
1034
+
1035
+
1036
+ <span class='type'>(<tt>Hash</tt>)</span>
1037
+
1038
+
1039
+ <em class="default">(defaults to: <tt>{}</tt>)</em>
1040
+
1041
+
1042
+ &mdash;
1043
+ <div class='inline'>
1044
+ <p>({}) Configuration options.</p>
1045
+ </div>
1046
+
1047
+ </li>
1048
+
1049
+ </ul>
1050
+
1051
+
1052
+
1053
+
1054
+ <p class="tag_title">Options Hash (<tt>opts</tt>):</p>
1055
+ <ul class="option">
1056
+
1057
+ <li>
1058
+ <span class="name">:html</span>
1059
+ <span class="type">(<tt>Nokogiri::Element</tt>)</span>
1060
+ <span class="default">
1061
+
1062
+ </span>
1063
+
1064
+ &mdash; <div class='inline'>
1065
+ <p>Container element to search into.</p>
1066
+ </div>
1067
+
1068
+ </li>
1069
+
1070
+ <li>
1071
+ <span class="name">:selector</span>
1072
+ <span class="type">(<tt>String</tt>)</span>
1073
+ <span class="default">
1074
+
1075
+ </span>
1076
+
1077
+ &mdash; <div class='inline'>
1078
+ <p>CSS selector to match header cells.</p>
1079
+ </div>
1080
+
1081
+ </li>
1082
+
1083
+ <li>
1084
+ <span class="name">:column_key_label_map</span>
1085
+ <span class="type">(<tt>Hash{Symbol,String =&gt; Regex,String}</tt>)</span>
1086
+ <span class="default">
1087
+
1088
+ </span>
1089
+
1090
+ &mdash; <div class='inline'>
1091
+ <p>Key vs. label dictionary.</p>
1092
+ </div>
1093
+
1094
+ </li>
1095
+
1096
+ <li>
1097
+ <span class="name">:first_row_header</span>
1098
+ <span class="type">(<tt>Boolean</tt>)</span>
1099
+ <span class="default">
1100
+
1101
+ &mdash; default:
1102
+ <tt>false</tt>
1103
+
1104
+ </span>
1105
+
1106
+ &mdash; <div class='inline'>
1107
+ <p>If true then selector first matching row will be used as header for
1108
+ parsing.</p>
1109
+ </div>
1110
+
1111
+ </li>
1112
+
1113
+ </ul>
1114
+
1115
+
1116
+ <p class="tag_title">Returns:</p>
1117
+ <ul class="return">
1118
+
1119
+ <li>
1120
+
1121
+
1122
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Integer}</tt>, <tt>nil</tt>)</span>
1123
+
1124
+
1125
+
1126
+ &mdash;
1127
+ <div class='inline'>
1128
+ <p>Key vs. column index map.</p>
1129
+ </div>
1130
+
1131
+ </li>
1132
+
1133
+ </ul>
1134
+
1135
+ </div><table class="source_code">
1136
+ <tr>
1137
+ <td>
1138
+ <pre class="lines">
1139
+
1140
+
1141
+ 152
1142
+ 153
1143
+ 154
1144
+ 155
1145
+ 156
1146
+ 157
1147
+ 158
1148
+ 159
1149
+ 160
1150
+ 161
1151
+ 162
1152
+ 163
1153
+ 164
1154
+ 165
1155
+ 166
1156
+ 167
1157
+ 168
1158
+ 169
1159
+ 170
1160
+ 171
1161
+ 172
1162
+ 173
1163
+ 174
1164
+ 175
1165
+ 176
1166
+ 177
1167
+ 178
1168
+ 179
1169
+ 180</pre>
1170
+ </td>
1171
+ <td>
1172
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 152</span>
1173
+
1174
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_header_map'>parse_header_map</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1175
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
1176
+ <span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1177
+ <span class='label'>selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1178
+ <span class='label'>column_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
1179
+ <span class='label'>first_row_header:</span> <span class='kw'>false</span>
1180
+ <span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
1181
+
1182
+ <span class='comment'># Setup config
1183
+ </span> <span class='id identifier rubyid_dictionary'>dictionary</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_key_label_map</span><span class='rbracket'>]</span>
1184
+ <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='rbracket'>]</span>
1185
+ <span class='id identifier rubyid_column_map'>column_map</span> <span class='op'>=</span> <span class='kw'>nil</span>
1186
+
1187
+ <span class='comment'># Extract and parse header rows
1188
+ </span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:selector</span><span class='rbracket'>]</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>nil</span>
1189
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1190
+ <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_first'>first</span><span class='rbracket'>]</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span>
1191
+ <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
1192
+ <span class='id identifier rubyid_column_map'>column_map</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1193
+ <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_children'>children</span><span class='period'>.</span><span class='id identifier rubyid_each_with_index'>each_with_index</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_col'>col</span><span class='comma'>,</span> <span class='id identifier rubyid_index'>index</span><span class='op'>|</span>
1194
+ <span class='comment'># Parse and map column header
1195
+ </span> <span class='id identifier rubyid_column_key'>column_key</span> <span class='op'>=</span> <span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_col'>col</span><span class='comma'>,</span> <span class='id identifier rubyid_dictionary'>dictionary</span>
1196
+ <span class='kw'>next</span> <span class='kw'>if</span> <span class='id identifier rubyid_column_key'>column_key</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1197
+ <span class='id identifier rubyid_column_map'>column_map</span><span class='lbracket'>[</span><span class='id identifier rubyid_column_key'>column_key</span><span class='rbracket'>]</span> <span class='op'>=</span> <span class='id identifier rubyid_index'>index</span>
1198
+ <span class='kw'>end</span>
1199
+ <span class='id identifier rubyid_data'>data</span> <span class='op'>&lt;&lt;</span> <span class='id identifier rubyid_column_map'>column_map</span>
1200
+ <span class='kw'>end</span>
1201
+ <span class='id identifier rubyid_data'>data</span><span class='op'>&amp;.</span><span class='id identifier rubyid_first'>first</span>
1202
+ <span class='kw'>end</span></pre>
1203
+ </td>
1204
+ </tr>
1205
+ </table>
1206
+ </div>
1207
+
1208
+ <div class="method_details ">
1209
+ <h3 class="signature " id="parse_table-class_method">
1210
+
1211
+ .<strong>parse_table</strong>(opts = {}) {|data, row, header_map| ... } &#x21d2; <tt>Hash{Symbol =&gt; Array,Hash,nil}</tt>
1212
+
1213
+
1214
+
1215
+
1216
+
1217
+ </h3><div class="docstring">
1218
+ <div class="discussion">
1219
+
1220
+ <p>Parse data from a horizontal table like structure matching a selectors and</p>
1221
+
1222
+ <pre class="code ruby"><code class="ruby">using a header map to match columns.
1223
+ </code></pre>
1224
+
1225
+
1226
+ </div>
1227
+ </div>
1228
+ <div class="tags">
1229
+ <p class="tag_title">Parameters:</p>
1230
+ <ul class="param">
1231
+
1232
+ <li>
1233
+
1234
+ <span class='name'>opts</span>
1235
+
1236
+
1237
+ <span class='type'>(<tt>Hash</tt>)</span>
1238
+
1239
+
1240
+ <em class="default">(defaults to: <tt>{}</tt>)</em>
1241
+
1242
+
1243
+ &mdash;
1244
+ <div class='inline'>
1245
+ <p>({}) Configuration options.</p>
1246
+ </div>
1247
+
1248
+ </li>
1249
+
1250
+ </ul>
1251
+
1252
+
1253
+
1254
+
1255
+ <p class="tag_title">Options Hash (<tt>opts</tt>):</p>
1256
+ <ul class="option">
1257
+
1258
+ <li>
1259
+ <span class="name">:html</span>
1260
+ <span class="type">(<tt>Nokogiri::Element</tt>)</span>
1261
+ <span class="default">
1262
+
1263
+ </span>
1264
+
1265
+ &mdash; <div class='inline'>
1266
+ <p>Container element to search into.</p>
1267
+ </div>
1268
+
1269
+ </li>
1270
+
1271
+ <li>
1272
+ <span class="name">:header_selector</span>
1273
+ <span class="type">(<tt>String</tt>)</span>
1274
+ <span class="default">
1275
+
1276
+ </span>
1277
+
1278
+ &mdash; <div class='inline'>
1279
+ <p>Header column elements selector.</p>
1280
+ </div>
1281
+
1282
+ </li>
1283
+
1284
+ <li>
1285
+ <span class="name">:header_key_label_map</span>
1286
+ <span class="type">(<tt>Hash{Symbol,String =&gt; Regex,String}</tt>)</span>
1287
+ <span class="default">
1288
+
1289
+ </span>
1290
+
1291
+ &mdash; <div class='inline'>
1292
+ <p>Header key vs. label dictionary to match column indexes.</p>
1293
+ </div>
1294
+
1295
+ </li>
1296
+
1297
+ <li>
1298
+ <span class="name">:content_selector</span>
1299
+ <span class="type">(<tt>String</tt>)</span>
1300
+ <span class="default">
1301
+
1302
+ </span>
1303
+
1304
+ &mdash; <div class='inline'>
1305
+ <p>Content row elements selector.</p>
1306
+ </div>
1307
+
1308
+ </li>
1309
+
1310
+ <li>
1311
+ <span class="name">:first_row_header</span>
1312
+ <span class="type">(<tt>Boolean</tt>)</span>
1313
+ <span class="default">
1314
+
1315
+ &mdash; default:
1316
+ <tt>false</tt>
1317
+
1318
+ </span>
1319
+
1320
+ &mdash; <div class='inline'>
1321
+ <p>If true then selector first matching row will be used as header for
1322
+ parsing.</p>
1323
+ </div>
1324
+
1325
+ </li>
1326
+
1327
+ <li>
1328
+ <span class="name">:column_parsers</span>
1329
+ <span class="type">(<tt>Hash{Symbol,String =&gt; lambda,proc}</tt>)</span>
1330
+ <span class="default">
1331
+
1332
+ &mdash; default:
1333
+ <tt>{}</tt>
1334
+
1335
+ </span>
1336
+
1337
+ &mdash; <div class='inline'>
1338
+ <p>Custom column parsers for advance data extraction.</p>
1339
+ </div>
1340
+
1341
+ </li>
1342
+
1343
+ </ul>
1344
+
1345
+
1346
+ <p class="tag_title">Yield Parameters:</p>
1347
+ <ul class="yieldparam">
1348
+
1349
+ <li>
1350
+
1351
+ <span class='name'>data</span>
1352
+
1353
+
1354
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Object}</tt>)</span>
1355
+
1356
+
1357
+
1358
+ &mdash;
1359
+ <div class='inline'>
1360
+ <p>Parsed content row data.</p>
1361
+ </div>
1362
+
1363
+ </li>
1364
+
1365
+ <li>
1366
+
1367
+ <span class='name'>row</span>
1368
+
1369
+
1370
+ <span class='type'>(<tt>Array</tt>)</span>
1371
+
1372
+
1373
+
1374
+ &mdash;
1375
+ <div class='inline'>
1376
+ <p>Raw content row data.</p>
1377
+ </div>
1378
+
1379
+ </li>
1380
+
1381
+ <li>
1382
+
1383
+ <span class='name'>header_map</span>
1384
+
1385
+
1386
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Integer}</tt>)</span>
1387
+
1388
+
1389
+
1390
+ &mdash;
1391
+ <div class='inline'>
1392
+ <p>Header map used.</p>
1393
+ </div>
1394
+
1395
+ </li>
1396
+
1397
+ </ul>
1398
+ <p class="tag_title">Yield Returns:</p>
1399
+ <ul class="yieldreturn">
1400
+
1401
+ <li>
1402
+
1403
+
1404
+ <span class='type'>(<tt>Boolean</tt>)</span>
1405
+
1406
+
1407
+
1408
+ &mdash;
1409
+ <div class='inline'>
1410
+ <p>`true` when valid, else `false`.</p>
1411
+ </div>
1412
+
1413
+ </li>
1414
+
1415
+ </ul>
1416
+ <p class="tag_title">Returns:</p>
1417
+ <ul class="return">
1418
+
1419
+ <li>
1420
+
1421
+
1422
+ <span class='type'>(<tt>Hash{Symbol =&gt; Array,Hash,nil}</tt>)</span>
1423
+
1424
+
1425
+
1426
+ &mdash;
1427
+ <div class='inline'>
1428
+ <p>Hash data is as follows:</p>
1429
+ <ul><li>
1430
+ <p>`[Hash] :header_map` Header map used.</p>
1431
+ </li><li>
1432
+ <p>`[Array&lt;Hash&gt;,nil] :data` Parsed rows data.</p>
1433
+ </li></ul>
1434
+ </div>
1435
+
1436
+ </li>
1437
+
1438
+ </ul>
1439
+
1440
+ </div><table class="source_code">
1441
+ <tr>
1442
+ <td>
1443
+ <pre class="lines">
1444
+
1445
+
1446
+ 204
1447
+ 205
1448
+ 206
1449
+ 207
1450
+ 208
1451
+ 209
1452
+ 210
1453
+ 211
1454
+ 212
1455
+ 213
1456
+ 214
1457
+ 215
1458
+ 216
1459
+ 217
1460
+ 218
1461
+ 219
1462
+ 220
1463
+ 221
1464
+ 222
1465
+ 223
1466
+ 224
1467
+ 225
1468
+ 226</pre>
1469
+ </td>
1470
+ <td>
1471
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 204</span>
1472
+
1473
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_table'>parse_table</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span> <span class='op'>&amp;</span><span class='id identifier rubyid_filter'>filter</span>
1474
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
1475
+ <span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1476
+ <span class='label'>header_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1477
+ <span class='label'>header_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
1478
+ <span class='label'>content_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1479
+ <span class='label'>first_row_header:</span> <span class='kw'>false</span><span class='comma'>,</span>
1480
+ <span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1481
+ <span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
1482
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1483
+ <span class='id identifier rubyid_header_map'>header_map</span> <span class='op'>=</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_header_map'>parse_header_map</span> <span class='label'>html:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='comma'>,</span>
1484
+ <span class='label'>selector:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_selector</span><span class='rbracket'>]</span><span class='comma'>,</span>
1485
+ <span class='label'>column_key_label_map:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_key_label_map</span><span class='rbracket'>]</span><span class='comma'>,</span>
1486
+ <span class='label'>first_row_header:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span>
1487
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1488
+ <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_content'>parse_content</span> <span class='label'>html:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='comma'>,</span>
1489
+ <span class='label'>selector:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:content_selector</span><span class='rbracket'>]</span><span class='comma'>,</span>
1490
+ <span class='label'>header_map:</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='comma'>,</span>
1491
+ <span class='label'>first_row_header:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span><span class='comma'>,</span>
1492
+ <span class='label'>column_parsers:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span><span class='comma'>,</span>
1493
+ <span class='op'>&amp;</span><span class='id identifier rubyid_filter'>filter</span>
1494
+ <span class='lbrace'>{</span><span class='label'>header_map:</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='comma'>,</span> <span class='label'>data:</span> <span class='id identifier rubyid_data'>data</span><span class='rbrace'>}</span>
1495
+ <span class='kw'>end</span></pre>
1496
+ </td>
1497
+ </tr>
1498
+ </table>
1499
+ </div>
1500
+
1501
+ <div class="method_details ">
1502
+ <h3 class="signature " id="parse_vertical_table-class_method">
1503
+
1504
+ .<strong>parse_vertical_table</strong>(opts = {}) {|data, row, header_map| ... } &#x21d2; <tt>Hash{Symbol =&gt; Array,Hash,nil}</tt>
1505
+
1506
+
1507
+
1508
+
1509
+
1510
+ </h3><div class="docstring">
1511
+ <div class="discussion">
1512
+
1513
+ <p>Parse data from a vertical table like structure matching a selectors and</p>
1514
+
1515
+ <pre class="code ruby"><code class="ruby">using a header map to match columns.
1516
+ </code></pre>
1517
+
1518
+
1519
+ </div>
1520
+ </div>
1521
+ <div class="tags">
1522
+ <p class="tag_title">Parameters:</p>
1523
+ <ul class="param">
1524
+
1525
+ <li>
1526
+
1527
+ <span class='name'>opts</span>
1528
+
1529
+
1530
+ <span class='type'>(<tt>Hash</tt>)</span>
1531
+
1532
+
1533
+ <em class="default">(defaults to: <tt>{}</tt>)</em>
1534
+
1535
+
1536
+ &mdash;
1537
+ <div class='inline'>
1538
+ <p>({}) Configuration options.</p>
1539
+ </div>
1540
+
1541
+ </li>
1542
+
1543
+ </ul>
1544
+
1545
+
1546
+
1547
+
1548
+ <p class="tag_title">Options Hash (<tt>opts</tt>):</p>
1549
+ <ul class="option">
1550
+
1551
+ <li>
1552
+ <span class="name">:html</span>
1553
+ <span class="type">(<tt>Nokogiri::Element</tt>)</span>
1554
+ <span class="default">
1555
+
1556
+ </span>
1557
+
1558
+ &mdash; <div class='inline'>
1559
+ <p>Container element to search into.</p>
1560
+ </div>
1561
+
1562
+ </li>
1563
+
1564
+ <li>
1565
+ <span class="name">:row_selector</span>
1566
+ <span class="type">(<tt>String</tt>)</span>
1567
+ <span class="default">
1568
+
1569
+ </span>
1570
+
1571
+ &mdash; <div class='inline'>
1572
+ <p>Vertical row like elements selector.</p>
1573
+ </div>
1574
+
1575
+ </li>
1576
+
1577
+ <li>
1578
+ <span class="name">:header_selector</span>
1579
+ <span class="type">(<tt>String</tt>)</span>
1580
+ <span class="default">
1581
+
1582
+ </span>
1583
+
1584
+ &mdash; <div class='inline'>
1585
+ <p>Header column elements selector.</p>
1586
+ </div>
1587
+
1588
+ </li>
1589
+
1590
+ <li>
1591
+ <span class="name">:header_key_label_map</span>
1592
+ <span class="type">(<tt>Hash{Symbol,String =&gt; Regex,String}</tt>)</span>
1593
+ <span class="default">
1594
+
1595
+ </span>
1596
+
1597
+ &mdash; <div class='inline'>
1598
+ <p>Header key vs. label dictionary to match column indexes.</p>
1599
+ </div>
1600
+
1601
+ </li>
1602
+
1603
+ <li>
1604
+ <span class="name">:content_selector</span>
1605
+ <span class="type">(<tt>String</tt>)</span>
1606
+ <span class="default">
1607
+
1608
+ </span>
1609
+
1610
+ &mdash; <div class='inline'>
1611
+ <p>Content row elements selector.</p>
1612
+ </div>
1613
+
1614
+ </li>
1615
+
1616
+ <li>
1617
+ <span class="name">:column_parsers</span>
1618
+ <span class="type">(<tt>Hash{Symbol,String =&gt; lambda,proc}</tt>)</span>
1619
+ <span class="default">
1620
+
1621
+ &mdash; default:
1622
+ <tt>{}</tt>
1623
+
1624
+ </span>
1625
+
1626
+ &mdash; <div class='inline'>
1627
+ <p>Custom column parsers for advance data extraction.</p>
1628
+ </div>
1629
+
1630
+ </li>
1631
+
1632
+ </ul>
1633
+
1634
+
1635
+ <p class="tag_title">Yield Parameters:</p>
1636
+ <ul class="yieldparam">
1637
+
1638
+ <li>
1639
+
1640
+ <span class='name'>data</span>
1641
+
1642
+
1643
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Object}</tt>)</span>
1644
+
1645
+
1646
+
1647
+ &mdash;
1648
+ <div class='inline'>
1649
+ <p>Parsed content row data.</p>
1650
+ </div>
1651
+
1652
+ </li>
1653
+
1654
+ <li>
1655
+
1656
+ <span class='name'>row</span>
1657
+
1658
+
1659
+ <span class='type'>(<tt>Array</tt>)</span>
1660
+
1661
+
1662
+
1663
+ &mdash;
1664
+ <div class='inline'>
1665
+ <p>Raw content row data.</p>
1666
+ </div>
1667
+
1668
+ </li>
1669
+
1670
+ <li>
1671
+
1672
+ <span class='name'>header_map</span>
1673
+
1674
+
1675
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Integer}</tt>)</span>
1676
+
1677
+
1678
+
1679
+ &mdash;
1680
+ <div class='inline'>
1681
+ <p>Header map used.</p>
1682
+ </div>
1683
+
1684
+ </li>
1685
+
1686
+ </ul>
1687
+ <p class="tag_title">Yield Returns:</p>
1688
+ <ul class="yieldreturn">
1689
+
1690
+ <li>
1691
+
1692
+
1693
+ <span class='type'>(<tt>Boolean</tt>)</span>
1694
+
1695
+
1696
+
1697
+ &mdash;
1698
+ <div class='inline'>
1699
+ <p>`true` when valid, else `false`.</p>
1700
+ </div>
1701
+
1702
+ </li>
1703
+
1704
+ </ul>
1705
+ <p class="tag_title">Returns:</p>
1706
+ <ul class="return">
1707
+
1708
+ <li>
1709
+
1710
+
1711
+ <span class='type'>(<tt>Hash{Symbol =&gt; Array,Hash,nil}</tt>)</span>
1712
+
1713
+
1714
+
1715
+ &mdash;
1716
+ <div class='inline'>
1717
+ <p>Hash data is as follows:</p>
1718
+ <ul><li>
1719
+ <p>`[Hash] :header_map` Header map used.</p>
1720
+ </li><li>
1721
+ <p>`[Array&lt;Hash&gt;,nil] :data` Parsed rows data.</p>
1722
+ </li></ul>
1723
+ </div>
1724
+
1725
+ </li>
1726
+
1727
+ </ul>
1728
+
1729
+ </div><table class="source_code">
1730
+ <tr>
1731
+ <td>
1732
+ <pre class="lines">
1733
+
1734
+
1735
+ 249
1736
+ 250
1737
+ 251
1738
+ 252
1739
+ 253
1740
+ 254
1741
+ 255
1742
+ 256
1743
+ 257
1744
+ 258
1745
+ 259
1746
+ 260
1747
+ 261
1748
+ 262
1749
+ 263
1750
+ 264
1751
+ 265
1752
+ 266
1753
+ 267
1754
+ 268
1755
+ 269
1756
+ 270
1757
+ 271
1758
+ 272
1759
+ 273
1760
+ 274
1761
+ 275
1762
+ 276
1763
+ 277
1764
+ 278
1765
+ 279
1766
+ 280
1767
+ 281</pre>
1768
+ </td>
1769
+ <td>
1770
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 249</span>
1771
+
1772
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_vertical_table'>parse_vertical_table</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span> <span class='op'>&amp;</span><span class='id identifier rubyid_filter'>filter</span>
1773
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
1774
+ <span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1775
+ <span class='label'>row_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1776
+ <span class='label'>header_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1777
+ <span class='label'>header_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
1778
+ <span class='label'>content_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1779
+ <span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1780
+ <span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
1781
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1782
+
1783
+ <span class='comment'># Setup config
1784
+ </span> <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1785
+ <span class='id identifier rubyid_dictionary'>dictionary</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_key_label_map</span><span class='rbracket'>]</span>
1786
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span>
1787
+
1788
+ <span class='comment'># Extract headers and content
1789
+ </span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:row_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>nil</span>
1790
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1791
+ <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
1792
+ <span class='comment'># Parse and map column header
1793
+ </span> <span class='id identifier rubyid_header_element'>header_element</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
1794
+ <span class='id identifier rubyid_key'>key</span> <span class='op'>=</span> <span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_header_element'>header_element</span><span class='comma'>,</span> <span class='id identifier rubyid_dictionary'>dictionary</span>
1795
+ <span class='kw'>next</span> <span class='kw'>if</span> <span class='id identifier rubyid_key'>key</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>||</span> <span class='id identifier rubyid_key'>key</span> <span class='op'>==</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_end'>&#39;</span></span>
1796
+
1797
+ <span class='comment'># Parse column html with default or custom parser
1798
+ </span> <span class='id identifier rubyid_content_element'>content_element</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:content_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
1799
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span>
1800
+ <span class='id identifier rubyid_default_parser'>default_parser</span><span class='lparen'>(</span><span class='id identifier rubyid_content_element'>content_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span> <span class='op'>:</span>
1801
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_content_element'>content_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span>
1802
+ <span class='kw'>end</span>
1803
+ <span class='id identifier rubyid_data'>data</span>
1804
+ <span class='kw'>end</span></pre>
1805
+ </td>
1806
+ </tr>
1807
+ </table>
1808
+ </div>
1809
+
1810
+ <div class="method_details ">
1811
+ <h3 class="signature " id="strip-class_method">
1812
+
1813
+ .<strong>strip</strong>(raw_text) &#x21d2; <tt>String</tt><sup>?</sup>
1814
+
1815
+
1816
+
1817
+
1818
+
1819
+ </h3><div class="docstring">
1820
+ <div class="discussion">
1821
+
1822
+ <p>Strip a value.</p>
1823
+
1824
+
1825
+ </div>
1826
+ </div>
1827
+ <div class="tags">
1828
+ <p class="tag_title">Parameters:</p>
1829
+ <ul class="param">
1830
+
1831
+ <li>
1832
+
1833
+ <span class='name'>raw_text</span>
1834
+
1835
+
1836
+ <span class='type'>(<tt>String</tt>, <tt>Object</tt>, <tt>nil</tt>)</span>
1837
+
1838
+
1839
+
1840
+ &mdash;
1841
+ <div class='inline'>
1842
+ <p>Text to strip.</p>
1843
+ </div>
1844
+
1845
+ </li>
1846
+
1847
+ </ul>
1848
+
1849
+ <p class="tag_title">Returns:</p>
1850
+ <ul class="return">
1851
+
1852
+ <li>
1853
+
1854
+
1855
+ <span class='type'>(<tt>String</tt>, <tt>nil</tt>)</span>
1856
+
1857
+
1858
+
1859
+ &mdash;
1860
+ <div class='inline'>
1861
+ <p>`nil` when <code>raw_text</code> is nil, else `String`.</p>
1862
+ </div>
1863
+
1864
+ </li>
1865
+
1866
+ </ul>
1867
+
1868
+ </div><table class="source_code">
1869
+ <tr>
1870
+ <td>
1871
+ <pre class="lines">
1872
+
1873
+
1874
+ 42
1875
+ 43
1876
+ 44
1877
+ 45
1878
+ 46
1879
+ 47
1880
+ 48
1881
+ 49
1882
+ 50
1883
+ 51
1884
+ 52
1885
+ 53</pre>
1886
+ </td>
1887
+ <td>
1888
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 42</span>
1889
+
1890
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_raw_text'>raw_text</span>
1891
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1892
+ <span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_to_s'>to_s</span> <span class='kw'>unless</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span> <span class='const'>String</span>
1893
+ <span class='id identifier rubyid_regex'>regex</span> <span class='op'>=</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>(\s|\u3000|\u00a0)+</span><span class='regexp_end'>/</span></span>
1894
+ <span class='id identifier rubyid_good_encoding'>good_encoding</span> <span class='op'>=</span> <span class='lparen'>(</span><span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=~</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>\u3000</span><span class='regexp_end'>/</span></span> <span class='op'>||</span> <span class='kw'>true</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>false</span>
1895
+ <span class='kw'>unless</span> <span class='id identifier rubyid_good_encoding'>good_encoding</span>
1896
+ <span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_force_encoding'>force_encoding</span><span class='lparen'>(</span><span class='gvar'>$APP_CONFIG</span><span class='lbracket'>[</span><span class='symbol'>:encoding</span><span class='rbracket'>]</span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_encode'>encode</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>UTF-8</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span>
1897
+ <span class='id identifier rubyid_regex'>regex</span> <span class='op'>=</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>(\s|\u3000|\u00a0|\u00c2\u00a0)+</span><span class='regexp_end'>/</span></span>
1898
+ <span class='kw'>end</span>
1899
+ <span class='id identifier rubyid_text'>text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='op'>&amp;.</span><span class='id identifier rubyid_gsub'>gsub</span><span class='lparen'>(</span><span class='id identifier rubyid_regex'>regex</span><span class='comma'>,</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'> </span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='op'>&amp;.</span><span class='id identifier rubyid_strip'>strip</span>
1900
+ <span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span> <span class='kw'>nil</span> <span class='op'>:</span> <span class='id identifier rubyid_decode_html'>decode_html</span><span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span><span class='rparen'>)</span>
1901
+ <span class='kw'>end</span></pre>
1902
+ </td>
1903
+ </tr>
1904
+ </table>
1905
+ </div>
1906
+
1907
+ <div class="method_details ">
1908
+ <h3 class="signature " id="translate_label_to_key-class_method">
1909
+
1910
+ .<strong>translate_label_to_key</strong>(element, label_map) &#x21d2; <tt>Symbol</tt>, <tt>String</tt>
1911
+
1912
+
1913
+
1914
+
1915
+
1916
+ </h3><div class="docstring">
1917
+ <div class="discussion">
1918
+
1919
+ <p>Extract column label and translate it into a frienly key.</p>
1920
+
1921
+
1922
+ </div>
1923
+ </div>
1924
+ <div class="tags">
1925
+ <p class="tag_title">Parameters:</p>
1926
+ <ul class="param">
1927
+
1928
+ <li>
1929
+
1930
+ <span class='name'>element</span>
1931
+
1932
+
1933
+ <span class='type'>(<tt>Nokogiri::Element</tt>)</span>
1934
+
1935
+
1936
+
1937
+ &mdash;
1938
+ <div class='inline'>
1939
+ <p>Html element to parse.</p>
1940
+ </div>
1941
+
1942
+ </li>
1943
+
1944
+ <li>
1945
+
1946
+ <span class='name'>label_map</span>
1947
+
1948
+
1949
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Regex,String}</tt>)</span>
1950
+
1951
+
1952
+
1953
+ &mdash;
1954
+ <div class='inline'>
1955
+ <p>Label dictionary for translation into key.</p>
1956
+ </div>
1957
+
1958
+ </li>
1959
+
1960
+ </ul>
1961
+
1962
+ <p class="tag_title">Returns:</p>
1963
+ <ul class="return">
1964
+
1965
+ <li>
1966
+
1967
+
1968
+ <span class='type'>(<tt>Symbol</tt>, <tt>String</tt>)</span>
1969
+
1970
+
1971
+
1972
+ &mdash;
1973
+ <div class='inline'>
1974
+ <p>Translated key.</p>
1975
+ </div>
1976
+
1977
+ </li>
1978
+
1979
+ </ul>
1980
+
1981
+ </div><table class="source_code">
1982
+ <tr>
1983
+ <td>
1984
+ <pre class="lines">
1985
+
1986
+
1987
+ 131
1988
+ 132
1989
+ 133
1990
+ 134
1991
+ 135
1992
+ 136
1993
+ 137
1994
+ 138</pre>
1995
+ </td>
1996
+ <td>
1997
+ <pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 131</span>
1998
+
1999
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_element'>element</span><span class='comma'>,</span> <span class='id identifier rubyid_label_map'>label_map</span>
2000
+ <span class='id identifier rubyid_element'>element</span><span class='op'>&amp;.</span><span class='id identifier rubyid_search'>search</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>//i</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_remove'>remove</span>
2001
+ <span class='id identifier rubyid_text'>text</span> <span class='op'>=</span> <span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_element'>element</span><span class='op'>&amp;.</span><span class='id identifier rubyid_text'>text</span>
2002
+ <span class='id identifier rubyid_key'>key</span> <span class='op'>=</span> <span class='id identifier rubyid_label_map'>label_map</span><span class='period'>.</span><span class='id identifier rubyid_find'>find</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_k'>k</span><span class='comma'>,</span><span class='id identifier rubyid_v'>v</span><span class='op'>|</span>
2003
+ <span class='id identifier rubyid_v'>v</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span><span class='lparen'>(</span><span class='const'>Regexp</span><span class='rparen'>)</span> <span class='op'>?</span> <span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span> <span class='op'>=~</span> <span class='id identifier rubyid_v'>v</span><span class='rparen'>)</span> <span class='op'>:</span> <span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span> <span class='op'>==</span> <span class='id identifier rubyid_v'>v</span><span class='rparen'>)</span>
2004
+ <span class='kw'>end</span><span class='op'>&amp;.</span><span class='id identifier rubyid_first'>first</span>
2005
+ <span class='id identifier rubyid_key'>key</span>
2006
+ <span class='kw'>end</span></pre>
2007
+ </td>
2008
+ </tr>
2009
+ </table>
2010
+ </div>
2011
+
2012
+ </div>
2013
+
2014
+ </div>
2015
+
2016
+ <div id="footer">
2017
+ Generated on Tue Feb 26 16:50:03 2019 by
2018
+ <a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
2019
+ 0.9.18 (ruby-2.5.3).
2020
+ </div>
2021
+
2022
+ </div>
2023
+ </body>
2024
+ </html>