dh_easy-text 0.0.6

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: f1947af143dc74134300d7cc30b25d4fc7caaa9995bf82b0cee158aa8bb23c22
4
+ data.tar.gz: 79ca17c444073d803ff7f6f4827f3f3eb6d62f2c2918b3ae07e6fa68f2eeff82
5
+ SHA512:
6
+ metadata.gz: ea8ae22386f6568b36cf9643aeaf1fbac8d69879c024b77c454b7fa5e31e29a500342aae7aae4cc6fc58c57fc787a61e86bd40893b10244be6b8e982fcf87543
7
+ data.tar.gz: 92bb658c18455656a86006ecc03aaede6a45611a3fb306114d8236789cc4e29477cde91abc872cf5940801e8015ba00ced68190e26d62caad315ade06b0d18e9
@@ -0,0 +1,12 @@
1
+ /.byebug*
2
+ /.bundle/
3
+ /.yardoc
4
+ /_yardoc/
5
+ /coverage/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+ /certs/
10
+ /checksum/
11
+ /vendor/
12
+ /Gemfile.lock
@@ -0,0 +1,7 @@
1
+ ---
2
+ sudo: false
3
+ language: ruby
4
+ cache: bundler
5
+ rvm:
6
+ - 2.4.2
7
+ before_install: gem install bundler -v 1.16.3
@@ -0,0 +1 @@
1
+ --no-private
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at perry@datahen.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source "https://rubygems.org"
2
+
3
+ git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
4
+
5
+ # Specify your gem's dependencies in dh_easy-text.gemspec
6
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2019 DataHen
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,20 @@
1
+ [![Documentation](http://img.shields.io/badge/docs-rdoc.info-blue.svg)](http://rubydoc.org/gems/dh_easy-text/frames)
2
+ [![Gem Version](https://badge.fury.io/rb/dh_easy-text.svg)](http://github.com/DataHenOfficial/dh_easy-text/releases)
3
+ [![License](http://img.shields.io/badge/license-MIT-yellowgreen.svg)](#license)
4
+
5
+ # DhEasy text module
6
+ ## Description
7
+
8
+ DhEasy text is part of DhEasy gem collection. It provides multiple text parsing helpers to ease common text parsing user cases.
9
+
10
+ Install gem:
11
+ ```ruby
12
+ gem install 'dh_easy-text'
13
+ ```
14
+
15
+ Require gem:
16
+ ```ruby
17
+ require 'dh_easy/text'
18
+ ```
19
+
20
+ Documentation can be found [here](http://rubydoc.org/gems/dh_easy-text/frames).
@@ -0,0 +1,22 @@
1
+ require 'benchmark'
2
+ require 'bundler/gem_tasks'
3
+ require 'rake/testtask'
4
+
5
+ Rake::TestTask.new do |t|
6
+ t.libs = ['lib', 'test']
7
+ t.warning = false
8
+ t.verbose = false
9
+ t.test_files = FileList['./test/**/*_test.rb']
10
+ end
11
+
12
+ desc 'Benchmark another task execution | usage example: benchmark[my_task, param1, param2]'
13
+ task :benchmark, [:task] do |task, args|
14
+ task_name = args[:task]
15
+ if task_name.nil?
16
+ puts "Should select a task."
17
+ exit 1
18
+ end
19
+ puts Benchmark.measure{ Rake::Task[task_name].invoke *args.extras }
20
+ end
21
+
22
+ task default: :test
@@ -0,0 +1,49 @@
1
+
2
+ lib = File.expand_path("../lib", __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require "dh_easy/text/version"
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "dh_easy-text"
8
+ spec.version = DhEasy::Text::VERSION
9
+ spec.authors = ["Eduardo Rosales"]
10
+ spec.email = ["eduardo@datahen.com"]
11
+
12
+ spec.summary = %q{DataHen Easy toolkit text module}
13
+ spec.description = %q{DataHen Easy toolkit text module contains multiple text parsing helpers.}
14
+ spec.homepage = "https://datahen.com"
15
+ spec.license = "MIT"
16
+
17
+ # spec.cert_chain = ['certs/dh_easy.pem']
18
+ # spec.signing_key = File.expand_path("~/.ssh/gems/gem-private_dh_easy.pem") if $0 =~ /gem\z/
19
+
20
+ # Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
21
+ # to allow pushing to a single host or delete this section to allow pushing to any host.
22
+ if spec.respond_to?(:metadata)
23
+ # spec.metadata["allowed_push_host"] = "TODO: Set to 'http://mygemserver.com'"
24
+
25
+ spec.metadata["homepage_uri"] = spec.homepage
26
+ spec.metadata["source_code_uri"] = "https://github.com/DataHenOfficial/dh_easy-text"
27
+ # spec.metadata["changelog_uri"] = "TODO: Put your gem's CHANGELOG.md URL here."
28
+ else
29
+ raise "RubyGems 2.0 or newer is required to protect against " \
30
+ "public gem pushes."
31
+ end
32
+
33
+ # Specify which files should be added to the gem when it is released.
34
+ # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
35
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
36
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
37
+ end
38
+ spec.require_paths = ["lib"]
39
+ spec.required_ruby_version = '>= 2.2.2'
40
+
41
+ spec.add_dependency 'dh_easy-core', '~> 0'
42
+ spec.add_development_dependency 'bundler', '>= 1'
43
+ spec.add_development_dependency 'rake', '~> 10'
44
+ spec.add_development_dependency 'minitest', '~> 5'
45
+ spec.add_development_dependency 'simplecov', '~> 0'
46
+ spec.add_development_dependency 'simplecov-console', '~> 0'
47
+ spec.add_development_dependency 'timecop', '~> 0'
48
+ spec.add_development_dependency 'byebug', '>= 0'
49
+ end
@@ -0,0 +1,117 @@
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <meta charset="utf-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>
7
+ Module: DhEasy
8
+
9
+ &mdash; Documentation by YARD 0.9.20
10
+
11
+ </title>
12
+
13
+ <link rel="stylesheet" href="css/style.css" type="text/css" charset="utf-8" />
14
+
15
+ <link rel="stylesheet" href="css/common.css" type="text/css" charset="utf-8" />
16
+
17
+ <script type="text/javascript" charset="utf-8">
18
+ pathId = "DhEasy";
19
+ relpath = '';
20
+ </script>
21
+
22
+
23
+ <script type="text/javascript" charset="utf-8" src="js/jquery.js"></script>
24
+
25
+ <script type="text/javascript" charset="utf-8" src="js/app.js"></script>
26
+
27
+
28
+ </head>
29
+ <body>
30
+ <div class="nav_wrap">
31
+ <iframe id="nav" src="class_list.html?1"></iframe>
32
+ <div id="resizer"></div>
33
+ </div>
34
+
35
+ <div id="main" tabindex="-1">
36
+ <div id="header">
37
+ <div id="menu">
38
+
39
+ <a href="_index.html">Index (D)</a> &raquo;
40
+
41
+
42
+ <span class="title">DhEasy</span>
43
+
44
+ </div>
45
+
46
+ <div id="search">
47
+
48
+ <a class="full_list_link" id="class_list_link"
49
+ href="class_list.html">
50
+
51
+ <svg width="24" height="24">
52
+ <rect x="0" y="4" width="24" height="4" rx="1" ry="1"></rect>
53
+ <rect x="0" y="12" width="24" height="4" rx="1" ry="1"></rect>
54
+ <rect x="0" y="20" width="24" height="4" rx="1" ry="1"></rect>
55
+ </svg>
56
+ </a>
57
+
58
+ </div>
59
+ <div class="clear"></div>
60
+ </div>
61
+
62
+ <div id="content"><h1>Module: DhEasy
63
+
64
+
65
+
66
+ </h1>
67
+ <div class="box_info">
68
+
69
+
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+
79
+ <dl>
80
+ <dt>Defined in:</dt>
81
+ <dd>lib/dh_easy/text.rb<span class="defines">,<br />
82
+ lib/dh_easy/text/version.rb</span>
83
+ </dd>
84
+ </dl>
85
+
86
+ </div>
87
+
88
+ <h2>Defined Under Namespace</h2>
89
+ <p class="children">
90
+
91
+
92
+ <strong class="modules">Modules:</strong> <span class='object_link'><a href="DhEasy/Text.html" title="DhEasy::Text (module)">Text</a></span>
93
+
94
+
95
+
96
+
97
+ </p>
98
+
99
+
100
+
101
+
102
+
103
+
104
+
105
+
106
+
107
+ </div>
108
+
109
+ <div id="footer">
110
+ Generated on Wed Dec 4 23:05:44 2019 by
111
+ <a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
112
+ 0.9.20 (ruby-2.5.3).
113
+ </div>
114
+
115
+ </div>
116
+ </body>
117
+ </html>
@@ -0,0 +1,2146 @@
1
+ <!DOCTYPE html>
2
+ <html>
3
+ <head>
4
+ <meta charset="utf-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>
7
+ Module: DhEasy::Text
8
+
9
+ &mdash; Documentation by YARD 0.9.20
10
+
11
+ </title>
12
+
13
+ <link rel="stylesheet" href="../css/style.css" type="text/css" charset="utf-8" />
14
+
15
+ <link rel="stylesheet" href="../css/common.css" type="text/css" charset="utf-8" />
16
+
17
+ <script type="text/javascript" charset="utf-8">
18
+ pathId = "DhEasy::Text";
19
+ relpath = '../';
20
+ </script>
21
+
22
+
23
+ <script type="text/javascript" charset="utf-8" src="../js/jquery.js"></script>
24
+
25
+ <script type="text/javascript" charset="utf-8" src="../js/app.js"></script>
26
+
27
+
28
+ </head>
29
+ <body>
30
+ <div class="nav_wrap">
31
+ <iframe id="nav" src="../class_list.html?1"></iframe>
32
+ <div id="resizer"></div>
33
+ </div>
34
+
35
+ <div id="main" tabindex="-1">
36
+ <div id="header">
37
+ <div id="menu">
38
+
39
+ <a href="../_index.html">Index (T)</a> &raquo;
40
+ <span class='title'><span class='object_link'><a href="../DhEasy.html" title="DhEasy (module)">DhEasy</a></span></span>
41
+ &raquo;
42
+ <span class="title">Text</span>
43
+
44
+ </div>
45
+
46
+ <div id="search">
47
+
48
+ <a class="full_list_link" id="class_list_link"
49
+ href="../class_list.html">
50
+
51
+ <svg width="24" height="24">
52
+ <rect x="0" y="4" width="24" height="4" rx="1" ry="1"></rect>
53
+ <rect x="0" y="12" width="24" height="4" rx="1" ry="1"></rect>
54
+ <rect x="0" y="20" width="24" height="4" rx="1" ry="1"></rect>
55
+ </svg>
56
+ </a>
57
+
58
+ </div>
59
+ <div class="clear"></div>
60
+ </div>
61
+
62
+ <div id="content"><h1>Module: DhEasy::Text
63
+
64
+
65
+
66
+ </h1>
67
+ <div class="box_info">
68
+
69
+
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+
78
+
79
+ <dl>
80
+ <dt>Defined in:</dt>
81
+ <dd>lib/dh_easy/text.rb<span class="defines">,<br />
82
+ lib/dh_easy/text/version.rb</span>
83
+ </dd>
84
+ </dl>
85
+
86
+ </div>
87
+
88
+
89
+
90
+ <h2>
91
+ Constant Summary
92
+ <small><a href="#" class="constants_summary_toggle">collapse</a></small>
93
+ </h2>
94
+
95
+ <dl class="constants">
96
+
97
+ <dt id="VERSION-constant" class="">VERSION =
98
+ <div class="docstring">
99
+ <div class="discussion">
100
+
101
+ <p>Gem version</p>
102
+
103
+
104
+ </div>
105
+ </div>
106
+ <div class="tags">
107
+
108
+
109
+ </div>
110
+ </dt>
111
+ <dd><pre class="code"><span class='tstring'><span class='tstring_beg'>&quot;</span><span class='tstring_content'>0.0.6</span><span class='tstring_end'>&quot;</span></span></pre></dd>
112
+
113
+ </dl>
114
+
115
+
116
+
117
+
118
+
119
+
120
+
121
+
122
+
123
+ <h2>
124
+ Class Method Summary
125
+ <small><a href="#" class="summary_toggle">collapse</a></small>
126
+ </h2>
127
+
128
+ <ul class="summary">
129
+
130
+ <li class="public ">
131
+ <span class="summary_signature">
132
+
133
+ <a href="#decode_html-class_method" title="decode_html (class method)">.<strong>decode_html</strong>(text) &#x21d2; String </a>
134
+
135
+
136
+
137
+ </span>
138
+
139
+
140
+
141
+
142
+
143
+
144
+
145
+
146
+
147
+ <span class="summary_desc"><div class='inline'>
148
+ <p>Decode HTML entities from text .</p>
149
+ </div></span>
150
+
151
+ </li>
152
+
153
+
154
+ <li class="public ">
155
+ <span class="summary_signature">
156
+
157
+ <a href="#default_parser-class_method" title="default_parser (class method)">.<strong>default_parser</strong>(cell_element, data, key) &#x21d2; Object </a>
158
+
159
+
160
+
161
+ </span>
162
+
163
+
164
+
165
+
166
+
167
+
168
+
169
+
170
+
171
+ <span class="summary_desc"><div class='inline'>
172
+ <p>Default cell content parser used to parse cell element.</p>
173
+ </div></span>
174
+
175
+ </li>
176
+
177
+
178
+ <li class="public ">
179
+ <span class="summary_signature">
180
+
181
+ <a href="#encode_html-class_method" title="encode_html (class method)">.<strong>encode_html</strong>(text) &#x21d2; String </a>
182
+
183
+
184
+
185
+ </span>
186
+
187
+
188
+
189
+
190
+
191
+
192
+
193
+
194
+
195
+ <span class="summary_desc"><div class='inline'>
196
+ <p>Encode text for valid HTML entities.</p>
197
+ </div></span>
198
+
199
+ </li>
200
+
201
+
202
+ <li class="public ">
203
+ <span class="summary_signature">
204
+
205
+ <a href="#hash-class_method" title="hash (class method)">.<strong>hash</strong>(object) &#x21d2; String </a>
206
+
207
+
208
+
209
+ </span>
210
+
211
+
212
+
213
+
214
+
215
+
216
+
217
+
218
+
219
+ <span class="summary_desc"><div class='inline'>
220
+ <p>Create a hash from object.</p>
221
+ </div></span>
222
+
223
+ </li>
224
+
225
+
226
+ <li class="public ">
227
+ <span class="summary_signature">
228
+
229
+ <a href="#parse_content-class_method" title="parse_content (class method)">.<strong>parse_content</strong>(opts) {|data, row, header_map| ... } &#x21d2; Array&lt;Hash&gt;<sup>?</sup> </a>
230
+
231
+
232
+
233
+ </span>
234
+
235
+
236
+
237
+
238
+
239
+
240
+
241
+
242
+
243
+ <span class="summary_desc"><div class='inline'>
244
+ <p>Parse row data matching a selector using a header map to translate
245
+ between columns and friendly keys.</p>
246
+ </div></span>
247
+
248
+ </li>
249
+
250
+
251
+ <li class="public ">
252
+ <span class="summary_signature">
253
+
254
+ <a href="#parse_header_map-class_method" title="parse_header_map (class method)">.<strong>parse_header_map</strong>(opts = {}) &#x21d2; Hash{Symbol,String =&gt; Integer}<sup>?</sup> </a>
255
+
256
+
257
+
258
+ </span>
259
+
260
+
261
+
262
+
263
+
264
+
265
+
266
+
267
+
268
+ <span class="summary_desc"><div class='inline'>
269
+ <p>Parse header from selector and create a header map to match a column key
270
+ with column index.</p>
271
+ </div></span>
272
+
273
+ </li>
274
+
275
+
276
+ <li class="public ">
277
+ <span class="summary_signature">
278
+
279
+ <a href="#parse_table-class_method" title="parse_table (class method)">.<strong>parse_table</strong>(opts = {}) {|data, row, header_map| ... } &#x21d2; Hash{Symbol =&gt; Array,Hash,nil} </a>
280
+
281
+
282
+
283
+ </span>
284
+
285
+
286
+
287
+
288
+
289
+
290
+
291
+
292
+
293
+ <span class="summary_desc"><div class='inline'>
294
+ <p>Parse data from a horizontal table like structure matching a selectors and
295
+ using a header map to match columns.</p>
296
+ </div></span>
297
+
298
+ </li>
299
+
300
+
301
+ <li class="public ">
302
+ <span class="summary_signature">
303
+
304
+ <a href="#parse_vertical_table-class_method" title="parse_vertical_table (class method)">.<strong>parse_vertical_table</strong>(opts = {}) {|data, row, header_map| ... } &#x21d2; Hash{Symbol =&gt; Array,Hash,nil} </a>
305
+
306
+
307
+
308
+ </span>
309
+
310
+
311
+
312
+
313
+
314
+
315
+
316
+
317
+
318
+ <span class="summary_desc"><div class='inline'>
319
+ <p>Parse data from a vertical table like structure matching a selectors and
320
+ using a header map to match columns.</p>
321
+ </div></span>
322
+
323
+ </li>
324
+
325
+
326
+ <li class="public ">
327
+ <span class="summary_signature">
328
+
329
+ <a href="#strip-class_method" title="strip (class method)">.<strong>strip</strong>(raw_text, orig_encoding = &#39;ASCII&#39;) &#x21d2; String<sup>?</sup> </a>
330
+
331
+
332
+
333
+ </span>
334
+
335
+
336
+
337
+
338
+
339
+
340
+
341
+
342
+
343
+ <span class="summary_desc"><div class='inline'>
344
+ <p>Strip a value by trimming spaces, reducing secuential spaces into a
345
+ single space, decode HTML entities and change encoding to UTF-8.</p>
346
+ </div></span>
347
+
348
+ </li>
349
+
350
+
351
+ <li class="public ">
352
+ <span class="summary_signature">
353
+
354
+ <a href="#translate_label_to_key-class_method" title="translate_label_to_key (class method)">.<strong>translate_label_to_key</strong>(element, label_map) &#x21d2; Symbol, String </a>
355
+
356
+
357
+
358
+ </span>
359
+
360
+
361
+
362
+
363
+
364
+
365
+
366
+
367
+
368
+ <span class="summary_desc"><div class='inline'>
369
+ <p>Extract column label and translate it into a frienly key.</p>
370
+ </div></span>
371
+
372
+ </li>
373
+
374
+
375
+ </ul>
376
+
377
+
378
+
379
+
380
+ <div id="class_method_details" class="method_details_list">
381
+ <h2>Class Method Details</h2>
382
+
383
+
384
+ <div class="method_details first">
385
+ <h3 class="signature first" id="decode_html-class_method">
386
+
387
+ .<strong>decode_html</strong>(text) &#x21d2; <tt>String</tt>
388
+
389
+
390
+
391
+
392
+
393
+ </h3><div class="docstring">
394
+ <div class="discussion">
395
+
396
+ <p>Decode HTML entities from text .</p>
397
+
398
+
399
+ </div>
400
+ </div>
401
+ <div class="tags">
402
+ <p class="tag_title">Parameters:</p>
403
+ <ul class="param">
404
+
405
+ <li>
406
+
407
+ <span class='name'>text</span>
408
+
409
+
410
+ <span class='type'>(<tt>String</tt>)</span>
411
+
412
+
413
+
414
+ &mdash;
415
+ <div class='inline'>
416
+ <p>Text to decode.</p>
417
+ </div>
418
+
419
+ </li>
420
+
421
+ </ul>
422
+
423
+ <p class="tag_title">Returns:</p>
424
+ <ul class="return">
425
+
426
+ <li>
427
+
428
+
429
+ <span class='type'>(<tt>String</tt>)</span>
430
+
431
+
432
+
433
+ </li>
434
+
435
+ </ul>
436
+
437
+ </div><table class="source_code">
438
+ <tr>
439
+ <td>
440
+ <pre class="lines">
441
+
442
+
443
+ 33
444
+ 34
445
+ 35</pre>
446
+ </td>
447
+ <td>
448
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 33</span>
449
+
450
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_decode_html'>decode_html</span> <span class='id identifier rubyid_text'>text</span>
451
+ <span class='const'>CGI</span><span class='period'>.</span><span class='id identifier rubyid_unescapeHTML'>unescapeHTML</span> <span class='id identifier rubyid_text'>text</span>
452
+ <span class='kw'>end</span></pre>
453
+ </td>
454
+ </tr>
455
+ </table>
456
+ </div>
457
+
458
+ <div class="method_details ">
459
+ <h3 class="signature " id="default_parser-class_method">
460
+
461
+ .<strong>default_parser</strong>(cell_element, data, key) &#x21d2; <tt>Object</tt>
462
+
463
+
464
+
465
+
466
+
467
+ </h3><div class="docstring">
468
+ <div class="discussion">
469
+
470
+ <p>Default cell content parser used to parse cell element.</p>
471
+
472
+
473
+ </div>
474
+ </div>
475
+ <div class="tags">
476
+ <p class="tag_title">Parameters:</p>
477
+ <ul class="param">
478
+
479
+ <li>
480
+
481
+ <span class='name'>cell_element</span>
482
+
483
+
484
+ <span class='type'>(<tt>Nokogiri::Element</tt>)</span>
485
+
486
+
487
+
488
+ &mdash;
489
+ <div class='inline'>
490
+ <p>Cell element to parse.</p>
491
+ </div>
492
+
493
+ </li>
494
+
495
+ <li>
496
+
497
+ <span class='name'>data</span>
498
+
499
+
500
+ <span class='type'>(<tt>Hash</tt>)</span>
501
+
502
+
503
+
504
+ &mdash;
505
+ <div class='inline'>
506
+ <p>Data hash to save parsed data into.</p>
507
+ </div>
508
+
509
+ </li>
510
+
511
+ <li>
512
+
513
+ <span class='name'>key</span>
514
+
515
+
516
+ <span class='type'>(<tt>String</tt>, <tt>Symbol</tt>)</span>
517
+
518
+
519
+
520
+ &mdash;
521
+ <div class='inline'>
522
+ <p>Header column key being parsed.</p>
523
+ </div>
524
+
525
+ </li>
526
+
527
+ </ul>
528
+
529
+
530
+ </div><table class="source_code">
531
+ <tr>
532
+ <td>
533
+ <pre class="lines">
534
+
535
+
536
+ 62
537
+ 63
538
+ 64
539
+ 65
540
+ 66</pre>
541
+ </td>
542
+ <td>
543
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 62</span>
544
+
545
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_default_parser'>default_parser</span> <span class='id identifier rubyid_cell_element'>cell_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span>
546
+ <span class='kw'>return</span> <span class='kw'>if</span> <span class='id identifier rubyid_cell_element'>cell_element</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
547
+ <span class='id identifier rubyid_cell_element'>cell_element</span><span class='period'>.</span><span class='id identifier rubyid_search'>search</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>//i</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_remove'>remove</span> <span class='kw'>if</span> <span class='id identifier rubyid_cell_element'>cell_element</span><span class='period'>.</span><span class='id identifier rubyid_search'>search</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>//i</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_count'>count</span> <span class='op'>&gt;</span> <span class='int'>0</span>
548
+ <span class='id identifier rubyid_data'>data</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span> <span class='op'>=</span> <span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_cell_element'>cell_element</span><span class='period'>.</span><span class='id identifier rubyid_text'>text</span>
549
+ <span class='kw'>end</span></pre>
550
+ </td>
551
+ </tr>
552
+ </table>
553
+ </div>
554
+
555
+ <div class="method_details ">
556
+ <h3 class="signature " id="encode_html-class_method">
557
+
558
+ .<strong>encode_html</strong>(text) &#x21d2; <tt>String</tt>
559
+
560
+
561
+
562
+
563
+
564
+ </h3><div class="docstring">
565
+ <div class="discussion">
566
+
567
+ <p>Encode text for valid HTML entities.</p>
568
+
569
+
570
+ </div>
571
+ </div>
572
+ <div class="tags">
573
+ <p class="tag_title">Parameters:</p>
574
+ <ul class="param">
575
+
576
+ <li>
577
+
578
+ <span class='name'>text</span>
579
+
580
+
581
+ <span class='type'>(<tt>String</tt>)</span>
582
+
583
+
584
+
585
+ &mdash;
586
+ <div class='inline'>
587
+ <p>Text to encode.</p>
588
+ </div>
589
+
590
+ </li>
591
+
592
+ </ul>
593
+
594
+ <p class="tag_title">Returns:</p>
595
+ <ul class="return">
596
+
597
+ <li>
598
+
599
+
600
+ <span class='type'>(<tt>String</tt>)</span>
601
+
602
+
603
+
604
+ </li>
605
+
606
+ </ul>
607
+
608
+ </div><table class="source_code">
609
+ <tr>
610
+ <td>
611
+ <pre class="lines">
612
+
613
+
614
+ 24
615
+ 25
616
+ 26</pre>
617
+ </td>
618
+ <td>
619
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 24</span>
620
+
621
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_encode_html'>encode_html</span> <span class='id identifier rubyid_text'>text</span>
622
+ <span class='const'>CGI</span><span class='period'>.</span><span class='id identifier rubyid_escapeHTML'>escapeHTML</span> <span class='id identifier rubyid_text'>text</span>
623
+ <span class='kw'>end</span></pre>
624
+ </td>
625
+ </tr>
626
+ </table>
627
+ </div>
628
+
629
+ <div class="method_details ">
630
+ <h3 class="signature " id="hash-class_method">
631
+
632
+ .<strong>hash</strong>(object) &#x21d2; <tt>String</tt>
633
+
634
+
635
+
636
+
637
+
638
+ </h3><div class="docstring">
639
+ <div class="discussion">
640
+
641
+ <p>Create a hash from object</p>
642
+
643
+
644
+ </div>
645
+ </div>
646
+ <div class="tags">
647
+ <p class="tag_title">Parameters:</p>
648
+ <ul class="param">
649
+
650
+ <li>
651
+
652
+ <span class='name'>object</span>
653
+
654
+
655
+ <span class='type'>(<tt>String</tt>, <tt>Hash</tt>, <tt>Object</tt>)</span>
656
+
657
+
658
+
659
+ &mdash;
660
+ <div class='inline'>
661
+ <p>Object to create hash from.</p>
662
+ </div>
663
+
664
+ </li>
665
+
666
+ </ul>
667
+
668
+ <p class="tag_title">Returns:</p>
669
+ <ul class="return">
670
+
671
+ <li>
672
+
673
+
674
+ <span class='type'>(<tt>String</tt>)</span>
675
+
676
+
677
+
678
+ </li>
679
+
680
+ </ul>
681
+
682
+ </div><table class="source_code">
683
+ <tr>
684
+ <td>
685
+ <pre class="lines">
686
+
687
+
688
+ 14
689
+ 15
690
+ 16
691
+ 17</pre>
692
+ </td>
693
+ <td>
694
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 14</span>
695
+
696
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_hash'>hash</span> <span class='id identifier rubyid_object'>object</span>
697
+ <span class='id identifier rubyid_object'>object</span> <span class='op'>=</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_hash'>hash</span> <span class='kw'>if</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span> <span class='const'>Hash</span>
698
+ <span class='const'>Digest</span><span class='op'>::</span><span class='const'>SHA1</span><span class='period'>.</span><span class='id identifier rubyid_hexdigest'>hexdigest</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_to_s'>to_s</span>
699
+ <span class='kw'>end</span></pre>
700
+ </td>
701
+ </tr>
702
+ </table>
703
+ </div>
704
+
705
+ <div class="method_details ">
706
+ <h3 class="signature " id="parse_content-class_method">
707
+
708
+ .<strong>parse_content</strong>(opts) {|data, row, header_map| ... } &#x21d2; <tt>Array&lt;Hash&gt;</tt><sup>?</sup>
709
+
710
+
711
+
712
+
713
+
714
+ </h3><div class="docstring">
715
+ <div class="discussion">
716
+
717
+ <p>Parse row data matching a selector using a header map to translate</p>
718
+
719
+ <pre class="code ruby"><code class="ruby">between columns and friendly keys.
720
+ </code></pre>
721
+
722
+
723
+ </div>
724
+ </div>
725
+ <div class="tags">
726
+ <p class="tag_title">Parameters:</p>
727
+ <ul class="param">
728
+
729
+ <li>
730
+
731
+ <span class='name'>opts</span>
732
+
733
+
734
+ <span class='type'>(<tt>Hash</tt>)</span>
735
+
736
+
737
+
738
+ &mdash;
739
+ <div class='inline'>
740
+ <p>({}) Configuration options.</p>
741
+ </div>
742
+
743
+ </li>
744
+
745
+ </ul>
746
+
747
+
748
+
749
+
750
+ <p class="tag_title">Options Hash (<tt>opts</tt>):</p>
751
+ <ul class="option">
752
+
753
+ <li>
754
+ <span class="name">:html</span>
755
+ <span class="type">(<tt>Nokogiri::Element</tt>)</span>
756
+ <span class="default">
757
+
758
+ </span>
759
+
760
+ &mdash; <div class='inline'>
761
+ <p>Container element to search into.</p>
762
+ </div>
763
+
764
+ </li>
765
+
766
+ <li>
767
+ <span class="name">:selector</span>
768
+ <span class="type">(<tt>String</tt>)</span>
769
+ <span class="default">
770
+
771
+ </span>
772
+
773
+ &mdash; <div class='inline'>
774
+ <p>CSS selector to match content cells.</p>
775
+ </div>
776
+
777
+ </li>
778
+
779
+ <li>
780
+ <span class="name">:first_row_header</span>
781
+ <span class="type">(<tt>Boolean</tt>)</span>
782
+ <span class="default">
783
+
784
+ &mdash; default:
785
+ <tt>false</tt>
786
+
787
+ </span>
788
+
789
+ &mdash; <div class='inline'>
790
+ <p>If true then first matching element will be assumed to be header and
791
+ ignored.</p>
792
+ </div>
793
+
794
+ </li>
795
+
796
+ <li>
797
+ <span class="name">:header_map</span>
798
+ <span class="type">(<tt>Hash{Symbol,String =&gt; Integer}</tt>)</span>
799
+ <span class="default">
800
+
801
+ </span>
802
+
803
+ &mdash; <div class='inline'>
804
+ <p>Header key vs index dictionary.</p>
805
+ </div>
806
+
807
+ </li>
808
+
809
+ <li>
810
+ <span class="name">:column_parsers</span>
811
+ <span class="type">(<tt>Hash{Symbol,String =&gt; lambda,proc}</tt>)</span>
812
+ <span class="default">
813
+
814
+ &mdash; default:
815
+ <tt>{}</tt>
816
+
817
+ </span>
818
+
819
+ &mdash; <div class='inline'>
820
+ <p>Custom column parsers for advance data extraction.</p>
821
+ </div>
822
+
823
+ </li>
824
+
825
+ <li>
826
+ <span class="name">:ignore_text_nodes</span>
827
+ <span class="type">(<tt>Boolean</tt>)</span>
828
+ <span class="default">
829
+
830
+ &mdash; default:
831
+ <tt>true</tt>
832
+
833
+ </span>
834
+
835
+ &mdash; <div class='inline'>
836
+ <p>Ignore text nodes when retriving content cells and rows.</p>
837
+ </div>
838
+
839
+ </li>
840
+
841
+ </ul>
842
+
843
+
844
+ <p class="tag_title">Yield Parameters:</p>
845
+ <ul class="yieldparam">
846
+
847
+ <li>
848
+
849
+ <span class='name'>data</span>
850
+
851
+
852
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Object}</tt>)</span>
853
+
854
+
855
+
856
+ &mdash;
857
+ <div class='inline'>
858
+ <p>Parsed row data.</p>
859
+ </div>
860
+
861
+ </li>
862
+
863
+ <li>
864
+
865
+ <span class='name'>row</span>
866
+
867
+
868
+ <span class='type'>(<tt>Array</tt>)</span>
869
+
870
+
871
+
872
+ &mdash;
873
+ <div class='inline'>
874
+ <p>Raw row data.</p>
875
+ </div>
876
+
877
+ </li>
878
+
879
+ <li>
880
+
881
+ <span class='name'>header_map</span>
882
+
883
+
884
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Integer}</tt>)</span>
885
+
886
+
887
+
888
+ &mdash;
889
+ <div class='inline'>
890
+ <p>Header map used.</p>
891
+ </div>
892
+
893
+ </li>
894
+
895
+ </ul>
896
+ <p class="tag_title">Yield Returns:</p>
897
+ <ul class="yieldreturn">
898
+
899
+ <li>
900
+
901
+
902
+ <span class='type'>(<tt>Boolean</tt>)</span>
903
+
904
+
905
+
906
+ &mdash;
907
+ <div class='inline'>
908
+ <p>`true` when valid, else `false`.</p>
909
+ </div>
910
+
911
+ </li>
912
+
913
+ </ul>
914
+ <p class="tag_title">Returns:</p>
915
+ <ul class="return">
916
+
917
+ <li>
918
+
919
+
920
+ <span class='type'>(<tt>Array&lt;Hash&gt;</tt>, <tt>nil</tt>)</span>
921
+
922
+
923
+
924
+ &mdash;
925
+ <div class='inline'>
926
+ <p>Parsed rows data.</p>
927
+ </div>
928
+
929
+ </li>
930
+
931
+ </ul>
932
+
933
+ </div><table class="source_code">
934
+ <tr>
935
+ <td>
936
+ <pre class="lines">
937
+
938
+
939
+ 89
940
+ 90
941
+ 91
942
+ 92
943
+ 93
944
+ 94
945
+ 95
946
+ 96
947
+ 97
948
+ 98
949
+ 99
950
+ 100
951
+ 101
952
+ 102
953
+ 103
954
+ 104
955
+ 105
956
+ 106
957
+ 107
958
+ 108
959
+ 109
960
+ 110
961
+ 111
962
+ 112
963
+ 113
964
+ 114
965
+ 115
966
+ 116
967
+ 117
968
+ 118
969
+ 119
970
+ 120
971
+ 121
972
+ 122
973
+ 123
974
+ 124
975
+ 125
976
+ 126
977
+ 127
978
+ 128
979
+ 129
980
+ 130
981
+ 131
982
+ 132
983
+ 133</pre>
984
+ </td>
985
+ <td>
986
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 89</span>
987
+
988
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_content'>parse_content</span> <span class='id identifier rubyid_opts'>opts</span><span class='comma'>,</span> <span class='op'>&amp;</span><span class='id identifier rubyid_filter'>filter</span>
989
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
990
+ <span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
991
+ <span class='label'>selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
992
+ <span class='label'>first_row_header:</span> <span class='kw'>false</span><span class='comma'>,</span>
993
+ <span class='label'>header_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
994
+ <span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
995
+ <span class='label'>ignore_text_nodes:</span> <span class='kw'>true</span>
996
+ <span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
997
+
998
+ <span class='comment'># Setup config
999
+ </span> <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='rbracket'>]</span>
1000
+ <span class='id identifier rubyid_row_data'>row_data</span> <span class='op'>=</span> <span class='id identifier rubyid_child_element'>child_element</span> <span class='op'>=</span> <span class='kw'>nil</span>
1001
+ <span class='id identifier rubyid_first'>first</span> <span class='op'>=</span> <span class='id identifier rubyid_first_row_header'>first_row_header</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span>
1002
+ <span class='id identifier rubyid_header_map'>header_map</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_map</span><span class='rbracket'>]</span>
1003
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span>
1004
+ <span class='id identifier rubyid_ignore_text_nodes'>ignore_text_nodes</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:ignore_text_nodes</span><span class='rbracket'>]</span>
1005
+
1006
+ <span class='comment'># Get and parse rows
1007
+ </span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
1008
+ <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
1009
+ <span class='kw'>next</span> <span class='kw'>if</span> <span class='id identifier rubyid_ignore_text_nodes'>ignore_text_nodes</span> <span class='op'>&amp;&amp;</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_name'>name</span> <span class='op'>==</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>text</span><span class='tstring_end'>&#39;</span></span>
1010
+
1011
+ <span class='comment'># First row header validation
1012
+ </span> <span class='kw'>if</span> <span class='id identifier rubyid_first'>first</span> <span class='op'>&amp;&amp;</span> <span class='id identifier rubyid_first_row_header'>first_row_header</span>
1013
+ <span class='id identifier rubyid_first'>first</span> <span class='op'>=</span> <span class='kw'>false</span>
1014
+ <span class='kw'>next</span>
1015
+ <span class='kw'>end</span>
1016
+
1017
+ <span class='comment'># Extract content data
1018
+ </span> <span class='id identifier rubyid_row_data'>row_data</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1019
+ <span class='id identifier rubyid_header_map'>header_map</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_key'>key</span><span class='comma'>,</span> <span class='id identifier rubyid_index'>index</span><span class='op'>|</span>
1020
+ <span class='comment'># Parse column html with default or custom parser
1021
+ </span> <span class='id identifier rubyid_children'>children</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_children'>children</span>
1022
+ <span class='id identifier rubyid_children'>children</span> <span class='op'>=</span> <span class='id identifier rubyid_children'>children</span><span class='period'>.</span><span class='id identifier rubyid_select'>select</span><span class='lbrace'>{</span><span class='op'>|</span><span class='id identifier rubyid_i'>i</span><span class='op'>|</span><span class='id identifier rubyid_i'>i</span><span class='period'>.</span><span class='id identifier rubyid_name'>name</span> <span class='op'>!=</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>text</span><span class='tstring_end'>&#39;</span></span><span class='rbrace'>}</span> <span class='kw'>if</span> <span class='id identifier rubyid_ignore_text_nodes'>ignore_text_nodes</span>
1023
+ <span class='id identifier rubyid_child_element'>child_element</span> <span class='op'>=</span> <span class='id identifier rubyid_children'>children</span><span class='lbracket'>[</span><span class='id identifier rubyid_index'>index</span><span class='rbracket'>]</span>
1024
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span>
1025
+ <span class='id identifier rubyid_default_parser'>default_parser</span><span class='lparen'>(</span><span class='id identifier rubyid_child_element'>child_element</span><span class='comma'>,</span> <span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span> <span class='op'>:</span>
1026
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_child_element'>child_element</span><span class='comma'>,</span> <span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span>
1027
+ <span class='kw'>end</span>
1028
+ <span class='kw'>next</span> <span class='kw'>unless</span> <span class='id identifier rubyid_filter'>filter</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>||</span> <span class='id identifier rubyid_filter'>filter</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_row'>row</span><span class='comma'>,</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='rparen'>)</span>
1029
+ <span class='id identifier rubyid_data'>data</span> <span class='op'>&lt;&lt;</span> <span class='id identifier rubyid_row_data'>row_data</span>
1030
+ <span class='kw'>end</span>
1031
+ <span class='id identifier rubyid_data'>data</span>
1032
+ <span class='kw'>end</span></pre>
1033
+ </td>
1034
+ </tr>
1035
+ </table>
1036
+ </div>
1037
+
1038
+ <div class="method_details ">
1039
+ <h3 class="signature " id="parse_header_map-class_method">
1040
+
1041
+ .<strong>parse_header_map</strong>(opts = {}) &#x21d2; <tt>Hash{Symbol,String =&gt; Integer}</tt><sup>?</sup>
1042
+
1043
+
1044
+
1045
+
1046
+
1047
+ </h3><div class="docstring">
1048
+ <div class="discussion">
1049
+
1050
+ <p>Parse header from selector and create a header map to match a column key</p>
1051
+
1052
+ <pre class="code ruby"><code class="ruby">with column index.
1053
+ </code></pre>
1054
+
1055
+
1056
+ </div>
1057
+ </div>
1058
+ <div class="tags">
1059
+ <p class="tag_title">Parameters:</p>
1060
+ <ul class="param">
1061
+
1062
+ <li>
1063
+
1064
+ <span class='name'>opts</span>
1065
+
1066
+
1067
+ <span class='type'>(<tt>Hash</tt>)</span>
1068
+
1069
+
1070
+ <em class="default">(defaults to: <tt>{}</tt>)</em>
1071
+
1072
+
1073
+ &mdash;
1074
+ <div class='inline'>
1075
+ <p>({}) Configuration options.</p>
1076
+ </div>
1077
+
1078
+ </li>
1079
+
1080
+ </ul>
1081
+
1082
+
1083
+
1084
+
1085
+ <p class="tag_title">Options Hash (<tt>opts</tt>):</p>
1086
+ <ul class="option">
1087
+
1088
+ <li>
1089
+ <span class="name">:html</span>
1090
+ <span class="type">(<tt>Nokogiri::Element</tt>)</span>
1091
+ <span class="default">
1092
+
1093
+ </span>
1094
+
1095
+ &mdash; <div class='inline'>
1096
+ <p>Container element to search into.</p>
1097
+ </div>
1098
+
1099
+ </li>
1100
+
1101
+ <li>
1102
+ <span class="name">:selector</span>
1103
+ <span class="type">(<tt>String</tt>)</span>
1104
+ <span class="default">
1105
+
1106
+ </span>
1107
+
1108
+ &mdash; <div class='inline'>
1109
+ <p>CSS selector to match header cells.</p>
1110
+ </div>
1111
+
1112
+ </li>
1113
+
1114
+ <li>
1115
+ <span class="name">:column_key_label_map</span>
1116
+ <span class="type">(<tt>Hash{Symbol,String =&gt; Regex,String}</tt>)</span>
1117
+ <span class="default">
1118
+
1119
+ </span>
1120
+
1121
+ &mdash; <div class='inline'>
1122
+ <p>Key vs. label dictionary.</p>
1123
+ </div>
1124
+
1125
+ </li>
1126
+
1127
+ <li>
1128
+ <span class="name">:first_row_header</span>
1129
+ <span class="type">(<tt>Boolean</tt>)</span>
1130
+ <span class="default">
1131
+
1132
+ &mdash; default:
1133
+ <tt>false</tt>
1134
+
1135
+ </span>
1136
+
1137
+ &mdash; <div class='inline'>
1138
+ <p>If true then selector first matching row will be used as header for
1139
+ parsing.</p>
1140
+ </div>
1141
+
1142
+ </li>
1143
+
1144
+ <li>
1145
+ <span class="name">:ignore_text_nodes</span>
1146
+ <span class="type">(<tt>Boolean</tt>)</span>
1147
+ <span class="default">
1148
+
1149
+ &mdash; default:
1150
+ <tt>true</tt>
1151
+
1152
+ </span>
1153
+
1154
+ &mdash; <div class='inline'>
1155
+ <p>Ignore text nodes when retriving header cells and rows.</p>
1156
+ </div>
1157
+
1158
+ </li>
1159
+
1160
+ </ul>
1161
+
1162
+
1163
+ <p class="tag_title">Returns:</p>
1164
+ <ul class="return">
1165
+
1166
+ <li>
1167
+
1168
+
1169
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Integer}</tt>, <tt>nil</tt>)</span>
1170
+
1171
+
1172
+
1173
+ &mdash;
1174
+ <div class='inline'>
1175
+ <p>Key vs. column index map.</p>
1176
+ </div>
1177
+
1178
+ </li>
1179
+
1180
+ </ul>
1181
+
1182
+ </div><table class="source_code">
1183
+ <tr>
1184
+ <td>
1185
+ <pre class="lines">
1186
+
1187
+
1188
+ 166
1189
+ 167
1190
+ 168
1191
+ 169
1192
+ 170
1193
+ 171
1194
+ 172
1195
+ 173
1196
+ 174
1197
+ 175
1198
+ 176
1199
+ 177
1200
+ 178
1201
+ 179
1202
+ 180
1203
+ 181
1204
+ 182
1205
+ 183
1206
+ 184
1207
+ 185
1208
+ 186
1209
+ 187
1210
+ 188
1211
+ 189
1212
+ 190
1213
+ 191
1214
+ 192
1215
+ 193
1216
+ 194
1217
+ 195
1218
+ 196
1219
+ 197
1220
+ 198
1221
+ 199
1222
+ 200</pre>
1223
+ </td>
1224
+ <td>
1225
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 166</span>
1226
+
1227
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_header_map'>parse_header_map</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1228
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
1229
+ <span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1230
+ <span class='label'>selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1231
+ <span class='label'>column_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
1232
+ <span class='label'>first_row_header:</span> <span class='kw'>false</span><span class='comma'>,</span>
1233
+ <span class='label'>ignore_text_nodes:</span> <span class='kw'>true</span>
1234
+ <span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
1235
+
1236
+ <span class='comment'># Setup config
1237
+ </span> <span class='id identifier rubyid_dictionary'>dictionary</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_key_label_map</span><span class='rbracket'>]</span>
1238
+ <span class='id identifier rubyid_ignore_text_nodes'>ignore_text_nodes</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:ignore_text_nodes</span><span class='rbracket'>]</span>
1239
+ <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='rbracket'>]</span>
1240
+ <span class='id identifier rubyid_column_map'>column_map</span> <span class='op'>=</span> <span class='kw'>nil</span>
1241
+
1242
+ <span class='comment'># Extract and parse header rows
1243
+ </span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:selector</span><span class='rbracket'>]</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>nil</span>
1244
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1245
+ <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_first'>first</span><span class='rbracket'>]</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span>
1246
+ <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
1247
+ <span class='kw'>next</span> <span class='kw'>if</span> <span class='id identifier rubyid_ignore_text_nodes'>ignore_text_nodes</span> <span class='op'>&amp;&amp;</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_name'>name</span> <span class='op'>==</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>text</span><span class='tstring_end'>&#39;</span></span>
1248
+
1249
+ <span class='id identifier rubyid_column_map'>column_map</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1250
+ <span class='id identifier rubyid_children'>children</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_children'>children</span>
1251
+ <span class='id identifier rubyid_children'>children</span> <span class='op'>=</span> <span class='id identifier rubyid_children'>children</span><span class='period'>.</span><span class='id identifier rubyid_select'>select</span><span class='lbrace'>{</span><span class='op'>|</span><span class='id identifier rubyid_i'>i</span><span class='op'>|</span><span class='id identifier rubyid_i'>i</span><span class='period'>.</span><span class='id identifier rubyid_name'>name</span> <span class='op'>!=</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>text</span><span class='tstring_end'>&#39;</span></span><span class='rbrace'>}</span> <span class='kw'>if</span> <span class='id identifier rubyid_ignore_text_nodes'>ignore_text_nodes</span>
1252
+ <span class='id identifier rubyid_children'>children</span><span class='period'>.</span><span class='id identifier rubyid_each_with_index'>each_with_index</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_col'>col</span><span class='comma'>,</span> <span class='id identifier rubyid_index'>index</span><span class='op'>|</span>
1253
+ <span class='comment'># Parse and map column header
1254
+ </span> <span class='id identifier rubyid_column_key'>column_key</span> <span class='op'>=</span> <span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_col'>col</span><span class='comma'>,</span> <span class='id identifier rubyid_dictionary'>dictionary</span>
1255
+ <span class='kw'>next</span> <span class='kw'>if</span> <span class='id identifier rubyid_column_key'>column_key</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1256
+ <span class='id identifier rubyid_column_map'>column_map</span><span class='lbracket'>[</span><span class='id identifier rubyid_column_key'>column_key</span><span class='rbracket'>]</span> <span class='op'>=</span> <span class='id identifier rubyid_index'>index</span>
1257
+ <span class='kw'>end</span>
1258
+ <span class='id identifier rubyid_data'>data</span> <span class='op'>&lt;&lt;</span> <span class='id identifier rubyid_column_map'>column_map</span>
1259
+ <span class='kw'>end</span>
1260
+ <span class='id identifier rubyid_data'>data</span><span class='op'>&amp;.</span><span class='id identifier rubyid_first'>first</span>
1261
+ <span class='kw'>end</span></pre>
1262
+ </td>
1263
+ </tr>
1264
+ </table>
1265
+ </div>
1266
+
1267
+ <div class="method_details ">
1268
+ <h3 class="signature " id="parse_table-class_method">
1269
+
1270
+ .<strong>parse_table</strong>(opts = {}) {|data, row, header_map| ... } &#x21d2; <tt>Hash{Symbol =&gt; Array,Hash,nil}</tt>
1271
+
1272
+
1273
+
1274
+
1275
+
1276
+ </h3><div class="docstring">
1277
+ <div class="discussion">
1278
+
1279
+ <p>Parse data from a horizontal table like structure matching a selectors and</p>
1280
+
1281
+ <pre class="code ruby"><code class="ruby">using a header map to match columns.
1282
+ </code></pre>
1283
+
1284
+
1285
+ </div>
1286
+ </div>
1287
+ <div class="tags">
1288
+ <p class="tag_title">Parameters:</p>
1289
+ <ul class="param">
1290
+
1291
+ <li>
1292
+
1293
+ <span class='name'>opts</span>
1294
+
1295
+
1296
+ <span class='type'>(<tt>Hash</tt>)</span>
1297
+
1298
+
1299
+ <em class="default">(defaults to: <tt>{}</tt>)</em>
1300
+
1301
+
1302
+ &mdash;
1303
+ <div class='inline'>
1304
+ <p>({}) Configuration options.</p>
1305
+ </div>
1306
+
1307
+ </li>
1308
+
1309
+ </ul>
1310
+
1311
+
1312
+
1313
+
1314
+ <p class="tag_title">Options Hash (<tt>opts</tt>):</p>
1315
+ <ul class="option">
1316
+
1317
+ <li>
1318
+ <span class="name">:html</span>
1319
+ <span class="type">(<tt>Nokogiri::Element</tt>)</span>
1320
+ <span class="default">
1321
+
1322
+ </span>
1323
+
1324
+ &mdash; <div class='inline'>
1325
+ <p>Container element to search into.</p>
1326
+ </div>
1327
+
1328
+ </li>
1329
+
1330
+ <li>
1331
+ <span class="name">:header_selector</span>
1332
+ <span class="type">(<tt>String</tt>)</span>
1333
+ <span class="default">
1334
+
1335
+ </span>
1336
+
1337
+ &mdash; <div class='inline'>
1338
+ <p>Header column elements selector.</p>
1339
+ </div>
1340
+
1341
+ </li>
1342
+
1343
+ <li>
1344
+ <span class="name">:header_key_label_map</span>
1345
+ <span class="type">(<tt>Hash{Symbol,String =&gt; Regex,String}</tt>)</span>
1346
+ <span class="default">
1347
+
1348
+ </span>
1349
+
1350
+ &mdash; <div class='inline'>
1351
+ <p>Header key vs. label dictionary to match column indexes.</p>
1352
+ </div>
1353
+
1354
+ </li>
1355
+
1356
+ <li>
1357
+ <span class="name">:content_selector</span>
1358
+ <span class="type">(<tt>String</tt>)</span>
1359
+ <span class="default">
1360
+
1361
+ </span>
1362
+
1363
+ &mdash; <div class='inline'>
1364
+ <p>Content row elements selector.</p>
1365
+ </div>
1366
+
1367
+ </li>
1368
+
1369
+ <li>
1370
+ <span class="name">:first_row_header</span>
1371
+ <span class="type">(<tt>Boolean</tt>)</span>
1372
+ <span class="default">
1373
+
1374
+ &mdash; default:
1375
+ <tt>false</tt>
1376
+
1377
+ </span>
1378
+
1379
+ &mdash; <div class='inline'>
1380
+ <p>If true then selector first matching row will be used as header for
1381
+ parsing.</p>
1382
+ </div>
1383
+
1384
+ </li>
1385
+
1386
+ <li>
1387
+ <span class="name">:column_parsers</span>
1388
+ <span class="type">(<tt>Hash{Symbol,String =&gt; lambda,proc}</tt>)</span>
1389
+ <span class="default">
1390
+
1391
+ &mdash; default:
1392
+ <tt>{}</tt>
1393
+
1394
+ </span>
1395
+
1396
+ &mdash; <div class='inline'>
1397
+ <p>Custom column parsers for advance data extraction.</p>
1398
+ </div>
1399
+
1400
+ </li>
1401
+
1402
+ <li>
1403
+ <span class="name">:ignore_text_nodes</span>
1404
+ <span class="type">(<tt>Boolean</tt>)</span>
1405
+ <span class="default">
1406
+
1407
+ &mdash; default:
1408
+ <tt>true</tt>
1409
+
1410
+ </span>
1411
+
1412
+ &mdash; <div class='inline'>
1413
+ <p>Ignore text nodes when retriving cells and rows.</p>
1414
+ </div>
1415
+
1416
+ </li>
1417
+
1418
+ </ul>
1419
+
1420
+
1421
+ <p class="tag_title">Yield Parameters:</p>
1422
+ <ul class="yieldparam">
1423
+
1424
+ <li>
1425
+
1426
+ <span class='name'>data</span>
1427
+
1428
+
1429
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Object}</tt>)</span>
1430
+
1431
+
1432
+
1433
+ &mdash;
1434
+ <div class='inline'>
1435
+ <p>Parsed content row data.</p>
1436
+ </div>
1437
+
1438
+ </li>
1439
+
1440
+ <li>
1441
+
1442
+ <span class='name'>row</span>
1443
+
1444
+
1445
+ <span class='type'>(<tt>Array</tt>)</span>
1446
+
1447
+
1448
+
1449
+ &mdash;
1450
+ <div class='inline'>
1451
+ <p>Raw content row data.</p>
1452
+ </div>
1453
+
1454
+ </li>
1455
+
1456
+ <li>
1457
+
1458
+ <span class='name'>header_map</span>
1459
+
1460
+
1461
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Integer}</tt>)</span>
1462
+
1463
+
1464
+
1465
+ &mdash;
1466
+ <div class='inline'>
1467
+ <p>Header map used.</p>
1468
+ </div>
1469
+
1470
+ </li>
1471
+
1472
+ </ul>
1473
+ <p class="tag_title">Yield Returns:</p>
1474
+ <ul class="yieldreturn">
1475
+
1476
+ <li>
1477
+
1478
+
1479
+ <span class='type'>(<tt>Boolean</tt>)</span>
1480
+
1481
+
1482
+
1483
+ &mdash;
1484
+ <div class='inline'>
1485
+ <p>`true` when valid, else `false`.</p>
1486
+ </div>
1487
+
1488
+ </li>
1489
+
1490
+ </ul>
1491
+ <p class="tag_title">Returns:</p>
1492
+ <ul class="return">
1493
+
1494
+ <li>
1495
+
1496
+
1497
+ <span class='type'>(<tt>Hash{Symbol =&gt; Array,Hash,nil}</tt>)</span>
1498
+
1499
+
1500
+
1501
+ &mdash;
1502
+ <div class='inline'>
1503
+ <p>Hash data is as follows:</p>
1504
+ <ul><li>
1505
+ <p>`[Hash] :header_map` Header map used.</p>
1506
+ </li><li>
1507
+ <p>`[Array&lt;Hash&gt;,nil] :data` Parsed rows data.</p>
1508
+ </li></ul>
1509
+ </div>
1510
+
1511
+ </li>
1512
+
1513
+ </ul>
1514
+
1515
+ </div><table class="source_code">
1516
+ <tr>
1517
+ <td>
1518
+ <pre class="lines">
1519
+
1520
+
1521
+ 226
1522
+ 227
1523
+ 228
1524
+ 229
1525
+ 230
1526
+ 231
1527
+ 232
1528
+ 233
1529
+ 234
1530
+ 235
1531
+ 236
1532
+ 237
1533
+ 238
1534
+ 239
1535
+ 240
1536
+ 241
1537
+ 242
1538
+ 243
1539
+ 244
1540
+ 245
1541
+ 246
1542
+ 247
1543
+ 248
1544
+ 249
1545
+ 250
1546
+ 251</pre>
1547
+ </td>
1548
+ <td>
1549
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 226</span>
1550
+
1551
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_table'>parse_table</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span> <span class='op'>&amp;</span><span class='id identifier rubyid_filter'>filter</span>
1552
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
1553
+ <span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1554
+ <span class='label'>header_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1555
+ <span class='label'>header_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
1556
+ <span class='label'>content_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1557
+ <span class='label'>first_row_header:</span> <span class='kw'>false</span><span class='comma'>,</span>
1558
+ <span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
1559
+ <span class='label'>ignore_text_nodes:</span> <span class='kw'>true</span>
1560
+ <span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
1561
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1562
+ <span class='id identifier rubyid_header_map'>header_map</span> <span class='op'>=</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_header_map'>parse_header_map</span> <span class='label'>html:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='comma'>,</span>
1563
+ <span class='label'>selector:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_selector</span><span class='rbracket'>]</span><span class='comma'>,</span>
1564
+ <span class='label'>column_key_label_map:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_key_label_map</span><span class='rbracket'>]</span><span class='comma'>,</span>
1565
+ <span class='label'>first_row_header:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span><span class='comma'>,</span>
1566
+ <span class='label'>ignore_text_nodes:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:ignore_text_nodes</span><span class='rbracket'>]</span>
1567
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1568
+ <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_content'>parse_content</span> <span class='label'>html:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='comma'>,</span>
1569
+ <span class='label'>selector:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:content_selector</span><span class='rbracket'>]</span><span class='comma'>,</span>
1570
+ <span class='label'>header_map:</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='comma'>,</span>
1571
+ <span class='label'>first_row_header:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span><span class='comma'>,</span>
1572
+ <span class='label'>column_parsers:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span><span class='comma'>,</span>
1573
+ <span class='label'>ignore_text_nodes:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:ignore_text_nodes</span><span class='rbracket'>]</span><span class='comma'>,</span>
1574
+ <span class='op'>&amp;</span><span class='id identifier rubyid_filter'>filter</span>
1575
+ <span class='lbrace'>{</span><span class='label'>header_map:</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='comma'>,</span> <span class='label'>data:</span> <span class='id identifier rubyid_data'>data</span><span class='rbrace'>}</span>
1576
+ <span class='kw'>end</span></pre>
1577
+ </td>
1578
+ </tr>
1579
+ </table>
1580
+ </div>
1581
+
1582
+ <div class="method_details ">
1583
+ <h3 class="signature " id="parse_vertical_table-class_method">
1584
+
1585
+ .<strong>parse_vertical_table</strong>(opts = {}) {|data, row, header_map| ... } &#x21d2; <tt>Hash{Symbol =&gt; Array,Hash,nil}</tt>
1586
+
1587
+
1588
+
1589
+
1590
+
1591
+ </h3><div class="docstring">
1592
+ <div class="discussion">
1593
+
1594
+ <p>Parse data from a vertical table like structure matching a selectors and</p>
1595
+
1596
+ <pre class="code ruby"><code class="ruby">using a header map to match columns.
1597
+ </code></pre>
1598
+
1599
+
1600
+ </div>
1601
+ </div>
1602
+ <div class="tags">
1603
+ <p class="tag_title">Parameters:</p>
1604
+ <ul class="param">
1605
+
1606
+ <li>
1607
+
1608
+ <span class='name'>opts</span>
1609
+
1610
+
1611
+ <span class='type'>(<tt>Hash</tt>)</span>
1612
+
1613
+
1614
+ <em class="default">(defaults to: <tt>{}</tt>)</em>
1615
+
1616
+
1617
+ &mdash;
1618
+ <div class='inline'>
1619
+ <p>({}) Configuration options.</p>
1620
+ </div>
1621
+
1622
+ </li>
1623
+
1624
+ </ul>
1625
+
1626
+
1627
+
1628
+
1629
+ <p class="tag_title">Options Hash (<tt>opts</tt>):</p>
1630
+ <ul class="option">
1631
+
1632
+ <li>
1633
+ <span class="name">:html</span>
1634
+ <span class="type">(<tt>Nokogiri::Element</tt>)</span>
1635
+ <span class="default">
1636
+
1637
+ </span>
1638
+
1639
+ &mdash; <div class='inline'>
1640
+ <p>Container element to search into.</p>
1641
+ </div>
1642
+
1643
+ </li>
1644
+
1645
+ <li>
1646
+ <span class="name">:row_selector</span>
1647
+ <span class="type">(<tt>String</tt>)</span>
1648
+ <span class="default">
1649
+
1650
+ </span>
1651
+
1652
+ &mdash; <div class='inline'>
1653
+ <p>Vertical row like elements selector.</p>
1654
+ </div>
1655
+
1656
+ </li>
1657
+
1658
+ <li>
1659
+ <span class="name">:header_selector</span>
1660
+ <span class="type">(<tt>String</tt>)</span>
1661
+ <span class="default">
1662
+
1663
+ </span>
1664
+
1665
+ &mdash; <div class='inline'>
1666
+ <p>Header column elements selector.</p>
1667
+ </div>
1668
+
1669
+ </li>
1670
+
1671
+ <li>
1672
+ <span class="name">:header_key_label_map</span>
1673
+ <span class="type">(<tt>Hash{Symbol,String =&gt; Regex,String}</tt>)</span>
1674
+ <span class="default">
1675
+
1676
+ </span>
1677
+
1678
+ &mdash; <div class='inline'>
1679
+ <p>Header key vs. label dictionary to match column indexes.</p>
1680
+ </div>
1681
+
1682
+ </li>
1683
+
1684
+ <li>
1685
+ <span class="name">:content_selector</span>
1686
+ <span class="type">(<tt>String</tt>)</span>
1687
+ <span class="default">
1688
+
1689
+ </span>
1690
+
1691
+ &mdash; <div class='inline'>
1692
+ <p>Content row elements selector.</p>
1693
+ </div>
1694
+
1695
+ </li>
1696
+
1697
+ <li>
1698
+ <span class="name">:column_parsers</span>
1699
+ <span class="type">(<tt>Hash{Symbol,String =&gt; lambda,proc}</tt>)</span>
1700
+ <span class="default">
1701
+
1702
+ &mdash; default:
1703
+ <tt>{}</tt>
1704
+
1705
+ </span>
1706
+
1707
+ &mdash; <div class='inline'>
1708
+ <p>Custom column parsers for advance data extraction.</p>
1709
+ </div>
1710
+
1711
+ </li>
1712
+
1713
+ <li>
1714
+ <span class="name">:ignore_text_nodes</span>
1715
+ <span class="type">(<tt>Boolean</tt>)</span>
1716
+ <span class="default">
1717
+
1718
+ &mdash; default:
1719
+ <tt>true</tt>
1720
+
1721
+ </span>
1722
+
1723
+ &mdash; <div class='inline'>
1724
+ <p>Ignore text nodes when retriving cells and rows.</p>
1725
+ </div>
1726
+
1727
+ </li>
1728
+
1729
+ </ul>
1730
+
1731
+
1732
+ <p class="tag_title">Yield Parameters:</p>
1733
+ <ul class="yieldparam">
1734
+
1735
+ <li>
1736
+
1737
+ <span class='name'>data</span>
1738
+
1739
+
1740
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Object}</tt>)</span>
1741
+
1742
+
1743
+
1744
+ &mdash;
1745
+ <div class='inline'>
1746
+ <p>Parsed content row data.</p>
1747
+ </div>
1748
+
1749
+ </li>
1750
+
1751
+ <li>
1752
+
1753
+ <span class='name'>row</span>
1754
+
1755
+
1756
+ <span class='type'>(<tt>Array</tt>)</span>
1757
+
1758
+
1759
+
1760
+ &mdash;
1761
+ <div class='inline'>
1762
+ <p>Raw content row data.</p>
1763
+ </div>
1764
+
1765
+ </li>
1766
+
1767
+ <li>
1768
+
1769
+ <span class='name'>header_map</span>
1770
+
1771
+
1772
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Integer}</tt>)</span>
1773
+
1774
+
1775
+
1776
+ &mdash;
1777
+ <div class='inline'>
1778
+ <p>Header map used.</p>
1779
+ </div>
1780
+
1781
+ </li>
1782
+
1783
+ </ul>
1784
+ <p class="tag_title">Yield Returns:</p>
1785
+ <ul class="yieldreturn">
1786
+
1787
+ <li>
1788
+
1789
+
1790
+ <span class='type'>(<tt>Boolean</tt>)</span>
1791
+
1792
+
1793
+
1794
+ &mdash;
1795
+ <div class='inline'>
1796
+ <p>`true` when valid, else `false`.</p>
1797
+ </div>
1798
+
1799
+ </li>
1800
+
1801
+ </ul>
1802
+ <p class="tag_title">Returns:</p>
1803
+ <ul class="return">
1804
+
1805
+ <li>
1806
+
1807
+
1808
+ <span class='type'>(<tt>Hash{Symbol =&gt; Array,Hash,nil}</tt>)</span>
1809
+
1810
+
1811
+
1812
+ &mdash;
1813
+ <div class='inline'>
1814
+ <p>Hash data is as follows:</p>
1815
+ <ul><li>
1816
+ <p>`[Hash] :header_map` Header map used.</p>
1817
+ </li><li>
1818
+ <p>`[Array&lt;Hash&gt;,nil] :data` Parsed rows data.</p>
1819
+ </li></ul>
1820
+ </div>
1821
+
1822
+ </li>
1823
+
1824
+ </ul>
1825
+
1826
+ </div><table class="source_code">
1827
+ <tr>
1828
+ <td>
1829
+ <pre class="lines">
1830
+
1831
+
1832
+ 276
1833
+ 277
1834
+ 278
1835
+ 279
1836
+ 280
1837
+ 281
1838
+ 282
1839
+ 283
1840
+ 284
1841
+ 285
1842
+ 286
1843
+ 287
1844
+ 288
1845
+ 289
1846
+ 290
1847
+ 291
1848
+ 292
1849
+ 293
1850
+ 294
1851
+ 295
1852
+ 296
1853
+ 297
1854
+ 298
1855
+ 299
1856
+ 300
1857
+ 301
1858
+ 302
1859
+ 303
1860
+ 304
1861
+ 305
1862
+ 306
1863
+ 307
1864
+ 308
1865
+ 309</pre>
1866
+ </td>
1867
+ <td>
1868
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 276</span>
1869
+
1870
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_vertical_table'>parse_vertical_table</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span> <span class='op'>&amp;</span><span class='id identifier rubyid_filter'>filter</span>
1871
+ <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
1872
+ <span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1873
+ <span class='label'>row_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1874
+ <span class='label'>header_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1875
+ <span class='label'>header_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
1876
+ <span class='label'>content_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
1877
+ <span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
1878
+ <span class='label'>ignore_text_nodes:</span> <span class='kw'>true</span>
1879
+ <span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
1880
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1881
+
1882
+ <span class='comment'># Setup config
1883
+ </span> <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
1884
+ <span class='id identifier rubyid_dictionary'>dictionary</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_key_label_map</span><span class='rbracket'>]</span>
1885
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span>
1886
+
1887
+ <span class='comment'># Extract headers and content
1888
+ </span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:row_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>nil</span>
1889
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
1890
+ <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
1891
+ <span class='comment'># Parse and map column header
1892
+ </span> <span class='id identifier rubyid_header_element'>header_element</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
1893
+ <span class='id identifier rubyid_key'>key</span> <span class='op'>=</span> <span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_header_element'>header_element</span><span class='comma'>,</span> <span class='id identifier rubyid_dictionary'>dictionary</span>
1894
+ <span class='kw'>next</span> <span class='kw'>if</span> <span class='id identifier rubyid_key'>key</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>||</span> <span class='id identifier rubyid_key'>key</span> <span class='op'>==</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_end'>&#39;</span></span>
1895
+
1896
+ <span class='comment'># Parse column html with default or custom parser
1897
+ </span> <span class='id identifier rubyid_content_element'>content_element</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:content_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
1898
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span>
1899
+ <span class='id identifier rubyid_default_parser'>default_parser</span><span class='lparen'>(</span><span class='id identifier rubyid_content_element'>content_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span> <span class='op'>:</span>
1900
+ <span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_content_element'>content_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span>
1901
+ <span class='kw'>end</span>
1902
+ <span class='id identifier rubyid_data'>data</span>
1903
+ <span class='kw'>end</span></pre>
1904
+ </td>
1905
+ </tr>
1906
+ </table>
1907
+ </div>
1908
+
1909
+ <div class="method_details ">
1910
+ <h3 class="signature " id="strip-class_method">
1911
+
1912
+ .<strong>strip</strong>(raw_text, orig_encoding = &#39;ASCII&#39;) &#x21d2; <tt>String</tt><sup>?</sup>
1913
+
1914
+
1915
+
1916
+
1917
+
1918
+ </h3><div class="docstring">
1919
+ <div class="discussion">
1920
+
1921
+ <p>Strip a value by trimming spaces, reducing secuential spaces into a</p>
1922
+
1923
+ <pre class="code ruby"><code class="ruby">single space, decode HTML entities and change encoding to UTF-8.
1924
+ </code></pre>
1925
+
1926
+
1927
+ </div>
1928
+ </div>
1929
+ <div class="tags">
1930
+ <p class="tag_title">Parameters:</p>
1931
+ <ul class="param">
1932
+
1933
+ <li>
1934
+
1935
+ <span class='name'>raw_text</span>
1936
+
1937
+
1938
+ <span class='type'>(<tt>String</tt>, <tt>Object</tt>, <tt>nil</tt>)</span>
1939
+
1940
+
1941
+
1942
+ &mdash;
1943
+ <div class='inline'>
1944
+ <p>Text to strip.</p>
1945
+ </div>
1946
+
1947
+ </li>
1948
+
1949
+ <li>
1950
+
1951
+ <span class='name'>orig_encoding</span>
1952
+
1953
+
1954
+ <span class='type'>(<tt>String</tt>)</span>
1955
+
1956
+
1957
+ <em class="default">(defaults to: <tt>&#39;ASCII&#39;</tt>)</em>
1958
+
1959
+
1960
+ &mdash;
1961
+ <div class='inline'>
1962
+ <p>Text original encoding.</p>
1963
+ </div>
1964
+
1965
+ </li>
1966
+
1967
+ </ul>
1968
+
1969
+ <p class="tag_title">Returns:</p>
1970
+ <ul class="return">
1971
+
1972
+ <li>
1973
+
1974
+
1975
+ <span class='type'>(<tt>String</tt>, <tt>nil</tt>)</span>
1976
+
1977
+
1978
+
1979
+ &mdash;
1980
+ <div class='inline'>
1981
+ <p>`nil` when <code>raw_text</code> is nil, else `String`.</p>
1982
+ </div>
1983
+
1984
+ </li>
1985
+
1986
+ </ul>
1987
+
1988
+ </div><table class="source_code">
1989
+ <tr>
1990
+ <td>
1991
+ <pre class="lines">
1992
+
1993
+
1994
+ 44
1995
+ 45
1996
+ 46
1997
+ 47
1998
+ 48
1999
+ 49
2000
+ 50
2001
+ 51
2002
+ 52
2003
+ 53
2004
+ 54
2005
+ 55</pre>
2006
+ </td>
2007
+ <td>
2008
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 44</span>
2009
+
2010
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='comma'>,</span> <span class='id identifier rubyid_orig_encoding'>orig_encoding</span> <span class='op'>=</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>ASCII</span><span class='tstring_end'>&#39;</span></span>
2011
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
2012
+ <span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_to_s'>to_s</span> <span class='kw'>unless</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span> <span class='const'>String</span>
2013
+ <span class='id identifier rubyid_regex'>regex</span> <span class='op'>=</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>(\s|\u3000|\u00a0)+</span><span class='regexp_end'>/</span></span>
2014
+ <span class='id identifier rubyid_good_encoding'>good_encoding</span> <span class='op'>=</span> <span class='lparen'>(</span><span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=~</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>\u3000</span><span class='regexp_end'>/</span></span> <span class='op'>||</span> <span class='kw'>true</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>false</span>
2015
+ <span class='kw'>unless</span> <span class='id identifier rubyid_good_encoding'>good_encoding</span>
2016
+ <span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_force_encoding'>force_encoding</span><span class='lparen'>(</span><span class='id identifier rubyid_orig_encoding'>orig_encoding</span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_encode'>encode</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>UTF-8</span><span class='tstring_end'>&#39;</span></span><span class='comma'>,</span> <span class='label'>invalid:</span> <span class='symbol'>:replace</span><span class='comma'>,</span> <span class='label'>undef:</span> <span class='symbol'>:replace</span><span class='rparen'>)</span>
2017
+ <span class='id identifier rubyid_regex'>regex</span> <span class='op'>=</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>(\s|\u3000|\u00a0|\u00c2\u00a0)+</span><span class='regexp_end'>/</span></span>
2018
+ <span class='kw'>end</span>
2019
+ <span class='id identifier rubyid_text'>text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_gsub'>gsub</span><span class='lparen'>(</span><span class='id identifier rubyid_regex'>regex</span><span class='comma'>,</span> <span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'> </span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span>
2020
+ <span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span> <span class='kw'>nil</span> <span class='op'>:</span> <span class='id identifier rubyid_decode_html'>decode_html</span><span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span><span class='rparen'>)</span>
2021
+ <span class='kw'>end</span></pre>
2022
+ </td>
2023
+ </tr>
2024
+ </table>
2025
+ </div>
2026
+
2027
+ <div class="method_details ">
2028
+ <h3 class="signature " id="translate_label_to_key-class_method">
2029
+
2030
+ .<strong>translate_label_to_key</strong>(element, label_map) &#x21d2; <tt>Symbol</tt>, <tt>String</tt>
2031
+
2032
+
2033
+
2034
+
2035
+
2036
+ </h3><div class="docstring">
2037
+ <div class="discussion">
2038
+
2039
+ <p>Extract column label and translate it into a frienly key.</p>
2040
+
2041
+
2042
+ </div>
2043
+ </div>
2044
+ <div class="tags">
2045
+ <p class="tag_title">Parameters:</p>
2046
+ <ul class="param">
2047
+
2048
+ <li>
2049
+
2050
+ <span class='name'>element</span>
2051
+
2052
+
2053
+ <span class='type'>(<tt>Nokogiri::Element</tt>)</span>
2054
+
2055
+
2056
+
2057
+ &mdash;
2058
+ <div class='inline'>
2059
+ <p>Html element to parse.</p>
2060
+ </div>
2061
+
2062
+ </li>
2063
+
2064
+ <li>
2065
+
2066
+ <span class='name'>label_map</span>
2067
+
2068
+
2069
+ <span class='type'>(<tt>Hash{Symbol,String =&gt; Regex,String}</tt>)</span>
2070
+
2071
+
2072
+
2073
+ &mdash;
2074
+ <div class='inline'>
2075
+ <p>Label dictionary for translation into key.</p>
2076
+ </div>
2077
+
2078
+ </li>
2079
+
2080
+ </ul>
2081
+
2082
+ <p class="tag_title">Returns:</p>
2083
+ <ul class="return">
2084
+
2085
+ <li>
2086
+
2087
+
2088
+ <span class='type'>(<tt>Symbol</tt>, <tt>String</tt>)</span>
2089
+
2090
+
2091
+
2092
+ &mdash;
2093
+ <div class='inline'>
2094
+ <p>Translated key.</p>
2095
+ </div>
2096
+
2097
+ </li>
2098
+
2099
+ </ul>
2100
+
2101
+ </div><table class="source_code">
2102
+ <tr>
2103
+ <td>
2104
+ <pre class="lines">
2105
+
2106
+
2107
+ 142
2108
+ 143
2109
+ 144
2110
+ 145
2111
+ 146
2112
+ 147
2113
+ 148
2114
+ 149
2115
+ 150</pre>
2116
+ </td>
2117
+ <td>
2118
+ <pre class="code"><span class="info file"># File 'lib/dh_easy/text.rb', line 142</span>
2119
+
2120
+ <span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_element'>element</span><span class='comma'>,</span> <span class='id identifier rubyid_label_map'>label_map</span>
2121
+ <span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_element'>element</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
2122
+ <span class='id identifier rubyid_element'>element</span><span class='period'>.</span><span class='id identifier rubyid_search'>search</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>//i</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_remove'>remove</span> <span class='kw'>if</span> <span class='id identifier rubyid_element'>element</span><span class='period'>.</span><span class='id identifier rubyid_search'>search</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>&#39;</span><span class='tstring_content'>//i</span><span class='tstring_end'>&#39;</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_count'>count</span> <span class='op'>&gt;</span> <span class='int'>0</span>
2123
+ <span class='id identifier rubyid_text'>text</span> <span class='op'>=</span> <span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_element'>element</span><span class='period'>.</span><span class='id identifier rubyid_text'>text</span>
2124
+ <span class='id identifier rubyid_key_pair'>key_pair</span> <span class='op'>=</span> <span class='id identifier rubyid_label_map'>label_map</span><span class='period'>.</span><span class='id identifier rubyid_find'>find</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_k'>k</span><span class='comma'>,</span><span class='id identifier rubyid_v'>v</span><span class='op'>|</span>
2125
+ <span class='id identifier rubyid_v'>v</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span><span class='lparen'>(</span><span class='const'>Regexp</span><span class='rparen'>)</span> <span class='op'>?</span> <span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span> <span class='op'>=~</span> <span class='id identifier rubyid_v'>v</span><span class='rparen'>)</span> <span class='op'>:</span> <span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span> <span class='op'>==</span> <span class='id identifier rubyid_v'>v</span><span class='rparen'>)</span>
2126
+ <span class='kw'>end</span>
2127
+ <span class='id identifier rubyid_key'>key</span> <span class='op'>=</span> <span class='id identifier rubyid_key_pair'>key_pair</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span> <span class='kw'>nil</span> <span class='op'>:</span> <span class='id identifier rubyid_key_pair'>key_pair</span><span class='lbracket'>[</span><span class='int'>0</span><span class='rbracket'>]</span>
2128
+ <span class='kw'>end</span></pre>
2129
+ </td>
2130
+ </tr>
2131
+ </table>
2132
+ </div>
2133
+
2134
+ </div>
2135
+
2136
+ </div>
2137
+
2138
+ <div id="footer">
2139
+ Generated on Wed Dec 4 23:05:44 2019 by
2140
+ <a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
2141
+ 0.9.20 (ruby-2.5.3).
2142
+ </div>
2143
+
2144
+ </div>
2145
+ </body>
2146
+ </html>