ae_easy-text 0.0.1
Sign up to get free protection for your applications and to get access to all the features.
- checksums.yaml +7 -0
- data/.gitignore +12 -0
- data/.travis.yml +7 -0
- data/.yardopts +1 -0
- data/CODE_OF_CONDUCT.md +74 -0
- data/Gemfile +6 -0
- data/LICENSE +21 -0
- data/README.md +16 -0
- data/Rakefile +22 -0
- data/ae_easy-text.gemspec +49 -0
- data/doc/AeEasy.html +117 -0
- data/doc/AeEasy/Text.html +2024 -0
- data/doc/_index.html +122 -0
- data/doc/class_list.html +51 -0
- data/doc/css/common.css +1 -0
- data/doc/css/full_list.css +58 -0
- data/doc/css/style.css +496 -0
- data/doc/file.README.html +91 -0
- data/doc/file_list.html +56 -0
- data/doc/frames.html +17 -0
- data/doc/index.html +91 -0
- data/doc/js/app.js +292 -0
- data/doc/js/full_list.js +216 -0
- data/doc/js/jquery.js +4 -0
- data/doc/method_list.html +131 -0
- data/doc/top-level-namespace.html +110 -0
- data/lib/ae_easy/text.rb +283 -0
- data/lib/ae_easy/text/version.rb +6 -0
- metadata +186 -0
checksums.yaml
ADDED
@@ -0,0 +1,7 @@
|
|
1
|
+
---
|
2
|
+
SHA256:
|
3
|
+
metadata.gz: 239189344e783f67b085da7394e535aa693a4b067c62b8d0b16f733a0b19d4f7
|
4
|
+
data.tar.gz: ca144105f26e399116b05560ff870f6aa051a04696602f6f68f67f06b9e0bfda
|
5
|
+
SHA512:
|
6
|
+
metadata.gz: 0b7c4495eeb71e5dae3ad799d14f8a2d83989a949183ee3df2837191b4a4f3a10965ead38416ccda078da7b29fc083eb02fd53a24999f97473b02f77489d921c
|
7
|
+
data.tar.gz: 4f377b26bcfb0ef4cce7806d153fb97de115d0e0bc4beef5e43b55b0125e1936d6c4440d1131cbe9cb4d5fe28a1810c2820269f6c317511068c91aff42ad8126
|
data/.gitignore
ADDED
data/.travis.yml
ADDED
data/.yardopts
ADDED
@@ -0,0 +1 @@
|
|
1
|
+
--no-private
|
data/CODE_OF_CONDUCT.md
ADDED
@@ -0,0 +1,74 @@
|
|
1
|
+
# Contributor Covenant Code of Conduct
|
2
|
+
|
3
|
+
## Our Pledge
|
4
|
+
|
5
|
+
In the interest of fostering an open and welcoming environment, we as
|
6
|
+
contributors and maintainers pledge to making participation in our project and
|
7
|
+
our community a harassment-free experience for everyone, regardless of age, body
|
8
|
+
size, disability, ethnicity, gender identity and expression, level of experience,
|
9
|
+
nationality, personal appearance, race, religion, or sexual identity and
|
10
|
+
orientation.
|
11
|
+
|
12
|
+
## Our Standards
|
13
|
+
|
14
|
+
Examples of behavior that contributes to creating a positive environment
|
15
|
+
include:
|
16
|
+
|
17
|
+
* Using welcoming and inclusive language
|
18
|
+
* Being respectful of differing viewpoints and experiences
|
19
|
+
* Gracefully accepting constructive criticism
|
20
|
+
* Focusing on what is best for the community
|
21
|
+
* Showing empathy towards other community members
|
22
|
+
|
23
|
+
Examples of unacceptable behavior by participants include:
|
24
|
+
|
25
|
+
* The use of sexualized language or imagery and unwelcome sexual attention or
|
26
|
+
advances
|
27
|
+
* Trolling, insulting/derogatory comments, and personal or political attacks
|
28
|
+
* Public or private harassment
|
29
|
+
* Publishing others' private information, such as a physical or electronic
|
30
|
+
address, without explicit permission
|
31
|
+
* Other conduct which could reasonably be considered inappropriate in a
|
32
|
+
professional setting
|
33
|
+
|
34
|
+
## Our Responsibilities
|
35
|
+
|
36
|
+
Project maintainers are responsible for clarifying the standards of acceptable
|
37
|
+
behavior and are expected to take appropriate and fair corrective action in
|
38
|
+
response to any instances of unacceptable behavior.
|
39
|
+
|
40
|
+
Project maintainers have the right and responsibility to remove, edit, or
|
41
|
+
reject comments, commits, code, wiki edits, issues, and other contributions
|
42
|
+
that are not aligned to this Code of Conduct, or to ban temporarily or
|
43
|
+
permanently any contributor for other behaviors that they deem inappropriate,
|
44
|
+
threatening, offensive, or harmful.
|
45
|
+
|
46
|
+
## Scope
|
47
|
+
|
48
|
+
This Code of Conduct applies both within project spaces and in public spaces
|
49
|
+
when an individual is representing the project or its community. Examples of
|
50
|
+
representing a project or community include using an official project e-mail
|
51
|
+
address, posting via an official social media account, or acting as an appointed
|
52
|
+
representative at an online or offline event. Representation of a project may be
|
53
|
+
further defined and clarified by project maintainers.
|
54
|
+
|
55
|
+
## Enforcement
|
56
|
+
|
57
|
+
Instances of abusive, harassing, or otherwise unacceptable behavior may be
|
58
|
+
reported by contacting the project team at parama@answersengine.com. All
|
59
|
+
complaints will be reviewed and investigated and will result in a response that
|
60
|
+
is deemed necessary and appropriate to the circumstances. The project team is
|
61
|
+
obligated to maintain confidentiality with regard to the reporter of an incident.
|
62
|
+
Further details of specific enforcement policies may be posted separately.
|
63
|
+
|
64
|
+
Project maintainers who do not follow or enforce the Code of Conduct in good
|
65
|
+
faith may face temporary or permanent repercussions as determined by other
|
66
|
+
members of the project's leadership.
|
67
|
+
|
68
|
+
## Attribution
|
69
|
+
|
70
|
+
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
|
71
|
+
available at [http://contributor-covenant.org/version/1/4][version]
|
72
|
+
|
73
|
+
[homepage]: http://contributor-covenant.org
|
74
|
+
[version]: http://contributor-covenant.org/version/1/4/
|
data/Gemfile
ADDED
data/LICENSE
ADDED
@@ -0,0 +1,21 @@
|
|
1
|
+
MIT License
|
2
|
+
|
3
|
+
Copyright (c) 2019 AnswersEngine
|
4
|
+
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
7
|
+
in the Software without restriction, including without limitation the rights
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
10
|
+
furnished to do so, subject to the following conditions:
|
11
|
+
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
13
|
+
copies or substantial portions of the Software.
|
14
|
+
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21
|
+
SOFTWARE.
|
data/README.md
ADDED
@@ -0,0 +1,16 @@
|
|
1
|
+
[![Documentation](http://img.shields.io/badge/docs-rdoc.info-blue.svg)](http://rubydoc.org/gems/ae_easy-text/frames)
|
2
|
+
[![Gem Version](https://badge.fury.io/rb/ae_easy-text.svg)](http://github.com/answersengine/ae_easy-text/releases)
|
3
|
+
[![License](http://img.shields.io/badge/license-MIT-yellowgreen.svg)](#license)
|
4
|
+
|
5
|
+
# AeEasy text module
|
6
|
+
## Description
|
7
|
+
|
8
|
+
AeEasy text is part of AeEasy gem collection. It provides multiple text parsing helpers to ease common text parsing user cases.
|
9
|
+
|
10
|
+
Install gem:
|
11
|
+
```gem install 'ae_easy-text'```
|
12
|
+
|
13
|
+
Require gem:
|
14
|
+
```require 'ae_easy-text'```
|
15
|
+
|
16
|
+
Documentation can be found [here](http://rubydoc.org/gems/ae_easy-text/frames).
|
data/Rakefile
ADDED
@@ -0,0 +1,22 @@
|
|
1
|
+
require 'benchmark'
|
2
|
+
require 'bundler/gem_tasks'
|
3
|
+
require 'rake/testtask'
|
4
|
+
|
5
|
+
Rake::TestTask.new do |t|
|
6
|
+
t.libs = ['lib', 'test']
|
7
|
+
t.warning = false
|
8
|
+
t.verbose = false
|
9
|
+
t.test_files = FileList['./test/**/*_test.rb']
|
10
|
+
end
|
11
|
+
|
12
|
+
desc 'Benchmark another task execution | usage example: benchmark[my_task, param1, param2]'
|
13
|
+
task :benchmark, [:task] do |task, args|
|
14
|
+
task_name = args[:task]
|
15
|
+
if task_name.nil?
|
16
|
+
puts "Should select a task."
|
17
|
+
exit 1
|
18
|
+
end
|
19
|
+
puts Benchmark.measure{ Rake::Task[task_name].invoke *args.extras }
|
20
|
+
end
|
21
|
+
|
22
|
+
task default: :test
|
@@ -0,0 +1,49 @@
|
|
1
|
+
|
2
|
+
lib = File.expand_path("../lib", __FILE__)
|
3
|
+
$LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
|
4
|
+
require "ae_easy/text/version"
|
5
|
+
|
6
|
+
Gem::Specification.new do |spec|
|
7
|
+
spec.name = "ae_easy-text"
|
8
|
+
spec.version = AeEasy::Text::VERSION
|
9
|
+
spec.authors = ["Eduardo Rosales"]
|
10
|
+
spec.email = ["eduardo@datahen.com"]
|
11
|
+
|
12
|
+
spec.summary = %q{AnswersEngine Easy toolkit text module}
|
13
|
+
spec.description = %q{AnswersEngine Easy toolkit text module contains multiple text parsing helpers.}
|
14
|
+
spec.homepage = "https://answersengine.com"
|
15
|
+
spec.license = "MIT"
|
16
|
+
|
17
|
+
# spec.cert_chain = ['certs/ae_easy.pem']
|
18
|
+
# spec.signing_key = File.expand_path("~/.ssh/gems/gem-private_ae_easy.pem") if $0 =~ /gem\z/
|
19
|
+
|
20
|
+
# Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
|
21
|
+
# to allow pushing to a single host or delete this section to allow pushing to any host.
|
22
|
+
if spec.respond_to?(:metadata)
|
23
|
+
# spec.metadata["allowed_push_host"] = "TODO: Set to 'http://mygemserver.com'"
|
24
|
+
|
25
|
+
spec.metadata["homepage_uri"] = spec.homepage
|
26
|
+
spec.metadata["source_code_uri"] = "https://github.com/answersengine/ae_easy-text"
|
27
|
+
# spec.metadata["changelog_uri"] = "TODO: Put your gem's CHANGELOG.md URL here."
|
28
|
+
else
|
29
|
+
raise "RubyGems 2.0 or newer is required to protect against " \
|
30
|
+
"public gem pushes."
|
31
|
+
end
|
32
|
+
|
33
|
+
# Specify which files should be added to the gem when it is released.
|
34
|
+
# The `git ls-files -z` loads the files in the RubyGem that have been added into git.
|
35
|
+
spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
|
36
|
+
`git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
|
37
|
+
end
|
38
|
+
spec.require_paths = ["lib"]
|
39
|
+
spec.required_ruby_version = '>= 2.2.2'
|
40
|
+
|
41
|
+
spec.add_dependency 'ae_easy-core', '>= 0'
|
42
|
+
spec.add_development_dependency 'bundler', '>= 1.16.3'
|
43
|
+
spec.add_development_dependency 'rake', '>= 10.0'
|
44
|
+
spec.add_development_dependency 'minitest', '>= 5.11'
|
45
|
+
spec.add_development_dependency 'simplecov', '>= 0.16.1'
|
46
|
+
spec.add_development_dependency 'simplecov-console', '>= 0.4.2'
|
47
|
+
spec.add_development_dependency 'timecop', '>= 0.9.1'
|
48
|
+
spec.add_development_dependency 'byebug', '>= 0'
|
49
|
+
end
|
data/doc/AeEasy.html
ADDED
@@ -0,0 +1,117 @@
|
|
1
|
+
<!DOCTYPE html>
|
2
|
+
<html>
|
3
|
+
<head>
|
4
|
+
<meta charset="utf-8">
|
5
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
6
|
+
<title>
|
7
|
+
Module: AeEasy
|
8
|
+
|
9
|
+
— Documentation by YARD 0.9.18
|
10
|
+
|
11
|
+
</title>
|
12
|
+
|
13
|
+
<link rel="stylesheet" href="css/style.css" type="text/css" charset="utf-8" />
|
14
|
+
|
15
|
+
<link rel="stylesheet" href="css/common.css" type="text/css" charset="utf-8" />
|
16
|
+
|
17
|
+
<script type="text/javascript" charset="utf-8">
|
18
|
+
pathId = "AeEasy";
|
19
|
+
relpath = '';
|
20
|
+
</script>
|
21
|
+
|
22
|
+
|
23
|
+
<script type="text/javascript" charset="utf-8" src="js/jquery.js"></script>
|
24
|
+
|
25
|
+
<script type="text/javascript" charset="utf-8" src="js/app.js"></script>
|
26
|
+
|
27
|
+
|
28
|
+
</head>
|
29
|
+
<body>
|
30
|
+
<div class="nav_wrap">
|
31
|
+
<iframe id="nav" src="class_list.html?1"></iframe>
|
32
|
+
<div id="resizer"></div>
|
33
|
+
</div>
|
34
|
+
|
35
|
+
<div id="main" tabindex="-1">
|
36
|
+
<div id="header">
|
37
|
+
<div id="menu">
|
38
|
+
|
39
|
+
<a href="_index.html">Index (A)</a> »
|
40
|
+
|
41
|
+
|
42
|
+
<span class="title">AeEasy</span>
|
43
|
+
|
44
|
+
</div>
|
45
|
+
|
46
|
+
<div id="search">
|
47
|
+
|
48
|
+
<a class="full_list_link" id="class_list_link"
|
49
|
+
href="class_list.html">
|
50
|
+
|
51
|
+
<svg width="24" height="24">
|
52
|
+
<rect x="0" y="4" width="24" height="4" rx="1" ry="1"></rect>
|
53
|
+
<rect x="0" y="12" width="24" height="4" rx="1" ry="1"></rect>
|
54
|
+
<rect x="0" y="20" width="24" height="4" rx="1" ry="1"></rect>
|
55
|
+
</svg>
|
56
|
+
</a>
|
57
|
+
|
58
|
+
</div>
|
59
|
+
<div class="clear"></div>
|
60
|
+
</div>
|
61
|
+
|
62
|
+
<div id="content"><h1>Module: AeEasy
|
63
|
+
|
64
|
+
|
65
|
+
|
66
|
+
</h1>
|
67
|
+
<div class="box_info">
|
68
|
+
|
69
|
+
|
70
|
+
|
71
|
+
|
72
|
+
|
73
|
+
|
74
|
+
|
75
|
+
|
76
|
+
|
77
|
+
|
78
|
+
|
79
|
+
<dl>
|
80
|
+
<dt>Defined in:</dt>
|
81
|
+
<dd>lib/ae_easy/text.rb<span class="defines">,<br />
|
82
|
+
lib/ae_easy/text/version.rb</span>
|
83
|
+
</dd>
|
84
|
+
</dl>
|
85
|
+
|
86
|
+
</div>
|
87
|
+
|
88
|
+
<h2>Defined Under Namespace</h2>
|
89
|
+
<p class="children">
|
90
|
+
|
91
|
+
|
92
|
+
<strong class="modules">Modules:</strong> <span class='object_link'><a href="AeEasy/Text.html" title="AeEasy::Text (module)">Text</a></span>
|
93
|
+
|
94
|
+
|
95
|
+
|
96
|
+
|
97
|
+
</p>
|
98
|
+
|
99
|
+
|
100
|
+
|
101
|
+
|
102
|
+
|
103
|
+
|
104
|
+
|
105
|
+
|
106
|
+
|
107
|
+
</div>
|
108
|
+
|
109
|
+
<div id="footer">
|
110
|
+
Generated on Tue Feb 26 16:50:02 2019 by
|
111
|
+
<a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
|
112
|
+
0.9.18 (ruby-2.5.3).
|
113
|
+
</div>
|
114
|
+
|
115
|
+
</div>
|
116
|
+
</body>
|
117
|
+
</html>
|
@@ -0,0 +1,2024 @@
|
|
1
|
+
<!DOCTYPE html>
|
2
|
+
<html>
|
3
|
+
<head>
|
4
|
+
<meta charset="utf-8">
|
5
|
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
6
|
+
<title>
|
7
|
+
Module: AeEasy::Text
|
8
|
+
|
9
|
+
— Documentation by YARD 0.9.18
|
10
|
+
|
11
|
+
</title>
|
12
|
+
|
13
|
+
<link rel="stylesheet" href="../css/style.css" type="text/css" charset="utf-8" />
|
14
|
+
|
15
|
+
<link rel="stylesheet" href="../css/common.css" type="text/css" charset="utf-8" />
|
16
|
+
|
17
|
+
<script type="text/javascript" charset="utf-8">
|
18
|
+
pathId = "AeEasy::Text";
|
19
|
+
relpath = '../';
|
20
|
+
</script>
|
21
|
+
|
22
|
+
|
23
|
+
<script type="text/javascript" charset="utf-8" src="../js/jquery.js"></script>
|
24
|
+
|
25
|
+
<script type="text/javascript" charset="utf-8" src="../js/app.js"></script>
|
26
|
+
|
27
|
+
|
28
|
+
</head>
|
29
|
+
<body>
|
30
|
+
<div class="nav_wrap">
|
31
|
+
<iframe id="nav" src="../class_list.html?1"></iframe>
|
32
|
+
<div id="resizer"></div>
|
33
|
+
</div>
|
34
|
+
|
35
|
+
<div id="main" tabindex="-1">
|
36
|
+
<div id="header">
|
37
|
+
<div id="menu">
|
38
|
+
|
39
|
+
<a href="../_index.html">Index (T)</a> »
|
40
|
+
<span class='title'><span class='object_link'><a href="../AeEasy.html" title="AeEasy (module)">AeEasy</a></span></span>
|
41
|
+
»
|
42
|
+
<span class="title">Text</span>
|
43
|
+
|
44
|
+
</div>
|
45
|
+
|
46
|
+
<div id="search">
|
47
|
+
|
48
|
+
<a class="full_list_link" id="class_list_link"
|
49
|
+
href="../class_list.html">
|
50
|
+
|
51
|
+
<svg width="24" height="24">
|
52
|
+
<rect x="0" y="4" width="24" height="4" rx="1" ry="1"></rect>
|
53
|
+
<rect x="0" y="12" width="24" height="4" rx="1" ry="1"></rect>
|
54
|
+
<rect x="0" y="20" width="24" height="4" rx="1" ry="1"></rect>
|
55
|
+
</svg>
|
56
|
+
</a>
|
57
|
+
|
58
|
+
</div>
|
59
|
+
<div class="clear"></div>
|
60
|
+
</div>
|
61
|
+
|
62
|
+
<div id="content"><h1>Module: AeEasy::Text
|
63
|
+
|
64
|
+
|
65
|
+
|
66
|
+
</h1>
|
67
|
+
<div class="box_info">
|
68
|
+
|
69
|
+
|
70
|
+
|
71
|
+
|
72
|
+
|
73
|
+
|
74
|
+
|
75
|
+
|
76
|
+
|
77
|
+
|
78
|
+
|
79
|
+
<dl>
|
80
|
+
<dt>Defined in:</dt>
|
81
|
+
<dd>lib/ae_easy/text.rb<span class="defines">,<br />
|
82
|
+
lib/ae_easy/text/version.rb</span>
|
83
|
+
</dd>
|
84
|
+
</dl>
|
85
|
+
|
86
|
+
</div>
|
87
|
+
|
88
|
+
|
89
|
+
|
90
|
+
<h2>
|
91
|
+
Constant Summary
|
92
|
+
<small><a href="#" class="constants_summary_toggle">collapse</a></small>
|
93
|
+
</h2>
|
94
|
+
|
95
|
+
<dl class="constants">
|
96
|
+
|
97
|
+
<dt id="VERSION-constant" class="">VERSION =
|
98
|
+
<div class="docstring">
|
99
|
+
<div class="discussion">
|
100
|
+
|
101
|
+
<p>Gem version</p>
|
102
|
+
|
103
|
+
|
104
|
+
</div>
|
105
|
+
</div>
|
106
|
+
<div class="tags">
|
107
|
+
|
108
|
+
|
109
|
+
</div>
|
110
|
+
</dt>
|
111
|
+
<dd><pre class="code"><span class='tstring'><span class='tstring_beg'>"</span><span class='tstring_content'>0.0.1</span><span class='tstring_end'>"</span></span></pre></dd>
|
112
|
+
|
113
|
+
</dl>
|
114
|
+
|
115
|
+
|
116
|
+
|
117
|
+
|
118
|
+
|
119
|
+
|
120
|
+
|
121
|
+
|
122
|
+
|
123
|
+
<h2>
|
124
|
+
Class Method Summary
|
125
|
+
<small><a href="#" class="summary_toggle">collapse</a></small>
|
126
|
+
</h2>
|
127
|
+
|
128
|
+
<ul class="summary">
|
129
|
+
|
130
|
+
<li class="public ">
|
131
|
+
<span class="summary_signature">
|
132
|
+
|
133
|
+
<a href="#decode_html-class_method" title="decode_html (class method)">.<strong>decode_html</strong>(text) ⇒ String </a>
|
134
|
+
|
135
|
+
|
136
|
+
|
137
|
+
</span>
|
138
|
+
|
139
|
+
|
140
|
+
|
141
|
+
|
142
|
+
|
143
|
+
|
144
|
+
|
145
|
+
|
146
|
+
|
147
|
+
<span class="summary_desc"><div class='inline'>
|
148
|
+
<p>Decode HTML entities from text .</p>
|
149
|
+
</div></span>
|
150
|
+
|
151
|
+
</li>
|
152
|
+
|
153
|
+
|
154
|
+
<li class="public ">
|
155
|
+
<span class="summary_signature">
|
156
|
+
|
157
|
+
<a href="#default_parser-class_method" title="default_parser (class method)">.<strong>default_parser</strong>(cell_element, data, key) ⇒ Object </a>
|
158
|
+
|
159
|
+
|
160
|
+
|
161
|
+
</span>
|
162
|
+
|
163
|
+
|
164
|
+
|
165
|
+
|
166
|
+
|
167
|
+
|
168
|
+
|
169
|
+
|
170
|
+
|
171
|
+
<span class="summary_desc"><div class='inline'>
|
172
|
+
<p>Default cell content parser used to parse cell element.</p>
|
173
|
+
</div></span>
|
174
|
+
|
175
|
+
</li>
|
176
|
+
|
177
|
+
|
178
|
+
<li class="public ">
|
179
|
+
<span class="summary_signature">
|
180
|
+
|
181
|
+
<a href="#encode_html-class_method" title="encode_html (class method)">.<strong>encode_html</strong>(text) ⇒ String </a>
|
182
|
+
|
183
|
+
|
184
|
+
|
185
|
+
</span>
|
186
|
+
|
187
|
+
|
188
|
+
|
189
|
+
|
190
|
+
|
191
|
+
|
192
|
+
|
193
|
+
|
194
|
+
|
195
|
+
<span class="summary_desc"><div class='inline'>
|
196
|
+
<p>Encode text for valid HTML entities.</p>
|
197
|
+
</div></span>
|
198
|
+
|
199
|
+
</li>
|
200
|
+
|
201
|
+
|
202
|
+
<li class="public ">
|
203
|
+
<span class="summary_signature">
|
204
|
+
|
205
|
+
<a href="#hash-class_method" title="hash (class method)">.<strong>hash</strong>(object) ⇒ String </a>
|
206
|
+
|
207
|
+
|
208
|
+
|
209
|
+
</span>
|
210
|
+
|
211
|
+
|
212
|
+
|
213
|
+
|
214
|
+
|
215
|
+
|
216
|
+
|
217
|
+
|
218
|
+
|
219
|
+
<span class="summary_desc"><div class='inline'>
|
220
|
+
<p>Create a hash from object.</p>
|
221
|
+
</div></span>
|
222
|
+
|
223
|
+
</li>
|
224
|
+
|
225
|
+
|
226
|
+
<li class="public ">
|
227
|
+
<span class="summary_signature">
|
228
|
+
|
229
|
+
<a href="#parse_content-class_method" title="parse_content (class method)">.<strong>parse_content</strong>(opts) {|data, row, header_map| ... } ⇒ Array<Hash><sup>?</sup> </a>
|
230
|
+
|
231
|
+
|
232
|
+
|
233
|
+
</span>
|
234
|
+
|
235
|
+
|
236
|
+
|
237
|
+
|
238
|
+
|
239
|
+
|
240
|
+
|
241
|
+
|
242
|
+
|
243
|
+
<span class="summary_desc"><div class='inline'>
|
244
|
+
<p>Parse row data matching a selector using a header map to translate
|
245
|
+
between columns and friendly keys.</p>
|
246
|
+
</div></span>
|
247
|
+
|
248
|
+
</li>
|
249
|
+
|
250
|
+
|
251
|
+
<li class="public ">
|
252
|
+
<span class="summary_signature">
|
253
|
+
|
254
|
+
<a href="#parse_header_map-class_method" title="parse_header_map (class method)">.<strong>parse_header_map</strong>(opts = {}) ⇒ Hash{Symbol,String => Integer}<sup>?</sup> </a>
|
255
|
+
|
256
|
+
|
257
|
+
|
258
|
+
</span>
|
259
|
+
|
260
|
+
|
261
|
+
|
262
|
+
|
263
|
+
|
264
|
+
|
265
|
+
|
266
|
+
|
267
|
+
|
268
|
+
<span class="summary_desc"><div class='inline'>
|
269
|
+
<p>Parse header from selector and create a header map to match a column key
|
270
|
+
with column index.</p>
|
271
|
+
</div></span>
|
272
|
+
|
273
|
+
</li>
|
274
|
+
|
275
|
+
|
276
|
+
<li class="public ">
|
277
|
+
<span class="summary_signature">
|
278
|
+
|
279
|
+
<a href="#parse_table-class_method" title="parse_table (class method)">.<strong>parse_table</strong>(opts = {}) {|data, row, header_map| ... } ⇒ Hash{Symbol => Array,Hash,nil} </a>
|
280
|
+
|
281
|
+
|
282
|
+
|
283
|
+
</span>
|
284
|
+
|
285
|
+
|
286
|
+
|
287
|
+
|
288
|
+
|
289
|
+
|
290
|
+
|
291
|
+
|
292
|
+
|
293
|
+
<span class="summary_desc"><div class='inline'>
|
294
|
+
<p>Parse data from a horizontal table like structure matching a selectors and
|
295
|
+
using a header map to match columns.</p>
|
296
|
+
</div></span>
|
297
|
+
|
298
|
+
</li>
|
299
|
+
|
300
|
+
|
301
|
+
<li class="public ">
|
302
|
+
<span class="summary_signature">
|
303
|
+
|
304
|
+
<a href="#parse_vertical_table-class_method" title="parse_vertical_table (class method)">.<strong>parse_vertical_table</strong>(opts = {}) {|data, row, header_map| ... } ⇒ Hash{Symbol => Array,Hash,nil} </a>
|
305
|
+
|
306
|
+
|
307
|
+
|
308
|
+
</span>
|
309
|
+
|
310
|
+
|
311
|
+
|
312
|
+
|
313
|
+
|
314
|
+
|
315
|
+
|
316
|
+
|
317
|
+
|
318
|
+
<span class="summary_desc"><div class='inline'>
|
319
|
+
<p>Parse data from a vertical table like structure matching a selectors and
|
320
|
+
using a header map to match columns.</p>
|
321
|
+
</div></span>
|
322
|
+
|
323
|
+
</li>
|
324
|
+
|
325
|
+
|
326
|
+
<li class="public ">
|
327
|
+
<span class="summary_signature">
|
328
|
+
|
329
|
+
<a href="#strip-class_method" title="strip (class method)">.<strong>strip</strong>(raw_text) ⇒ String<sup>?</sup> </a>
|
330
|
+
|
331
|
+
|
332
|
+
|
333
|
+
</span>
|
334
|
+
|
335
|
+
|
336
|
+
|
337
|
+
|
338
|
+
|
339
|
+
|
340
|
+
|
341
|
+
|
342
|
+
|
343
|
+
<span class="summary_desc"><div class='inline'>
|
344
|
+
<p>Strip a value.</p>
|
345
|
+
</div></span>
|
346
|
+
|
347
|
+
</li>
|
348
|
+
|
349
|
+
|
350
|
+
<li class="public ">
|
351
|
+
<span class="summary_signature">
|
352
|
+
|
353
|
+
<a href="#translate_label_to_key-class_method" title="translate_label_to_key (class method)">.<strong>translate_label_to_key</strong>(element, label_map) ⇒ Symbol, String </a>
|
354
|
+
|
355
|
+
|
356
|
+
|
357
|
+
</span>
|
358
|
+
|
359
|
+
|
360
|
+
|
361
|
+
|
362
|
+
|
363
|
+
|
364
|
+
|
365
|
+
|
366
|
+
|
367
|
+
<span class="summary_desc"><div class='inline'>
|
368
|
+
<p>Extract column label and translate it into a frienly key.</p>
|
369
|
+
</div></span>
|
370
|
+
|
371
|
+
</li>
|
372
|
+
|
373
|
+
|
374
|
+
</ul>
|
375
|
+
|
376
|
+
|
377
|
+
|
378
|
+
|
379
|
+
<div id="class_method_details" class="method_details_list">
|
380
|
+
<h2>Class Method Details</h2>
|
381
|
+
|
382
|
+
|
383
|
+
<div class="method_details first">
|
384
|
+
<h3 class="signature first" id="decode_html-class_method">
|
385
|
+
|
386
|
+
.<strong>decode_html</strong>(text) ⇒ <tt>String</tt>
|
387
|
+
|
388
|
+
|
389
|
+
|
390
|
+
|
391
|
+
|
392
|
+
</h3><div class="docstring">
|
393
|
+
<div class="discussion">
|
394
|
+
|
395
|
+
<p>Decode HTML entities from text .</p>
|
396
|
+
|
397
|
+
|
398
|
+
</div>
|
399
|
+
</div>
|
400
|
+
<div class="tags">
|
401
|
+
<p class="tag_title">Parameters:</p>
|
402
|
+
<ul class="param">
|
403
|
+
|
404
|
+
<li>
|
405
|
+
|
406
|
+
<span class='name'>text</span>
|
407
|
+
|
408
|
+
|
409
|
+
<span class='type'>(<tt>String</tt>)</span>
|
410
|
+
|
411
|
+
|
412
|
+
|
413
|
+
—
|
414
|
+
<div class='inline'>
|
415
|
+
<p>Text to decode.</p>
|
416
|
+
</div>
|
417
|
+
|
418
|
+
</li>
|
419
|
+
|
420
|
+
</ul>
|
421
|
+
|
422
|
+
<p class="tag_title">Returns:</p>
|
423
|
+
<ul class="return">
|
424
|
+
|
425
|
+
<li>
|
426
|
+
|
427
|
+
|
428
|
+
<span class='type'>(<tt>String</tt>)</span>
|
429
|
+
|
430
|
+
|
431
|
+
|
432
|
+
</li>
|
433
|
+
|
434
|
+
</ul>
|
435
|
+
|
436
|
+
</div><table class="source_code">
|
437
|
+
<tr>
|
438
|
+
<td>
|
439
|
+
<pre class="lines">
|
440
|
+
|
441
|
+
|
442
|
+
33
|
443
|
+
34
|
444
|
+
35</pre>
|
445
|
+
</td>
|
446
|
+
<td>
|
447
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 33</span>
|
448
|
+
|
449
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_decode_html'>decode_html</span> <span class='id identifier rubyid_text'>text</span>
|
450
|
+
<span class='const'>CGI</span><span class='period'>.</span><span class='id identifier rubyid_unescapeHTML'>unescapeHTML</span> <span class='id identifier rubyid_text'>text</span>
|
451
|
+
<span class='kw'>end</span></pre>
|
452
|
+
</td>
|
453
|
+
</tr>
|
454
|
+
</table>
|
455
|
+
</div>
|
456
|
+
|
457
|
+
<div class="method_details ">
|
458
|
+
<h3 class="signature " id="default_parser-class_method">
|
459
|
+
|
460
|
+
.<strong>default_parser</strong>(cell_element, data, key) ⇒ <tt>Object</tt>
|
461
|
+
|
462
|
+
|
463
|
+
|
464
|
+
|
465
|
+
|
466
|
+
</h3><div class="docstring">
|
467
|
+
<div class="discussion">
|
468
|
+
|
469
|
+
<p>Default cell content parser used to parse cell element.</p>
|
470
|
+
|
471
|
+
|
472
|
+
</div>
|
473
|
+
</div>
|
474
|
+
<div class="tags">
|
475
|
+
<p class="tag_title">Parameters:</p>
|
476
|
+
<ul class="param">
|
477
|
+
|
478
|
+
<li>
|
479
|
+
|
480
|
+
<span class='name'>cell_element</span>
|
481
|
+
|
482
|
+
|
483
|
+
<span class='type'>(<tt>Nokogiri::Element</tt>)</span>
|
484
|
+
|
485
|
+
|
486
|
+
|
487
|
+
—
|
488
|
+
<div class='inline'>
|
489
|
+
<p>Cell element to parse.</p>
|
490
|
+
</div>
|
491
|
+
|
492
|
+
</li>
|
493
|
+
|
494
|
+
<li>
|
495
|
+
|
496
|
+
<span class='name'>data</span>
|
497
|
+
|
498
|
+
|
499
|
+
<span class='type'>(<tt>Hash</tt>)</span>
|
500
|
+
|
501
|
+
|
502
|
+
|
503
|
+
—
|
504
|
+
<div class='inline'>
|
505
|
+
<p>Data hash to save parsed data into.</p>
|
506
|
+
</div>
|
507
|
+
|
508
|
+
</li>
|
509
|
+
|
510
|
+
<li>
|
511
|
+
|
512
|
+
<span class='name'>key</span>
|
513
|
+
|
514
|
+
|
515
|
+
<span class='type'>(<tt>String</tt>, <tt>Symbol</tt>)</span>
|
516
|
+
|
517
|
+
|
518
|
+
|
519
|
+
—
|
520
|
+
<div class='inline'>
|
521
|
+
<p>Header column key being parsed.</p>
|
522
|
+
</div>
|
523
|
+
|
524
|
+
</li>
|
525
|
+
|
526
|
+
</ul>
|
527
|
+
|
528
|
+
|
529
|
+
</div><table class="source_code">
|
530
|
+
<tr>
|
531
|
+
<td>
|
532
|
+
<pre class="lines">
|
533
|
+
|
534
|
+
|
535
|
+
60
|
536
|
+
61
|
537
|
+
62
|
538
|
+
63</pre>
|
539
|
+
</td>
|
540
|
+
<td>
|
541
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 60</span>
|
542
|
+
|
543
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_default_parser'>default_parser</span> <span class='id identifier rubyid_cell_element'>cell_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span>
|
544
|
+
<span class='id identifier rubyid_cell_element'>cell_element</span><span class='op'>&.</span><span class='id identifier rubyid_search'>search</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>'</span><span class='tstring_content'>//i</span><span class='tstring_end'>'</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_remove'>remove</span>
|
545
|
+
<span class='id identifier rubyid_row_data'>row_data</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span> <span class='op'>=</span> <span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_cell_element'>cell_element</span><span class='op'>&.</span><span class='id identifier rubyid_text'>text</span>
|
546
|
+
<span class='kw'>end</span></pre>
|
547
|
+
</td>
|
548
|
+
</tr>
|
549
|
+
</table>
|
550
|
+
</div>
|
551
|
+
|
552
|
+
<div class="method_details ">
|
553
|
+
<h3 class="signature " id="encode_html-class_method">
|
554
|
+
|
555
|
+
.<strong>encode_html</strong>(text) ⇒ <tt>String</tt>
|
556
|
+
|
557
|
+
|
558
|
+
|
559
|
+
|
560
|
+
|
561
|
+
</h3><div class="docstring">
|
562
|
+
<div class="discussion">
|
563
|
+
|
564
|
+
<p>Encode text for valid HTML entities.</p>
|
565
|
+
|
566
|
+
|
567
|
+
</div>
|
568
|
+
</div>
|
569
|
+
<div class="tags">
|
570
|
+
<p class="tag_title">Parameters:</p>
|
571
|
+
<ul class="param">
|
572
|
+
|
573
|
+
<li>
|
574
|
+
|
575
|
+
<span class='name'>text</span>
|
576
|
+
|
577
|
+
|
578
|
+
<span class='type'>(<tt>String</tt>)</span>
|
579
|
+
|
580
|
+
|
581
|
+
|
582
|
+
—
|
583
|
+
<div class='inline'>
|
584
|
+
<p>Text to encode.</p>
|
585
|
+
</div>
|
586
|
+
|
587
|
+
</li>
|
588
|
+
|
589
|
+
</ul>
|
590
|
+
|
591
|
+
<p class="tag_title">Returns:</p>
|
592
|
+
<ul class="return">
|
593
|
+
|
594
|
+
<li>
|
595
|
+
|
596
|
+
|
597
|
+
<span class='type'>(<tt>String</tt>)</span>
|
598
|
+
|
599
|
+
|
600
|
+
|
601
|
+
</li>
|
602
|
+
|
603
|
+
</ul>
|
604
|
+
|
605
|
+
</div><table class="source_code">
|
606
|
+
<tr>
|
607
|
+
<td>
|
608
|
+
<pre class="lines">
|
609
|
+
|
610
|
+
|
611
|
+
24
|
612
|
+
25
|
613
|
+
26</pre>
|
614
|
+
</td>
|
615
|
+
<td>
|
616
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 24</span>
|
617
|
+
|
618
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_encode_html'>encode_html</span> <span class='id identifier rubyid_text'>text</span>
|
619
|
+
<span class='const'>CGI</span><span class='period'>.</span><span class='id identifier rubyid_escapeHTML'>escapeHTML</span> <span class='id identifier rubyid_text'>text</span>
|
620
|
+
<span class='kw'>end</span></pre>
|
621
|
+
</td>
|
622
|
+
</tr>
|
623
|
+
</table>
|
624
|
+
</div>
|
625
|
+
|
626
|
+
<div class="method_details ">
|
627
|
+
<h3 class="signature " id="hash-class_method">
|
628
|
+
|
629
|
+
.<strong>hash</strong>(object) ⇒ <tt>String</tt>
|
630
|
+
|
631
|
+
|
632
|
+
|
633
|
+
|
634
|
+
|
635
|
+
</h3><div class="docstring">
|
636
|
+
<div class="discussion">
|
637
|
+
|
638
|
+
<p>Create a hash from object</p>
|
639
|
+
|
640
|
+
|
641
|
+
</div>
|
642
|
+
</div>
|
643
|
+
<div class="tags">
|
644
|
+
<p class="tag_title">Parameters:</p>
|
645
|
+
<ul class="param">
|
646
|
+
|
647
|
+
<li>
|
648
|
+
|
649
|
+
<span class='name'>object</span>
|
650
|
+
|
651
|
+
|
652
|
+
<span class='type'>(<tt>String</tt>, <tt>Hash</tt>, <tt>Object</tt>)</span>
|
653
|
+
|
654
|
+
|
655
|
+
|
656
|
+
—
|
657
|
+
<div class='inline'>
|
658
|
+
<p>Object to create hash from.</p>
|
659
|
+
</div>
|
660
|
+
|
661
|
+
</li>
|
662
|
+
|
663
|
+
</ul>
|
664
|
+
|
665
|
+
<p class="tag_title">Returns:</p>
|
666
|
+
<ul class="return">
|
667
|
+
|
668
|
+
<li>
|
669
|
+
|
670
|
+
|
671
|
+
<span class='type'>(<tt>String</tt>)</span>
|
672
|
+
|
673
|
+
|
674
|
+
|
675
|
+
</li>
|
676
|
+
|
677
|
+
</ul>
|
678
|
+
|
679
|
+
</div><table class="source_code">
|
680
|
+
<tr>
|
681
|
+
<td>
|
682
|
+
<pre class="lines">
|
683
|
+
|
684
|
+
|
685
|
+
14
|
686
|
+
15
|
687
|
+
16
|
688
|
+
17</pre>
|
689
|
+
</td>
|
690
|
+
<td>
|
691
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 14</span>
|
692
|
+
|
693
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_hash'>hash</span> <span class='id identifier rubyid_object'>object</span>
|
694
|
+
<span class='id identifier rubyid_object'>object</span> <span class='op'>=</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_hash'>hash</span> <span class='kw'>if</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span> <span class='const'>Hash</span>
|
695
|
+
<span class='const'>Digest</span><span class='op'>::</span><span class='const'>SHA1</span><span class='period'>.</span><span class='id identifier rubyid_hexdigest'>hexdigest</span> <span class='id identifier rubyid_object'>object</span><span class='period'>.</span><span class='id identifier rubyid_to_s'>to_s</span>
|
696
|
+
<span class='kw'>end</span></pre>
|
697
|
+
</td>
|
698
|
+
</tr>
|
699
|
+
</table>
|
700
|
+
</div>
|
701
|
+
|
702
|
+
<div class="method_details ">
|
703
|
+
<h3 class="signature " id="parse_content-class_method">
|
704
|
+
|
705
|
+
.<strong>parse_content</strong>(opts) {|data, row, header_map| ... } ⇒ <tt>Array<Hash></tt><sup>?</sup>
|
706
|
+
|
707
|
+
|
708
|
+
|
709
|
+
|
710
|
+
|
711
|
+
</h3><div class="docstring">
|
712
|
+
<div class="discussion">
|
713
|
+
|
714
|
+
<p>Parse row data matching a selector using a header map to translate</p>
|
715
|
+
|
716
|
+
<pre class="code ruby"><code class="ruby">between columns and friendly keys.
|
717
|
+
</code></pre>
|
718
|
+
|
719
|
+
|
720
|
+
</div>
|
721
|
+
</div>
|
722
|
+
<div class="tags">
|
723
|
+
<p class="tag_title">Parameters:</p>
|
724
|
+
<ul class="param">
|
725
|
+
|
726
|
+
<li>
|
727
|
+
|
728
|
+
<span class='name'>opts</span>
|
729
|
+
|
730
|
+
|
731
|
+
<span class='type'>(<tt>Hash</tt>)</span>
|
732
|
+
|
733
|
+
|
734
|
+
|
735
|
+
—
|
736
|
+
<div class='inline'>
|
737
|
+
<p>({}) Configuration options.</p>
|
738
|
+
</div>
|
739
|
+
|
740
|
+
</li>
|
741
|
+
|
742
|
+
</ul>
|
743
|
+
|
744
|
+
|
745
|
+
|
746
|
+
|
747
|
+
<p class="tag_title">Options Hash (<tt>opts</tt>):</p>
|
748
|
+
<ul class="option">
|
749
|
+
|
750
|
+
<li>
|
751
|
+
<span class="name">:html</span>
|
752
|
+
<span class="type">(<tt>Nokogiri::Element</tt>)</span>
|
753
|
+
<span class="default">
|
754
|
+
|
755
|
+
</span>
|
756
|
+
|
757
|
+
— <div class='inline'>
|
758
|
+
<p>Container element to search into.</p>
|
759
|
+
</div>
|
760
|
+
|
761
|
+
</li>
|
762
|
+
|
763
|
+
<li>
|
764
|
+
<span class="name">:selector</span>
|
765
|
+
<span class="type">(<tt>String</tt>)</span>
|
766
|
+
<span class="default">
|
767
|
+
|
768
|
+
</span>
|
769
|
+
|
770
|
+
— <div class='inline'>
|
771
|
+
<p>CSS selector to match content cells.</p>
|
772
|
+
</div>
|
773
|
+
|
774
|
+
</li>
|
775
|
+
|
776
|
+
<li>
|
777
|
+
<span class="name">:first_row_header</span>
|
778
|
+
<span class="type">(<tt>Boolean</tt>)</span>
|
779
|
+
<span class="default">
|
780
|
+
|
781
|
+
— default:
|
782
|
+
<tt>false</tt>
|
783
|
+
|
784
|
+
</span>
|
785
|
+
|
786
|
+
— <div class='inline'>
|
787
|
+
<p>If true then first matching element will be assumed to be header and
|
788
|
+
ignored.</p>
|
789
|
+
</div>
|
790
|
+
|
791
|
+
</li>
|
792
|
+
|
793
|
+
<li>
|
794
|
+
<span class="name">:header_map</span>
|
795
|
+
<span class="type">(<tt>Hash{Symbol,String => Integer}</tt>)</span>
|
796
|
+
<span class="default">
|
797
|
+
|
798
|
+
</span>
|
799
|
+
|
800
|
+
— <div class='inline'>
|
801
|
+
<p>Header key vs index dictionary.</p>
|
802
|
+
</div>
|
803
|
+
|
804
|
+
</li>
|
805
|
+
|
806
|
+
<li>
|
807
|
+
<span class="name">:column_parsers</span>
|
808
|
+
<span class="type">(<tt>Hash{Symbol,String => lambda,proc}</tt>)</span>
|
809
|
+
<span class="default">
|
810
|
+
|
811
|
+
— default:
|
812
|
+
<tt>{}</tt>
|
813
|
+
|
814
|
+
</span>
|
815
|
+
|
816
|
+
— <div class='inline'>
|
817
|
+
<p>Custom column parsers for advance data extraction.</p>
|
818
|
+
</div>
|
819
|
+
|
820
|
+
</li>
|
821
|
+
|
822
|
+
</ul>
|
823
|
+
|
824
|
+
|
825
|
+
<p class="tag_title">Yield Parameters:</p>
|
826
|
+
<ul class="yieldparam">
|
827
|
+
|
828
|
+
<li>
|
829
|
+
|
830
|
+
<span class='name'>data</span>
|
831
|
+
|
832
|
+
|
833
|
+
<span class='type'>(<tt>Hash{Symbol,String => Object}</tt>)</span>
|
834
|
+
|
835
|
+
|
836
|
+
|
837
|
+
—
|
838
|
+
<div class='inline'>
|
839
|
+
<p>Parsed row data.</p>
|
840
|
+
</div>
|
841
|
+
|
842
|
+
</li>
|
843
|
+
|
844
|
+
<li>
|
845
|
+
|
846
|
+
<span class='name'>row</span>
|
847
|
+
|
848
|
+
|
849
|
+
<span class='type'>(<tt>Array</tt>)</span>
|
850
|
+
|
851
|
+
|
852
|
+
|
853
|
+
—
|
854
|
+
<div class='inline'>
|
855
|
+
<p>Raw row data.</p>
|
856
|
+
</div>
|
857
|
+
|
858
|
+
</li>
|
859
|
+
|
860
|
+
<li>
|
861
|
+
|
862
|
+
<span class='name'>header_map</span>
|
863
|
+
|
864
|
+
|
865
|
+
<span class='type'>(<tt>Hash{Symbol,String => Integer}</tt>)</span>
|
866
|
+
|
867
|
+
|
868
|
+
|
869
|
+
—
|
870
|
+
<div class='inline'>
|
871
|
+
<p>Header map used.</p>
|
872
|
+
</div>
|
873
|
+
|
874
|
+
</li>
|
875
|
+
|
876
|
+
</ul>
|
877
|
+
<p class="tag_title">Yield Returns:</p>
|
878
|
+
<ul class="yieldreturn">
|
879
|
+
|
880
|
+
<li>
|
881
|
+
|
882
|
+
|
883
|
+
<span class='type'>(<tt>Boolean</tt>)</span>
|
884
|
+
|
885
|
+
|
886
|
+
|
887
|
+
—
|
888
|
+
<div class='inline'>
|
889
|
+
<p>`true` when valid, else `false`.</p>
|
890
|
+
</div>
|
891
|
+
|
892
|
+
</li>
|
893
|
+
|
894
|
+
</ul>
|
895
|
+
<p class="tag_title">Returns:</p>
|
896
|
+
<ul class="return">
|
897
|
+
|
898
|
+
<li>
|
899
|
+
|
900
|
+
|
901
|
+
<span class='type'>(<tt>Array<Hash></tt>, <tt>nil</tt>)</span>
|
902
|
+
|
903
|
+
|
904
|
+
|
905
|
+
—
|
906
|
+
<div class='inline'>
|
907
|
+
<p>Parsed rows data.</p>
|
908
|
+
</div>
|
909
|
+
|
910
|
+
</li>
|
911
|
+
|
912
|
+
</ul>
|
913
|
+
|
914
|
+
</div><table class="source_code">
|
915
|
+
<tr>
|
916
|
+
<td>
|
917
|
+
<pre class="lines">
|
918
|
+
|
919
|
+
|
920
|
+
84
|
921
|
+
85
|
922
|
+
86
|
923
|
+
87
|
924
|
+
88
|
925
|
+
89
|
926
|
+
90
|
927
|
+
91
|
928
|
+
92
|
929
|
+
93
|
930
|
+
94
|
931
|
+
95
|
932
|
+
96
|
933
|
+
97
|
934
|
+
98
|
935
|
+
99
|
936
|
+
100
|
937
|
+
101
|
938
|
+
102
|
939
|
+
103
|
940
|
+
104
|
941
|
+
105
|
942
|
+
106
|
943
|
+
107
|
944
|
+
108
|
945
|
+
109
|
946
|
+
110
|
947
|
+
111
|
948
|
+
112
|
949
|
+
113
|
950
|
+
114
|
951
|
+
115
|
952
|
+
116
|
953
|
+
117
|
954
|
+
118
|
955
|
+
119
|
956
|
+
120
|
957
|
+
121
|
958
|
+
122</pre>
|
959
|
+
</td>
|
960
|
+
<td>
|
961
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 84</span>
|
962
|
+
|
963
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_content'>parse_content</span> <span class='id identifier rubyid_opts'>opts</span><span class='comma'>,</span> <span class='op'>&</span><span class='id identifier rubyid_filter'>filter</span>
|
964
|
+
<span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
|
965
|
+
<span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
966
|
+
<span class='label'>selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
967
|
+
<span class='label'>first_row_header:</span> <span class='kw'>false</span><span class='comma'>,</span>
|
968
|
+
<span class='label'>header_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
|
969
|
+
<span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
|
970
|
+
<span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
|
971
|
+
|
972
|
+
<span class='comment'># Setup config
|
973
|
+
</span> <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='rbracket'>]</span>
|
974
|
+
<span class='id identifier rubyid_row_data'>row_data</span> <span class='op'>=</span> <span class='id identifier rubyid_child_element'>child_element</span> <span class='op'>=</span> <span class='kw'>nil</span>
|
975
|
+
<span class='id identifier rubyid_first'>first</span> <span class='op'>=</span> <span class='id identifier rubyid_first_row_header'>first_row_header</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span>
|
976
|
+
<span class='id identifier rubyid_header_map'>header_map</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_map</span><span class='rbracket'>]</span>
|
977
|
+
<span class='id identifier rubyid_column_parsers'>column_parsers</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span>
|
978
|
+
|
979
|
+
<span class='comment'># Get and parse rows
|
980
|
+
</span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
|
981
|
+
<span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
|
982
|
+
<span class='comment'># First row header validation
|
983
|
+
</span> <span class='kw'>if</span> <span class='id identifier rubyid_first'>first</span> <span class='op'>&&</span> <span class='id identifier rubyid_first_row_header'>first_row_header</span>
|
984
|
+
<span class='id identifier rubyid_first'>first</span> <span class='op'>=</span> <span class='kw'>false</span>
|
985
|
+
<span class='kw'>next</span>
|
986
|
+
<span class='kw'>end</span>
|
987
|
+
|
988
|
+
<span class='comment'># Extract content data
|
989
|
+
</span> <span class='id identifier rubyid_row_data'>row_data</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
|
990
|
+
<span class='id identifier rubyid_header_map'>header_map</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_key'>key</span><span class='comma'>,</span> <span class='id identifier rubyid_index'>index</span><span class='op'>|</span>
|
991
|
+
<span class='comment'># Parse column html with default or custom parser
|
992
|
+
</span> <span class='id identifier rubyid_child_element'>child_element</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_children'>children</span><span class='lbracket'>[</span><span class='id identifier rubyid_index'>index</span><span class='rbracket'>]</span>
|
993
|
+
<span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span>
|
994
|
+
<span class='id identifier rubyid_default_parser'>default_parser</span><span class='lparen'>(</span><span class='id identifier rubyid_child_element'>child_element</span><span class='comma'>,</span> <span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span> <span class='op'>:</span>
|
995
|
+
<span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_child_element'>child_element</span><span class='comma'>,</span> <span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span>
|
996
|
+
<span class='kw'>end</span>
|
997
|
+
<span class='kw'>next</span> <span class='kw'>unless</span> <span class='id identifier rubyid_filter'>filter</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>||</span> <span class='id identifier rubyid_filter'>filter</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_row_data'>row_data</span><span class='comma'>,</span> <span class='id identifier rubyid_row'>row</span><span class='comma'>,</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='rparen'>)</span>
|
998
|
+
<span class='id identifier rubyid_data'>data</span> <span class='op'><<</span> <span class='id identifier rubyid_row_data'>row_data</span>
|
999
|
+
<span class='kw'>end</span>
|
1000
|
+
<span class='id identifier rubyid_data'>data</span>
|
1001
|
+
<span class='kw'>end</span></pre>
|
1002
|
+
</td>
|
1003
|
+
</tr>
|
1004
|
+
</table>
|
1005
|
+
</div>
|
1006
|
+
|
1007
|
+
<div class="method_details ">
|
1008
|
+
<h3 class="signature " id="parse_header_map-class_method">
|
1009
|
+
|
1010
|
+
.<strong>parse_header_map</strong>(opts = {}) ⇒ <tt>Hash{Symbol,String => Integer}</tt><sup>?</sup>
|
1011
|
+
|
1012
|
+
|
1013
|
+
|
1014
|
+
|
1015
|
+
|
1016
|
+
</h3><div class="docstring">
|
1017
|
+
<div class="discussion">
|
1018
|
+
|
1019
|
+
<p>Parse header from selector and create a header map to match a column key</p>
|
1020
|
+
|
1021
|
+
<pre class="code ruby"><code class="ruby">with column index.
|
1022
|
+
</code></pre>
|
1023
|
+
|
1024
|
+
|
1025
|
+
</div>
|
1026
|
+
</div>
|
1027
|
+
<div class="tags">
|
1028
|
+
<p class="tag_title">Parameters:</p>
|
1029
|
+
<ul class="param">
|
1030
|
+
|
1031
|
+
<li>
|
1032
|
+
|
1033
|
+
<span class='name'>opts</span>
|
1034
|
+
|
1035
|
+
|
1036
|
+
<span class='type'>(<tt>Hash</tt>)</span>
|
1037
|
+
|
1038
|
+
|
1039
|
+
<em class="default">(defaults to: <tt>{}</tt>)</em>
|
1040
|
+
|
1041
|
+
|
1042
|
+
—
|
1043
|
+
<div class='inline'>
|
1044
|
+
<p>({}) Configuration options.</p>
|
1045
|
+
</div>
|
1046
|
+
|
1047
|
+
</li>
|
1048
|
+
|
1049
|
+
</ul>
|
1050
|
+
|
1051
|
+
|
1052
|
+
|
1053
|
+
|
1054
|
+
<p class="tag_title">Options Hash (<tt>opts</tt>):</p>
|
1055
|
+
<ul class="option">
|
1056
|
+
|
1057
|
+
<li>
|
1058
|
+
<span class="name">:html</span>
|
1059
|
+
<span class="type">(<tt>Nokogiri::Element</tt>)</span>
|
1060
|
+
<span class="default">
|
1061
|
+
|
1062
|
+
</span>
|
1063
|
+
|
1064
|
+
— <div class='inline'>
|
1065
|
+
<p>Container element to search into.</p>
|
1066
|
+
</div>
|
1067
|
+
|
1068
|
+
</li>
|
1069
|
+
|
1070
|
+
<li>
|
1071
|
+
<span class="name">:selector</span>
|
1072
|
+
<span class="type">(<tt>String</tt>)</span>
|
1073
|
+
<span class="default">
|
1074
|
+
|
1075
|
+
</span>
|
1076
|
+
|
1077
|
+
— <div class='inline'>
|
1078
|
+
<p>CSS selector to match header cells.</p>
|
1079
|
+
</div>
|
1080
|
+
|
1081
|
+
</li>
|
1082
|
+
|
1083
|
+
<li>
|
1084
|
+
<span class="name">:column_key_label_map</span>
|
1085
|
+
<span class="type">(<tt>Hash{Symbol,String => Regex,String}</tt>)</span>
|
1086
|
+
<span class="default">
|
1087
|
+
|
1088
|
+
</span>
|
1089
|
+
|
1090
|
+
— <div class='inline'>
|
1091
|
+
<p>Key vs. label dictionary.</p>
|
1092
|
+
</div>
|
1093
|
+
|
1094
|
+
</li>
|
1095
|
+
|
1096
|
+
<li>
|
1097
|
+
<span class="name">:first_row_header</span>
|
1098
|
+
<span class="type">(<tt>Boolean</tt>)</span>
|
1099
|
+
<span class="default">
|
1100
|
+
|
1101
|
+
— default:
|
1102
|
+
<tt>false</tt>
|
1103
|
+
|
1104
|
+
</span>
|
1105
|
+
|
1106
|
+
— <div class='inline'>
|
1107
|
+
<p>If true then selector first matching row will be used as header for
|
1108
|
+
parsing.</p>
|
1109
|
+
</div>
|
1110
|
+
|
1111
|
+
</li>
|
1112
|
+
|
1113
|
+
</ul>
|
1114
|
+
|
1115
|
+
|
1116
|
+
<p class="tag_title">Returns:</p>
|
1117
|
+
<ul class="return">
|
1118
|
+
|
1119
|
+
<li>
|
1120
|
+
|
1121
|
+
|
1122
|
+
<span class='type'>(<tt>Hash{Symbol,String => Integer}</tt>, <tt>nil</tt>)</span>
|
1123
|
+
|
1124
|
+
|
1125
|
+
|
1126
|
+
—
|
1127
|
+
<div class='inline'>
|
1128
|
+
<p>Key vs. column index map.</p>
|
1129
|
+
</div>
|
1130
|
+
|
1131
|
+
</li>
|
1132
|
+
|
1133
|
+
</ul>
|
1134
|
+
|
1135
|
+
</div><table class="source_code">
|
1136
|
+
<tr>
|
1137
|
+
<td>
|
1138
|
+
<pre class="lines">
|
1139
|
+
|
1140
|
+
|
1141
|
+
152
|
1142
|
+
153
|
1143
|
+
154
|
1144
|
+
155
|
1145
|
+
156
|
1146
|
+
157
|
1147
|
+
158
|
1148
|
+
159
|
1149
|
+
160
|
1150
|
+
161
|
1151
|
+
162
|
1152
|
+
163
|
1153
|
+
164
|
1154
|
+
165
|
1155
|
+
166
|
1156
|
+
167
|
1157
|
+
168
|
1158
|
+
169
|
1159
|
+
170
|
1160
|
+
171
|
1161
|
+
172
|
1162
|
+
173
|
1163
|
+
174
|
1164
|
+
175
|
1165
|
+
176
|
1166
|
+
177
|
1167
|
+
178
|
1168
|
+
179
|
1169
|
+
180</pre>
|
1170
|
+
</td>
|
1171
|
+
<td>
|
1172
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 152</span>
|
1173
|
+
|
1174
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_header_map'>parse_header_map</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
|
1175
|
+
<span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
|
1176
|
+
<span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1177
|
+
<span class='label'>selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1178
|
+
<span class='label'>column_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
|
1179
|
+
<span class='label'>first_row_header:</span> <span class='kw'>false</span>
|
1180
|
+
<span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
|
1181
|
+
|
1182
|
+
<span class='comment'># Setup config
|
1183
|
+
</span> <span class='id identifier rubyid_dictionary'>dictionary</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_key_label_map</span><span class='rbracket'>]</span>
|
1184
|
+
<span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='rbracket'>]</span>
|
1185
|
+
<span class='id identifier rubyid_column_map'>column_map</span> <span class='op'>=</span> <span class='kw'>nil</span>
|
1186
|
+
|
1187
|
+
<span class='comment'># Extract and parse header rows
|
1188
|
+
</span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:selector</span><span class='rbracket'>]</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>nil</span>
|
1189
|
+
<span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
|
1190
|
+
<span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='lbracket'>[</span><span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_first'>first</span><span class='rbracket'>]</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span>
|
1191
|
+
<span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
|
1192
|
+
<span class='id identifier rubyid_column_map'>column_map</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
|
1193
|
+
<span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_children'>children</span><span class='period'>.</span><span class='id identifier rubyid_each_with_index'>each_with_index</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_col'>col</span><span class='comma'>,</span> <span class='id identifier rubyid_index'>index</span><span class='op'>|</span>
|
1194
|
+
<span class='comment'># Parse and map column header
|
1195
|
+
</span> <span class='id identifier rubyid_column_key'>column_key</span> <span class='op'>=</span> <span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_col'>col</span><span class='comma'>,</span> <span class='id identifier rubyid_dictionary'>dictionary</span>
|
1196
|
+
<span class='kw'>next</span> <span class='kw'>if</span> <span class='id identifier rubyid_column_key'>column_key</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
|
1197
|
+
<span class='id identifier rubyid_column_map'>column_map</span><span class='lbracket'>[</span><span class='id identifier rubyid_column_key'>column_key</span><span class='rbracket'>]</span> <span class='op'>=</span> <span class='id identifier rubyid_index'>index</span>
|
1198
|
+
<span class='kw'>end</span>
|
1199
|
+
<span class='id identifier rubyid_data'>data</span> <span class='op'><<</span> <span class='id identifier rubyid_column_map'>column_map</span>
|
1200
|
+
<span class='kw'>end</span>
|
1201
|
+
<span class='id identifier rubyid_data'>data</span><span class='op'>&.</span><span class='id identifier rubyid_first'>first</span>
|
1202
|
+
<span class='kw'>end</span></pre>
|
1203
|
+
</td>
|
1204
|
+
</tr>
|
1205
|
+
</table>
|
1206
|
+
</div>
|
1207
|
+
|
1208
|
+
<div class="method_details ">
|
1209
|
+
<h3 class="signature " id="parse_table-class_method">
|
1210
|
+
|
1211
|
+
.<strong>parse_table</strong>(opts = {}) {|data, row, header_map| ... } ⇒ <tt>Hash{Symbol => Array,Hash,nil}</tt>
|
1212
|
+
|
1213
|
+
|
1214
|
+
|
1215
|
+
|
1216
|
+
|
1217
|
+
</h3><div class="docstring">
|
1218
|
+
<div class="discussion">
|
1219
|
+
|
1220
|
+
<p>Parse data from a horizontal table like structure matching a selectors and</p>
|
1221
|
+
|
1222
|
+
<pre class="code ruby"><code class="ruby">using a header map to match columns.
|
1223
|
+
</code></pre>
|
1224
|
+
|
1225
|
+
|
1226
|
+
</div>
|
1227
|
+
</div>
|
1228
|
+
<div class="tags">
|
1229
|
+
<p class="tag_title">Parameters:</p>
|
1230
|
+
<ul class="param">
|
1231
|
+
|
1232
|
+
<li>
|
1233
|
+
|
1234
|
+
<span class='name'>opts</span>
|
1235
|
+
|
1236
|
+
|
1237
|
+
<span class='type'>(<tt>Hash</tt>)</span>
|
1238
|
+
|
1239
|
+
|
1240
|
+
<em class="default">(defaults to: <tt>{}</tt>)</em>
|
1241
|
+
|
1242
|
+
|
1243
|
+
—
|
1244
|
+
<div class='inline'>
|
1245
|
+
<p>({}) Configuration options.</p>
|
1246
|
+
</div>
|
1247
|
+
|
1248
|
+
</li>
|
1249
|
+
|
1250
|
+
</ul>
|
1251
|
+
|
1252
|
+
|
1253
|
+
|
1254
|
+
|
1255
|
+
<p class="tag_title">Options Hash (<tt>opts</tt>):</p>
|
1256
|
+
<ul class="option">
|
1257
|
+
|
1258
|
+
<li>
|
1259
|
+
<span class="name">:html</span>
|
1260
|
+
<span class="type">(<tt>Nokogiri::Element</tt>)</span>
|
1261
|
+
<span class="default">
|
1262
|
+
|
1263
|
+
</span>
|
1264
|
+
|
1265
|
+
— <div class='inline'>
|
1266
|
+
<p>Container element to search into.</p>
|
1267
|
+
</div>
|
1268
|
+
|
1269
|
+
</li>
|
1270
|
+
|
1271
|
+
<li>
|
1272
|
+
<span class="name">:header_selector</span>
|
1273
|
+
<span class="type">(<tt>String</tt>)</span>
|
1274
|
+
<span class="default">
|
1275
|
+
|
1276
|
+
</span>
|
1277
|
+
|
1278
|
+
— <div class='inline'>
|
1279
|
+
<p>Header column elements selector.</p>
|
1280
|
+
</div>
|
1281
|
+
|
1282
|
+
</li>
|
1283
|
+
|
1284
|
+
<li>
|
1285
|
+
<span class="name">:header_key_label_map</span>
|
1286
|
+
<span class="type">(<tt>Hash{Symbol,String => Regex,String}</tt>)</span>
|
1287
|
+
<span class="default">
|
1288
|
+
|
1289
|
+
</span>
|
1290
|
+
|
1291
|
+
— <div class='inline'>
|
1292
|
+
<p>Header key vs. label dictionary to match column indexes.</p>
|
1293
|
+
</div>
|
1294
|
+
|
1295
|
+
</li>
|
1296
|
+
|
1297
|
+
<li>
|
1298
|
+
<span class="name">:content_selector</span>
|
1299
|
+
<span class="type">(<tt>String</tt>)</span>
|
1300
|
+
<span class="default">
|
1301
|
+
|
1302
|
+
</span>
|
1303
|
+
|
1304
|
+
— <div class='inline'>
|
1305
|
+
<p>Content row elements selector.</p>
|
1306
|
+
</div>
|
1307
|
+
|
1308
|
+
</li>
|
1309
|
+
|
1310
|
+
<li>
|
1311
|
+
<span class="name">:first_row_header</span>
|
1312
|
+
<span class="type">(<tt>Boolean</tt>)</span>
|
1313
|
+
<span class="default">
|
1314
|
+
|
1315
|
+
— default:
|
1316
|
+
<tt>false</tt>
|
1317
|
+
|
1318
|
+
</span>
|
1319
|
+
|
1320
|
+
— <div class='inline'>
|
1321
|
+
<p>If true then selector first matching row will be used as header for
|
1322
|
+
parsing.</p>
|
1323
|
+
</div>
|
1324
|
+
|
1325
|
+
</li>
|
1326
|
+
|
1327
|
+
<li>
|
1328
|
+
<span class="name">:column_parsers</span>
|
1329
|
+
<span class="type">(<tt>Hash{Symbol,String => lambda,proc}</tt>)</span>
|
1330
|
+
<span class="default">
|
1331
|
+
|
1332
|
+
— default:
|
1333
|
+
<tt>{}</tt>
|
1334
|
+
|
1335
|
+
</span>
|
1336
|
+
|
1337
|
+
— <div class='inline'>
|
1338
|
+
<p>Custom column parsers for advance data extraction.</p>
|
1339
|
+
</div>
|
1340
|
+
|
1341
|
+
</li>
|
1342
|
+
|
1343
|
+
</ul>
|
1344
|
+
|
1345
|
+
|
1346
|
+
<p class="tag_title">Yield Parameters:</p>
|
1347
|
+
<ul class="yieldparam">
|
1348
|
+
|
1349
|
+
<li>
|
1350
|
+
|
1351
|
+
<span class='name'>data</span>
|
1352
|
+
|
1353
|
+
|
1354
|
+
<span class='type'>(<tt>Hash{Symbol,String => Object}</tt>)</span>
|
1355
|
+
|
1356
|
+
|
1357
|
+
|
1358
|
+
—
|
1359
|
+
<div class='inline'>
|
1360
|
+
<p>Parsed content row data.</p>
|
1361
|
+
</div>
|
1362
|
+
|
1363
|
+
</li>
|
1364
|
+
|
1365
|
+
<li>
|
1366
|
+
|
1367
|
+
<span class='name'>row</span>
|
1368
|
+
|
1369
|
+
|
1370
|
+
<span class='type'>(<tt>Array</tt>)</span>
|
1371
|
+
|
1372
|
+
|
1373
|
+
|
1374
|
+
—
|
1375
|
+
<div class='inline'>
|
1376
|
+
<p>Raw content row data.</p>
|
1377
|
+
</div>
|
1378
|
+
|
1379
|
+
</li>
|
1380
|
+
|
1381
|
+
<li>
|
1382
|
+
|
1383
|
+
<span class='name'>header_map</span>
|
1384
|
+
|
1385
|
+
|
1386
|
+
<span class='type'>(<tt>Hash{Symbol,String => Integer}</tt>)</span>
|
1387
|
+
|
1388
|
+
|
1389
|
+
|
1390
|
+
—
|
1391
|
+
<div class='inline'>
|
1392
|
+
<p>Header map used.</p>
|
1393
|
+
</div>
|
1394
|
+
|
1395
|
+
</li>
|
1396
|
+
|
1397
|
+
</ul>
|
1398
|
+
<p class="tag_title">Yield Returns:</p>
|
1399
|
+
<ul class="yieldreturn">
|
1400
|
+
|
1401
|
+
<li>
|
1402
|
+
|
1403
|
+
|
1404
|
+
<span class='type'>(<tt>Boolean</tt>)</span>
|
1405
|
+
|
1406
|
+
|
1407
|
+
|
1408
|
+
—
|
1409
|
+
<div class='inline'>
|
1410
|
+
<p>`true` when valid, else `false`.</p>
|
1411
|
+
</div>
|
1412
|
+
|
1413
|
+
</li>
|
1414
|
+
|
1415
|
+
</ul>
|
1416
|
+
<p class="tag_title">Returns:</p>
|
1417
|
+
<ul class="return">
|
1418
|
+
|
1419
|
+
<li>
|
1420
|
+
|
1421
|
+
|
1422
|
+
<span class='type'>(<tt>Hash{Symbol => Array,Hash,nil}</tt>)</span>
|
1423
|
+
|
1424
|
+
|
1425
|
+
|
1426
|
+
—
|
1427
|
+
<div class='inline'>
|
1428
|
+
<p>Hash data is as follows:</p>
|
1429
|
+
<ul><li>
|
1430
|
+
<p>`[Hash] :header_map` Header map used.</p>
|
1431
|
+
</li><li>
|
1432
|
+
<p>`[Array<Hash>,nil] :data` Parsed rows data.</p>
|
1433
|
+
</li></ul>
|
1434
|
+
</div>
|
1435
|
+
|
1436
|
+
</li>
|
1437
|
+
|
1438
|
+
</ul>
|
1439
|
+
|
1440
|
+
</div><table class="source_code">
|
1441
|
+
<tr>
|
1442
|
+
<td>
|
1443
|
+
<pre class="lines">
|
1444
|
+
|
1445
|
+
|
1446
|
+
204
|
1447
|
+
205
|
1448
|
+
206
|
1449
|
+
207
|
1450
|
+
208
|
1451
|
+
209
|
1452
|
+
210
|
1453
|
+
211
|
1454
|
+
212
|
1455
|
+
213
|
1456
|
+
214
|
1457
|
+
215
|
1458
|
+
216
|
1459
|
+
217
|
1460
|
+
218
|
1461
|
+
219
|
1462
|
+
220
|
1463
|
+
221
|
1464
|
+
222
|
1465
|
+
223
|
1466
|
+
224
|
1467
|
+
225
|
1468
|
+
226</pre>
|
1469
|
+
</td>
|
1470
|
+
<td>
|
1471
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 204</span>
|
1472
|
+
|
1473
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_table'>parse_table</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span> <span class='op'>&</span><span class='id identifier rubyid_filter'>filter</span>
|
1474
|
+
<span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
|
1475
|
+
<span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1476
|
+
<span class='label'>header_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1477
|
+
<span class='label'>header_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
|
1478
|
+
<span class='label'>content_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1479
|
+
<span class='label'>first_row_header:</span> <span class='kw'>false</span><span class='comma'>,</span>
|
1480
|
+
<span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
|
1481
|
+
<span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
|
1482
|
+
<span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
|
1483
|
+
<span class='id identifier rubyid_header_map'>header_map</span> <span class='op'>=</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_header_map'>parse_header_map</span> <span class='label'>html:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='comma'>,</span>
|
1484
|
+
<span class='label'>selector:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_selector</span><span class='rbracket'>]</span><span class='comma'>,</span>
|
1485
|
+
<span class='label'>column_key_label_map:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_key_label_map</span><span class='rbracket'>]</span><span class='comma'>,</span>
|
1486
|
+
<span class='label'>first_row_header:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span>
|
1487
|
+
<span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
|
1488
|
+
<span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_content'>parse_content</span> <span class='label'>html:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='comma'>,</span>
|
1489
|
+
<span class='label'>selector:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:content_selector</span><span class='rbracket'>]</span><span class='comma'>,</span>
|
1490
|
+
<span class='label'>header_map:</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='comma'>,</span>
|
1491
|
+
<span class='label'>first_row_header:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:first_row_header</span><span class='rbracket'>]</span><span class='comma'>,</span>
|
1492
|
+
<span class='label'>column_parsers:</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span><span class='comma'>,</span>
|
1493
|
+
<span class='op'>&</span><span class='id identifier rubyid_filter'>filter</span>
|
1494
|
+
<span class='lbrace'>{</span><span class='label'>header_map:</span> <span class='id identifier rubyid_header_map'>header_map</span><span class='comma'>,</span> <span class='label'>data:</span> <span class='id identifier rubyid_data'>data</span><span class='rbrace'>}</span>
|
1495
|
+
<span class='kw'>end</span></pre>
|
1496
|
+
</td>
|
1497
|
+
</tr>
|
1498
|
+
</table>
|
1499
|
+
</div>
|
1500
|
+
|
1501
|
+
<div class="method_details ">
|
1502
|
+
<h3 class="signature " id="parse_vertical_table-class_method">
|
1503
|
+
|
1504
|
+
.<strong>parse_vertical_table</strong>(opts = {}) {|data, row, header_map| ... } ⇒ <tt>Hash{Symbol => Array,Hash,nil}</tt>
|
1505
|
+
|
1506
|
+
|
1507
|
+
|
1508
|
+
|
1509
|
+
|
1510
|
+
</h3><div class="docstring">
|
1511
|
+
<div class="discussion">
|
1512
|
+
|
1513
|
+
<p>Parse data from a vertical table like structure matching a selectors and</p>
|
1514
|
+
|
1515
|
+
<pre class="code ruby"><code class="ruby">using a header map to match columns.
|
1516
|
+
</code></pre>
|
1517
|
+
|
1518
|
+
|
1519
|
+
</div>
|
1520
|
+
</div>
|
1521
|
+
<div class="tags">
|
1522
|
+
<p class="tag_title">Parameters:</p>
|
1523
|
+
<ul class="param">
|
1524
|
+
|
1525
|
+
<li>
|
1526
|
+
|
1527
|
+
<span class='name'>opts</span>
|
1528
|
+
|
1529
|
+
|
1530
|
+
<span class='type'>(<tt>Hash</tt>)</span>
|
1531
|
+
|
1532
|
+
|
1533
|
+
<em class="default">(defaults to: <tt>{}</tt>)</em>
|
1534
|
+
|
1535
|
+
|
1536
|
+
—
|
1537
|
+
<div class='inline'>
|
1538
|
+
<p>({}) Configuration options.</p>
|
1539
|
+
</div>
|
1540
|
+
|
1541
|
+
</li>
|
1542
|
+
|
1543
|
+
</ul>
|
1544
|
+
|
1545
|
+
|
1546
|
+
|
1547
|
+
|
1548
|
+
<p class="tag_title">Options Hash (<tt>opts</tt>):</p>
|
1549
|
+
<ul class="option">
|
1550
|
+
|
1551
|
+
<li>
|
1552
|
+
<span class="name">:html</span>
|
1553
|
+
<span class="type">(<tt>Nokogiri::Element</tt>)</span>
|
1554
|
+
<span class="default">
|
1555
|
+
|
1556
|
+
</span>
|
1557
|
+
|
1558
|
+
— <div class='inline'>
|
1559
|
+
<p>Container element to search into.</p>
|
1560
|
+
</div>
|
1561
|
+
|
1562
|
+
</li>
|
1563
|
+
|
1564
|
+
<li>
|
1565
|
+
<span class="name">:row_selector</span>
|
1566
|
+
<span class="type">(<tt>String</tt>)</span>
|
1567
|
+
<span class="default">
|
1568
|
+
|
1569
|
+
</span>
|
1570
|
+
|
1571
|
+
— <div class='inline'>
|
1572
|
+
<p>Vertical row like elements selector.</p>
|
1573
|
+
</div>
|
1574
|
+
|
1575
|
+
</li>
|
1576
|
+
|
1577
|
+
<li>
|
1578
|
+
<span class="name">:header_selector</span>
|
1579
|
+
<span class="type">(<tt>String</tt>)</span>
|
1580
|
+
<span class="default">
|
1581
|
+
|
1582
|
+
</span>
|
1583
|
+
|
1584
|
+
— <div class='inline'>
|
1585
|
+
<p>Header column elements selector.</p>
|
1586
|
+
</div>
|
1587
|
+
|
1588
|
+
</li>
|
1589
|
+
|
1590
|
+
<li>
|
1591
|
+
<span class="name">:header_key_label_map</span>
|
1592
|
+
<span class="type">(<tt>Hash{Symbol,String => Regex,String}</tt>)</span>
|
1593
|
+
<span class="default">
|
1594
|
+
|
1595
|
+
</span>
|
1596
|
+
|
1597
|
+
— <div class='inline'>
|
1598
|
+
<p>Header key vs. label dictionary to match column indexes.</p>
|
1599
|
+
</div>
|
1600
|
+
|
1601
|
+
</li>
|
1602
|
+
|
1603
|
+
<li>
|
1604
|
+
<span class="name">:content_selector</span>
|
1605
|
+
<span class="type">(<tt>String</tt>)</span>
|
1606
|
+
<span class="default">
|
1607
|
+
|
1608
|
+
</span>
|
1609
|
+
|
1610
|
+
— <div class='inline'>
|
1611
|
+
<p>Content row elements selector.</p>
|
1612
|
+
</div>
|
1613
|
+
|
1614
|
+
</li>
|
1615
|
+
|
1616
|
+
<li>
|
1617
|
+
<span class="name">:column_parsers</span>
|
1618
|
+
<span class="type">(<tt>Hash{Symbol,String => lambda,proc}</tt>)</span>
|
1619
|
+
<span class="default">
|
1620
|
+
|
1621
|
+
— default:
|
1622
|
+
<tt>{}</tt>
|
1623
|
+
|
1624
|
+
</span>
|
1625
|
+
|
1626
|
+
— <div class='inline'>
|
1627
|
+
<p>Custom column parsers for advance data extraction.</p>
|
1628
|
+
</div>
|
1629
|
+
|
1630
|
+
</li>
|
1631
|
+
|
1632
|
+
</ul>
|
1633
|
+
|
1634
|
+
|
1635
|
+
<p class="tag_title">Yield Parameters:</p>
|
1636
|
+
<ul class="yieldparam">
|
1637
|
+
|
1638
|
+
<li>
|
1639
|
+
|
1640
|
+
<span class='name'>data</span>
|
1641
|
+
|
1642
|
+
|
1643
|
+
<span class='type'>(<tt>Hash{Symbol,String => Object}</tt>)</span>
|
1644
|
+
|
1645
|
+
|
1646
|
+
|
1647
|
+
—
|
1648
|
+
<div class='inline'>
|
1649
|
+
<p>Parsed content row data.</p>
|
1650
|
+
</div>
|
1651
|
+
|
1652
|
+
</li>
|
1653
|
+
|
1654
|
+
<li>
|
1655
|
+
|
1656
|
+
<span class='name'>row</span>
|
1657
|
+
|
1658
|
+
|
1659
|
+
<span class='type'>(<tt>Array</tt>)</span>
|
1660
|
+
|
1661
|
+
|
1662
|
+
|
1663
|
+
—
|
1664
|
+
<div class='inline'>
|
1665
|
+
<p>Raw content row data.</p>
|
1666
|
+
</div>
|
1667
|
+
|
1668
|
+
</li>
|
1669
|
+
|
1670
|
+
<li>
|
1671
|
+
|
1672
|
+
<span class='name'>header_map</span>
|
1673
|
+
|
1674
|
+
|
1675
|
+
<span class='type'>(<tt>Hash{Symbol,String => Integer}</tt>)</span>
|
1676
|
+
|
1677
|
+
|
1678
|
+
|
1679
|
+
—
|
1680
|
+
<div class='inline'>
|
1681
|
+
<p>Header map used.</p>
|
1682
|
+
</div>
|
1683
|
+
|
1684
|
+
</li>
|
1685
|
+
|
1686
|
+
</ul>
|
1687
|
+
<p class="tag_title">Yield Returns:</p>
|
1688
|
+
<ul class="yieldreturn">
|
1689
|
+
|
1690
|
+
<li>
|
1691
|
+
|
1692
|
+
|
1693
|
+
<span class='type'>(<tt>Boolean</tt>)</span>
|
1694
|
+
|
1695
|
+
|
1696
|
+
|
1697
|
+
—
|
1698
|
+
<div class='inline'>
|
1699
|
+
<p>`true` when valid, else `false`.</p>
|
1700
|
+
</div>
|
1701
|
+
|
1702
|
+
</li>
|
1703
|
+
|
1704
|
+
</ul>
|
1705
|
+
<p class="tag_title">Returns:</p>
|
1706
|
+
<ul class="return">
|
1707
|
+
|
1708
|
+
<li>
|
1709
|
+
|
1710
|
+
|
1711
|
+
<span class='type'>(<tt>Hash{Symbol => Array,Hash,nil}</tt>)</span>
|
1712
|
+
|
1713
|
+
|
1714
|
+
|
1715
|
+
—
|
1716
|
+
<div class='inline'>
|
1717
|
+
<p>Hash data is as follows:</p>
|
1718
|
+
<ul><li>
|
1719
|
+
<p>`[Hash] :header_map` Header map used.</p>
|
1720
|
+
</li><li>
|
1721
|
+
<p>`[Array<Hash>,nil] :data` Parsed rows data.</p>
|
1722
|
+
</li></ul>
|
1723
|
+
</div>
|
1724
|
+
|
1725
|
+
</li>
|
1726
|
+
|
1727
|
+
</ul>
|
1728
|
+
|
1729
|
+
</div><table class="source_code">
|
1730
|
+
<tr>
|
1731
|
+
<td>
|
1732
|
+
<pre class="lines">
|
1733
|
+
|
1734
|
+
|
1735
|
+
249
|
1736
|
+
250
|
1737
|
+
251
|
1738
|
+
252
|
1739
|
+
253
|
1740
|
+
254
|
1741
|
+
255
|
1742
|
+
256
|
1743
|
+
257
|
1744
|
+
258
|
1745
|
+
259
|
1746
|
+
260
|
1747
|
+
261
|
1748
|
+
262
|
1749
|
+
263
|
1750
|
+
264
|
1751
|
+
265
|
1752
|
+
266
|
1753
|
+
267
|
1754
|
+
268
|
1755
|
+
269
|
1756
|
+
270
|
1757
|
+
271
|
1758
|
+
272
|
1759
|
+
273
|
1760
|
+
274
|
1761
|
+
275
|
1762
|
+
276
|
1763
|
+
277
|
1764
|
+
278
|
1765
|
+
279
|
1766
|
+
280
|
1767
|
+
281</pre>
|
1768
|
+
</td>
|
1769
|
+
<td>
|
1770
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 249</span>
|
1771
|
+
|
1772
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_parse_vertical_table'>parse_vertical_table</span> <span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span> <span class='op'>&</span><span class='id identifier rubyid_filter'>filter</span>
|
1773
|
+
<span class='id identifier rubyid_opts'>opts</span> <span class='op'>=</span> <span class='lbrace'>{</span>
|
1774
|
+
<span class='label'>html:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1775
|
+
<span class='label'>row_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1776
|
+
<span class='label'>header_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1777
|
+
<span class='label'>header_key_label_map:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span><span class='comma'>,</span>
|
1778
|
+
<span class='label'>content_selector:</span> <span class='kw'>nil</span><span class='comma'>,</span>
|
1779
|
+
<span class='label'>column_parsers:</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
|
1780
|
+
<span class='rbrace'>}</span><span class='period'>.</span><span class='id identifier rubyid_merge'>merge</span> <span class='id identifier rubyid_opts'>opts</span>
|
1781
|
+
<span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
|
1782
|
+
|
1783
|
+
<span class='comment'># Setup config
|
1784
|
+
</span> <span class='id identifier rubyid_data'>data</span> <span class='op'>=</span> <span class='lbrace'>{</span><span class='rbrace'>}</span>
|
1785
|
+
<span class='id identifier rubyid_dictionary'>dictionary</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_key_label_map</span><span class='rbracket'>]</span>
|
1786
|
+
<span class='id identifier rubyid_column_parsers'>column_parsers</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:column_parsers</span><span class='rbracket'>]</span>
|
1787
|
+
|
1788
|
+
<span class='comment'># Extract headers and content
|
1789
|
+
</span> <span class='id identifier rubyid_html_rows'>html_rows</span> <span class='op'>=</span> <span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:html</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:row_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>nil</span>
|
1790
|
+
<span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
|
1791
|
+
<span class='id identifier rubyid_html_rows'>html_rows</span><span class='period'>.</span><span class='id identifier rubyid_each'>each</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_row'>row</span><span class='op'>|</span>
|
1792
|
+
<span class='comment'># Parse and map column header
|
1793
|
+
</span> <span class='id identifier rubyid_header_element'>header_element</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:header_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
|
1794
|
+
<span class='id identifier rubyid_key'>key</span> <span class='op'>=</span> <span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_header_element'>header_element</span><span class='comma'>,</span> <span class='id identifier rubyid_dictionary'>dictionary</span>
|
1795
|
+
<span class='kw'>next</span> <span class='kw'>if</span> <span class='id identifier rubyid_key'>key</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>||</span> <span class='id identifier rubyid_key'>key</span> <span class='op'>==</span> <span class='tstring'><span class='tstring_beg'>'</span><span class='tstring_end'>'</span></span>
|
1796
|
+
|
1797
|
+
<span class='comment'># Parse column html with default or custom parser
|
1798
|
+
</span> <span class='id identifier rubyid_content_element'>content_element</span> <span class='op'>=</span> <span class='id identifier rubyid_row'>row</span><span class='period'>.</span><span class='id identifier rubyid_css'>css</span><span class='lparen'>(</span><span class='id identifier rubyid_opts'>opts</span><span class='lbracket'>[</span><span class='symbol'>:content_selector</span><span class='rbracket'>]</span><span class='rparen'>)</span>
|
1799
|
+
<span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span>
|
1800
|
+
<span class='id identifier rubyid_default_parser'>default_parser</span><span class='lparen'>(</span><span class='id identifier rubyid_content_element'>content_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span> <span class='op'>:</span>
|
1801
|
+
<span class='id identifier rubyid_column_parsers'>column_parsers</span><span class='lbracket'>[</span><span class='id identifier rubyid_key'>key</span><span class='rbracket'>]</span><span class='period'>.</span><span class='id identifier rubyid_call'>call</span><span class='lparen'>(</span><span class='id identifier rubyid_content_element'>content_element</span><span class='comma'>,</span> <span class='id identifier rubyid_data'>data</span><span class='comma'>,</span> <span class='id identifier rubyid_key'>key</span><span class='rparen'>)</span>
|
1802
|
+
<span class='kw'>end</span>
|
1803
|
+
<span class='id identifier rubyid_data'>data</span>
|
1804
|
+
<span class='kw'>end</span></pre>
|
1805
|
+
</td>
|
1806
|
+
</tr>
|
1807
|
+
</table>
|
1808
|
+
</div>
|
1809
|
+
|
1810
|
+
<div class="method_details ">
|
1811
|
+
<h3 class="signature " id="strip-class_method">
|
1812
|
+
|
1813
|
+
.<strong>strip</strong>(raw_text) ⇒ <tt>String</tt><sup>?</sup>
|
1814
|
+
|
1815
|
+
|
1816
|
+
|
1817
|
+
|
1818
|
+
|
1819
|
+
</h3><div class="docstring">
|
1820
|
+
<div class="discussion">
|
1821
|
+
|
1822
|
+
<p>Strip a value.</p>
|
1823
|
+
|
1824
|
+
|
1825
|
+
</div>
|
1826
|
+
</div>
|
1827
|
+
<div class="tags">
|
1828
|
+
<p class="tag_title">Parameters:</p>
|
1829
|
+
<ul class="param">
|
1830
|
+
|
1831
|
+
<li>
|
1832
|
+
|
1833
|
+
<span class='name'>raw_text</span>
|
1834
|
+
|
1835
|
+
|
1836
|
+
<span class='type'>(<tt>String</tt>, <tt>Object</tt>, <tt>nil</tt>)</span>
|
1837
|
+
|
1838
|
+
|
1839
|
+
|
1840
|
+
—
|
1841
|
+
<div class='inline'>
|
1842
|
+
<p>Text to strip.</p>
|
1843
|
+
</div>
|
1844
|
+
|
1845
|
+
</li>
|
1846
|
+
|
1847
|
+
</ul>
|
1848
|
+
|
1849
|
+
<p class="tag_title">Returns:</p>
|
1850
|
+
<ul class="return">
|
1851
|
+
|
1852
|
+
<li>
|
1853
|
+
|
1854
|
+
|
1855
|
+
<span class='type'>(<tt>String</tt>, <tt>nil</tt>)</span>
|
1856
|
+
|
1857
|
+
|
1858
|
+
|
1859
|
+
—
|
1860
|
+
<div class='inline'>
|
1861
|
+
<p>`nil` when <code>raw_text</code> is nil, else `String`.</p>
|
1862
|
+
</div>
|
1863
|
+
|
1864
|
+
</li>
|
1865
|
+
|
1866
|
+
</ul>
|
1867
|
+
|
1868
|
+
</div><table class="source_code">
|
1869
|
+
<tr>
|
1870
|
+
<td>
|
1871
|
+
<pre class="lines">
|
1872
|
+
|
1873
|
+
|
1874
|
+
42
|
1875
|
+
43
|
1876
|
+
44
|
1877
|
+
45
|
1878
|
+
46
|
1879
|
+
47
|
1880
|
+
48
|
1881
|
+
49
|
1882
|
+
50
|
1883
|
+
51
|
1884
|
+
52
|
1885
|
+
53</pre>
|
1886
|
+
</td>
|
1887
|
+
<td>
|
1888
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 42</span>
|
1889
|
+
|
1890
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_raw_text'>raw_text</span>
|
1891
|
+
<span class='kw'>return</span> <span class='kw'>nil</span> <span class='kw'>if</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span>
|
1892
|
+
<span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_to_s'>to_s</span> <span class='kw'>unless</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span> <span class='const'>String</span>
|
1893
|
+
<span class='id identifier rubyid_regex'>regex</span> <span class='op'>=</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>(\s|\u3000|\u00a0)+</span><span class='regexp_end'>/</span></span>
|
1894
|
+
<span class='id identifier rubyid_good_encoding'>good_encoding</span> <span class='op'>=</span> <span class='lparen'>(</span><span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=~</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>\u3000</span><span class='regexp_end'>/</span></span> <span class='op'>||</span> <span class='kw'>true</span><span class='rparen'>)</span> <span class='kw'>rescue</span> <span class='kw'>false</span>
|
1895
|
+
<span class='kw'>unless</span> <span class='id identifier rubyid_good_encoding'>good_encoding</span>
|
1896
|
+
<span class='id identifier rubyid_raw_text'>raw_text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='period'>.</span><span class='id identifier rubyid_force_encoding'>force_encoding</span><span class='lparen'>(</span><span class='gvar'>$APP_CONFIG</span><span class='lbracket'>[</span><span class='symbol'>:encoding</span><span class='rbracket'>]</span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_encode'>encode</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>'</span><span class='tstring_content'>UTF-8</span><span class='tstring_end'>'</span></span><span class='rparen'>)</span>
|
1897
|
+
<span class='id identifier rubyid_regex'>regex</span> <span class='op'>=</span> <span class='tstring'><span class='regexp_beg'>/</span><span class='tstring_content'>(\s|\u3000|\u00a0|\u00c2\u00a0)+</span><span class='regexp_end'>/</span></span>
|
1898
|
+
<span class='kw'>end</span>
|
1899
|
+
<span class='id identifier rubyid_text'>text</span> <span class='op'>=</span> <span class='id identifier rubyid_raw_text'>raw_text</span><span class='op'>&.</span><span class='id identifier rubyid_gsub'>gsub</span><span class='lparen'>(</span><span class='id identifier rubyid_regex'>regex</span><span class='comma'>,</span> <span class='tstring'><span class='tstring_beg'>'</span><span class='tstring_content'> </span><span class='tstring_end'>'</span></span><span class='rparen'>)</span><span class='op'>&.</span><span class='id identifier rubyid_strip'>strip</span>
|
1900
|
+
<span class='id identifier rubyid_text'>text</span><span class='period'>.</span><span class='id identifier rubyid_nil?'>nil?</span> <span class='op'>?</span> <span class='kw'>nil</span> <span class='op'>:</span> <span class='id identifier rubyid_decode_html'>decode_html</span><span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span><span class='rparen'>)</span>
|
1901
|
+
<span class='kw'>end</span></pre>
|
1902
|
+
</td>
|
1903
|
+
</tr>
|
1904
|
+
</table>
|
1905
|
+
</div>
|
1906
|
+
|
1907
|
+
<div class="method_details ">
|
1908
|
+
<h3 class="signature " id="translate_label_to_key-class_method">
|
1909
|
+
|
1910
|
+
.<strong>translate_label_to_key</strong>(element, label_map) ⇒ <tt>Symbol</tt>, <tt>String</tt>
|
1911
|
+
|
1912
|
+
|
1913
|
+
|
1914
|
+
|
1915
|
+
|
1916
|
+
</h3><div class="docstring">
|
1917
|
+
<div class="discussion">
|
1918
|
+
|
1919
|
+
<p>Extract column label and translate it into a frienly key.</p>
|
1920
|
+
|
1921
|
+
|
1922
|
+
</div>
|
1923
|
+
</div>
|
1924
|
+
<div class="tags">
|
1925
|
+
<p class="tag_title">Parameters:</p>
|
1926
|
+
<ul class="param">
|
1927
|
+
|
1928
|
+
<li>
|
1929
|
+
|
1930
|
+
<span class='name'>element</span>
|
1931
|
+
|
1932
|
+
|
1933
|
+
<span class='type'>(<tt>Nokogiri::Element</tt>)</span>
|
1934
|
+
|
1935
|
+
|
1936
|
+
|
1937
|
+
—
|
1938
|
+
<div class='inline'>
|
1939
|
+
<p>Html element to parse.</p>
|
1940
|
+
</div>
|
1941
|
+
|
1942
|
+
</li>
|
1943
|
+
|
1944
|
+
<li>
|
1945
|
+
|
1946
|
+
<span class='name'>label_map</span>
|
1947
|
+
|
1948
|
+
|
1949
|
+
<span class='type'>(<tt>Hash{Symbol,String => Regex,String}</tt>)</span>
|
1950
|
+
|
1951
|
+
|
1952
|
+
|
1953
|
+
—
|
1954
|
+
<div class='inline'>
|
1955
|
+
<p>Label dictionary for translation into key.</p>
|
1956
|
+
</div>
|
1957
|
+
|
1958
|
+
</li>
|
1959
|
+
|
1960
|
+
</ul>
|
1961
|
+
|
1962
|
+
<p class="tag_title">Returns:</p>
|
1963
|
+
<ul class="return">
|
1964
|
+
|
1965
|
+
<li>
|
1966
|
+
|
1967
|
+
|
1968
|
+
<span class='type'>(<tt>Symbol</tt>, <tt>String</tt>)</span>
|
1969
|
+
|
1970
|
+
|
1971
|
+
|
1972
|
+
—
|
1973
|
+
<div class='inline'>
|
1974
|
+
<p>Translated key.</p>
|
1975
|
+
</div>
|
1976
|
+
|
1977
|
+
</li>
|
1978
|
+
|
1979
|
+
</ul>
|
1980
|
+
|
1981
|
+
</div><table class="source_code">
|
1982
|
+
<tr>
|
1983
|
+
<td>
|
1984
|
+
<pre class="lines">
|
1985
|
+
|
1986
|
+
|
1987
|
+
131
|
1988
|
+
132
|
1989
|
+
133
|
1990
|
+
134
|
1991
|
+
135
|
1992
|
+
136
|
1993
|
+
137
|
1994
|
+
138</pre>
|
1995
|
+
</td>
|
1996
|
+
<td>
|
1997
|
+
<pre class="code"><span class="info file"># File 'lib/ae_easy/text.rb', line 131</span>
|
1998
|
+
|
1999
|
+
<span class='kw'>def</span> <span class='kw'>self</span><span class='period'>.</span><span class='id identifier rubyid_translate_label_to_key'>translate_label_to_key</span> <span class='id identifier rubyid_element'>element</span><span class='comma'>,</span> <span class='id identifier rubyid_label_map'>label_map</span>
|
2000
|
+
<span class='id identifier rubyid_element'>element</span><span class='op'>&.</span><span class='id identifier rubyid_search'>search</span><span class='lparen'>(</span><span class='tstring'><span class='tstring_beg'>'</span><span class='tstring_content'>//i</span><span class='tstring_end'>'</span></span><span class='rparen'>)</span><span class='period'>.</span><span class='id identifier rubyid_remove'>remove</span>
|
2001
|
+
<span class='id identifier rubyid_text'>text</span> <span class='op'>=</span> <span class='id identifier rubyid_strip'>strip</span> <span class='id identifier rubyid_element'>element</span><span class='op'>&.</span><span class='id identifier rubyid_text'>text</span>
|
2002
|
+
<span class='id identifier rubyid_key'>key</span> <span class='op'>=</span> <span class='id identifier rubyid_label_map'>label_map</span><span class='period'>.</span><span class='id identifier rubyid_find'>find</span> <span class='kw'>do</span> <span class='op'>|</span><span class='id identifier rubyid_k'>k</span><span class='comma'>,</span><span class='id identifier rubyid_v'>v</span><span class='op'>|</span>
|
2003
|
+
<span class='id identifier rubyid_v'>v</span><span class='period'>.</span><span class='id identifier rubyid_is_a?'>is_a?</span><span class='lparen'>(</span><span class='const'>Regexp</span><span class='rparen'>)</span> <span class='op'>?</span> <span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span> <span class='op'>=~</span> <span class='id identifier rubyid_v'>v</span><span class='rparen'>)</span> <span class='op'>:</span> <span class='lparen'>(</span><span class='id identifier rubyid_text'>text</span> <span class='op'>==</span> <span class='id identifier rubyid_v'>v</span><span class='rparen'>)</span>
|
2004
|
+
<span class='kw'>end</span><span class='op'>&.</span><span class='id identifier rubyid_first'>first</span>
|
2005
|
+
<span class='id identifier rubyid_key'>key</span>
|
2006
|
+
<span class='kw'>end</span></pre>
|
2007
|
+
</td>
|
2008
|
+
</tr>
|
2009
|
+
</table>
|
2010
|
+
</div>
|
2011
|
+
|
2012
|
+
</div>
|
2013
|
+
|
2014
|
+
</div>
|
2015
|
+
|
2016
|
+
<div id="footer">
|
2017
|
+
Generated on Tue Feb 26 16:50:03 2019 by
|
2018
|
+
<a href="http://yardoc.org" title="Yay! A Ruby Documentation Tool" target="_parent">yard</a>
|
2019
|
+
0.9.18 (ruby-2.5.3).
|
2020
|
+
</div>
|
2021
|
+
|
2022
|
+
</div>
|
2023
|
+
</body>
|
2024
|
+
</html>
|