dh_easy 0.0.8

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: c0dc1ab1c61908614f1002368b5f2e97d21f5b0a090d371b310c13070e2e8f2e
4
+ data.tar.gz: 46d8272ab380fb644295ced234a3d261e1a4c947a0e6877306c6686b23a1cf7e
5
+ SHA512:
6
+ metadata.gz: cabc9a8c2b39147c40f1e814759900fa8d5972de976525764a78fd09542bfc9a5a6d561799dcd96e194051a7f4f96e82df81bdc39595f2241e6865bc95cfaba4
7
+ data.tar.gz: ae17f259f68dffc789cb25bea49cdb1d75aca073677bf029c22a50f7274ecc8aae924490d621a0fd64f568064851cd6b54aa2d2465d91ca5ecef09feafcec3a0
@@ -0,0 +1,13 @@
1
+ /.byebug*
2
+ /.bundle/
3
+ /.yardoc
4
+ /_yardoc/
5
+ /coverage/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+ !/tmp/.keep
10
+ /certs/
11
+ /checksum/
12
+ /vendor/
13
+ /Gemfile.lock
@@ -0,0 +1,7 @@
1
+ ---
2
+ sudo: false
3
+ language: ruby
4
+ cache: bundler
5
+ rvm:
6
+ - 2.4.2
7
+ before_install: gem install bundler -v 1.16.3
@@ -0,0 +1 @@
1
+ --no-private
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at perry@datahen.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source "https://rubygems.org"
2
+
3
+ git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
4
+
5
+ # Specify your gem's dependencies in dh_easy.gemspec
6
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2019 DataHen
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,267 @@
1
+ [![Gem Version](https://badge.fury.io/rb/dh_easy.svg)](http://github.com/DataHenOfficial/dh_easy/releases)
2
+ [![License](http://img.shields.io/badge/license-MIT-yellowgreen.svg)](#license)
3
+
4
+ # DhEasy
5
+ ## Description
6
+
7
+ DhEasy gem collection allow advance DataHen features possible by including a collection of specialized gems.
8
+
9
+ Install gem:
10
+ ```ruby
11
+ gem install 'dh_easy'
12
+ ```
13
+
14
+ Require gem:
15
+ ```ruby
16
+ require 'dh_easy'
17
+ ```
18
+
19
+ Included gems documentation:
20
+ ```
21
+ dh_easy-core: http://rubydoc.org/gems/dh_easy-core/frames
22
+ dh_easy-config: http://rubydoc.org/gems/dh_easy-config/frames
23
+ dh_easy-router: http://rubydoc.org/gems/dh_easy-router/frames
24
+ dh_easy-text: http://rubydoc.org/gems/dh_easy-text/frames
25
+ dh_easy-login: http://rubydoc.org/gems/dh_easy-login/frames
26
+ ```
27
+
28
+ ## How to implement
29
+
30
+ ### Sample DataHen project
31
+
32
+ Lets take a simple project without `dh_easy`:
33
+
34
+ ```yaml
35
+ # ./config.yaml
36
+
37
+ seeder:
38
+ file: ./seeder/seeder.rb
39
+ disabled: false
40
+ parsers:
41
+ - page_type: search
42
+ file: ./parsers/search.rb
43
+ disabled: false
44
+ - page_type: product
45
+ file: ./parsers/product.rb
46
+ disabled: false
47
+ ```
48
+
49
+ ```ruby
50
+ # ./seeder/seeder.rb
51
+
52
+ pages << {
53
+ 'url' => 'https://example.com/login.rb?query=food',
54
+ 'page_type' => 'search'
55
+ }
56
+ ```
57
+
58
+ ```ruby
59
+ # ./parsers/search.rb
60
+
61
+ require 'cgi'
62
+
63
+ html = Nokogiri.HTML content
64
+ html.css('.name').each do |element|
65
+ name = element.text.strip
66
+ pages << {
67
+ 'url' => "https://example.com/product/#{CGI::escape name}",
68
+ 'page_type' => 'product',
69
+ 'vars' => {'name' => name}
70
+ }
71
+ end
72
+ ```
73
+
74
+ ```ruby
75
+ # ./parsers/product.rb
76
+
77
+ html = Nokogiri.HTML content
78
+ description = html.css('.description').first.text.strip
79
+ outputs << {
80
+ '_collection' => 'product',
81
+ 'name' => page['vars']['name'],
82
+ 'description' => description
83
+ }
84
+ ```
85
+
86
+ ### Adding dh_easy to sample project
87
+
88
+ One of DhEasy's main feature is to allow users to use classes instead of raw scripts with the whole `datahen` gem contexts (seeder, parsers, finishers, etc.) functions and objects integreated directly on our classes.
89
+
90
+ Converting seeders, parsers and finishers to DhEasy supported classes is quite easy, just wrap your seeders and parsers like this:
91
+
92
+ ```ruby
93
+ class MySeeder
94
+ include DhEasy::Core::Plugin::Seeder
95
+
96
+ # Create "initialize_hook_*" methods instead of "initialize" method
97
+ # to prevent overriding the logic behind DhEasy
98
+ def initialize_hook_my_seeder opts = {}
99
+ @my_param = opts[:my_param]
100
+ end
101
+
102
+ def seed
103
+
104
+ # Your seeder code goes here
105
+
106
+ end
107
+ end
108
+ ```
109
+
110
+ ```ruby
111
+ class MyParser
112
+ include DhEasy::Core::Plugin::Parser
113
+
114
+ # Create "initialize_hook_*" methods instead of "initialize" method
115
+ # to prevent overriding the logic behind DhEasy
116
+ def initialize_hook_my_parser opts = {}
117
+ @my_param = opts[:my_param]
118
+ end
119
+
120
+ def parse
121
+
122
+ # Your parser code goes here
123
+
124
+ end
125
+ end
126
+ ```
127
+
128
+ ```ruby
129
+ class MyFinisher
130
+ include DhEasy::Core::Plugin::Finisher
131
+
132
+ # Create "initialize_hook_*" methods instead of "initialize" method
133
+ # to prevent overriding the logic behind DhEasy
134
+ def initialize_hook_my_parser opts = {}
135
+ @my_param = opts[:my_param]
136
+ end
137
+
138
+ def finish
139
+
140
+ # Your finisher code goes here
141
+
142
+ end
143
+ end
144
+ ```
145
+
146
+ You can also add `initialize_hook_` methods to extend the default `initialize` provided by DhEasy plugins.
147
+
148
+ Now let's try this on our sample project's seeders and parsers:
149
+
150
+ ```ruby
151
+ # ./seeder/seeder.rb
152
+
153
+ module Seeder
154
+ class Seeder
155
+ include DhEasy::Core::Plugin::Seeder
156
+
157
+ def seed
158
+ pages << {
159
+ 'url' => 'https://example.com/search?query=food',
160
+ 'page_type' => 'search'
161
+ }
162
+ end
163
+ end
164
+ end
165
+ ```
166
+
167
+ ```ruby
168
+ # ./parsers/search.rb
169
+
170
+ module Parsers
171
+ class Search
172
+ include DhEasy::Core::Plugin::Parser
173
+
174
+ def parse
175
+ html = Nokogiri.HTML content
176
+ html.css('.name').each do |element|
177
+ name = element.text.strip
178
+ pages << {
179
+ 'url' => "https://example.com/product/#{CGI::escape name}",
180
+ 'page_type' => 'product',
181
+ 'vars' => {'name' => name}
182
+ }
183
+ end
184
+ end
185
+ end
186
+ end
187
+ ```
188
+
189
+ ```ruby
190
+ # ./parsers/product.rb
191
+
192
+ module Parsers
193
+ class Product
194
+ include DhEasy::Core::Plugin::Parser
195
+
196
+ def parse
197
+ html = Nokogiri.HTML content
198
+ description = html.css('.description').first.text.strip
199
+ outputs << {
200
+ '_collection' => 'product',
201
+ 'name' => page['vars']['name'],
202
+ 'description' => description
203
+ }
204
+ end
205
+ end
206
+ end
207
+ ```
208
+
209
+ Next step is to add router capabilities to consume these classes. To do this, let's create the routers and require our seeder and parsers classes, like this:
210
+
211
+ ```ruby
212
+ # ./router/seeder.rb
213
+
214
+ require 'dh_easy/router'
215
+ require './seeder/seeder'
216
+
217
+ DhEasy::Router::Seeder.new.route context: self
218
+ ```
219
+
220
+ ```ruby
221
+ # ./router/parser.rb
222
+
223
+ require 'cgi'
224
+ require 'dh_easy/router'
225
+ require './parsers/search'
226
+ require './parsers/product'
227
+
228
+ DhEasy::Router::Parser.new.route context: self
229
+ ```
230
+
231
+ Now lets create our `./dh_easy.yaml` config file to link our routers to our new seeder and parsers classes:
232
+
233
+ ```yaml
234
+ # ./dh_easy.yaml
235
+
236
+ router:
237
+ parser:
238
+ routes:
239
+ - page_type: search
240
+ class: Parsers::Search
241
+ - page_type: product
242
+ class: Parsers::Product
243
+
244
+ seeder:
245
+ routes:
246
+ - class: Seeder::Seeder
247
+ ```
248
+
249
+ Finally, we need to modify our `./config.yaml` to use our routers:
250
+
251
+ ```yaml
252
+ # ./config.yaml
253
+
254
+ seeder:
255
+ file: ./router/seeder.rb
256
+ disabled: false
257
+
258
+ parsers:
259
+ - page_type: search
260
+ file: ./router/parser.rb
261
+ disabled: false
262
+ - page_type: product
263
+ file: ./router/parser.rb
264
+ disabled: false
265
+ ```
266
+
267
+ Hurray! you have successfullly implemented DhEasy on your project.
@@ -0,0 +1,22 @@
1
+ require 'benchmark'
2
+ require 'bundler/gem_tasks'
3
+ require 'rake/testtask'
4
+
5
+ Rake::TestTask.new do |t|
6
+ t.libs = ['lib', 'test']
7
+ t.warning = false
8
+ t.verbose = false
9
+ t.test_files = FileList['./test/**/*_test.rb']
10
+ end
11
+
12
+ desc 'Benchmark another task execution | usage example: benchmark[my_task, param1, param2]'
13
+ task :benchmark, [:task] do |task, args|
14
+ task_name = args[:task]
15
+ if task_name.nil?
16
+ puts "Should select a task."
17
+ exit 1
18
+ end
19
+ puts Benchmark.measure{ Rake::Task[task_name].invoke *args.extras }
20
+ end
21
+
22
+ task default: :test
@@ -0,0 +1,52 @@
1
+
2
+ lib = File.expand_path("../lib", __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require "dh_easy/version"
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "dh_easy"
8
+ spec.version = DhEasy::VERSION
9
+ spec.authors = ["Eduardo Rosales"]
10
+ spec.email = ["eduardo@datahen.com"]
11
+
12
+ spec.summary = %q{DataHen Easy toolkit modules.}
13
+ spec.description = %q{DataHen Easy toolkit module collection.}
14
+ spec.homepage = "https://datahen.com"
15
+ spec.license = "MIT"
16
+
17
+ # spec.cert_chain = ['certs/dh_easy.pem']
18
+ # spec.signing_key = File.expand_path("~/.ssh/gems/gem-private_dh_easy.pem") if $0 =~ /gem\z/
19
+
20
+ # Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
21
+ # to allow pushing to a single host or delete this section to allow pushing to any host.
22
+ if spec.respond_to?(:metadata)
23
+ # spec.metadata["allowed_push_host"] = "TODO: Set to 'http://mygemserver.com'"
24
+
25
+ spec.metadata["homepage_uri"] = spec.homepage
26
+ spec.metadata["source_code_uri"] = "https://github.com/DataHenOfficial/dh_easy"
27
+ # spec.metadata["allowed_push_host"] = "TODO: Set to 'http://mygemserver.com'"
28
+ else
29
+ raise "RubyGems 2.0 or newer is required to protect against " \
30
+ "public gem pushes."
31
+ end
32
+
33
+ # Specify which files should be added to the gem when it is released.
34
+ # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
35
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
36
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
37
+ end
38
+ spec.require_paths = ["lib"]
39
+ spec.required_ruby_version = '>= 2.2.2'
40
+
41
+ spec.add_dependency 'dh_easy-core', '>= 0'
42
+ spec.add_dependency 'dh_easy-config', '>= 0'
43
+ spec.add_dependency 'dh_easy-text', '>= 0'
44
+ spec.add_dependency 'dh_easy-router', '>= 0'
45
+ spec.add_dependency 'dh_easy-login', '>= 0'
46
+ spec.add_development_dependency 'bundler', '>= 1'
47
+ spec.add_development_dependency 'rake', '~> 10'
48
+ spec.add_development_dependency 'minitest', '~> 5'
49
+ spec.add_development_dependency 'simplecov', '~> 0'
50
+ spec.add_development_dependency 'simplecov-console', '~> 0'
51
+ spec.add_development_dependency 'byebug', '>= 0'
52
+ end