dh_easy 0.0.8

Sign up to get free protection for your applications and to get access to all the features.
@@ -0,0 +1,7 @@
1
+ ---
2
+ SHA256:
3
+ metadata.gz: c0dc1ab1c61908614f1002368b5f2e97d21f5b0a090d371b310c13070e2e8f2e
4
+ data.tar.gz: 46d8272ab380fb644295ced234a3d261e1a4c947a0e6877306c6686b23a1cf7e
5
+ SHA512:
6
+ metadata.gz: cabc9a8c2b39147c40f1e814759900fa8d5972de976525764a78fd09542bfc9a5a6d561799dcd96e194051a7f4f96e82df81bdc39595f2241e6865bc95cfaba4
7
+ data.tar.gz: ae17f259f68dffc789cb25bea49cdb1d75aca073677bf029c22a50f7274ecc8aae924490d621a0fd64f568064851cd6b54aa2d2465d91ca5ecef09feafcec3a0
@@ -0,0 +1,13 @@
1
+ /.byebug*
2
+ /.bundle/
3
+ /.yardoc
4
+ /_yardoc/
5
+ /coverage/
6
+ /pkg/
7
+ /spec/reports/
8
+ /tmp/
9
+ !/tmp/.keep
10
+ /certs/
11
+ /checksum/
12
+ /vendor/
13
+ /Gemfile.lock
@@ -0,0 +1,7 @@
1
+ ---
2
+ sudo: false
3
+ language: ruby
4
+ cache: bundler
5
+ rvm:
6
+ - 2.4.2
7
+ before_install: gem install bundler -v 1.16.3
@@ -0,0 +1 @@
1
+ --no-private
@@ -0,0 +1,74 @@
1
+ # Contributor Covenant Code of Conduct
2
+
3
+ ## Our Pledge
4
+
5
+ In the interest of fostering an open and welcoming environment, we as
6
+ contributors and maintainers pledge to making participation in our project and
7
+ our community a harassment-free experience for everyone, regardless of age, body
8
+ size, disability, ethnicity, gender identity and expression, level of experience,
9
+ nationality, personal appearance, race, religion, or sexual identity and
10
+ orientation.
11
+
12
+ ## Our Standards
13
+
14
+ Examples of behavior that contributes to creating a positive environment
15
+ include:
16
+
17
+ * Using welcoming and inclusive language
18
+ * Being respectful of differing viewpoints and experiences
19
+ * Gracefully accepting constructive criticism
20
+ * Focusing on what is best for the community
21
+ * Showing empathy towards other community members
22
+
23
+ Examples of unacceptable behavior by participants include:
24
+
25
+ * The use of sexualized language or imagery and unwelcome sexual attention or
26
+ advances
27
+ * Trolling, insulting/derogatory comments, and personal or political attacks
28
+ * Public or private harassment
29
+ * Publishing others' private information, such as a physical or electronic
30
+ address, without explicit permission
31
+ * Other conduct which could reasonably be considered inappropriate in a
32
+ professional setting
33
+
34
+ ## Our Responsibilities
35
+
36
+ Project maintainers are responsible for clarifying the standards of acceptable
37
+ behavior and are expected to take appropriate and fair corrective action in
38
+ response to any instances of unacceptable behavior.
39
+
40
+ Project maintainers have the right and responsibility to remove, edit, or
41
+ reject comments, commits, code, wiki edits, issues, and other contributions
42
+ that are not aligned to this Code of Conduct, or to ban temporarily or
43
+ permanently any contributor for other behaviors that they deem inappropriate,
44
+ threatening, offensive, or harmful.
45
+
46
+ ## Scope
47
+
48
+ This Code of Conduct applies both within project spaces and in public spaces
49
+ when an individual is representing the project or its community. Examples of
50
+ representing a project or community include using an official project e-mail
51
+ address, posting via an official social media account, or acting as an appointed
52
+ representative at an online or offline event. Representation of a project may be
53
+ further defined and clarified by project maintainers.
54
+
55
+ ## Enforcement
56
+
57
+ Instances of abusive, harassing, or otherwise unacceptable behavior may be
58
+ reported by contacting the project team at perry@datahen.com. All
59
+ complaints will be reviewed and investigated and will result in a response that
60
+ is deemed necessary and appropriate to the circumstances. The project team is
61
+ obligated to maintain confidentiality with regard to the reporter of an incident.
62
+ Further details of specific enforcement policies may be posted separately.
63
+
64
+ Project maintainers who do not follow or enforce the Code of Conduct in good
65
+ faith may face temporary or permanent repercussions as determined by other
66
+ members of the project's leadership.
67
+
68
+ ## Attribution
69
+
70
+ This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71
+ available at [http://contributor-covenant.org/version/1/4][version]
72
+
73
+ [homepage]: http://contributor-covenant.org
74
+ [version]: http://contributor-covenant.org/version/1/4/
data/Gemfile ADDED
@@ -0,0 +1,6 @@
1
+ source "https://rubygems.org"
2
+
3
+ git_source(:github) {|repo_name| "https://github.com/#{repo_name}" }
4
+
5
+ # Specify your gem's dependencies in dh_easy.gemspec
6
+ gemspec
data/LICENSE ADDED
@@ -0,0 +1,21 @@
1
+ MIT License
2
+
3
+ Copyright (c) 2019 DataHen
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
@@ -0,0 +1,267 @@
1
+ [![Gem Version](https://badge.fury.io/rb/dh_easy.svg)](http://github.com/DataHenOfficial/dh_easy/releases)
2
+ [![License](http://img.shields.io/badge/license-MIT-yellowgreen.svg)](#license)
3
+
4
+ # DhEasy
5
+ ## Description
6
+
7
+ DhEasy gem collection allow advance DataHen features possible by including a collection of specialized gems.
8
+
9
+ Install gem:
10
+ ```ruby
11
+ gem install 'dh_easy'
12
+ ```
13
+
14
+ Require gem:
15
+ ```ruby
16
+ require 'dh_easy'
17
+ ```
18
+
19
+ Included gems documentation:
20
+ ```
21
+ dh_easy-core: http://rubydoc.org/gems/dh_easy-core/frames
22
+ dh_easy-config: http://rubydoc.org/gems/dh_easy-config/frames
23
+ dh_easy-router: http://rubydoc.org/gems/dh_easy-router/frames
24
+ dh_easy-text: http://rubydoc.org/gems/dh_easy-text/frames
25
+ dh_easy-login: http://rubydoc.org/gems/dh_easy-login/frames
26
+ ```
27
+
28
+ ## How to implement
29
+
30
+ ### Sample DataHen project
31
+
32
+ Lets take a simple project without `dh_easy`:
33
+
34
+ ```yaml
35
+ # ./config.yaml
36
+
37
+ seeder:
38
+ file: ./seeder/seeder.rb
39
+ disabled: false
40
+ parsers:
41
+ - page_type: search
42
+ file: ./parsers/search.rb
43
+ disabled: false
44
+ - page_type: product
45
+ file: ./parsers/product.rb
46
+ disabled: false
47
+ ```
48
+
49
+ ```ruby
50
+ # ./seeder/seeder.rb
51
+
52
+ pages << {
53
+ 'url' => 'https://example.com/login.rb?query=food',
54
+ 'page_type' => 'search'
55
+ }
56
+ ```
57
+
58
+ ```ruby
59
+ # ./parsers/search.rb
60
+
61
+ require 'cgi'
62
+
63
+ html = Nokogiri.HTML content
64
+ html.css('.name').each do |element|
65
+ name = element.text.strip
66
+ pages << {
67
+ 'url' => "https://example.com/product/#{CGI::escape name}",
68
+ 'page_type' => 'product',
69
+ 'vars' => {'name' => name}
70
+ }
71
+ end
72
+ ```
73
+
74
+ ```ruby
75
+ # ./parsers/product.rb
76
+
77
+ html = Nokogiri.HTML content
78
+ description = html.css('.description').first.text.strip
79
+ outputs << {
80
+ '_collection' => 'product',
81
+ 'name' => page['vars']['name'],
82
+ 'description' => description
83
+ }
84
+ ```
85
+
86
+ ### Adding dh_easy to sample project
87
+
88
+ One of DhEasy's main feature is to allow users to use classes instead of raw scripts with the whole `datahen` gem contexts (seeder, parsers, finishers, etc.) functions and objects integreated directly on our classes.
89
+
90
+ Converting seeders, parsers and finishers to DhEasy supported classes is quite easy, just wrap your seeders and parsers like this:
91
+
92
+ ```ruby
93
+ class MySeeder
94
+ include DhEasy::Core::Plugin::Seeder
95
+
96
+ # Create "initialize_hook_*" methods instead of "initialize" method
97
+ # to prevent overriding the logic behind DhEasy
98
+ def initialize_hook_my_seeder opts = {}
99
+ @my_param = opts[:my_param]
100
+ end
101
+
102
+ def seed
103
+
104
+ # Your seeder code goes here
105
+
106
+ end
107
+ end
108
+ ```
109
+
110
+ ```ruby
111
+ class MyParser
112
+ include DhEasy::Core::Plugin::Parser
113
+
114
+ # Create "initialize_hook_*" methods instead of "initialize" method
115
+ # to prevent overriding the logic behind DhEasy
116
+ def initialize_hook_my_parser opts = {}
117
+ @my_param = opts[:my_param]
118
+ end
119
+
120
+ def parse
121
+
122
+ # Your parser code goes here
123
+
124
+ end
125
+ end
126
+ ```
127
+
128
+ ```ruby
129
+ class MyFinisher
130
+ include DhEasy::Core::Plugin::Finisher
131
+
132
+ # Create "initialize_hook_*" methods instead of "initialize" method
133
+ # to prevent overriding the logic behind DhEasy
134
+ def initialize_hook_my_parser opts = {}
135
+ @my_param = opts[:my_param]
136
+ end
137
+
138
+ def finish
139
+
140
+ # Your finisher code goes here
141
+
142
+ end
143
+ end
144
+ ```
145
+
146
+ You can also add `initialize_hook_` methods to extend the default `initialize` provided by DhEasy plugins.
147
+
148
+ Now let's try this on our sample project's seeders and parsers:
149
+
150
+ ```ruby
151
+ # ./seeder/seeder.rb
152
+
153
+ module Seeder
154
+ class Seeder
155
+ include DhEasy::Core::Plugin::Seeder
156
+
157
+ def seed
158
+ pages << {
159
+ 'url' => 'https://example.com/search?query=food',
160
+ 'page_type' => 'search'
161
+ }
162
+ end
163
+ end
164
+ end
165
+ ```
166
+
167
+ ```ruby
168
+ # ./parsers/search.rb
169
+
170
+ module Parsers
171
+ class Search
172
+ include DhEasy::Core::Plugin::Parser
173
+
174
+ def parse
175
+ html = Nokogiri.HTML content
176
+ html.css('.name').each do |element|
177
+ name = element.text.strip
178
+ pages << {
179
+ 'url' => "https://example.com/product/#{CGI::escape name}",
180
+ 'page_type' => 'product',
181
+ 'vars' => {'name' => name}
182
+ }
183
+ end
184
+ end
185
+ end
186
+ end
187
+ ```
188
+
189
+ ```ruby
190
+ # ./parsers/product.rb
191
+
192
+ module Parsers
193
+ class Product
194
+ include DhEasy::Core::Plugin::Parser
195
+
196
+ def parse
197
+ html = Nokogiri.HTML content
198
+ description = html.css('.description').first.text.strip
199
+ outputs << {
200
+ '_collection' => 'product',
201
+ 'name' => page['vars']['name'],
202
+ 'description' => description
203
+ }
204
+ end
205
+ end
206
+ end
207
+ ```
208
+
209
+ Next step is to add router capabilities to consume these classes. To do this, let's create the routers and require our seeder and parsers classes, like this:
210
+
211
+ ```ruby
212
+ # ./router/seeder.rb
213
+
214
+ require 'dh_easy/router'
215
+ require './seeder/seeder'
216
+
217
+ DhEasy::Router::Seeder.new.route context: self
218
+ ```
219
+
220
+ ```ruby
221
+ # ./router/parser.rb
222
+
223
+ require 'cgi'
224
+ require 'dh_easy/router'
225
+ require './parsers/search'
226
+ require './parsers/product'
227
+
228
+ DhEasy::Router::Parser.new.route context: self
229
+ ```
230
+
231
+ Now lets create our `./dh_easy.yaml` config file to link our routers to our new seeder and parsers classes:
232
+
233
+ ```yaml
234
+ # ./dh_easy.yaml
235
+
236
+ router:
237
+ parser:
238
+ routes:
239
+ - page_type: search
240
+ class: Parsers::Search
241
+ - page_type: product
242
+ class: Parsers::Product
243
+
244
+ seeder:
245
+ routes:
246
+ - class: Seeder::Seeder
247
+ ```
248
+
249
+ Finally, we need to modify our `./config.yaml` to use our routers:
250
+
251
+ ```yaml
252
+ # ./config.yaml
253
+
254
+ seeder:
255
+ file: ./router/seeder.rb
256
+ disabled: false
257
+
258
+ parsers:
259
+ - page_type: search
260
+ file: ./router/parser.rb
261
+ disabled: false
262
+ - page_type: product
263
+ file: ./router/parser.rb
264
+ disabled: false
265
+ ```
266
+
267
+ Hurray! you have successfullly implemented DhEasy on your project.
@@ -0,0 +1,22 @@
1
+ require 'benchmark'
2
+ require 'bundler/gem_tasks'
3
+ require 'rake/testtask'
4
+
5
+ Rake::TestTask.new do |t|
6
+ t.libs = ['lib', 'test']
7
+ t.warning = false
8
+ t.verbose = false
9
+ t.test_files = FileList['./test/**/*_test.rb']
10
+ end
11
+
12
+ desc 'Benchmark another task execution | usage example: benchmark[my_task, param1, param2]'
13
+ task :benchmark, [:task] do |task, args|
14
+ task_name = args[:task]
15
+ if task_name.nil?
16
+ puts "Should select a task."
17
+ exit 1
18
+ end
19
+ puts Benchmark.measure{ Rake::Task[task_name].invoke *args.extras }
20
+ end
21
+
22
+ task default: :test
@@ -0,0 +1,52 @@
1
+
2
+ lib = File.expand_path("../lib", __FILE__)
3
+ $LOAD_PATH.unshift(lib) unless $LOAD_PATH.include?(lib)
4
+ require "dh_easy/version"
5
+
6
+ Gem::Specification.new do |spec|
7
+ spec.name = "dh_easy"
8
+ spec.version = DhEasy::VERSION
9
+ spec.authors = ["Eduardo Rosales"]
10
+ spec.email = ["eduardo@datahen.com"]
11
+
12
+ spec.summary = %q{DataHen Easy toolkit modules.}
13
+ spec.description = %q{DataHen Easy toolkit module collection.}
14
+ spec.homepage = "https://datahen.com"
15
+ spec.license = "MIT"
16
+
17
+ # spec.cert_chain = ['certs/dh_easy.pem']
18
+ # spec.signing_key = File.expand_path("~/.ssh/gems/gem-private_dh_easy.pem") if $0 =~ /gem\z/
19
+
20
+ # Prevent pushing this gem to RubyGems.org. To allow pushes either set the 'allowed_push_host'
21
+ # to allow pushing to a single host or delete this section to allow pushing to any host.
22
+ if spec.respond_to?(:metadata)
23
+ # spec.metadata["allowed_push_host"] = "TODO: Set to 'http://mygemserver.com'"
24
+
25
+ spec.metadata["homepage_uri"] = spec.homepage
26
+ spec.metadata["source_code_uri"] = "https://github.com/DataHenOfficial/dh_easy"
27
+ # spec.metadata["allowed_push_host"] = "TODO: Set to 'http://mygemserver.com'"
28
+ else
29
+ raise "RubyGems 2.0 or newer is required to protect against " \
30
+ "public gem pushes."
31
+ end
32
+
33
+ # Specify which files should be added to the gem when it is released.
34
+ # The `git ls-files -z` loads the files in the RubyGem that have been added into git.
35
+ spec.files = Dir.chdir(File.expand_path('..', __FILE__)) do
36
+ `git ls-files -z`.split("\x0").reject { |f| f.match(%r{^(test|spec|features)/}) }
37
+ end
38
+ spec.require_paths = ["lib"]
39
+ spec.required_ruby_version = '>= 2.2.2'
40
+
41
+ spec.add_dependency 'dh_easy-core', '>= 0'
42
+ spec.add_dependency 'dh_easy-config', '>= 0'
43
+ spec.add_dependency 'dh_easy-text', '>= 0'
44
+ spec.add_dependency 'dh_easy-router', '>= 0'
45
+ spec.add_dependency 'dh_easy-login', '>= 0'
46
+ spec.add_development_dependency 'bundler', '>= 1'
47
+ spec.add_development_dependency 'rake', '~> 10'
48
+ spec.add_development_dependency 'minitest', '~> 5'
49
+ spec.add_development_dependency 'simplecov', '~> 0'
50
+ spec.add_development_dependency 'simplecov-console', '~> 0'
51
+ spec.add_development_dependency 'byebug', '>= 0'
52
+ end