geolocation_service 0.1.0 → 0.1.2

Sign up to get free protection for your applications and to get access to all the features.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 83121b2e17c3649b2f5f070dda4749a3efd7724f0a33de35ad857b7676df7727
4
- data.tar.gz: 97bacb951601c4a2faf7042b3db24381a602aa9d35347142b745b8d46f0544fe
3
+ metadata.gz: 425afb60c0457918c20ce1dc77821fe51f72a5c3956baad224b2282628cf4ae7
4
+ data.tar.gz: 2b21fb3f283c1c9ae58d479c3ab1fa24ad8c989e1f741741f384f88f83c066f4
5
5
  SHA512:
6
- metadata.gz: eef1ce0a4df76d0a350e99b0297f7171a2e5b5fb9011798cabe982f33ec3373163239c91c7c12bd5de3c2799e9975c7f43ce0c2bf197d7e5ad9b6f435ee57747
7
- data.tar.gz: f8611c216add9ee4ab34dc8eb0c70006f71142ef777517825da5456927d14aa4c07e00276961e81d90be81140edd94ce7ce4ff20125769d39fc3b7a40568aef4
6
+ metadata.gz: f4fdd1994c0f4ba8fa758da9440a3510e679ea737d0431e1e312fffc34bfb925020a29b10a6574f14f82f63f8681a1b7fce541ce6012f1ba1ccd008c67afba77
7
+ data.tar.gz: 40636b1b37251ed3b00eed41246fc9cf238e5b78db80b1a5d3175ef432017876b0041c2a5f295591279c9cd77690f4657bbe3c9c20e6ead478d588e2dd82b147
data/README.md CHANGED
@@ -1,10 +1,20 @@
1
- # GeolocationService
2
- Short description and motivation.
1
+ # FindHotel Coding Challenge
3
2
 
4
- ## Usage
5
- How to use my plugin.
3
+ The FindHotel coding challenge consists of two parts, a library and a REST API application:
4
+
5
+ 1. A library with two main features:
6
+ * A service that parses the CSV file containing the raw data and persists it in a database;
7
+ * An interface to provide access to the geolocation data (model layer);
8
+ 2. A REST API that uses the aforementioned library to expose the geolocation data.
9
+
10
+ This repository contains my solution to the library. You can find my solution to the REST API application here: https://github.com/jalerson/geolocation_api.
11
+
12
+ ## Geolocation Service
13
+
14
+ The library was developed as a [Rails Engine gem](https://guides.rubyonrails.org/engines.html), which can be easily and seamlessly integrated into any Rails application.
15
+
16
+ ### Installation
6
17
 
7
- ## Installation
8
18
  Add this line to your application's Gemfile:
9
19
 
10
20
  ```ruby
@@ -12,17 +22,84 @@ gem 'geolocation_service'
12
22
  ```
13
23
 
14
24
  And then execute:
25
+
15
26
  ```bash
16
27
  $ bundle
17
28
  ```
18
29
 
19
- Or install it yourself as:
30
+ Install the gem's migrations:
31
+
20
32
  ```bash
21
- $ gem install geolocation_service
33
+ $ rails geolocation_service_engine:install:migrations
34
+ ```
35
+
36
+ Execute pending migrations:
37
+
38
+ ```bash
39
+ $ rake db:migrate
40
+ ```
41
+
42
+ ### Usage
43
+
44
+ The gem provides four new models: `Ip`, `City`, `Country` and `Location`.
45
+
46
+ ![Models and associations](https://i.ibb.co/Dbb2nTH/geocoding-service-erd.png)
47
+
48
+ In order to import data, you must use the `GeolocationService::Services::ImportBulkDataService`.
49
+
50
+ ```ruby
51
+ GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')
52
+ ```
53
+
54
+ The service returns a [Dry::Monad::Result](https://dry-rb.org/gems/dry-monads/result/) indicating the success or failure of the importing operation.
55
+
56
+ ```ruby
57
+ result = GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')
58
+
59
+ if result.success?
60
+ # do something...
61
+ else
62
+ # do something else...
63
+ end
22
64
  ```
23
65
 
24
- ## Contributing
25
- Contribution directions go here.
66
+ In a successful importing operation, the `result` will contain an instance of `GeolocationService::ImportResult`, which has:
67
+
68
+ - `imported_records`: number of imported records
69
+ - `invalid_records`: number of invalid records
70
+ - `time_consumed`: time consumed in seconds
71
+
72
+ In a failure importing operation, the `result` will contain an error/exception.
73
+
74
+ ```ruby
75
+ result = GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')
76
+
77
+ if result.success?
78
+ import_results = result.value!
79
+ Rails.logger.info "Records imported in #{import_results.time_consumed} seconds"
80
+ else
81
+ error = result.failure
82
+ Rails.logger.error error.message
83
+ end
84
+ ```
85
+
86
+ ### Design decisions
87
+
88
+ The main guidance for design decisions in this project was: **provide the best importing performance while keeping the database normalized**.
89
+
90
+ In order to achieve the best importing performance using a normalized database, experiments were conduct to seek for the best performance of (a) converting the CSV data to a format/representation that could be validated and stored into the database, (b) data validation and (c) actually store the data into the database. In each case, the alternatives considered were:
91
+
92
+ **(a) CSV data conversion:** ActiveRecord instances, simple Ruby classes (no ActiveRecord), Arrays/Hashes and Structs.
93
+
94
+ **(b) data validation:** [ActiveRecord validations](https://guides.rubyonrails.org/active_record_validations.html) or [contracts/schemas](https://dry-rb.org/gems/dry-validation/1.0/).
95
+
96
+ **(c) store into the database**: in order to keep the performance as best as possible, the alternatives considered are limited to those which persist a set of records in a single `INSERT` statement: [activerecord-import gem](https://github.com/zdennis/activerecord-import) or writing and sending the SQL statements to the database.
97
+
98
+ After several experiments with different combinations, the chosen approach is using Structs, contracts/schemas and send SQL statements directly to the database. This particular combination presented a great performance when importing one million records in approximately 4 minutes while avoiding duplicates and keeping a clean code.
99
+
100
+ ## Trade-offs
101
+
102
+ In order to keep the importing performance as best as possible, two design decisions have consequences which users need to be aware of.
26
103
 
27
- ## License
28
- The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
104
+ - **Memory usage:** when importing a set of records, the service will load all existing records in memory and also add new records. This way the service avoids creating duplicated records.
105
+ - **id (primary key) needs to be manually set:** in order to guarantee the proper relationship constraints between records in the database when importing records, the `id` must be set manually in all tables, except `locations`.
@@ -43,10 +43,10 @@ module GeolocationService::Services
43
43
 
44
44
  GeolocationService::ImportResult.new(
45
45
  imported_records: {
46
- ip: @ip_count,
47
- city: @city_count,
48
- country: @country_count,
49
- location: @new_records[:location].count
46
+ ip: @new_records[:ip].values.size,
47
+ city: @new_records[:city].values.size,
48
+ country: @new_records[:country].values.size,
49
+ location: @new_records[:location].size
50
50
  },
51
51
  invalid_records: @invalid_records,
52
52
  time_consumed: (Time.zone.now - start_time)
@@ -74,8 +74,8 @@ module GeolocationService::Services
74
74
  def build_city(row, country)
75
75
  return if row['city'].blank?
76
76
 
77
- if validate(:city, name: row['city'], country_code: country[:id]).success?
78
- new_city = Structs::CityStruct.new(@city_count, row['city'], country[:id])
77
+ if validate(:city, name: row['city'], country_id: country&.id).success?
78
+ new_city = Structs::CityStruct.new(@city_count, row['city'], country&.id)
79
79
  @new_records[:city][row['city'].downcase] = new_city
80
80
  @city_count += 1
81
81
  return new_city
@@ -117,9 +117,9 @@ module GeolocationService::Services
117
117
  end
118
118
 
119
119
  def load_new_records
120
- @ip_count = 0
121
- @city_count = 0
122
- @country_count = 0
120
+ @ip_count = Ip.count == 0 ? 0 : Ip.last.id + 1
121
+ @city_count = City.count == 0 ? 0 : City.last.id + 1
122
+ @country_count = Country.count == 0 ? 0 : Country.last.id + 1
123
123
  @invalid_records = 0
124
124
  @new_records = {ip: {}, location: [], city: {}, country: {}}
125
125
  end
@@ -1,3 +1,3 @@
1
1
  module GeolocationService
2
- VERSION = '0.1.0'
2
+ VERSION = '0.1.2'
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: geolocation_service
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jalerson Lima
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-09-21 00:00:00.000000000 Z
11
+ date: 2019-09-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rails