geolocation_service 0.1.0 → 0.1.2

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 83121b2e17c3649b2f5f070dda4749a3efd7724f0a33de35ad857b7676df7727
4
- data.tar.gz: 97bacb951601c4a2faf7042b3db24381a602aa9d35347142b745b8d46f0544fe
3
+ metadata.gz: 425afb60c0457918c20ce1dc77821fe51f72a5c3956baad224b2282628cf4ae7
4
+ data.tar.gz: 2b21fb3f283c1c9ae58d479c3ab1fa24ad8c989e1f741741f384f88f83c066f4
5
5
  SHA512:
6
- metadata.gz: eef1ce0a4df76d0a350e99b0297f7171a2e5b5fb9011798cabe982f33ec3373163239c91c7c12bd5de3c2799e9975c7f43ce0c2bf197d7e5ad9b6f435ee57747
7
- data.tar.gz: f8611c216add9ee4ab34dc8eb0c70006f71142ef777517825da5456927d14aa4c07e00276961e81d90be81140edd94ce7ce4ff20125769d39fc3b7a40568aef4
6
+ metadata.gz: f4fdd1994c0f4ba8fa758da9440a3510e679ea737d0431e1e312fffc34bfb925020a29b10a6574f14f82f63f8681a1b7fce541ce6012f1ba1ccd008c67afba77
7
+ data.tar.gz: 40636b1b37251ed3b00eed41246fc9cf238e5b78db80b1a5d3175ef432017876b0041c2a5f295591279c9cd77690f4657bbe3c9c20e6ead478d588e2dd82b147
data/README.md CHANGED
@@ -1,10 +1,20 @@
1
- # GeolocationService
2
- Short description and motivation.
1
+ # FindHotel Coding Challenge
3
2
 
4
- ## Usage
5
- How to use my plugin.
3
+ The FindHotel coding challenge consists of two parts, a library and a REST API application:
4
+
5
+ 1. A library with two main features:
6
+ * A service that parses the CSV file containing the raw data and persists it in a database;
7
+ * An interface to provide access to the geolocation data (model layer);
8
+ 2. A REST API that uses the aforementioned library to expose the geolocation data.
9
+
10
+ This repository contains my solution to the library. You can find my solution to the REST API application here: https://github.com/jalerson/geolocation_api.
11
+
12
+ ## Geolocation Service
13
+
14
+ The library was developed as a [Rails Engine gem](https://guides.rubyonrails.org/engines.html), which can be easily and seamlessly integrated into any Rails application.
15
+
16
+ ### Installation
6
17
 
7
- ## Installation
8
18
  Add this line to your application's Gemfile:
9
19
 
10
20
  ```ruby
@@ -12,17 +22,84 @@ gem 'geolocation_service'
12
22
  ```
13
23
 
14
24
  And then execute:
25
+
15
26
  ```bash
16
27
  $ bundle
17
28
  ```
18
29
 
19
- Or install it yourself as:
30
+ Install the gem's migrations:
31
+
20
32
  ```bash
21
- $ gem install geolocation_service
33
+ $ rails geolocation_service_engine:install:migrations
34
+ ```
35
+
36
+ Execute pending migrations:
37
+
38
+ ```bash
39
+ $ rake db:migrate
40
+ ```
41
+
42
+ ### Usage
43
+
44
+ The gem provides four new models: `Ip`, `City`, `Country` and `Location`.
45
+
46
+ ![Models and associations](https://i.ibb.co/Dbb2nTH/geocoding-service-erd.png)
47
+
48
+ In order to import data, you must use the `GeolocationService::Services::ImportBulkDataService`.
49
+
50
+ ```ruby
51
+ GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')
52
+ ```
53
+
54
+ The service returns a [Dry::Monad::Result](https://dry-rb.org/gems/dry-monads/result/) indicating the success or failure of the importing operation.
55
+
56
+ ```ruby
57
+ result = GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')
58
+
59
+ if result.success?
60
+ # do something...
61
+ else
62
+ # do something else...
63
+ end
22
64
  ```
23
65
 
24
- ## Contributing
25
- Contribution directions go here.
66
+ In a successful importing operation, the `result` will contain an instance of `GeolocationService::ImportResult`, which has:
67
+
68
+ - `imported_records`: number of imported records
69
+ - `invalid_records`: number of invalid records
70
+ - `time_consumed`: time consumed in seconds
71
+
72
+ In a failure importing operation, the `result` will contain an error/exception.
73
+
74
+ ```ruby
75
+ result = GeolocationService::Services::ImportBulkDataService.call(file_path: 'path/to/data_file.csv')
76
+
77
+ if result.success?
78
+ import_results = result.value!
79
+ Rails.logger.info "Records imported in #{import_results.time_consumed} seconds"
80
+ else
81
+ error = result.failure
82
+ Rails.logger.error error.message
83
+ end
84
+ ```
85
+
86
+ ### Design decisions
87
+
88
+ The main guidance for design decisions in this project was: **provide the best importing performance while keeping the database normalized**.
89
+
90
+ In order to achieve the best importing performance using a normalized database, experiments were conduct to seek for the best performance of (a) converting the CSV data to a format/representation that could be validated and stored into the database, (b) data validation and (c) actually store the data into the database. In each case, the alternatives considered were:
91
+
92
+ **(a) CSV data conversion:** ActiveRecord instances, simple Ruby classes (no ActiveRecord), Arrays/Hashes and Structs.
93
+
94
+ **(b) data validation:** [ActiveRecord validations](https://guides.rubyonrails.org/active_record_validations.html) or [contracts/schemas](https://dry-rb.org/gems/dry-validation/1.0/).
95
+
96
+ **(c) store into the database**: in order to keep the performance as best as possible, the alternatives considered are limited to those which persist a set of records in a single `INSERT` statement: [activerecord-import gem](https://github.com/zdennis/activerecord-import) or writing and sending the SQL statements to the database.
97
+
98
+ After several experiments with different combinations, the chosen approach is using Structs, contracts/schemas and send SQL statements directly to the database. This particular combination presented a great performance when importing one million records in approximately 4 minutes while avoiding duplicates and keeping a clean code.
99
+
100
+ ## Trade-offs
101
+
102
+ In order to keep the importing performance as best as possible, two design decisions have consequences which users need to be aware of.
26
103
 
27
- ## License
28
- The gem is available as open source under the terms of the [MIT License](https://opensource.org/licenses/MIT).
104
+ - **Memory usage:** when importing a set of records, the service will load all existing records in memory and also add new records. This way the service avoids creating duplicated records.
105
+ - **id (primary key) needs to be manually set:** in order to guarantee the proper relationship constraints between records in the database when importing records, the `id` must be set manually in all tables, except `locations`.
@@ -43,10 +43,10 @@ module GeolocationService::Services
43
43
 
44
44
  GeolocationService::ImportResult.new(
45
45
  imported_records: {
46
- ip: @ip_count,
47
- city: @city_count,
48
- country: @country_count,
49
- location: @new_records[:location].count
46
+ ip: @new_records[:ip].values.size,
47
+ city: @new_records[:city].values.size,
48
+ country: @new_records[:country].values.size,
49
+ location: @new_records[:location].size
50
50
  },
51
51
  invalid_records: @invalid_records,
52
52
  time_consumed: (Time.zone.now - start_time)
@@ -74,8 +74,8 @@ module GeolocationService::Services
74
74
  def build_city(row, country)
75
75
  return if row['city'].blank?
76
76
 
77
- if validate(:city, name: row['city'], country_code: country[:id]).success?
78
- new_city = Structs::CityStruct.new(@city_count, row['city'], country[:id])
77
+ if validate(:city, name: row['city'], country_id: country&.id).success?
78
+ new_city = Structs::CityStruct.new(@city_count, row['city'], country&.id)
79
79
  @new_records[:city][row['city'].downcase] = new_city
80
80
  @city_count += 1
81
81
  return new_city
@@ -117,9 +117,9 @@ module GeolocationService::Services
117
117
  end
118
118
 
119
119
  def load_new_records
120
- @ip_count = 0
121
- @city_count = 0
122
- @country_count = 0
120
+ @ip_count = Ip.count == 0 ? 0 : Ip.last.id + 1
121
+ @city_count = City.count == 0 ? 0 : City.last.id + 1
122
+ @country_count = Country.count == 0 ? 0 : Country.last.id + 1
123
123
  @invalid_records = 0
124
124
  @new_records = {ip: {}, location: [], city: {}, country: {}}
125
125
  end
@@ -1,3 +1,3 @@
1
1
  module GeolocationService
2
- VERSION = '0.1.0'
2
+ VERSION = '0.1.2'
3
3
  end
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: geolocation_service
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.0
4
+ version: 0.1.2
5
5
  platform: ruby
6
6
  authors:
7
7
  - Jalerson Lima
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2019-09-21 00:00:00.000000000 Z
11
+ date: 2019-09-22 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: rails