embulk-filter-mssql_lookup 0.1.3 → 0.1.4

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
checksums.yaml CHANGED
@@ -1,7 +1,7 @@
1
1
  ---
2
2
  SHA256:
3
- metadata.gz: 8aa20f2a39bd348bd8944b1870eaaa1a75bcdd38cf9121d54f4874eec161c4ab
4
- data.tar.gz: b7289a54a01a8e03513712ed669368b3528ad5fac12f8ddc76489e3d0e0aee5b
3
+ metadata.gz: 943d846e1cd6d2971f5caa0e34567b9a3ada60433974d8e365377ec8fbf96644
4
+ data.tar.gz: 19b8c8f9f49c5e2f5234089dca6b9ac3a2e1b8a8b0ed09f2d24326599b7780ab
5
5
  SHA512:
6
- metadata.gz: e0a88889cb88f60ebdb82a509e248136ba3f050a59c5b3c44af3831047c0a055fd02c4ee272a4c365b125a816575a507a3dc5de1365143846c190d3a3691717d
7
- data.tar.gz: 54a3f648cc4753423b03eb88f8d03aa1cd0544cdfe1ecbd4df0939cc7867408dc2abcab51a74794ae37cdffb29fd2cccc8e2b5047b0c7e466446c76623c6c271
6
+ metadata.gz: 9537885d100efde4348394107dd7147ed4cf487e5c0a8af2eaa09b2782bd58b80c6e311a7d8062b5df07b2c7191e7be3b642c85facbbc591035491e73d7f59bb
7
+ data.tar.gz: 5e563167b6676531452ede2bfbd082afda4e7751dce0a85b8a6869ab91178002f4a1545453a0653ad0bde3d6d87c76bbaef9d68a75d872793015071367b4fe29
data/README.md CHANGED
@@ -13,7 +13,7 @@ An Embulk filter plugin for Lookup Transformation with MS-SQL database
13
13
 
14
14
  - **mssql_lookup**: Required attributes for the LookUp Filter Plugin -
15
15
  - **filters**:
16
- - **type**: Name of lookup type (required)
16
+ - **type**: Name of lookup type (required)
17
17
  - **driver_path**: driver path of the driver which has been installed in the system
18
18
  - **driver_class**: driver class name
19
19
  - **host**: database host (example `localhost`) (required)
@@ -30,92 +30,107 @@ An Embulk filter plugin for Lookup Transformation with MS-SQL database
30
30
  - **Name of column-1**: column name-1 from input file
31
31
  - **Name of column-2**: column name-2 from input file
32
32
  - **new_columns**: (New generated column names) (required)
33
- - **Name-1,Type-1**: Any Name, Type of the name (name: country_name, type: string)
34
- - **Name-2,Type-2**: Any Name, Type of the name (name: country_address, type: string) etc ...
33
+ - **Name-1,Type-1**: Any Name, Type of the name { name: car_id, type: string }
34
+ - **Name-2,Type-2**: Any Name, Type of the name { name: category, type: string }
35
+ - **Name-3,Type-3**: Any Name, Type of the name { name: fuel_capacity, type: string } etc ...
35
36
  ## Example - columns
36
37
 
37
- Input1 for table 1 is as follows :-
38
+ Customer.csv for table 1 is as follows :-
38
39
 
39
40
  ```
40
- year country_code country_name literacy_rate
41
-
42
- 1990 1 India 80%
43
- 1993 2 USA 83%
44
- 1997 3 JAPAN
45
- 1999 4 China 72%
46
- 2000 5 Ukraine 68%
47
- 2002 6 Italy 79%
48
- 2004 7 UK 75%
49
- 2011 8 NULL 42%
41
+ id customer_name address email car_name company
42
+ 1 John Doe 123 Main St, Anytown USA john.doe@example.com Civic Honda
43
+ 2 Jane Smith 456 Elm St, Anytown USA jane.smith@example.com E-Class Mercedes-Benz
44
+ 3 Bob Johnson 789 Oak St, Anytown USA bob.johnson@example.com GLE-Class Mercedes-Benz
45
+ 4 Amanda Hernandez 999 Cedar St, Anytown USA amanda.hernandez@example.com 911 119
46
+ 5 Tom Brown 567 Pine St, Anytown USA tom.brown@example.com C-Class Mercedes-Benz
47
+ 6 Samantha Davis 890 Cedar St, Anytown USA samantha.davis@example.com Civic Honda
48
+ 7 Mike Wilson 1234 Spruce St, Anytown USA mike.wilson@example.com GLE-Class Mercedes-Benz
49
+ 8 Jason Brown 888 Pine St, Anytown USA jason.brown@example.com 911 Porsche
50
+ 9 David Rodriguez 9010 Oak St, Anytown USA david.rodriguez@example.com GLC-Class Mercedes-Benz
51
+ 10 Mark Davis 666 Spruce St, Anytown USA mark.davis@example.com C-Class Mercedes-Benz
52
+ 11 Chris Thompson 222 Cedar St, Anytown USA chris.thompson@example.com Cayenne Porsche
53
+ 12 Linda Young 555 Birch St, Anytown USA linda.young@example.com RAV4
54
+ 13 Kevin Hernandez 444 Maple St, Anytown USA kevin.hernandez@example.com 911 119
50
55
  ```
51
56
 
52
- Input2 for table 2 is as follows :-
57
+ Car.csv for table 2 is as follows :-
53
58
 
54
59
  ```
55
- id country_population country_address country_GDP
56
-
57
- 1 11.3 India 1.67
58
- 2 18.2 USA 16.72
59
- 3 30 JAPAN 5.00
60
- 4 4 China 9.33
61
- 5 57 Ukraine 1.08
62
- 6 63 Italy 2.068
63
- 7 17 UK 2.49
64
- 8 28 UAE 1.18
65
-
66
-
67
- Note: country_population is calculated in Billion and country_GDP is calculated in $USD Trillion
60
+ car_id model brand category fuel_capacity
61
+ 87 GLE-Class Mercedes-Benz SUV 80
62
+ 101 Cayenne Porsche SUV 75
63
+ 119 911 Porsche Sports Car 64
64
+ 205 Accord Honda Sedan 56
65
+ 334 Pilot Honda SUV 70
66
+ 434 CR-v Honda SUV 64
67
+ 559 C-Class Mercedes-Benz Sedan 66
68
+ 603 Civic Honda Sedan 42
69
+ 697 E-Class Mercedes-Benz Sedan 72
70
+ 812 GLC-Class Mercedes-Benz Sedan 68
71
+
72
+
68
73
  ```
69
74
 
70
75
  As shown in yaml below, columns mentioned in mapping_from will be mapped with columns mentioned in mapping_to
71
76
  ie:
72
77
 
73
-
74
- country_code : id
75
- country_name : country_address
78
+ car_name : model
79
+ company : brand
76
80
 
77
81
  After successful mapping an Output.csv file containing the columns mentioned in new_columns will be generated
78
82
 
79
83
  Output File generated :-
80
84
 
81
85
  ```
82
- year country_code country_name literacy_rate country_GDP country_population
83
-
84
- 1990 1 India 80% 1.67 11.3
85
- 1993 2 USA 83% 16.72 18.2
86
- 1997 3 JAPAN 5.00 30
87
- 1999 4 China 72% 9.33 4
88
- 2000 5 Ukraine 68% 1.08 57
89
- 2002 6 Italy 79% 2.068 63
90
- 2004 7 UK 75% 2.49 17
91
- 2011 8 NULL 42%
86
+ id customer_name address email car_name company car_id category fuel_capacity
87
+ 1 John Doe 123 Main St, Anytown USA john.doe@example.com Civic Honda 603 Sedan 42
88
+ 2 Jane Smith 456 Elm St, Anytown USA jane.smith@example.com E-Class Mercedes-Benz 697 Sedan 72
89
+ 3 Bob Johnson 789 Oak St, Anytown USA bob.johnson@example.com GLE-Class Mercedes-Benz 87 SUV 80
90
+ 4 Amanda Hernandez 999 Cedar St, Anytown USA amanda.hernandez@example.com 911 119 0
91
+ 5 Tom Brown 567 Pine St, Anytown USA tom.brown@example.com C-Class Mercedes-Benz 559 Sedan 66
92
+ 6 Samantha Davis 890 Cedar St, Anytown USA samantha.davis@example.com Civic Honda 603 Sedan 42
93
+ 7 Mike Wilson 1234 Spruce St, Anytown USA mike.wilson@example.com GLE-Class Mercedes-Benz 87 SUV 80
94
+ 8 Jason Brown 888 Pine St, Anytown USA jason.brown@example.com 911 Porsche 119 Sport Car 64
95
+ 9 David Rodriguez 9010 Oak St, Anytown USA david.rodriguez@example.com GLC-Class Mercedes-Benz 812 SUV 68
96
+ 10 Mark Davis 666 Spruce St, Anytown USA mark.davis@example.com C-Class Mercedes-Benz 559 Sedan 66
97
+ 11 Chris Thompson 222 Cedar St, Anytown USA chris.thompson@example.com Cayenne Porsche 101 SUV 75
98
+ 12 Linda Young 555 Birch St, Anytown USA linda.young@example.com RAV4 \N 0
99
+ 13 Kevin Hernandez 444 Maple St, Anytown USA kevin.hernandez@example.com 911 119 0
100
+
92
101
  ```
93
102
 
94
103
  ```yaml
95
- type: mssql_lookup
96
- driver_path: /home/infoobjects/Downloads/sqljdbc_1s1.2.0.0_enu/sqljdbc_11.2/enu/mssql-jdbc-11.2.0.jre8.jar
97
- driver_class: com.microsoft.sqlserver.jdbc.SQLServerDriver
98
- host: localhost
99
- port: 1433
100
- database: test
101
- schemaName: testSchema1
102
- table_name: country_details
103
- username: root
104
- password: root
105
- mapping_from:
106
- - country_code
107
- - country_name
108
- mapping_to:
109
- - id
110
- - country_address
111
- new_columns:
112
- - { name: country_GDP, type: string }
113
- - { name: country_population, type: string }
104
+ filters:
105
+ - type: sqlserver
106
+ host: Localhost
107
+ driver_path: "..path../mssql-jdbc-10.2.0.jre17.jar"
108
+ username: "user"
109
+ password: "password"
110
+ database: "test"
111
+ table_name: car
112
+ mapping_from:
113
+ - car_name
114
+ - company
115
+ mapping_to:
116
+ - model
117
+ - brand
118
+ new_columns:
119
+ - { name: car_id, type: string }
120
+ - { name: category, type: string }
121
+ - { name: fuel_capacity, type: string }
114
122
  ```
115
123
 
116
124
  Notes:
117
125
  1. mapping_from attribute should be in same order as mentioned in input file.
118
126
 
127
+ 2. This attribute needs to be provided(in input plugin) while using jdbc input plugin in case datatype is Number:------
128
+ ```
129
+ column_options:
130
+ id: {value_type: long}
131
+ ```
132
+ 3. Matching columns data types must be int,long and String
133
+
119
134
  ## Development
120
135
 
121
136
  Run example:
data/build.gradle CHANGED
@@ -13,7 +13,7 @@ configurations {
13
13
  provided
14
14
  }
15
15
 
16
- version = "0.1.3"
16
+ version = "0.1.4"
17
17
 
18
18
  sourceCompatibility = 1.8
19
19
  targetCompatibility = 1.8
@@ -0,0 +1,38 @@
1
+ exec:
2
+ max_threads: 1
3
+ min_output_tasks: 1
4
+ in:
5
+ type: sqlserver
6
+ host: Localhost
7
+ driver_path: "..path../mssql-jdbc-10.2.0.jre17.jar"
8
+ user: "usert"
9
+ password: "password"
10
+ database: "test"
11
+ table: customer
12
+ filters:
13
+ - type: sqlserver
14
+ host: Localhost
15
+ driver_path: "..path../mssql-jdbc-10.2.0.jre17.jar"
16
+ username: "user"
17
+ password: "password"
18
+ database: "test"
19
+ table_name: car
20
+ mapping_from:
21
+ - car_name
22
+ - company
23
+ mapping_to:
24
+ - model
25
+ - brand
26
+ new_columns:
27
+ - { name: car_id, type: string }
28
+ - { name: category, type: string }
29
+ - { name: fuel_capacity, type: string }
30
+ out:
31
+ type: sqlserver
32
+ host: Localhost
33
+ driver_path: "..path../mssql-jdbc-10.2.0.jre17.jar"
34
+ user: "usert"
35
+ password: "password"
36
+ database: "test"
37
+ table: output_table
38
+ mode: truncate_insert
@@ -0,0 +1,53 @@
1
+ exec:
2
+ max_threads: 1
3
+ min_output_tasks: 1
4
+ in:
5
+ type: file
6
+ path_prefix: "..path../customer.csv"
7
+ parser:
8
+ charset: UTF-8
9
+ type: csv
10
+ delimiter: ','
11
+ quote: '"'
12
+ header_line: true
13
+ columns:
14
+ - { name: id, type: long }
15
+ - { name: customer_name, type: string }
16
+ - { name: address, type: string }
17
+ - { name: email, type: string }
18
+ - { name: car_name, type: string }
19
+ - { name: company, type: string }
20
+ filters:
21
+ - type: sqlserver
22
+ host: Localhost
23
+ driver_path: "..path../mssql-jdbc-10.2.0.jre17.jar"
24
+ username: "user"
25
+ password: "password"
26
+ database: "test"
27
+ table_name: car
28
+ mapping_from:
29
+ - car_name
30
+ - company
31
+ mapping_to:
32
+ - model
33
+ - brand
34
+ new_columns:
35
+ - { name: car_id, type: string }
36
+ - { name: category, type: string }
37
+ - { name: fuel_capacity, type: string }
38
+
39
+ out:
40
+ type: file
41
+ path_prefix: "..path../output.csv"
42
+ file_ext: csv
43
+ formatter:
44
+ type: csv
45
+ delimiter: ","
46
+ newline: CRLF
47
+ newline_in_field: LF
48
+ charset: UTF-8
49
+ quote_policy: MINIMAL
50
+ quote: '"'
51
+ escape: "\\"
52
+ null_string: "\\N"
53
+ default_timezone: 'UTC'
@@ -0,0 +1,37 @@
1
+ exec:
2
+ max_threads: 1
3
+ min_output_tasks: 1
4
+ in:
5
+ type: mysql
6
+ host: localhost
7
+ user: root
8
+ password: 'passsword'
9
+ database: test
10
+ table: database_table
11
+ filters:
12
+ - type: sqlserver
13
+ host: Localhost
14
+ driver_path: "..path../mssql-jdbc-10.2.0.jre17.jar"
15
+ username: "user"
16
+ password: "password"
17
+ database: "test"
18
+ table_name: car
19
+ mapping_from:
20
+ - car_name
21
+ - company
22
+ mapping_to:
23
+ - model
24
+ - brand
25
+ new_columns:
26
+ - { name: car_id, type: string }
27
+ - { name: category, type: string }
28
+ - { name: fuel_capacity, type: string }
29
+
30
+ out:
31
+ type: mysql
32
+ host: localhost
33
+ user: root
34
+ password: 'passsword'
35
+ database: test
36
+ table: output_table
37
+ mode: truncate_insert
@@ -0,0 +1,43 @@
1
+ exec:
2
+ max_threads: 1
3
+ min_output_tasks: 1
4
+ in:
5
+ type: jdbc
6
+ host: localhost
7
+ driver_path: "C:/drivers/ojdbc8.jar"
8
+ driver_class: 'oracle.jdbc.driver.OracleDriver'
9
+ url: jdbc:oracle:thin:@localhost:1521:orcl
10
+ user: MYUSER
11
+ password: ABCD
12
+ database: DEMO
13
+ table: customer
14
+ filters:
15
+ - type: sqlserver
16
+ host: Localhost
17
+ driver_path: "..path../mssql-jdbc-10.2.0.jre17.jar"
18
+ username: "user"
19
+ password: "password"
20
+ database: "test"
21
+ table_name: car
22
+ mapping_from:
23
+ - car_name
24
+ - company
25
+ mapping_to:
26
+ - model
27
+ - brand
28
+ new_columns:
29
+ - { name: car_id, type: string }
30
+ - { name: category, type: string }
31
+ - { name: fuel_capacity, type: string }
32
+ out:
33
+ type: jdbc
34
+ host: localhost
35
+ driver_path: "..path../ojdbc8.jar"
36
+ driver_class: 'oracle.jdbc.driver.OracleDriver'
37
+ url: jdbc:oracle:thin:@localhost:1521:orcl
38
+ user: MYUSER
39
+ password: ABCD
40
+ database: DEMO
41
+ table: output_table
42
+ mode: truncate_insert
43
+
@@ -0,0 +1,41 @@
1
+ exec:
2
+ max_threads: 1
3
+ min_output_tasks: 1
4
+ in:
5
+ type: postgresql
6
+ host: localhost
7
+ port: 5432
8
+ user: postgres
9
+ password: 1234
10
+ schema: public
11
+ database: test
12
+ table: customer
13
+ filters:
14
+ - type: sqlserver
15
+ host: Localhost
16
+ driver_path: "..path../mssql-jdbc-10.2.0.jre17.jar"
17
+ username: "user"
18
+ password: "password"
19
+ database: "test"
20
+ table_name: car
21
+ mapping_from:
22
+ - car_name
23
+ - company
24
+ mapping_to:
25
+ - model
26
+ - brand
27
+ new_columns:
28
+ - { name: car_id, type: string }
29
+ - { name: category, type: string }
30
+ - { name: fuel_capacity, type: string }
31
+ out:
32
+ type: postgresql
33
+ host: localhost
34
+ port: 5432
35
+ database: test
36
+ user: postgres
37
+ password: 1234
38
+ schema: public
39
+ table: output_table
40
+ mode: truncate_insert
41
+
@@ -5,20 +5,21 @@ import com.google.common.collect.ImmutableList;
5
5
  import org.embulk.config.*;
6
6
  import org.embulk.spi.*;
7
7
  import org.embulk.spi.type.Types;
8
+ import org.slf4j.Logger;
9
+ import org.slf4j.LoggerFactory;
8
10
 
9
11
  import java.sql.Connection;
10
12
  import java.sql.ResultSet;
11
13
  import java.sql.SQLException;
12
14
  import java.sql.Statement;
13
15
  import java.time.Instant;
14
- import java.util.ArrayList;
15
- import java.util.HashMap;
16
- import java.util.List;
17
- import java.util.Map;
16
+ import java.util.*;
18
17
 
19
18
  public class MssqlLookupFilterPlugin
20
19
  implements FilterPlugin
21
20
  {
21
+ private static final Logger logger = LoggerFactory.getLogger(MssqlLookupFilterPlugin.class);
22
+
22
23
  public interface PluginTask
23
24
  extends Task
24
25
  {
@@ -195,6 +196,8 @@ public class MssqlLookupFilterPlugin
195
196
  columnConfigList.add(columnConfig);
196
197
  }
197
198
 
199
+ Set<String> unmatchedData = new LinkedHashSet<>();
200
+ List<String> keyColumns = task.getMappingFrom();
198
201
  while (reader.nextRecord()) {
199
202
 
200
203
  int colNum = 0;
@@ -227,6 +230,8 @@ public class MssqlLookupFilterPlugin
227
230
  List<String> matchedData = new ArrayList<>();
228
231
  if (keyValuePair.containsKey(key)) {
229
232
  matchedData = keyValuePair.get(key);
233
+ }else {
234
+ unmatchedData.add(key);
230
235
  }
231
236
 
232
237
  if (matchedData.size() == 0) {
@@ -242,6 +247,18 @@ public class MssqlLookupFilterPlugin
242
247
  }
243
248
  builder.addRecord();
244
249
  }
250
+ String info="\n--------------------Unmatched rows.....................\nMapping Key Columns: ";
251
+ for(int i=0;i<keyColumns.size();i++){
252
+ info+= keyColumns.get(i);
253
+ if(i!=keyColumns.size()-1){
254
+ info+=",";
255
+ }
256
+ }
257
+ info+="\n";
258
+ for(String key: unmatchedData){
259
+ info+= key+"\n";
260
+ }
261
+ logger.info(info);
245
262
  }
246
263
 
247
264
  @Override
metadata CHANGED
@@ -1,14 +1,14 @@
1
1
  --- !ruby/object:Gem::Specification
2
2
  name: embulk-filter-mssql_lookup
3
3
  version: !ruby/object:Gem::Version
4
- version: 0.1.3
4
+ version: 0.1.4
5
5
  platform: ruby
6
6
  authors:
7
7
  - Infoobjects Inc.
8
8
  autorequire:
9
9
  bindir: bin
10
10
  cert_chain: []
11
- date: 2023-02-13 00:00:00.000000000 Z
11
+ date: 2023-02-27 00:00:00.000000000 Z
12
12
  dependencies:
13
13
  - !ruby/object:Gem::Dependency
14
14
  name: bundler
@@ -49,11 +49,15 @@ files:
49
49
  - LICENSE
50
50
  - README.md
51
51
  - build.gradle
52
- - classpath/embulk-filter-mssql_lookup-0.1.3.jar
52
+ - classpath/embulk-filter-mssql_lookup-0.1.4.jar
53
53
  - classpath/sqljdbc4-4.0.jar
54
54
  - config/checkstyle/checkstyle.xml
55
55
  - config/checkstyle/default.xml
56
- - example/config.yml
56
+ - example/mssql-filter-lookup_testing.yml.liquid
57
+ - example/mssql-filter-lookup_testing_csv.yml.liquid
58
+ - example/mssql-filter-lookup_testing_mysql.yml.liquid
59
+ - example/mssql-filter-lookup_testing_oracle.yml.liquid
60
+ - example/mssql-filter-lookup_testing_postgres.yml.liquid
57
61
  - gradle/wrapper/gradle-wrapper.jar
58
62
  - gradle/wrapper/gradle-wrapper.properties
59
63
  - gradlew
data/example/config.yml DELETED
@@ -1,44 +0,0 @@
1
- exec:
2
- max_threads: 1
3
- min_output_tasks: 1
4
- in:
5
- type: file
6
- path_prefix: /home/infoobjects/Downloads/sample/calendarFloat1 (copy).csv
7
- parser:
8
- type: csv
9
- columns:
10
- - { name: Country_Name, type: string }
11
- - { name: quarter_number, type: long }
12
-
13
- filters:
14
- - type: mssql_lookup
15
- driver_path: /home/infoobjects/Downloads/sqljdbc_1s1.2.0.0_enu/sqljdbc_11.2/enu/mssql-jdbc-11.2.0.jre8.jar
16
- driver_class: com.microsoft.sqlserver.jdbc.SQLServerDriver
17
- host: infoobjects-ThinkPad-E490
18
- database: test
19
- schemaName: testSchema1
20
- table_name: cnew
21
- username: SA
22
- password: Mssqlroot95
23
- mapping_from:
24
- - quarter_number
25
- mapping_to:
26
- - qtr_no
27
- new_columns:
28
- - { name: qtr_name, type: string }
29
-
30
- out:
31
- type: file
32
- path_prefix: /home/infoobjects/sqlOutput/output.csv
33
- file_ext: csv
34
- formatter:
35
- type: csv
36
- delimiter: "\t"
37
- newline: CRLF
38
- newline_in_field: LF
39
- charset: UTF-8
40
- quote_policy: MINIMAL
41
- quote: '"'
42
- escape: "\\"
43
- null_string: "\\N"
44
- default_timezone: 'UTC'