DataComparerLibrary 0.830__tar.gz → 0.832__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/PKG-INFO +11 -7
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/README.rst +10 -6
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary/datacomparer.py +0 -45
- DataComparerLibrary-0.832/src/DataComparerLibrary/datasorter.py +139 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary/version.py +1 -1
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary.egg-info/PKG-INFO +11 -7
- DataComparerLibrary-0.830/src/DataComparerLibrary/datasorter.py +0 -134
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/LICENSE.txt +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/pyproject.toml +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/setup.cfg +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/setup.py +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary/__init__.py +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary/delimitertranslator.py +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary/fileconverter.py +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary.egg-info/SOURCES.txt +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary.egg-info/dependency_links.txt +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary.egg-info/top_level.txt +0 -0
- {DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/test/test1.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: DataComparerLibrary
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.832
|
|
4
4
|
Summary: For comparing csv-files or 2d-array with csv-file.
|
|
5
5
|
Home-page:
|
|
6
6
|
Author: René Philip Zuijderduijn
|
|
@@ -39,6 +39,10 @@ you can simply run::
|
|
|
39
39
|
pip install --upgrade DataComparerLibrary
|
|
40
40
|
|
|
41
41
|
|
|
42
|
+
Also the following pip package is needed::
|
|
43
|
+
|
|
44
|
+
pip install python-dateutil
|
|
45
|
+
|
|
42
46
|
|
|
43
47
|
Import statement for the DataComparerLibrary in Python
|
|
44
48
|
------------------------------------------------------
|
|
@@ -65,30 +69,30 @@ The DataComparerLibrary can be used for:
|
|
|
65
69
|
beneath.
|
|
66
70
|
|
|
|
67
71
|
| {PRESENT}:
|
|
68
|
-
| With {PRESENT} in the expected data file you can make clear that data of a field of the actual data should be present.
|
|
72
|
+
| With {PRESENT} in the expected data file you can make clear that data of a field/cell of the actual data should be present.
|
|
69
73
|
This can be helpful for fields that have constant changing values. For example generated id's.
|
|
70
74
|
|
|
|
71
75
|
| {EMPTY}:
|
|
72
|
-
| With {EMPTY} in the expected data file you can make clear that data of a field of the actual data should be absent.
|
|
76
|
+
| With {EMPTY} in the expected data file you can make clear that data of a field/cell of the actual data should be absent.
|
|
73
77
|
|
|
|
74
78
|
| {SKIP}:
|
|
75
|
-
| With {SKIP} in the expected data file you can make clear that the comparison of data of a
|
|
79
|
+
| With {SKIP} in the expected data file you can make clear that the comparison of data of a field/cell or part of a field/cell
|
|
76
80
|
of the actual data should be skipped. This can be helpful for fields or parts of fields that have constant changing
|
|
77
81
|
values. For example time or generated id's.
|
|
78
82
|
|
|
|
79
83
|
| {INTEGER}:
|
|
80
|
-
| With {INTEGER} in the expected data file you can make clear that the data of a field of the actual data should be an
|
|
84
|
+
| With {INTEGER} in the expected data file you can make clear that the data of a field/cell of the actual data should be an
|
|
81
85
|
integer. This can be helpful for fields that have constant changing integer values. For example integer id's.
|
|
82
86
|
|
|
|
83
87
|
| {NOW()...:....}:
|
|
84
|
-
| With {NOW()} in the expected data file you can make clear that the data of a field or part of a field of the actual
|
|
88
|
+
| With {NOW()} in the expected data file you can make clear that the data of a field/cell or part of a field/cell of the actual
|
|
85
89
|
data should be (a part of) a date. You can let calculate the current or a date in the past or future. Calculation is
|
|
86
90
|
based on the "relativedelta" method from Python. Also you can style the date in the format you want. This can be
|
|
87
91
|
helpful for fields that have constant changing date values, but which date values have a fixed offset linked to the
|
|
88
92
|
current date. At "Examples comparing Actual Data with Expected Data" you can find some examples how to use it.
|
|
89
93
|
|
|
|
90
94
|
| {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6}:
|
|
91
|
-
| With {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6} in the expected data file you can make clear that the data of a field or part of a field of the actual
|
|
95
|
+
| With {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6} in the expected data file you can make clear that the data of a field/cell or part of a field/cell of the actual
|
|
92
96
|
data should be (a part of) a date. At this moment it is processed as {SKIP}. In the future it will be changed into a check on date format, but
|
|
93
97
|
not a specific date. For check on a specific expected date you can use {NOW()...:....}.
|
|
94
98
|
|
|
|
@@ -19,6 +19,10 @@ you can simply run::
|
|
|
19
19
|
pip install --upgrade DataComparerLibrary
|
|
20
20
|
|
|
21
21
|
|
|
22
|
+
Also the following pip package is needed::
|
|
23
|
+
|
|
24
|
+
pip install python-dateutil
|
|
25
|
+
|
|
22
26
|
|
|
23
27
|
Import statement for the DataComparerLibrary in Python
|
|
24
28
|
------------------------------------------------------
|
|
@@ -45,30 +49,30 @@ The DataComparerLibrary can be used for:
|
|
|
45
49
|
beneath.
|
|
46
50
|
|
|
|
47
51
|
| {PRESENT}:
|
|
48
|
-
| With {PRESENT} in the expected data file you can make clear that data of a field of the actual data should be present.
|
|
52
|
+
| With {PRESENT} in the expected data file you can make clear that data of a field/cell of the actual data should be present.
|
|
49
53
|
This can be helpful for fields that have constant changing values. For example generated id's.
|
|
50
54
|
|
|
|
51
55
|
| {EMPTY}:
|
|
52
|
-
| With {EMPTY} in the expected data file you can make clear that data of a field of the actual data should be absent.
|
|
56
|
+
| With {EMPTY} in the expected data file you can make clear that data of a field/cell of the actual data should be absent.
|
|
53
57
|
|
|
|
54
58
|
| {SKIP}:
|
|
55
|
-
| With {SKIP} in the expected data file you can make clear that the comparison of data of a
|
|
59
|
+
| With {SKIP} in the expected data file you can make clear that the comparison of data of a field/cell or part of a field/cell
|
|
56
60
|
of the actual data should be skipped. This can be helpful for fields or parts of fields that have constant changing
|
|
57
61
|
values. For example time or generated id's.
|
|
58
62
|
|
|
|
59
63
|
| {INTEGER}:
|
|
60
|
-
| With {INTEGER} in the expected data file you can make clear that the data of a field of the actual data should be an
|
|
64
|
+
| With {INTEGER} in the expected data file you can make clear that the data of a field/cell of the actual data should be an
|
|
61
65
|
integer. This can be helpful for fields that have constant changing integer values. For example integer id's.
|
|
62
66
|
|
|
|
63
67
|
| {NOW()...:....}:
|
|
64
|
-
| With {NOW()} in the expected data file you can make clear that the data of a field or part of a field of the actual
|
|
68
|
+
| With {NOW()} in the expected data file you can make clear that the data of a field/cell or part of a field/cell of the actual
|
|
65
69
|
data should be (a part of) a date. You can let calculate the current or a date in the past or future. Calculation is
|
|
66
70
|
based on the "relativedelta" method from Python. Also you can style the date in the format you want. This can be
|
|
67
71
|
helpful for fields that have constant changing date values, but which date values have a fixed offset linked to the
|
|
68
72
|
current date. At "Examples comparing Actual Data with Expected Data" you can find some examples how to use it.
|
|
69
73
|
|
|
|
70
74
|
| {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6}:
|
|
71
|
-
| With {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6} in the expected data file you can make clear that the data of a field or part of a field of the actual
|
|
75
|
+
| With {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6} in the expected data file you can make clear that the data of a field/cell or part of a field/cell of the actual
|
|
72
76
|
data should be (a part of) a date. At this moment it is processed as {SKIP}. In the future it will be changed into a check on date format, but
|
|
73
77
|
not a specific date. For check on a specific expected date you can use {NOW()...:....}.
|
|
74
78
|
|
|
{DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary/datacomparer.py
RENAMED
|
@@ -30,15 +30,6 @@ class DataComparer:
|
|
|
30
30
|
else:
|
|
31
31
|
expected_data = list(csv.reader((line.replace(delimiter_expected_data, chr(255)) for line in expected_file), delimiter=chr(255), quotechar=quotechar_expected_data))
|
|
32
32
|
#
|
|
33
|
-
if actual_data == [] and expected_data == []:
|
|
34
|
-
raise Exception("Actual and Expected data are not present. Actual data input 2d-array and Expected data input file are empty.")
|
|
35
|
-
#
|
|
36
|
-
if not actual_data:
|
|
37
|
-
raise Exception("Actual data is not present. Input 2d-array is empty.")
|
|
38
|
-
#
|
|
39
|
-
if not expected_data:
|
|
40
|
-
raise Exception("Expected data is not present. Input file is empty.")
|
|
41
|
-
#
|
|
42
33
|
DataComparer.__compare_data(self, actual_data, expected_data)
|
|
43
34
|
|
|
44
35
|
|
|
@@ -57,15 +48,6 @@ class DataComparer:
|
|
|
57
48
|
else:
|
|
58
49
|
actual_data = list(csv.reader((line.replace(delimiter_actual_data, chr(255)) for line in actual_file), delimiter=chr(255), quotechar=quotechar_actual_data))
|
|
59
50
|
#
|
|
60
|
-
if actual_data == [] and expected_data == []:
|
|
61
|
-
raise Exception("Actual and Expected data are not present. Actual data input file and Expected data input 2d-array are empty.")
|
|
62
|
-
#
|
|
63
|
-
if not actual_data:
|
|
64
|
-
raise Exception("Actual data is not present. Input file is empty.")
|
|
65
|
-
#
|
|
66
|
-
if not expected_data:
|
|
67
|
-
raise Exception("Expected data is not present. Input 2d-array is empty.")
|
|
68
|
-
#
|
|
69
51
|
DataComparer.__compare_data(self, actual_data, expected_data)
|
|
70
52
|
|
|
71
53
|
|
|
@@ -76,15 +58,6 @@ class DataComparer:
|
|
|
76
58
|
if 'expected_data' not in locals():
|
|
77
59
|
raise Exception("Input Expected data unknown.")
|
|
78
60
|
#
|
|
79
|
-
if actual_data == [] and expected_data == []:
|
|
80
|
-
raise Exception("Actual and Expected data are not present. Actual data input 2d-array and Expected data input 2d-array are empty.")
|
|
81
|
-
#
|
|
82
|
-
if not actual_data:
|
|
83
|
-
raise Exception("Actual data is not present. Input 2d-array is empty.")
|
|
84
|
-
#
|
|
85
|
-
if not expected_data:
|
|
86
|
-
raise Exception("Expected data is not present. Input 2d-array is empty.")
|
|
87
|
-
#
|
|
88
61
|
DataComparer.__compare_data(self, actual_data, expected_data)
|
|
89
62
|
|
|
90
63
|
|
|
@@ -107,30 +80,12 @@ class DataComparer:
|
|
|
107
80
|
else:
|
|
108
81
|
expected_data = list(csv.reader((line.replace(delimiter_expected_data, chr(255)) for line in expected_file), delimiter=chr(255), quotechar=quotechar_expected_data))
|
|
109
82
|
#
|
|
110
|
-
if actual_data == [] and expected_data == []:
|
|
111
|
-
raise Exception("Actual and Expected data are not present. Input files are empty.")
|
|
112
|
-
#
|
|
113
|
-
if not actual_data:
|
|
114
|
-
raise Exception("Actual data is not present. Input file is empty.")
|
|
115
|
-
#
|
|
116
|
-
if not expected_data:
|
|
117
|
-
raise Exception("Expected data is not present. Input file is empty.")
|
|
118
|
-
#
|
|
119
83
|
DataComparer.__compare_data(self, actual_data, expected_data)
|
|
120
84
|
|
|
121
85
|
|
|
122
86
|
def __compare_data(self, actual_data, expected_data_including_templates):
|
|
123
87
|
difference_found = False
|
|
124
88
|
#
|
|
125
|
-
if actual_data == [] and expected_data_including_templates == []:
|
|
126
|
-
raise Exception("Actual and Expected data are not present.")
|
|
127
|
-
#
|
|
128
|
-
if not actual_data:
|
|
129
|
-
raise Exception("Actual data is not present.")
|
|
130
|
-
#
|
|
131
|
-
if not expected_data_including_templates:
|
|
132
|
-
raise Exception("Expected data is not present.")
|
|
133
|
-
#
|
|
134
89
|
number_of_rows_actual_data = len(actual_data)
|
|
135
90
|
number_of_rows_expected_data = len(expected_data_including_templates)
|
|
136
91
|
|
|
@@ -0,0 +1,139 @@
|
|
|
1
|
+
# Script for sorting an input file. Sorted result will be written to an output file. Encoding is UFT-08.
|
|
2
|
+
#
|
|
3
|
+
import csv
|
|
4
|
+
import os
|
|
5
|
+
from DataComparerLibrary.delimitertranslator import DelimiterTranslator
|
|
6
|
+
|
|
7
|
+
|
|
8
|
+
class DataSorter:
|
|
9
|
+
def sort_file(self, input_file, output_file, number_of_header_lines=0, number_of_trailer_lines=0, sort_on_columns_list=None, delimiter=',', quotechar=None):
|
|
10
|
+
try:
|
|
11
|
+
if not os.path.isfile(input_file):
|
|
12
|
+
raise Exception("Input file doesn't exists: ", input_file)
|
|
13
|
+
#
|
|
14
|
+
print("input_file: ", input_file)
|
|
15
|
+
print("output_file: ", output_file)
|
|
16
|
+
print("number_of_header_lines: ", number_of_header_lines)
|
|
17
|
+
print("number_of_trailer_lines: ", number_of_trailer_lines)
|
|
18
|
+
print("sort_on_columns_list: ", sort_on_columns_list)
|
|
19
|
+
print("delimiter: ", delimiter)
|
|
20
|
+
print("quotechar: ", quotechar)
|
|
21
|
+
#
|
|
22
|
+
number_of_records_input_file = 0
|
|
23
|
+
with open(input_file) as input_file_row_count:
|
|
24
|
+
number_of_records_input_file = sum(1 for line in input_file_row_count)
|
|
25
|
+
#
|
|
26
|
+
print("number_of_records_input_file: ", number_of_records_input_file)
|
|
27
|
+
#
|
|
28
|
+
number_of_header_lines = int(number_of_header_lines)
|
|
29
|
+
number_of_trailer_lines = int(number_of_trailer_lines)
|
|
30
|
+
#
|
|
31
|
+
if number_of_records_input_file < number_of_header_lines + number_of_trailer_lines:
|
|
32
|
+
raise Exception("The number of records in the input file is less than the declared number of header and trailer lines.")
|
|
33
|
+
#
|
|
34
|
+
number_of_sort_columns = 0
|
|
35
|
+
if sort_on_columns_list:
|
|
36
|
+
number_of_sort_columns = len(sort_on_columns_list)
|
|
37
|
+
print("number_of_sort_columns: ", number_of_sort_columns)
|
|
38
|
+
#
|
|
39
|
+
with open(input_file, mode='rt', encoding='utf-8') as input_file, open(output_file, mode='w', newline='', encoding='utf-8') as output_file:
|
|
40
|
+
if len(delimiter) == 1:
|
|
41
|
+
input_data = list(csv.reader(input_file, delimiter=delimiter, quotechar=quotechar))
|
|
42
|
+
output_data = csv.writer(output_file, delimiter=delimiter, quotechar=quotechar)
|
|
43
|
+
else:
|
|
44
|
+
input_data = list(csv.reader((line.replace(delimiter, chr(255)) for line in input_file), delimiter=chr(255), quotechar=quotechar))
|
|
45
|
+
translator = DelimiterTranslator(output_file, chr(255), delimiter)
|
|
46
|
+
output_data = csv.writer(translator, delimiter=chr(255), quotechar=quotechar)
|
|
47
|
+
#
|
|
48
|
+
data_text = []
|
|
49
|
+
trailer_text = []
|
|
50
|
+
#
|
|
51
|
+
if number_of_header_lines == 0 and number_of_trailer_lines == 0:
|
|
52
|
+
# Only data lines.
|
|
53
|
+
data_text = input_data
|
|
54
|
+
else:
|
|
55
|
+
# Header or/and trailer or/and data lines present.
|
|
56
|
+
line_nr = 0
|
|
57
|
+
#
|
|
58
|
+
# Separate in header, data and trailer text.
|
|
59
|
+
for row in input_data:
|
|
60
|
+
line_nr += 1
|
|
61
|
+
if number_of_header_lines != 0 and line_nr <= number_of_header_lines:
|
|
62
|
+
# A header line.
|
|
63
|
+
output_data.writerow(row)
|
|
64
|
+
elif number_of_trailer_lines != 0 and line_nr > number_of_records_input_file - number_of_trailer_lines:
|
|
65
|
+
# A trailer line.
|
|
66
|
+
trailer_text.append(row)
|
|
67
|
+
else:
|
|
68
|
+
# A data line.
|
|
69
|
+
data_text.append(row)
|
|
70
|
+
#
|
|
71
|
+
sorted_data_text = DataSorter.__sort_data_records(self, data_text, sort_on_columns_list)
|
|
72
|
+
#
|
|
73
|
+
for row in sorted_data_text:
|
|
74
|
+
output_data.writerow(row)
|
|
75
|
+
|
|
76
|
+
for row in trailer_text:
|
|
77
|
+
output_data.writerow(row)
|
|
78
|
+
#
|
|
79
|
+
except IndexError as error:
|
|
80
|
+
raise Exception("Probably selected column doesn't exist. Correct argument 'sort_on_columns_list'. Error message: ", type(error).__name__, "–", error)
|
|
81
|
+
except Exception as error:
|
|
82
|
+
raise Exception("Error message: ", type(error).__name__, "–", error)
|
|
83
|
+
|
|
84
|
+
|
|
85
|
+
|
|
86
|
+
def __sort_data_records(self, data_text, sort_on_columns_list):
|
|
87
|
+
number_of_sort_columns = 0
|
|
88
|
+
if sort_on_columns_list:
|
|
89
|
+
number_of_sort_columns = len(sort_on_columns_list)
|
|
90
|
+
print("number_of_sort_columns: ", number_of_sort_columns)
|
|
91
|
+
#
|
|
92
|
+
match number_of_sort_columns:
|
|
93
|
+
case 0:
|
|
94
|
+
sorted_data_text = sorted(data_text, key=lambda column: (column[0]))
|
|
95
|
+
case 1:
|
|
96
|
+
a = sort_on_columns_list[0]
|
|
97
|
+
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)]))
|
|
98
|
+
case 2:
|
|
99
|
+
a = sort_on_columns_list[0]
|
|
100
|
+
b = sort_on_columns_list[1]
|
|
101
|
+
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)]))
|
|
102
|
+
case 3:
|
|
103
|
+
a = sort_on_columns_list[0]
|
|
104
|
+
b = sort_on_columns_list[1]
|
|
105
|
+
c = sort_on_columns_list[2]
|
|
106
|
+
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)]))
|
|
107
|
+
case 4:
|
|
108
|
+
a = sort_on_columns_list[0]
|
|
109
|
+
b = sort_on_columns_list[1]
|
|
110
|
+
c = sort_on_columns_list[2]
|
|
111
|
+
d = sort_on_columns_list[3]
|
|
112
|
+
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)], int(column[d]) if isinstance(d, int) else column[int(d)]))
|
|
113
|
+
case 5:
|
|
114
|
+
a = sort_on_columns_list[0]
|
|
115
|
+
b = sort_on_columns_list[1]
|
|
116
|
+
c = sort_on_columns_list[2]
|
|
117
|
+
d = sort_on_columns_list[3]
|
|
118
|
+
e = sort_on_columns_list[4]
|
|
119
|
+
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)], int(column[d]) if isinstance(d, int) else column[int(d)], int(column[e]) if isinstance(e, int) else column[int(e)]))
|
|
120
|
+
case 6:
|
|
121
|
+
a = sort_on_columns_list[0]
|
|
122
|
+
b = sort_on_columns_list[1]
|
|
123
|
+
c = sort_on_columns_list[2]
|
|
124
|
+
d = sort_on_columns_list[3]
|
|
125
|
+
e = sort_on_columns_list[4]
|
|
126
|
+
f = sort_on_columns_list[5]
|
|
127
|
+
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)], int(column[d]) if isinstance(d, int) else column[int(d)], int(column[e]) if isinstance(e, int) else column[int(e)], int(column[f]) if isinstance(f, int) else column[int(f)]))
|
|
128
|
+
case 7:
|
|
129
|
+
a = sort_on_columns_list[0]
|
|
130
|
+
b = sort_on_columns_list[1]
|
|
131
|
+
c = sort_on_columns_list[2]
|
|
132
|
+
d = sort_on_columns_list[3]
|
|
133
|
+
e = sort_on_columns_list[4]
|
|
134
|
+
f = sort_on_columns_list[5]
|
|
135
|
+
g = sort_on_columns_list[6]
|
|
136
|
+
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)], int(column[d]) if isinstance(d, int) else column[int(d)], int(column[e]) if isinstance(e, int) else column[int(e)], int(column[f]) if isinstance(f, int) else column[int(f)], int(column[g]) if isinstance(g, int) else column[int(g)]))
|
|
137
|
+
case _:
|
|
138
|
+
raise Exception("Too many columns selected for sorting. Only sorting on maximum 7 columns is supported.")
|
|
139
|
+
return sorted_data_text
|
{DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary.egg-info/PKG-INFO
RENAMED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.1
|
|
2
2
|
Name: DataComparerLibrary
|
|
3
|
-
Version: 0.
|
|
3
|
+
Version: 0.832
|
|
4
4
|
Summary: For comparing csv-files or 2d-array with csv-file.
|
|
5
5
|
Home-page:
|
|
6
6
|
Author: René Philip Zuijderduijn
|
|
@@ -39,6 +39,10 @@ you can simply run::
|
|
|
39
39
|
pip install --upgrade DataComparerLibrary
|
|
40
40
|
|
|
41
41
|
|
|
42
|
+
Also the following pip package is needed::
|
|
43
|
+
|
|
44
|
+
pip install python-dateutil
|
|
45
|
+
|
|
42
46
|
|
|
43
47
|
Import statement for the DataComparerLibrary in Python
|
|
44
48
|
------------------------------------------------------
|
|
@@ -65,30 +69,30 @@ The DataComparerLibrary can be used for:
|
|
|
65
69
|
beneath.
|
|
66
70
|
|
|
|
67
71
|
| {PRESENT}:
|
|
68
|
-
| With {PRESENT} in the expected data file you can make clear that data of a field of the actual data should be present.
|
|
72
|
+
| With {PRESENT} in the expected data file you can make clear that data of a field/cell of the actual data should be present.
|
|
69
73
|
This can be helpful for fields that have constant changing values. For example generated id's.
|
|
70
74
|
|
|
|
71
75
|
| {EMPTY}:
|
|
72
|
-
| With {EMPTY} in the expected data file you can make clear that data of a field of the actual data should be absent.
|
|
76
|
+
| With {EMPTY} in the expected data file you can make clear that data of a field/cell of the actual data should be absent.
|
|
73
77
|
|
|
|
74
78
|
| {SKIP}:
|
|
75
|
-
| With {SKIP} in the expected data file you can make clear that the comparison of data of a
|
|
79
|
+
| With {SKIP} in the expected data file you can make clear that the comparison of data of a field/cell or part of a field/cell
|
|
76
80
|
of the actual data should be skipped. This can be helpful for fields or parts of fields that have constant changing
|
|
77
81
|
values. For example time or generated id's.
|
|
78
82
|
|
|
|
79
83
|
| {INTEGER}:
|
|
80
|
-
| With {INTEGER} in the expected data file you can make clear that the data of a field of the actual data should be an
|
|
84
|
+
| With {INTEGER} in the expected data file you can make clear that the data of a field/cell of the actual data should be an
|
|
81
85
|
integer. This can be helpful for fields that have constant changing integer values. For example integer id's.
|
|
82
86
|
|
|
|
83
87
|
| {NOW()...:....}:
|
|
84
|
-
| With {NOW()} in the expected data file you can make clear that the data of a field or part of a field of the actual
|
|
88
|
+
| With {NOW()} in the expected data file you can make clear that the data of a field/cell or part of a field/cell of the actual
|
|
85
89
|
data should be (a part of) a date. You can let calculate the current or a date in the past or future. Calculation is
|
|
86
90
|
based on the "relativedelta" method from Python. Also you can style the date in the format you want. This can be
|
|
87
91
|
helpful for fields that have constant changing date values, but which date values have a fixed offset linked to the
|
|
88
92
|
current date. At "Examples comparing Actual Data with Expected Data" you can find some examples how to use it.
|
|
89
93
|
|
|
|
90
94
|
| {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6}:
|
|
91
|
-
| With {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6} in the expected data file you can make clear that the data of a field or part of a field of the actual
|
|
95
|
+
| With {DATETIME_FORMAT():YYYYMMDDHHMMSSFF6} in the expected data file you can make clear that the data of a field/cell or part of a field/cell of the actual
|
|
92
96
|
data should be (a part of) a date. At this moment it is processed as {SKIP}. In the future it will be changed into a check on date format, but
|
|
93
97
|
not a specific date. For check on a specific expected date you can use {NOW()...:....}.
|
|
94
98
|
|
|
|
@@ -1,134 +0,0 @@
|
|
|
1
|
-
# Script for sorting an input file. Sorted result will be written to an output file. Encoding is UFT-08.
|
|
2
|
-
#
|
|
3
|
-
import csv
|
|
4
|
-
import os
|
|
5
|
-
from DataComparerLibrary.delimitertranslator import DelimiterTranslator
|
|
6
|
-
|
|
7
|
-
|
|
8
|
-
class DataSorter:
|
|
9
|
-
def sort_file(self, input_file, output_file, number_of_header_lines=0, number_of_trailer_lines=0, sort_on_columns_list=None, delimiter=',', quotechar=None):
|
|
10
|
-
try:
|
|
11
|
-
#
|
|
12
|
-
if not os.path.isfile(input_file):
|
|
13
|
-
raise Exception("Input file doesn't exists: ", input_file)
|
|
14
|
-
#
|
|
15
|
-
print("input_file: ", input_file)
|
|
16
|
-
print("output_file: ", output_file)
|
|
17
|
-
print("number_of_header_lines: ", number_of_header_lines)
|
|
18
|
-
print("number_of_trailer_lines: ", number_of_trailer_lines)
|
|
19
|
-
print("sort_on_columns_list: ", sort_on_columns_list)
|
|
20
|
-
print("delimiter: ", delimiter)
|
|
21
|
-
print("quotechar: ", quotechar)
|
|
22
|
-
#
|
|
23
|
-
number_of_records_input_file = 0
|
|
24
|
-
with open(input_file) as input_file_row_count:
|
|
25
|
-
number_of_records_input_file = sum(1 for line in input_file_row_count)
|
|
26
|
-
#
|
|
27
|
-
print("number_of_records_input_file: ", number_of_records_input_file)
|
|
28
|
-
if number_of_records_input_file == 0:
|
|
29
|
-
raise Exception("Input file is empty.")
|
|
30
|
-
#
|
|
31
|
-
number_of_header_lines = int(number_of_header_lines)
|
|
32
|
-
number_of_trailer_lines = int(number_of_trailer_lines)
|
|
33
|
-
#
|
|
34
|
-
if number_of_records_input_file < number_of_header_lines + number_of_trailer_lines:
|
|
35
|
-
raise Exception("The number of records in the input file is less than the declared number of header and trailer lines.")
|
|
36
|
-
#
|
|
37
|
-
number_of_sort_columns = 0
|
|
38
|
-
if sort_on_columns_list:
|
|
39
|
-
number_of_sort_columns = len(sort_on_columns_list)
|
|
40
|
-
print("number_of_sort_columns: ", number_of_sort_columns)
|
|
41
|
-
#
|
|
42
|
-
with open(input_file, mode='rt', encoding='utf-8') as input_file, open(output_file, mode='w', newline='', encoding='utf-8') as output_file:
|
|
43
|
-
if len(delimiter) == 1:
|
|
44
|
-
input_data = list(csv.reader(input_file, delimiter=delimiter, quotechar=quotechar))
|
|
45
|
-
output_data = csv.writer(output_file, delimiter=delimiter, quotechar=quotechar)
|
|
46
|
-
else:
|
|
47
|
-
input_data = list(csv.reader((line.replace(delimiter, chr(255)) for line in input_file), delimiter=chr(255), quotechar=quotechar))
|
|
48
|
-
translator = DelimiterTranslator(output_file, chr(255), delimiter)
|
|
49
|
-
output_data = csv.writer(translator, delimiter=chr(255), quotechar=quotechar)
|
|
50
|
-
#
|
|
51
|
-
if number_of_header_lines == 0 and number_of_trailer_lines == 0:
|
|
52
|
-
# Only data lines. # Sort all records.
|
|
53
|
-
sorted_records = sorted(input_data, key=lambda elem: elem[0])
|
|
54
|
-
#
|
|
55
|
-
for row in sorted_records:
|
|
56
|
-
output_data.writerow(row)
|
|
57
|
-
#
|
|
58
|
-
else:
|
|
59
|
-
# Header or/and trailer or/and data lines present.
|
|
60
|
-
line_nr = 0
|
|
61
|
-
data_text = []
|
|
62
|
-
trailer_text = []
|
|
63
|
-
#
|
|
64
|
-
# Separate in header, data and trailer text.
|
|
65
|
-
for row in input_data:
|
|
66
|
-
line_nr += 1
|
|
67
|
-
if number_of_header_lines != 0 and line_nr <= number_of_header_lines:
|
|
68
|
-
# A header line.
|
|
69
|
-
output_data.writerow(row)
|
|
70
|
-
elif number_of_trailer_lines != 0 and line_nr > number_of_records_input_file - number_of_trailer_lines:
|
|
71
|
-
# A trailer line.
|
|
72
|
-
trailer_text.append(row)
|
|
73
|
-
else:
|
|
74
|
-
# A data line.
|
|
75
|
-
data_text.append(row)
|
|
76
|
-
#
|
|
77
|
-
match number_of_sort_columns:
|
|
78
|
-
case 0:
|
|
79
|
-
sorted_data_text = sorted(data_text, key=lambda column: (column[0]))
|
|
80
|
-
case 1:
|
|
81
|
-
a = sort_on_columns_list[0]
|
|
82
|
-
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)]))
|
|
83
|
-
case 2:
|
|
84
|
-
a = sort_on_columns_list[0]
|
|
85
|
-
b = sort_on_columns_list[1]
|
|
86
|
-
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)]))
|
|
87
|
-
case 3:
|
|
88
|
-
a = sort_on_columns_list[0]
|
|
89
|
-
b = sort_on_columns_list[1]
|
|
90
|
-
c = sort_on_columns_list[2]
|
|
91
|
-
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)]))
|
|
92
|
-
case 4:
|
|
93
|
-
a = sort_on_columns_list[0]
|
|
94
|
-
b = sort_on_columns_list[1]
|
|
95
|
-
c = sort_on_columns_list[2]
|
|
96
|
-
d = sort_on_columns_list[3]
|
|
97
|
-
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)], int(column[d]) if isinstance(d, int) else column[int(d)]))
|
|
98
|
-
case 5:
|
|
99
|
-
a = sort_on_columns_list[0]
|
|
100
|
-
b = sort_on_columns_list[1]
|
|
101
|
-
c = sort_on_columns_list[2]
|
|
102
|
-
d = sort_on_columns_list[3]
|
|
103
|
-
e = sort_on_columns_list[4]
|
|
104
|
-
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)], int(column[d]) if isinstance(d, int) else column[int(d)], int(column[e]) if isinstance(e, int) else column[int(e)]))
|
|
105
|
-
case 6:
|
|
106
|
-
a = sort_on_columns_list[0]
|
|
107
|
-
b = sort_on_columns_list[1]
|
|
108
|
-
c = sort_on_columns_list[2]
|
|
109
|
-
d = sort_on_columns_list[3]
|
|
110
|
-
e = sort_on_columns_list[4]
|
|
111
|
-
f = sort_on_columns_list[5]
|
|
112
|
-
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)], int(column[d]) if isinstance(d, int) else column[int(d)], int(column[e]) if isinstance(e, int) else column[int(e)], int(column[f]) if isinstance(f, int) else column[int(f)]))
|
|
113
|
-
case 7:
|
|
114
|
-
a = sort_on_columns_list[0]
|
|
115
|
-
b = sort_on_columns_list[1]
|
|
116
|
-
c = sort_on_columns_list[2]
|
|
117
|
-
d = sort_on_columns_list[3]
|
|
118
|
-
e = sort_on_columns_list[4]
|
|
119
|
-
f = sort_on_columns_list[5]
|
|
120
|
-
g = sort_on_columns_list[6]
|
|
121
|
-
sorted_data_text = sorted(data_text, key=lambda column: (int(column[a]) if isinstance(a, int) else column[int(a)], int(column[b]) if isinstance(b, int) else column[int(b)], int(column[c]) if isinstance(c, int) else column[int(c)], int(column[d]) if isinstance(d, int) else column[int(d)], int(column[e]) if isinstance(e, int) else column[int(e)], int(column[f]) if isinstance(f, int) else column[int(f)], int(column[g]) if isinstance(g, int) else column[int(g)]))
|
|
122
|
-
case _:
|
|
123
|
-
raise Exception("Too many columns selected for sorting. Only sorting on 7 columns is supported.")
|
|
124
|
-
#
|
|
125
|
-
for row in sorted_data_text:
|
|
126
|
-
output_data.writerow(row)
|
|
127
|
-
|
|
128
|
-
for row in trailer_text:
|
|
129
|
-
output_data.writerow(row)
|
|
130
|
-
#
|
|
131
|
-
except IndexError as error:
|
|
132
|
-
raise Exception("Probably selected column doesn't exist. Correct argument 'sort_on_columns_list'. Error message: ", type(error).__name__, "–", error)
|
|
133
|
-
except Exception as error:
|
|
134
|
-
raise Exception("Error message: ", type(error).__name__, "–", error)
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
{DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary/fileconverter.py
RENAMED
|
File without changes
|
{DataComparerLibrary-0.830 → DataComparerLibrary-0.832}/src/DataComparerLibrary.egg-info/SOURCES.txt
RENAMED
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|