informatica-python 1.9.3__tar.gz → 1.9.4__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- {informatica_python-1.9.3 → informatica_python-1.9.4}/PKG-INFO +3 -3
- {informatica_python-1.9.3 → informatica_python-1.9.4}/README.md +2 -2
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/__init__.py +1 -1
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/mapping_gen.py +25 -6
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/utils/expression_converter.py +18 -4
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/PKG-INFO +3 -3
- {informatica_python-1.9.3 → informatica_python-1.9.4}/pyproject.toml +1 -1
- {informatica_python-1.9.3 → informatica_python-1.9.4}/tests/test_integration.py +239 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/LICENSE +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/cli.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/converter.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/__init__.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/config_gen.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/error_log_gen.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/helper_gen.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/sql_gen.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/workflow_gen.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/models.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/parser.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/utils/__init__.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/utils/datatype_map.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/utils/lib_adapters.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/utils/sql_dialect.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/SOURCES.txt +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/dependency_links.txt +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/entry_points.txt +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/requires.txt +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/top_level.txt +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/setup.cfg +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/tests/test_converter.py +0 -0
- {informatica_python-1.9.3 → informatica_python-1.9.4}/tests/test_expressions.py +0 -0
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: informatica-python
|
|
3
|
-
Version: 1.9.
|
|
3
|
+
Version: 1.9.4
|
|
4
4
|
Summary: Convert Informatica PowerCenter workflow XML to Python/PySpark code
|
|
5
5
|
Author: Nick
|
|
6
6
|
License: MIT
|
|
@@ -430,7 +430,7 @@ The generated `helper_functions.py` provides a complete runtime library:
|
|
|
430
430
|
- **Generated code formatting**: Consistent `# ---` section headers for Source Qualifiers, Transforms, and Target Writes; metadata comments (database type, field lists); column mapping and write operation comments; clean blank line handling
|
|
431
431
|
- **Source/target detection**: Case-insensitive instance type matching
|
|
432
432
|
- **Session→mapping inference**: Longest-suffix-match strategy for ambiguous mapping names
|
|
433
|
-
- **
|
|
433
|
+
- **663 tests** across unit, integration, expression, and formatting test suites
|
|
434
434
|
|
|
435
435
|
### v1.9.2 (Phase 8)
|
|
436
436
|
- Mapping output files now use real mapping names (e.g., `mapping_m_customer_load.py`) instead of generic numeric indices (`mapping_1.py`)
|
|
@@ -495,7 +495,7 @@ The generated `helper_functions.py` provides a complete runtime library:
|
|
|
495
495
|
cd informatica_python
|
|
496
496
|
pip install -e ".[dev]"
|
|
497
497
|
|
|
498
|
-
# Run tests (
|
|
498
|
+
# Run tests (663 tests)
|
|
499
499
|
pytest tests/ -v
|
|
500
500
|
```
|
|
501
501
|
|
|
@@ -403,7 +403,7 @@ The generated `helper_functions.py` provides a complete runtime library:
|
|
|
403
403
|
- **Generated code formatting**: Consistent `# ---` section headers for Source Qualifiers, Transforms, and Target Writes; metadata comments (database type, field lists); column mapping and write operation comments; clean blank line handling
|
|
404
404
|
- **Source/target detection**: Case-insensitive instance type matching
|
|
405
405
|
- **Session→mapping inference**: Longest-suffix-match strategy for ambiguous mapping names
|
|
406
|
-
- **
|
|
406
|
+
- **663 tests** across unit, integration, expression, and formatting test suites
|
|
407
407
|
|
|
408
408
|
### v1.9.2 (Phase 8)
|
|
409
409
|
- Mapping output files now use real mapping names (e.g., `mapping_m_customer_load.py`) instead of generic numeric indices (`mapping_1.py`)
|
|
@@ -468,7 +468,7 @@ The generated `helper_functions.py` provides a complete runtime library:
|
|
|
468
468
|
cd informatica_python
|
|
469
469
|
pip install -e ".[dev]"
|
|
470
470
|
|
|
471
|
-
# Run tests (
|
|
471
|
+
# Run tests (663 tests)
|
|
472
472
|
pytest tests/ -v
|
|
473
473
|
```
|
|
474
474
|
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/mapping_gen.py
RENAMED
|
@@ -757,7 +757,7 @@ def _generate_transformation(lines, tx, connector_graph, source_dfs, transform_m
|
|
|
757
757
|
elif tx_type in ("joiner",):
|
|
758
758
|
_gen_joiner_transform(lines, tx, tx_safe, input_df, input_sources, source_dfs, connector_graph, data_lib)
|
|
759
759
|
elif tx_type in ("lookup procedure", "lookup"):
|
|
760
|
-
_gen_lookup_transform(lines, tx, tx_safe, input_df, source_dfs, data_lib)
|
|
760
|
+
_gen_lookup_transform(lines, tx, tx_safe, input_df, source_dfs, connector_graph, data_lib)
|
|
761
761
|
elif tx_type == "router":
|
|
762
762
|
_gen_router_transform(lines, tx, tx_safe, input_df, source_dfs)
|
|
763
763
|
elif tx_type in ("union",):
|
|
@@ -982,7 +982,7 @@ def _gen_joiner_transform(lines, tx, tx_safe, input_df, input_sources, source_df
|
|
|
982
982
|
source_dfs[tx.name] = f"df_{tx_safe}"
|
|
983
983
|
|
|
984
984
|
|
|
985
|
-
def _gen_lookup_transform(lines, tx, tx_safe, input_df, source_dfs, data_lib="pandas"):
|
|
985
|
+
def _gen_lookup_transform(lines, tx, tx_safe, input_df, source_dfs, connector_graph=None, data_lib="pandas"):
|
|
986
986
|
lookup_table = ""
|
|
987
987
|
lookup_sql = ""
|
|
988
988
|
lookup_condition = ""
|
|
@@ -1012,6 +1012,11 @@ def _gen_lookup_transform(lines, tx, tx_safe, input_df, source_dfs, data_lib="pa
|
|
|
1012
1012
|
|
|
1013
1013
|
all_output_fields = return_fields + lookup_output_fields
|
|
1014
1014
|
|
|
1015
|
+
port_to_col = {}
|
|
1016
|
+
if connector_graph and tx.name in connector_graph.get("to", {}):
|
|
1017
|
+
for conn in connector_graph["to"][tx.name]:
|
|
1018
|
+
port_to_col[conn.to_field.lower()] = conn.from_field
|
|
1019
|
+
|
|
1015
1020
|
lines.append(f" # Lookup: {lookup_table or tx.name}")
|
|
1016
1021
|
if lookup_sql:
|
|
1017
1022
|
_emit_sql_with_params(lines, f"lkp_sql_{tx_safe}", lookup_sql)
|
|
@@ -1020,10 +1025,13 @@ def _gen_lookup_transform(lines, tx, tx_safe, input_df, source_dfs, data_lib="pa
|
|
|
1020
1025
|
lines.append(f" df_lkp_{tx_safe} = read_from_db(config, 'SELECT * FROM {lookup_table}', 'default')")
|
|
1021
1026
|
else:
|
|
1022
1027
|
empty_expr = lib_empty_df(data_lib)
|
|
1023
|
-
lines.append(f" df_lkp_{tx_safe} = {empty_expr}")
|
|
1028
|
+
lines.append(f" df_lkp_{tx_safe} = {empty_expr} # WARNING: no lookup table/SQL override found")
|
|
1024
1029
|
|
|
1025
1030
|
input_keys, lookup_keys = parse_lookup_condition(lookup_condition)
|
|
1026
1031
|
|
|
1032
|
+
if input_keys and port_to_col:
|
|
1033
|
+
input_keys = [port_to_col.get(k.lower(), k) for k in input_keys]
|
|
1034
|
+
|
|
1027
1035
|
if input_keys and lookup_keys:
|
|
1028
1036
|
lines.append(f" # Lookup condition: {lookup_condition}")
|
|
1029
1037
|
|
|
@@ -1078,12 +1086,23 @@ def _gen_router_transform(lines, tx, tx_safe, input_df, source_dfs):
|
|
|
1078
1086
|
if "Group Filter Condition" in attr.name:
|
|
1079
1087
|
group_conditions[attr.name] = attr.value
|
|
1080
1088
|
|
|
1089
|
+
remaining_mask_parts = []
|
|
1081
1090
|
if group_conditions:
|
|
1082
1091
|
for i, (gname, cond) in enumerate(group_conditions.items()):
|
|
1083
|
-
|
|
1084
|
-
|
|
1092
|
+
if cond and cond.strip():
|
|
1093
|
+
expr_py = convert_filter_vectorized(cond, input_df)
|
|
1094
|
+
else:
|
|
1095
|
+
expr_py = f"pd.Series(True, index={input_df}.index)"
|
|
1096
|
+
mask_var = f"_router_mask_{tx_safe}_{i}"
|
|
1097
|
+
lines.append(f" {mask_var} = {expr_py} # {gname}")
|
|
1098
|
+
lines.append(f" df_{tx_safe}_group{i} = {input_df}[{mask_var}].copy()")
|
|
1085
1099
|
source_dfs[f"{tx.name}_group{i}"] = f"df_{tx_safe}_group{i}"
|
|
1086
|
-
|
|
1100
|
+
remaining_mask_parts.append(f"~{mask_var}")
|
|
1101
|
+
if remaining_mask_parts:
|
|
1102
|
+
lines.append(f" _router_default_mask = {' & '.join(remaining_mask_parts)}")
|
|
1103
|
+
lines.append(f" df_{tx_safe} = {input_df}[_router_default_mask].copy() # Default group")
|
|
1104
|
+
else:
|
|
1105
|
+
lines.append(f" df_{tx_safe} = {input_df}.copy() # Default group")
|
|
1087
1106
|
source_dfs[tx.name] = f"df_{tx_safe}"
|
|
1088
1107
|
|
|
1089
1108
|
|
|
@@ -248,6 +248,7 @@ def _convert_infa_date_format(fmt_str):
|
|
|
248
248
|
fmt = fmt.replace("Mon", "%b").replace("MON", "%b")
|
|
249
249
|
fmt = fmt.replace("HH24", "%H").replace("HH12", "%I").replace("HH", "%H")
|
|
250
250
|
fmt = fmt.replace("MI", "%M").replace("SS", "%S")
|
|
251
|
+
fmt = fmt.replace("US", "%f").replace("NS", "%f").replace("MS", "%f")
|
|
251
252
|
return fmt
|
|
252
253
|
|
|
253
254
|
|
|
@@ -548,7 +549,7 @@ def _vec_recursive(expr, df_var):
|
|
|
548
549
|
'RTRIM': f'.str.rstrip("{char_arg}")',
|
|
549
550
|
'TRIM': f'.str.strip("{char_arg}")',
|
|
550
551
|
}
|
|
551
|
-
return f'{inner_val}{method_map[func_name.upper()]}'
|
|
552
|
+
return f'{inner_val}.astype(str){method_map[func_name.upper()]}'
|
|
552
553
|
|
|
553
554
|
upper_result = _find_func_call(cleaned, 'UPPER')
|
|
554
555
|
if upper_result and upper_result[0] == 0 and upper_result[1] == len(cleaned):
|
|
@@ -584,7 +585,7 @@ def _vec_recursive(expr, df_var):
|
|
|
584
585
|
if len(args) >= 2:
|
|
585
586
|
field_val = _vec_recursive(args[0], df_var)
|
|
586
587
|
try:
|
|
587
|
-
start = int(args[1].strip()) - 1
|
|
588
|
+
start = max(int(args[1].strip()) - 1, 0)
|
|
588
589
|
except ValueError:
|
|
589
590
|
start_val = _vec_recursive(args[1], df_var)
|
|
590
591
|
if len(args) >= 3:
|
|
@@ -722,7 +723,11 @@ def _vec_recursive(expr, df_var):
|
|
|
722
723
|
field_val = _vec_recursive(args[0], df_var)
|
|
723
724
|
pattern_val = args[1].strip().strip("'\"")
|
|
724
725
|
if func_name == 'REG_EXTRACT':
|
|
725
|
-
|
|
726
|
+
if re.search(r'(?<!\\)\((?!\?)', pattern_val):
|
|
727
|
+
extract_pat = pattern_val
|
|
728
|
+
else:
|
|
729
|
+
extract_pat = f'({pattern_val})'
|
|
730
|
+
return f'{field_val}.str.extract(r"{extract_pat}", expand=False)'
|
|
726
731
|
elif func_name == 'REG_REPLACE':
|
|
727
732
|
replace_val = args[2].strip().strip("'\"") if len(args) >= 3 else ''
|
|
728
733
|
return f'{field_val}.str.replace(r"{pattern_val}", "{replace_val}", regex=True)'
|
|
@@ -894,7 +899,8 @@ def _vec_recursive(expr, df_var):
|
|
|
894
899
|
'True', 'False', 'None', 'and', 'or', 'not', 'np', 'pd', 'get_variable',
|
|
895
900
|
'str', 'int', 'float', 'bool', 'len', 'abs', 'round',
|
|
896
901
|
'fillna', 'astype', 'isna', 'notna', 'where', 'errors', 'coerce',
|
|
897
|
-
'lookup_func',
|
|
902
|
+
'lookup_func', 'expand', 'extract', 'regex', 'contains', 'replace',
|
|
903
|
+
'upper', 'lower', 'strip', 'lstrip', 'rstrip', 'dt', 'copy',
|
|
898
904
|
}
|
|
899
905
|
converted = _substitute_fields(converted, df_var, skip_words)
|
|
900
906
|
|
|
@@ -904,6 +910,8 @@ def _vec_recursive(expr, df_var):
|
|
|
904
910
|
converted = re.sub(r'<>', '!=', converted)
|
|
905
911
|
converted = re.sub(r'(?<![<>!=])=(?!=)', '==', converted)
|
|
906
912
|
converted = re.sub(r'\berrors\s*==\s*(["\'])', r'errors=\1', converted)
|
|
913
|
+
converted = re.sub(r'\bexpand\s*==\s*', 'expand=', converted)
|
|
914
|
+
converted = re.sub(r'\bregex\s*==\s*', 'regex=', converted)
|
|
907
915
|
|
|
908
916
|
converted = re.sub(r'\s+', ' ', converted).strip()
|
|
909
917
|
|
|
@@ -1044,8 +1052,14 @@ def _vectorize_simple(part, df_var):
|
|
|
1044
1052
|
'True', 'False', 'None', 'and', 'or', 'not', 'np', 'pd',
|
|
1045
1053
|
'str', 'int', 'float', 'isna', 'notna', 'fillna',
|
|
1046
1054
|
'get_variable', 'lookup_func', 'isin', 'eq',
|
|
1055
|
+
'expand', 'extract', 'astype', 'errors', 'coerce', 'regex',
|
|
1056
|
+
'contains', 'replace', 'upper', 'lower', 'strip', 'lstrip', 'rstrip',
|
|
1057
|
+
'dt', 'len', 'copy', 'abs', 'round', 'where', 'bool',
|
|
1047
1058
|
}
|
|
1048
1059
|
c = _substitute_fields(c, df_var, skip_words)
|
|
1060
|
+
c = re.sub(r'\bexpand\s*==\s*', 'expand=', c)
|
|
1061
|
+
c = re.sub(r'\berrors\s*==\s*', 'errors=', c)
|
|
1062
|
+
c = re.sub(r'\bregex\s*==\s*', 'regex=', c)
|
|
1049
1063
|
|
|
1050
1064
|
return c
|
|
1051
1065
|
|
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
Metadata-Version: 2.4
|
|
2
2
|
Name: informatica-python
|
|
3
|
-
Version: 1.9.
|
|
3
|
+
Version: 1.9.4
|
|
4
4
|
Summary: Convert Informatica PowerCenter workflow XML to Python/PySpark code
|
|
5
5
|
Author: Nick
|
|
6
6
|
License: MIT
|
|
@@ -430,7 +430,7 @@ The generated `helper_functions.py` provides a complete runtime library:
|
|
|
430
430
|
- **Generated code formatting**: Consistent `# ---` section headers for Source Qualifiers, Transforms, and Target Writes; metadata comments (database type, field lists); column mapping and write operation comments; clean blank line handling
|
|
431
431
|
- **Source/target detection**: Case-insensitive instance type matching
|
|
432
432
|
- **Session→mapping inference**: Longest-suffix-match strategy for ambiguous mapping names
|
|
433
|
-
- **
|
|
433
|
+
- **663 tests** across unit, integration, expression, and formatting test suites
|
|
434
434
|
|
|
435
435
|
### v1.9.2 (Phase 8)
|
|
436
436
|
- Mapping output files now use real mapping names (e.g., `mapping_m_customer_load.py`) instead of generic numeric indices (`mapping_1.py`)
|
|
@@ -495,7 +495,7 @@ The generated `helper_functions.py` provides a complete runtime library:
|
|
|
495
495
|
cd informatica_python
|
|
496
496
|
pip install -e ".[dev]"
|
|
497
497
|
|
|
498
|
-
# Run tests (
|
|
498
|
+
# Run tests (663 tests)
|
|
499
499
|
pytest tests/ -v
|
|
500
500
|
```
|
|
501
501
|
|
|
@@ -2246,3 +2246,242 @@ class TestJoinerFieldRemapping(unittest.TestCase):
|
|
|
2246
2246
|
if "left_on" in line and "right_on" in line:
|
|
2247
2247
|
assert "Table_Name" in line, \
|
|
2248
2248
|
"Merge should use source column name Table_Name"
|
|
2249
|
+
|
|
2250
|
+
|
|
2251
|
+
class TestRegExtractConversion(unittest.TestCase):
|
|
2252
|
+
"""Tests for REG_EXTRACT capture group and expand parameter handling."""
|
|
2253
|
+
|
|
2254
|
+
def test_no_double_capture_group(self):
|
|
2255
|
+
r = convert_expression_vectorized(r"REG_EXTRACT(col,'(\s+)')", "df")
|
|
2256
|
+
assert r.count("(") - r.count("str.extract") <= 2
|
|
2257
|
+
assert '((\\s+))' not in r
|
|
2258
|
+
|
|
2259
|
+
def test_adds_capture_group_when_missing(self):
|
|
2260
|
+
r = convert_expression_vectorized(r"REG_EXTRACT(col,'\\d+')", "df")
|
|
2261
|
+
assert 'expand=False' in r
|
|
2262
|
+
assert '.str.extract' in r
|
|
2263
|
+
|
|
2264
|
+
def test_expand_is_boolean_not_series(self):
|
|
2265
|
+
r = convert_expression_vectorized(r"REG_EXTRACT(col,'(\s+)')", "df")
|
|
2266
|
+
assert 'expand=False' in r
|
|
2267
|
+
assert 'expand==False' not in r
|
|
2268
|
+
assert 'df["expand"]' not in r
|
|
2269
|
+
|
|
2270
|
+
def test_isnull_reg_extract_nested(self):
|
|
2271
|
+
r = convert_expression_vectorized(
|
|
2272
|
+
"IIF(ISNULL(REG_EXTRACT(PART_BIRTH_DTE,'(\\s+)')),PART_BIRTH_DTE,NULL)", "df_exp"
|
|
2273
|
+
)
|
|
2274
|
+
assert "np.where" in r
|
|
2275
|
+
assert ".isna()" in r
|
|
2276
|
+
assert "expand=False" in r
|
|
2277
|
+
assert 'expand==False' not in r
|
|
2278
|
+
assert 'df_exp["expand"]' not in r
|
|
2279
|
+
|
|
2280
|
+
|
|
2281
|
+
class TestDatetimeFormatMask(unittest.TestCase):
|
|
2282
|
+
"""Tests for datetime format mask conversion (US/microseconds)."""
|
|
2283
|
+
|
|
2284
|
+
def test_us_to_percent_f(self):
|
|
2285
|
+
from informatica_python.utils.expression_converter import _convert_infa_date_format
|
|
2286
|
+
fmt = _convert_infa_date_format("YYYY-MM-DD HH24.MI.SS.US")
|
|
2287
|
+
assert "%f" in fmt
|
|
2288
|
+
assert "US" not in fmt
|
|
2289
|
+
|
|
2290
|
+
def test_full_format_mask(self):
|
|
2291
|
+
from informatica_python.utils.expression_converter import _convert_infa_date_format
|
|
2292
|
+
fmt = _convert_infa_date_format("YYYY-MM-DD HH24:MI:SS")
|
|
2293
|
+
assert fmt == "%Y-%m-%d %H:%M:%S"
|
|
2294
|
+
|
|
2295
|
+
def test_to_date_with_us_format(self):
|
|
2296
|
+
r = convert_expression_vectorized(
|
|
2297
|
+
"TO_DATE(x, 'YYYY-MM-DD HH24.MI.SS.US')", "df"
|
|
2298
|
+
)
|
|
2299
|
+
assert "%f" in r
|
|
2300
|
+
assert "US" not in r
|
|
2301
|
+
|
|
2302
|
+
|
|
2303
|
+
class TestSubstrZeroIndex(unittest.TestCase):
|
|
2304
|
+
"""Tests for SUBSTR with 0-based start position."""
|
|
2305
|
+
|
|
2306
|
+
def test_substr_start_0(self):
|
|
2307
|
+
r = convert_expression_vectorized("SUBSTR(x, 0, 11)", "df")
|
|
2308
|
+
assert "str[0:" in r
|
|
2309
|
+
assert "str[-1:" not in r
|
|
2310
|
+
|
|
2311
|
+
def test_substr_start_1(self):
|
|
2312
|
+
r = convert_expression_vectorized("SUBSTR(x, 1, 5)", "df")
|
|
2313
|
+
assert "str[0:" in r
|
|
2314
|
+
|
|
2315
|
+
def test_substr_start_5(self):
|
|
2316
|
+
r = convert_expression_vectorized("SUBSTR(x, 5, 3)", "df")
|
|
2317
|
+
assert "str[4:7]" in r
|
|
2318
|
+
|
|
2319
|
+
|
|
2320
|
+
class TestStringOpSafety(unittest.TestCase):
|
|
2321
|
+
"""Tests for string operations adding .astype(str) for safety."""
|
|
2322
|
+
|
|
2323
|
+
def test_ltrim_has_astype_str(self):
|
|
2324
|
+
r = convert_expression_vectorized("LTRIM(name)", "df")
|
|
2325
|
+
assert ".astype(str)" in r
|
|
2326
|
+
assert ".str.lstrip()" in r
|
|
2327
|
+
|
|
2328
|
+
def test_rtrim_has_astype_str(self):
|
|
2329
|
+
r = convert_expression_vectorized("RTRIM(name)", "df")
|
|
2330
|
+
assert ".astype(str)" in r
|
|
2331
|
+
assert ".str.rstrip()" in r
|
|
2332
|
+
|
|
2333
|
+
def test_trim_has_astype_str(self):
|
|
2334
|
+
r = convert_expression_vectorized("TRIM(name)", "df")
|
|
2335
|
+
assert ".astype(str)" in r
|
|
2336
|
+
assert ".str.strip()" in r
|
|
2337
|
+
|
|
2338
|
+
def test_ltrim_with_char(self):
|
|
2339
|
+
r = convert_expression_vectorized("LTRIM(name, '0')", "df")
|
|
2340
|
+
assert ".astype(str)" in r
|
|
2341
|
+
assert '.str.lstrip("0")' in r
|
|
2342
|
+
|
|
2343
|
+
|
|
2344
|
+
class TestRouterVectorized(unittest.TestCase):
|
|
2345
|
+
"""Tests for Router transformation generating vectorized conditions."""
|
|
2346
|
+
|
|
2347
|
+
ROUTER_XML = '''<?xml version="1.0" encoding="UTF-8"?>
|
|
2348
|
+
<!DOCTYPE POWERMART SYSTEM "powrmart.dtd">
|
|
2349
|
+
<POWERMART CREATION_DATE="01/01/2025" REPOSITORY_VERSION="1">
|
|
2350
|
+
<REPOSITORY NAME="repo" VERSION="1" CODEPAGE="UTF-8" DATABASETYPE="Oracle">
|
|
2351
|
+
<FOLDER NAME="TEST" OWNER="admin">
|
|
2352
|
+
<SOURCE NAME="SRC" DATABASETYPE="Flat File" DBDNAME="SRC">
|
|
2353
|
+
<FLATFILE DELIMITEDBY="COMMA" HEADERROWPRESENT="YES" PADBYTES="NO" ROWDELIMITER="\\n"/>
|
|
2354
|
+
<SOURCEFIELD NAME="ID" DATATYPE="integer" PRECISION="10" SCALE="0" NULLABLE="NOTNULL" KEYTYPE="PRIMARY KEY" FIELDNUMBER="1"/>
|
|
2355
|
+
<SOURCEFIELD NAME="STATUS" DATATYPE="string" PRECISION="20" SCALE="0" NULLABLE="NULL" KEYTYPE="NOT A KEY" FIELDNUMBER="2"/>
|
|
2356
|
+
</SOURCE>
|
|
2357
|
+
<TARGET NAME="TGT" DATABASETYPE="Flat File">
|
|
2358
|
+
<TARGETFIELD NAME="ID" DATATYPE="integer" PRECISION="10" SCALE="0" NULLABLE="NULL" KEYTYPE="NOT A KEY" FIELDNUMBER="1"/>
|
|
2359
|
+
</TARGET>
|
|
2360
|
+
<MAPPING NAME="m_router_test" ISVALID="YES">
|
|
2361
|
+
<TRANSFORMATION NAME="SQ_SRC" TYPE="Source Qualifier" REUSABLE="NO">
|
|
2362
|
+
<TRANSFORMFIELD NAME="ID" DATATYPE="integer" PRECISION="10" SCALE="0" PORTTYPE="OUTPUT"/>
|
|
2363
|
+
<TRANSFORMFIELD NAME="STATUS" DATATYPE="string" PRECISION="20" SCALE="0" PORTTYPE="OUTPUT"/>
|
|
2364
|
+
</TRANSFORMATION>
|
|
2365
|
+
<TRANSFORMATION NAME="RTR_STATUS" TYPE="Router" REUSABLE="NO">
|
|
2366
|
+
<TRANSFORMFIELD NAME="ID" DATATYPE="integer" PRECISION="10" SCALE="0" PORTTYPE="INPUT/OUTPUT"/>
|
|
2367
|
+
<TRANSFORMFIELD NAME="STATUS" DATATYPE="string" PRECISION="20" SCALE="0" PORTTYPE="INPUT/OUTPUT"/>
|
|
2368
|
+
<TABLEATTRIBUTE NAME="Group Filter Condition_ACTIVE" VALUE="STATUS = 'ACTIVE'"/>
|
|
2369
|
+
<TABLEATTRIBUTE NAME="Group Filter Condition_INACTIVE" VALUE="STATUS = 'INACTIVE'"/>
|
|
2370
|
+
</TRANSFORMATION>
|
|
2371
|
+
<INSTANCE NAME="SRC" TYPE="Source Definition" TRANSFORMATION_NAME="SRC"/>
|
|
2372
|
+
<INSTANCE NAME="SQ_SRC" TYPE="Source Qualifier" TRANSFORMATION_NAME="SQ_SRC"/>
|
|
2373
|
+
<INSTANCE NAME="RTR_STATUS" TYPE="Router" TRANSFORMATION_NAME="RTR_STATUS"/>
|
|
2374
|
+
<INSTANCE NAME="TGT" TYPE="Target Definition" TRANSFORMATION_NAME="TGT"/>
|
|
2375
|
+
<CONNECTOR FROMINSTANCE="SRC" FROMFIELD="ID" TOINSTANCE="SQ_SRC" TOFIELD="ID"/>
|
|
2376
|
+
<CONNECTOR FROMINSTANCE="SRC" FROMFIELD="STATUS" TOINSTANCE="SQ_SRC" TOFIELD="STATUS"/>
|
|
2377
|
+
<CONNECTOR FROMINSTANCE="SQ_SRC" FROMFIELD="ID" TOINSTANCE="RTR_STATUS" TOFIELD="ID"/>
|
|
2378
|
+
<CONNECTOR FROMINSTANCE="SQ_SRC" FROMFIELD="STATUS" TOINSTANCE="RTR_STATUS" TOFIELD="STATUS"/>
|
|
2379
|
+
<CONNECTOR FROMINSTANCE="RTR_STATUS" FROMFIELD="ID" TOINSTANCE="TGT" TOFIELD="ID"/>
|
|
2380
|
+
</MAPPING>
|
|
2381
|
+
<CONFIG NAME="default_session_config"/>
|
|
2382
|
+
<WORKFLOW NAME="wf_router_test" ISVALID="YES">
|
|
2383
|
+
<TASK NAME="Start" REUSABLE="NO" TYPE="Start"/>
|
|
2384
|
+
<SESSION NAME="s_m_router_test" ISVALID="YES" REUSABLE="NO" MAPPINGNAME="m_router_test">
|
|
2385
|
+
<CONFIGREFERENCE REFOBJECTNAME="default_session_config" TYPE="Session config"/>
|
|
2386
|
+
</SESSION>
|
|
2387
|
+
<TASKINSTANCE NAME="Start" TASKNAME="Start" TASKTYPE="Start"/>
|
|
2388
|
+
<TASKINSTANCE NAME="s_m_router_test" TASKNAME="s_m_router_test" TASKTYPE="Session"/>
|
|
2389
|
+
<WORKFLOWLINK FROMTASK="Start" TOTASK="s_m_router_test"/>
|
|
2390
|
+
</WORKFLOW>
|
|
2391
|
+
</FOLDER>
|
|
2392
|
+
</REPOSITORY>
|
|
2393
|
+
</POWERMART>'''
|
|
2394
|
+
|
|
2395
|
+
def test_router_generates_group_filters(self):
|
|
2396
|
+
converter = InformaticaConverter()
|
|
2397
|
+
tmpdir = tempfile.mkdtemp()
|
|
2398
|
+
try:
|
|
2399
|
+
converter.convert_string(self.ROUTER_XML, output_dir=tmpdir)
|
|
2400
|
+
for fn in os.listdir(tmpdir):
|
|
2401
|
+
if fn.startswith("mapping_") and fn.endswith(".py"):
|
|
2402
|
+
with open(os.path.join(tmpdir, fn)) as f:
|
|
2403
|
+
code = f.read()
|
|
2404
|
+
assert "_router_mask_" in code or "group0" in code, \
|
|
2405
|
+
"Router should generate group filter masks"
|
|
2406
|
+
assert "Default group" in code
|
|
2407
|
+
break
|
|
2408
|
+
finally:
|
|
2409
|
+
shutil.rmtree(tmpdir)
|
|
2410
|
+
|
|
2411
|
+
def test_router_default_excludes_matched_rows(self):
|
|
2412
|
+
converter = InformaticaConverter()
|
|
2413
|
+
tmpdir = tempfile.mkdtemp()
|
|
2414
|
+
try:
|
|
2415
|
+
converter.convert_string(self.ROUTER_XML, output_dir=tmpdir)
|
|
2416
|
+
for fn in os.listdir(tmpdir):
|
|
2417
|
+
if fn.startswith("mapping_") and fn.endswith(".py"):
|
|
2418
|
+
with open(os.path.join(tmpdir, fn)) as f:
|
|
2419
|
+
code = f.read()
|
|
2420
|
+
assert "_router_default_mask" in code or "~" in code, \
|
|
2421
|
+
"Default group should exclude rows matching other groups"
|
|
2422
|
+
break
|
|
2423
|
+
finally:
|
|
2424
|
+
shutil.rmtree(tmpdir)
|
|
2425
|
+
|
|
2426
|
+
|
|
2427
|
+
class TestLookupWarning(unittest.TestCase):
|
|
2428
|
+
"""Tests for lookup empty DataFrame warning."""
|
|
2429
|
+
|
|
2430
|
+
LOOKUP_XML = '''<?xml version="1.0" encoding="UTF-8"?>
|
|
2431
|
+
<!DOCTYPE POWERMART SYSTEM "powrmart.dtd">
|
|
2432
|
+
<POWERMART CREATION_DATE="01/01/2025" REPOSITORY_VERSION="1">
|
|
2433
|
+
<REPOSITORY NAME="repo" VERSION="1" CODEPAGE="UTF-8" DATABASETYPE="Oracle">
|
|
2434
|
+
<FOLDER NAME="TEST" OWNER="admin">
|
|
2435
|
+
<SOURCE NAME="SRC" DATABASETYPE="Flat File" DBDNAME="SRC">
|
|
2436
|
+
<FLATFILE DELIMITEDBY="COMMA" HEADERROWPRESENT="YES" PADBYTES="NO" ROWDELIMITER="\\n"/>
|
|
2437
|
+
<SOURCEFIELD NAME="ID" DATATYPE="integer" PRECISION="10" SCALE="0" NULLABLE="NOTNULL" KEYTYPE="PRIMARY KEY" FIELDNUMBER="1"/>
|
|
2438
|
+
</SOURCE>
|
|
2439
|
+
<TARGET NAME="TGT" DATABASETYPE="Flat File">
|
|
2440
|
+
<TARGETFIELD NAME="ID" DATATYPE="integer" PRECISION="10" SCALE="0" NULLABLE="NULL" KEYTYPE="NOT A KEY" FIELDNUMBER="1"/>
|
|
2441
|
+
</TARGET>
|
|
2442
|
+
<MAPPING NAME="m_lkp_test" ISVALID="YES">
|
|
2443
|
+
<TRANSFORMATION NAME="SQ_SRC" TYPE="Source Qualifier" REUSABLE="NO">
|
|
2444
|
+
<TRANSFORMFIELD NAME="ID" DATATYPE="integer" PRECISION="10" SCALE="0" PORTTYPE="OUTPUT"/>
|
|
2445
|
+
</TRANSFORMATION>
|
|
2446
|
+
<TRANSFORMATION NAME="LKP_TEST" TYPE="Lookup Procedure" REUSABLE="NO">
|
|
2447
|
+
<TRANSFORMFIELD NAME="ID" DATATYPE="integer" PRECISION="10" SCALE="0" PORTTYPE="INPUT"/>
|
|
2448
|
+
<TRANSFORMFIELD NAME="RESULT" DATATYPE="string" PRECISION="100" SCALE="0" PORTTYPE="OUTPUT/RETURN"/>
|
|
2449
|
+
<TABLEATTRIBUTE NAME="Lookup table name" VALUE="DIM_TABLE"/>
|
|
2450
|
+
<TABLEATTRIBUTE NAME="Lookup condition" VALUE="ID = ID"/>
|
|
2451
|
+
</TRANSFORMATION>
|
|
2452
|
+
<INSTANCE NAME="SRC" TYPE="Source Definition" TRANSFORMATION_NAME="SRC"/>
|
|
2453
|
+
<INSTANCE NAME="SQ_SRC" TYPE="Source Qualifier" TRANSFORMATION_NAME="SQ_SRC"/>
|
|
2454
|
+
<INSTANCE NAME="LKP_TEST" TYPE="Lookup Procedure" TRANSFORMATION_NAME="LKP_TEST"/>
|
|
2455
|
+
<INSTANCE NAME="TGT" TYPE="Target Definition" TRANSFORMATION_NAME="TGT"/>
|
|
2456
|
+
<CONNECTOR FROMINSTANCE="SRC" FROMFIELD="ID" TOINSTANCE="SQ_SRC" TOFIELD="ID"/>
|
|
2457
|
+
<CONNECTOR FROMINSTANCE="SQ_SRC" FROMFIELD="ID" TOINSTANCE="LKP_TEST" TOFIELD="ID"/>
|
|
2458
|
+
<CONNECTOR FROMINSTANCE="LKP_TEST" FROMFIELD="RESULT" TOINSTANCE="TGT" TOFIELD="ID"/>
|
|
2459
|
+
</MAPPING>
|
|
2460
|
+
<CONFIG NAME="default_session_config"/>
|
|
2461
|
+
<WORKFLOW NAME="wf_lkp_test" ISVALID="YES">
|
|
2462
|
+
<TASK NAME="Start" REUSABLE="NO" TYPE="Start"/>
|
|
2463
|
+
<SESSION NAME="s_m_lkp_test" ISVALID="YES" REUSABLE="NO" MAPPINGNAME="m_lkp_test">
|
|
2464
|
+
<CONFIGREFERENCE REFOBJECTNAME="default_session_config" TYPE="Session config"/>
|
|
2465
|
+
</SESSION>
|
|
2466
|
+
<TASKINSTANCE NAME="Start" TASKNAME="Start" TASKTYPE="Start"/>
|
|
2467
|
+
<TASKINSTANCE NAME="s_m_lkp_test" TASKNAME="s_m_lkp_test" TASKTYPE="Session"/>
|
|
2468
|
+
<WORKFLOWLINK FROMTASK="Start" TOTASK="s_m_lkp_test"/>
|
|
2469
|
+
</WORKFLOW>
|
|
2470
|
+
</FOLDER>
|
|
2471
|
+
</REPOSITORY>
|
|
2472
|
+
</POWERMART>'''
|
|
2473
|
+
|
|
2474
|
+
def test_lookup_with_table_reads_from_db(self):
|
|
2475
|
+
converter = InformaticaConverter()
|
|
2476
|
+
tmpdir = tempfile.mkdtemp()
|
|
2477
|
+
try:
|
|
2478
|
+
converter.convert_string(self.LOOKUP_XML, output_dir=tmpdir)
|
|
2479
|
+
for fn in os.listdir(tmpdir):
|
|
2480
|
+
if fn.startswith("mapping_") and fn.endswith(".py"):
|
|
2481
|
+
with open(os.path.join(tmpdir, fn)) as f:
|
|
2482
|
+
code = f.read()
|
|
2483
|
+
assert "read_from_db" in code, "Lookup with table should use read_from_db"
|
|
2484
|
+
assert "DIM_TABLE" in code
|
|
2485
|
+
break
|
|
2486
|
+
finally:
|
|
2487
|
+
shutil.rmtree(tmpdir)
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/__init__.py
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/config_gen.py
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/error_log_gen.py
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/helper_gen.py
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/sql_gen.py
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/generators/workflow_gen.py
RENAMED
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/utils/datatype_map.py
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/utils/lib_adapters.py
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python/utils/sql_dialect.py
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/SOURCES.txt
RENAMED
|
File without changes
|
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/entry_points.txt
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/requires.txt
RENAMED
|
File without changes
|
{informatica_python-1.9.3 → informatica_python-1.9.4}/informatica_python.egg-info/top_level.txt
RENAMED
|
File without changes
|
|
File without changes
|
|
File without changes
|
|
File without changes
|