informatica-python 1.9.1__tar.gz → 1.9.3__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (32) hide show
  1. {informatica_python-1.9.1 → informatica_python-1.9.3}/PKG-INFO +181 -47
  2. {informatica_python-1.9.1 → informatica_python-1.9.3}/README.md +180 -46
  3. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/__init__.py +1 -1
  4. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/generators/helper_gen.py +11 -0
  5. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/generators/mapping_gen.py +141 -57
  6. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/generators/workflow_gen.py +21 -4
  7. informatica_python-1.9.3/informatica_python/utils/expression_converter.py +1262 -0
  8. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python.egg-info/PKG-INFO +181 -47
  9. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python.egg-info/SOURCES.txt +1 -0
  10. {informatica_python-1.9.1 → informatica_python-1.9.3}/pyproject.toml +1 -1
  11. {informatica_python-1.9.1 → informatica_python-1.9.3}/tests/test_converter.py +171 -0
  12. informatica_python-1.9.3/tests/test_expressions.py +1195 -0
  13. {informatica_python-1.9.1 → informatica_python-1.9.3}/tests/test_integration.py +635 -0
  14. informatica_python-1.9.1/informatica_python/utils/expression_converter.py +0 -481
  15. {informatica_python-1.9.1 → informatica_python-1.9.3}/LICENSE +0 -0
  16. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/cli.py +0 -0
  17. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/converter.py +0 -0
  18. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/generators/__init__.py +0 -0
  19. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/generators/config_gen.py +0 -0
  20. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/generators/error_log_gen.py +0 -0
  21. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/generators/sql_gen.py +0 -0
  22. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/models.py +0 -0
  23. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/parser.py +0 -0
  24. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/utils/__init__.py +0 -0
  25. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/utils/datatype_map.py +0 -0
  26. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/utils/lib_adapters.py +0 -0
  27. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python/utils/sql_dialect.py +0 -0
  28. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python.egg-info/dependency_links.txt +0 -0
  29. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python.egg-info/entry_points.txt +0 -0
  30. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python.egg-info/requires.txt +0 -0
  31. {informatica_python-1.9.1 → informatica_python-1.9.3}/informatica_python.egg-info/top_level.txt +0 -0
  32. {informatica_python-1.9.1 → informatica_python-1.9.3}/setup.cfg +0 -0
@@ -1,6 +1,6 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: informatica-python
3
- Version: 1.9.1
3
+ Version: 1.9.3
4
4
  Summary: Convert Informatica PowerCenter workflow XML to Python/PySpark code
5
5
  Author: Nick
6
6
  License: MIT
@@ -79,25 +79,26 @@ from informatica_python import InformaticaConverter
79
79
 
80
80
  converter = InformaticaConverter()
81
81
 
82
- # Parse and generate files
83
- converter.convert_to_files("workflow_export.xml", "output_dir")
82
+ # Parse and generate files to a directory
83
+ converter.convert("workflow_export.xml", output_dir="output_dir")
84
84
 
85
- # Parse and generate zip
86
- converter.convert_to_zip("workflow_export.xml", "output.zip")
85
+ # Parse and generate zip archive
86
+ converter.convert("workflow_export.xml", output_zip="output.zip")
87
87
 
88
- # Parse to structured dict
88
+ # Parse to structured dict (no code generation)
89
89
  result = converter.parse_file("workflow_export.xml")
90
90
 
91
91
  # Use a different data library
92
- converter.convert_to_files("workflow_export.xml", "output_dir", data_lib="polars")
92
+ converter = InformaticaConverter(data_lib="polars")
93
+ converter.convert("workflow_export.xml", output_dir="output_dir")
93
94
  ```
94
95
 
95
96
  ## Generated Output Files
96
97
 
97
98
  | File | Description |
98
99
  |------|-------------|
99
- | `helper_functions.py` | Database/file I/O helpers, Informatica expression equivalents (80+ functions), window/analytic functions, stored procedure execution, state persistence |
100
- | `mapping_{name}.py` | One per mapping, named after the real Informatica mapping name — transformation logic with row-count logging, source reads, target writes, inline documentation |
100
+ | `helper_functions.py` | Database/file I/O helpers, 90+ Informatica expression equivalents, window/analytic functions, stored procedure execution, state persistence |
101
+ | `mapping_{name}.py` | One per mapping, named after the real Informatica mapping name — transformation logic with vectorized expressions, row-count logging, type casting, inline documentation |
101
102
  | `workflow.py` | Task orchestration with topological ordering, decision branching, worklet calls, and error handling |
102
103
  | `config.yml` | Connection configs, source/target metadata, runtime parameters |
103
104
  | `all_sql_queries.sql` | All SQL extracted from Source Qualifiers, Lookups, SQL transforms (with ANSI-translated variants) |
@@ -119,23 +120,22 @@ Select via `--data-lib` CLI flag or `data_lib` parameter:
119
120
 
120
121
  The code generator produces real, runnable Python for these transformation types:
121
122
 
122
- - **Source Qualifier** — SQL override, pre/post SQL, column selection, session connection overrides
123
- - **Expression** — Field-level expressions converted to vectorized pandas operations (`df["COL"]` style)
123
+ - **Source Qualifier** — SQL override, pre/post SQL, column selection, session connection overrides, `$$PARAM` substitution in SQL
124
+ - **Expression** — Field-level expressions converted to vectorized pandas operations (`df["COL"]` style) with 40+ vectorized function handlers
124
125
  - **Filter** — Row filtering with vectorized converted conditions
125
126
  - **Joiner** — `pd.merge()` with join type and condition parsing (inner/left/right/outer)
126
- - **Lookup** — `pd.merge()` lookups with connection-aware DB/file reads, multiple match policies, default values
127
+ - **Lookup** — `pd.merge()` lookups with connection-aware DB reads, multiple match policies, default values, `$$PARAM` substitution
127
128
  - **Aggregator** — `groupby().agg()` with SUM/COUNT/AVG/MIN/MAX/FIRST/LAST, computed aggregates
128
- - **Sorter** — `sort_values()` with multi-key ascending/descending
129
+ - **Sorter** — `sort_values()` with multi-key ascending/descending per-field direction from SORTDIRECTION attribute
129
130
  - **Router** — Multi-group conditional routing with named groups
130
131
  - **Union** — `pd.concat()` across multiple input groups
131
- - **Update Strategy** — DD_INSERT/DD_UPDATE/DD_DELETE/DD_REJECT routing with actual target INSERT/UPDATE/DELETE operations, dialect-aware SQL placeholders, auto-detected primary keys
132
+ - **Update Strategy** — DD_INSERT/DD_UPDATE/DD_DELETE/DD_REJECT routing with actual target INSERT/UPDATE/DELETE operations, dialect-aware SQL placeholders, auto-detected primary keys; vectorized expression parsing with row-level fallback
132
133
  - **Sequence Generator** — Auto-incrementing ID columns
133
134
  - **Normalizer** — `pd.melt()` with auto-detected id/value vars
134
135
  - **Rank** — `groupby().rank()` with Top-N filtering
135
136
  - **Stored Procedure** — Full code generation with Oracle/MSSQL/generic support, input/output parameter mapping
136
- - **Transaction Control** — Commit/rollback logic
137
137
  - **Custom / Java** — Placeholder stubs with TODO markers
138
- - **SQL Transform** — Direct SQL execution pass-through
138
+ - **SQL Transform** — Direct SQL execution pass-through with `$$PARAM` substitution
139
139
 
140
140
  ## Supported XML Tags (72 Tags)
141
141
 
@@ -153,6 +153,86 @@ The code generator produces real, runnable Python for these transformation types
153
153
 
154
154
  ## Key Features
155
155
 
156
+ ### Generated Code Quality (v1.9.3+)
157
+
158
+ Generated code follows clean formatting and commenting standards:
159
+ - Consistent section headers (`# ---`) for Source Qualifiers, Transformations, and Target Writes
160
+ - Each section includes metadata: database type, field lists, descriptions
161
+ - Column mapping comments (`# Column mapping: source -> target`) and write operation type comments (`# Write to database table` / `# Write to file`)
162
+ - Expression inline comments showing original Informatica expression (e.g., `# FULL_NAME = UPPER(FIRST_NAME) || ' ' || UPPER(LAST_NAME)`)
163
+ - Clean indentation: no blank line after `try:`, no consecutive blank lines inside function body
164
+ - Mapping-level `try:/except` wrapper with `logger.error()` for runtime visibility
165
+
166
+ ### Smart Target Write Detection (v1.9.3+)
167
+
168
+ Targets are automatically classified as database or file writes:
169
+ - Targets with `database_type` set (Oracle, SQL Server, etc.) generate `write_to_db()` calls
170
+ - Targets with flatfile metadata or file extensions (`.csv`, `.dat`, `.txt`, `.xml`, `.json`, `.parquet`, `.xlsx`, `.xls`, `.tsv`, `.avro`) generate `write_file()` calls
171
+ - Bare targets (no metadata) default to `write_to_db()` since Informatica targets are typically database tables
172
+ - Schema-qualified names (e.g., `dbo.MY_TABLE`) correctly route to database writes
173
+ - Session file path overrides take priority when present
174
+
175
+ ### Vectorized Expression Engine (v1.9.2+)
176
+
177
+ Column-level pandas operations instead of row-level iteration. The expression converter uses a recursive parenthesis-aware parser that handles:
178
+
179
+ **Conditional / Null:**
180
+ - `IIF(cond, val, else_val)` → `np.where()` — supports 2-arg form (missing else defaults to `None`)
181
+ - `DECODE(TRUE, cond1, val1, ..., default)` → nested `np.where()` chains
182
+ - `DECODE(field, val1, res1, ..., default)` → value-matching `np.where()`
183
+ - `NVL(val, default)` → `.fillna()`
184
+ - `IS_SPACES(field)` → `field.str.strip().eq("")`
185
+ - `IS_NUMBER(field)` → `pd.to_numeric(field, errors="coerce").notna()`
186
+ - `IN(field, val1, val2, ...)` → `field.isin([...])`
187
+
188
+ **String:**
189
+ - `UPPER/LOWER` → `.str.upper()/.str.lower()`
190
+ - `LTRIM/RTRIM/TRIM` → `.str.lstrip()/.str.rstrip()/.str.strip()` with custom char support
191
+ - `SUBSTR(val, start, len)` → `.str[start:end]`
192
+ - `INSTR(val, search)` → `.str.find()`
193
+ - `LPAD/RPAD` → `.str.pad()`
194
+ - `REVERSE(val)` → `.str[::-1]`
195
+ - `INITCAP(val)` → `.str.title()`
196
+ - `REPLACECHR/REPLACESTR` → `.str.replace()`
197
+ - `REG_EXTRACT/REG_REPLACE` → `.str.extract()/.str.replace(regex=True)`
198
+ - `CHR(code)` → `chr(int(code))`
199
+ - `||` concatenation → `+` with `.astype(str)` on non-literals
200
+
201
+ **Date/Time:**
202
+ - `TO_DATE(val, fmt)` → `pd.to_datetime()` with Informatica→Python format conversion
203
+ - `TO_CHAR(val, fmt)` → `.dt.strftime()`
204
+ - `ADD_TO_DATE(date, part, amount)` → `date + pd.to_timedelta()` with full unit mapping (YY/MM/DD/HH/MI/SS)
205
+ - `DATE_DIFF(date1, date2, part)` → `(date1 - date2).dt.days` / `.dt.total_seconds() / 3600` etc.
206
+ - `SYSDATE/SYSTIMESTAMP` → `pd.Timestamp.now()`
207
+ - `TRUNC(date, 'DD')` → date truncation via `.dt.floor()/.dt.to_period()`
208
+ - `MAKE_DATE_TIME(y, m, d, h, mi, s)` → `pd.Timestamp()`
209
+
210
+ **Numeric:**
211
+ - `TO_INTEGER/TO_BIGINT/TO_FLOAT/TO_DECIMAL` → `pd.to_numeric()`
212
+ - `TRUNC(val)` → `np.trunc()` for numeric truncation
213
+ - `ROUND/ABS/CEIL/FLOOR/POWER/SQRT/MOD/LOG/SIGN` → `np.*` equivalents
214
+
215
+ **Special:**
216
+ - `:LKP.TABLE(args)` — Connected lookup references → `df_lkp_table` merge
217
+ - `:PORT.FUNC(args)` — Unconnected lookups → `lookup_func("FUNC", args)` calls
218
+ - Inline `--` comment stripping (respects string literals)
219
+ - String-literal-aware field substitution
220
+
221
+ ### Expression Converter (90+ Row-Level Functions)
222
+
223
+ All Informatica expression functions are available as row-level Python equivalents in `helper_functions.py`:
224
+
225
+ - **String:** `substr`, `ltrim`, `rtrim`, `upper`, `lower`, `lpad`, `rpad`, `instr`, `length`, `concat`, `replacechr`, `replacestr`, `reg_extract`, `reg_replace`, `reg_match`, `reverse_str`, `initcap`, `chr_func`, `ascii_func`, `left_str`, `right_str`, `trim_func`, `indexof`, `metaphone_func`, `soundex_func`, `compress_func`, `decompress_func`
226
+ - **Date:** `add_to_date`, `date_diff`, `date_compare`, `get_date_part`, `set_date_part`, `last_day`, `make_date_time`, `to_date`, `to_char`, `to_timestamp_func`, `current_timestamp`, `session_start_time`
227
+ - **Numeric:** `round_val`, `trunc`, `mod_val`, `abs_val`, `ceil_val`, `floor_val`, `power_val`, `sqrt_val`, `log_val`, `ln_val`, `exp_val`, `sign_val`, `rand_val`, `greatest_val`, `least_val`
228
+ - **Conversion:** `to_integer`, `to_bigint`, `to_float`, `to_decimal`, `cast_func`
229
+ - **Null/Conditional:** `iif_expr`, `decode_expr`, `nvl`, `nvl2`, `isnull`, `is_spaces`, `is_number`, `is_date`, `in_expr`, `choose_expr`
230
+ - **Aggregate:** `sum_val`, `avg_val`, `count_val`, `min_val`, `max_val`, `first_val`, `last_val`, `median_val`, `stddev_val`, `variance_val`, `percentile_val`
231
+ - **Window/Analytic:** `moving_avg`, `moving_avg_df`, `moving_sum`, `moving_sum_df`, `cume`, `cume_df`, `percentile_df`
232
+ - **Lookup:** `lookup_func` — Placeholder for runtime lookup resolution
233
+ - **Variable:** `get_variable`, `set_variable`, `set_count_variable`
234
+ - **Control:** `raise_error`, `abort_func`
235
+
156
236
  ### Row-Count Logging (v1.8+)
157
237
 
158
238
  Generated code automatically logs row counts at every step of the data pipeline:
@@ -165,8 +245,6 @@ AGG_TOTALS (Aggregator): 8542 input rows -> 150 output rows
165
245
  Target TGT_SUMMARY: 150 rows written
166
246
  ```
167
247
 
168
- All row-count operations are backend-safe (wrapped in try/except), so Dask and other lazy-evaluation backends won't fail.
169
-
170
248
  ### Generated Code Documentation (v1.8+)
171
249
 
172
250
  Every generated mapping function includes a rich docstring describing:
@@ -179,14 +257,6 @@ Each transformation block is annotated with:
179
257
  - Transform type and description (from Informatica XML)
180
258
  - Input and output field lists (truncated at 10 for readability)
181
259
 
182
- ### Window / Analytic Functions (v1.7+)
183
-
184
- DataFrame-level analytic functions for aggregation transforms:
185
- - `moving_avg_df(df, col, window)` — rolling mean via `.rolling().mean()`
186
- - `moving_sum_df(df, col, window)` — rolling sum via `.rolling().sum()`
187
- - `cume_df(df, col)` — cumulative sum via `.expanding().sum()`
188
- - `percentile_df(df, col, pct)` — quantile via `.quantile()`
189
-
190
260
  ### Update Strategy with Target Operations (v1.7+)
191
261
 
192
262
  Update Strategy transforms now generate real INSERT/UPDATE/DELETE operations:
@@ -196,6 +266,14 @@ Update Strategy transforms now generate real INSERT/UPDATE/DELETE operations:
196
266
  - Dialect-aware SQL placeholders (`?` for MSSQL, `%s` for PostgreSQL/Oracle)
197
267
  - Primary key columns auto-detected from target field definitions
198
268
 
269
+ ### Window / Analytic Functions (v1.7+)
270
+
271
+ DataFrame-level analytic functions for aggregation transforms:
272
+ - `moving_avg_df(df, col, window)` — rolling mean via `.rolling().mean()`
273
+ - `moving_sum_df(df, col, window)` — rolling sum via `.rolling().sum()`
274
+ - `cume_df(df, col)` — cumulative sum via `.expanding().sum()`
275
+ - `percentile_df(df, col, pct)` — quantile via `.quantile()`
276
+
199
277
  ### Stored Procedure Execution (v1.7+)
200
278
 
201
279
  Full stored procedure code generation (not just stubs):
@@ -241,19 +319,13 @@ Optional `--validate-casts` flag generates null-count checks before/after type c
241
319
  - Logs warnings when coercion introduces new nulls
242
320
  - Helps identify data quality issues during test runs
243
321
 
244
- ### Vectorized Expression Generation (v1.5+)
245
-
246
- Column-level pandas operations instead of row-level iteration:
247
- - IIF → `np.where()`, NVL → `.fillna()`, UPPER/LOWER → `.str.upper()/.str.lower()`
248
- - SUBSTR → `.str[start:end]`, TO_INTEGER → `pd.to_numeric()`, TO_DATE → `pd.to_datetime()`
249
- - IS NULL/IS NOT NULL → `.isna()`/`.notna()`
250
-
251
322
  ### Parameter File Support (v1.5+)
252
323
 
253
324
  Standard Informatica `.param` file parsing:
254
325
  - `[Global]` and `[folder.WF:workflow.ST:session]` section support
255
326
  - `get_param(config, var_name)` resolution chain: config → env vars → defaults
256
327
  - CLI `--param-file` flag for specifying parameter files
328
+ - `$$PARAM` variables in SQL automatically substituted with `.replace()` calls
257
329
 
258
330
  ### Session Connection Overrides (v1.4+)
259
331
 
@@ -283,18 +355,49 @@ Expands Mapplet instances into prefixed transforms, rewires connectors, and elim
283
355
 
284
356
  Converts Informatica decision conditions to Python if/else branches with proper variable substitution.
285
357
 
286
- ### Expression Converter (80+ Functions)
287
-
288
- Converts Informatica expressions to Python equivalents:
289
-
290
- - **String:** SUBSTR, LTRIM, RTRIM, UPPER, LOWER, LPAD, RPAD, INSTR, LENGTH, CONCAT, REPLACE, REG_EXTRACT, REG_REPLACE, REVERSE, INITCAP, CHR, ASCII
291
- - **Date:** ADD_TO_DATE, DATE_DIFF, GET_DATE_PART, SYSDATE, SYSTIMESTAMP, TO_DATE, TO_CHAR, TRUNC (date)
292
- - **Numeric:** ROUND, TRUNC, MOD, ABS, CEIL, FLOOR, POWER, SQRT, LOG, EXP, SIGN
293
- - **Conversion:** TO_INTEGER, TO_BIGINT, TO_FLOAT, TO_DECIMAL, TO_CHAR, TO_DATE
294
- - **Null handling:** IIF, DECODE, NVL, NVL2, ISNULL, IS_SPACES, IS_NUMBER
295
- - **Aggregate:** SUM, AVG, COUNT, MIN, MAX, FIRST, LAST, MEDIAN, STDDEV, VARIANCE
296
- - **Lookup:** :LKP expressions with dynamic lookup references
297
- - **Variable:** SETVARIABLE / mapping variable assignment
358
+ ## Helper Functions Library
359
+
360
+ The generated `helper_functions.py` provides a complete runtime library:
361
+
362
+ ### Configuration & Parameters
363
+ | Function | Description |
364
+ |----------|-------------|
365
+ | `load_config(path, param_file)` | Load YAML config with optional `.param` file merge |
366
+ | `parse_param_file(path)` | Parse Informatica `.param` files (`[Global]`, `[folder.WF:...]` sections) |
367
+ | `get_param(config, var_name, default)` | Resolve parameter: config env vars → default |
368
+ | `get_variable(var_name, config)` | Get workflow/mapping variable from params, env vars, or param store |
369
+ | `set_variable(var_name, value)` | Set workflow/mapping variable in param store and env |
370
+
371
+ ### Database Operations
372
+ | Function | Description |
373
+ |----------|-------------|
374
+ | `get_db_connection(config, conn_name)` | Create DB connection (pyodbc/pymssql/sqlalchemy fallback for MSSQL) |
375
+ | `read_from_db(config, query, conn_name)` | Execute SQL query and return DataFrame |
376
+ | `write_to_db(config, df, table, conn_name)` | Write DataFrame to database table via `.to_sql()` |
377
+ | `execute_sql(config, sql, conn_name)` | Execute DDL/DML statement (INSERT, UPDATE, DELETE) |
378
+ | `write_with_update_strategy(config, df, table, ...)` | Split rows by `_update_strategy` column into INSERT/UPDATE/DELETE/REJECT operations |
379
+ | `call_stored_procedure(config, proc, params, ...)` | Execute stored procedure with input/output parameter mapping (Oracle/MSSQL/generic) |
380
+
381
+ ### File Operations
382
+ | Function | Description |
383
+ |----------|-------------|
384
+ | `read_file(path, file_config)` | Read CSV/DAT/TXT/XML/XLSX/JSON/Parquet with auto-detection |
385
+ | `write_file(df, path, file_config)` | Write DataFrame to file with format auto-detection |
386
+
387
+ ### State Persistence
388
+ | Function | Description |
389
+ |----------|-------------|
390
+ | `load_persistent_state(file)` | Load JSON state file for persistent variables |
391
+ | `save_persistent_state(file)` | Save persistent variables to JSON state file |
392
+ | `get_persistent_variable(scope, var, default)` | Get scoped persistent variable |
393
+ | `set_persistent_variable(scope, var, value)` | Set scoped persistent variable |
394
+
395
+ ### Logging & Monitoring
396
+ | Function | Description |
397
+ |----------|-------------|
398
+ | `log_mapping_start(name)` | Log mapping start with timestamp |
399
+ | `log_mapping_end(name, start_time, row_count)` | Log mapping completion with elapsed time |
400
+ | `validate_row_count(df, name, min_rows)` | Validate minimum row count threshold |
298
401
 
299
402
  ## Requirements
300
403
 
@@ -304,9 +407,40 @@ Converts Informatica expressions to Python equivalents:
304
407
 
305
408
  ## Changelog
306
409
 
307
- ### v1.9.x (Phase 8)
410
+ ### v1.9.3 (Current)
411
+ - **Smart target write detection**: Bare targets default to `write_to_db()` instead of `write_file()`; file extension allowlist (`.csv`, `.dat`, `.txt`, `.xml`, `.json`, `.parquet`, `.xlsx`, `.xls`, `.tsv`, `.avro`) for file targets; schema-qualified names (`dbo.TABLE`) correctly route to database
412
+ - **DECODE vectorization**: `DECODE(TRUE, cond1, val1, ..., default)` → nested `np.where()` chains; value-matching DECODE; handles IN() conditions and complex boolean nesting
413
+ - **IS_SPACES vectorization**: `IS_SPACES(field)` → `field.str.strip().eq("")`
414
+ - **2-arg IIF**: `IIF(cond, val)` without else clause defaults to `None`
415
+ - **REVERSE vectorization**: `REVERSE(field)` → `field.str[::-1]`
416
+ - **IN() vectorization**: `IN(field, val1, val2, ...)` → `field.isin([...])`
417
+ - **IS_NUMBER vectorization**: `IS_NUMBER(field)` → `pd.to_numeric(field, errors="coerce").notna()`
418
+ - **SYSDATE/SYSTIMESTAMP**: Bare `SYSDATE`/`SYSTIMESTAMP` → `pd.Timestamp.now()` in vectorized mode
419
+ - **TRUNC vectorization**: Numeric `TRUNC(field)` → `np.trunc()`; date `TRUNC(field, 'DD')` → `.dt.floor()`
420
+ - **ADD_TO_DATE vectorization**: `ADD_TO_DATE(date, part, amount)` → `pd.to_timedelta()` with YY/MM/DD/HH/MI/SS units
421
+ - **DATE_DIFF vectorization**: `DATE_DIFF(date1, date2, part)` → arithmetic on timedelta components
422
+ - **Unconnected lookup support**: `:PORT.FUNC_NAME(args)` → `lookup_func("FUNC_NAME", args)`
423
+ - **Inline comment stripping**: `--` comments removed from expressions (respects string literals)
424
+ - **`$$PARAM` SQL substitution**: Source Qualifier, Lookup, and SQL Transform SQL strings auto-substitute `$$VAR` with `get_param(config, 'VAR')` calls
425
+ - **Sorter direction**: Reads `SORTDIRECTION` from field attributes, generates per-field `ascending=[True, False, ...]`
426
+ - **Pass-through optimization**: Identity expressions skip `.copy()` and use direct reference
427
+ - **Duplicate lookup deduplication**: `_gen_lookup_transform` uses `seen_output_cols` set to avoid duplicate column checks
428
+ - **Mapping-level error handling**: Generated function body wrapped in `try:/except` with `logger.error()`
429
+ - **Update strategy vectorized**: Tries vectorized expression first, falls back to row-level `apply()`
430
+ - **Generated code formatting**: Consistent `# ---` section headers for Source Qualifiers, Transforms, and Target Writes; metadata comments (database type, field lists); column mapping and write operation comments; clean blank line handling
431
+ - **Source/target detection**: Case-insensitive instance type matching
432
+ - **Session→mapping inference**: Longest-suffix-match strategy for ambiguous mapping names
433
+ - **646 tests** across unit, integration, expression, and formatting test suites
434
+
435
+ ### v1.9.2 (Phase 8)
308
436
  - Mapping output files now use real mapping names (e.g., `mapping_m_customer_load.py`) instead of generic numeric indices (`mapping_1.py`)
309
437
  - Workflow imports automatically match the named mapping files
438
+ - **Expression converter rewrite**: Recursive parenthesis-aware parser replacing simple regex; fixes nested IIF/INSTR/LTRIM/RTRIM/REPLACECHR/REPLACESTR/SUBSTR/TO_CHAR/CHR/MAKE_DATE_TIME
439
+ - **`:LKP.` references** now properly converted to `lookup_func()` calls in vectorized mode
440
+ - **String literal safety**: `||` concatenation no longer applies `.astype(str)` to string literals
441
+ - **NULL/TRUE/FALSE**: Correctly resolved as `None`/`True`/`False` before field-name substitution
442
+ - **`import pandas as pd`** and `from datetime import datetime` now included in generated mapping files
443
+ - **MSSQL connection fallbacks**: `pymssql` and `sqlalchemy` tried when `pyodbc` unavailable
310
444
 
311
445
  ### v1.8.x (Phase 7)
312
446
  - Row-count logging at every pipeline step (source reads, transforms, target writes)
@@ -361,7 +495,7 @@ Converts Informatica expressions to Python equivalents:
361
495
  cd informatica_python
362
496
  pip install -e ".[dev]"
363
497
 
364
- # Run tests (136 tests)
498
+ # Run tests (646 tests)
365
499
  pytest tests/ -v
366
500
  ```
367
501