sandwich 0.2.0__py3-none-any.whl → 0.2.2__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,144 +1,177 @@
1
- Metadata-Version: 2.4
1
+ Metadata-Version: 2.3
2
2
  Name: sandwich
3
- Version: 0.2.0
3
+ Version: 0.2.2
4
4
  Summary: DataVault 2.0 code gen
5
- Author-email: Andrey Morozov <andrey@morozov.lv>
6
- License-File: LICENSE
7
5
  Keywords: DWH,Data Vault 2.0
8
- Classifier: Development Status :: 1 - Planning
9
- Classifier: Environment :: Console
10
- Classifier: Intended Audience :: Developers
6
+ Author: Andrey Morozov
7
+ Author-email: Andrey Morozov <andrey@morozov.lv>
8
+ Classifier: Programming Language :: Python :: 3
11
9
  Classifier: License :: OSI Approved :: MIT License
10
+ Classifier: Topic :: Database
12
11
  Classifier: Operating System :: OS Independent
13
- Classifier: Programming Language :: Python :: 3
12
+ Classifier: Environment :: Console
13
+ Classifier: Development Status :: 2 - Pre-Alpha
14
+ Classifier: Intended Audience :: Developers
14
15
  Classifier: Typing :: Typed
15
- Requires-Python: >=3.12
16
- Requires-Dist: dotenv>=0.9.9
17
- Requires-Dist: pyodbc>=5.3.0
18
- Requires-Dist: sqlalchemy>=2.0.44
16
+ Requires-Dist: sqlalchemy
17
+ Requires-Python: >=3.14
19
18
  Description-Content-Type: text/markdown
20
19
 
21
- ## Data Vault 2.0 scaffolding tool
22
- This tool is designed to streamline the process of creating Data Vault 2.0 entities,
23
- such as hubs, links, and satellites.
24
- As well as building information layer objects such as dim and fact tables
25
- from a multidimensional paradigm.
26
-
27
- ### How it works:
28
- User: provides a staging view `stg.[entity_name]` (or a table if the staging layer persisted)
29
- with all requirements for the `[entity_name]` defined in the schema (how to define see below).
30
- Tool:
31
- 1. Validates metadata of the provided staging view or table.
32
- 2. Generates the necessary DDL statements to create the Data Vault 2.0 entities.
33
- 3. Generates ELT procedures to load data to the generated entities.
34
- 4. Generates support procedures such as `meta.Drop_all_related_to_[entity_name]` and `elt.Run_all_related_to_[entity_name]`
35
-
36
- ```text
37
- +----------------------+
38
- | hub.[entity_name] |
39
- +----------------------+
40
- ^
41
- o 1.define +-------------------+ | 3.create
42
- /|\ -------> | stg.[entity_name] | # +----------------------+
43
- / \ +-------------------+ /|\ ---------> | sat.[entity_name] |
44
- User ---------------------------------------> / \ 3.create +----------------------+
45
- 2.use Tool
46
- | 3.create
47
- v
48
- +----------------------+
49
- | dim.[entity_name] |
50
- +----------------------+
51
-
52
- ```
53
-
54
- ### How to define a staging view or table:
55
- * `bk_` (BusinessKey) - at least one `bk_` column
56
- * `hk_[entity_name]` (HashKey) - exactly one `hk_[entity_name]` column if you want a `hub` table created
57
- * `LoadDate` - required by dv2 standard for an auditability
58
- * `RecordSource` - required by dv2 standard for an auditability
59
- * `HashDiff` - optional, required if you want to have a scd2 type `dim` table created
60
- * `IsAvailable` - optional, required if you want to track missing/deleted records
61
- * all other columns will be considered as business columns and will be included to the `sat` table definition
62
-
63
-
64
- | staging fields | scd2dim profile |
65
- |--------------------|-----------------|
66
- | bk_ | ✅ |
67
- | hk_`[entity_name]` | ✅ |
68
- | LoadDate | ✅ |
69
- | RecordSource | ✅ |
70
- | HashDiff | ✅ |
71
- | IsAvailable | ✅ |
72
-
73
- ```sql
74
- -- staging view example for the scd2dim profile (mssql)
75
- create view [stg].[UR_officers] as
76
- select cast(31 as bigint) [bk_id]
77
- , core.StringToHash1(cast(31 as bigint)) [hk_UR_officers]
78
- , sysdatetime() [LoadDate]
79
- , cast('LobSystem.dbo.officers_daily' as varchar(200)) [RecordSource]
80
- , core.StringToHash8(
81
- cast('uri' as nvarchar(100))
82
- , cast('00000000000000' as varchar(20))
83
- , cast('NATURAL_PERSON' as varchar(50))
84
- , cast(null as varchar(20))
85
- , cast('INDIVIDUALLY' as varchar(50))
86
- , cast(0 as int)
87
- , cast('2008-04-07' as date)
88
- , cast('2008-04-07 18:00:54.000' as datetime)
89
- ) [HashDiff]
90
- , cast('uri' as nvarchar(100)) [uri]
91
- , cast('00000000000000' as varchar(20)) [at_legal_entity_registration_number]
92
- , cast('NATURAL_PERSON' as varchar(50)) [entity_type]
93
- , cast(null as varchar(20)) [legal_entity_registration_number]
94
- , cast('INDIVIDUALLY' as varchar(50)) [rights_of_representation_type]
95
- , cast(0 as int) [representation_with_at_least]
96
- , cast('2008-04-07' as date) [registered_on]
97
- , cast('2008-04-07 18:00:54.000' as datetime) [last_modified_at]
98
- , cast(1 as bit) [IsAvailable]
99
- ```
100
- ### scd2dim profile example:
101
- | stg | hub | sat | dim |
102
- |--------------------|------------------------|----------------------------|--------------------|
103
- | | | | hk_`[entity_name]` |
104
- | BKs... | (uk)BKs... | BKs... | (pk)BKs... |
105
- | hk_`[entity_name]` | (pk)hk_`[entity_name]` | (pk)(fk)hk_`[entity_name]` | |
106
- | LoadDate | LoadDate | (pk)LoadDate | |
107
- | RecordSource | RecordSource | RecordSource | |
108
- | HashDiff | | HashDiff | |
109
- | FLDs... | | FLDs... | FLDs... |
110
- | IsAvailable | | IsAvailable | IsAvailable |
111
- | | | | IsCurrent |
112
- | | | | (pk)DateFrom |
113
- | | | | DateTo |
114
-
115
- ### link2fact profile example:
116
- | stg | link | sat | fact |
117
- |--------------------|------------------------|----------------------------|------|
118
- | HKs... | (uk)(fk)HKs... | | |
119
- | hk_`[entity_name]` | (pk)hk_`[entity_name]` | (pk)(fk)hk_`[entity_name]` | |
120
- | degenerate_field | (uk)degenerate_field | degenerate_field | |
121
- | LoadDate | LoadDate | LoadDate | |
122
- | RecordSource | RecordSource | RecordSource | |
123
- | FLDs... | | FLDs... | |
124
-
125
-
126
- ### Schemas:
127
- * `core` - framework-related code
128
- * `stg` - staging layer for both virtual (views) and materialized (tables)
129
- * `hub` - hub layer
130
- * `sat` - satellite layer
131
- * `dim` - dimension layer (information vault)
132
- * `fact` - fact layer (information vault)
133
- * `elt` - ELT procedures
134
- * `job` - top level ELT procedures
135
- * `meta` - metadata vault
136
- * `proxy` - source data for materialized staging area (meant for wrapping external data sources as SQL views)
137
-
138
- ### DV2-related schemas layering
139
- | LoB* | staging | raw vault | business vault | information vault |
140
- |-------|---------|-----------|----------------|-------------------|
141
- | proxy | stg | hub | sal | dim |
142
- | | | sat | | fact |
143
- | | | link | | |
144
- _* Line of Business applications_
20
+ ## Data Vault 2.0 scaffolding tool
21
+ This tool is designed to streamline the process of creating Data Vault 2.0 entities,
22
+ such as hubs, links, and satellites.
23
+ As well as building information layer objects such as dim and fact tables
24
+ from a multidimensional paradigm.
25
+
26
+ ### How it works:
27
+ User: provides a staging view `stg.[entity_name]` (or a table if the staging layer persisted)
28
+ with all requirements for the `[entity_name]` defined in the schema (how to define see below).
29
+ Tool:
30
+ 1. Validates metadata of the provided staging view or table.
31
+ 2. Generates the necessary DDL statements to create the Data Vault 2.0 entities.
32
+ 3. Generates ELT procedures to load data to the generated entities.
33
+ 4. Generates support procedures such as `meta.Drop_all_related_to_[entity_name]` and `elt.Run_all_related_to_[entity_name]`
34
+
35
+ #### App design (layers):
36
+ DV2Modeler (service)
37
+ 1. gets user input (stg) and analyzes it, producing `stg_info`
38
+ 2. chooses strategy (`scd2dim`, `link2fact`)
39
+
40
+ Strategy (algorithm)
41
+ 1. validates staging using `stg_info`
42
+ 2. generates schema using dialects handler
43
+
44
+ Dialect handler (repository)
45
+ 1. creates DB objects for postgres or MSSQL database
46
+
47
+ ```text
48
+ +----------------------+
49
+ | hub.[entity_name] |
50
+ +----------------------+
51
+ ^
52
+ o 1.define +-------------------+ | 3.create
53
+ /|\ -------> | stg.[entity_name] | # +----------------------+
54
+ / \ +-------------------+ /|\ ---------> | sat.[entity_name] |
55
+ User ---------------------------------------> / \ 3.create +----------------------+
56
+ 2.use Tool
57
+ | 3.create
58
+ v
59
+ +----------------------+
60
+ | dim.[entity_name] |
61
+ +----------------------+
62
+
63
+ ```
64
+
65
+ ### How to define a staging view or table:
66
+ * `bk_` (BusinessKey) - at least one `bk_` column
67
+ * `hk_[entity_name]` (HashKey) - exactly one `hk_[entity_name]` column if you want a `hub` table created
68
+ * `LoadDate` - required by dv2 standard for an auditability
69
+ * `RecordSource` - required by dv2 standard for an auditability
70
+ * `HashDiff` - optional, required if you want to have a scd2 type `dim` table created
71
+ * `IsAvailable` - optional, required if you want to track missing/deleted records
72
+ * all other columns will be considered as business columns and will be included to the `sat` table definition
73
+
74
+
75
+ | staging fields | scd2dim profile | link2fact profile |
76
+ |--------------------|-----------------|-------------------|
77
+ | bk_ | ✅ | |
78
+ | hk_`[entity_name]` | ✅ | |
79
+ | LoadDate | ✅ | |
80
+ | RecordSource | ✅ | |
81
+ | HashDiff | ✅ | |
82
+ | IsAvailable | ✅ | |
83
+
84
+ ```sql
85
+ -- staging view example for the scd2dim profile (mssql)
86
+ create view [stg].[UR_officers] as
87
+ select cast(31 as bigint) [bk_id]
88
+ , core.StringToHash1(cast(31 as bigint)) [hk_UR_officers]
89
+ , sysdatetime() [LoadDate]
90
+ , cast('LobSystem.dbo.officers_daily' as varchar(200)) [RecordSource]
91
+ , core.StringToHash8(
92
+ cast('uri' as nvarchar(100))
93
+ , cast('00000000000000' as varchar(20))
94
+ , cast('NATURAL_PERSON' as varchar(50))
95
+ , cast(null as varchar(20))
96
+ , cast('INDIVIDUALLY' as varchar(50))
97
+ , cast(0 as int)
98
+ , cast('2008-04-07' as date)
99
+ , cast('2008-04-07 18:00:54.000' as datetime)
100
+ ) [HashDiff]
101
+ , cast('uri' as nvarchar(100)) [uri]
102
+ , cast('00000000000000' as varchar(20)) [at_legal_entity_registration_number]
103
+ , cast('NATURAL_PERSON' as varchar(50)) [entity_type]
104
+ , cast(null as varchar(20)) [legal_entity_registration_number]
105
+ , cast('INDIVIDUALLY' as varchar(50)) [rights_of_representation_type]
106
+ , cast(0 as int) [representation_with_at_least]
107
+ , cast('2008-04-07' as date) [registered_on]
108
+ , cast('2008-04-07 18:00:54.000' as datetime) [last_modified_at]
109
+ , cast(1 as bit) [IsAvailable]
110
+ ```
111
+ ### scd2dim profile columns mapping:
112
+ | stg | hub | sat | dim |
113
+ |--------------------|------------------------|----------------------------|--------------------|
114
+ | | | | hk_`[entity_name]` |
115
+ | BKs... | (uk)BKs... | BKs... | (pk)BKs... |
116
+ | hk_`[entity_name]` | (pk)hk_`[entity_name]` | (pk)(fk)hk_`[entity_name]` | |
117
+ | LoadDate | LoadDate | (pk)LoadDate | |
118
+ | RecordSource | RecordSource | RecordSource | |
119
+ | HashDiff | | HashDiff | |
120
+ | FLDs... | | FLDs... | FLDs... |
121
+ | IsAvailable | | IsAvailable | IsAvailable |
122
+ | | | | IsCurrent |
123
+ | | | | (pk)DateFrom |
124
+ | | | | DateTo |
125
+
126
+ ### link2fact profile columns mapping:
127
+ | stg | link | sat | fact |
128
+ |--------------------|------------------------|----------------------------|------|
129
+ | HKs... | (uk)(fk)HKs... | | |
130
+ | hk_`[entity_name]` | (pk)hk_`[entity_name]` | (pk)(fk)hk_`[entity_name]` | |
131
+ | degenerate_field | (uk)degenerate_field | degenerate_field | |
132
+ | LoadDate | LoadDate | LoadDate | |
133
+ | RecordSource | RecordSource | RecordSource | |
134
+ | FLDs... | | FLDs... | |
135
+
136
+
137
+ ### Schemas:
138
+ * `core` - framework-related code
139
+ * `stg` - staging layer for both virtual (views) and materialized (tables)
140
+ * `hub` - hub tables
141
+ * `sat` - satellite tables
142
+ * `dim` - dimension tables (information vault)
143
+ * `fact` - fact tables (information vault)
144
+ * `elt` - ELT procedures
145
+ * `job` - top level ELT procedures
146
+ * `meta` - metadata vault
147
+ * `proxy` - source data for a materialized staging area (meant for wrapping external data sources as SQL views)
148
+
149
+ ### DV2-related schemas layering
150
+ | LoB* | staging | raw vault | business vault | information vault |
151
+ |-------|---------|-----------|----------------|-------------------|
152
+ | proxy | stg | hub | sal | dim |
153
+ | | | sat | | fact |
154
+ | | | link | | |
155
+ _* Line of Business applications_
156
+
157
+ ### Usage diagram
158
+ ```text
159
+ + +-----------+ automation
160
+ +---- + -------> | Dv2Utils | -------+------+
161
+ | + uses +-----------+ |
162
+ | + | uses | creates
163
+ | + v |
164
+ | + uses +-----------+ uses |
165
+ +---- + -------> | Dv2Helper | --------------+
166
+ | + +-----------+ |
167
+ o + | |
168
+ /|\ + | DDL | python
169
+ / \ ==========================================================
170
+ DWH Dev + creates | | database
171
+ | + v V
172
+ | + uses +--------+ uses +---------------+
173
+ +---- + -------> | entity | -----> | core objects |
174
+ + +--------+ +---------------+
175
+ +
176
+
177
+ ```
@@ -0,0 +1,23 @@
1
+ sandwich/__init__.py,sha256=DiQSmvml9OXujAYHILR4jz8UjoxbMvxFgRIlsdRza1E,80
2
+ sandwich/dialects/__init__.py,sha256=zQ4oigT3yqjZyl_IL_Tc-GmyoJar_Oqj_bGyiRVdSjg,415
3
+ sandwich/dialects/base.py,sha256=Y9ztnULBIMu7SEZ23ecxo1vjeJ3mnnYT1neQVQRVjUg,4922
4
+ sandwich/dialects/ddl_mssql.py,sha256=3yGHC_sKpO9eKdO10YuLhvDkY3zyeufy0jteCH_iUac,3831
5
+ sandwich/dialects/ddl_postgres.py,sha256=5mncfUc1o6tbp1Po9iKlrhJEnKp6HfxTMu6saTECLW8,3402
6
+ sandwich/dialects/factory.py,sha256=-mpWGKp8NRmTFCXVlhbTsGhR0oAwOEdm-xnBolwWrQo,911
7
+ sandwich/dialects/mssql.py,sha256=kV9I-W2LutjRDrr4tV5dI_OyveL2LspazJgoQFP4m-M,9110
8
+ sandwich/dialects/postgres.py,sha256=bH6PBCuOi79U2emfnaS4CyclOLbHC_Qmb-ghRpb5Aks,4075
9
+ sandwich/dialects/utils.py,sha256=2Kl4fztlIbgqfZCvDi_fk1RB5opksw-qJQ5L7d4ZLIM,5970
10
+ sandwich/dv2_helper.py,sha256=-0M-9WJXwovOr57h7IWglRsK558obuIcMPE0vOFjF7Q,4429
11
+ sandwich/errors.py,sha256=kIJmYbUf9wOnshJbFHwhqxZ3qEEdVtOy5Dcb2bSdnAk,872
12
+ sandwich/main.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
13
+ sandwich/modeling/__init__.py,sha256=vlpKEZVmRpav2pWMJ4Djv_CrweMovEeu21Pnp6BToAY,3464
14
+ sandwich/py.typed,sha256=70pF0eMpuZgOyb0zFSE07ugId_AoU5z6CpLlVfg3pik,34
15
+ sandwich/strategies/__init__.py,sha256=VePgBfvPCLl4_GJIWlbN-VbWjB7iD0ztPjMYuzIsi3U,403
16
+ sandwich/strategies/base.py,sha256=jdD-_Z8P4wtt6pMdH3jWRFtCE0P8R4vMbFUDjjhobiQ,1264
17
+ sandwich/strategies/factory.py,sha256=QYO77e-MqHtOwesGdm7_N5-PJax2QPVL9Mrse0E3rBM,1559
18
+ sandwich/strategies/link2fact.py,sha256=25GiPY-E9M9imFCwn1J7XnxpDYj1K8Jy-lO2KfIFUEA,3127
19
+ sandwich/strategies/scd2dim.py,sha256=Vh9eDAGf6RAoC0XC_BEyOXCMQhM0weXdYI2EmXu8ANw,10682
20
+ sandwich-0.2.2.dist-info/WHEEL,sha256=YUH1mBqsx8Dh2cQG2rlcuRYUhJddG9iClegy4IgnHik,79
21
+ sandwich-0.2.2.dist-info/entry_points.txt,sha256=0GSrDEOq9Qo5CwxppoVq-HNcILK65PchNmI6J0ripq8,44
22
+ sandwich-0.2.2.dist-info/METADATA,sha256=OLRT2IqN2XNj2oQRERNiRmts991ViaKsvqKYapEi688,8952
23
+ sandwich-0.2.2.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: uv 0.9.11
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,3 @@
1
+ [console_scripts]
2
+ sandwich = sandwich:main
3
+
@@ -1,5 +0,0 @@
1
- sandwich/py.typed,sha256=70pF0eMpuZgOyb0zFSE07ugId_AoU5z6CpLlVfg3pik,34
2
- sandwich-0.2.0.dist-info/METADATA,sha256=q0IzWgU5bKfbQueVK-Rvsyhr-fhJiwTsMhr4hMr9mj0,7322
3
- sandwich-0.2.0.dist-info/WHEEL,sha256=WLgqFyCfm_KASv4WHyYy0P3pM_m7J5L9k2skdKLirC8,87
4
- sandwich-0.2.0.dist-info/licenses/LICENSE,sha256=kyN6iPdQMWMA8kquZpbnZvnPQDi9WMlyyXAPtlevrFQ,1080
5
- sandwich-0.2.0.dist-info/RECORD,,
@@ -1,4 +0,0 @@
1
- Wheel-Version: 1.0
2
- Generator: hatchling 1.28.0
3
- Root-Is-Purelib: true
4
- Tag: py3-none-any
@@ -1,9 +0,0 @@
1
- MIT License
2
-
3
- Copyright (c) 2025 Andrey Morozov
4
-
5
- Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
6
-
7
- The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
8
-
9
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.