mkpipe-loader-sqlite 0.4.0__tar.gz → 0.5.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,116 @@
1
+ Metadata-Version: 2.4
2
+ Name: mkpipe-loader-sqlite
3
+ Version: 0.5.0
4
+ Summary: SQLite loader for mkpipe.
5
+ Author: Metin Karakus
6
+ Author-email: metin_karakus@yahoo.com
7
+ License: Apache License 2.0
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: License :: OSI Approved :: Apache Software License
10
+ Requires-Python: >=3.8
11
+ Description-Content-Type: text/markdown
12
+ License-File: LICENSE
13
+ Requires-Dist: mkpipe
14
+ Dynamic: author
15
+ Dynamic: author-email
16
+ Dynamic: classifier
17
+ Dynamic: description
18
+ Dynamic: description-content-type
19
+ Dynamic: license
20
+ Dynamic: license-file
21
+ Dynamic: requires-dist
22
+ Dynamic: requires-python
23
+ Dynamic: summary
24
+
25
+ # mkpipe-loader-sqlite
26
+
27
+ SQLite loader plugin for [MkPipe](https://github.com/mkpipe-etl/mkpipe). Writes Spark DataFrames into SQLite database files via JDBC.
28
+
29
+ ## Documentation
30
+
31
+ For more detailed documentation, please visit the [GitHub repository](https://github.com/mkpipe-etl/mkpipe).
32
+
33
+ ## License
34
+
35
+ This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
36
+
37
+ ---
38
+
39
+ ## Connection Configuration
40
+
41
+ ```yaml
42
+ connections:
43
+ sqlite_target:
44
+ variant: sqlite
45
+ database: /full/path/to/mydb.db
46
+ ```
47
+
48
+ ---
49
+
50
+ ## Table Configuration
51
+
52
+ ```yaml
53
+ pipelines:
54
+ - name: pg_to_sqlite
55
+ source: pg_source
56
+ destination: sqlite_target
57
+ tables:
58
+ - name: public.users
59
+ target_name: stg_users
60
+ replication_method: full
61
+ batchsize: 5000
62
+ ```
63
+
64
+ ---
65
+
66
+ ## Write Strategy
67
+
68
+ Control how data is written to SQLite:
69
+
70
+ ```yaml
71
+ - name: public.users
72
+ target_name: stg_users
73
+ write_strategy: upsert # append | replace | upsert | merge
74
+ write_key: [id] # required for upsert/merge
75
+ ```
76
+
77
+ | Strategy | SQLite Behavior |
78
+ |---|---|
79
+ | `append` | Plain `INSERT` via JDBC (default for incremental) |
80
+ | `replace` | Drop and recreate table, then insert (default for full) |
81
+ | `upsert` | `INSERT ... ON CONFLICT (write_key) DO UPDATE` via temp table |
82
+ | `merge` | Same as upsert for SQLite |
83
+
84
+ ---
85
+
86
+ ## Write Throughput
87
+
88
+ ```yaml
89
+ - name: public.users
90
+ target_name: stg_users
91
+ replication_method: full
92
+ batchsize: 5000
93
+ ```
94
+
95
+ ### Performance Notes
96
+
97
+ - SQLite uses file-level locking — **`write_partitions` has no benefit** here as concurrent writes will serialize or fail. Set `write_partitions: 1` to avoid contention.
98
+ - Keep `batchsize` moderate (1,000–10,000) — large transactions in SQLite can cause memory pressure.
99
+ - SQLite is best suited for small-to-medium outputs (local development, testing, lightweight pipelines).
100
+
101
+ ---
102
+
103
+ ## All Table Parameters
104
+
105
+ | Parameter | Type | Default | Description |
106
+ |---|---|---|---|
107
+ | `name` | string | required | Source table name |
108
+ | `target_name` | string | required | SQLite destination table name |
109
+ | `replication_method` | `full` / `incremental` | `full` | Replication strategy |
110
+ | `batchsize` | int | `10000` | Rows per JDBC batch insert |
111
+ | `write_partitions` | int | — | Set to `1` to avoid SQLite lock contention |
112
+ | `write_strategy` | string | — | `append`, `replace`, `upsert`, `merge` |
113
+ | `write_key` | list | — | Key columns for upsert/merge (required) |
114
+ | `dedup_columns` | list | — | Columns used for `mkpipe_id` hash deduplication |
115
+ | `tags` | list | `[]` | Tags for selective pipeline execution |
116
+ | `pass_on_error` | bool | `false` | Skip table on error instead of failing |
@@ -0,0 +1,92 @@
1
+ # mkpipe-loader-sqlite
2
+
3
+ SQLite loader plugin for [MkPipe](https://github.com/mkpipe-etl/mkpipe). Writes Spark DataFrames into SQLite database files via JDBC.
4
+
5
+ ## Documentation
6
+
7
+ For more detailed documentation, please visit the [GitHub repository](https://github.com/mkpipe-etl/mkpipe).
8
+
9
+ ## License
10
+
11
+ This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
12
+
13
+ ---
14
+
15
+ ## Connection Configuration
16
+
17
+ ```yaml
18
+ connections:
19
+ sqlite_target:
20
+ variant: sqlite
21
+ database: /full/path/to/mydb.db
22
+ ```
23
+
24
+ ---
25
+
26
+ ## Table Configuration
27
+
28
+ ```yaml
29
+ pipelines:
30
+ - name: pg_to_sqlite
31
+ source: pg_source
32
+ destination: sqlite_target
33
+ tables:
34
+ - name: public.users
35
+ target_name: stg_users
36
+ replication_method: full
37
+ batchsize: 5000
38
+ ```
39
+
40
+ ---
41
+
42
+ ## Write Strategy
43
+
44
+ Control how data is written to SQLite:
45
+
46
+ ```yaml
47
+ - name: public.users
48
+ target_name: stg_users
49
+ write_strategy: upsert # append | replace | upsert | merge
50
+ write_key: [id] # required for upsert/merge
51
+ ```
52
+
53
+ | Strategy | SQLite Behavior |
54
+ |---|---|
55
+ | `append` | Plain `INSERT` via JDBC (default for incremental) |
56
+ | `replace` | Drop and recreate table, then insert (default for full) |
57
+ | `upsert` | `INSERT ... ON CONFLICT (write_key) DO UPDATE` via temp table |
58
+ | `merge` | Same as upsert for SQLite |
59
+
60
+ ---
61
+
62
+ ## Write Throughput
63
+
64
+ ```yaml
65
+ - name: public.users
66
+ target_name: stg_users
67
+ replication_method: full
68
+ batchsize: 5000
69
+ ```
70
+
71
+ ### Performance Notes
72
+
73
+ - SQLite uses file-level locking — **`write_partitions` has no benefit** here as concurrent writes will serialize or fail. Set `write_partitions: 1` to avoid contention.
74
+ - Keep `batchsize` moderate (1,000–10,000) — large transactions in SQLite can cause memory pressure.
75
+ - SQLite is best suited for small-to-medium outputs (local development, testing, lightweight pipelines).
76
+
77
+ ---
78
+
79
+ ## All Table Parameters
80
+
81
+ | Parameter | Type | Default | Description |
82
+ |---|---|---|---|
83
+ | `name` | string | required | Source table name |
84
+ | `target_name` | string | required | SQLite destination table name |
85
+ | `replication_method` | `full` / `incremental` | `full` | Replication strategy |
86
+ | `batchsize` | int | `10000` | Rows per JDBC batch insert |
87
+ | `write_partitions` | int | — | Set to `1` to avoid SQLite lock contention |
88
+ | `write_strategy` | string | — | `append`, `replace`, `upsert`, `merge` |
89
+ | `write_key` | list | — | Key columns for upsert/merge (required) |
90
+ | `dedup_columns` | list | — | Columns used for `mkpipe_id` hash deduplication |
91
+ | `tags` | list | `[]` | Tags for selective pipeline execution |
92
+ | `pass_on_error` | bool | `false` | Skip table on error instead of failing |
@@ -8,6 +8,7 @@ JAR_PACKAGES = ['org.xerial:sqlite-jdbc:3.47.1.0']
8
8
  class SqliteLoader(JdbcLoader, variant='sqlite'):
9
9
  driver_name = 'sqlite'
10
10
  driver_jdbc = 'org.sqlite.JDBC'
11
+ _dialect = 'sqlite'
11
12
 
12
13
  def build_jdbc_url(self):
13
14
  db_path = self.connection.extra.get('db_path', self.database or 'data.db')
@@ -0,0 +1,116 @@
1
+ Metadata-Version: 2.4
2
+ Name: mkpipe-loader-sqlite
3
+ Version: 0.5.0
4
+ Summary: SQLite loader for mkpipe.
5
+ Author: Metin Karakus
6
+ Author-email: metin_karakus@yahoo.com
7
+ License: Apache License 2.0
8
+ Classifier: Programming Language :: Python :: 3
9
+ Classifier: License :: OSI Approved :: Apache Software License
10
+ Requires-Python: >=3.8
11
+ Description-Content-Type: text/markdown
12
+ License-File: LICENSE
13
+ Requires-Dist: mkpipe
14
+ Dynamic: author
15
+ Dynamic: author-email
16
+ Dynamic: classifier
17
+ Dynamic: description
18
+ Dynamic: description-content-type
19
+ Dynamic: license
20
+ Dynamic: license-file
21
+ Dynamic: requires-dist
22
+ Dynamic: requires-python
23
+ Dynamic: summary
24
+
25
+ # mkpipe-loader-sqlite
26
+
27
+ SQLite loader plugin for [MkPipe](https://github.com/mkpipe-etl/mkpipe). Writes Spark DataFrames into SQLite database files via JDBC.
28
+
29
+ ## Documentation
30
+
31
+ For more detailed documentation, please visit the [GitHub repository](https://github.com/mkpipe-etl/mkpipe).
32
+
33
+ ## License
34
+
35
+ This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
36
+
37
+ ---
38
+
39
+ ## Connection Configuration
40
+
41
+ ```yaml
42
+ connections:
43
+ sqlite_target:
44
+ variant: sqlite
45
+ database: /full/path/to/mydb.db
46
+ ```
47
+
48
+ ---
49
+
50
+ ## Table Configuration
51
+
52
+ ```yaml
53
+ pipelines:
54
+ - name: pg_to_sqlite
55
+ source: pg_source
56
+ destination: sqlite_target
57
+ tables:
58
+ - name: public.users
59
+ target_name: stg_users
60
+ replication_method: full
61
+ batchsize: 5000
62
+ ```
63
+
64
+ ---
65
+
66
+ ## Write Strategy
67
+
68
+ Control how data is written to SQLite:
69
+
70
+ ```yaml
71
+ - name: public.users
72
+ target_name: stg_users
73
+ write_strategy: upsert # append | replace | upsert | merge
74
+ write_key: [id] # required for upsert/merge
75
+ ```
76
+
77
+ | Strategy | SQLite Behavior |
78
+ |---|---|
79
+ | `append` | Plain `INSERT` via JDBC (default for incremental) |
80
+ | `replace` | Drop and recreate table, then insert (default for full) |
81
+ | `upsert` | `INSERT ... ON CONFLICT (write_key) DO UPDATE` via temp table |
82
+ | `merge` | Same as upsert for SQLite |
83
+
84
+ ---
85
+
86
+ ## Write Throughput
87
+
88
+ ```yaml
89
+ - name: public.users
90
+ target_name: stg_users
91
+ replication_method: full
92
+ batchsize: 5000
93
+ ```
94
+
95
+ ### Performance Notes
96
+
97
+ - SQLite uses file-level locking — **`write_partitions` has no benefit** here as concurrent writes will serialize or fail. Set `write_partitions: 1` to avoid contention.
98
+ - Keep `batchsize` moderate (1,000–10,000) — large transactions in SQLite can cause memory pressure.
99
+ - SQLite is best suited for small-to-medium outputs (local development, testing, lightweight pipelines).
100
+
101
+ ---
102
+
103
+ ## All Table Parameters
104
+
105
+ | Parameter | Type | Default | Description |
106
+ |---|---|---|---|
107
+ | `name` | string | required | Source table name |
108
+ | `target_name` | string | required | SQLite destination table name |
109
+ | `replication_method` | `full` / `incremental` | `full` | Replication strategy |
110
+ | `batchsize` | int | `10000` | Rows per JDBC batch insert |
111
+ | `write_partitions` | int | — | Set to `1` to avoid SQLite lock contention |
112
+ | `write_strategy` | string | — | `append`, `replace`, `upsert`, `merge` |
113
+ | `write_key` | list | — | Key columns for upsert/merge (required) |
114
+ | `dedup_columns` | list | — | Columns used for `mkpipe_id` hash deduplication |
115
+ | `tags` | list | `[]` | Tags for selective pipeline execution |
116
+ | `pass_on_error` | bool | `false` | Skip table on error instead of failing |
@@ -2,7 +2,7 @@ from setuptools import setup, find_packages
2
2
 
3
3
  setup(
4
4
  name='mkpipe-loader-sqlite',
5
- version='0.4.0',
5
+ version='0.5.0',
6
6
  license='Apache License 2.0',
7
7
  packages=find_packages(exclude=['tests', 'scripts', 'deploy', 'install_jars.py']),
8
8
  install_requires=['mkpipe'],
@@ -1,44 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: mkpipe-loader-sqlite
3
- Version: 0.4.0
4
- Summary: SQLite loader for mkpipe.
5
- Author: Metin Karakus
6
- Author-email: metin_karakus@yahoo.com
7
- License: Apache License 2.0
8
- Classifier: Programming Language :: Python :: 3
9
- Classifier: License :: OSI Approved :: Apache Software License
10
- Requires-Python: >=3.8
11
- Description-Content-Type: text/markdown
12
- License-File: LICENSE
13
- Requires-Dist: mkpipe
14
- Dynamic: author
15
- Dynamic: author-email
16
- Dynamic: classifier
17
- Dynamic: description
18
- Dynamic: description-content-type
19
- Dynamic: license
20
- Dynamic: license-file
21
- Dynamic: requires-dist
22
- Dynamic: requires-python
23
- Dynamic: summary
24
-
25
- # MkPipe
26
-
27
- **MkPipe** is a modular, open-source ETL (Extract, Transform, Load) tool that allows you to integrate various data sources and sinks easily. It is designed to be extensible with a plugin-based architecture that supports extractors, transformers, and loaders.
28
-
29
- ## Documentation
30
-
31
- For more detailed documentation, please visit the [GitHub repository](https://github.com/mkpipe-etl/mkpipe).
32
-
33
- ## License
34
-
35
- This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
36
-
37
- ## mkpipe_project.yaml Variables
38
- ```yaml
39
- ...
40
- connections:
41
- source:
42
- db_path: '/full_path/sqlite_db.db'
43
- ...
44
- ```
@@ -1,20 +0,0 @@
1
- # MkPipe
2
-
3
- **MkPipe** is a modular, open-source ETL (Extract, Transform, Load) tool that allows you to integrate various data sources and sinks easily. It is designed to be extensible with a plugin-based architecture that supports extractors, transformers, and loaders.
4
-
5
- ## Documentation
6
-
7
- For more detailed documentation, please visit the [GitHub repository](https://github.com/mkpipe-etl/mkpipe).
8
-
9
- ## License
10
-
11
- This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
12
-
13
- ## mkpipe_project.yaml Variables
14
- ```yaml
15
- ...
16
- connections:
17
- source:
18
- db_path: '/full_path/sqlite_db.db'
19
- ...
20
- ```
@@ -1,44 +0,0 @@
1
- Metadata-Version: 2.4
2
- Name: mkpipe-loader-sqlite
3
- Version: 0.4.0
4
- Summary: SQLite loader for mkpipe.
5
- Author: Metin Karakus
6
- Author-email: metin_karakus@yahoo.com
7
- License: Apache License 2.0
8
- Classifier: Programming Language :: Python :: 3
9
- Classifier: License :: OSI Approved :: Apache Software License
10
- Requires-Python: >=3.8
11
- Description-Content-Type: text/markdown
12
- License-File: LICENSE
13
- Requires-Dist: mkpipe
14
- Dynamic: author
15
- Dynamic: author-email
16
- Dynamic: classifier
17
- Dynamic: description
18
- Dynamic: description-content-type
19
- Dynamic: license
20
- Dynamic: license-file
21
- Dynamic: requires-dist
22
- Dynamic: requires-python
23
- Dynamic: summary
24
-
25
- # MkPipe
26
-
27
- **MkPipe** is a modular, open-source ETL (Extract, Transform, Load) tool that allows you to integrate various data sources and sinks easily. It is designed to be extensible with a plugin-based architecture that supports extractors, transformers, and loaders.
28
-
29
- ## Documentation
30
-
31
- For more detailed documentation, please visit the [GitHub repository](https://github.com/mkpipe-etl/mkpipe).
32
-
33
- ## License
34
-
35
- This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details.
36
-
37
- ## mkpipe_project.yaml Variables
38
- ```yaml
39
- ...
40
- connections:
41
- source:
42
- db_path: '/full_path/sqlite_db.db'
43
- ...
44
- ```