TestDataX 0.1.0__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,345 @@
1
+ Metadata-Version: 2.3
2
+ Name: TestDataX
3
+ Version: 0.1.0
4
+ Summary: A flexible test data generation toolkit
5
+ License: MIT
6
+ Author: JamesPBrett
7
+ Requires-Python: >=3.11,<4.0
8
+ Classifier: License :: OSI Approved :: MIT License
9
+ Classifier: Programming Language :: Python :: 3
10
+ Classifier: Programming Language :: Python :: 3.11
11
+ Classifier: Programming Language :: Python :: 3.12
12
+ Classifier: Programming Language :: Python :: 3.13
13
+ Requires-Dist: faker (>=33.1.0,<34.0.0)
14
+ Requires-Dist: mysql-connector-python (>=9.1.0,<10.0.0)
15
+ Requires-Dist: orjson (>=3.10.12,<4.0.0)
16
+ Requires-Dist: pandas (>=2.2.3,<3.0.0)
17
+ Requires-Dist: pyarrow (>=18.1.0,<19.0.0)
18
+ Requires-Dist: pydantic (>=2.10.4,<3.0.0)
19
+ Requires-Dist: typer (>=0.15.1,<0.16.0)
20
+ Description-Content-Type: text/markdown
21
+
22
+ # TestDataX
23
+
24
+ ![Build Status](https://github.com/JamesPBrett/testdatax/actions/workflows/publish.yml/badge.svg)
25
+ [![codecov](https://codecov.io/gh/JamesPBrett/testdatax/branch/main/graph/badge.svg?token=6VX62CI6U9)](https://codecov.io/gh/JamesPBrett/testdatax)
26
+ ![Python Version](https://img.shields.io/badge/python-3.11%2B-blue)
27
+ ![License](https://img.shields.io/badge/license-MIT-blue.svg)
28
+
29
+ This command-line interface application enables quick and customizable test data generation across various formats. It leverages Faker for realistic data fields, offers flexible schema configurations, and simplifies output to multiple database dialects or file types. Users can define precise parameters for data volume, types, and constraints for each target data set.
30
+
31
+ ## Requirements
32
+ - Python 3.11+
33
+ - Additional dependencies are handled automatically by poetry
34
+
35
+ ## Installation
36
+
37
+ ### Prerequisites
38
+
39
+ ```bash
40
+ # Install Python 3.11+ if not already installed
41
+ brew install python@3.11
42
+
43
+ # Install Poetry
44
+ curl -sSL https://install.python-poetry.org | python3 -
45
+
46
+ # Verify Poetry installation
47
+ poetry --version
48
+ ```
49
+
50
+ ### Install
51
+
52
+ ```bash
53
+ # Clone the repository
54
+ git clone https://github.com/JamesPBrett/testdatax.git
55
+ cd testdatax
56
+
57
+ # Install dependencies
58
+ poetry install
59
+ ```
60
+
61
+ ### Common Issues
62
+
63
+ - If Poetry is not found in PATH:
64
+ ```bash
65
+ export PATH="$HOME/.local/bin:$PATH"
66
+ ```
67
+
68
+ ## Features
69
+
70
+ - Generate realistic test data using Data providers
71
+ - Support for multiple output formats (CSV, JSON, SQL, etc.)
72
+ - Customizable schema definitions
73
+ - Configurable data generation parameters
74
+ - CLI tool for easy data generation
75
+
76
+ ## Supported Formats
77
+
78
+ - JSON
79
+ - CSV
80
+ - ORC
81
+ - Parquet
82
+ - MySQL
83
+ - MSSQL
84
+ - Oracle
85
+
86
+ ## CLI Usage
87
+ ```bash
88
+ testdatax -o <output_file> -f <format> -s <schema_file> -r <num_rows> [-d]
89
+ ```
90
+
91
+ Options:
92
+ - `-o, --output`: Output file path (table_name for sql exports)
93
+ - `-f, --format`: Output format (csv, json, orc, parquet, mysql, mssql, oracle)
94
+ - `-r, --rows`: Number of rows to generate (default: 10)
95
+ - `-s, --schema`: Path to schema file
96
+ - `-d, --debug`: Enable debug output
97
+
98
+ ## Usage Examples
99
+
100
+ Generate 10 rows of CSV data:
101
+ ```bash
102
+ testdatax -o users.csv -f csv -s schema.json -r 10
103
+ ```
104
+
105
+ Generate 1000 rows of Parquet data with debug output:
106
+ ```bash
107
+ testdatax -o large_dataset.parquet -f parquet -s users_schema.json -r 1000 -d
108
+ ```
109
+ Generate JSON data with default row count (10):
110
+ ```bash
111
+ testdatax -o data.json -f json -s schema.json
112
+ ```
113
+
114
+ Generate ORC file with specific schema:
115
+ ```bash
116
+ testdatax -o analytics.orc -f orc -s analytics_schema.json -r 100
117
+ ```
118
+
119
+ Generate MySQL with default row count (1000), table_name as 'default':
120
+ ```bash
121
+ testdatax -o default.sql -f mysql -r 1000
122
+ ```
123
+
124
+ Generate MSSQL with default row count (1000), table_name as 'mstest':
125
+ ```bash
126
+ testdatax -o mstest.sql -f mssql -r 1000
127
+ ```
128
+
129
+ Generate Oracle with default row count (1000), table_name as 'oracle':
130
+ ```bash
131
+ datagen -o oracle.sql -f oracle -r 1000
132
+ ```
133
+
134
+ Each command consists of:
135
+ - `-o, --output`: Specify the output file path and name
136
+ - `-f, --format`: Output format (csv, json, orc, parquet, mysql, mssql, oracle)
137
+ - `-s, --schema`: Path to your schema definition file
138
+ - `-r, --rows`: Number of rows to generate (optional, defaults to 10)
139
+ - `-d, --debug`: Enable debug logging (optional)
140
+
141
+ ## Schema Example
142
+
143
+ ```json
144
+ {
145
+ "username": {
146
+ "type": "string",
147
+ "faker": "name"
148
+ },
149
+ "date_joined": {
150
+ "type": "datetime"
151
+ },
152
+ "date": {
153
+ "type": "date"
154
+ },
155
+ "age": {
156
+ "type": "integer",
157
+ "min": 18,
158
+ "max": 99
159
+ },
160
+ "is_active": {
161
+ "type": "boolean"
162
+ },
163
+ "float": {
164
+ "type": "float"
165
+ },
166
+ "uuid": {
167
+ "type": "uuid"
168
+ },
169
+ "status": {
170
+ "type": "enum",
171
+ "values": ["active", "inactive", "pending"]
172
+ }
173
+ }
174
+ ```
175
+
176
+ ## Schema Configuration
177
+
178
+ The schema file defines the structure and constraints of your generated data. Each field in the schema can have the following properties:
179
+
180
+ ### Basic Field Properties
181
+ - `type`: (required) The data type of the field
182
+ - `nullable`: (optional) Boolean to allow null values (default: false)
183
+ - `unique`: (optional) Boolean to ensure unique values (default: false)
184
+
185
+ ### Type-Specific Properties
186
+
187
+ #### String Fields
188
+ ```json
189
+ {
190
+ "username": {
191
+ "type": "string",
192
+ "min_length": 5,
193
+ "max_length": 20,
194
+ "faker": "user_name" // Use faker to generate realistic data
195
+ },
196
+ "description": {
197
+ "type": "text",
198
+ "min_length": 100,
199
+ "max_length": 500
200
+ }
201
+ }
202
+ ```
203
+
204
+ #### Numeric Fields
205
+ ```json
206
+ {
207
+ "age": {
208
+ "type": "integer",
209
+ "min": 18,
210
+ "max": 99
211
+ },
212
+ "score": {
213
+ "type": "float",
214
+ "min": 0.0,
215
+ "max": 100.0,
216
+ "precision": 2
217
+ }
218
+ }
219
+ ```
220
+
221
+ #### Date and Time Fields
222
+ ```json
223
+ {
224
+ "created_at": {
225
+ "type": "datetime",
226
+ "start_date": "2020-01-01",
227
+ "end_date": "2023-12-31"
228
+ },
229
+ "birth_date": {
230
+ "type": "date",
231
+ "format": "%Y-%m-%d"
232
+ }
233
+ }
234
+ ```
235
+
236
+ #### Enum Fields
237
+ ```json
238
+ {
239
+ "status": {
240
+ "type": "enum",
241
+ "values": ["pending", "active", "suspended"],
242
+ "weights": [0.2, 0.7, 0.1] // Optional probability weights
243
+ }
244
+ }
245
+ ```
246
+
247
+ #### Using Faker
248
+ The generator supports Faker providers for generating realistic data:
249
+ ```json
250
+ {
251
+ "name": {
252
+ "type": "string",
253
+ "faker": "name"
254
+ },
255
+ "email": {
256
+ "type": "string",
257
+ "faker": "email"
258
+ },
259
+ "address": {
260
+ "type": "string",
261
+ "faker": "address"
262
+ },
263
+ "company": {
264
+ "type": "string",
265
+ "faker": "company"
266
+ }
267
+ }
268
+ ```
269
+
270
+ ### Complete Example
271
+ ```json
272
+ {
273
+ "user_id": {
274
+ "type": "uuid",
275
+ "unique": true
276
+ },
277
+ "username": {
278
+ "type": "string",
279
+ "faker": "user_name",
280
+ "unique": true
281
+ },
282
+ "email": {
283
+ "type": "string",
284
+ "faker": "email",
285
+ "unique": true
286
+ },
287
+ "age": {
288
+ "type": "integer",
289
+ "min": 18,
290
+ "max": 99
291
+ },
292
+ "status": {
293
+ "type": "enum",
294
+ "values": ["active", "inactive"],
295
+ "weights": [0.8, 0.2]
296
+ },
297
+ "created_at": {
298
+ "type": "datetime",
299
+ "start_date": "2020-01-01",
300
+ "end_date": "2023-12-31"
301
+ },
302
+ "is_verified": {
303
+ "type": "boolean",
304
+ "nullable": true
305
+ }
306
+ }
307
+ ```
308
+
309
+ ## Supported Data Types
310
+
311
+ - string
312
+ - text
313
+ - integer
314
+ - bigint
315
+ - float
316
+ - decimal
317
+ - boolean
318
+ - date
319
+ - datetime
320
+ - blob
321
+ - uuid
322
+ - enum
323
+
324
+ ## Database Type Mappings
325
+
326
+ | Generic Type | MySQL | MSSQL | Oracle |
327
+ |--------------|---------------|-------------------|---------------|
328
+ | string | VARCHAR(255) | NVARCHAR(255) | VARCHAR2(255) |
329
+ | text | TEXT | NVARCHAR(MAX) | CLOB |
330
+ | integer | INT | INT | NUMBER(10) |
331
+ | bigint | BIGINT | BIGINT | NUMBER(19) |
332
+ | float | FLOAT | FLOAT | FLOAT |
333
+ | decimal | DECIMAL(18,2) | DECIMAL(18,2) | NUMBER(18,2) |
334
+ | boolean | TINYINT(1) | BIT | NUMBER(1) |
335
+ | date | DATE | DATE | DATE |
336
+ | datetime | DATETIME | DATETIME2 | TIMESTAMP |
337
+ | blob | LONGBLOB | VARBINARY(MAX) | BLOB |
338
+ | uuid | VARCHAR(36) | UNIQUEIDENTIFIER | VARCHAR2(36) |
339
+ | enum | ENUM | NVARCHAR(255) | VARCHAR2(255) |
340
+
341
+ ## License
342
+
343
+ This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
344
+ # Test change
345
+
@@ -0,0 +1,26 @@
1
+ src/__init__.py,sha256=0bQUYmeNcWB-QTwPB464gAdohtIEnOqlbGZu6Cbl4QY,115
2
+ src/cli.py,sha256=h4CffULDO56kH2FH7dmFRA9PkmNdPJZSqVdx28dmhLI,5992
3
+ src/exporters/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
4
+ src/exporters/base_exporter.py,sha256=SQkavQTzent1htHbeUSYaB0SzHwtpuKAonjcIFgvw5c,618
5
+ src/exporters/csv_exporter.py,sha256=_MnO4nwohO5aUPBnTnKSrPz2XnpF5r-rufffp7pWtK4,4436
6
+ src/exporters/json_exporter.py,sha256=idi8eD64ZoYwuyn7aK4aNWKipEFzTaTabr6AL41bZ6I,3279
7
+ src/exporters/mssql_exporter.py,sha256=te6-LL1tzkjwxMmLKBMJSiKug4nMK0iQCPGsEmbI9WQ,7388
8
+ src/exporters/mysql_exporter.py,sha256=rJgHwlhkHeMTQ3Jc8gGnVP11BG80egwcCgxWD2WkSyc,6684
9
+ src/exporters/oracle_exporter.py,sha256=KelLHKxySstM4aWVO6sG1DbJgfyDjvzO06gvwzzsM4w,7648
10
+ src/exporters/orc_exporter.py,sha256=AKKEvM2OiGqglnh98HmNR7-VVdvCqGFxSP9duUYdEKQ,3372
11
+ src/exporters/parquet_exporter.py,sha256=e0HvnP5K-SZ0voqZxQje0suySf3yuR4FlwAMSPbRpcs,3588
12
+ src/exporters/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
13
+ src/exporters/utils/chunker.py,sha256=4ORFh9SbbzJRKRxKcU3BmQg_TPbGz7WfdkNpcIiT-wI,935
14
+ src/exporters/utils/constants.py,sha256=5kHau0F7j7dx0ekJP_BmRb5IAn_bkodBhYHsHhVG7KA,1759
15
+ src/exporters/utils/exporter_config.py,sha256=ersV06RZY38LDWJ5Dbz2FHjUOA3LMvnSkAKIMjuKTXA,527
16
+ src/exporters/utils/formatters.py,sha256=pZRp2sxCitopXK2fzKK5ZonKh1EaxlK21ZVeLAL_B44,4418
17
+ src/generator.py,sha256=vxPY4xnAAdAtYxca1IDO8z4zPBG4aLxPzgUfpnD8Img,4451
18
+ src/providers/__init__.py,sha256=2HM2EJZm_9yTI129E3dvi4Q_VH6rbTPE2oK5IdsMR8o,118
19
+ src/providers/base.py,sha256=1zxmrE__UP61PmZF62hP3TYR7h_ECltQgv1D3vZ6zNg,1492
20
+ src/providers/faker_provider.py,sha256=fOY2jSB6IaF9nLnITwUH414EgGRubzIXvs7EKrtbeLM,2314
21
+ src/schemas.py,sha256=KZomIz__AjvJH8HOPIbInXzsC-5IYMcRYAyFSYjGQDI,2534
22
+ testdatax-0.1.0.dist-info/LICENSE,sha256=hWMNYiI2moXwEsTtb1yeS8DLErzX-ORwz5_6YMbwuFo,1067
23
+ testdatax-0.1.0.dist-info/METADATA,sha256=1izE3gMB5jnNSchqdZMePdbs-Ib0Jfmo9v36JGBumk4,7967
24
+ testdatax-0.1.0.dist-info/WHEEL,sha256=IYZQI976HJqqOpQU6PHkJ8fb3tMNBFjg-Cn-pwAbaFM,88
25
+ testdatax-0.1.0.dist-info/entry_points.txt,sha256=TeI36QYzNSfsZuK5pxrLwxDcKahnlteNdmvhQceg-6w,41
26
+ testdatax-0.1.0.dist-info/RECORD,,
@@ -0,0 +1,4 @@
1
+ Wheel-Version: 1.0
2
+ Generator: poetry-core 2.0.1
3
+ Root-Is-Purelib: true
4
+ Tag: py3-none-any
@@ -0,0 +1,3 @@
1
+ [console_scripts]
2
+ testdatax=src.cli:app
3
+