realitydb 0.4.1 → 1.3.1

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (3) hide show
  1. package/README.md +46 -0
  2. package/dist/index.js +8766 -2690
  3. package/package.json +2 -1
package/README.md CHANGED
@@ -47,6 +47,51 @@ realitydb load bug-4821.realitydb-pack.json --confirm
47
47
  npx realitydb seed --ci --template saas --records 500 --seed 42
48
48
  ```
49
49
 
50
+ ## Data Science Mode
51
+
52
+ Generate large-scale datasets for ML training, analytics testing, and data pipelines — no database required.
53
+
54
+ ```bash
55
+ # Generate 1M rows with default demo schema
56
+ realitydb generate --records 1000000 --format csv
57
+
58
+ # Generate from your SQL schema
59
+ realitydb generate --schema schema.sql --records 100000 --format parquet
60
+
61
+ # Generate from JSON schema with distribution controls
62
+ realitydb generate --schema custom.json --records 500000 --correlations --seed 42
63
+ ```
64
+
65
+ ### Statistical Distributions
66
+
67
+ Define per-column distributions in your JSON schema:
68
+
69
+ ```json
70
+ {
71
+ "tables": [{
72
+ "name": "users",
73
+ "columns": [
74
+ { "name": "age", "type": "integer", "distribution": { "type": "normal", "mean": 35, "stddev": 12, "min": 18, "max": 85 } },
75
+ { "name": "income", "type": "numeric", "distribution": { "type": "log-normal", "mu": 10.5, "sigma": 0.8, "min": 15000, "max": 500000 } },
76
+ { "name": "login_count", "type": "integer", "distribution": { "type": "zipf", "exponent": 1.2, "min": 1, "max": 1000 } }
77
+ ]
78
+ }],
79
+ "correlations": [
80
+ { "source": "age", "target": "income", "coefficient": 0.6 }
81
+ ]
82
+ }
83
+ ```
84
+
85
+ Supported distributions: `normal`, `uniform`, `zipf`, `exponential`, `log-normal`.
86
+
87
+ ### Output Formats
88
+
89
+ | Format | Flag | Description |
90
+ |--------|------|-------------|
91
+ | JSON | `--format json` | NDJSON (newline-delimited JSON), one object per line |
92
+ | CSV | `--format csv` | Standard CSV with headers |
93
+ | Parquet | `--format parquet` | NDJSON with `.parquet.ndjson` extension (convert via DuckDB/pyarrow) |
94
+
50
95
  ## Commands
51
96
 
52
97
  | Command | Description |
@@ -55,6 +100,7 @@ npx realitydb seed --ci --template saas --records 500 --seed 42
55
100
  | `realitydb seed` | Generate and insert realistic data |
56
101
  | `realitydb reset` | Clear seeded data |
57
102
  | `realitydb export` | Export data to JSON/CSV/SQL files |
103
+ | `realitydb generate` | Generate large-scale datasets (no DB required) |
58
104
  | `realitydb capture` | Snapshot live database into a Reality Pack |
59
105
  | `realitydb load` | Load a Reality Pack into the database |
60
106
  | `realitydb share` | Display Reality Pack info for sharing |