datly 0.0.1
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.MD +2986 -0
- package/dist/datly.umd.js +1 -0
- package/package.json +51 -0
package/README.MD
ADDED
@@ -0,0 +1,2986 @@
|
|
1
|
+
# Datly
|
2
|
+
|
3
|
+

|
4
|
+

|
5
|
+

|
6
|
+
|
7
|
+
**Javascript toolkit for data science, statistical analysis and machine learning.**
|
8
|
+
---
|
9
|
+
|
10
|
+
### ⚡ Key Features:
|
11
|
+
- 📈 **Complete Statistics Suite** - From descriptive stats to advanced hypothesis testing
|
12
|
+
- 🤖 **7 ML Algorithms** - Classification, regression, and ensemble methods
|
13
|
+
- 📊 **13 Visualization Types** - Interactive D3.js charts with one-line commands
|
14
|
+
- 🔄 **Auto-Analysis** - Intelligent data exploration with automated insights
|
15
|
+
- 🎨 **Zero Config** - Works out of the box, customizable when needed
|
16
|
+
- 🌐 **Universal** - Same API for browser and Node.js
|
17
|
+
---
|
18
|
+
|
19
|
+
**Datly** is a comprehensive JavaScript library for statistical analysis, machine learning, and data visualization. Built for both browser and Node.js environments, it provides a complete toolkit for data scientists, analysts, and developers.
|
20
|
+
|
21
|
+
---
|
22
|
+
|
23
|
+
## 📚 Table of Contents
|
24
|
+
|
25
|
+
- [Installation](#installation)
|
26
|
+
- [Quick Start](#quick-start)
|
27
|
+
- [Core Modules](#core-modules)
|
28
|
+
- [1. Data Loading](#1-data-loading)
|
29
|
+
- [2. Data Validation](#2-data-validation)
|
30
|
+
- [3. Utility Functions](#3-utility-functions)
|
31
|
+
- [4. Central Tendency](#4-central-tendency)
|
32
|
+
- [5. Dispersion Measures](#5-dispersion-measures)
|
33
|
+
- [6. Position Measures](#6-position-measures)
|
34
|
+
- [7. Shape Analysis](#7-shape-analysis)
|
35
|
+
- [8. Hypothesis Testing](#8-hypothesis-testing)
|
36
|
+
- [9. Confidence Intervals](#9-confidence-intervals)
|
37
|
+
- [10. Normality Tests](#10-normality-tests)
|
38
|
+
- [11. Correlation Analysis](#11-correlation-analysis)
|
39
|
+
- [12. Regression Analysis](#12-regression-analysis)
|
40
|
+
- [13. Report Generation](#13-report-generation)
|
41
|
+
- [14. Pattern Detection](#14-pattern-detection)
|
42
|
+
- [15. Result Interpretation](#15-result-interpretation)
|
43
|
+
- [16. Auto-Analysis](#16-auto-analysis)
|
44
|
+
- [17. Machine Learning](#17-machine-learning)
|
45
|
+
- [18. Data Visualization](#18-data-visualization)
|
46
|
+
- [Complete Examples](#complete-examples)
|
47
|
+
- [API Reference](#api-reference)
|
48
|
+
|
49
|
+
---
|
50
|
+
|
51
|
+
## 🚀 Installation
|
52
|
+
|
53
|
+
### Browser (CDN)
|
54
|
+
|
55
|
+
```html
|
56
|
+
<!-- Include Datly -->
|
57
|
+
<script src="https://unpkg.com/datly"></script>
|
58
|
+
|
59
|
+
<script>
|
60
|
+
const datly = new Datly();
|
61
|
+
</script>
|
62
|
+
```
|
63
|
+
|
64
|
+
### Node.js (NPM)
|
65
|
+
|
66
|
+
```bash
|
67
|
+
# Core library (statistics and machine learning)
|
68
|
+
npm install datly
|
69
|
+
```
|
70
|
+
|
71
|
+
```javascript
|
72
|
+
const Datly = require('datly');
|
73
|
+
const datly = new Datly();
|
74
|
+
```
|
75
|
+
|
76
|
+
---
|
77
|
+
|
78
|
+
## ⚡ Quick Start
|
79
|
+
|
80
|
+
```javascript
|
81
|
+
// Initialize the library
|
82
|
+
const datly = new Datly();
|
83
|
+
|
84
|
+
// Load data from CSV
|
85
|
+
const data = await datly.loadCSV('data.csv');
|
86
|
+
|
87
|
+
// Calculate mean
|
88
|
+
const ages = [25, 30, 35, 40, 45];
|
89
|
+
const meanAge = datly.mean(ages);
|
90
|
+
console.log('Mean Age:', meanAge); // 35
|
91
|
+
|
92
|
+
// Perform t-test
|
93
|
+
const group1 = [23, 25, 27, 29, 31];
|
94
|
+
const group2 = [30, 32, 34, 36, 38];
|
95
|
+
const tTest = datly.tTest(group1, group2);
|
96
|
+
console.log('T-Test Result:', tTest);
|
97
|
+
|
98
|
+
// Create visualization
|
99
|
+
datly.plotHistogram(ages, {
|
100
|
+
title: 'Age Distribution',
|
101
|
+
xlabel: 'Age',
|
102
|
+
ylabel: 'Frequency'
|
103
|
+
});
|
104
|
+
```
|
105
|
+
|
106
|
+
---
|
107
|
+
|
108
|
+
## 📦 Core Modules
|
109
|
+
|
110
|
+
### 1. Data Loading
|
111
|
+
|
112
|
+
Load data from various sources including CSV, JSON, and file systems.
|
113
|
+
|
114
|
+
#### Methods
|
115
|
+
|
116
|
+
##### `loadCSV(filePath, options)`
|
117
|
+
Load data from a CSV file.
|
118
|
+
|
119
|
+
```javascript
|
120
|
+
const data = await datly.loadCSV('sales.csv', {
|
121
|
+
delimiter: ',',
|
122
|
+
header: true,
|
123
|
+
skipEmptyLines: true,
|
124
|
+
encoding: 'utf8'
|
125
|
+
});
|
126
|
+
|
127
|
+
console.log(data);
|
128
|
+
// {
|
129
|
+
// headers: ['product', 'sales', 'revenue'],
|
130
|
+
// data: [
|
131
|
+
// { product: 'A', sales: 100, revenue: 5000 },
|
132
|
+
// { product: 'B', sales: 150, revenue: 7500 }
|
133
|
+
// ],
|
134
|
+
// length: 2,
|
135
|
+
// columns: 3
|
136
|
+
// }
|
137
|
+
```
|
138
|
+
|
139
|
+
**Parameters:**
|
140
|
+
- `filePath` (string): Path to CSV file
|
141
|
+
- `options` (object): Configuration options
|
142
|
+
- `delimiter` (string): Column delimiter (default: ',')
|
143
|
+
- `header` (boolean): First row contains headers (default: true)
|
144
|
+
- `skipEmptyLines` (boolean): Skip empty rows (default: true)
|
145
|
+
- `encoding` (string): File encoding (default: 'utf8')
|
146
|
+
|
147
|
+
**Returns:** Dataset object with headers, data array, length, and column count
|
148
|
+
|
149
|
+
---
|
150
|
+
|
151
|
+
##### `loadJSON(source, options)`
|
152
|
+
Load data from JSON file, string, or object.
|
153
|
+
|
154
|
+
```javascript
|
155
|
+
// From file
|
156
|
+
const data1 = await datly.loadJSON('data.json');
|
157
|
+
|
158
|
+
// From JSON string
|
159
|
+
const jsonString = '{"users": [{"name": "John", "age": 30}]}';
|
160
|
+
const data2 = await datly.loadJSON(jsonString);
|
161
|
+
|
162
|
+
// From object
|
163
|
+
const obj = {
|
164
|
+
headers: ['name', 'age'],
|
165
|
+
data: [
|
166
|
+
{ name: 'Alice', age: 25 },
|
167
|
+
{ name: 'Bob', age: 30 }
|
168
|
+
]
|
169
|
+
};
|
170
|
+
const data3 = await datly.loadJSON(obj);
|
171
|
+
```
|
172
|
+
|
173
|
+
**Parameters:**
|
174
|
+
- `source` (string|object): JSON file path, string, or object
|
175
|
+
- `options` (object): Configuration options
|
176
|
+
- `validateTypes` (boolean): Auto-infer data types (default: true)
|
177
|
+
- `autoInferHeaders` (boolean): Extract headers from data (default: true)
|
178
|
+
|
179
|
+
**Returns:** Structured dataset object
|
180
|
+
|
181
|
+
---
|
182
|
+
|
183
|
+
##### `parseCSV(text, options)`
|
184
|
+
Parse CSV text into structured data.
|
185
|
+
|
186
|
+
```javascript
|
187
|
+
const csvText = `name,age,city
|
188
|
+
John,30,NYC
|
189
|
+
Jane,25,LA`;
|
190
|
+
|
191
|
+
const parsed = datly.parseCSV(csvText, {
|
192
|
+
delimiter: ',',
|
193
|
+
header: true
|
194
|
+
});
|
195
|
+
|
196
|
+
console.log(parsed.data);
|
197
|
+
// [
|
198
|
+
// { name: 'John', age: 30, city: 'NYC' },
|
199
|
+
// { name: 'Jane', age: 25, city: 'LA' }
|
200
|
+
// ]
|
201
|
+
```
|
202
|
+
|
203
|
+
---
|
204
|
+
|
205
|
+
##### `cleanData(dataset)`
|
206
|
+
Remove rows with all null values.
|
207
|
+
|
208
|
+
```javascript
|
209
|
+
const dataset = {
|
210
|
+
headers: ['a', 'b', 'c'],
|
211
|
+
data: [
|
212
|
+
{ a: 1, b: 2, c: 3 },
|
213
|
+
{ a: null, b: null, c: null }, // Will be removed
|
214
|
+
{ a: 4, b: 5, c: 6 }
|
215
|
+
],
|
216
|
+
length: 3,
|
217
|
+
columns: 3
|
218
|
+
};
|
219
|
+
|
220
|
+
const cleaned = datly.cleanData(dataset);
|
221
|
+
console.log(cleaned.length); // 2
|
222
|
+
```
|
223
|
+
|
224
|
+
---
|
225
|
+
|
226
|
+
##### `getColumn(dataset, columnName)`
|
227
|
+
Extract a single column as an array.
|
228
|
+
|
229
|
+
```javascript
|
230
|
+
const data = {
|
231
|
+
headers: ['name', 'age', 'salary'],
|
232
|
+
data: [
|
233
|
+
{ name: 'Alice', age: 25, salary: 50000 },
|
234
|
+
{ name: 'Bob', age: 30, salary: 60000 },
|
235
|
+
{ name: 'Charlie', age: null, salary: 55000 }
|
236
|
+
]
|
237
|
+
};
|
238
|
+
|
239
|
+
const ages = datly.getColumn(data, 'age');
|
240
|
+
console.log(ages); // [25, 30] (null values filtered)
|
241
|
+
```
|
242
|
+
|
243
|
+
---
|
244
|
+
|
245
|
+
##### `getColumns(dataset, columnNames)`
|
246
|
+
Extract multiple columns as an object.
|
247
|
+
|
248
|
+
```javascript
|
249
|
+
const columns = datly.getColumns(data, ['age', 'salary']);
|
250
|
+
console.log(columns);
|
251
|
+
// {
|
252
|
+
// age: [25, 30],
|
253
|
+
// salary: [50000, 60000, 55000]
|
254
|
+
// }
|
255
|
+
```
|
256
|
+
|
257
|
+
---
|
258
|
+
|
259
|
+
##### `filterRows(dataset, predicate)`
|
260
|
+
Filter dataset rows based on condition.
|
261
|
+
|
262
|
+
```javascript
|
263
|
+
const filtered = datly.filterRows(data, row => row.age > 25);
|
264
|
+
console.log(filtered.data);
|
265
|
+
// [{ name: 'Bob', age: 30, salary: 60000 }]
|
266
|
+
```
|
267
|
+
|
268
|
+
---
|
269
|
+
|
270
|
+
##### `sortBy(dataset, column, order)`
|
271
|
+
Sort dataset by column.
|
272
|
+
|
273
|
+
```javascript
|
274
|
+
const sorted = datly.sortBy(data, 'salary', 'desc');
|
275
|
+
console.log(sorted.data);
|
276
|
+
// [
|
277
|
+
// { name: 'Bob', age: 30, salary: 60000 },
|
278
|
+
// { name: 'Charlie', age: null, salary: 55000 },
|
279
|
+
// { name: 'Alice', age: 25, salary: 50000 }
|
280
|
+
// ]
|
281
|
+
```
|
282
|
+
|
283
|
+
---
|
284
|
+
|
285
|
+
### 2. Data Validation
|
286
|
+
|
287
|
+
Validate data integrity and structure.
|
288
|
+
|
289
|
+
#### Methods
|
290
|
+
|
291
|
+
##### `validateData(dataset)`
|
292
|
+
Validate dataset structure and integrity.
|
293
|
+
|
294
|
+
```javascript
|
295
|
+
const dataset = {
|
296
|
+
headers: ['a', 'b'],
|
297
|
+
data: [
|
298
|
+
{ a: 1, b: 2 },
|
299
|
+
{ a: 3, b: 4, c: 5 } // Extra column 'c'
|
300
|
+
]
|
301
|
+
};
|
302
|
+
|
303
|
+
const validation = datly.validateData(dataset);
|
304
|
+
console.log(validation);
|
305
|
+
// {
|
306
|
+
// valid: true,
|
307
|
+
// errors: [],
|
308
|
+
// warnings: ['Row 1: Extra columns: c']
|
309
|
+
// }
|
310
|
+
```
|
311
|
+
|
312
|
+
---
|
313
|
+
|
314
|
+
##### `validateNumericColumn(column)`
|
315
|
+
Validate that column contains numeric values.
|
316
|
+
|
317
|
+
```javascript
|
318
|
+
const column = [1, 2, 'three', 4, NaN, 5];
|
319
|
+
const result = datly.validateNumericColumn(column);
|
320
|
+
console.log(result);
|
321
|
+
// {
|
322
|
+
// valid: true,
|
323
|
+
// validCount: 4,
|
324
|
+
// invalidCount: 2,
|
325
|
+
// cleanData: [1, 2, 4, 5]
|
326
|
+
// }
|
327
|
+
```
|
328
|
+
|
329
|
+
---
|
330
|
+
|
331
|
+
##### `validateSampleSize(sample, minSize)`
|
332
|
+
Ensure sample has minimum required size.
|
333
|
+
|
334
|
+
```javascript
|
335
|
+
const sample = [1, 2, 3];
|
336
|
+
try {
|
337
|
+
datly.validateSampleSize(sample, 5);
|
338
|
+
} catch (error) {
|
339
|
+
console.log(error.message); // "Sample size (3) must be at least 5"
|
340
|
+
}
|
341
|
+
```
|
342
|
+
|
343
|
+
---
|
344
|
+
|
345
|
+
##### `validateConfidenceLevel(level)`
|
346
|
+
Validate confidence level is between 0 and 1.
|
347
|
+
|
348
|
+
```javascript
|
349
|
+
datly.validateConfidenceLevel(0.95); // true
|
350
|
+
datly.validateConfidenceLevel(1.5); // throws error
|
351
|
+
```
|
352
|
+
|
353
|
+
---
|
354
|
+
|
355
|
+
### 3. Utility Functions
|
356
|
+
|
357
|
+
General-purpose statistical utilities.
|
358
|
+
|
359
|
+
#### Methods
|
360
|
+
|
361
|
+
##### `detectOutliers(data, method)`
|
362
|
+
Detect outliers using various methods.
|
363
|
+
|
364
|
+
```javascript
|
365
|
+
const data = [1, 2, 3, 4, 5, 100]; // 100 is an outlier
|
366
|
+
|
367
|
+
// IQR method (default)
|
368
|
+
const outliers1 = datly.detectOutliers(data, 'iqr');
|
369
|
+
console.log(outliers1);
|
370
|
+
// {
|
371
|
+
// outliers: [100],
|
372
|
+
// indices: [5],
|
373
|
+
// count: 1,
|
374
|
+
// percentage: 16.67
|
375
|
+
// }
|
376
|
+
|
377
|
+
// Z-score method
|
378
|
+
const outliers2 = datly.detectOutliers(data, 'zscore');
|
379
|
+
|
380
|
+
// Modified Z-score method
|
381
|
+
const outliers3 = datly.detectOutliers(data, 'modified_zscore');
|
382
|
+
```
|
383
|
+
|
384
|
+
**Methods available:**
|
385
|
+
- `'iqr'`: Interquartile Range method (default)
|
386
|
+
- `'zscore'`: Z-score method (|z| > 3)
|
387
|
+
- `'modified_zscore'`: Modified Z-score method
|
388
|
+
|
389
|
+
---
|
390
|
+
|
391
|
+
##### `frequencyTable(data)`
|
392
|
+
Create frequency distribution table.
|
393
|
+
|
394
|
+
```javascript
|
395
|
+
const colors = ['red', 'blue', 'red', 'green', 'blue', 'red'];
|
396
|
+
const freq = datly.frequencyTable(colors);
|
397
|
+
console.log(freq);
|
398
|
+
// [
|
399
|
+
// { value: 'red', frequency: 3, relativeFrequency: 0.5, percentage: 50 },
|
400
|
+
// { value: 'blue', frequency: 2, relativeFrequency: 0.333, percentage: 33.33 },
|
401
|
+
// { value: 'green', frequency: 1, relativeFrequency: 0.167, percentage: 16.67 }
|
402
|
+
// ]
|
403
|
+
```
|
404
|
+
|
405
|
+
---
|
406
|
+
|
407
|
+
##### `groupBy(dataset, column, aggregations)`
|
408
|
+
Group data and calculate aggregations.
|
409
|
+
|
410
|
+
```javascript
|
411
|
+
const data = {
|
412
|
+
headers: ['category', 'sales', 'profit'],
|
413
|
+
data: [
|
414
|
+
{ category: 'A', sales: 100, profit: 20 },
|
415
|
+
{ category: 'B', sales: 150, profit: 30 },
|
416
|
+
{ category: 'A', sales: 200, profit: 40 },
|
417
|
+
{ category: 'B', sales: 180, profit: 35 }
|
418
|
+
]
|
419
|
+
};
|
420
|
+
|
421
|
+
const grouped = datly.groupBy(data, 'category', {
|
422
|
+
sales: 'mean',
|
423
|
+
profit: 'sum'
|
424
|
+
});
|
425
|
+
|
426
|
+
console.log(grouped);
|
427
|
+
// {
|
428
|
+
// A: {
|
429
|
+
// count: 2,
|
430
|
+
// mean_sales: 150,
|
431
|
+
// sum_profit: 60,
|
432
|
+
// data: [...]
|
433
|
+
// },
|
434
|
+
// B: {
|
435
|
+
// count: 2,
|
436
|
+
// mean_sales: 165,
|
437
|
+
// sum_profit: 65,
|
438
|
+
// data: [...]
|
439
|
+
// }
|
440
|
+
// }
|
441
|
+
```
|
442
|
+
|
443
|
+
**Aggregation functions:**
|
444
|
+
- `'mean'`: Average value
|
445
|
+
- `'median'`: Median value
|
446
|
+
- `'sum'`: Sum of values
|
447
|
+
- `'min'`: Minimum value
|
448
|
+
- `'max'`: Maximum value
|
449
|
+
- `'std'`: Standard deviation
|
450
|
+
- `'var'`: Variance
|
451
|
+
- `'count'`: Count of values
|
452
|
+
|
453
|
+
---
|
454
|
+
|
455
|
+
##### `sample(dataset, size, method)`
|
456
|
+
Extract a sample from dataset.
|
457
|
+
|
458
|
+
```javascript
|
459
|
+
const data = {
|
460
|
+
headers: ['id', 'value'],
|
461
|
+
data: Array.from({ length: 100 }, (_, i) => ({ id: i, value: i * 10 })),
|
462
|
+
length: 100
|
463
|
+
};
|
464
|
+
|
465
|
+
// Random sampling
|
466
|
+
const randomSample = datly.sample(data, 10, 'random');
|
467
|
+
|
468
|
+
// Systematic sampling
|
469
|
+
const systematicSample = datly.sample(data, 10, 'systematic');
|
470
|
+
|
471
|
+
// First n records
|
472
|
+
const firstSample = datly.sample(data, 10, 'first');
|
473
|
+
|
474
|
+
// Last n records
|
475
|
+
const lastSample = datly.sample(data, 10, 'last');
|
476
|
+
```
|
477
|
+
|
478
|
+
---
|
479
|
+
|
480
|
+
##### `bootstrap(data, statistic, iterations)`
|
481
|
+
Bootstrap resampling for estimating statistic distribution.
|
482
|
+
|
483
|
+
```javascript
|
484
|
+
const data = [23, 25, 27, 29, 31, 33, 35];
|
485
|
+
|
486
|
+
const bootstrap = datly.bootstrap(data, 'mean', 1000);
|
487
|
+
console.log(bootstrap);
|
488
|
+
// {
|
489
|
+
// bootstrapStats: [...], // 1000 bootstrap means
|
490
|
+
// mean: 29.5,
|
491
|
+
// standardError: 1.23,
|
492
|
+
// confidenceInterval: { lower: 27.1, upper: 31.9 }
|
493
|
+
// }
|
494
|
+
```
|
495
|
+
|
496
|
+
**Statistics available:**
|
497
|
+
- `'mean'`: Bootstrap mean
|
498
|
+
- `'median'`: Bootstrap median
|
499
|
+
- `'std'`: Bootstrap standard deviation
|
500
|
+
- `'var'`: Bootstrap variance
|
501
|
+
- Custom function: Pass your own function
|
502
|
+
|
503
|
+
---
|
504
|
+
|
505
|
+
##### `contingencyTable(column1, column2)`
|
506
|
+
Create contingency table for categorical data.
|
507
|
+
|
508
|
+
```javascript
|
509
|
+
const gender = ['M', 'F', 'M', 'F', 'M', 'F'];
|
510
|
+
const preference = ['A', 'B', 'A', 'A', 'B', 'B'];
|
511
|
+
|
512
|
+
const table = datly.contingencyTable(gender, preference);
|
513
|
+
console.log(table);
|
514
|
+
// {
|
515
|
+
// table: {
|
516
|
+
// M: { A: 2, B: 1 },
|
517
|
+
// F: { A: 1, B: 2 }
|
518
|
+
// },
|
519
|
+
// totals: {
|
520
|
+
// row: { M: 3, F: 3 },
|
521
|
+
// col: { A: 3, B: 3 },
|
522
|
+
// grand: 6
|
523
|
+
// },
|
524
|
+
// rows: ['M', 'F'],
|
525
|
+
// columns: ['A', 'B']
|
526
|
+
// }
|
527
|
+
```
|
528
|
+
|
529
|
+
---
|
530
|
+
|
531
|
+
### 4. Central Tendency
|
532
|
+
|
533
|
+
Measures of central tendency (mean, median, mode).
|
534
|
+
|
535
|
+
#### Methods
|
536
|
+
|
537
|
+
##### `mean(data)`
|
538
|
+
Calculate arithmetic mean.
|
539
|
+
|
540
|
+
```javascript
|
541
|
+
const data = [10, 20, 30, 40, 50];
|
542
|
+
const avg = datly.mean(data);
|
543
|
+
console.log(avg); // 30
|
544
|
+
```
|
545
|
+
|
546
|
+
---
|
547
|
+
|
548
|
+
##### `median(data)`
|
549
|
+
Calculate median (middle value).
|
550
|
+
|
551
|
+
```javascript
|
552
|
+
const data1 = [1, 3, 5, 7, 9];
|
553
|
+
console.log(datly.median(data1)); // 5
|
554
|
+
|
555
|
+
const data2 = [1, 3, 5, 7];
|
556
|
+
console.log(datly.median(data2)); // 4 (average of 3 and 5)
|
557
|
+
```
|
558
|
+
|
559
|
+
---
|
560
|
+
|
561
|
+
##### `mode(data)`
|
562
|
+
Find mode (most frequent value).
|
563
|
+
|
564
|
+
```javascript
|
565
|
+
const data = [1, 2, 2, 3, 3, 3, 4, 4];
|
566
|
+
const modeResult = datly.mode(data);
|
567
|
+
console.log(modeResult);
|
568
|
+
// {
|
569
|
+
// values: [3],
|
570
|
+
// frequency: 3,
|
571
|
+
// isMultimodal: false,
|
572
|
+
// isUniform: false
|
573
|
+
// }
|
574
|
+
|
575
|
+
// Multimodal example
|
576
|
+
const data2 = [1, 1, 2, 2, 3];
|
577
|
+
const modeResult2 = datly.mode(data2);
|
578
|
+
console.log(modeResult2);
|
579
|
+
// {
|
580
|
+
// values: [1, 2],
|
581
|
+
// frequency: 2,
|
582
|
+
// isMultimodal: true,
|
583
|
+
// isUniform: false
|
584
|
+
// }
|
585
|
+
```
|
586
|
+
|
587
|
+
---
|
588
|
+
|
589
|
+
##### `geometricMean(data)`
|
590
|
+
Calculate geometric mean (for positive values).
|
591
|
+
|
592
|
+
```javascript
|
593
|
+
const data = [2, 8, 32]; // Growth rates
|
594
|
+
const geoMean = datly.geometricMean(data);
|
595
|
+
console.log(geoMean); // 8 (∛(2×8×32))
|
596
|
+
```
|
597
|
+
|
598
|
+
**Use cases:** Growth rates, ratios, percentages
|
599
|
+
|
600
|
+
---
|
601
|
+
|
602
|
+
##### `harmonicMean(data)`
|
603
|
+
Calculate harmonic mean (for rates).
|
604
|
+
|
605
|
+
```javascript
|
606
|
+
const speeds = [60, 40, 30]; // km/h on different segments
|
607
|
+
const harmMean = datly.harmonicMean(speeds);
|
608
|
+
console.log(harmMean); // 40 (average speed)
|
609
|
+
```
|
610
|
+
|
611
|
+
**Use cases:** Average rates, speeds, ratios
|
612
|
+
|
613
|
+
---
|
614
|
+
|
615
|
+
##### `trimmedMean(data, percentage)`
|
616
|
+
Calculate trimmed mean (remove extreme values).
|
617
|
+
|
618
|
+
```javascript
|
619
|
+
const data = [1, 2, 3, 100, 4, 5, 6]; // 100 is outlier
|
620
|
+
const trimmed = datly.trimmedMean(data, 10); // Trim 10% from each end
|
621
|
+
console.log(trimmed); // ~3.75 (without extreme values)
|
622
|
+
```
|
623
|
+
|
624
|
+
---
|
625
|
+
|
626
|
+
##### `weightedMean(values, weights)`
|
627
|
+
Calculate weighted average.
|
628
|
+
|
629
|
+
```javascript
|
630
|
+
const grades = [85, 90, 78, 92];
|
631
|
+
const weights = [0.3, 0.3, 0.2, 0.2]; // Exam weights
|
632
|
+
const weightedGrade = datly.weightedMean(grades, weights);
|
633
|
+
console.log(weightedGrade); // 86.5
|
634
|
+
```eoMean); // 8 (∛(2×8×32))
|
635
|
+
```
|
636
|
+
|
637
|
+
**Use cases:** Growth rates, ratios, percentages
|
638
|
+
|
639
|
+
---
|
640
|
+
|
641
|
+
##### `harmonicMean(data)`
|
642
|
+
Calculate harmonic mean (for rates).
|
643
|
+
|
644
|
+
```javascript
|
645
|
+
const speeds = [60, 40, 30]; // km/h on different segments
|
646
|
+
const harmMean = datly.centralTendency.harmonicMean(speeds);
|
647
|
+
console.log(harmMean); // 40 (average speed)
|
648
|
+
```
|
649
|
+
|
650
|
+
**Use cases:** Average rates, speeds, ratios
|
651
|
+
|
652
|
+
---
|
653
|
+
|
654
|
+
##### `trimmedMean(data, percentage)`
|
655
|
+
Calculate trimmed mean (remove extreme values).
|
656
|
+
|
657
|
+
```javascript
|
658
|
+
const data = [1, 2, 3, 100, 4, 5, 6]; // 100 is outlier
|
659
|
+
const trimmed = datly.centralTendency.trimmedMean(data, 10); // Trim 10% from each end
|
660
|
+
console.log(trimmed); // ~3.75 (without extreme values)
|
661
|
+
```
|
662
|
+
|
663
|
+
---
|
664
|
+
|
665
|
+
##### `weightedMean(values, weights)`
|
666
|
+
Calculate weighted average.
|
667
|
+
|
668
|
+
```javascript
|
669
|
+
const grades = [85, 90, 78, 92];
|
670
|
+
const weights = [0.3, 0.3, 0.2, 0.2]; // Exam weights
|
671
|
+
const weightedGrade = datly.centralTendency.weightedMean(grades, weights);
|
672
|
+
console.log(weightedGrade); // 86.5
|
673
|
+
```
|
674
|
+
|
675
|
+
---
|
676
|
+
|
677
|
+
### 5. Dispersion Measures
|
678
|
+
|
679
|
+
Measures of variability and spread.
|
680
|
+
|
681
|
+
#### Methods
|
682
|
+
|
683
|
+
##### `variance(data, sample)`
|
684
|
+
Calculate variance.
|
685
|
+
|
686
|
+
```javascript
|
687
|
+
const data = [2, 4, 6, 8, 10];
|
688
|
+
|
689
|
+
// Sample variance (default)
|
690
|
+
const sampleVar = datly.variance(data, true);
|
691
|
+
console.log(sampleVar); // 10
|
692
|
+
|
693
|
+
// Population variance
|
694
|
+
const popVar = datly.variance(data, false);
|
695
|
+
console.log(popVar); // 8
|
696
|
+
```
|
697
|
+
|
698
|
+
---
|
699
|
+
|
700
|
+
##### `standardDeviation(data, sample)`
|
701
|
+
Calculate standard deviation.
|
702
|
+
|
703
|
+
```javascript
|
704
|
+
const data = [2, 4, 6, 8, 10];
|
705
|
+
const std = datly.standardDeviation(data);
|
706
|
+
console.log(std); // 3.162 (√10)
|
707
|
+
```
|
708
|
+
|
709
|
+
---
|
710
|
+
|
711
|
+
##### `range(data)`
|
712
|
+
Calculate range (max - min).
|
713
|
+
|
714
|
+
```javascript
|
715
|
+
const data = [5, 10, 15, 20, 25];
|
716
|
+
const rangeResult = datly.range(data);
|
717
|
+
console.log(rangeResult);
|
718
|
+
// {
|
719
|
+
// range: 20,
|
720
|
+
// min: 5,
|
721
|
+
// max: 25
|
722
|
+
// }
|
723
|
+
```
|
724
|
+
|
725
|
+
---
|
726
|
+
|
727
|
+
##### `interquartileRange(data)`
|
728
|
+
Calculate IQR (Q3 - Q1).
|
729
|
+
|
730
|
+
```javascript
|
731
|
+
const data = [1, 2, 3, 4, 5, 6, 7, 8, 9];
|
732
|
+
const iqr = datly.interquartileRange(data);
|
733
|
+
console.log(iqr);
|
734
|
+
// {
|
735
|
+
// iqr: 4,
|
736
|
+
// q1: 3,
|
737
|
+
// q3: 7
|
738
|
+
// }
|
739
|
+
```
|
740
|
+
|
741
|
+
---
|
742
|
+
|
743
|
+
##### `coefficientOfVariation(data)`
|
744
|
+
Calculate coefficient of variation (CV%).
|
745
|
+
|
746
|
+
```javascript
|
747
|
+
const data = [10, 20, 30, 40, 50];
|
748
|
+
const cv = datly.coefficientOfVariation(data);
|
749
|
+
console.log(cv);
|
750
|
+
// {
|
751
|
+
// cv: 0.471,
|
752
|
+
// cvPercent: 47.1
|
753
|
+
// }
|
754
|
+
```
|
755
|
+
|
756
|
+
**Interpretation:**
|
757
|
+
- CV < 15%: Low variability
|
758
|
+
- CV 15-30%: Moderate variability
|
759
|
+
- CV > 30%: High variability
|
760
|
+
|
761
|
+
---
|
762
|
+
|
763
|
+
##### `meanAbsoluteDeviation(data)`
|
764
|
+
Calculate MAD (mean absolute deviation).
|
765
|
+
|
766
|
+
```javascript
|
767
|
+
const data = [2, 4, 6, 8, 10];
|
768
|
+
const mad = datly.meanAbsoluteDeviation(data);
|
769
|
+
console.log(mad);
|
770
|
+
// {
|
771
|
+
// mad: 2.4,
|
772
|
+
// mean: 6
|
773
|
+
// }
|
774
|
+
```
|
775
|
+
|
776
|
+
---
|
777
|
+
|
778
|
+
##### `standardError(data)`
|
779
|
+
Calculate standard error of the mean.
|
780
|
+
|
781
|
+
```javascript
|
782
|
+
const data = [10, 12, 14, 16, 18, 20];
|
783
|
+
const se = datly.standardError(data);
|
784
|
+
console.log(se); // 1.29 (σ/√n)
|
785
|
+
```
|
786
|
+
|
787
|
+
---
|
788
|
+
|
789
|
+
### 6. Position Measures
|
790
|
+
|
791
|
+
Quantiles, percentiles, and ranking.
|
792
|
+
|
793
|
+
#### Methods
|
794
|
+
|
795
|
+
##### `quantile(data, q)`
|
796
|
+
Calculate quantile.
|
797
|
+
|
798
|
+
```javascript
|
799
|
+
const data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
|
800
|
+
|
801
|
+
// Median (0.5 quantile)
|
802
|
+
console.log(datly.quantile(data, 0.5)); // 5.5
|
803
|
+
|
804
|
+
// First quartile
|
805
|
+
console.log(datly.quantile(data, 0.25)); // 3.25
|
806
|
+
|
807
|
+
// Third quartile
|
808
|
+
console.log(datly.quantile(data, 0.75)); // 7.75
|
809
|
+
```
|
810
|
+
|
811
|
+
---
|
812
|
+
|
813
|
+
##### `percentile(data, p)`
|
814
|
+
Calculate percentile (0-100 scale).
|
815
|
+
|
816
|
+
```javascript
|
817
|
+
const scores = [65, 70, 75, 80, 85, 90, 95];
|
818
|
+
const p90 = datly.percentile(scores, 90);
|
819
|
+
console.log(p90); // 93.5
|
820
|
+
```
|
821
|
+
|
822
|
+
---
|
823
|
+
|
824
|
+
##### `quartiles(data)`
|
825
|
+
Calculate all quartiles.
|
826
|
+
|
827
|
+
```javascript
|
828
|
+
const data = [1, 2, 3, 4, 5, 6, 7, 8, 9];
|
829
|
+
const quartiles = datly.quartiles(data);
|
830
|
+
console.log(quartiles);
|
831
|
+
// {
|
832
|
+
// q1: 2.5,
|
833
|
+
// q2: 5, // Median
|
834
|
+
// q3: 7.5,
|
835
|
+
// iqr: 5
|
836
|
+
// }
|
837
|
+
```
|
838
|
+
|
839
|
+
---
|
840
|
+
|
841
|
+
##### `percentileRank(data, value)`
|
842
|
+
Calculate percentile rank of a value.
|
843
|
+
|
844
|
+
```javascript
|
845
|
+
const scores = [60, 70, 80, 90, 100];
|
846
|
+
const rank = datly.percentileRank(scores, 80);
|
847
|
+
console.log(rank); // 50 (80 is at the 50th percentile)
|
848
|
+
```
|
849
|
+
|
850
|
+
---
|
851
|
+
|
852
|
+
##### `zScore(data, value)`
|
853
|
+
Calculate z-score (standardized value).
|
854
|
+
|
855
|
+
```javascript
|
856
|
+
const data = [10, 20, 30, 40, 50];
|
857
|
+
const z = datly.zScore(data, 40);
|
858
|
+
console.log(z); // 0.632 standard deviations above mean
|
859
|
+
```
|
860
|
+
|
861
|
+
**Interpretation:**
|
862
|
+
- |z| < 1: Within 1 standard deviation
|
863
|
+
- |z| < 2: Within 2 standard deviations
|
864
|
+
- |z| > 3: Potential outlier
|
865
|
+
|
866
|
+
---
|
867
|
+
|
868
|
+
##### `boxplotStats(data)`
|
869
|
+
Calculate box plot statistics.
|
870
|
+
|
871
|
+
```javascript
|
872
|
+
const data = [1, 2, 3, 4, 5, 100]; // 100 is outlier
|
873
|
+
const stats = datly.boxplotStats(data);
|
874
|
+
console.log(stats);
|
875
|
+
// {
|
876
|
+
// min: 1,
|
877
|
+
// q1: 2,
|
878
|
+
// median: 3.5,
|
879
|
+
// q3: 5,
|
880
|
+
// max: 5,
|
881
|
+
// iqr: 3,
|
882
|
+
// lowerFence: -2.5,
|
883
|
+
// upperFence: 9.5,
|
884
|
+
// outliers: [100],
|
885
|
+
// outlierCount: 1
|
886
|
+
// }
|
887
|
+
```
|
888
|
+
|
889
|
+
---
|
890
|
+
|
891
|
+
##### `rank(data, method)`
|
892
|
+
Rank data values.
|
893
|
+
|
894
|
+
```javascript
|
895
|
+
const data = [10, 20, 20, 30, 40];
|
896
|
+
|
897
|
+
// Average ranking (ties get average rank)
|
898
|
+
const ranks1 = datly.rank(data, 'average');
|
899
|
+
console.log(ranks1); // [1, 2.5, 2.5, 4, 5]
|
900
|
+
|
901
|
+
// Min ranking (ties get minimum rank)
|
902
|
+
const ranks2 = datly.rank(data, 'min');
|
903
|
+
console.log(ranks2); // [1, 2, 2, 4, 5]
|
904
|
+
|
905
|
+
// Max ranking (ties get maximum rank)
|
906
|
+
const ranks3 = datly.rank(data, 'max');
|
907
|
+
console.log(ranks3); // [1, 3, 3, 4, 5]
|
908
|
+
```
|
909
|
+
|
910
|
+
---
|
911
|
+
|
912
|
+
### 7. Shape Analysis
|
913
|
+
|
914
|
+
Distribution shape: skewness and kurtosis.
|
915
|
+
|
916
|
+
#### Methods
|
917
|
+
|
918
|
+
##### `skewness(data, bias)`
|
919
|
+
Calculate skewness (measure of asymmetry).
|
920
|
+
|
921
|
+
```javascript
|
922
|
+
const data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 100]; // Right-skewed
|
923
|
+
|
924
|
+
// Biased estimate (default)
|
925
|
+
const skew1 = datly.skewness(data, true);
|
926
|
+
console.log(skew1); // Positive value
|
927
|
+
|
928
|
+
// Unbiased estimate
|
929
|
+
const skew2 = datly.skewness(data, false);
|
930
|
+
console.log(skew2);
|
931
|
+
```
|
932
|
+
|
933
|
+
**Interpretation:**
|
934
|
+
- Skewness < -1: Highly left-skewed
|
935
|
+
- -1 < Skewness < -0.5: Moderately left-skewed
|
936
|
+
- -0.5 < Skewness < 0.5: Approximately symmetric
|
937
|
+
- 0.5 < Skewness < 1: Moderately right-skewed
|
938
|
+
- Skewness > 1: Highly right-skewed
|
939
|
+
|
940
|
+
---
|
941
|
+
|
942
|
+
##### `kurtosis(data, bias, excess)`
|
943
|
+
Calculate kurtosis (measure of tailedness).
|
944
|
+
|
945
|
+
```javascript
|
946
|
+
const data = [1, 2, 3, 4, 5, 6, 7, 8, 9];
|
947
|
+
|
948
|
+
// Excess kurtosis (default)
|
949
|
+
const kurt1 = datly.kurtosis(data, false, true);
|
950
|
+
console.log(kurt1); // -3 adjustment applied
|
951
|
+
|
952
|
+
// Regular kurtosis
|
953
|
+
const kurt2 = datly.kurtosis(data, false, false);
|
954
|
+
console.log(kurt2);
|
955
|
+
```
|
956
|
+
|
957
|
+
**Interpretation (excess kurtosis):**
|
958
|
+
- Excess < -1: Platykurtic (thin tails)
|
959
|
+
- -1 < Excess < 1: Mesokurtic (normal)
|
960
|
+
- Excess > 1: Leptokurtic (fat tails)
|
961
|
+
|
962
|
+
---
|
963
|
+
|
964
|
+
##### `isNormalDistribution(data, alpha)`
|
965
|
+
Test if data follows normal distribution.
|
966
|
+
|
967
|
+
```javascript
|
968
|
+
const normalData = [2, 3, 4, 4, 5, 5, 5, 6, 6, 7];
|
969
|
+
const test = datly.isNormalDistribution(normalData, 0.05);
|
970
|
+
console.log(test);
|
971
|
+
// {
|
972
|
+
// shapiroWilk: { statistic: 0.95, pValue: 0.12, isNormal: true },
|
973
|
+
// jarqueBera: { statistic: 1.23, pValue: 0.54, isNormal: true },
|
974
|
+
// skewness: 0.12,
|
975
|
+
// kurtosis: -0.34,
|
976
|
+
// isNormalByTests: true
|
977
|
+
// }
|
978
|
+
```
|
979
|
+
|
980
|
+
---
|
981
|
+
|
982
|
+
##### `jarqueBeraTest(data, alpha)`
|
983
|
+
Jarque-Bera normality test.
|
984
|
+
|
985
|
+
```javascript
|
986
|
+
const data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
|
987
|
+
const jb = datly.jarqueBeraTest(data);
|
988
|
+
console.log(jb);
|
989
|
+
// {
|
990
|
+
// statistic: 0.84,
|
991
|
+
// pValue: 0.66,
|
992
|
+
// skewness: 0,
|
993
|
+
// excessKurtosis: -1.2,
|
994
|
+
// isNormal: true
|
995
|
+
// }
|
996
|
+
```
|
997
|
+
|
998
|
+
---
|
999
|
+
|
1000
|
+
### 8. Hypothesis Testing
|
1001
|
+
|
1002
|
+
Statistical hypothesis tests.
|
1003
|
+
|
1004
|
+
#### Methods
|
1005
|
+
|
1006
|
+
##### `tTest(sample1, sample2, type, alpha)`
|
1007
|
+
Perform t-test.
|
1008
|
+
|
1009
|
+
```javascript
|
1010
|
+
// One-sample t-test
|
1011
|
+
const sample = [23, 25, 27, 29, 31];
|
1012
|
+
const populationMean = 25;
|
1013
|
+
const oneSample = datly.tTest(sample, populationMean, 'one-sample');
|
1014
|
+
console.log(oneSample);
|
1015
|
+
// {
|
1016
|
+
// type: 'one-sample',
|
1017
|
+
// statistic: 1.89,
|
1018
|
+
// pValue: 0.13,
|
1019
|
+
// degreesOfFreedom: 4,
|
1020
|
+
// significant: false
|
1021
|
+
// }
|
1022
|
+
|
1023
|
+
// Two-sample t-test
|
1024
|
+
const group1 = [23, 25, 27, 29, 31];
|
1025
|
+
const group2 = [28, 30, 32, 34, 36];
|
1026
|
+
const twoSample = datly.tTest(group1, group2, 'two-sample');
|
1027
|
+
console.log(twoSample);
|
1028
|
+
// {
|
1029
|
+
// type: 'two-sample',
|
1030
|
+
// statistic: -3.46,
|
1031
|
+
// pValue: 0.008,
|
1032
|
+
// significant: true,
|
1033
|
+
// meanDifference: -5
|
1034
|
+
// }
|
1035
|
+
|
1036
|
+
// Paired t-test
|
1037
|
+
const before = [120, 125, 130, 128, 122];
|
1038
|
+
const after = [115, 118, 125, 120, 115];
|
1039
|
+
const paired = datly.tTest(before, after, 'paired');
|
1040
|
+
console.log(paired);
|
1041
|
+
```
|
1042
|
+
|
1043
|
+
---
|
1044
|
+
|
1045
|
+
##### `zTest(sample, populationMean, populationStd, alpha)`
|
1046
|
+
Perform z-test (known population variance).
|
1047
|
+
|
1048
|
+
```javascript
|
1049
|
+
const sample = [105, 110, 108, 112, 115];
|
1050
|
+
const popMean = 100;
|
1051
|
+
const popStd = 15;
|
1052
|
+
|
1053
|
+
const zTest = datly.zTest(sample, popMean, popStd);
|
1054
|
+
console.log(zTest);
|
1055
|
+
// {
|
1056
|
+
// type: 'z-test',
|
1057
|
+
// statistic: 1.49,
|
1058
|
+
// pValue: 0.136,
|
1059
|
+
// significant: false,
|
1060
|
+
// sampleMean: 110
|
1061
|
+
// }
|
1062
|
+
```
|
1063
|
+
|
1064
|
+
---
|
1065
|
+
|
1066
|
+
##### `anovaTest(groups, alpha)`
|
1067
|
+
One-way ANOVA test.
|
1068
|
+
|
1069
|
+
```javascript
|
1070
|
+
const groupA = [23, 25, 27, 29];
|
1071
|
+
const groupB = [30, 32, 34, 36];
|
1072
|
+
const groupC = [28, 30, 32, 34];
|
1073
|
+
|
1074
|
+
const anova = datly.anovaTest([groupA, groupB, groupC]);
|
1075
|
+
console.log(anova);
|
1076
|
+
// {
|
1077
|
+
// type: 'one-way-anova',
|
1078
|
+
// statistic: 12.45,
|
1079
|
+
// pValue: 0.001,
|
1080
|
+
// dfBetween: 2,
|
1081
|
+
// dfWithin: 9,
|
1082
|
+
// significant: true,
|
1083
|
+
// groupMeans: [26, 33, 31]
|
1084
|
+
// }
|
1085
|
+
```
|
1086
|
+
|
1087
|
+
---
|
1088
|
+
|
1089
|
+
##### `chiSquareTest(column1, column2, alpha)`
|
1090
|
+
Chi-square test of independence.
|
1091
|
+
|
1092
|
+
```javascript
|
1093
|
+
const gender = ['M', 'F', 'M', 'F', 'M', 'F', 'M', 'F'];
|
1094
|
+
const preference = ['A', 'B', 'A', 'A', 'B', 'B', 'A', 'B'];
|
1095
|
+
|
1096
|
+
const chiTest = datly.chiSquareTest(gender, preference);
|
1097
|
+
console.log(chiTest);
|
1098
|
+
// {
|
1099
|
+
// type: 'chi-square-independence',
|
1100
|
+
// statistic: 0.5,
|
1101
|
+
// pValue: 0.48,
|
1102
|
+
// degreesOfFreedom: 1,
|
1103
|
+
// significant: false,
|
1104
|
+
// cramersV: 0.25
|
1105
|
+
// }
|
1106
|
+
```
|
1107
|
+
|
1108
|
+
---
|
1109
|
+
|
1110
|
+
##### `mannWhitneyTest(sample1, sample2, alpha)`
|
1111
|
+
Mann-Whitney U test (non-parametric alternative to t-test).
|
1112
|
+
|
1113
|
+
```javascript
|
1114
|
+
const group1 = [1, 2, 3, 4, 5];
|
1115
|
+
const group2 = [6, 7, 8, 9, 10];
|
1116
|
+
|
1117
|
+
const mwTest = datly.mannWhitneyTest(group1, group2);
|
1118
|
+
console.log(mwTest);
|
1119
|
+
// {
|
1120
|
+
// type: 'mann-whitney-u',
|
1121
|
+
// statistic: 0,
|
1122
|
+
// u1: 0,
|
1123
|
+
// u2: 25,
|
1124
|
+
// zStatistic: -2.61,
|
1125
|
+
// pValue: 0.009,
|
1126
|
+
// significant: true
|
1127
|
+
// }
|
1128
|
+
```
|
1129
|
+
|
1130
|
+
---
|
1131
|
+
|
1132
|
+
### 9. Confidence Intervals
|
1133
|
+
|
1134
|
+
Estimate population parameters with confidence.
|
1135
|
+
|
1136
|
+
#### Methods
|
1137
|
+
|
1138
|
+
##### `mean(data, confidence)`
|
1139
|
+
Confidence interval for mean.
|
1140
|
+
|
1141
|
+
```javascript
|
1142
|
+
const data = [23, 25, 27, 29, 31, 33, 35];
|
1143
|
+
const ci = datly.confidenceInterval(data, 0.95);
|
1144
|
+
console.log(ci);
|
1145
|
+
// {
|
1146
|
+
// mean: 29,
|
1147
|
+
// standardError: 1.63,
|
1148
|
+
// marginOfError: 4.03,
|
1149
|
+
// lowerBound: 24.97,
|
1150
|
+
// upperBound: 33.03,
|
1151
|
+
// confidence: 0.95,
|
1152
|
+
// degreesOfFreedom: 6
|
1153
|
+
// }
|
1154
|
+
```
|
1155
|
+
|
1156
|
+
---
|
1157
|
+
|
1158
|
+
##### `proportion(successes, total, confidence)`
|
1159
|
+
Confidence interval for proportion.
|
1160
|
+
|
1161
|
+
```javascript
|
1162
|
+
const successes = 45; // Number of successes
|
1163
|
+
const total = 100; // Total trials
|
1164
|
+
|
1165
|
+
const ci = datly.confidenceIntervals.proportion(successes, total, 0.95);
|
1166
|
+
console.log(ci);
|
1167
|
+
// {
|
1168
|
+
// normal: {
|
1169
|
+
// proportion: 0.45,
|
1170
|
+
// lowerBound: 0.353,
|
1171
|
+
// upperBound: 0.547,
|
1172
|
+
// confidence: 0.95
|
1173
|
+
// },
|
1174
|
+
// wilson: {
|
1175
|
+
// proportion: 0.45,
|
1176
|
+
// center: 0.453,
|
1177
|
+
// lowerBound: 0.355,
|
1178
|
+
// upperBound: 0.551,
|
1179
|
+
// confidence: 0.95
|
1180
|
+
// },
|
1181
|
+
// recommended: {...} // Wilson interval for small samples
|
1182
|
+
// }
|
1183
|
+
```
|
1184
|
+
|
1185
|
+
---
|
1186
|
+
|
1187
|
+
##### `variance(data, confidence)`
|
1188
|
+
Confidence interval for variance.
|
1189
|
+
|
1190
|
+
```javascript
|
1191
|
+
const data = [10, 12, 14, 16, 18, 20];
|
1192
|
+
const ci = datly.confidenceIntervals.variance(data, 0.95);
|
1193
|
+
console.log(ci);
|
1194
|
+
// {
|
1195
|
+
// sampleVariance: 14.67,
|
1196
|
+
// lowerBound: 6.23,
|
1197
|
+
// upperBound: 73.33,
|
1198
|
+
// confidence: 0.95
|
1199
|
+
// }
|
1200
|
+
```
|
1201
|
+
|
1202
|
+
---
|
1203
|
+
|
1204
|
+
##### `meanDifference(sample1, sample2, confidence, equalVariances)`
|
1205
|
+
Confidence interval for difference between two means.
|
1206
|
+
|
1207
|
+
```javascript
|
1208
|
+
const before = [120, 125, 130, 128, 122];
|
1209
|
+
const after = [115, 118, 125, 120, 115];
|
1210
|
+
|
1211
|
+
const ci = datly.confidenceIntervals.meanDifference(before, after, 0.95);
|
1212
|
+
console.log(ci);
|
1213
|
+
// {
|
1214
|
+
// meanDifference: 9.2,
|
1215
|
+
// sample1Mean: 125,
|
1216
|
+
// sample2Mean: 115.8,
|
1217
|
+
// lowerBound: 3.5,
|
1218
|
+
// upperBound: 14.9,
|
1219
|
+
// confidence: 0.95
|
1220
|
+
// }
|
1221
|
+
```
|
1222
|
+
|
1223
|
+
---
|
1224
|
+
|
1225
|
+
##### `correlation(x, y, confidence, method)`
|
1226
|
+
Confidence interval for correlation coefficient.
|
1227
|
+
|
1228
|
+
```javascript
|
1229
|
+
const x = [1, 2, 3, 4, 5];
|
1230
|
+
const y = [2, 4, 5, 4, 5];
|
1231
|
+
|
1232
|
+
const ci = datly.confidenceIntervals.correlation(x, y, 0.95);
|
1233
|
+
console.log(ci);
|
1234
|
+
// {
|
1235
|
+
// correlation: 0.775,
|
1236
|
+
// fisherZ: 1.033,
|
1237
|
+
// lowerBound: 0.034,
|
1238
|
+
// upperBound: 0.965,
|
1239
|
+
// confidence: 0.95
|
1240
|
+
// }
|
1241
|
+
```
|
1242
|
+
|
1243
|
+
---
|
1244
|
+
|
1245
|
+
##### `bootstrapCI(data, statistic, confidence, iterations)`
|
1246
|
+
Bootstrap confidence interval.
|
1247
|
+
|
1248
|
+
```javascript
|
1249
|
+
const data = [23, 25, 27, 29, 31];
|
1250
|
+
|
1251
|
+
// Bootstrap CI for median
|
1252
|
+
const ci = datly.confidenceIntervals.bootstrapCI(data, 'median', 0.95, 1000);
|
1253
|
+
console.log(ci);
|
1254
|
+
// {
|
1255
|
+
// originalStatistic: 27,
|
1256
|
+
// bootstrapMean: 27.1,
|
1257
|
+
// bias: 0.1,
|
1258
|
+
// standardError: 1.4,
|
1259
|
+
// lowerBound: 24.5,
|
1260
|
+
// upperBound: 29.8,
|
1261
|
+
// confidence: 0.95,
|
1262
|
+
// iterations: 1000
|
1263
|
+
// }
|
1264
|
+
```
|
1265
|
+
|
1266
|
+
---
|
1267
|
+
|
1268
|
+
### 10. Normality Tests
|
1269
|
+
|
1270
|
+
Test if data follows normal distribution.
|
1271
|
+
|
1272
|
+
#### Methods
|
1273
|
+
|
1274
|
+
##### `shapiroWilk(data, alpha)`
|
1275
|
+
Shapiro-Wilk test (most powerful for small samples).
|
1276
|
+
|
1277
|
+
```javascript
|
1278
|
+
const data = [2.3, 2.5, 2.7, 2.9, 3.1, 3.3, 3.5];
|
1279
|
+
const sw = datly.normalityTests.shapiroWilk(data);
|
1280
|
+
console.log(sw);
|
1281
|
+
// {
|
1282
|
+
// statistic: 0.96,
|
1283
|
+
// pValue: 0.82,
|
1284
|
+
// isNormal: true,
|
1285
|
+
// alpha: 0.05,
|
1286
|
+
// interpretation: "Fail to reject null hypothesis..."
|
1287
|
+
// }
|
1288
|
+
```
|
1289
|
+
|
1290
|
+
**Best for:** Sample sizes 3-5000
|
1291
|
+
|
1292
|
+
---
|
1293
|
+
|
1294
|
+
##### `kolmogorovSmirnov(data, alpha)`
|
1295
|
+
Kolmogorov-Smirnov test.
|
1296
|
+
|
1297
|
+
```javascript
|
1298
|
+
const data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];
|
1299
|
+
const ks = datly.normalityTests.kolmogorovSmirnov(data);
|
1300
|
+
console.log(ks);
|
1301
|
+
// {
|
1302
|
+
// statistic: 0.15,
|
1303
|
+
// pValue: 0.95,
|
1304
|
+
// isNormal: true,
|
1305
|
+
// lambda: 0.47
|
1306
|
+
// }
|
1307
|
+
```
|
1308
|
+
|
1309
|
+
**Best for:** Large samples, continuous data
|
1310
|
+
|
1311
|
+
---
|
1312
|
+
|
1313
|
+
##### `andersonDarling(data, alpha)`
|
1314
|
+
Anderson-Darling test (sensitive to tails).
|
1315
|
+
|
1316
|
+
```javascript
|
1317
|
+
const data = [10, 12, 14, 15, 16, 18, 20];
|
1318
|
+
const ad = datly.normalityTests.andersonDarling(data);
|
1319
|
+
console.log(ad);
|
1320
|
+
// {
|
1321
|
+
// statistic: 0.234,
|
1322
|
+
// adjustedStatistic: 0.247,
|
1323
|
+
// pValue: 0.75,
|
1324
|
+
// isNormal: true
|
1325
|
+
// }
|
1326
|
+
```
|
1327
|
+
|
1328
|
+
**Best for:** Detecting tail deviations
|
1329
|
+
|
1330
|
+
---
|
1331
|
+
|
1332
|
+
##### `jarqueBera(data, alpha)`
|
1333
|
+
Jarque-Bera test (based on skewness and kurtosis).
|
1334
|
+
|
1335
|
+
```javascript
|
1336
|
+
const data = [5, 6, 7, 8, 9, 10, 11, 12];
|
1337
|
+
const jb = datly.normalityTests.jarqueBera(data);
|
1338
|
+
console.log(jb);
|
1339
|
+
// {
|
1340
|
+
// statistic: 0.45,
|
1341
|
+
// pValue: 0.80,
|
1342
|
+
// skewness: 0,
|
1343
|
+
// excessKurtosis: -1.2,
|
1344
|
+
// isNormal: true
|
1345
|
+
// }
|
1346
|
+
```
|
1347
|
+
|
1348
|
+
**Best for:** Large samples (n > 30)
|
1349
|
+
|
1350
|
+
---
|
1351
|
+
|
1352
|
+
##### `dagoTest(data, alpha)`
|
1353
|
+
D'Agostino K² test.
|
1354
|
+
|
1355
|
+
```javascript
|
1356
|
+
const data = Array.from({ length: 50 }, () => Math.random() * 10);
|
1357
|
+
const dago = datly.normalityTests.dagoTest(data);
|
1358
|
+
console.log(dago);
|
1359
|
+
// {
|
1360
|
+
// statistic: 2.34,
|
1361
|
+
// pValue: 0.31,
|
1362
|
+
// isNormal: true,
|
1363
|
+
// skewness: 0.23,
|
1364
|
+
// excessKurtosis: -0.45
|
1365
|
+
// }
|
1366
|
+
```
|
1367
|
+
|
1368
|
+
**Best for:** Medium to large samples (n ≥ 20)
|
1369
|
+
|
1370
|
+
---
|
1371
|
+
|
1372
|
+
##### `batchNormalityTest(data, alpha)`
|
1373
|
+
Run all normality tests at once.
|
1374
|
+
|
1375
|
+
```javascript
|
1376
|
+
const data = [12, 14, 15, 16, 17, 18, 20, 22];
|
1377
|
+
const batch = datly.normalityTests.batchNormalityTest(data);
|
1378
|
+
console.log(batch);
|
1379
|
+
// {
|
1380
|
+
// individualTests: {
|
1381
|
+
// shapiroWilk: {...},
|
1382
|
+
// jarqueBera: {...},
|
1383
|
+
// andersonDarling: {...},
|
1384
|
+
// kolmogorovSmirnov: {...}
|
1385
|
+
// },
|
1386
|
+
// summary: {
|
1387
|
+
// testsRun: 4,
|
1388
|
+
// testsPassingNormality: 4,
|
1389
|
+
// consensusNormal: true,
|
1390
|
+
// strongNormalEvidence: true
|
1391
|
+
// },
|
1392
|
+
// recommendation: "Strong evidence for normality..."
|
1393
|
+
// }
|
1394
|
+
```
|
1395
|
+
|
1396
|
+
---
|
1397
|
+
|
1398
|
+
### 11. Correlation Analysis
|
1399
|
+
|
1400
|
+
Measure relationships between variables.
|
1401
|
+
|
1402
|
+
#### Methods
|
1403
|
+
|
1404
|
+
##### `pearson(x, y)`
|
1405
|
+
Pearson correlation (linear relationships).
|
1406
|
+
|
1407
|
+
```javascript
|
1408
|
+
const height = [160, 165, 170, 175, 180];
|
1409
|
+
const weight = [55, 60, 65, 70, 75];
|
1410
|
+
|
1411
|
+
const corr = datly.correlation.pearson(height, weight);
|
1412
|
+
console.log(corr);
|
1413
|
+
// {
|
1414
|
+
// correlation: 1.0,
|
1415
|
+
// pValue: 0.000,
|
1416
|
+
// tStatistic: Infinity,
|
1417
|
+
// significant: true,
|
1418
|
+
// confidenceInterval: { lower: 1.0, upper: 1.0 }
|
1419
|
+
// }
|
1420
|
+
```
|
1421
|
+
|
1422
|
+
**Range:** -1 (perfect negative) to +1 (perfect positive)
|
1423
|
+
|
1424
|
+
**Interpretation:**
|
1425
|
+
- |r| < 0.3: Weak
|
1426
|
+
- 0.3 ≤ |r| < 0.7: Moderate
|
1427
|
+
- |r| ≥ 0.7: Strong
|
1428
|
+
|
1429
|
+
---
|
1430
|
+
|
1431
|
+
##### `spearman(x, y)`
|
1432
|
+
Spearman correlation (monotonic relationships, rank-based).
|
1433
|
+
|
1434
|
+
```javascript
|
1435
|
+
const x = [1, 2, 3, 4, 5];
|
1436
|
+
const y = [1, 4, 9, 16, 25]; // Non-linear but monotonic
|
1437
|
+
|
1438
|
+
const corr = datly.correlation.spearman(x, y);
|
1439
|
+
console.log(corr);
|
1440
|
+
// {
|
1441
|
+
// correlation: 1.0,
|
1442
|
+
// pValue: 0.000,
|
1443
|
+
// significant: true,
|
1444
|
+
// xRanks: [1, 2, 3, 4, 5],
|
1445
|
+
// yRanks: [1, 2, 3, 4, 5]
|
1446
|
+
// }
|
1447
|
+
```
|
1448
|
+
|
1449
|
+
**Use when:** Data is ordinal or non-linear
|
1450
|
+
|
1451
|
+
---
|
1452
|
+
|
1453
|
+
##### `kendall(x, y)`
|
1454
|
+
Kendall's Tau correlation.
|
1455
|
+
|
1456
|
+
```javascript
|
1457
|
+
const x = [1, 2, 3, 4, 5];
|
1458
|
+
const y = [2, 1, 4, 3, 5];
|
1459
|
+
|
1460
|
+
const corr = datly.correlation.kendall(x, y);
|
1461
|
+
console.log(corr);
|
1462
|
+
// {
|
1463
|
+
// correlation: 0.6,
|
1464
|
+
// pValue: 0.142,
|
1465
|
+
// zStatistic: 1.47,
|
1466
|
+
// concordantPairs: 8,
|
1467
|
+
// discordantPairs: 2,
|
1468
|
+
// significant: false
|
1469
|
+
// }
|
1470
|
+
```
|
1471
|
+
|
1472
|
+
**Use when:** Small sample sizes, ordinal data
|
1473
|
+
|
1474
|
+
---
|
1475
|
+
|
1476
|
+
##### `matrix(dataset, method)`
|
1477
|
+
Correlation matrix for multiple variables.
|
1478
|
+
|
1479
|
+
```javascript
|
1480
|
+
const data = {
|
1481
|
+
headers: ['age', 'income', 'spending'],
|
1482
|
+
data: [
|
1483
|
+
{ age: 25, income: 30000, spending: 20000 },
|
1484
|
+
{ age: 30, income: 40000, spending: 25000 },
|
1485
|
+
{ age: 35, income: 50000, spending: 30000 },
|
1486
|
+
{ age: 40, income: 60000, spending: 35000 }
|
1487
|
+
]
|
1488
|
+
};
|
1489
|
+
|
1490
|
+
const matrix = datly.correlation.matrix(data, 'pearson');
|
1491
|
+
console.log(matrix);
|
1492
|
+
// {
|
1493
|
+
// correlations: {
|
1494
|
+
// age: { age: 1, income: 1, spending: 1 },
|
1495
|
+
// income: { age: 1, income: 1, spending: 1 },
|
1496
|
+
// spending: { age: 1, income: 1, spending: 1 }
|
1497
|
+
// },
|
1498
|
+
// pValues: {...},
|
1499
|
+
// strongCorrelations: [
|
1500
|
+
// { variable1: 'age', variable2: 'income', correlation: 1.0 }
|
1501
|
+
// ]
|
1502
|
+
// }
|
1503
|
+
```
|
1504
|
+
|
1505
|
+
---
|
1506
|
+
|
1507
|
+
##### `partialCorrelation(x, y, z)`
|
1508
|
+
Partial correlation (controlling for third variable).
|
1509
|
+
|
1510
|
+
```javascript
|
1511
|
+
const x = [1, 2, 3, 4, 5];
|
1512
|
+
const y = [2, 3, 4, 5, 6];
|
1513
|
+
const z = [1, 1, 2, 2, 3]; // Control variable
|
1514
|
+
|
1515
|
+
const partial = datly.correlation.partialCorrelation(x, y, z);
|
1516
|
+
console.log(partial);
|
1517
|
+
// {
|
1518
|
+
// correlation: 0.95,
|
1519
|
+
// pValue: 0.013,
|
1520
|
+
// significant: true,
|
1521
|
+
// controllingFor: 'third variable'
|
1522
|
+
// }
|
1523
|
+
```
|
1524
|
+
|
1525
|
+
---
|
1526
|
+
|
1527
|
+
##### `covariance(x, y, sample)`
|
1528
|
+
Calculate covariance.
|
1529
|
+
|
1530
|
+
```javascript
|
1531
|
+
const x = [1, 2, 3, 4, 5];
|
1532
|
+
const y = [2, 4, 6, 8, 10];
|
1533
|
+
|
1534
|
+
// Sample covariance
|
1535
|
+
const cov = datly.correlation.covariance(x, y, true);
|
1536
|
+
console.log(cov);
|
1537
|
+
// {
|
1538
|
+
// covariance: 5,
|
1539
|
+
// meanX: 3,
|
1540
|
+
// meanY: 6,
|
1541
|
+
// sampleSize: 5
|
1542
|
+
// }
|
1543
|
+
```
|
1544
|
+
|
1545
|
+
---
|
1546
|
+
|
1547
|
+
### 12. Regression Analysis
|
1548
|
+
|
1549
|
+
Model relationships and make predictions.
|
1550
|
+
|
1551
|
+
#### Methods
|
1552
|
+
|
1553
|
+
##### `linear(x, y)`
|
1554
|
+
Simple linear regression.
|
1555
|
+
|
1556
|
+
```javascript
|
1557
|
+
const x = [1, 2, 3, 4, 5];
|
1558
|
+
const y = [2, 4, 5, 4, 5];
|
1559
|
+
|
1560
|
+
const model = datly.regression.linear(x, y);
|
1561
|
+
console.log(model);
|
1562
|
+
// {
|
1563
|
+
// slope: 0.6,
|
1564
|
+
// intercept: 2.2,
|
1565
|
+
// rSquared: 0.46,
|
1566
|
+
// adjustedRSquared: 0.28,
|
1567
|
+
// equation: 'y = 2.2000 + 0.6000x',
|
1568
|
+
// pValueSlope: 0.158,
|
1569
|
+
// pValueModel: 0.158,
|
1570
|
+
// residuals: [-0.2, 0.2, 0.4, -1.0, 0.6],
|
1571
|
+
// predicted: [2.8, 3.4, 4.0, 4.6, 5.2]
|
1572
|
+
// }
|
1573
|
+
```
|
1574
|
+
|
1575
|
+
---
|
1576
|
+
|
1577
|
+
##### `multiple(dataset, dependent, independents)`
|
1578
|
+
Multiple linear regression.
|
1579
|
+
|
1580
|
+
```javascript
|
1581
|
+
const data = {
|
1582
|
+
headers: ['sales', 'advertising', 'price', 'competition'],
|
1583
|
+
data: [
|
1584
|
+
{ sales: 100, advertising: 10, price: 50, competition: 3 },
|
1585
|
+
{ sales: 150, advertising: 15, price: 45, competition: 2 },
|
1586
|
+
{ sales: 120, advertising: 12, price: 48, competition: 4 },
|
1587
|
+
{ sales: 180, advertising: 20, price: 40, competition: 1 }
|
1588
|
+
]
|
1589
|
+
};
|
1590
|
+
|
1591
|
+
const model = datly.regression.multiple(
|
1592
|
+
data,
|
1593
|
+
'sales',
|
1594
|
+
['advertising', 'price', 'competition']
|
1595
|
+
);
|
1596
|
+
|
1597
|
+
console.log(model);
|
1598
|
+
// {
|
1599
|
+
// coefficients: [
|
1600
|
+
// { variable: 'Intercept', coefficient: 50, pValue: 0.05 },
|
1601
|
+
// { variable: 'advertising', coefficient: 5.5, pValue: 0.01 },
|
1602
|
+
// { variable: 'price', coefficient: -2.1, pValue: 0.03 },
|
1603
|
+
// { variable: 'competition', coefficient: -10, pValue: 0.02 }
|
1604
|
+
// ],
|
1605
|
+
// rSquared: 0.95,
|
1606
|
+
// adjustedRSquared: 0.90,
|
1607
|
+
// fStatistic: 19.0,
|
1608
|
+
// pValueModel: 0.001,
|
1609
|
+
// equation: 'y = 50.0000 + 5.5000*advertising + -2.1000*price + -10.0000*competition'
|
1610
|
+
// }
|
1611
|
+
```
|
1612
|
+
|
1613
|
+
---
|
1614
|
+
|
1615
|
+
##### `polynomial(x, y, degree)`
|
1616
|
+
Polynomial regression.
|
1617
|
+
|
1618
|
+
```javascript
|
1619
|
+
const x = [1, 2, 3, 4, 5];
|
1620
|
+
const y = [1, 4, 9, 16, 25]; // y = x²
|
1621
|
+
|
1622
|
+
const model = datly.regression.polynomial(x, y, 2);
|
1623
|
+
console.log(model);
|
1624
|
+
// {
|
1625
|
+
// coefficients: [0, 0, 1], // y = 0 + 0x + 1x²
|
1626
|
+
// degree: 2,
|
1627
|
+
// rSquared: 1.0,
|
1628
|
+
// equation: 'y = 0.0000 + 0.0000*x + 1.0000*x^2',
|
1629
|
+
// predicted: [1, 4, 9, 16, 25]
|
1630
|
+
// }
|
1631
|
+
```
|
1632
|
+
|
1633
|
+
---
|
1634
|
+
|
1635
|
+
##### `logistic(x, y, maxIterations, tolerance)`
|
1636
|
+
Logistic regression (binary classification).
|
1637
|
+
|
1638
|
+
```javascript
|
1639
|
+
const x = [1, 2, 3, 4, 5, 6];
|
1640
|
+
const y = [0, 0, 0, 1, 1, 1]; // Binary outcome
|
1641
|
+
|
1642
|
+
const model = datly.regression.logistic(x, y);
|
1643
|
+
console.log(model);
|
1644
|
+
// {
|
1645
|
+
// intercept: -3.5,
|
1646
|
+
// slope: 1.2,
|
1647
|
+
// probabilities: [0.12, 0.23, 0.38, 0.55, 0.70, 0.81],
|
1648
|
+
// predicted: [0, 0, 0, 1, 1, 1],
|
1649
|
+
// accuracy: 1.0,
|
1650
|
+
// logLikelihood: -2.1,
|
1651
|
+
// mcFaddenR2: 0.68
|
1652
|
+
// }
|
1653
|
+
```
|
1654
|
+
|
1655
|
+
---
|
1656
|
+
|
1657
|
+
##### `predict(model, x)`
|
1658
|
+
Make predictions using fitted model.
|
1659
|
+
|
1660
|
+
```javascript
|
1661
|
+
// After fitting a model
|
1662
|
+
const newX = [6, 7, 8];
|
1663
|
+
const predictions = datly.regression.predict(model, newX);
|
1664
|
+
console.log(predictions); // [5.8, 6.4, 7.0]
|
1665
|
+
```
|
1666
|
+
|
1667
|
+
---
|
1668
|
+
|
1669
|
+
### 13. Report Generation
|
1670
|
+
|
1671
|
+
Generate comprehensive statistical reports.
|
1672
|
+
|
1673
|
+
#### Methods
|
1674
|
+
|
1675
|
+
##### `summary(dataset)`
|
1676
|
+
Generate complete statistical summary.
|
1677
|
+
|
1678
|
+
```javascript
|
1679
|
+
const data = {
|
1680
|
+
headers: ['age', 'income', 'department'],
|
1681
|
+
data: [
|
1682
|
+
{ age: 25, income: 30000, department: 'Sales' },
|
1683
|
+
{ age: 30, income: 45000, department: 'IT' },
|
1684
|
+
{ age: 35, income: 50000, department: 'Sales' },
|
1685
|
+
{ age: 40, income: 60000, department: 'IT' }
|
1686
|
+
],
|
1687
|
+
length: 4,
|
1688
|
+
columns: 3
|
1689
|
+
};
|
1690
|
+
|
1691
|
+
const report = datly.reportGenerator.summary(data);
|
1692
|
+
console.log(report);
|
1693
|
+
// {
|
1694
|
+
// title: 'Statistical Summary Report',
|
1695
|
+
// generatedAt: '2025-01-15T10:30:00.000Z',
|
1696
|
+
// basicInfo: {
|
1697
|
+
// totalRows: 4,
|
1698
|
+
// totalColumns: 3,
|
1699
|
+
// headers: ['age', 'income', 'department']
|
1700
|
+
// },
|
1701
|
+
// columnAnalysis: {
|
1702
|
+
// age: {
|
1703
|
+
// type: 'numeric',
|
1704
|
+
// mean: 32.5,
|
1705
|
+
// median: 32.5,
|
1706
|
+
// min: 25,
|
1707
|
+
// max: 40,
|
1708
|
+
// standardDeviation: 6.45
|
1709
|
+
// },
|
1710
|
+
// income: {
|
1711
|
+
// type: 'numeric',
|
1712
|
+
// mean: 46250,
|
1713
|
+
// median: 47500,
|
1714
|
+
// ...
|
1715
|
+
// },
|
1716
|
+
// department: {
|
1717
|
+
// type: 'categorical',
|
1718
|
+
// categories: [...],
|
1719
|
+
// mostFrequent: { value: 'Sales', frequency: 2 }
|
1720
|
+
// }
|
1721
|
+
// },
|
1722
|
+
// dataQuality: {
|
1723
|
+
// overallScore: 95,
|
1724
|
+
// completenessScore: 100,
|
1725
|
+
// consistencyScore: 90
|
1726
|
+
// },
|
1727
|
+
// keyInsights: [
|
1728
|
+
// {
|
1729
|
+
// type: 'correlation',
|
1730
|
+
// title: 'Strong correlation between age and income',
|
1731
|
+
// importance: 8
|
1732
|
+
// }
|
1733
|
+
// ],
|
1734
|
+
// recommendations: [...]
|
1735
|
+
// }
|
1736
|
+
```
|
1737
|
+
|
1738
|
+
---
|
1739
|
+
|
1740
|
+
##### `exportSummary(summary, format)`
|
1741
|
+
Export report in different formats.
|
1742
|
+
|
1743
|
+
```javascript
|
1744
|
+
const report = datly.reportGenerator.summary(data);
|
1745
|
+
|
1746
|
+
// Export as JSON
|
1747
|
+
const json = datly.reportGenerator.exportSummary(report, 'json');
|
1748
|
+
|
1749
|
+
// Export as text
|
1750
|
+
const text = datly.reportGenerator.exportSummary(report, 'text');
|
1751
|
+
console.log(text);
|
1752
|
+
// STATISTICAL SUMMARY REPORT
|
1753
|
+
// Generated: 1/15/2025, 10:30:00 AM
|
1754
|
+
// ==================================================
|
1755
|
+
//
|
1756
|
+
// BASIC INFORMATION
|
1757
|
+
// --------------------
|
1758
|
+
// Rows: 4
|
1759
|
+
// Columns: 3
|
1760
|
+
// ...
|
1761
|
+
|
1762
|
+
// Export as CSV
|
1763
|
+
const csv = datly.reportGenerator.exportSummary(report, 'csv');
|
1764
|
+
```
|
1765
|
+
|
1766
|
+
---
|
1767
|
+
|
1768
|
+
### 14. Pattern Detection
|
1769
|
+
|
1770
|
+
Automatically detect patterns in data.
|
1771
|
+
|
1772
|
+
#### Methods
|
1773
|
+
|
1774
|
+
##### `analyze(dataset)`
|
1775
|
+
Comprehensive pattern analysis.
|
1776
|
+
|
1777
|
+
```javascript
|
1778
|
+
const data = {
|
1779
|
+
headers: ['date', 'sales', 'temperature'],
|
1780
|
+
data: [
|
1781
|
+
{ date: '2024-01-01', sales: 100, temperature: 20 },
|
1782
|
+
{ date: '2024-01-02', sales: 110, temperature: 22 },
|
1783
|
+
{ date: '2024-01-03', sales: 105, temperature: 21 },
|
1784
|
+
{ date: '2024-01-04', sales: 115, temperature: 23 }
|
1785
|
+
],
|
1786
|
+
length: 4,
|
1787
|
+
columns: 3
|
1788
|
+
};
|
1789
|
+
|
1790
|
+
const patterns = datly.patternDetector.analyze(data);
|
1791
|
+
console.log(patterns);
|
1792
|
+
// {
|
1793
|
+
// patterns: {
|
1794
|
+
// trends: [
|
1795
|
+
// {
|
1796
|
+
// column: 'sales',
|
1797
|
+
// direction: 'increasing',
|
1798
|
+
// slope: 5,
|
1799
|
+
// rSquared: 0.75,
|
1800
|
+
// strength: 'strong'
|
1801
|
+
// }
|
1802
|
+
// ],
|
1803
|
+
// seasonality: [...],
|
1804
|
+
// outliers: [...],
|
1805
|
+
// correlations: {
|
1806
|
+
// strongCorrelations: [
|
1807
|
+
// {
|
1808
|
+
// variable1: 'sales',
|
1809
|
+
// variable2: 'temperature',
|
1810
|
+
// correlation: 0.95,
|
1811
|
+
// strength: 'very_strong'
|
1812
|
+
// }
|
1813
|
+
// ]
|
1814
|
+
// },
|
1815
|
+
// distributions: [...],
|
1816
|
+
// clustering: [...],
|
1817
|
+
// temporal: [...]
|
1818
|
+
// },
|
1819
|
+
// insights: [
|
1820
|
+
// {
|
1821
|
+
// type: 'trend',
|
1822
|
+
// importance: 'high',
|
1823
|
+
// message: 'Found 1 strong trend(s) in your data',
|
1824
|
+
// details: ['sales: increasing trend']
|
1825
|
+
// }
|
1826
|
+
// ]
|
1827
|
+
// }
|
1828
|
+
```
|
1829
|
+
|
1830
|
+
---
|
1831
|
+
|
1832
|
+
### 15. Result Interpretation
|
1833
|
+
|
1834
|
+
Interpret statistical test results in plain language.
|
1835
|
+
|
1836
|
+
#### Methods
|
1837
|
+
|
1838
|
+
##### `interpret(testResult)`
|
1839
|
+
Interpret any statistical test result.
|
1840
|
+
|
1841
|
+
```javascript
|
1842
|
+
// After performing a t-test
|
1843
|
+
const tTestResult = datly.hypothesisTesting.tTest(group1, group2);
|
1844
|
+
|
1845
|
+
const interpretation = datly.interpreter.interpret(tTestResult);
|
1846
|
+
console.log(interpretation);
|
1847
|
+
// {
|
1848
|
+
// testType: 't-test',
|
1849
|
+
// summary: 'significant difference between groups (t = -3.46, p = 0.008)',
|
1850
|
+
// conclusion: {
|
1851
|
+
// decision: 'reject_null',
|
1852
|
+
// statement: 'At the 95% confidence level, we reject the null hypothesis',
|
1853
|
+
// pValue: 0.008,
|
1854
|
+
// confidenceLevel: 95
|
1855
|
+
// },
|
1856
|
+
// significance: {
|
1857
|
+
// level: 'strong',
|
1858
|
+
// pValue: 0.008,
|
1859
|
+
// interpretation: 'Strong evidence against null hypothesis',
|
1860
|
+
// isSignificant: true
|
1861
|
+
// },
|
1862
|
+
// effectSize: {
|
1863
|
+
// value: 0.85,
|
1864
|
+
// magnitude: 'Large',
|
1865
|
+
// interpretation: 'large effect size'
|
1866
|
+
// },
|
1867
|
+
// plainLanguage: '✓ SIGNIFICANT RESULT: Found a meaningful difference between the groups. (p-value: 0.0080)',
|
1868
|
+
// recommendations: [
|
1869
|
+
// 'Very strong result - investigate practical significance',
|
1870
|
+
// 'Replicate findings with independent data when possible'
|
1871
|
+
// ]
|
1872
|
+
// }
|
1873
|
+
```
|
1874
|
+
|
1875
|
+
---
|
1876
|
+
|
1877
|
+
### 16. Auto-Analysis
|
1878
|
+
|
1879
|
+
Automated end-to-end analysis.
|
1880
|
+
|
1881
|
+
#### Methods
|
1882
|
+
|
1883
|
+
##### `autoAnalyze(dataset, options)`
|
1884
|
+
Perform comprehensive automatic analysis.
|
1885
|
+
|
1886
|
+
```javascript
|
1887
|
+
const data = {
|
1888
|
+
headers: ['age', 'income', 'gender', 'purchase'],
|
1889
|
+
data: [
|
1890
|
+
{ age: 25, income: 30000, gender: 'M', purchase: 100 },
|
1891
|
+
{ age: 30, income: 45000, gender: 'F', purchase: 150 },
|
1892
|
+
{ age: 35, income: 50000, gender: 'M', purchase: 120 },
|
1893
|
+
{ age: 40, income: 60000, gender: 'F', purchase: 180 },
|
1894
|
+
// ... more data
|
1895
|
+
],
|
1896
|
+
length: 100,
|
1897
|
+
columns: 4
|
1898
|
+
};
|
1899
|
+
|
1900
|
+
const analysis = datly.autoAnalyzer.autoAnalyze(data, {
|
1901
|
+
minCorrelationThreshold: 0.3,
|
1902
|
+
significanceLevel: 0.05,
|
1903
|
+
generateVisualizations: true,
|
1904
|
+
includeAdvancedAnalysis: true
|
1905
|
+
});
|
1906
|
+
|
1907
|
+
console.log(analysis);
|
1908
|
+
// {
|
1909
|
+
// metadata: {
|
1910
|
+
// analysisDate: '2025-01-15T10:30:00.000Z',
|
1911
|
+
// datasetSize: 100,
|
1912
|
+
// columnsAnalyzed: 4
|
1913
|
+
// },
|
1914
|
+
// variableClassification: {
|
1915
|
+
// quantitative: [
|
1916
|
+
// { name: 'age', type: 'quantitative', subtype: 'discrete' },
|
1917
|
+
// { name: 'income', type: 'quantitative', subtype: 'continuous' },
|
1918
|
+
// { name: 'purchase', type: 'quantitative', subtype: 'continuous' }
|
1919
|
+
// ],
|
1920
|
+
// qualitative: [
|
1921
|
+
// { name: 'gender', type: 'binary', categories: ['M', 'F'] }
|
1922
|
+
// ]
|
1923
|
+
// },
|
1924
|
+
// descriptiveStatistics: {
|
1925
|
+
// age: { mean: 32.5, median: 32, std: 6.45, ... },
|
1926
|
+
// income: { mean: 46250, median: 47500, ... },
|
1927
|
+
// purchase: { mean: 137.5, median: 135, ... }
|
1928
|
+
// },
|
1929
|
+
// correlationAnalysis: {
|
1930
|
+
// strongCorrelations: [
|
1931
|
+
// {
|
1932
|
+
// variable1: 'income',
|
1933
|
+
// variable2: 'purchase',
|
1934
|
+
// correlation: 0.92,
|
1935
|
+
// significance: true
|
1936
|
+
// }
|
1937
|
+
// ]
|
1938
|
+
// },
|
1939
|
+
// regressionAnalysis: {
|
1940
|
+
// models: [
|
1941
|
+
// {
|
1942
|
+
// independent: 'income',
|
1943
|
+
// dependent: 'purchase',
|
1944
|
+
// rSquared: 0.85,
|
1945
|
+
// significant: true,
|
1946
|
+
// equation: 'y = 10.5 + 0.0025*income'
|
1947
|
+
// }
|
1948
|
+
// ]
|
1949
|
+
// },
|
1950
|
+
// distributionAnalysis: {
|
1951
|
+
// age: {
|
1952
|
+
// isNormal: true,
|
1953
|
+
// normalityPValue: 0.15,
|
1954
|
+
// skewness: 0.12,
|
1955
|
+
// distributionType: 'normal'
|
1956
|
+
// }
|
1957
|
+
// },
|
1958
|
+
// outlierAnalysis: {
|
1959
|
+
// income: {
|
1960
|
+
// count: 2,
|
1961
|
+
// percentage: 2,
|
1962
|
+
// severity: 'low'
|
1963
|
+
// }
|
1964
|
+
// },
|
1965
|
+
// insights: [
|
1966
|
+
// {
|
1967
|
+
// category: 'overview',
|
1968
|
+
// priority: 'high',
|
1969
|
+
// title: 'Dataset Composition',
|
1970
|
+
// description: 'Dataset with 100 records, 3 numeric and 1 categorical variables',
|
1971
|
+
// icon: '📊'
|
1972
|
+
// },
|
1973
|
+
// {
|
1974
|
+
// category: 'correlation',
|
1975
|
+
// priority: 'high',
|
1976
|
+
// title: 'Very strong correlation between income and purchase',
|
1977
|
+
// description: 'Positive correlation of 0.920',
|
1978
|
+
// icon: '🔗'
|
1979
|
+
// }
|
1980
|
+
// ],
|
1981
|
+
// visualizationSuggestions: [
|
1982
|
+
// {
|
1983
|
+
// type: 'scatter',
|
1984
|
+
// variables: ['income', 'purchase'],
|
1985
|
+
// priority: 'high',
|
1986
|
+
// title: 'income vs purchase'
|
1987
|
+
// },
|
1988
|
+
// {
|
1989
|
+
// type: 'histogram',
|
1990
|
+
// variable: 'age',
|
1991
|
+
// priority: 'medium',
|
1992
|
+
// title: 'Distribution of age'
|
1993
|
+
// }
|
1994
|
+
// ],
|
1995
|
+
// summary: {
|
1996
|
+
// totalInsights: 8,
|
1997
|
+
// highPriorityInsights: 3,
|
1998
|
+
// keyFindings: [...],
|
1999
|
+
// recommendations: [
|
2000
|
+
// 'Explore correlations identified for possible predictive modeling',
|
2001
|
+
// 'Consider transformations for non-normal distributions'
|
2002
|
+
// ]
|
2003
|
+
// }
|
2004
|
+
// }
|
2005
|
+
```
|
2006
|
+
|
2007
|
+
---
|
2008
|
+
|
2009
|
+
### 17. Machine Learning
|
2010
|
+
|
2011
|
+
Build and train ML models.
|
2012
|
+
|
2013
|
+
#### Creating Models
|
2014
|
+
|
2015
|
+
##### Linear Regression
|
2016
|
+
|
2017
|
+
```javascript
|
2018
|
+
// Create model
|
2019
|
+
const model = datly.ml.createLinearRegression({
|
2020
|
+
learningRate: 0.01,
|
2021
|
+
iterations: 1000,
|
2022
|
+
regularization: 'l2', // 'l1', 'l2', or null
|
2023
|
+
lambda: 0.01
|
2024
|
+
});
|
2025
|
+
|
2026
|
+
// Prepare data
|
2027
|
+
const X = [
|
2028
|
+
[1, 2],
|
2029
|
+
[2, 3],
|
2030
|
+
[3, 4],
|
2031
|
+
[4, 5]
|
2032
|
+
];
|
2033
|
+
const y = [3, 5, 7, 9];
|
2034
|
+
|
2035
|
+
// Train model
|
2036
|
+
model.fit(X, y, true); // true = normalize features
|
2037
|
+
|
2038
|
+
// Make predictions
|
2039
|
+
const predictions = model.predict([[5, 6]]);
|
2040
|
+
console.log(predictions); // [11]
|
2041
|
+
|
2042
|
+
// Evaluate model
|
2043
|
+
const score = model.score(X, y);
|
2044
|
+
console.log(score);
|
2045
|
+
// {
|
2046
|
+
// r2Score: 1.0,
|
2047
|
+
// mse: 0.0,
|
2048
|
+
// rmse: 0.0,
|
2049
|
+
// mae: 0.0
|
2050
|
+
// }
|
2051
|
+
```
|
2052
|
+
|
2053
|
+
---
|
2054
|
+
|
2055
|
+
##### Logistic Regression
|
2056
|
+
|
2057
|
+
```javascript
|
2058
|
+
// Create model for classification
|
2059
|
+
const model = datly.ml.createLogisticRegression({
|
2060
|
+
learningRate: 0.01,
|
2061
|
+
iterations: 1000
|
2062
|
+
});
|
2063
|
+
|
2064
|
+
// Binary classification data
|
2065
|
+
const X = [
|
2066
|
+
[1, 2], [2, 3], [3, 4], [4, 5],
|
2067
|
+
[5, 6], [6, 7], [7, 8], [8, 9]
|
2068
|
+
];
|
2069
|
+
const y = [0, 0, 0, 0, 1, 1, 1, 1];
|
2070
|
+
|
2071
|
+
// Train
|
2072
|
+
model.fit(X, y);
|
2073
|
+
|
2074
|
+
// Predict classes
|
2075
|
+
const predictions = model.predict([[3.5, 4.5], [7, 8]]);
|
2076
|
+
console.log(predictions); // [0, 1]
|
2077
|
+
|
2078
|
+
// Predict probabilities
|
2079
|
+
const probabilities = model.predictProba([[3.5, 4.5]]);
|
2080
|
+
console.log(probabilities); // [{ 0: 0.62, 1: 0.38 }]
|
2081
|
+
|
2082
|
+
// Evaluate
|
2083
|
+
const score = model.score(X, y);
|
2084
|
+
console.log(score);
|
2085
|
+
// {
|
2086
|
+
// accuracy: 1.0,
|
2087
|
+
// confusionMatrix: {...},
|
2088
|
+
// classMetrics: {...}
|
2089
|
+
// }
|
2090
|
+
```
|
2091
|
+
|
2092
|
+
---
|
2093
|
+
|
2094
|
+
##### K-Nearest Neighbors (KNN)
|
2095
|
+
|
2096
|
+
```javascript
|
2097
|
+
// Create KNN model
|
2098
|
+
const model = datly.ml.createKNN({
|
2099
|
+
k: 5,
|
2100
|
+
metric: 'euclidean', // 'euclidean', 'manhattan', 'minkowski'
|
2101
|
+
weights: 'uniform' // 'uniform' or 'distance'
|
2102
|
+
});
|
2103
|
+
|
2104
|
+
// Prepare data
|
2105
|
+
const X = [
|
2106
|
+
[1, 2], [2, 3], [3, 4],
|
2107
|
+
[6, 7], [7, 8], [8, 9]
|
2108
|
+
];
|
2109
|
+
const y = [0, 0, 0, 1, 1, 1]; // Classes
|
2110
|
+
|
2111
|
+
// Train (KNN just stores the data)
|
2112
|
+
model.fit(X, y, true, 'classification');
|
2113
|
+
|
2114
|
+
// Predict
|
2115
|
+
const predictions = model.predict([[2, 2], [7, 7]]);
|
2116
|
+
console.log(predictions); // [0, 1]
|
2117
|
+
|
2118
|
+
// Predict with probabilities
|
2119
|
+
const proba = model.predictProba([[4, 5]]);
|
2120
|
+
console.log(proba); // [{ 0: 0.4, 1: 0.6 }]
|
2121
|
+
```
|
2122
|
+
|
2123
|
+
---
|
2124
|
+
|
2125
|
+
##### Decision Tree
|
2126
|
+
|
2127
|
+
```javascript
|
2128
|
+
// Create decision tree
|
2129
|
+
const model = datly.ml.createDecisionTree({
|
2130
|
+
maxDepth: 10,
|
2131
|
+
minSamplesSplit: 2,
|
2132
|
+
minSamplesLeaf: 1,
|
2133
|
+
criterion: 'gini' // 'gini' or 'entropy'
|
2134
|
+
});
|
2135
|
+
|
2136
|
+
// Train
|
2137
|
+
const X = [
|
2138
|
+
[2.5], [3.5], [4.5], [5.5], [6.5], [7.5]
|
2139
|
+
];
|
2140
|
+
const y = [0, 0, 1, 1, 2, 2]; // Multi-class
|
2141
|
+
|
2142
|
+
model.fit(X, y, 'classification');
|
2143
|
+
|
2144
|
+
// Predict
|
2145
|
+
const predictions = model.predict([[3.0], [6.0]]);
|
2146
|
+
console.log(predictions); // [0, 1]
|
2147
|
+
|
2148
|
+
// Get feature importance
|
2149
|
+
const importance = model.getFeatureImportance();
|
2150
|
+
console.log(importance); // { feature_0: 1.0 }
|
2151
|
+
|
2152
|
+
// Model summary
|
2153
|
+
const summary = model.summary();
|
2154
|
+
console.log(summary);
|
2155
|
+
// {
|
2156
|
+
// modelType: 'Decision Tree',
|
2157
|
+
// taskType: 'classification',
|
2158
|
+
// trainingMetrics: {
|
2159
|
+
// treeDepth: 3,
|
2160
|
+
// leafCount: 6,
|
2161
|
+
// nodeCount: 11
|
2162
|
+
// }
|
2163
|
+
// }
|
2164
|
+
```
|
2165
|
+
|
2166
|
+
---
|
2167
|
+
|
2168
|
+
##### Random Forest
|
2169
|
+
|
2170
|
+
```javascript
|
2171
|
+
// Create random forest
|
2172
|
+
const model = datly.ml.createRandomForest({
|
2173
|
+
nEstimators: 100, // Number of trees
|
2174
|
+
maxDepth: 10,
|
2175
|
+
minSamplesSplit: 2,
|
2176
|
+
maxFeatures: 'sqrt', // 'sqrt', 'log2', or number
|
2177
|
+
bootstrap: true
|
2178
|
+
});
|
2179
|
+
|
2180
|
+
// Train
|
2181
|
+
const X = [
|
2182
|
+
[1, 2], [2, 3], [3, 4], [4, 5],
|
2183
|
+
[5, 6], [6, 7], [7, 8], [8, 9]
|
2184
|
+
];
|
2185
|
+
const y = ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'];
|
2186
|
+
|
2187
|
+
model.fit(X, y, 'classification');
|
2188
|
+
|
2189
|
+
// Predict
|
2190
|
+
const predictions = model.predict([[2.5, 3.5], [7, 8]]);
|
2191
|
+
console.log(predictions); // ['A', 'C']
|
2192
|
+
|
2193
|
+
// Get feature importance
|
2194
|
+
const importance = model.getFeatureImportance();
|
2195
|
+
console.log(importance); // [0.6, 0.4]
|
2196
|
+
```
|
2197
|
+
|
2198
|
+
---
|
2199
|
+
|
2200
|
+
##### Naive Bayes
|
2201
|
+
|
2202
|
+
```javascript
|
2203
|
+
// Create Naive Bayes classifier
|
2204
|
+
const model = datly.ml.createNaiveBayes({
|
2205
|
+
type: 'gaussian' // 'gaussian', 'multinomial', or 'bernoulli'
|
2206
|
+
});
|
2207
|
+
|
2208
|
+
// Train
|
2209
|
+
const X = [
|
2210
|
+
[1, 2], [2, 3], [3, 4],
|
2211
|
+
[5, 6], [6, 7], [7, 8]
|
2212
|
+
];
|
2213
|
+
const y = ['spam', 'spam', 'spam', 'ham', 'ham', 'ham'];
|
2214
|
+
|
2215
|
+
model.fit(X, y);
|
2216
|
+
|
2217
|
+
// Predict
|
2218
|
+
const predictions = model.predict([[2, 2], [6, 6]]);
|
2219
|
+
console.log(predictions); // ['spam', 'ham']
|
2220
|
+
|
2221
|
+
// Predict probabilities
|
2222
|
+
const proba = model.predictProba([[4, 5]]);
|
2223
|
+
console.log(proba); // [{ spam: 0.3, ham: 0.7 }]
|
2224
|
+
```
|
2225
|
+
|
2226
|
+
---
|
2227
|
+
|
2228
|
+
##### Support Vector Machine (SVM)
|
2229
|
+
|
2230
|
+
```javascript
|
2231
|
+
// Create SVM
|
2232
|
+
const model = datly.ml.createSVM({
|
2233
|
+
C: 1.0, // Regularization parameter
|
2234
|
+
kernel: 'linear', // 'linear', 'rbf', 'poly'
|
2235
|
+
gamma: 'scale', // 'scale', 'auto', or number
|
2236
|
+
degree: 3, // For polynomial kernel
|
2237
|
+
learningRate: 0.001,
|
2238
|
+
iterations: 1000
|
2239
|
+
});
|
2240
|
+
|
2241
|
+
// Train
|
2242
|
+
const X = [
|
2243
|
+
[1, 2], [2, 3], [3, 4],
|
2244
|
+
[6, 7], [7, 8], [8, 9]
|
2245
|
+
];
|
2246
|
+
const y = [0, 0, 0, 1, 1, 1];
|
2247
|
+
|
2248
|
+
model.fit(X, y);
|
2249
|
+
|
2250
|
+
// Predict
|
2251
|
+
const predictions = model.predict([[2, 2], [7, 7]]);
|
2252
|
+
console.log(predictions); // [0, 1]
|
2253
|
+
|
2254
|
+
// Summary
|
2255
|
+
const summary = model.summary();
|
2256
|
+
console.log(summary.trainingMetrics.nSupportVectors); // Number of support vectors
|
2257
|
+
```
|
2258
|
+
|
2259
|
+
---
|
2260
|
+
|
2261
|
+
#### Model Utilities
|
2262
|
+
|
2263
|
+
##### Train-Test Split
|
2264
|
+
|
2265
|
+
```javascript
|
2266
|
+
const X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6]];
|
2267
|
+
const y = [0, 0, 1, 1, 1];
|
2268
|
+
|
2269
|
+
const split = datly.ml.trainTestSplit(X, y, 0.2, true);
|
2270
|
+
// {
|
2271
|
+
// X_train: [[...], [...], [...]], // 80% of data
|
2272
|
+
// X_test: [[...], [...]], // 20% of data
|
2273
|
+
// y_train: [0, 1, 1],
|
2274
|
+
// y_test: [0, 1]
|
2275
|
+
// }
|
2276
|
+
```
|
2277
|
+
|
2278
|
+
---
|
2279
|
+
|
2280
|
+
##### Cross-Validation
|
2281
|
+
|
2282
|
+
```javascript
|
2283
|
+
const model = datly.ml.createKNN({ k: 3 });
|
2284
|
+
const X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7], [7, 8], [8, 9]];
|
2285
|
+
const y = [0, 0, 0, 0, 1, 1, 1, 1];
|
2286
|
+
|
2287
|
+
const cv = datly.ml.crossValidate(model, X, y, 5, 'classification');
|
2288
|
+
console.log(cv);
|
2289
|
+
// {
|
2290
|
+
// scores: [1.0, 0.8, 1.0, 0.8, 0.9],
|
2291
|
+
// meanScore: 0.9,
|
2292
|
+
// stdScore: 0.089,
|
2293
|
+
// folds: 5
|
2294
|
+
// }
|
2295
|
+
```
|
2296
|
+
|
2297
|
+
---
|
2298
|
+
|
2299
|
+
##### Compare Models
|
2300
|
+
|
2301
|
+
```javascript
|
2302
|
+
const X = [[1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7]];
|
2303
|
+
const y = [0, 0, 0, 1, 1, 1];
|
2304
|
+
|
2305
|
+
const models = [
|
2306
|
+
{ name: 'KNN', model: datly.ml.createKNN({ k: 3 }) },
|
2307
|
+
{ name: 'Decision Tree', model: datly.ml.createDecisionTree() },
|
2308
|
+
{ name: 'Logistic Regression', model: datly.ml.createLogisticRegression() }
|
2309
|
+
];
|
2310
|
+
|
2311
|
+
const comparison = datly.ml.compareModels(models, X, y, 'classification');
|
2312
|
+
console.log(comparison);
|
2313
|
+
// {
|
2314
|
+
// results: [
|
2315
|
+
// { name: 'KNN', score: 0.95, trainTime: 5, evalTime: 2 },
|
2316
|
+
// { name: 'Decision Tree', score: 0.90, trainTime: 15, evalTime: 1 },
|
2317
|
+
// { name: 'Logistic Regression', score: 0.85, trainTime: 100, evalTime: 1 }
|
2318
|
+
// ],
|
2319
|
+
// bestModel: { name: 'KNN', score: 0.95, ... },
|
2320
|
+
// comparison: "📊 MODEL COMPARISON REPORT\n..."
|
2321
|
+
// }
|
2322
|
+
```
|
2323
|
+
|
2324
|
+
---
|
2325
|
+
|
2326
|
+
##### Quick Train (One-liner)
|
2327
|
+
|
2328
|
+
```javascript
|
2329
|
+
// Train and evaluate a model in one line
|
2330
|
+
const result = datly.ml.quickTrain(
|
2331
|
+
'randomforest', // Model type
|
2332
|
+
X, // Features
|
2333
|
+
y, // Target
|
2334
|
+
{
|
2335
|
+
taskType: 'classification',
|
2336
|
+
testSize: 0.2,
|
2337
|
+
normalize: true,
|
2338
|
+
nEstimators: 50
|
2339
|
+
}
|
2340
|
+
);
|
2341
|
+
|
2342
|
+
console.log(result);
|
2343
|
+
// {
|
2344
|
+
// model: RandomForest {...},
|
2345
|
+
// score: {
|
2346
|
+
// accuracy: 0.95,
|
2347
|
+
// confusionMatrix: {...}
|
2348
|
+
// },
|
2349
|
+
// trainTime: 150,
|
2350
|
+
// summary: {...}
|
2351
|
+
// }
|
2352
|
+
```
|
2353
|
+
|
2354
|
+
---
|
2355
|
+
|
2356
|
+
##### Feature Engineering
|
2357
|
+
|
2358
|
+
```javascript
|
2359
|
+
// Polynomial features
|
2360
|
+
const X = [[1], [2], [3]];
|
2361
|
+
const polyFeatures = datly.ml.polynomialFeatures(X, 2);
|
2362
|
+
console.log(polyFeatures);
|
2363
|
+
// [[1, 1], [2, 4], [3, 9]] // [x, x²]
|
2364
|
+
|
2365
|
+
// Standard scaling
|
2366
|
+
const data = [[1, 2], [3, 4], [5, 6]];
|
2367
|
+
const scaler = datly.ml.standardScaler(data);
|
2368
|
+
console.log(scaler.scaled);
|
2369
|
+
// [[-1.22, -1.22], [0, 0], [1.22, 1.22]]
|
2370
|
+
|
2371
|
+
// Transform new data
|
2372
|
+
const newData = [[2, 3]];
|
2373
|
+
const scaled = scaler.transform(newData);
|
2374
|
+
console.log(scaled);
|
2375
|
+
|
2376
|
+
// Min-Max scaling
|
2377
|
+
const minMaxScaler = datly.ml.minMaxScaler(data, [0, 1]);
|
2378
|
+
console.log(minMaxScaler.scaled);
|
2379
|
+
// [[0, 0], [0.5, 0.5], [1, 1]]
|
2380
|
+
```
|
2381
|
+
|
2382
|
+
---
|
2383
|
+
|
2384
|
+
##### ROC Curve
|
2385
|
+
|
2386
|
+
```javascript
|
2387
|
+
const yTrue = [0, 0, 1, 1, 1, 0, 1, 0];
|
2388
|
+
const yProba = [0.1, 0.3, 0.6, 0.8, 0.9, 0.2, 0.7, 0.4];
|
2389
|
+
|
2390
|
+
const roc = datly.ml.rocCurve(yTrue, yProba);
|
2391
|
+
console.log(roc);
|
2392
|
+
// {
|
2393
|
+
// fpr: [0, 0, 0.25, 0.25, 0.5, ...],
|
2394
|
+
// tpr: [0, 0.2, 0.2, 0.4, 0.4, ...],
|
2395
|
+
// auc: 0.85, // Area Under Curve
|
2396
|
+
// thresholds: [...]
|
2397
|
+
// }
|
2398
|
+
```
|
2399
|
+
|
2400
|
+
---
|
2401
|
+
|
2402
|
+
##### Precision-Recall Curve
|
2403
|
+
|
2404
|
+
```javascript
|
2405
|
+
const yTrue = [0, 0, 1, 1, 1, 0, 1, 0];
|
2406
|
+
const yProba = [0.1, 0.3, 0.6, 0.8, 0.9, 0.2, 0.7, 0.4];
|
2407
|
+
|
2408
|
+
const pr = datly.ml.precisionRecallCurve(yTrue, yProba);
|
2409
|
+
console.log(pr);
|
2410
|
+
// {
|
2411
|
+
// precision: [0.5, 0.6, 0.67, ...],
|
2412
|
+
// recall: [1.0, 0.8, 0.67, ...],
|
2413
|
+
// thresholds: [...]
|
2414
|
+
// }
|
2415
|
+
```
|
2416
|
+
|
2417
|
+
---
|
2418
|
+
|
2419
|
+
### 18. Data Visualization
|
2420
|
+
|
2421
|
+
Create interactive D3.js visualizations
|
2422
|
+
|
2423
|
+
#### Setup
|
2424
|
+
|
2425
|
+
```javascript
|
2426
|
+
// Initialize visualizer
|
2427
|
+
const viz = datly.viz;
|
2428
|
+
|
2429
|
+
// Or create custom container
|
2430
|
+
const customViz = new Datly.Visualizer('my-container-id');
|
2431
|
+
```
|
2432
|
+
|
2433
|
+
---
|
2434
|
+
|
2435
|
+
#### Methods
|
2436
|
+
|
2437
|
+
##### `histogram(data, options)`
|
2438
|
+
|
2439
|
+
```javascript
|
2440
|
+
const ages = [23, 25, 27, 29, 31, 33, 35, 37, 39, 41];
|
2441
|
+
|
2442
|
+
datly.plotHistogram(ages, {
|
2443
|
+
title: 'Age Distribution',
|
2444
|
+
xlabel: 'Age',
|
2445
|
+
ylabel: 'Frequency',
|
2446
|
+
bins: 10,
|
2447
|
+
color: '#4299e1',
|
2448
|
+
width: 800,
|
2449
|
+
height: 600
|
2450
|
+
});
|
2451
|
+
```
|
2452
|
+
|
2453
|
+
---
|
2454
|
+
|
2455
|
+
##### `boxplot(data, options)`
|
2456
|
+
|
2457
|
+
```javascript
|
2458
|
+
// Single box plot
|
2459
|
+
const data1 = [1, 2, 3, 4, 5, 6, 7, 8, 9, 100];
|
2460
|
+
datly.plotBoxplot(data1, {
|
2461
|
+
title: 'Sales Distribution',
|
2462
|
+
ylabel: 'Sales ($)'
|
2463
|
+
});
|
2464
|
+
|
2465
|
+
// Multiple box plots
|
2466
|
+
const groupA = [20, 22, 23, 25, 27];
|
2467
|
+
const groupB = [30, 32, 35, 37, 40];
|
2468
|
+
const groupC = [25, 27, 29, 31, 33];
|
2469
|
+
|
2470
|
+
datly.plotBoxplot([groupA, groupB, groupC], {
|
2471
|
+
title: 'Sales by Region',
|
2472
|
+
labels: ['North', 'South', 'East'],
|
2473
|
+
ylabel: 'Sales ($)'
|
2474
|
+
});
|
2475
|
+
```
|
2476
|
+
|
2477
|
+
---
|
2478
|
+
|
2479
|
+
##### `scatter(x, y, options)`
|
2480
|
+
|
2481
|
+
```javascript
|
2482
|
+
const height = [160, 165, 170, 175, 180, 185];
|
2483
|
+
const weight = [55, 60, 65, 70, 75, 80];
|
2484
|
+
|
2485
|
+
datly.plotScatter(height, weight, {
|
2486
|
+
title: 'Height vs Weight',
|
2487
|
+
xlabel: 'Height (cm)',
|
2488
|
+
ylabel: 'Weight (kg)',
|
2489
|
+
color: '#e74c3c',
|
2490
|
+
size: 6,
|
2491
|
+
labels: ['Person 1', 'Person 2', ...] // Optional
|
2492
|
+
});
|
2493
|
+
```
|
2494
|
+
|
2495
|
+
---
|
2496
|
+
|
2497
|
+
##### `line(x, y, options)`
|
2498
|
+
|
2499
|
+
```javascript
|
2500
|
+
const months = [1, 2, 3, 4, 5, 6];
|
2501
|
+
const revenue = [100, 120, 140, 130, 160, 180];
|
2502
|
+
|
2503
|
+
datly.plotLine(months, revenue, {
|
2504
|
+
title: 'Monthly Revenue',
|
2505
|
+
xlabel: 'Month',
|
2506
|
+
ylabel: 'Revenue ($1000)',
|
2507
|
+
color: '#2ecc71',
|
2508
|
+
lineWidth: 3,
|
2509
|
+
showPoints: true
|
2510
|
+
});
|
2511
|
+
```
|
2512
|
+
|
2513
|
+
---
|
2514
|
+
|
2515
|
+
##### `bar(categories, values, options)`
|
2516
|
+
|
2517
|
+
```javascript
|
2518
|
+
const products = ['Product A', 'Product B', 'Product C'];
|
2519
|
+
const sales = [150, 230, 180];
|
2520
|
+
|
2521
|
+
datly.plotBar(products, sales, {
|
2522
|
+
title: 'Sales by Product',
|
2523
|
+
xlabel: 'Product',
|
2524
|
+
ylabel: 'Sales',
|
2525
|
+
color: '#f39c12',
|
2526
|
+
horizontal: false // Set to true for horizontal bars
|
2527
|
+
});
|
2528
|
+
```
|
2529
|
+
|
2530
|
+
---
|
2531
|
+
|
2532
|
+
##### `pie(labels, values, options)`
|
2533
|
+
|
2534
|
+
```javascript
|
2535
|
+
const categories = ['Electronics', 'Clothing', 'Food', 'Other'];
|
2536
|
+
const amounts = [35, 25, 20, 20];
|
2537
|
+
|
2538
|
+
datly.plotPie(categories, amounts, {
|
2539
|
+
title: 'Sales Distribution',
|
2540
|
+
showLabels: true,
|
2541
|
+
showPercentage: true
|
2542
|
+
});
|
2543
|
+
```
|
2544
|
+
|
2545
|
+
---
|
2546
|
+
|
2547
|
+
##### `heatmap(matrix, options)`
|
2548
|
+
|
2549
|
+
```javascript
|
2550
|
+
const correlationMatrix = [
|
2551
|
+
[1.0, 0.8, 0.3],
|
2552
|
+
[0.8, 1.0, 0.5],
|
2553
|
+
[0.3, 0.5, 1.0]
|
2554
|
+
];
|
2555
|
+
|
2556
|
+
datly.plotHeatmap(correlationMatrix, {
|
2557
|
+
title: 'Correlation Heatmap',
|
2558
|
+
labels: ['Var1', 'Var2', 'Var3'],
|
2559
|
+
colorScheme: 'RdBu', // Color scheme
|
2560
|
+
showValues: true
|
2561
|
+
});
|
2562
|
+
```
|
2563
|
+
|
2564
|
+
---
|
2565
|
+
|
2566
|
+
##### `violin(data, options)`
|
2567
|
+
|
2568
|
+
```javascript
|
2569
|
+
const groupA = [1, 2, 3, 4, 5, 6, 7];
|
2570
|
+
const groupB = [3, 4, 5, 6, 7, 8, 9];
|
2571
|
+
|
2572
|
+
datly.plotViolin([groupA, groupB], {
|
2573
|
+
title: 'Distribution Comparison',
|
2574
|
+
labels: ['Control', 'Treatment'],
|
2575
|
+
ylabel: 'Value',
|
2576
|
+
color: '#9b59b6'
|
2577
|
+
});
|
2578
|
+
```
|
2579
|
+
|
2580
|
+
---
|
2581
|
+
|
2582
|
+
##### `density(data, options)`
|
2583
|
+
|
2584
|
+
```javascript
|
2585
|
+
const data = [23, 25, 27, 29, 31, 33, 35];
|
2586
|
+
|
2587
|
+
datly.plotDensity(data, {
|
2588
|
+
title: 'Density Plot',
|
2589
|
+
xlabel: 'Value',
|
2590
|
+
ylabel: 'Density',
|
2591
|
+
color: '#1abc9c',
|
2592
|
+
bandwidth: null // Auto-calculate or specify
|
2593
|
+
});
|
2594
|
+
```
|
2595
|
+
|
2596
|
+
---
|
2597
|
+
|
2598
|
+
##### `qqplot(data, options)`
|
2599
|
+
|
2600
|
+
```javascript
|
2601
|
+
const data = [2.3, 2.5, 2.7, 2.9, 3.1, 3.3];
|
2602
|
+
|
2603
|
+
datly.plotQQ(data, {
|
2604
|
+
title: 'Q-Q Plot',
|
2605
|
+
xlabel: 'Theoretical Quantiles',
|
2606
|
+
ylabel: 'Sample Quantiles',
|
2607
|
+
color: '#34495e'
|
2608
|
+
});
|
2609
|
+
```
|
2610
|
+
|
2611
|
+
---
|
2612
|
+
|
2613
|
+
##### `parallel(data, dimensions, options)`
|
2614
|
+
|
2615
|
+
```javascript
|
2616
|
+
const data = [
|
2617
|
+
{ age: 25, income: 30000, spending: 15000, satisfaction: 7 },
|
2618
|
+
{ age: 30, income: 45000, spending: 20000, satisfaction: 8 },
|
2619
|
+
{ age: 35, income: 60000, spending: 25000, satisfaction: 9 }
|
2620
|
+
];
|
2621
|
+
|
2622
|
+
datly.plotParallel(data, ['age', 'income', 'spending', 'satisfaction'], {
|
2623
|
+
title: 'Parallel Coordinates Plot',
|
2624
|
+
colors: ['#e74c3c', '#3498db', '#2ecc71']
|
2625
|
+
});
|
2626
|
+
```
|
2627
|
+
|
2628
|
+
---
|
2629
|
+
|
2630
|
+
##### `pairplot(data, columns, options)`
|
2631
|
+
|
2632
|
+
```javascript
|
2633
|
+
const data = [
|
2634
|
+
{ height: 160, weight: 55, age: 25 },
|
2635
|
+
{ height: 165, weight: 60, age: 30 },
|
2636
|
+
{ height: 170, weight: 65, age: 35 }
|
2637
|
+
];
|
2638
|
+
|
2639
|
+
datly.plotPairplot(data, ['height', 'weight', 'age'], {
|
2640
|
+
title: 'Pair Plot',
|
2641
|
+
color: '#3498db',
|
2642
|
+
size: 3
|
2643
|
+
});
|
2644
|
+
```
|
2645
|
+
|
2646
|
+
---
|
2647
|
+
|
2648
|
+
##### `multiline(series, options)`
|
2649
|
+
|
2650
|
+
```javascript
|
2651
|
+
const series = [
|
2652
|
+
{
|
2653
|
+
name: 'Series A',
|
2654
|
+
data: [
|
2655
|
+
{ x: 1, y: 10 },
|
2656
|
+
{ x: 2, y: 15 },
|
2657
|
+
{ x: 3, y: 12 }
|
2658
|
+
]
|
2659
|
+
},
|
2660
|
+
{
|
2661
|
+
name: 'Series B',
|
2662
|
+
data: [
|
2663
|
+
{ x: 1, y: 5 },
|
2664
|
+
{ x: 2, y: 8 },
|
2665
|
+
{ x: 3, y: 10 }
|
2666
|
+
]
|
2667
|
+
}
|
2668
|
+
];
|
2669
|
+
|
2670
|
+
datly.plotMultiline(series, {
|
2671
|
+
title: 'Multiple Time Series',
|
2672
|
+
xlabel: 'Time',
|
2673
|
+
ylabel: 'Value',
|
2674
|
+
legend: true
|
2675
|
+
});
|
2676
|
+
```
|
2677
|
+
|
2678
|
+
---
|
2679
|
+
|
2680
|
+
##### Special Visualization Methods
|
2681
|
+
|
2682
|
+
```javascript
|
2683
|
+
// Correlation matrix heatmap from dataset
|
2684
|
+
const data = {
|
2685
|
+
headers: ['var1', 'var2', 'var3'],
|
2686
|
+
data: [
|
2687
|
+
{ var1: 1, var2: 2, var3: 3 },
|
2688
|
+
{ var1: 2, var2: 4, var3: 5 },
|
2689
|
+
{ var1: 3, var2: 5, var3: 7 }
|
2690
|
+
]
|
2691
|
+
};
|
2692
|
+
|
2693
|
+
datly.plotCorrelationMatrix(data, {
|
2694
|
+
title: 'Correlation Matrix'
|
2695
|
+
});
|
2696
|
+
|
2697
|
+
// Distribution plot from dataset column
|
2698
|
+
datly.plotDistribution(data, 'var1', {
|
2699
|
+
title: 'Distribution of var1'
|
2700
|
+
});
|
2701
|
+
|
2702
|
+
// Compare multiple distributions
|
2703
|
+
datly.plotMultipleDistributions(data, ['var1', 'var2', 'var3'], {
|
2704
|
+
title: 'Distribution Comparison'
|
2705
|
+
});
|
2706
|
+
```
|
2707
|
+
|
2708
|
+
---
|
2709
|
+
|
2710
|
+
## 🎯 Complete Examples
|
2711
|
+
|
2712
|
+
### Example 1: Comprehensive Data Analysis
|
2713
|
+
|
2714
|
+
```javascript
|
2715
|
+
const datly = new Datly();
|
2716
|
+
|
2717
|
+
// Load data
|
2718
|
+
const data = datly.dataLoader.loadCSV('sales_data.csv');
|
2719
|
+
|
2720
|
+
// Validate data
|
2721
|
+
const validation = datly.validator.validateData(data);
|
2722
|
+
if (!validation.valid) {
|
2723
|
+
console.error('Data validation failed:', validation.errors);
|
2724
|
+
}
|
2725
|
+
|
2726
|
+
// Get descriptive statistics
|
2727
|
+
const sales = datly.dataLoader.getColumn(data, 'sales');
|
2728
|
+
console.log('Mean Sales:', datly.centralTendency.mean(sales));
|
2729
|
+
console.log('Median Sales:', datly.centralTendency.median(sales));
|
2730
|
+
console.log('Std Dev:', datly.dispersion.standardDeviation(sales));
|
2731
|
+
|
2732
|
+
// Check for outliers
|
2733
|
+
const outliers = datly.utils.detectOutliers(sales, 'iqr');
|
2734
|
+
console.log('Outliers:', outliers);
|
2735
|
+
|
2736
|
+
// Test normality
|
2737
|
+
const normalityTest = datly.normalityTests.shapiroWilk(sales);
|
2738
|
+
console.log('Is Normal:', normalityTest.isNormal);
|
2739
|
+
|
2740
|
+
// Generate report
|
2741
|
+
const report = datly.reportGenerator.summary(data);
|
2742
|
+
console.log(report);
|
2743
|
+
|
2744
|
+
datly.plotHistogram(sales, { title: 'Sales Distribution' });
|
2745
|
+
datly.plotBoxplot(sales, { title: 'Sales Box Plot' });
|
2746
|
+
```
|
2747
|
+
|
2748
|
+
---
|
2749
|
+
|
2750
|
+
### Example 2: Hypothesis Testing Workflow
|
2751
|
+
|
2752
|
+
```javascript
|
2753
|
+
const datly = new Datly();
|
2754
|
+
|
2755
|
+
// Two groups to compare
|
2756
|
+
const controlGroup = [23, 25, 27, 29, 31, 33];
|
2757
|
+
const treatmentGroup = [28, 30, 32, 34, 36, 38];
|
2758
|
+
|
2759
|
+
// Perform t-test
|
2760
|
+
const tTest = datly.hypothesisTesting.tTest(
|
2761
|
+
controlGroup,
|
2762
|
+
treatmentGroup,
|
2763
|
+
'two-sample'
|
2764
|
+
);
|
2765
|
+
|
2766
|
+
// Interpret results
|
2767
|
+
const interpretation = datly.interpreter.interpret(tTest);
|
2768
|
+
console.log(interpretation.plainLanguage);
|
2769
|
+
console.log('Decision:', interpretation.conclusion.decision);
|
2770
|
+
console.log('Effect Size:', interpretation.effectSize.magnitude);
|
2771
|
+
|
2772
|
+
// Calculate confidence interval for difference
|
2773
|
+
const ci = datly.confidenceIntervals.meanDifference(
|
2774
|
+
controlGroup,
|
2775
|
+
treatmentGroup,
|
2776
|
+
0.95
|
2777
|
+
);
|
2778
|
+
console.log('95% CI for difference:', ci.lowerBound, 'to', ci.upperBound);
|
2779
|
+
|
2780
|
+
datly.plotBoxplot([controlGroup, treatmentGroup], {
|
2781
|
+
title: 'Control vs Treatment',
|
2782
|
+
labels: ['Control', 'Treatment']
|
2783
|
+
});
|
2784
|
+
```
|
2785
|
+
|
2786
|
+
---
|
2787
|
+
|
2788
|
+
### Example 3: Correlation and Regression
|
2789
|
+
|
2790
|
+
```javascript
|
2791
|
+
const datly = new Datly();
|
2792
|
+
|
2793
|
+
const data = {
|
2794
|
+
headers: ['advertising', 'sales'],
|
2795
|
+
data: [
|
2796
|
+
{ advertising: 10, sales: 100 },
|
2797
|
+
{ advertising: 15, sales: 150 },
|
2798
|
+
{ advertising: 20, sales: 180 },
|
2799
|
+
{ advertising: 25, sales: 230 },
|
2800
|
+
{ advertising: 30, sales: 270 }
|
2801
|
+
]
|
2802
|
+
};
|
2803
|
+
|
2804
|
+
// Extract columns
|
2805
|
+
const advertising = datly.dataLoader.getColumn(data, 'advertising');
|
2806
|
+
const sales = datly.dataLoader.getColumn(data, 'sales');
|
2807
|
+
|
2808
|
+
// Calculate correlation
|
2809
|
+
const correlation = datly.correlation.pearson(advertising, sales);
|
2810
|
+
console.log('Correlation:', correlation.correlation);
|
2811
|
+
console.log('P-value:', correlation.pValue);
|
2812
|
+
|
2813
|
+
// Fit regression model
|
2814
|
+
const model = datly.regression.linear(advertising, sales);
|
2815
|
+
console.log('Equation:', model.equation);
|
2816
|
+
console.log('R²:', model.rSquared);
|
2817
|
+
console.log('Model significant:', model.pValueModel < 0.05);
|
2818
|
+
|
2819
|
+
// Make prediction
|
2820
|
+
const newAdvertising = [35];
|
2821
|
+
const prediction = datly.regression.predict(model, newAdvertising);
|
2822
|
+
console.log('Predicted sales for $35k advertising:', prediction[0]);
|
2823
|
+
|
2824
|
+
datly.plotScatter(advertising, sales, {
|
2825
|
+
title: 'Advertising vs Sales',
|
2826
|
+
xlabel: 'Advertising Budget ($1000)',
|
2827
|
+
ylabel: 'Sales ($1000)'
|
2828
|
+
});
|
2829
|
+
```
|
2830
|
+
|
2831
|
+
---
|
2832
|
+
|
2833
|
+
### Example 4: Machine Learning Pipeline
|
2834
|
+
|
2835
|
+
```javascript
|
2836
|
+
const datly = new Datly();
|
2837
|
+
|
2838
|
+
// Load data
|
2839
|
+
const data = datly.dataLoader.loadJSON('iris.json');
|
2840
|
+
|
2841
|
+
// Prepare features and target
|
2842
|
+
const X = data.data.map(row => [
|
2843
|
+
row.sepal_length,
|
2844
|
+
row.sepal_width,
|
2845
|
+
row.petal_length,
|
2846
|
+
row.petal_width
|
2847
|
+
]);
|
2848
|
+
const y = data.data.map(row => row.species);
|
2849
|
+
|
2850
|
+
// Split data
|
2851
|
+
const split = datly.ml.trainTestSplit(X, y, 0.2, true);
|
2852
|
+
|
2853
|
+
// Create and train model
|
2854
|
+
const model = datly.ml.createRandomForest({
|
2855
|
+
nEstimators: 100,
|
2856
|
+
maxDepth: 10
|
2857
|
+
});
|
2858
|
+
|
2859
|
+
model.fit(split.X_train, split.y_train, 'classification');
|
2860
|
+
|
2861
|
+
// Evaluate
|
2862
|
+
const score = model.score(split.X_test, split.y_test);
|
2863
|
+
console.log('Accuracy:', score.accuracy);
|
2864
|
+
console.log('Confusion Matrix:', score.confusionMatrix.display);
|
2865
|
+
|
2866
|
+
// Cross-validation
|
2867
|
+
const cv = datly.ml.crossValidate(model, X, y, 5, 'classification');
|
2868
|
+
console.log('CV Mean Score:', cv.meanScore);
|
2869
|
+
console.log('CV Std:', cv.stdScore);
|
2870
|
+
|
2871
|
+
// Feature importance
|
2872
|
+
const importance = model.getFeatureImportance();
|
2873
|
+
console.log('Feature Importance:', importance);
|
2874
|
+
```
|
2875
|
+
|
2876
|
+
---
|
2877
|
+
|
2878
|
+
### Example 5: Automatic Analysis
|
2879
|
+
|
2880
|
+
```javascript
|
2881
|
+
const datly = new Datly();
|
2882
|
+
|
2883
|
+
// Load your data
|
2884
|
+
const data = datly.dataLoader.loadCSV('customer_data.csv');
|
2885
|
+
|
2886
|
+
// Run automatic analysis
|
2887
|
+
const analysis = datly.autoAnalyzer.autoAnalyze(data, {
|
2888
|
+
minCorrelationThreshold: 0.5,
|
2889
|
+
significanceLevel: 0.05,
|
2890
|
+
generateVisualizations: true
|
2891
|
+
});
|
2892
|
+
|
2893
|
+
// View insights
|
2894
|
+
analysis.insights.forEach(insight => {
|
2895
|
+
console.log(`${insight.icon} [${insight.priority}] ${insight.title}`);
|
2896
|
+
console.log(` ${insight.description}`);
|
2897
|
+
if (insight.recommendation) {
|
2898
|
+
console.log(` → ${insight.recommendation}`);
|
2899
|
+
}
|
2900
|
+
});
|
2901
|
+
|
2902
|
+
// View recommended visualizations
|
2903
|
+
analysis.visualizationSuggestions.forEach(viz => {
|
2904
|
+
console.log(`📊 ${viz.title} (${viz.type}) - Priority: ${viz.priority}`);
|
2905
|
+
});
|
2906
|
+
|
2907
|
+
// Generate report
|
2908
|
+
const textReport = datly.reportGenerator.exportSummary(
|
2909
|
+
analysis,
|
2910
|
+
'text'
|
2911
|
+
);
|
2912
|
+
console.log(textReport);
|
2913
|
+
```
|
2914
|
+
|
2915
|
+
---
|
2916
|
+
|
2917
|
+
## 📖 API Reference
|
2918
|
+
|
2919
|
+
### Core Classes
|
2920
|
+
|
2921
|
+
- **`DataLoader`**: Load and manipulate datasets
|
2922
|
+
- **`Validator`**: Validate data integrity
|
2923
|
+
- **`Utils`**: Utility functions for data analysis
|
2924
|
+
- **`CentralTendency`**: Mean, median, mode calculations
|
2925
|
+
- **`Dispersion`**: Variance, standard deviation measures
|
2926
|
+
- **`Position`**: Quantiles, percentiles, rankings
|
2927
|
+
- **`Shape`**: Skewness and kurtosis analysis
|
2928
|
+
- **`HypothesisTesting`**: Statistical tests
|
2929
|
+
- **`ConfidenceIntervals`**: Interval estimation
|
2930
|
+
- **`NormalityTests`**: Test for normal distribution
|
2931
|
+
- **`Correlation`**: Correlation analysis
|
2932
|
+
- **`Regression`**: Regression modeling
|
2933
|
+
- **`ReportGenerator`**: Generate statistical reports
|
2934
|
+
- **`PatternDetector`**: Detect patterns in data
|
2935
|
+
- **`Interpreter`**: Interpret statistical results
|
2936
|
+
- **`AutoAnalyzer`**: Automated analysis
|
2937
|
+
- **`ML`**: Machine learning models
|
2938
|
+
- **`Visualizer`**: Data visualization
|
2939
|
+
|
2940
|
+
---
|
2941
|
+
|
2942
|
+
## 🌐 Browser Support
|
2943
|
+
|
2944
|
+
- Chrome (latest)
|
2945
|
+
- Firefox (latest)
|
2946
|
+
- Safari (latest)
|
2947
|
+
- Edge (latest)
|
2948
|
+
|
2949
|
+
**Requirements:**
|
2950
|
+
- Modern JavaScript (ES6+)
|
2951
|
+
|
2952
|
+
---
|
2953
|
+
|
2954
|
+
## 🤝 Contributing
|
2955
|
+
|
2956
|
+
Contributions are welcome! Please follow these steps:
|
2957
|
+
|
2958
|
+
1. Fork the repository
|
2959
|
+
2. Create a feature branch (`git checkout -b feature/AmazingFeature`)
|
2960
|
+
3. Commit your changes (`git commit -m 'Add AmazingFeature'`)
|
2961
|
+
4. Push to the branch (`git push origin feature/AmazingFeature`)
|
2962
|
+
5. Open a Pull Request
|
2963
|
+
|
2964
|
+
---
|
2965
|
+
|
2966
|
+
## 📝 License
|
2967
|
+
|
2968
|
+
MIT License
|
2969
|
+
|
2970
|
+
Copyright (c) 2025 Datly
|
2971
|
+
|
2972
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
2973
|
+
of this software and associated documentation files (the "Software"), to deal
|
2974
|
+
in the Software without restriction, including without limitation the rights
|
2975
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
2976
|
+
copies of the Software.
|
2977
|
+
|
2978
|
+
---
|
2979
|
+
|
2980
|
+
## 📧 Contact
|
2981
|
+
|
2982
|
+
For questions, issues, or feature requests:
|
2983
|
+
- GitHub Issues: [github.com/yourrepo/datly/issues](https://github.com/yourrepo/datly/issues)
|
2984
|
+
- NPM Package: [npmjs.com/package/datly](https://npmjs.com/package/datly)
|
2985
|
+
|
2986
|
+
**Made with ❤️ for the data science community**
|