@winm2m/inferential-stats-js 0.1.4 → 0.1.6
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +59 -59
- package/package.json +1 -1
package/README.md
CHANGED
|
@@ -70,7 +70,7 @@
|
|
|
70
70
|
|
|
71
71
|
This section documents the mathematical foundations and internal Python implementations of all 16 analyses.
|
|
72
72
|
|
|
73
|
-
> **Note on math rendering:** Equations are rendered as images
|
|
73
|
+
> **Note on math rendering:** Equations are rendered as images so they display correctly on npm.
|
|
74
74
|
|
|
75
75
|
---
|
|
76
76
|
|
|
@@ -84,9 +84,9 @@ Computes a frequency distribution for a categorical variable, including absolute
|
|
|
84
84
|
|
|
85
85
|
**Relative frequency:**
|
|
86
86
|
|
|
87
|
-

|
|
88
88
|
|
|
89
|
-
where
|
|
89
|
+
where  is the count of category  and  is the total number of observations. Cumulative percentage is the running sum of .
|
|
90
90
|
|
|
91
91
|
---
|
|
92
92
|
|
|
@@ -98,19 +98,19 @@ Produces summary statistics for one or more numeric variables: count, mean, stan
|
|
|
98
98
|
|
|
99
99
|
**Arithmetic mean:**
|
|
100
100
|
|
|
101
|
-

|
|
102
102
|
|
|
103
103
|
**Sample standard deviation (Bessel-corrected):**
|
|
104
104
|
|
|
105
|
-
%5E2%7D>)
|
|
106
106
|
|
|
107
107
|
**Skewness (Fisher):**
|
|
108
108
|
|
|
109
|
-
%5Ek>)
|
|
110
110
|
|
|
111
111
|
**Excess kurtosis (Fisher):**
|
|
112
112
|
|
|
113
|
-

|
|
114
114
|
|
|
115
115
|
---
|
|
116
116
|
|
|
@@ -122,15 +122,15 @@ Cross-tabulates two categorical variables and tests for independence using Pears
|
|
|
122
122
|
|
|
123
123
|
**Pearson's Chi-square statistic:**
|
|
124
124
|
|
|
125
|
-
%5E2%7D%7BE_%7Bij%7D%7D>)
|
|
126
126
|
|
|
127
|
-
where
|
|
127
|
+
where  is the observed frequency in cell () and  is the expected frequency under independence.
|
|
128
128
|
|
|
129
129
|
**Cramér's V:**
|
|
130
130
|
|
|
131
|
-
%7D%7D>)
|
|
132
132
|
|
|
133
|
-
where
|
|
133
|
+
where >).
|
|
134
134
|
|
|
135
135
|
---
|
|
136
136
|
|
|
@@ -144,15 +144,15 @@ Compares the means of a numeric variable between two independent groups. Automat
|
|
|
144
144
|
|
|
145
145
|
**T-statistic (equal variance assumed):**
|
|
146
146
|
|
|
147
|
-

|
|
148
148
|
|
|
149
149
|
**Pooled standard deviation:**
|
|
150
150
|
|
|
151
|
-
s_1%5E2%2B(n_2-1)s_2%5E2%7D%7Bn_1%2Bn_2-2%7D%7D>)
|
|
152
152
|
|
|
153
|
-
**Degrees of freedom:**
|
|
153
|
+
**Degrees of freedom:** 
|
|
154
154
|
|
|
155
|
-
When Levene's test is significant (
|
|
155
|
+
When Levene's test is significant (), Welch's t-test is recommended, which uses the Welch–Satterthwaite approximation for degrees of freedom.
|
|
156
156
|
|
|
157
157
|
---
|
|
158
158
|
|
|
@@ -164,11 +164,11 @@ Tests whether the mean difference between two paired measurements is significant
|
|
|
164
164
|
|
|
165
165
|
**T-statistic:**
|
|
166
166
|
|
|
167
|
-

|
|
168
168
|
|
|
169
|
-
where
|
|
169
|
+
where >) is the mean difference and  is the standard deviation of the differences.
|
|
170
170
|
|
|
171
|
-
**Degrees of freedom:**
|
|
171
|
+
**Degrees of freedom:** 
|
|
172
172
|
|
|
173
173
|
---
|
|
174
174
|
|
|
@@ -180,23 +180,23 @@ Tests whether the means of a numeric variable differ significantly across three
|
|
|
180
180
|
|
|
181
181
|
**F-statistic:**
|
|
182
182
|
|
|
183
|
-

|
|
184
184
|
|
|
185
185
|
**Sum of Squares Between Groups:**
|
|
186
186
|
|
|
187
|
-
%5E2>)
|
|
188
188
|
|
|
189
189
|
**Sum of Squares Within Groups:**
|
|
190
190
|
|
|
191
|
-
%5E2>)
|
|
192
192
|
|
|
193
193
|
**Mean Squares:**
|
|
194
194
|
|
|
195
|
-

|
|
196
196
|
|
|
197
197
|
**Effect size (Eta-squared):**
|
|
198
198
|
|
|
199
|
-

|
|
200
200
|
|
|
201
201
|
---
|
|
202
202
|
|
|
@@ -208,9 +208,9 @@ Performs pairwise comparisons of group means following a significant ANOVA resul
|
|
|
208
208
|
|
|
209
209
|
**Studentized range statistic:**
|
|
210
210
|
|
|
211
|
-

|
|
212
212
|
|
|
213
|
-
where
|
|
213
|
+
where  is the within-group mean square from the ANOVA and  is the harmonic mean of group sizes. The critical  value is obtained from the Studentized Range distribution with  groups and  degrees of freedom.
|
|
214
214
|
|
|
215
215
|
---
|
|
216
216
|
|
|
@@ -218,43 +218,43 @@ where $MS_W$ is the within-group mean square from the ANOVA and $n$ is the harmo
|
|
|
218
218
|
|
|
219
219
|
#### Linear Regression (OLS)
|
|
220
220
|
|
|
221
|
-
Fits an Ordinary Least Squares regression model with one or more independent variables. Reports regression coefficients, standard errors, t-statistics, p-values, confidence intervals,
|
|
221
|
+
Fits an Ordinary Least Squares regression model with one or more independent variables. Reports regression coefficients, standard errors, t-statistics, p-values, confidence intervals, , adjusted , F-test, and the Durbin-Watson statistic for autocorrelation detection.
|
|
222
222
|
|
|
223
223
|
**Python implementation:** `statsmodels.api.OLS`
|
|
224
224
|
|
|
225
225
|
**Model:**
|
|
226
226
|
|
|
227
|
-

|
|
228
228
|
|
|
229
|
-
where
|
|
229
|
+
where >).
|
|
230
230
|
|
|
231
231
|
**OLS estimator:**
|
|
232
232
|
|
|
233
|
-
%5E%7B-1%7DX%5ETY>)
|
|
234
234
|
|
|
235
235
|
**Coefficient of determination:**
|
|
236
236
|
|
|
237
|
-

|
|
238
238
|
|
|
239
|
-
where
|
|
239
|
+
where %5E2>) and %5E2>).
|
|
240
240
|
|
|
241
241
|
---
|
|
242
242
|
|
|
243
243
|
#### Binary Logistic Regression
|
|
244
244
|
|
|
245
|
-
Models the probability of a binary outcome as a function of one or more independent variables. Reports coefficients (log-odds), odds ratios, z-statistics, p-values, pseudo
|
|
245
|
+
Models the probability of a binary outcome as a function of one or more independent variables. Reports coefficients (log-odds), odds ratios, z-statistics, p-values, pseudo-, AIC, and BIC.
|
|
246
246
|
|
|
247
247
|
**Python implementation:** `statsmodels.discrete.discrete_model.Logit`
|
|
248
248
|
|
|
249
249
|
**Logit link function:**
|
|
250
250
|
|
|
251
|
-
%3D%5Cbeta_0%2B%5Cbeta_1X_1%2B%5Ccdots%2B%5Cbeta_pX_p>)
|
|
252
252
|
|
|
253
253
|
**Predicted probability:**
|
|
254
254
|
|
|
255
|
-
%3D%5Cfrac%7B1%7D%7B1%2Be%5E%7B-(%5Cbeta_0%2B%5Cbeta_1X_1%2B%5Ccdots%2B%5Cbeta_pX_p)%7D%7D>)
|
|
256
256
|
|
|
257
|
-
Coefficients are estimated by Maximum Likelihood Estimation (MLE). The odds ratio for predictor
|
|
257
|
+
Coefficients are estimated by Maximum Likelihood Estimation (MLE). The odds ratio for predictor j is .
|
|
258
258
|
|
|
259
259
|
---
|
|
260
260
|
|
|
@@ -264,15 +264,15 @@ Extends binary logistic regression to outcomes with more than two unordered cate
|
|
|
264
264
|
|
|
265
265
|
**Python implementation:** `sklearn.linear_model.LogisticRegression(multi_class='multinomial')`
|
|
266
266
|
|
|
267
|
-
**Log-odds relative to reference category
|
|
267
|
+
**Log-odds relative to reference category :**
|
|
268
268
|
|
|
269
|
-
%7D%7BP(Y%3DK)%7D%5Cright)%3D%5Cbeta_%7Bk0%7D%2B%5Cbeta_%7Bk1%7DX_1%2B%5Ccdots%2B%5Cbeta_%7Bkp%7DX_p>)
|
|
270
270
|
|
|
271
|
-
for each category
|
|
271
|
+
for each category .
|
|
272
272
|
|
|
273
273
|
**Predicted probability via softmax:**
|
|
274
274
|
|
|
275
|
-
%3D%5Cfrac%7Be%5E%7B%5Cbeta_%7Bk0%7D%2B%5Cbeta_%7Bk1%7DX_1%2B%5Ccdots%2B%5Cbeta_%7Bkp%7DX_p%7D%7D%7B%5Csum_%7Bj%3D1%7D%5E%7BK%7De%5E%7B%5Cbeta_%7Bj0%7D%2B%5Cbeta_%7Bj1%7DX_1%2B%5Ccdots%2B%5Cbeta_%7Bjp%7DX_p%7D%7D>)
|
|
276
276
|
|
|
277
277
|
---
|
|
278
278
|
|
|
@@ -280,15 +280,15 @@ for each category $k \neq K$.
|
|
|
280
280
|
|
|
281
281
|
#### K-Means Clustering
|
|
282
282
|
|
|
283
|
-
Partitions observations into
|
|
283
|
+
Partitions observations into  clusters by iteratively assigning points to the nearest centroid and updating centroids until convergence.
|
|
284
284
|
|
|
285
285
|
**Python implementation:** `sklearn.cluster.KMeans`
|
|
286
286
|
|
|
287
287
|
**Objective function (inertia):**
|
|
288
288
|
|
|
289
|
-

|
|
290
290
|
|
|
291
|
-
where
|
|
291
|
+
where  is the set of observations in cluster j and  is the centroid. The algorithm minimizes J using Lloyd's algorithm (Expectation-Maximization style).
|
|
292
292
|
|
|
293
293
|
---
|
|
294
294
|
|
|
@@ -300,9 +300,9 @@ Builds a hierarchy of clusters using a bottom-up approach. Supports Ward, comple
|
|
|
300
300
|
|
|
301
301
|
**Ward's minimum variance method** (default):
|
|
302
302
|
|
|
303
|
-
%3D%5Cfrac%7Bn_A%20n_B%7D%7Bn_A%2Bn_B%7D%5C%7C%5Cbar%7Bx%7D_A-%5Cbar%7Bx%7D_B%5C%7C%5E2>)
|
|
304
304
|
|
|
305
|
-
At each step, the pair of clusters
|
|
305
|
+
At each step, the pair of clusters (A, B) that produces the smallest increase in total within-cluster variance is merged. Ward's method tends to produce compact, equally sized clusters.
|
|
306
306
|
|
|
307
307
|
---
|
|
308
308
|
|
|
@@ -316,15 +316,15 @@ Discovers latent factors underlying a set of observed variables. Supports varima
|
|
|
316
316
|
|
|
317
317
|
**Factor model:**
|
|
318
318
|
|
|
319
|
-

|
|
320
320
|
|
|
321
|
-
where
|
|
321
|
+
where  is the observed variable vector,  is the matrix of factor loadings,  is the vector of latent factors, and  is the unique variance.
|
|
322
322
|
|
|
323
323
|
**Kaiser-Meyer-Olkin (KMO) measure:**
|
|
324
324
|
|
|
325
|
-

|
|
326
326
|
|
|
327
|
-
where
|
|
327
|
+
where  are elements of the correlation matrix and  are elements of the partial correlation matrix. KMO values above 0.6 are generally considered acceptable for factor analysis.
|
|
328
328
|
|
|
329
329
|
---
|
|
330
330
|
|
|
@@ -334,15 +334,15 @@ Finds orthogonal components that maximize variance in the data. Reports componen
|
|
|
334
334
|
|
|
335
335
|
**Python implementation:** `sklearn.decomposition.PCA`
|
|
336
336
|
|
|
337
|
-
**Objective:** Find the weight vector
|
|
337
|
+
**Objective:** Find the weight vector  that maximizes projected variance:
|
|
338
338
|
|
|
339
|
-
%5Cto%5Cmax%5Cquad%5Ctext%7Bsubject%20to%7D%5Cquad%5C%7Cw%5C%7C%3D1>)
|
|
340
340
|
|
|
341
|
-
This is equivalent to finding the eigenvectors of the covariance matrix
|
|
341
|
+
This is equivalent to finding the eigenvectors of the covariance matrix . The eigenvalues  represent the variance explained by each component.
|
|
342
342
|
|
|
343
343
|
**Explained variance ratio:**
|
|
344
344
|
|
|
345
|
-

|
|
346
346
|
|
|
347
347
|
---
|
|
348
348
|
|
|
@@ -354,9 +354,9 @@ Projects high-dimensional data into a lower-dimensional space (typically 2D) whi
|
|
|
354
354
|
|
|
355
355
|
**Stress function (Kruskal's Stress-1):**
|
|
356
356
|
|
|
357
|
-
%5E2%7D%7B%5Csum_%7Bi%3Cj%7Dd_%7Bij%7D%5E2%7D%7D>)
|
|
358
358
|
|
|
359
|
-
where
|
|
359
|
+
where  is the distance in the reduced space and  is the original distance (or a monotonic transformation for non-metric MDS). A stress value below 0.1 is generally considered a good fit.
|
|
360
360
|
|
|
361
361
|
---
|
|
362
362
|
|
|
@@ -370,17 +370,17 @@ Measures the internal consistency (reliability) of a set of scale items. Reports
|
|
|
370
370
|
|
|
371
371
|
**Cronbach's alpha (raw):**
|
|
372
372
|
|
|
373
|
-
>)
|
|
374
374
|
|
|
375
|
-
where
|
|
375
|
+
where  is the number of items,  is the variance of item i, and  is the variance of the total score.
|
|
376
376
|
|
|
377
377
|
**Standardized alpha (based on mean inter-item correlation):**
|
|
378
378
|
|
|
379
|
-
%5Cbar%7Br%7D%7D>)
|
|
380
380
|
|
|
381
|
-
where
|
|
381
|
+
where  is the mean of all pairwise Pearson correlations among items.
|
|
382
382
|
|
|
383
|
-
|
|
|
383
|
+
| Alpha Range | Interpretation |
|
|
384
384
|
|---|---|
|
|
385
385
|
| ≥ 0.9 | Excellent |
|
|
386
386
|
| 0.8 – 0.9 | Good |
|
package/package.json
CHANGED
|
@@ -1,6 +1,6 @@
|
|
|
1
1
|
{
|
|
2
2
|
"name": "@winm2m/inferential-stats-js",
|
|
3
|
-
"version": "0.1.
|
|
3
|
+
"version": "0.1.6",
|
|
4
4
|
"description": "A headless JavaScript SDK for advanced statistical analysis in the browser using WebAssembly (Pyodide). Performs SPSS-level inferential statistics entirely client-side with no backend required.",
|
|
5
5
|
"author": "Youngjune Kwon <yjkwon@winm2m.com>",
|
|
6
6
|
"license": "MIT",
|