distclassipy 0.1.0__py3-none-any.whl → 0.1.2__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- distclassipy/classifier.py +141 -58
- distclassipy/distances.py +389 -226
- {distclassipy-0.1.0.dist-info → distclassipy-0.1.2.dist-info}/METADATA +64 -13
- distclassipy-0.1.2.dist-info/RECORD +9 -0
- {distclassipy-0.1.0.dist-info → distclassipy-0.1.2.dist-info}/WHEEL +1 -1
- distclassipy-0.1.0.dist-info/RECORD +0 -9
- {distclassipy-0.1.0.dist-info → distclassipy-0.1.2.dist-info}/LICENSE +0 -0
- {distclassipy-0.1.0.dist-info → distclassipy-0.1.2.dist-info}/top_level.txt +0 -0
distclassipy/distances.py
CHANGED
|
@@ -4,38 +4,26 @@ A module providing a variety of distance metrics to calculate the distance betwe
|
|
|
4
4
|
This module includes implementations of various distance metrics, including both common and less
|
|
5
5
|
common measures. It allows for the calculation of distances between data points in a vectorized
|
|
6
6
|
manner using numpy arrays.
|
|
7
|
-
|
|
8
|
-
https://github.com/aziele/statistical-distances/blob/04412b3155c59fc7238b3d8ecf6f3723ac5befff/distance.py
|
|
9
|
-
|
|
10
|
-
It
|
|
11
|
-
|
|
12
|
-
|
|
13
|
-
|
|
14
|
-
|
|
15
|
-
|
|
16
|
-
|
|
17
|
-
|
|
18
|
-
|
|
19
|
-
|
|
20
|
-
|
|
21
|
-
|
|
22
|
-
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
3. Ruzicka distance
|
|
28
|
-
4. Inner product distance
|
|
29
|
-
5. Harmonic mean distance
|
|
30
|
-
6. Fidelity
|
|
31
|
-
7. Minimimum Symmetric Chi Squared
|
|
32
|
-
8. Probabilistic Symmetric Chi Squared
|
|
33
|
-
|
|
34
|
-
In addition, the following code was added to all functions for array conversion:
|
|
35
|
-
u,v = np.asarray(u), np.asarray(v)
|
|
36
|
-
|
|
37
|
-
Todos:
|
|
38
|
-
ALSO COMPARE RUNTIME OF THIS v/s custom v/s Tschopp
|
|
7
|
+
A part of this code is based on the work of Andrzej Zielezinski, originally retrieved on 20 November 2022 from
|
|
8
|
+
https://github.com/aziele/statistical-distances/blob/04412b3155c59fc7238b3d8ecf6f3723ac5befff/distance.py, which was released via the GNU General Public License v3.0.
|
|
9
|
+
|
|
10
|
+
It was originally modified by Siddharth Chaini on 27 November 2022.
|
|
11
|
+
|
|
12
|
+
Notes
|
|
13
|
+
-----
|
|
14
|
+
|
|
15
|
+
Modifications by Siddharth Chaini include the addition of the following distance measures:
|
|
16
|
+
1. Meehl distance
|
|
17
|
+
2. Sorensen distance
|
|
18
|
+
3. Ruzicka distance
|
|
19
|
+
4. Inner product distance
|
|
20
|
+
5. Harmonic mean distance
|
|
21
|
+
6. Fidelity
|
|
22
|
+
7. Minimimum Symmetric Chi Squared
|
|
23
|
+
8. Probabilistic Symmetric Chi Squared
|
|
24
|
+
|
|
25
|
+
In addition, the following code was added to all functions for array conversion:
|
|
26
|
+
u,v = np.asarray(u), np.asarray(v)
|
|
39
27
|
"""
|
|
40
28
|
|
|
41
29
|
import numpy as np
|
|
@@ -47,7 +35,8 @@ class Distance:
|
|
|
47
35
|
"""
|
|
48
36
|
Initialize the Distance class with an optional epsilon value.
|
|
49
37
|
|
|
50
|
-
Parameters
|
|
38
|
+
Parameters
|
|
39
|
+
----------
|
|
51
40
|
- epsilon: A small value to avoid division by zero errors.
|
|
52
41
|
"""
|
|
53
42
|
self.epsilon = np.finfo(float).eps if not epsilon else epsilon
|
|
@@ -55,41 +44,45 @@ class Distance:
|
|
|
55
44
|
def acc(self, u, v):
|
|
56
45
|
"""
|
|
57
46
|
Calculate the average of Cityblock/Manhattan and Chebyshev distances.
|
|
58
|
-
|
|
59
47
|
This function computes the ACC distance, also known as the Average distance, between two
|
|
60
48
|
vectors u and v. It is the average of the Cityblock (or Manhattan) and Chebyshev distances.
|
|
61
49
|
|
|
62
|
-
Parameters
|
|
50
|
+
Parameters
|
|
51
|
+
----------
|
|
63
52
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
64
53
|
|
|
65
|
-
Returns
|
|
54
|
+
Returns
|
|
55
|
+
-------
|
|
66
56
|
- The ACC distance between the two vectors.
|
|
67
57
|
|
|
68
|
-
References
|
|
69
|
-
|
|
70
|
-
|
|
71
|
-
|
|
72
|
-
|
|
58
|
+
References
|
|
59
|
+
----------
|
|
60
|
+
1. Krause EF (2012) Taxicab Geometry An Adventure in Non-Euclidean Geometry. Dover Publications.
|
|
61
|
+
2. Sung-Hyuk C (2007) Comprehensive Survey on Distance/Similarity Measures between Probability
|
|
62
|
+
Density Functions. International Journal of Mathematical Models and Methods in Applied Sciences.
|
|
63
|
+
vol. 1(4), pp. 300-307.
|
|
73
64
|
"""
|
|
74
65
|
return (self.cityblock(u, v) + self.chebyshev(u, v)) / 2
|
|
75
66
|
|
|
76
67
|
def add_chisq(self, u, v):
|
|
77
68
|
"""
|
|
78
69
|
Compute the Additive Symmetric Chi-square distance between two vectors.
|
|
79
|
-
|
|
80
70
|
The Additive Symmetric Chi-square distance is a measure that can be used to compare two vectors.
|
|
81
71
|
This function calculates it based on the input vectors u and v.
|
|
82
72
|
|
|
83
|
-
Parameters
|
|
73
|
+
Parameters
|
|
74
|
+
----------
|
|
84
75
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
85
76
|
|
|
86
|
-
Returns
|
|
77
|
+
Returns
|
|
78
|
+
-------
|
|
87
79
|
- The Additive Symmetric Chi-square distance between the two vectors.
|
|
88
80
|
|
|
89
|
-
References
|
|
90
|
-
|
|
91
|
-
|
|
92
|
-
|
|
81
|
+
References
|
|
82
|
+
----------
|
|
83
|
+
1. Sung-Hyuk C (2007) Comprehensive Survey on Distance/Similarity Measures between Probability
|
|
84
|
+
Density Functions. International Journal of Mathematical Models and Methods in Applied Sciences.
|
|
85
|
+
vol. 1(4), pp. 300-307.
|
|
93
86
|
"""
|
|
94
87
|
u, v = np.asarray(u), np.asarray(v)
|
|
95
88
|
uvmult = u * v
|
|
@@ -102,21 +95,24 @@ class Distance:
|
|
|
102
95
|
|
|
103
96
|
Returns a distance value between 0 and 1.
|
|
104
97
|
|
|
105
|
-
Parameters
|
|
98
|
+
Parameters
|
|
99
|
+
----------
|
|
106
100
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
107
101
|
|
|
108
|
-
Returns
|
|
102
|
+
Returns
|
|
103
|
+
-------
|
|
109
104
|
- The Bhattacharyya distance between the two vectors.
|
|
110
105
|
|
|
111
|
-
References
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
|
|
116
|
-
|
|
117
|
-
|
|
118
|
-
|
|
119
|
-
|
|
106
|
+
References
|
|
107
|
+
----------
|
|
108
|
+
1. Bhattacharyya A (1947) On a measure of divergence between two
|
|
109
|
+
statistical populations defined by probability distributions,
|
|
110
|
+
Bull. Calcutta Math. Soc., 35, 99–109.
|
|
111
|
+
2. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
112
|
+
Measures between Probability Density Functions. International
|
|
113
|
+
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
114
|
+
1(4), 300-307.
|
|
115
|
+
3. https://en.wikipedia.org/wiki/Bhattacharyya_distance
|
|
120
116
|
"""
|
|
121
117
|
u, v = np.asarray(u), np.asarray(v)
|
|
122
118
|
return -np.log(np.sum(np.sqrt(u * v)))
|
|
@@ -130,17 +126,21 @@ class Distance:
|
|
|
130
126
|
of species at both sites. It is closely related to the Sørensen distance and is also known as
|
|
131
127
|
Bray-Curtis dissimilarity.
|
|
132
128
|
|
|
133
|
-
Notes
|
|
129
|
+
Notes
|
|
130
|
+
-----
|
|
134
131
|
When used for comparing two probability density functions (pdfs),
|
|
135
132
|
the Bray-Curtis distance equals the Cityblock distance divided by 2.
|
|
136
133
|
|
|
137
|
-
Parameters
|
|
134
|
+
Parameters
|
|
135
|
+
----------
|
|
138
136
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
139
137
|
|
|
140
|
-
Returns
|
|
138
|
+
Returns
|
|
139
|
+
-------
|
|
141
140
|
- The Bray-Curtis distance between the two vectors.
|
|
142
141
|
|
|
143
|
-
References
|
|
142
|
+
References
|
|
143
|
+
----------
|
|
144
144
|
1. Bray JR, Curtis JT (1957) An ordination of the upland forest of
|
|
145
145
|
southern Wisconsin. Ecological Monographs, 27, 325-349.
|
|
146
146
|
2. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
@@ -158,17 +158,21 @@ class Distance:
|
|
|
158
158
|
|
|
159
159
|
The Canberra distance is a weighted version of the Manhattan distance, used in numerical analysis.
|
|
160
160
|
|
|
161
|
-
Notes
|
|
161
|
+
Notes
|
|
162
|
+
-----
|
|
162
163
|
When `u[i]` and `v[i]` are 0 for given i, then the fraction 0/0 = 0
|
|
163
164
|
is used in the calculation.
|
|
164
165
|
|
|
165
|
-
Parameters
|
|
166
|
+
Parameters
|
|
167
|
+
----------
|
|
166
168
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
167
169
|
|
|
168
|
-
Returns
|
|
170
|
+
Returns
|
|
171
|
+
-------
|
|
169
172
|
- The Canberra distance between the two vectors.
|
|
170
173
|
|
|
171
|
-
References
|
|
174
|
+
References
|
|
175
|
+
----------
|
|
172
176
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
173
177
|
Measures between Probability Density Functions. International
|
|
174
178
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -191,13 +195,16 @@ class Distance:
|
|
|
191
195
|
Maximum value distance
|
|
192
196
|
Minimax approximation
|
|
193
197
|
|
|
194
|
-
Parameters
|
|
198
|
+
Parameters
|
|
199
|
+
----------
|
|
195
200
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
196
201
|
|
|
197
|
-
Returns
|
|
202
|
+
Returns
|
|
203
|
+
-------
|
|
198
204
|
- The Chebyshev distance between the two vectors.
|
|
199
205
|
|
|
200
|
-
References
|
|
206
|
+
References
|
|
207
|
+
----------
|
|
201
208
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
202
209
|
Measures between Probability Density Functions. International
|
|
203
210
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -212,10 +219,12 @@ class Distance:
|
|
|
212
219
|
|
|
213
220
|
This measure represents a custom approach by Zielezinski to distance measurement, focusing on the minimum absolute difference.
|
|
214
221
|
|
|
215
|
-
Parameters
|
|
222
|
+
Parameters
|
|
223
|
+
----------
|
|
216
224
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
217
225
|
|
|
218
|
-
Returns
|
|
226
|
+
Returns
|
|
227
|
+
-------
|
|
219
228
|
- The minimum value distance between the two vectors.
|
|
220
229
|
"""
|
|
221
230
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -227,17 +236,21 @@ class Distance:
|
|
|
227
236
|
|
|
228
237
|
The Clark distance equals the square root of half of the divergence.
|
|
229
238
|
|
|
230
|
-
Notes
|
|
239
|
+
Notes
|
|
240
|
+
-----
|
|
231
241
|
When `u[i]` and `v[i]` are 0 for given i, then the fraction 0/0 = 0
|
|
232
242
|
is used in the calculation.
|
|
233
243
|
|
|
234
|
-
Parameters
|
|
244
|
+
Parameters
|
|
245
|
+
----------
|
|
235
246
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
236
247
|
|
|
237
|
-
Returns
|
|
248
|
+
Returns
|
|
249
|
+
-------
|
|
238
250
|
- The Clark distance between the two vectors.
|
|
239
251
|
|
|
240
|
-
References
|
|
252
|
+
References
|
|
253
|
+
----------
|
|
241
254
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
242
255
|
Measures between Probability Density Functions. International
|
|
243
256
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -251,13 +264,16 @@ class Distance:
|
|
|
251
264
|
"""
|
|
252
265
|
Calculate the cosine distance between two vectors.
|
|
253
266
|
|
|
254
|
-
Parameters
|
|
267
|
+
Parameters
|
|
268
|
+
----------
|
|
255
269
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
256
270
|
|
|
257
|
-
Returns
|
|
271
|
+
Returns
|
|
272
|
+
-------
|
|
258
273
|
- The cosine distance between the two vectors.
|
|
259
274
|
|
|
260
|
-
References
|
|
275
|
+
References
|
|
276
|
+
----------
|
|
261
277
|
1. SciPy.
|
|
262
278
|
"""
|
|
263
279
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -269,10 +285,12 @@ class Distance:
|
|
|
269
285
|
|
|
270
286
|
Returns a distance value between 0 and 2.
|
|
271
287
|
|
|
272
|
-
Parameters
|
|
288
|
+
Parameters
|
|
289
|
+
----------
|
|
273
290
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
274
291
|
|
|
275
|
-
Returns
|
|
292
|
+
Returns
|
|
293
|
+
-------
|
|
276
294
|
- The Pearson correlation distance between the two vectors.
|
|
277
295
|
"""
|
|
278
296
|
|
|
@@ -284,13 +302,16 @@ class Distance:
|
|
|
284
302
|
"""
|
|
285
303
|
Calculate the Czekanowski distance between two vectors.
|
|
286
304
|
|
|
287
|
-
Parameters
|
|
305
|
+
Parameters
|
|
306
|
+
----------
|
|
288
307
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
289
308
|
|
|
290
|
-
Returns
|
|
309
|
+
Returns
|
|
310
|
+
-------
|
|
291
311
|
- The Czekanowski distance between the two vectors.
|
|
292
312
|
|
|
293
|
-
References
|
|
313
|
+
References
|
|
314
|
+
----------
|
|
294
315
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
295
316
|
Measures between Probability Density Functions. International
|
|
296
317
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -306,13 +327,16 @@ class Distance:
|
|
|
306
327
|
Synonyms:
|
|
307
328
|
Sorensen distance
|
|
308
329
|
|
|
309
|
-
Parameters
|
|
330
|
+
Parameters
|
|
331
|
+
----------
|
|
310
332
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
311
333
|
|
|
312
|
-
Returns
|
|
334
|
+
Returns
|
|
335
|
+
-------
|
|
313
336
|
- The Dice dissimilarity between the two vectors.
|
|
314
337
|
|
|
315
|
-
References
|
|
338
|
+
References
|
|
339
|
+
----------
|
|
316
340
|
1. Dice LR (1945) Measures of the amount of ecologic association
|
|
317
341
|
between species. Ecology. 26, 297-302.
|
|
318
342
|
2. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
@@ -330,13 +354,16 @@ class Distance:
|
|
|
330
354
|
|
|
331
355
|
Divergence equals squared Clark distance multiplied by 2.
|
|
332
356
|
|
|
333
|
-
Parameters
|
|
357
|
+
Parameters
|
|
358
|
+
----------
|
|
334
359
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
335
360
|
|
|
336
|
-
Returns
|
|
361
|
+
Returns
|
|
362
|
+
-------
|
|
337
363
|
- The divergence between the two vectors.
|
|
338
364
|
|
|
339
|
-
References
|
|
365
|
+
References
|
|
366
|
+
----------
|
|
340
367
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
341
368
|
Measures between Probability Density Functions. International
|
|
342
369
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -352,13 +379,16 @@ class Distance:
|
|
|
352
379
|
|
|
353
380
|
The Euclidean distance is the "ordinary" straight-line distance between two points in Euclidean space.
|
|
354
381
|
|
|
355
|
-
Parameters
|
|
382
|
+
Parameters
|
|
383
|
+
----------
|
|
356
384
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
357
385
|
|
|
358
|
-
Returns
|
|
386
|
+
Returns
|
|
387
|
+
-------
|
|
359
388
|
- The Euclidean distance between the two vectors.
|
|
360
389
|
|
|
361
|
-
References
|
|
390
|
+
References
|
|
391
|
+
----------
|
|
362
392
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
363
393
|
Measures between Probability Density Functions. International
|
|
364
394
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -373,13 +403,16 @@ class Distance:
|
|
|
373
403
|
|
|
374
404
|
The fidelity distance measures the similarity between two probability distributions.
|
|
375
405
|
|
|
376
|
-
Parameters
|
|
406
|
+
Parameters
|
|
407
|
+
----------
|
|
377
408
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
378
409
|
|
|
379
|
-
Returns
|
|
410
|
+
Returns
|
|
411
|
+
-------
|
|
380
412
|
- The fidelity distance between the two vectors.
|
|
381
413
|
|
|
382
|
-
Notes
|
|
414
|
+
Notes
|
|
415
|
+
-----
|
|
383
416
|
Added by SC.
|
|
384
417
|
"""
|
|
385
418
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -391,17 +424,21 @@ class Distance:
|
|
|
391
424
|
|
|
392
425
|
NGD is a measure of similarity derived from the number of hits returned by the Google search engine for a given set of keywords.
|
|
393
426
|
|
|
394
|
-
Parameters
|
|
427
|
+
Parameters
|
|
428
|
+
----------
|
|
395
429
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
396
430
|
|
|
397
|
-
Returns
|
|
431
|
+
Returns
|
|
432
|
+
-------
|
|
398
433
|
- The Normalized Google Distance between the two vectors.
|
|
399
434
|
|
|
400
|
-
Notes
|
|
435
|
+
Notes
|
|
436
|
+
-----
|
|
401
437
|
When used for comparing two probability density functions (pdfs),
|
|
402
438
|
Google distance equals half of Cityblock distance.
|
|
403
439
|
|
|
404
|
-
References
|
|
440
|
+
References
|
|
441
|
+
----------
|
|
405
442
|
1. Lee & Rashid (2008) Information Technology, ITSim 2008.
|
|
406
443
|
doi:10.1109/ITSIM.2008.4631601.
|
|
407
444
|
"""
|
|
@@ -417,13 +454,16 @@ class Distance:
|
|
|
417
454
|
|
|
418
455
|
The Gower distance equals the Cityblock distance divided by the vector length.
|
|
419
456
|
|
|
420
|
-
Parameters
|
|
457
|
+
Parameters
|
|
458
|
+
----------
|
|
421
459
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
422
460
|
|
|
423
|
-
Returns
|
|
461
|
+
Returns
|
|
462
|
+
-------
|
|
424
463
|
- The Gower distance between the two vectors.
|
|
425
464
|
|
|
426
|
-
References
|
|
465
|
+
References
|
|
466
|
+
----------
|
|
427
467
|
1. Gower JC. (1971) General Coefficient of Similarity
|
|
428
468
|
and Some of Its Properties, Biometrics 27, 857-874.
|
|
429
469
|
2. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
@@ -451,18 +491,22 @@ class Distance:
|
|
|
451
491
|
|
|
452
492
|
The Hellinger distance is a measure of similarity between two probability distributions.
|
|
453
493
|
|
|
454
|
-
Parameters
|
|
494
|
+
Parameters
|
|
495
|
+
----------
|
|
455
496
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
456
497
|
|
|
457
|
-
Returns
|
|
498
|
+
Returns
|
|
499
|
+
-------
|
|
458
500
|
- The Hellinger distance between the two vectors.
|
|
459
501
|
|
|
460
|
-
Notes
|
|
502
|
+
Notes
|
|
503
|
+
-----
|
|
461
504
|
This implementation produces values two times larger than values
|
|
462
505
|
obtained by Hellinger distance described in Wikipedia and also
|
|
463
506
|
in https://gist.github.com/larsmans/3116927.
|
|
464
507
|
|
|
465
|
-
References
|
|
508
|
+
References
|
|
509
|
+
----------
|
|
466
510
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
467
511
|
Measures between Probability Density Functions. International
|
|
468
512
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -477,13 +521,16 @@ class Distance:
|
|
|
477
521
|
|
|
478
522
|
The inner product distance is a measure of similarity between two vectors, based on their inner product.
|
|
479
523
|
|
|
480
|
-
Parameters
|
|
524
|
+
Parameters
|
|
525
|
+
----------
|
|
481
526
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
482
527
|
|
|
483
|
-
Returns
|
|
528
|
+
Returns
|
|
529
|
+
-------
|
|
484
530
|
- The inner product distance between the two vectors.
|
|
485
531
|
|
|
486
|
-
Notes
|
|
532
|
+
Notes
|
|
533
|
+
-----
|
|
487
534
|
Added by SC.
|
|
488
535
|
"""
|
|
489
536
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -495,13 +542,16 @@ class Distance:
|
|
|
495
542
|
|
|
496
543
|
The Jaccard distance measures dissimilarity between sample sets.
|
|
497
544
|
|
|
498
|
-
Parameters
|
|
545
|
+
Parameters
|
|
546
|
+
----------
|
|
499
547
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
500
548
|
|
|
501
|
-
Returns
|
|
549
|
+
Returns
|
|
550
|
+
-------
|
|
502
551
|
- The Jaccard distance between the two vectors.
|
|
503
552
|
|
|
504
|
-
References
|
|
553
|
+
References
|
|
554
|
+
----------
|
|
505
555
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
506
556
|
Measures between Probability Density Functions. International
|
|
507
557
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -517,13 +567,16 @@ class Distance:
|
|
|
517
567
|
|
|
518
568
|
The Jeffreys divergence is a symmetric version of the Kullback-Leibler divergence.
|
|
519
569
|
|
|
520
|
-
Parameters
|
|
570
|
+
Parameters
|
|
571
|
+
----------
|
|
521
572
|
- u, v: Input vectors between which the divergence is to be calculated.
|
|
522
573
|
|
|
523
|
-
Returns
|
|
574
|
+
Returns
|
|
575
|
+
-------
|
|
524
576
|
- The Jeffreys divergence between the two vectors.
|
|
525
577
|
|
|
526
|
-
References
|
|
578
|
+
References
|
|
579
|
+
----------
|
|
527
580
|
1. Jeffreys H (1946) An Invariant Form for the Prior Probability
|
|
528
581
|
in Estimation Problems. Proc.Roy.Soc.Lon., Ser. A 186, 453-461.
|
|
529
582
|
2. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
@@ -547,13 +600,16 @@ class Distance:
|
|
|
547
600
|
|
|
548
601
|
The Jensen-Shannon divergence is a symmetric and finite measure of similarity between two probability distributions.
|
|
549
602
|
|
|
550
|
-
Parameters
|
|
603
|
+
Parameters
|
|
604
|
+
----------
|
|
551
605
|
- u, v: Input vectors between which the divergence is to be calculated.
|
|
552
606
|
|
|
553
|
-
Returns
|
|
607
|
+
Returns
|
|
608
|
+
-------
|
|
554
609
|
- The Jensen-Shannon divergence between the two vectors.
|
|
555
610
|
|
|
556
|
-
References
|
|
611
|
+
References
|
|
612
|
+
----------
|
|
557
613
|
1. Lin J. (1991) Divergence measures based on the Shannon entropy.
|
|
558
614
|
IEEE Transactions on Information Theory, 37(1):145–151.
|
|
559
615
|
2. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
@@ -582,18 +638,22 @@ class Distance:
|
|
|
582
638
|
|
|
583
639
|
The Jensen difference is considered similar to the Jensen-Shannon divergence.
|
|
584
640
|
|
|
585
|
-
Parameters
|
|
641
|
+
Parameters
|
|
642
|
+
----------
|
|
586
643
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
587
644
|
|
|
588
|
-
Returns
|
|
645
|
+
Returns
|
|
646
|
+
-------
|
|
589
647
|
- The Jensen difference between the two vectors.
|
|
590
648
|
|
|
591
|
-
Notes
|
|
649
|
+
Notes
|
|
650
|
+
-----
|
|
592
651
|
1. Equals half of Topsøe distance
|
|
593
652
|
2. Equals squared jensenshannon_distance.
|
|
594
653
|
|
|
595
654
|
|
|
596
|
-
References
|
|
655
|
+
References
|
|
656
|
+
----------
|
|
597
657
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
598
658
|
Measures between Probability Density Functions. International
|
|
599
659
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -610,13 +670,16 @@ class Distance:
|
|
|
610
670
|
"""
|
|
611
671
|
Calculate the K divergence between two vectors.
|
|
612
672
|
|
|
613
|
-
Parameters
|
|
673
|
+
Parameters
|
|
674
|
+
----------
|
|
614
675
|
- u, v: Input vectors between which the divergence is to be calculated.
|
|
615
676
|
|
|
616
|
-
Returns
|
|
677
|
+
Returns
|
|
678
|
+
-------
|
|
617
679
|
- The K divergence between the two vectors.
|
|
618
680
|
|
|
619
|
-
References
|
|
681
|
+
References
|
|
682
|
+
----------
|
|
620
683
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
621
684
|
Measures between Probability Density Functions. International
|
|
622
685
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -633,13 +696,16 @@ class Distance:
|
|
|
633
696
|
|
|
634
697
|
The Kullback-Leibler divergence measures the difference between two probability distributions.
|
|
635
698
|
|
|
636
|
-
Parameters
|
|
699
|
+
Parameters
|
|
700
|
+
----------
|
|
637
701
|
- u, v: Input vectors between which the divergence is to be calculated.
|
|
638
702
|
|
|
639
|
-
Returns
|
|
703
|
+
Returns
|
|
704
|
+
-------
|
|
640
705
|
- The Kullback-Leibler divergence between the two vectors.
|
|
641
706
|
|
|
642
|
-
References
|
|
707
|
+
References
|
|
708
|
+
----------
|
|
643
709
|
1. Kullback S, Leibler RA (1951) On information and sufficiency.
|
|
644
710
|
Ann. Math. Statist. 22:79–86
|
|
645
711
|
2. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
@@ -656,13 +722,16 @@ class Distance:
|
|
|
656
722
|
"""
|
|
657
723
|
Calculate the Kulczynski distance between two vectors.
|
|
658
724
|
|
|
659
|
-
Parameters
|
|
725
|
+
Parameters
|
|
726
|
+
----------
|
|
660
727
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
661
728
|
|
|
662
|
-
Returns
|
|
729
|
+
Returns
|
|
730
|
+
-------
|
|
663
731
|
- The Kulczynski distance between the two vectors.
|
|
664
732
|
|
|
665
|
-
References
|
|
733
|
+
References
|
|
734
|
+
----------
|
|
666
735
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
667
736
|
Measures between Probability Density Functions. International
|
|
668
737
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -675,13 +744,16 @@ class Distance:
|
|
|
675
744
|
"""
|
|
676
745
|
Calculate the Kumar-Johnson distance between two vectors.
|
|
677
746
|
|
|
678
|
-
Parameters
|
|
747
|
+
Parameters
|
|
748
|
+
----------
|
|
679
749
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
680
750
|
|
|
681
|
-
Returns
|
|
751
|
+
Returns
|
|
752
|
+
-------
|
|
682
753
|
- The Kumar-Johnson distance between the two vectors.
|
|
683
754
|
|
|
684
|
-
References
|
|
755
|
+
References
|
|
756
|
+
----------
|
|
685
757
|
1. Kumar P, Johnson A. (2005) On a symmetric divergence measure
|
|
686
758
|
and information inequalities, Journal of Inequalities in pure
|
|
687
759
|
and applied Mathematics. 6(3).
|
|
@@ -701,19 +773,23 @@ class Distance:
|
|
|
701
773
|
"""
|
|
702
774
|
Calculate the Lorentzian distance between two vectors.
|
|
703
775
|
|
|
704
|
-
Parameters
|
|
776
|
+
Parameters
|
|
777
|
+
----------
|
|
705
778
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
706
779
|
|
|
707
|
-
Returns
|
|
780
|
+
Returns
|
|
781
|
+
-------
|
|
708
782
|
- The Lorentzian distance between the two vectors.
|
|
709
783
|
|
|
710
|
-
References
|
|
784
|
+
References
|
|
785
|
+
----------
|
|
711
786
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
712
787
|
Measures between Probability Density Functions. International
|
|
713
788
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
714
789
|
1(4):300-307.
|
|
715
790
|
|
|
716
|
-
Notes
|
|
791
|
+
Notes
|
|
792
|
+
-----
|
|
717
793
|
One (1) is added to guarantee the non-negativity property and to
|
|
718
794
|
eschew the log of zero.
|
|
719
795
|
"""
|
|
@@ -724,13 +800,16 @@ class Distance:
|
|
|
724
800
|
"""
|
|
725
801
|
Calculate the Cityblock (Manhattan) distance between two vectors.
|
|
726
802
|
|
|
727
|
-
Parameters
|
|
803
|
+
Parameters
|
|
804
|
+
----------
|
|
728
805
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
729
806
|
|
|
730
|
-
Returns
|
|
807
|
+
Returns
|
|
808
|
+
-------
|
|
731
809
|
- The Cityblock distance between the two vectors.
|
|
732
810
|
|
|
733
|
-
References
|
|
811
|
+
References
|
|
812
|
+
----------
|
|
734
813
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
735
814
|
Measures between Probability Density Functions. International
|
|
736
815
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -742,7 +821,8 @@ class Distance:
|
|
|
742
821
|
Rectilinear distance
|
|
743
822
|
Taxicab norm
|
|
744
823
|
|
|
745
|
-
Notes
|
|
824
|
+
Notes
|
|
825
|
+
-----
|
|
746
826
|
Cityblock distance between two probability density functions
|
|
747
827
|
(pdfs) equals:
|
|
748
828
|
1. Non-intersection distance multiplied by 2.
|
|
@@ -757,13 +837,16 @@ class Distance:
|
|
|
757
837
|
"""
|
|
758
838
|
Calculate the Maryland Bridge distance between two vectors.
|
|
759
839
|
|
|
760
|
-
Parameters
|
|
840
|
+
Parameters
|
|
841
|
+
----------
|
|
761
842
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
762
843
|
|
|
763
|
-
Returns
|
|
844
|
+
Returns
|
|
845
|
+
-------
|
|
764
846
|
- The Maryland Bridge distance between the two vectors.
|
|
765
847
|
|
|
766
|
-
References
|
|
848
|
+
References
|
|
849
|
+
----------
|
|
767
850
|
1. Deza M, Deza E (2009) Encyclopedia of Distances.
|
|
768
851
|
Springer-Verlag Berlin Heidelberg. 1-590.
|
|
769
852
|
"""
|
|
@@ -775,19 +858,23 @@ class Distance:
|
|
|
775
858
|
"""
|
|
776
859
|
Calculate the Matusita distance between two vectors.
|
|
777
860
|
|
|
778
|
-
Parameters
|
|
861
|
+
Parameters
|
|
862
|
+
----------
|
|
779
863
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
780
864
|
|
|
781
|
-
Returns
|
|
865
|
+
Returns
|
|
866
|
+
-------
|
|
782
867
|
- The Matusita distance between the two vectors.
|
|
783
868
|
|
|
784
|
-
References
|
|
869
|
+
References
|
|
870
|
+
----------
|
|
785
871
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
786
872
|
Measures between Probability Density Functions. International
|
|
787
873
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
788
874
|
1(4):300-307.
|
|
789
875
|
|
|
790
|
-
Notes
|
|
876
|
+
Notes
|
|
877
|
+
-----
|
|
791
878
|
Equals square root of Squared-chord distance.
|
|
792
879
|
"""
|
|
793
880
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -797,13 +884,16 @@ class Distance:
|
|
|
797
884
|
"""
|
|
798
885
|
Calculate the maximum symmetric chi-square distance between two vectors.
|
|
799
886
|
|
|
800
|
-
Parameters
|
|
887
|
+
Parameters
|
|
888
|
+
----------
|
|
801
889
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
802
890
|
|
|
803
|
-
Returns
|
|
891
|
+
Returns
|
|
892
|
+
-------
|
|
804
893
|
- The maximum symmetric chi-square distance between the two vectors.
|
|
805
894
|
|
|
806
|
-
References
|
|
895
|
+
References
|
|
896
|
+
----------
|
|
807
897
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
808
898
|
Measures between Probability Density Functions. International
|
|
809
899
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -816,13 +906,16 @@ class Distance:
|
|
|
816
906
|
"""
|
|
817
907
|
Calculate the minimum symmetric chi-square distance between two vectors.
|
|
818
908
|
|
|
819
|
-
Parameters
|
|
909
|
+
Parameters
|
|
910
|
+
----------
|
|
820
911
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
821
912
|
|
|
822
|
-
Returns
|
|
913
|
+
Returns
|
|
914
|
+
-------
|
|
823
915
|
- The minimum symmetric chi-square distance between the two vectors.
|
|
824
916
|
|
|
825
|
-
Notes
|
|
917
|
+
Notes
|
|
918
|
+
-----
|
|
826
919
|
Added by SC.
|
|
827
920
|
"""
|
|
828
921
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -832,16 +925,20 @@ class Distance:
|
|
|
832
925
|
"""
|
|
833
926
|
Calculate the Meehl distance between two vectors.
|
|
834
927
|
|
|
835
|
-
Parameters
|
|
928
|
+
Parameters
|
|
929
|
+
----------
|
|
836
930
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
837
931
|
|
|
838
|
-
Returns
|
|
932
|
+
Returns
|
|
933
|
+
-------
|
|
839
934
|
- The Meehl distance between the two vectors.
|
|
840
935
|
|
|
841
|
-
Notes
|
|
936
|
+
Notes
|
|
937
|
+
-----
|
|
842
938
|
Added by SC.
|
|
843
939
|
|
|
844
|
-
References
|
|
940
|
+
References
|
|
941
|
+
----------
|
|
845
942
|
1. Deza M. and Deza E. (2013) Encyclopedia of Distances.
|
|
846
943
|
Berlin, Heidelberg: Springer Berlin Heidelberg.
|
|
847
944
|
https://doi.org/10.1007/978-3-642-30958-8.
|
|
@@ -860,17 +957,21 @@ class Distance:
|
|
|
860
957
|
"""
|
|
861
958
|
Calculate the Minkowski distance between two vectors.
|
|
862
959
|
|
|
863
|
-
Parameters
|
|
960
|
+
Parameters
|
|
961
|
+
----------
|
|
864
962
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
865
963
|
- p: The order of the norm of the difference.
|
|
866
964
|
|
|
867
|
-
Returns
|
|
965
|
+
Returns
|
|
966
|
+
-------
|
|
868
967
|
- The Minkowski distance between the two vectors.
|
|
869
968
|
|
|
870
|
-
Notes
|
|
969
|
+
Notes
|
|
970
|
+
-----
|
|
871
971
|
When p goes to infinite, the Chebyshev distance is derived.
|
|
872
972
|
|
|
873
|
-
References
|
|
973
|
+
References
|
|
974
|
+
----------
|
|
874
975
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
875
976
|
Measures between Probability Density Functions. International
|
|
876
977
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -883,16 +984,20 @@ class Distance:
|
|
|
883
984
|
"""
|
|
884
985
|
Calculate the Motyka distance between two vectors.
|
|
885
986
|
|
|
886
|
-
Parameters
|
|
987
|
+
Parameters
|
|
988
|
+
----------
|
|
887
989
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
888
990
|
|
|
889
|
-
Returns
|
|
991
|
+
Returns
|
|
992
|
+
-------
|
|
890
993
|
- The Motyka distance between the two vectors.
|
|
891
994
|
|
|
892
|
-
Notes
|
|
995
|
+
Notes
|
|
996
|
+
-----
|
|
893
997
|
The distance between identical vectors is not equal to 0 but 0.5.
|
|
894
998
|
|
|
895
|
-
References
|
|
999
|
+
References
|
|
1000
|
+
----------
|
|
896
1001
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
897
1002
|
Measures between Probability Density Functions. International
|
|
898
1003
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -905,13 +1010,16 @@ class Distance:
|
|
|
905
1010
|
"""
|
|
906
1011
|
Calculate the Neyman chi-square distance between two vectors.
|
|
907
1012
|
|
|
908
|
-
Parameters
|
|
1013
|
+
Parameters
|
|
1014
|
+
----------
|
|
909
1015
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
910
1016
|
|
|
911
|
-
Returns
|
|
1017
|
+
Returns
|
|
1018
|
+
-------
|
|
912
1019
|
- The Neyman chi-square distance between the two vectors.
|
|
913
1020
|
|
|
914
|
-
References
|
|
1021
|
+
References
|
|
1022
|
+
----------
|
|
915
1023
|
1. Neyman J (1949) Contributions to the theory of the chi^2 test.
|
|
916
1024
|
In Proceedings of the First Berkley Symposium on Mathematical
|
|
917
1025
|
Statistics and Probability.
|
|
@@ -928,19 +1036,23 @@ class Distance:
|
|
|
928
1036
|
"""
|
|
929
1037
|
Calculate the Nonintersection distance between two vectors.
|
|
930
1038
|
|
|
931
|
-
Parameters
|
|
1039
|
+
Parameters
|
|
1040
|
+
----------
|
|
932
1041
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
933
1042
|
|
|
934
|
-
Returns
|
|
1043
|
+
Returns
|
|
1044
|
+
-------
|
|
935
1045
|
- The Nonintersection distance between the two vectors.
|
|
936
1046
|
|
|
937
|
-
References
|
|
1047
|
+
References
|
|
1048
|
+
----------
|
|
938
1049
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
939
1050
|
Measures between Probability Density Functions. International
|
|
940
1051
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
941
1052
|
1(4), 300-307.
|
|
942
1053
|
|
|
943
|
-
Notes
|
|
1054
|
+
Notes
|
|
1055
|
+
-----
|
|
944
1056
|
When used for comparing two probability density functions (pdfs),
|
|
945
1057
|
Nonintersection distance equals half of Cityblock distance.
|
|
946
1058
|
"""
|
|
@@ -951,13 +1063,16 @@ class Distance:
|
|
|
951
1063
|
"""
|
|
952
1064
|
Calculate the Pearson chi-square divergence between two vectors.
|
|
953
1065
|
|
|
954
|
-
Parameters
|
|
1066
|
+
Parameters
|
|
1067
|
+
----------
|
|
955
1068
|
- u, v: Input vectors between which the divergence is to be calculated.
|
|
956
1069
|
|
|
957
|
-
Returns
|
|
1070
|
+
Returns
|
|
1071
|
+
-------
|
|
958
1072
|
- The Pearson chi-square divergence between the two vectors.
|
|
959
1073
|
|
|
960
|
-
References
|
|
1074
|
+
References
|
|
1075
|
+
----------
|
|
961
1076
|
1. Pearson K. (1900) On the Criterion that a given system of
|
|
962
1077
|
deviations from the probable in the case of correlated system
|
|
963
1078
|
of variables is such that it can be reasonable supposed to have
|
|
@@ -967,7 +1082,8 @@ class Distance:
|
|
|
967
1082
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
968
1083
|
1(4), 300-307.
|
|
969
1084
|
|
|
970
|
-
Notes
|
|
1085
|
+
Notes
|
|
1086
|
+
-----
|
|
971
1087
|
Pearson chi-square divergence is asymmetric.
|
|
972
1088
|
"""
|
|
973
1089
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -978,13 +1094,16 @@ class Distance:
|
|
|
978
1094
|
"""
|
|
979
1095
|
Calculate the Penrose shape distance between two vectors.
|
|
980
1096
|
|
|
981
|
-
Parameters
|
|
1097
|
+
Parameters
|
|
1098
|
+
----------
|
|
982
1099
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
983
1100
|
|
|
984
|
-
Returns
|
|
1101
|
+
Returns
|
|
1102
|
+
-------
|
|
985
1103
|
- The Penrose shape distance between the two vectors.
|
|
986
1104
|
|
|
987
|
-
References
|
|
1105
|
+
References
|
|
1106
|
+
----------
|
|
988
1107
|
1. Deza M, Deza E (2009) Encyclopedia of Distances.
|
|
989
1108
|
Springer-Verlag Berlin Heidelberg. 1-590.
|
|
990
1109
|
"""
|
|
@@ -997,13 +1116,16 @@ class Distance:
|
|
|
997
1116
|
"""
|
|
998
1117
|
Calculate the Probabilistic chi-square distance between two vectors.
|
|
999
1118
|
|
|
1000
|
-
Parameters
|
|
1119
|
+
Parameters
|
|
1120
|
+
----------
|
|
1001
1121
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1002
1122
|
|
|
1003
|
-
Returns
|
|
1123
|
+
Returns
|
|
1124
|
+
-------
|
|
1004
1125
|
- The Probabilistic chi-square distance between the two vectors.
|
|
1005
1126
|
|
|
1006
|
-
Notes
|
|
1127
|
+
Notes
|
|
1128
|
+
-----
|
|
1007
1129
|
Added by SC.
|
|
1008
1130
|
"""
|
|
1009
1131
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -1015,13 +1137,16 @@ class Distance:
|
|
|
1015
1137
|
"""
|
|
1016
1138
|
Calculate the Ruzicka distance between two vectors.
|
|
1017
1139
|
|
|
1018
|
-
Parameters
|
|
1140
|
+
Parameters
|
|
1141
|
+
----------
|
|
1019
1142
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1020
1143
|
|
|
1021
|
-
Returns
|
|
1144
|
+
Returns
|
|
1145
|
+
-------
|
|
1022
1146
|
- The Ruzicka distance between the two vectors.
|
|
1023
1147
|
|
|
1024
|
-
Notes
|
|
1148
|
+
Notes
|
|
1149
|
+
-----
|
|
1025
1150
|
Added by SC.
|
|
1026
1151
|
"""
|
|
1027
1152
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -1033,13 +1158,16 @@ class Distance:
|
|
|
1033
1158
|
"""
|
|
1034
1159
|
Calculate the Sorensen distance between two vectors.
|
|
1035
1160
|
|
|
1036
|
-
Parameters
|
|
1161
|
+
Parameters
|
|
1162
|
+
----------
|
|
1037
1163
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1038
1164
|
|
|
1039
|
-
Returns
|
|
1165
|
+
Returns
|
|
1166
|
+
-------
|
|
1040
1167
|
- The Sorensen distance between the two vectors.
|
|
1041
1168
|
|
|
1042
|
-
Notes
|
|
1169
|
+
Notes
|
|
1170
|
+
-----
|
|
1043
1171
|
The Sorensen distance equals the Manhattan distance divided by the sum of the two vectors.
|
|
1044
1172
|
|
|
1045
1173
|
Added by SC.
|
|
@@ -1051,16 +1179,20 @@ class Distance:
|
|
|
1051
1179
|
"""
|
|
1052
1180
|
Calculate the Soergel distance between two vectors.
|
|
1053
1181
|
|
|
1054
|
-
Parameters
|
|
1182
|
+
Parameters
|
|
1183
|
+
----------
|
|
1055
1184
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1056
1185
|
|
|
1057
|
-
Returns
|
|
1186
|
+
Returns
|
|
1187
|
+
-------
|
|
1058
1188
|
- The Soergel distance between the two vectors.
|
|
1059
1189
|
|
|
1060
|
-
Notes
|
|
1190
|
+
Notes
|
|
1191
|
+
-----
|
|
1061
1192
|
Equals Tanimoto distance.
|
|
1062
1193
|
|
|
1063
|
-
References
|
|
1194
|
+
References
|
|
1195
|
+
----------
|
|
1064
1196
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
1065
1197
|
Measures between Probability Density Functions. International
|
|
1066
1198
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -1073,13 +1205,16 @@ class Distance:
|
|
|
1073
1205
|
"""
|
|
1074
1206
|
Calculate the Squared chi-square distance between two vectors.
|
|
1075
1207
|
|
|
1076
|
-
Parameters
|
|
1208
|
+
Parameters
|
|
1209
|
+
----------
|
|
1077
1210
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1078
1211
|
|
|
1079
|
-
Returns
|
|
1212
|
+
Returns
|
|
1213
|
+
-------
|
|
1080
1214
|
- The Squared chi-square distance between the two vectors.
|
|
1081
1215
|
|
|
1082
|
-
References
|
|
1216
|
+
References
|
|
1217
|
+
----------
|
|
1083
1218
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
1084
1219
|
Measures between Probability Density Functions. International
|
|
1085
1220
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -1094,13 +1229,16 @@ class Distance:
|
|
|
1094
1229
|
"""
|
|
1095
1230
|
Calculate the Squared-chord distance between two vectors.
|
|
1096
1231
|
|
|
1097
|
-
Parameters
|
|
1232
|
+
Parameters
|
|
1233
|
+
----------
|
|
1098
1234
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1099
1235
|
|
|
1100
|
-
Returns
|
|
1236
|
+
Returns
|
|
1237
|
+
-------
|
|
1101
1238
|
- The Squared-chord distance between the two vectors.
|
|
1102
1239
|
|
|
1103
|
-
References
|
|
1240
|
+
References
|
|
1241
|
+
----------
|
|
1104
1242
|
1. Gavin DG et al. (2003) A statistical approach to evaluating
|
|
1105
1243
|
distance metrics and analog assignments for pollen records.
|
|
1106
1244
|
Quaternary Research 60:356–367.
|
|
@@ -1109,7 +1247,8 @@ class Distance:
|
|
|
1109
1247
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
1110
1248
|
1(4), 300-307.
|
|
1111
1249
|
|
|
1112
|
-
Notes
|
|
1250
|
+
Notes
|
|
1251
|
+
-----
|
|
1113
1252
|
Equals to squared Matusita distance.
|
|
1114
1253
|
"""
|
|
1115
1254
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -1119,13 +1258,16 @@ class Distance:
|
|
|
1119
1258
|
"""
|
|
1120
1259
|
Calculate the Squared Euclidean distance between two vectors.
|
|
1121
1260
|
|
|
1122
|
-
Parameters
|
|
1261
|
+
Parameters
|
|
1262
|
+
----------
|
|
1123
1263
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1124
1264
|
|
|
1125
|
-
Returns
|
|
1265
|
+
Returns
|
|
1266
|
+
-------
|
|
1126
1267
|
- The Squared Euclidean distance between the two vectors.
|
|
1127
1268
|
|
|
1128
|
-
References
|
|
1269
|
+
References
|
|
1270
|
+
----------
|
|
1129
1271
|
1. Gavin DG et al. (2003) A statistical approach to evaluating
|
|
1130
1272
|
distance metrics and analog assignments for pollen records.
|
|
1131
1273
|
Quaternary Research 60:356–367.
|
|
@@ -1134,7 +1276,8 @@ class Distance:
|
|
|
1134
1276
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
1135
1277
|
1(4), 300-307.
|
|
1136
1278
|
|
|
1137
|
-
Notes
|
|
1279
|
+
Notes
|
|
1280
|
+
-----
|
|
1138
1281
|
Equals to squared Euclidean distance.
|
|
1139
1282
|
"""
|
|
1140
1283
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -1144,13 +1287,16 @@ class Distance:
|
|
|
1144
1287
|
"""
|
|
1145
1288
|
Calculate the Taneja distance between two vectors.
|
|
1146
1289
|
|
|
1147
|
-
Parameters
|
|
1290
|
+
Parameters
|
|
1291
|
+
----------
|
|
1148
1292
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1149
1293
|
|
|
1150
|
-
Returns
|
|
1294
|
+
Returns
|
|
1295
|
+
-------
|
|
1151
1296
|
- The Taneja distance between the two vectors.
|
|
1152
1297
|
|
|
1153
|
-
References
|
|
1298
|
+
References
|
|
1299
|
+
----------
|
|
1154
1300
|
1. Taneja IJ. (1995), New Developments in Generalized Information
|
|
1155
1301
|
Measures, Chapter in: Advances in Imaging and Electron Physics,
|
|
1156
1302
|
Ed. P.W. Hawkes, 91, 37-135.
|
|
@@ -1169,19 +1315,23 @@ class Distance:
|
|
|
1169
1315
|
"""
|
|
1170
1316
|
Calculate the Tanimoto distance between two vectors.
|
|
1171
1317
|
|
|
1172
|
-
Parameters
|
|
1318
|
+
Parameters
|
|
1319
|
+
----------
|
|
1173
1320
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1174
1321
|
|
|
1175
|
-
Returns
|
|
1322
|
+
Returns
|
|
1323
|
+
-------
|
|
1176
1324
|
- The Tanimoto distance between the two vectors.
|
|
1177
1325
|
|
|
1178
|
-
References
|
|
1326
|
+
References
|
|
1327
|
+
----------
|
|
1179
1328
|
1. Sung-Hyuk C. (2007) Comprehensive Survey on Distance/Similarity
|
|
1180
1329
|
Measures between Probability Density Functions. International
|
|
1181
1330
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
1182
1331
|
1(4), 300-307.
|
|
1183
1332
|
|
|
1184
|
-
Notes
|
|
1333
|
+
Notes
|
|
1334
|
+
-----
|
|
1185
1335
|
Equals Soergel distance.
|
|
1186
1336
|
"""
|
|
1187
1337
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -1195,19 +1345,23 @@ class Distance:
|
|
|
1195
1345
|
"""
|
|
1196
1346
|
Calculate the Topsøe distance between two vectors.
|
|
1197
1347
|
|
|
1198
|
-
Parameters
|
|
1348
|
+
Parameters
|
|
1349
|
+
----------
|
|
1199
1350
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1200
1351
|
|
|
1201
|
-
Returns
|
|
1352
|
+
Returns
|
|
1353
|
+
-------
|
|
1202
1354
|
- The Topsøe distance between the two vectors.
|
|
1203
1355
|
|
|
1204
|
-
References
|
|
1356
|
+
References
|
|
1357
|
+
----------
|
|
1205
1358
|
1. Sung-Hyuk C (2007) Comprehensive Survey on Distance/Similarity
|
|
1206
1359
|
Measures between Probability Density Functions. International
|
|
1207
1360
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
1208
1361
|
1(4), 300-307.
|
|
1209
1362
|
|
|
1210
|
-
Notes
|
|
1363
|
+
Notes
|
|
1364
|
+
-----
|
|
1211
1365
|
Equals two times Jensen-Shannon divergence.
|
|
1212
1366
|
"""
|
|
1213
1367
|
u, v = np.asarray(u), np.asarray(v)
|
|
@@ -1221,13 +1375,16 @@ class Distance:
|
|
|
1221
1375
|
"""
|
|
1222
1376
|
Calculate the Vicis Symmetric chi-square distance between two vectors.
|
|
1223
1377
|
|
|
1224
|
-
Parameters
|
|
1378
|
+
Parameters
|
|
1379
|
+
----------
|
|
1225
1380
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1226
1381
|
|
|
1227
|
-
Returns
|
|
1382
|
+
Returns
|
|
1383
|
+
-------
|
|
1228
1384
|
- The Vicis Symmetric chi-square distance between the two vectors.
|
|
1229
1385
|
|
|
1230
|
-
References
|
|
1386
|
+
References
|
|
1387
|
+
----------
|
|
1231
1388
|
1. Sung-Hyuk C (2007) Comprehensive Survey on Distance/Similarity
|
|
1232
1389
|
Measures between Probability Density Functions. International
|
|
1233
1390
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -1243,13 +1400,16 @@ class Distance:
|
|
|
1243
1400
|
"""
|
|
1244
1401
|
Calculate the Vicis-Wave Hedges distance between two vectors.
|
|
1245
1402
|
|
|
1246
|
-
Parameters
|
|
1403
|
+
Parameters
|
|
1404
|
+
----------
|
|
1247
1405
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1248
1406
|
|
|
1249
|
-
Returns
|
|
1407
|
+
Returns
|
|
1408
|
+
-------
|
|
1250
1409
|
- The Vicis-Wave Hedges distance between the two vectors.
|
|
1251
1410
|
|
|
1252
|
-
References
|
|
1411
|
+
References
|
|
1412
|
+
----------
|
|
1253
1413
|
1. Sung-Hyuk C (2007) Comprehensive Survey on Distance/Similarity
|
|
1254
1414
|
Measures between Probability Density Functions. International
|
|
1255
1415
|
Journal of Mathematical Models and Methods in Applied Sciences.
|
|
@@ -1265,13 +1425,16 @@ class Distance:
|
|
|
1265
1425
|
"""
|
|
1266
1426
|
Calculate the Wave Hedges distance between two vectors.
|
|
1267
1427
|
|
|
1268
|
-
Parameters
|
|
1428
|
+
Parameters
|
|
1429
|
+
----------
|
|
1269
1430
|
- u, v: Input vectors between which the distance is to be calculated.
|
|
1270
1431
|
|
|
1271
|
-
Returns
|
|
1432
|
+
Returns
|
|
1433
|
+
-------
|
|
1272
1434
|
- The Wave Hedges distance between the two vectors.
|
|
1273
1435
|
|
|
1274
|
-
References
|
|
1436
|
+
References
|
|
1437
|
+
----------
|
|
1275
1438
|
1. Sung-Hyuk C (2007) Comprehensive Survey on Distance/Similarity
|
|
1276
1439
|
Measures between Probability Density Functions. International
|
|
1277
1440
|
Journal of Mathematical Models and Methods in Applied Sciences.
|