RubyGems - tomoto - Versions diffs - 0.6.0-aarch64-linux → 0.6.2-aarch64-linux - Mend

tomoto 0.6.0-aarch64-linux → 0.6.2-aarch64-linux

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Files changed (33) hide show

checksums.yaml +4 -4
data/CHANGELOG.md +9 -0
data/LICENSE.txt +1 -1
data/ext/tomoto/ct.cpp +1 -1
data/ext/tomoto/dmr.cpp +1 -1
data/ext/tomoto/dt.cpp +1 -1
data/ext/tomoto/extconf.rb +4 -8
data/ext/tomoto/gdmr.cpp +1 -1
data/ext/tomoto/hdp.cpp +1 -1
data/ext/tomoto/hlda.cpp +1 -1
data/ext/tomoto/hpa.cpp +1 -1
data/ext/tomoto/lda.cpp +29 -3
data/ext/tomoto/llda.cpp +1 -1
data/ext/tomoto/mglda.cpp +1 -1
data/ext/tomoto/pa.cpp +1 -1
data/ext/tomoto/plda.cpp +1 -1
data/ext/tomoto/slda.cpp +1 -1
data/lib/tomoto/3.2/tomoto.so +0 -0
data/lib/tomoto/3.3/tomoto.so +0 -0
data/lib/tomoto/3.4/tomoto.so +0 -0
data/lib/tomoto/4.0/tomoto.so +0 -0
data/lib/tomoto/lda.rb +1 -0
data/lib/tomoto/version.rb +1 -1
data/vendor/EigenRand/EigenRand/EigenRand +4 -4
data/vendor/EigenRand/README.md +60 -272
data/vendor/tomotopy/README.kr.rst +27 -5
data/vendor/tomotopy/README.rst +27 -5
data/vendor/tomotopy/README_pypi.rst +583 -0
data/vendor/tomotopy/licenses_bundled/EigenRand +21 -0
metadata +6 -6
data/vendor/variant/LICENSE +0 -25
data/vendor/variant/LICENSE_1_0.txt +0 -23
data/vendor/variant/README.md +0 -102

data/vendor/EigenRand/README.md CHANGED Viewed

@@ -16,7 +16,7 @@ You can get 5~10 times speed by just replacing old Eigen's Random or unvectoriza
 * 5~10 times faster than non-vectorized functions
 * Header-only (like Eigen)
 * Can be easily integrated with Eigen's expressions
-* Currently supports only x86, x86-64(up to AVX2), and ARM64 NEON (experimental) architecture.
+* Currently supports only x86, x86-64(up to AVX2), and ARM64 NEON architecture.
 ## Requirement
@@ -46,6 +46,13 @@ You can specify additional compiler arguments including target machine options (
 $ cmake -DCMAKE_BUILD_TYPE=Release -DEIGENRAND_CXX_FLAGS="-march=native" ..
 ```
+Alternatively cmake preset with cmake 3.21 or later can be used to compile EigenRand which also integrates nicely in VSCode
+```console
+cmake --preset default
+cmake --build --preset default
+ctest --preset default
+```
 ## Documentation
 https://bab2min.github.io/eigenrand/
@@ -54,33 +61,37 @@ https://bab2min.github.io/eigenrand/
 ### Random distributions for real types
-| Function | Generator | Scalar Type | Description | Equivalent to |
-|:---:|:---:|:---:|:---:|:---:|
-| `Eigen::Rand::balanced` | `Eigen::Rand::BalancedGen` | float, double | generates real values in the [-1, 1] range | `Eigen::DenseBase<Ty>::Random` for floating point types |
-| `Eigen::Rand::beta` | `Eigen::Rand::BetaGen` | float, double | generates real values on a [beta distribution](https://en.wikipedia.org/wiki/Beta_distribution) |  |
-| `Eigen::Rand::cauchy` | `Eigen::Rand::CauchyGen` | float, double | generates real values on the [Cauchy distribution](https://en.wikipedia.org/wiki/Cauchy_distribution). | `std::cauchy_distribution` |
-| `Eigen::Rand::chiSquared` | `Eigen::Rand::ChiSquaredGen` | float, double | generates real values on a [chi-squared distribution](https://en.wikipedia.org/wiki/Chi-squared_distribution). | `std::chi_squared_distribution` |
-| `Eigen::Rand::exponential` | `Eigen::Rand::ExponentialGen` | float, double | generates real values on an [exponential distribution](https://en.wikipedia.org/wiki/Exponential_distribution). | `std::exponential_distribution` |
-| `Eigen::Rand::extremeValue` | `Eigen::Rand::ExtremeValueGen` | float, double | generates real values on an [extreme value distribution](https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution). | `std::extreme_value_distribution` |
-| `Eigen::Rand::fisherF` | `Eigen::Rand::FisherFGen` | float, double | generates real values on the [Fisher's F distribution](https://en.wikipedia.org/wiki/F_distribution). | `std::fisher_f_distribution` |
-| `Eigen::Rand::gamma` | `Eigen::Rand::GammaGen` | float, double | generates real values on a [gamma distribution](https://en.wikipedia.org/wiki/Gamma_distribution). | `std::gamma_distribution` |
-| `Eigen::Rand::lognormal` | `Eigen::Rand::LognormalGen` | float, double | generates real values on a [lognormal distribution](https://en.wikipedia.org/wiki/Lognormal_distribution). | `std::lognormal_distribution` |
-| `Eigen::Rand::normal` | `Eigen::Rand::StdNormalGen`, `Eigen::Rand::NormalGen` | float, double | generates real values on a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution). | `std::normal_distribution` |
-| `Eigen::Rand::studentT` | `Eigen::Rand::StudentTGen` | float, double | generates real values on the [Student's t distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution). | `std::student_t_distribution` |
-| `Eigen::Rand::uniformReal` | `Eigen::Rand::UniformRealGen` | float, double | generates real values in the `[0, 1)` range. | `std::generate_canonical` |
-| `Eigen::Rand::weibull` | `Eigen::Rand::WeibullGen` | float, double | generates real values on the [Weibull distribution](https://en.wikipedia.org/wiki/Weibull_distribution). | `std::weibull_distribution` |
+| Function | Generator | Scalar Type | VoP | Description | Equivalent to |
+|:---:|:---:|:---:|:---:|:---:|:---:|
+| `Eigen::Rand::balanced` | `Eigen::Rand::BalancedGen` | float, double | Yes | generates real values in the [-1, 1] range | `Eigen::DenseBase<Ty>::Random` for floating point types |
+| `Eigen::Rand::beta` | `Eigen::Rand::BetaGen` | float, double | | generates real values on a [beta distribution](https://en.wikipedia.org/wiki/Beta_distribution) |  |
+| `Eigen::Rand::cauchy` | `Eigen::Rand::CauchyGen` | float, double | Yes | generates real values on the [Cauchy distribution](https://en.wikipedia.org/wiki/Cauchy_distribution). | `std::cauchy_distribution` |
+| `Eigen::Rand::chiSquared` | `Eigen::Rand::ChiSquaredGen` | float, double | | generates real values on a [chi-squared distribution](https://en.wikipedia.org/wiki/Chi-squared_distribution). | `std::chi_squared_distribution` |
+| `Eigen::Rand::exponential` | `Eigen::Rand::ExponentialGen` | float, double | Yes | generates real values on an [exponential distribution](https://en.wikipedia.org/wiki/Exponential_distribution). | `std::exponential_distribution` |
+| `Eigen::Rand::extremeValue` | `Eigen::Rand::ExtremeValueGen` | float, double | Yes | generates real values on an [extreme value distribution](https://en.wikipedia.org/wiki/Generalized_extreme_value_distribution). | `std::extreme_value_distribution` |
+| `Eigen::Rand::fisherF` | `Eigen::Rand::FisherFGen` | float, double | | generates real values on the [Fisher's F distribution](https://en.wikipedia.org/wiki/F_distribution). | `std::fisher_f_distribution` |
+| `Eigen::Rand::gamma` | `Eigen::Rand::GammaGen` | float, double | | generates real values on a [gamma distribution](https://en.wikipedia.org/wiki/Gamma_distribution). | `std::gamma_distribution` |
+| `Eigen::Rand::lognormal` | `Eigen::Rand::LognormalGen` | float, double | Yes | generates real values on a [lognormal distribution](https://en.wikipedia.org/wiki/Lognormal_distribution). | `std::lognormal_distribution` |
+| `Eigen::Rand::normal` | `Eigen::Rand::StdNormalGen`, `Eigen::Rand::NormalGen` | float, double | Yes | generates real values on a [normal distribution](https://en.wikipedia.org/wiki/Normal_distribution). | `std::normal_distribution` |
+| `Eigen::Rand::studentT` | `Eigen::Rand::StudentTGen` | float, double | Yes | generates real values on the [Student's t distribution](https://en.wikipedia.org/wiki/Student%27s_t-distribution). | `std::student_t_distribution` |
+| `Eigen::Rand::uniformReal` | `Eigen::Rand::UniformRealGen` | float, double | Yes | generates real values in the `[0, 1)` range. | `std::generate_canonical` |
+| `Eigen::Rand::weibull` | `Eigen::Rand::WeibullGen` | float, double | Yes | generates real values on the [Weibull distribution](https://en.wikipedia.org/wiki/Weibull_distribution). | `std::weibull_distribution` |
+* VoP indicates 'Vectorization over Parameters'.
 ### Random distributions for integer types
-| Function | Generator | Scalar Type | Description | Equivalent to |
-|:---:|:---:|:---:|:---:|:---:|
-| `Eigen::Rand::binomial` | `Eigen::Rand::BinomialGen` | int | generates integers on a [binomial distribution](https://en.wikipedia.org/wiki/Binomial_distribution). | `std::binomial_distribution` |
-| `Eigen::Rand::discrete` | `Eigen::Rand::DiscreteGen` | int | generates random integers on a discrete distribution. | `std::discrete_distribution` |
-| `Eigen::Rand::geometric` | `Eigen::Rand::GeometricGen` | int | generates integers on a [geometric distribution](https://en.wikipedia.org/wiki/Geometric_distribution). | `std::geometric_distribution` |
-| `Eigen::Rand::negativeBinomial` | `Eigen::Rand::NegativeBinomialGen` | int | generates integers on a [negative binomial distribution](https://en.wikipedia.org/wiki/Negative_binomial_distribution). | `std::negative_binomial_distribution` |
-| `Eigen::Rand::poisson` | `Eigen::Rand::PoissonGen` | int | generates integers on the [Poisson distribution](https://en.wikipedia.org/wiki/Poisson_distribution). | `std::poisson_distribution` |
-| `Eigen::Rand::randBits` | `Eigen::Rand::RandbitsGen` | int | generates integers with random bits. | `Eigen::DenseBase<Ty>::Random` for integer types |
-| `Eigen::Rand::uniformInt` | `Eigen::Rand::UniformIntGen` | int | generates integers in the `[min, max]` range. | `std::uniform_int_distribution` |
+| Function | Generator | Scalar Type | VoP | Description | Equivalent to |
+|:---:|:---:|:---:|:---:|:---:|:---:|
+| `Eigen::Rand::binomial` | `Eigen::Rand::BinomialGen` | int | Yes | generates integers on a [binomial distribution](https://en.wikipedia.org/wiki/Binomial_distribution). | `std::binomial_distribution` |
+| `Eigen::Rand::discrete` | `Eigen::Rand::DiscreteGen` | int | | generates random integers on a discrete distribution. | `std::discrete_distribution` |
+| `Eigen::Rand::geometric` | `Eigen::Rand::GeometricGen` | int | | generates integers on a [geometric distribution](https://en.wikipedia.org/wiki/Geometric_distribution). | `std::geometric_distribution` |
+| `Eigen::Rand::negativeBinomial` | `Eigen::Rand::NegativeBinomialGen` | int | | generates integers on a [negative binomial distribution](https://en.wikipedia.org/wiki/Negative_binomial_distribution). | `std::negative_binomial_distribution` |
+| `Eigen::Rand::poisson` | `Eigen::Rand::PoissonGen` | int | | generates integers on the [Poisson distribution](https://en.wikipedia.org/wiki/Poisson_distribution). | `std::poisson_distribution` |
+| `Eigen::Rand::randBits` | `Eigen::Rand::RandbitsGen` | int | | generates integers with random bits. | `Eigen::DenseBase<Ty>::Random` for integer types |
+| `Eigen::Rand::uniformInt` | `Eigen::Rand::UniformIntGen` | int | | generates integers in the `[min, max]` range. | `std::uniform_int_distribution` |
+* VoP indicates 'Vectorization over Parameters'.
 ### Multivariate distributions for real vectors and matrices
@@ -101,259 +112,27 @@ https://bab2min.github.io/eigenrand/
 | `Eigen::Rand::P8_mt19937_64` | a vectorized version of Mersenne Twister algorithm. Since it generates eight 64bit random integers simultaneously, the random values are the same regardless of architecture. | |
 ## Performance
-The following charts show the relative speed-up of EigenRand compared to references(equivalent functions of C++ std or Eigen).
-![Perf_no_vect](/doxygen/images/perf_no_vect.png)
-![Perf_no_vect](/doxygen/images/perf_sse2.png)
-![Perf_no_vect](/doxygen/images/perf_avx.png)
-![Perf_no_vect](/doxygen/images/perf_avx2.png)
-The following charts are about multivariate distributions.
-![Perf_no_vect](/doxygen/images/perf_mv_part1.png)
-![Perf_no_vect](/doxygen/images/perf_mv_part2.png)
-The following result is a measure of the time in seconds it takes to generate 1M random numbers.
-It shows the average of 20 times.
-### Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz (Ubuntu 16.04, gcc5.4)
-|  | C++ std (or Eigen) | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (SSSE3) | EigenRand (AVX) | EigenRand (AVX2) |
-|---|---:|---:|---:|---:|---:|---:|
-| `balanced`* | 9.0 | 5.9 | 1.5 | 1.4 | 1.3 | 0.9 |
-| `balanced`(double)* | 8.7 | 6.4 | 3.3 | 2.9 | 1.7 | 1.7 |
-| `binomial(20, 0.5)` | 400.8 | 118.5 | 32.7 | 36.6 | 30.0 | 22.7 |
-| `binomial(50, 0.01)` | 71.7 | 22.5 | 7.7 | 8.3 | 7.9 | 6.6 |
-| `binomial(100, 0.75)` | 340.5 | 454.5 | 91.7 | 111.5 | 106.3 | 86.4 |
-| `cauchy` | 36.1 | 54.4 | 6.1 | 7.1 | 4.7 | 3.9 |
-| `chiSquared` | 80.5 | 249.5 | 64.6 | 58.0 | 29.4 | 28.8 |
-| `discrete`(int32) | - | 14.0 | 2.9 | 2.6 | 2.4 | 1.7 |
-| `discrete`(fp32) | - | 21.9 | 4.3 | 4.0 | 3.6 | 3.0 |
-| `discrete`(fp64) | 72.4 | 21.4 | 6.9 | 6.5 | 4.9 | 3.7 |
-| `exponential` | 31.0 | 25.3 | 5.5 | 5.3 | 3.3 | 2.9 |
-| `extremeValue` | 66.0 | 60.1 | 11.9 | 10.7 | 6.5 | 5.8 |
-| `fisherF(1, 1)` | 178.1 | 35.1 | 33.2 | 39.3 | 22.9 | 18.7 |
-| `fisherF(5, 5)` | 141.8 | 415.2 | 136.47 | 172.4 | 92.4 | 74.9 |
-| `gamma(0.2, 1)` | 207.8 | 211.4 | 54.6 | 51.2 | 26.9 | 27.0 |
-| `gamma(5, 3)` | 80.9 | 60.0 | 14.3 | 13.3 | 11.4 | 8.0 |
-| `gamma(10.5, 1)` | 81.1 | 248.6 | 63.3 | 58.5 | 29.2 | 28.4 |
-| `geometric` | 43.0 | 22.4 | 6.7 | 7.4 | 5.8 |  |
-| `lognormal` | 66.3 | 55.4 | 12.8 | 11.8 | 6.2 | 6.2 |
-| `negativeBinomial(10, 0.5)` | 312.0 | 301.4 | 82.9 | 100.6 | 95.3 | 77.9 |
-| `negativeBinomial(20, 0.25)` | 483.4 | 575.9 | 125.0 | 158.2 | 148.4 | 119.5 |
-| `normal(0, 1)` | 38.1 | 28.5 | 6.8 | 6.2 | 3.8 | 3.7 |
-| `normal(2, 3)` | 37.6 | 29.0 | 7.3 | 6.6 | 4.0 | 3.9 |
-| `poisson(1)` | 31.8 | 25.2 | 9.8 | 10.8 | 9.7 | 8.2 |
-| `poisson(16)` | 231.8 | 274.1 | 66.2 | 80.7 | 74.4 | 64.2 |
-| `randBits` | 5.2 | 5.4 | 1.4 | 1.3 | 1.1 | 1.0 |
-| `studentT(1)` | 122.7 | 120.1 | 15.3 | 19.2 | 12.6 | 9.4 |
-| `studentT(20)` | 102.2 | 111.1 | 15.4 | 19.2 | 12.2 | 9.4 |
-| `uniformInt(0~63)` | 22.4 | 4.7 | 1.7 | 1.6 | 1.4 | 1.1 |
-| `uniformInt(0~100k)` | 21.8 | 10.1 | 6.2 | 6.7 | 6.6 | 5.4 |
-| `uniformReal` | 12.9 | 5.7 | 1.4 | 1.2 | 1.4 | 0.7 |
-| `weibull` | 41.0 | 35.8 | 17.7 | 15.5 | 8.5 | 8.5 |
-* Since there is no equivalent class to `balanced` in C++11 std, we used Eigen::DenseBase::Random instead.
-|  | C++ std | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (SSSE3) | EigenRand (AVX) | EigenRand (AVX2) |
-|---|---:|---:|---:|---:|---:|---:|
-| Mersenne Twister(int32) | 4.7 | 5.6 | 4.0 | 3.7 | 3.5 | 3.6 |
-| Mersenne Twister(int64) | 5.4 | 5.3 | 4.0 | 3.9 | 3.4 | 2.6 |
-|  | Python 3.6 + scipy 1.5.2 + numpy 1.19.2 | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (SSSE3) | EigenRand (AVX) | EigenRand (AVX2) |
-|---|---:|---:|---:|---:|---:|---:|
-| `Dirichlet(4)` | 6.47 | 6.60 | 2.39 | 2.49 | 1.34 | 1.67 |
-| `Dirichlet(100)` | 75.95 | 189.97 | 66.60 | 72.11 | 38.86 | 34.98 |
-| `InvWishart(4)` | 140.18 | 7.62 | 4.21 | 4.54 | 3.58 | 3.39 |
-| `InvWishart(50)` | 1510.47 | 1737.4 | 697.39 | 733.69 | 604.59 | 554.006 |
-| `Multinomial(4, t=20)` | 3.32 | 4.12 | 0.95 | 1.06 | 1.00 | 1.03 |
-| `Multinomial(4, t=1000)` | 3.51 | 192.51 | 35.99 | 39.58 | 27.84 | 35.45 |
-| `Multinomial(100, t=20)` | 69.19 | 4.80 | 2.00 | 2.20 | 2.28 | 2.09 |
-| `Multinomial(100, t=1000)` | 139.74 | 179.43 | 49.48 | 56.19 | 40.78 | 43.18 |
-| `MvNormal(4)` | 2.32 | 0.96 | 0.36 | 0.37 | 0.25 | 0.30 |
-| `MvNormal(100)` | 49.09 | 57.18 | 17.17 | 18.51 | 10.82 | 11.03 |
-| `Wishart(4)` | 71.19 | 5.28 | 2.70 | 2.93 | 2.04 | 1.94 |
-| `Wishart(50)` | 1185.26 | 1360.49 | 492.91 | 517.44 | 359.03 | 324.60 |
-### Intel(R) Xeon(R) CPU E5-1650 v2 @ 3.50GHz (macOS 10.15, clang-1103)
-|  | C++ std (or Eigen) | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (SSSE3) | EigenRand (AVX) |
-|---|---:|---:|---:|---:|---:|
-| `balanced`* | 6.5 | 7.3 | 1.1 | 1.4 | 1.1 |
-| `balanced`(double)* | 6.6 | 7.5 | 2.6 | 3.3 | 2.4 |
-| `binomial(20, 0.5)` | 38.8 | 164.9 | 27.7 | 29.3 | 24.9 |
-| `binomial(50, 0.01)` | 21.9 | 27.6 | 6.6 | 7.0 | 6.3 |
-| `binomial(100, 0.75)` | 52.2 | 421.9 | 93.6 | 94.8 | 89.1 |
-| `cauchy` | 36.0 | 30.4 | 5.6 | 5.8 | 4.0 |
-| `chiSquared` | 84.4 | 152.2 | 44.1 | 48.7 | 26.2 |
-| `discrete`(int32) | - | 12.4 | 2.1 | 2.6 | 2.2 |
-| `discrete`(fp32) | - | 23.2 | 3.4 | 3.7 | 3.4 |
-| `discrete`(fp64) | 48.6 | 22.9 | 4.2 | 5.0 | 4.6 |
-| `exponential` | 22.0 | 18.0 | 4.1 | 4.9 | 3.2 |
-| `extremeValue` | 36.2 | 32.0 | 8.7 | 9.5 | 5.1 |
-| `fisherF(1, 1)` | 158.2 | 73.1 | 32.3 | 32.1 | 18.1 |
-| `fisherF(5, 5)` | 177.3 | 310.1 | 127.0 | 121.8 | 74.3 |
-| `gamma(0.2, 1)` | 69.8 | 80.4 | 28.5 | 33.8 | 19.2 |
-| `gamma(5, 3)` | 83.9 | 53.3 | 10.6 | 12.4 | 8.6 |
-| `gamma(10.5, 1)` | 83.2 | 150.4 | 43.3 | 48.4 | 26.2 |
-| `geometric` | 39.6 | 19.0 | 4.3 | 4.4 | 4.1 |
-| `lognormal` | 43.8 | 40.7 | 9.0 | 10.8 | 5.7 |
-| `negativeBinomial(10, 0.5)` | 217.4 | 274.8 | 71.6 | 73.7 | 68.2 |
-| `negativeBinomial(20, 0.25)` | 192.9 | 464.9 | 112.0 | 111.5 | 105.7 |
-| `normal(0, 1)` | 32.6 | 28.6 | 5.5 | 6.5 | 3.8 |
-| `normal(2, 3)` | 32.9 | 30.5 | 5.7 | 6.7 | 3.9 |
-| `poisson(1)` | 37.9 | 31.0 | 7.5 | 7.8 | 7.1 |
-| `poisson(16)` | 92.4 | 243.3 | 55.6 | 57.7 | 53.7 |
-| `randBits` | 6.5 | 6.5 | 1.1 | 1.3 | 1.1 |
-| `studentT(1)` | 115.0 | 54.1 | 15.5 | 15.7 | 8.3 |
-| `studentT(20)` | 121.2 | 53.8 | 15.8 | 16.0 | 8.2 |
-| `uniformInt(0~63)` | 20.2 | 9.8 | 1.8 | 1.8 | 1.6 |
-| `uniformInt(0~100k)` | 25.7 | 16.1 | 8.1 | 8.5 | 7.2 |
-| `uniformReal` | 12.7 | 7.0 | 1.0 | 1.2 | 1.1 |
-| `weibull` | 23.1 | 19.2 | 11.6 | 13.6 | 7.6 |
-* Since there is no equivalent class to `balanced` in C++11 std, we used Eigen::DenseBase::Random instead.
-|  | C++ std | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (SSSE3) | EigenRand (AVX) |
-|---|---:|---:|---:|---:|---:|
-| Mersenne Twister(int32) | 6.2 | 6.4 | 1.7 | 2.0 | 1.8 |
-| Mersenne Twister(int64) | 6.4 | 6.3 | 2.5 | 3.1 | 2.4 |
-|  | Python 3.6 + scipy 1.5.2 + numpy 1.19.2 | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (SSSE3) | EigenRand (AVX) |
-|---|---:|---:|---:|---:|---:|
-| `Dirichlet(4)` | 3.54 | 3.29 | 1.25 | 1.25 | 0.83 |
-| `Dirichlet(100)` | 57.63 | 145.32 | 49.71 | 49.50 | 29.13 |
-| `InvWishart(4)` | 210.92 | 7.53 | 3.72 | 3.66 | 3.10 |
-| `InvWishart(50)` | 1980.73 | 1446.40 | 560.40 | 559.73 | 457.07 |
-| `Multinomial(4, t=20)` | 2.60 | 5.22 | 1.48 | 1.50 | 1.42 |
-| `Multinomial(4, t=1000)` | 3.90 | 208.75 | 29.19 | 29.50 | 27.70 |
-| `Multinomial(100, t=20)` | 47.71 | 7.09 | 3.71 | 3.63 | 3.60 |
-| `Multinomial(100, t=1000)` | 128.69 | 215.19 | 44.48 | 44.63 | 43.76 |
-| `MvNormal(4)` | 2.04 | 1.05 | 0.35 | 0.34 | 0.19 |
-| `MvNormal(100)` | 48.69 | 47.10 | 16.25 | 16.12 | 11.41 |
-| `Wishart(4)` | 81.11 | 13.24 | 9.87 | 9.81 | 5.90 |
-| `Wishart(50)` | 1419.02 | 1087.40 | 448.06 | 442.97 | 328.20 |
-### Intel(R) Xeon(R) Platinum 8171M CPU @ 2.60GHz (Windows Server 2019, MSVC2019)
-|  | C++ std (or Eigen) | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (AVX) | EigenRand (AVX2) |
-|---|---:|---:|---:|---:|---:|
-| `balanced`* | 20.7 | 7.2 | 3.3 | 4.0 | 2.2 |
-| `balanced`(double)* | 21.9 | 8.8 | 6.7 | 4.3 | 4.3 |
-| `binomial(20, 0.5)` | 718.3 | 141.0 | 38.1 | 30.2 | 32.7 |
-| `binomial(50, 0.01)` | 61.5 | 21.4 | 7.5 | 6.5 | 8.0 |
-| `binomial(100, 0.75)` | 495.9 | 1042.5 | 100.6 | 95.2 | 93.0 |
-| `cauchy` | 71.6 | 30.0 | 6.8 | 6.4 | 3.0 |
-| `chiSquared` | 243.0 | 147.3 | 63.5 | 34.1 | 24.0 |
-| `discrete`(int32) | - | 12.4 | 3.5 | 2.7 | 2.2 |
-| `discrete`(fp32) | - | 19.2 | 5.1 | 3.6 | 3.7 |
-| `discrete`(fp64) | 83.9 | 19.0 | 6.7 | 7.4 | 4.6 |
-| `exponential` | 58.7 | 16.0 | 6.8 | 6.4 | 3.0 |
-| `extremeValue` | 64.6 | 27.7 | 13.5 | 9.8 | 5.5 |
-| `fisherF(1, 1)` | 178.7 | 75.2 | 35.3 | 28.4 | 17.5 |
-| `fisherF(5, 5)` | 491.0 | 298.4 | 125.8 | 87.4 | 60.5 |
-| `gamma(0.2, 1)` | 211.7 | 69.3 | 43.7 | 24.7 | 18.7 |
-| `gamma(5, 3)` | 272.5 | 42.3 | 17.6 | 17.2 | 8.5 |
-| `gamma(10.5, 1)` | 237.8 | 146.2 | 63.7 | 33.8 | 23.5 |
-| `geometric` | 49.3 | 17.0 | 7.0 | 5.8 | 5.4 |
-| `lognormal` | 169.8 | 37.6 | 12.7 | 7.2 | 5.0 |
-| `negativeBinomial(10, 0.5)` | 752.7 | 462.3 | 87.0 | 83.0 | 81.6 |
-| `negativeBinomial(20, 0.25)` | 611.4 | 855.3 | 123.7 | 125.3 | 116.6 |
-| `normal(0, 1)` | 78.4 | 21.1 | 6.9 | 4.6 | 2.9 |
-| `normal(2, 3)` | 77.2 | 22.3 | 6.8 | 4.8 | 3.1 |
-| `poisson(1)` | 77.4 | 28.9 | 10.0 | 8.1 | 10.1 |
-| `poisson(16)` | 312.9 | 485.5 | 63.6 | 61.5 | 60.5 |
-| `randBits` | 6.0 | 6.2 | 3.1 | 2.7 | 2.7 |
-| `studentT(1)` | 175.8 | 53.9 | 17.3 | 12.5 | 7.7 |
-| `studentT(20)` | 173.2 | 55.5 | 17.9 | 12.7 | 7.6 |
-| `uniformInt(0~63)` | 39.1 | 5.2 | 2.0 | 1.4 | 1.6 |
-| `uniformInt(0~100k)` | 38.5 | 12.3 | 7.6 | 6.0 | 7.7 |
-| `uniformReal` | 53.4 | 5.7 | 1.9 | 2.3 | 1.0 |
-| `weibull` | 75.1 | 44.3 | 18.5 | 14.3 | 7.9 |
-* Since there is no equivalent class to `balanced` in C++11 std, we used Eigen::DenseBase::Random instead.
-|  | C++ std | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (AVX) | EigenRand (AVX2) |
-|---|---:|---:|---:|---:|---:|
-| Mersenne Twister(int32) | 6.5 | 6.4 | 5.6 | 5.1 | 4.5 |
-| Mersenne Twister(int64) | 6.6 | 6.5 | 6.9 | 5.9 | 5.1 |
-|  | Python 3.6 + scipy 1.5.2 + numpy 1.19.2 | EigenRand (No Vect.) | EigenRand (SSE2) | EigenRand (AVX) | EigenRand (AVX2) |
-|---|---:|---:|---:|---:|---:|
-| `Dirichlet(4)` | 4.27 | 3.20 | 2.31 | 1.43 | 1.25 |
-| `Dirichlet(100)` | 69.61 | 150.33 | 67.01 | 47.34 | 32.47 |
-| `InvWishart(4)` | 482.87 | 14.52 | 8.88 | 13.17 | 11.28 |
-| `InvWishart(50)` | 2222.72 | 2211.66 | 902.34 | 775.36 | 610.60 |
-| `Multinomial(4, t=20)` | 2.99 | 5.41 | 1.99 | 1.92 | 1.78 |
-| `Multinomial(4, t=1000)` | 4.23 | 235.84 | 49.73 | 42.41 | 40.76 |
-| `Multinomial(100, t=20)` | 58.20 | 9.12 | 5.84 | 6.02 | 5.98 |
-| `Multinomial(100, t=1000)` | 130.54 | 234.40 | 72.99 | 66.36 | 55.28 |
-| `MvNormal(4)` | 2.25 | 1.89 | 0.35 | 0.32 | 0.25 |
-| `MvNormal(100)` | 57.71 | 68.80 | 24.40 | 18.28 | 13.05 |
-| `Wishart(4)` | 70.18 | 16.25 | 4.49 | 3.97 | 3.07 |
-| `Wishart(50)` | 1471.29 | 1641.73 | 628.58 | 485.68 | 349.81 |
-### AMD Ryzen 7 3700x CPU @ 3.60GHz (Windows 10, MSVC2017)
-|  | C++ std (or Eigen) | EigenRand (SSE2) | EigenRand (AVX) | EigenRand (AVX2) |
-|---|---:|---:|---:|---:|
-| `balanced`* | 20.8 | 1.9 | 2.0 | 1.4 |
-| `balanced`(double)* | 21.7 | 4.1 | 2.7 | 3.0 |
-| `binomial(20, 0.5)` | 416.0 | 27.7 | 28.9 | 29.1 |
-| `binomial(50, 0.01)` | 37.8 | 6.3 | 6.0 | 6.6 |
-| `binomial(100, 0.75)` | 309.1 | 72.4 | 66.0 | 67.0 |
-| `cauchy` | 42.2 | 4.8 | 5.1 | 2.7 |
-| `chiSquared` | 153.8 | 33.5 | 21.2 | 17.0 |
-| `discrete`(int32) | - | 2.4 | 2.3 | 2.5 |
-| `discrete`(fp32) | - | 2.6 | 2.3 | 3.5 |
-| `discrete`(fp64) | 55.8 | 5.1 | 4.7 | 4.3 |
-| `exponential` | 33.4 | 6.4 | 2.8 | 2.2 |
-| `extremeValue` | 39.4 | 7.8 | 4.6 | 4.0 |
-| `fisherF(1, 1)` | 103.9 | 25.3 | 14.9 | 11.7 |
-| `fisherF(5, 5)` | 295.7 | 85.5 | 58.3 | 44.8 |
-| `gamma(0.2, 1)` | 128.8 | 31.9 | 18.3 | 15.8 |
-| `gamma(5, 3)` | 156.1 | 9.7 | 8.0 | 5.0 |
-| `gamma(10.5, 1)` | 148.5 | 33.1 | 21.1 | 17.2 |
-| `geometric` | 27.1 | 6.6 | 4.3 | 4.1 |
-| `lognormal` | 104.0 | 6.6 | 4.7 | 3.5 |
-| `negativeBinomial(10, 0.5)` | 462.1 | 60.0 | 56.4 | 58.6 |
-| `negativeBinomial(20, 0.25)` | 357.6 | 84.5 | 80.6 | 78.4 |
-| `normal(0, 1)` | 48.8 | 4.2 | 3.7 | 2.3 |
-| `normal(2, 3)` | 48.8 | 4.5 | 3.8 | 2.4 |
-| `poisson(1)` | 46.4 | 7.9 | 7.4 | 8.2 |
-| `poisson(16)` | 192.4 | 43.2 | 40.4 | 40.9 |
-| `randBits` | 4.2 | 1.7 | 1.5 | 1.8 |
-| `studentT(1)` | 107.0 | 12.3 | 6.8 | 5.7 |
-| `studentT(20)` | 107.1 | 12.3 | 6.8 | 5.8 |
-| `uniformInt(0~63)` | 31.2 | 1.1 | 1.0 | 1.2 |
-| `uniformInt(0~100k)` | 27.7 | 5.6 | 5.6 | 5.4 |
-| `uniformReal` | 30.7 | 1.1 | 1.0 | 0.6 |
-| `weibull` | 46.5 | 10.6 | 6.4 | 5.2 |
+The following charts show the relative speed-up of EigenRand compared to references(equivalent functions of C++ std or Eigen for univariate distributions and Scipy for multivariate distributions).
 * Since there is no equivalent class to `balanced` in C++11 std, we used Eigen::DenseBase::Random instead.
+* Cases filled with orange are generators that are slower than reference functions.
-|  | C++ std | EigenRand (SSE2) | EigenRand (AVX) | EigenRand (AVX2) |
-|---|---:|---:|---:|---:|
-| Mersenne Twister(int32) | 5.0 | 3.4 | 3.4 | 3.3 |
-| Mersenne Twister(int64) | 5.1 | 3.9 | 3.9 | 3.3 |
-### ARM64 NEON (Cortex-A73)
-Currently, Support for ARM64 NEON is experimental and the result may be sub-optimal.
-Also keep in mind that NEON does not support vectorization of double type.
-So if you use double type generators, they would fallback into scalar computations.
+### Windows 2019, MSVC 19.29.30147, Intel(R) Xeon(R) Platinum 8171M CPU, AVX2, Eigen 3.4.0
+![Perf_AVX2_Win](/doxygen/images/perf_avx2_win.png)
+![Perf_AVX2_Win_Mv1](/doxygen/images/perf_avx2_win_mv1.png)
+![Perf_AVX2_Win_Mv1](/doxygen/images/perf_avx2_win_mv2.png)
-![Perf_no_vect](/doxygen/images/perf_neon_v0.3.90.png)
+### Ubuntu 18.04, gcc 7.5.0, Intel(R) Xeon(R) Platinum 8370C CPU, AVX2, Eigen 3.4.0
+![Perf_AVX2_Ubu](/doxygen/images/perf_avx2_ubu.png)
+![Perf_AVX2_Ubu_Mv1](/doxygen/images/perf_avx2_ubu_mv1.png)
+![Perf_AVX2_Ubu_Mv1](/doxygen/images/perf_avx2_ubu_mv2.png)
-The following charts are about multivariate distributions.
-![Perf_no_vect](/doxygen/images/perf_mv_part1_neon_v0.3.90.png)
-![Perf_no_vect](/doxygen/images/perf_mv_part2_neon_v0.3.90.png)
+### macOS Monterey 12.2.1, clang 13.1.6, Apple M1 Pro, NEON, Eigen 3.4.0
+![Perf_NEON_mac](/doxygen/images/perf_neon_mac.png)
+![Perf_NEON_mac_Mv1](/doxygen/images/perf_neon_mac_mv1.png)
+![Perf_NEON_mac_Mv1](/doxygen/images/perf_neon_mac_mv2.png)
-Cases filled with orange are generators that are slower than reference functions.
+You can see the detailed numerical values used to plot the above charts on the [Action](https://github.com/bab2min/EigenRand/actions/workflows/release.yml) page.
 ## Accuracy
 Since vectorized mathematical functions may have a loss of precision, I measured how well the generated random number fits its actual distribution.
@@ -385,6 +164,15 @@ MIT License
 ## History
+### 0.5.1 (2024-09-08)
+* Add AVX512 support
+* Add `EIGENRAND_BUILD_BENCHMARK` cmake option
+### 0.5.0 (2023-01-31)
+* Improved the performance of `MultinomialGen`.
+* Implemented vectorization over parameters to some distributions.
+* Optimized the performance of `double`-type generators on NEON architecture.
 ### 0.4.1 (2022-08-13)
 * Fixed a bug where double-type generation with std::mt19937 fails compilation.
 * Fixed a bug where `UniformIntGen` in scalar mode generates numbers in the wrong range.

data/vendor/tomotopy/README.kr.rst CHANGED Viewed

@@ -55,9 +55,9 @@ tomotopy 란?
 ::
     import tomotopy as tp
-    print(tp.isa) # 'avx2'나 'avx', 'sse2', 'none'를 출력합니다.
+    print(tp.isa) # 'avx512'나 'avx2', 'sse2', 'none'를 출력합니다.
-현재 tomotopy는 가속을 위해 AVX2, AVX or SSE2 SIMD 명령어 세트를 활용할 수 있습니다.
+현재 tomotopy는 가속을 위해 AVX512, AVX2 or SSE2 SIMD 명령어 세트를 활용할 수 있습니다.
 패키지가 import될 때 현재 환경에서 활용할 수 있는 최선의 명령어 세트를 확인하여 최상의 모듈을 자동으로 가져옵니다.
 만약 `tp.isa`가 `none`이라면 현재 환경에서 활용 가능한 SIMD 명령어 세트가 없는 것이므로 훈련에 오랜 시간이 걸릴 수 있습니다.
 그러나 최근 대부분의 Intel 및 AMD CPU에서는 SIMD 명령어 세트를 지원하므로 SIMD 가속이 성능을 크게 향상시킬 수 있을 것입니다.
@@ -148,6 +148,31 @@ CGS와 VB는 서로 접근방법이 아예 다른 기법이기 때문에 둘을
 이에 대해서는 `tomotopy.LDAModel.save`와 `tomotopy.LDAModel.load`에서 더 자세한 내용을 확인할 수 있습니다.
+인터랙티브 모델 뷰어
+------------------
+.. raw:: html
+    <video src="https://private-user-images.githubusercontent.com/19266222/355924875-fc9d27f5-5542-4e65-ab69-1d96dc0913af.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjMwNTI4MTUsIm5iZiI6MTcyMzA1MjUxNSwicGF0aCI6Ii8xOTI2NjIyMi8zNTU5MjQ4NzUtZmM5ZDI3ZjUtNTU0Mi00ZTY1LWFiNjktMWQ5NmRjMDkxM2FmLm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODA3VDE3NDE1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTk1N2YxODE3MzBiZTNhMjkyNTk1OWJkODRmZjc4ZTcyYzFkZGYxZjgxODUxYTNlNGYxMzllOTgzNDI0MjA4ZDImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.evTswIGMps594nQ6JCtbd6puFM8ARHM0emgaluIUxvY" style="max-width:100%"></video>
+v0.13.0부터는 토픽 모델링의 결과를 인터랙티브 뷰어를 통해 살펴보는 것도 가능합니다.
+::
+    import tomotopy as tp
+    model = tp.LDAModel(...)
+    # ... some training codes ...
+    tp.viewer.open_viewer(model, host="localhost", port=9999)
+    # And open http://localhost:9999 in your web browser!
+이미 저장된 모델 파일이 있다면 다음 명령행을 통해서도 간단히 뷰어를 구동할 수 있습니다.
+::
+    python -m tomotopy.viewer a_trained_model.bin --host localhost --port 9999
+자세한 내용은 `tomotopy.viewer`을 참조하세요.
 모델 안의 문헌과 모델 밖의 문헌
 -------------------------------------------
 토픽 모델은 크게 2가지 목적으로 사용할 수 있습니다.
@@ -540,6 +565,3 @@ tomotopy의 Python3 예제 코드는 https://github.com/bab2min/tomotopy/blob/ma
 * EigenRand: `MIT License
   <licenses_bundled/EigenRand>`_
-* Mapbox Variant: `BSD License
-  <licenses_bundled/MapboxVariant>`_

data/vendor/tomotopy/README.rst CHANGED Viewed

@@ -56,9 +56,9 @@ After installing, you can start tomotopy by just importing.
 ::
     import tomotopy as tp
-    print(tp.isa) # prints 'avx2', 'avx', 'sse2' or 'none'
+    print(tp.isa) # prints 'avx512', 'avx2', 'sse2' or 'none'
-Currently, tomotopy can exploits AVX2, AVX or SSE2 SIMD instruction set for maximizing performance.
+Currently, tomotopy can exploits AVX512, AVX2 or SSE2 SIMD instruction set for maximizing performance.
 When the package is imported, it will check available instruction sets and select the best option.
 If `tp.isa` tells `none`, iterations of training may take a long time.
 But, since most of modern Intel or AMD CPUs provide SIMD instruction set, the SIMD acceleration could show a big improvement.
@@ -149,6 +149,31 @@ When you load the model from a file, a model type in the file should match the c
 See more at `tomotopy.LDAModel.save` and `tomotopy.LDAModel.load` methods.
+Interactive Model Viewer
+------------------------
+.. raw:: html
+    <video src="https://private-user-images.githubusercontent.com/19266222/355924875-fc9d27f5-5542-4e65-ab69-1d96dc0913af.mp4?jwt=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3MjMwNTI4MTUsIm5iZiI6MTcyMzA1MjUxNSwicGF0aCI6Ii8xOTI2NjIyMi8zNTU5MjQ4NzUtZmM5ZDI3ZjUtNTU0Mi00ZTY1LWFiNjktMWQ5NmRjMDkxM2FmLm1wND9YLUFtei1BbGdvcml0aG09QVdTNC1ITUFDLVNIQTI1NiZYLUFtei1DcmVkZW50aWFsPUFLSUFWQ09EWUxTQTUzUFFLNFpBJTJGMjAyNDA4MDclMkZ1cy1lYXN0LTElMkZzMyUyRmF3czRfcmVxdWVzdCZYLUFtei1EYXRlPTIwMjQwODA3VDE3NDE1NVomWC1BbXotRXhwaXJlcz0zMDAmWC1BbXotU2lnbmF0dXJlPTk1N2YxODE3MzBiZTNhMjkyNTk1OWJkODRmZjc4ZTcyYzFkZGYxZjgxODUxYTNlNGYxMzllOTgzNDI0MjA4ZDImWC1BbXotU2lnbmVkSGVhZGVycz1ob3N0JmFjdG9yX2lkPTAma2V5X2lkPTAmcmVwb19pZD0wIn0.evTswIGMps594nQ6JCtbd6puFM8ARHM0emgaluIUxvY" style="max-width:100%"></video>
+You can see the result of modeling using the interactive viewer since v0.13.0.
+::
+    import tomotopy as tp
+    model = tp.LDAModel(...)
+    # ... some training codes ...
+    tp.viewer.open_viewer(model, host="localhost", port=9999)
+    # And open http://localhost:9999 in your web browser!
+If you have a saved model file, you can also use the following command line.
+::
+    python -m tomotopy.viewer a_trained_model.bin --host localhost --port 9999
+See more at `tomotopy.viewer` module.
 Documents in the Model and out of the Model
 -------------------------------------------
 We can use Topic Model for two major purposes.
@@ -545,9 +570,6 @@ Bundled Libraries and Their License
 * EigenRand: `MIT License
   <licenses_bundled/EigenRand>`_
-* Mapbox Variant: `BSD License
-  <licenses_bundled/MapboxVariant>`_
 Citation
 ---------
 ::