lawkit-python 2.5.8__tar.gz → 2.6.0__tar.gz
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- lawkit_python-2.6.0/PKG-INFO +37 -0
- lawkit_python-2.6.0/README.md +24 -0
- lawkit_python-2.6.0/lawkit_python/__init__.py +28 -0
- lawkit_python-2.6.0/lawkit_python.egg-info/PKG-INFO +37 -0
- lawkit_python-2.6.0/lawkit_python.egg-info/SOURCES.txt +8 -0
- lawkit_python-2.6.0/lawkit_python.egg-info/dependency_links.txt +1 -0
- lawkit_python-2.6.0/lawkit_python.egg-info/requires.txt +1 -0
- lawkit_python-2.6.0/lawkit_python.egg-info/top_level.txt +1 -0
- lawkit_python-2.6.0/pyproject.toml +22 -0
- lawkit_python-2.6.0/setup.cfg +4 -0
- lawkit_python-2.5.8/Cargo.toml +0 -52
- lawkit_python-2.5.8/PKG-INFO +0 -474
- lawkit_python-2.5.8/README.md +0 -435
- lawkit_python-2.5.8/lawkit-core/Cargo.toml +0 -44
- lawkit_python-2.5.8/lawkit-core/README.md +0 -508
- lawkit_python-2.5.8/lawkit-core/benches/law_benchmark.rs +0 -103
- lawkit_python-2.5.8/lawkit-core/src/common/filtering.rs +0 -353
- lawkit_python-2.5.8/lawkit-core/src/common/input/file_detector.rs +0 -256
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/csv.rs +0 -86
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/excel.rs +0 -208
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/html.rs +0 -134
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/json_xml.rs +0 -276
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/mod.rs +0 -18
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/opendocument.rs +0 -284
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/pdf.rs +0 -99
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/powerpoint.rs +0 -229
- lawkit_python-2.5.8/lawkit-core/src/common/input/formats/word.rs +0 -161
- lawkit_python-2.5.8/lawkit-core/src/common/input/mod.rs +0 -5
- lawkit_python-2.5.8/lawkit-core/src/common/input/parser.rs +0 -46
- lawkit_python-2.5.8/lawkit-core/src/common/international.rs +0 -214
- lawkit_python-2.5.8/lawkit-core/src/common/memory.rs +0 -1063
- lawkit_python-2.5.8/lawkit-core/src/common/mod.rs +0 -11
- lawkit_python-2.5.8/lawkit-core/src/common/outliers.rs +0 -456
- lawkit_python-2.5.8/lawkit-core/src/common/output/formatter.rs +0 -223
- lawkit_python-2.5.8/lawkit-core/src/common/output/mod.rs +0 -3
- lawkit_python-2.5.8/lawkit-core/src/common/parallel.rs +0 -487
- lawkit_python-2.5.8/lawkit-core/src/common/risk.rs +0 -41
- lawkit_python-2.5.8/lawkit-core/src/common/statistics.rs +0 -70
- lawkit_python-2.5.8/lawkit-core/src/common/streaming_io.rs +0 -185
- lawkit_python-2.5.8/lawkit-core/src/common/timeseries.rs +0 -837
- lawkit_python-2.5.8/lawkit-core/src/core/mod.rs +0 -121
- lawkit_python-2.5.8/lawkit-core/src/error.rs +0 -54
- lawkit_python-2.5.8/lawkit-core/src/generate/benford.rs +0 -118
- lawkit_python-2.5.8/lawkit-core/src/generate/mod.rs +0 -57
- lawkit_python-2.5.8/lawkit-core/src/generate/normal.rs +0 -78
- lawkit_python-2.5.8/lawkit-core/src/generate/pareto.rs +0 -118
- lawkit_python-2.5.8/lawkit-core/src/generate/poisson.rs +0 -88
- lawkit_python-2.5.8/lawkit-core/src/generate/zipf.rs +0 -116
- lawkit_python-2.5.8/lawkit-core/src/laws/benford/analysis.rs +0 -62
- lawkit_python-2.5.8/lawkit-core/src/laws/benford/japanese.rs +0 -431
- lawkit_python-2.5.8/lawkit-core/src/laws/benford/mod.rs +0 -6
- lawkit_python-2.5.8/lawkit-core/src/laws/benford/result.rs +0 -140
- lawkit_python-2.5.8/lawkit-core/src/laws/integration/analysis.rs +0 -822
- lawkit_python-2.5.8/lawkit-core/src/laws/integration/mod.rs +0 -5
- lawkit_python-2.5.8/lawkit-core/src/laws/integration/result.rs +0 -1106
- lawkit_python-2.5.8/lawkit-core/src/laws/mod.rs +0 -6
- lawkit_python-2.5.8/lawkit-core/src/laws/normal/analysis.rs +0 -365
- lawkit_python-2.5.8/lawkit-core/src/laws/normal/mod.rs +0 -9
- lawkit_python-2.5.8/lawkit-core/src/laws/normal/result.rs +0 -534
- lawkit_python-2.5.8/lawkit-core/src/laws/pareto/analysis.rs +0 -139
- lawkit_python-2.5.8/lawkit-core/src/laws/pareto/mod.rs +0 -5
- lawkit_python-2.5.8/lawkit-core/src/laws/pareto/result.rs +0 -130
- lawkit_python-2.5.8/lawkit-core/src/laws/poisson/analysis.rs +0 -473
- lawkit_python-2.5.8/lawkit-core/src/laws/poisson/mod.rs +0 -5
- lawkit_python-2.5.8/lawkit-core/src/laws/poisson/result.rs +0 -365
- lawkit_python-2.5.8/lawkit-core/src/laws/zipf/analysis.rs +0 -257
- lawkit_python-2.5.8/lawkit-core/src/laws/zipf/mod.rs +0 -8
- lawkit_python-2.5.8/lawkit-core/src/laws/zipf/result.rs +0 -319
- lawkit_python-2.5.8/lawkit-core/src/lib.rs +0 -15
- lawkit_python-2.5.8/lawkit-python/Cargo.lock +0 -2061
- lawkit_python-2.5.8/lawkit-python/Cargo.toml +0 -25
- lawkit_python-2.5.8/lawkit-python/PACKAGE_SUMMARY.md +0 -229
- lawkit_python-2.5.8/lawkit-python/README.md +0 -435
- lawkit_python-2.5.8/lawkit-python/STRUCTURE.md +0 -228
- lawkit_python-2.5.8/lawkit-python/src/analyze.rs +0 -47
- lawkit_python-2.5.8/lawkit-python/src/benf.rs +0 -66
- lawkit_python-2.5.8/lawkit-python/src/colors.rs +0 -110
- lawkit_python-2.5.8/lawkit-python/src/common_options.rs +0 -460
- lawkit_python-2.5.8/lawkit-python/src/diagnose.rs +0 -419
- lawkit_python-2.5.8/lawkit-python/src/lawkit_python/__init__.py +0 -73
- lawkit_python-2.5.8/lawkit-python/src/lawkit_python/__main__.py +0 -26
- lawkit_python-2.5.8/lawkit-python/src/lawkit_python/lawkit.py +0 -857
- lawkit_python-2.5.8/lawkit-python/src/lib.rs +0 -9
- lawkit_python-2.5.8/lawkit-python/src/main.rs +0 -399
- lawkit_python-2.5.8/lawkit-python/src/mod.rs +0 -9
- lawkit_python-2.5.8/lawkit-python/src/normal.rs +0 -947
- lawkit_python-2.5.8/lawkit-python/src/pareto.rs +0 -66
- lawkit_python-2.5.8/lawkit-python/src/poisson.rs +0 -764
- lawkit_python-2.5.8/lawkit-python/src/subcommands/analyze.rs +0 -47
- lawkit_python-2.5.8/lawkit-python/src/subcommands/benf.rs +0 -497
- lawkit_python-2.5.8/lawkit-python/src/subcommands/diagnose.rs +0 -419
- lawkit_python-2.5.8/lawkit-python/src/subcommands/integration_common.rs +0 -575
- lawkit_python-2.5.8/lawkit-python/src/subcommands/mod.rs +0 -9
- lawkit_python-2.5.8/lawkit-python/src/subcommands/normal.rs +0 -947
- lawkit_python-2.5.8/lawkit-python/src/subcommands/pareto.rs +0 -472
- lawkit_python-2.5.8/lawkit-python/src/subcommands/poisson.rs +0 -764
- lawkit_python-2.5.8/lawkit-python/src/subcommands/validate.rs +0 -121
- lawkit_python-2.5.8/lawkit-python/src/subcommands/zipf.rs +0 -490
- lawkit_python-2.5.8/lawkit-python/src/validate.rs +0 -121
- lawkit_python-2.5.8/lawkit-python/src/zipf.rs +0 -490
- lawkit_python-2.5.8/lawkit-python/uv.lock +0 -760
- lawkit_python-2.5.8/pyproject.toml +0 -92
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: lawkit-python
|
|
3
|
+
Version: 2.6.0
|
|
4
|
+
Summary: DEPRECATED: Use 'lawkit' instead. This package is a compatibility shim.
|
|
5
|
+
Author: kako-jun
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/kako-jun/lawkit-python
|
|
8
|
+
Classifier: Development Status :: 7 - Inactive
|
|
9
|
+
Classifier: Programming Language :: Python :: 3
|
|
10
|
+
Requires-Python: >=3.8
|
|
11
|
+
Description-Content-Type: text/markdown
|
|
12
|
+
Requires-Dist: lawkit>=2.6.0
|
|
13
|
+
|
|
14
|
+
# lawkit-python (DEPRECATED)
|
|
15
|
+
|
|
16
|
+
**This package has been renamed to [`lawkit`](https://pypi.org/project/lawkit/).**
|
|
17
|
+
|
|
18
|
+
## Migration
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
pip uninstall lawkit-python
|
|
22
|
+
pip install lawkit
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Update your imports:
|
|
26
|
+
|
|
27
|
+
```python
|
|
28
|
+
# Old (deprecated)
|
|
29
|
+
import lawkit_python
|
|
30
|
+
|
|
31
|
+
# New
|
|
32
|
+
import lawkit
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Why the change?
|
|
36
|
+
|
|
37
|
+
The package has been renamed to provide a cleaner import experience that matches the package name.
|
|
@@ -0,0 +1,24 @@
|
|
|
1
|
+
# lawkit-python (DEPRECATED)
|
|
2
|
+
|
|
3
|
+
**This package has been renamed to [`lawkit`](https://pypi.org/project/lawkit/).**
|
|
4
|
+
|
|
5
|
+
## Migration
|
|
6
|
+
|
|
7
|
+
```bash
|
|
8
|
+
pip uninstall lawkit-python
|
|
9
|
+
pip install lawkit
|
|
10
|
+
```
|
|
11
|
+
|
|
12
|
+
Update your imports:
|
|
13
|
+
|
|
14
|
+
```python
|
|
15
|
+
# Old (deprecated)
|
|
16
|
+
import lawkit_python
|
|
17
|
+
|
|
18
|
+
# New
|
|
19
|
+
import lawkit
|
|
20
|
+
```
|
|
21
|
+
|
|
22
|
+
## Why the change?
|
|
23
|
+
|
|
24
|
+
The package has been renamed to provide a cleaner import experience that matches the package name.
|
|
@@ -0,0 +1,28 @@
|
|
|
1
|
+
"""
|
|
2
|
+
DEPRECATED: This package has been renamed to 'lawkit'.
|
|
3
|
+
|
|
4
|
+
Please update your dependencies:
|
|
5
|
+
pip install lawkit
|
|
6
|
+
|
|
7
|
+
And update your imports:
|
|
8
|
+
# Old (deprecated)
|
|
9
|
+
import lawkit_python
|
|
10
|
+
|
|
11
|
+
# New
|
|
12
|
+
import lawkit
|
|
13
|
+
"""
|
|
14
|
+
|
|
15
|
+
import warnings
|
|
16
|
+
|
|
17
|
+
warnings.warn(
|
|
18
|
+
"The 'lawkit-python' package is deprecated and will be removed in a future release. "
|
|
19
|
+
"Please use 'lawkit' instead:\n"
|
|
20
|
+
" pip install lawkit\n"
|
|
21
|
+
" import lawkit",
|
|
22
|
+
DeprecationWarning,
|
|
23
|
+
stacklevel=2
|
|
24
|
+
)
|
|
25
|
+
|
|
26
|
+
# Re-export everything from the new package
|
|
27
|
+
from lawkit import *
|
|
28
|
+
from lawkit import __version__
|
|
@@ -0,0 +1,37 @@
|
|
|
1
|
+
Metadata-Version: 2.4
|
|
2
|
+
Name: lawkit-python
|
|
3
|
+
Version: 2.6.0
|
|
4
|
+
Summary: DEPRECATED: Use 'lawkit' instead. This package is a compatibility shim.
|
|
5
|
+
Author: kako-jun
|
|
6
|
+
License-Expression: MIT
|
|
7
|
+
Project-URL: Homepage, https://github.com/kako-jun/lawkit-python
|
|
8
|
+
Classifier: Development Status :: 7 - Inactive
|
|
9
|
+
Classifier: Programming Language :: Python :: 3
|
|
10
|
+
Requires-Python: >=3.8
|
|
11
|
+
Description-Content-Type: text/markdown
|
|
12
|
+
Requires-Dist: lawkit>=2.6.0
|
|
13
|
+
|
|
14
|
+
# lawkit-python (DEPRECATED)
|
|
15
|
+
|
|
16
|
+
**This package has been renamed to [`lawkit`](https://pypi.org/project/lawkit/).**
|
|
17
|
+
|
|
18
|
+
## Migration
|
|
19
|
+
|
|
20
|
+
```bash
|
|
21
|
+
pip uninstall lawkit-python
|
|
22
|
+
pip install lawkit
|
|
23
|
+
```
|
|
24
|
+
|
|
25
|
+
Update your imports:
|
|
26
|
+
|
|
27
|
+
```python
|
|
28
|
+
# Old (deprecated)
|
|
29
|
+
import lawkit_python
|
|
30
|
+
|
|
31
|
+
# New
|
|
32
|
+
import lawkit
|
|
33
|
+
```
|
|
34
|
+
|
|
35
|
+
## Why the change?
|
|
36
|
+
|
|
37
|
+
The package has been renamed to provide a cleaner import experience that matches the package name.
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
lawkit>=2.6.0
|
|
@@ -0,0 +1 @@
|
|
|
1
|
+
lawkit_python
|
|
@@ -0,0 +1,22 @@
|
|
|
1
|
+
[build-system]
|
|
2
|
+
requires = ["setuptools>=61.0"]
|
|
3
|
+
build-backend = "setuptools.build_meta"
|
|
4
|
+
|
|
5
|
+
[project]
|
|
6
|
+
name = "lawkit-python"
|
|
7
|
+
version = "2.6.0"
|
|
8
|
+
description = "DEPRECATED: Use 'lawkit' instead. This package is a compatibility shim."
|
|
9
|
+
readme = "README.md"
|
|
10
|
+
license = "MIT"
|
|
11
|
+
authors = [
|
|
12
|
+
{ name = "kako-jun" }
|
|
13
|
+
]
|
|
14
|
+
classifiers = [
|
|
15
|
+
"Development Status :: 7 - Inactive",
|
|
16
|
+
"Programming Language :: Python :: 3",
|
|
17
|
+
]
|
|
18
|
+
requires-python = ">=3.8"
|
|
19
|
+
dependencies = ["lawkit>=2.6.0"]
|
|
20
|
+
|
|
21
|
+
[project.urls]
|
|
22
|
+
Homepage = "https://github.com/kako-jun/lawkit-python"
|
lawkit_python-2.5.8/Cargo.toml
DELETED
|
@@ -1,52 +0,0 @@
|
|
|
1
|
-
[workspace]
|
|
2
|
-
resolver = "2"
|
|
3
|
-
members = [
|
|
4
|
-
"lawkit-core",
|
|
5
|
-
"lawkit-cli"
|
|
6
|
-
]
|
|
7
|
-
|
|
8
|
-
[workspace.package]
|
|
9
|
-
version = "2.5.8"
|
|
10
|
-
edition = "2021"
|
|
11
|
-
authors = ["kako-jun"]
|
|
12
|
-
license = "MIT"
|
|
13
|
-
description = "Statistical law analysis toolkit with international number support"
|
|
14
|
-
homepage = "https://github.com/kako-jun/lawkit"
|
|
15
|
-
repository = "https://github.com/kako-jun/lawkit"
|
|
16
|
-
documentation = "https://docs.rs/lawkit-core"
|
|
17
|
-
readme = "README.md"
|
|
18
|
-
keywords = ["statistics", "benford", "pareto", "zipf", "audit"]
|
|
19
|
-
categories = ["command-line-utilities", "mathematics", "algorithms", "science"]
|
|
20
|
-
exclude = [
|
|
21
|
-
"test_manual/",
|
|
22
|
-
"test_threshold.csv",
|
|
23
|
-
".github/",
|
|
24
|
-
"international_numerals_research.md"
|
|
25
|
-
]
|
|
26
|
-
rust-version = "1.75"
|
|
27
|
-
|
|
28
|
-
[workspace.dependencies]
|
|
29
|
-
clap = { version = "4.0", features = ["derive", "cargo"] }
|
|
30
|
-
scraper = "0.17"
|
|
31
|
-
calamine = "0.22"
|
|
32
|
-
pdf-extract = "0.7"
|
|
33
|
-
docx-rs = "0.4"
|
|
34
|
-
zip = "0.6"
|
|
35
|
-
serde = { version = "1.0", features = ["derive"] }
|
|
36
|
-
serde_json = "1.0"
|
|
37
|
-
serde_yaml = "0.9"
|
|
38
|
-
toml = "0.8"
|
|
39
|
-
regex = "1.0"
|
|
40
|
-
thiserror = "1.0"
|
|
41
|
-
anyhow = "1.0"
|
|
42
|
-
rayon = "1.0"
|
|
43
|
-
tempfile = "3.0"
|
|
44
|
-
mockito = "1.0"
|
|
45
|
-
pretty_assertions = "1.0"
|
|
46
|
-
criterion = { version = "0.5", features = ["html_reports"] }
|
|
47
|
-
|
|
48
|
-
|
|
49
|
-
[profile.release]
|
|
50
|
-
lto = true
|
|
51
|
-
codegen-units = 1
|
|
52
|
-
panic = "abort"
|
lawkit_python-2.5.8/PKG-INFO
DELETED
|
@@ -1,474 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: lawkit-python
|
|
3
|
-
Version: 2.5.8
|
|
4
|
-
Classifier: Development Status :: 4 - Beta
|
|
5
|
-
Classifier: Intended Audience :: Developers
|
|
6
|
-
Classifier: Intended Audience :: Financial and Insurance Industry
|
|
7
|
-
Classifier: Intended Audience :: Science/Research
|
|
8
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
9
|
-
Classifier: Programming Language :: Python :: 3
|
|
10
|
-
Classifier: Programming Language :: Python :: 3.8
|
|
11
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
12
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
13
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
14
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
15
|
-
Classifier: Programming Language :: Python :: 3.13
|
|
16
|
-
Classifier: Topic :: Scientific/Engineering :: Mathematics
|
|
17
|
-
Classifier: Topic :: Office/Business :: Financial
|
|
18
|
-
Classifier: Topic :: Software Development :: Libraries :: Python Modules
|
|
19
|
-
Classifier: Topic :: Security
|
|
20
|
-
Classifier: Topic :: Utilities
|
|
21
|
-
Requires-Dist: maturin>=1.9.1
|
|
22
|
-
Requires-Dist: pytest>=6.0 ; extra == 'dev'
|
|
23
|
-
Requires-Dist: pytest-cov ; extra == 'dev'
|
|
24
|
-
Requires-Dist: black ; extra == 'dev'
|
|
25
|
-
Requires-Dist: isort ; extra == 'dev'
|
|
26
|
-
Requires-Dist: mypy ; extra == 'dev'
|
|
27
|
-
Requires-Dist: ruff ; extra == 'dev'
|
|
28
|
-
Provides-Extra: dev
|
|
29
|
-
Summary: Python wrapper for lawkit - Statistical law analysis toolkit for fraud detection and data quality assessment
|
|
30
|
-
Keywords: statistics,benford,pareto,zipf,normal,poisson,fraud-detection,audit,compliance,data-quality,forensic-accounting,statistical-analysis,outlier-detection,anomaly-detection
|
|
31
|
-
Author: kako-jun
|
|
32
|
-
License: MIT
|
|
33
|
-
Requires-Python: >=3.8
|
|
34
|
-
Description-Content-Type: text/markdown; charset=UTF-8; variant=GFM
|
|
35
|
-
Project-URL: Homepage, https://github.com/kako-jun/lawkit
|
|
36
|
-
Project-URL: Repository, https://github.com/kako-jun/lawkit
|
|
37
|
-
Project-URL: Issues, https://github.com/kako-jun/lawkit/issues
|
|
38
|
-
Project-URL: Documentation, https://github.com/kako-jun/lawkit/tree/main/docs
|
|
39
|
-
|
|
40
|
-
# lawkit-python
|
|
41
|
-
|
|
42
|
-
Python wrapper for the `lawkit` CLI tool - Statistical law analysis toolkit for fraud detection and data quality assessment.
|
|
43
|
-
|
|
44
|
-
## Installation
|
|
45
|
-
|
|
46
|
-
```bash
|
|
47
|
-
pip install lawkit-python
|
|
48
|
-
```
|
|
49
|
-
|
|
50
|
-
This includes the `lawkit` binary embedded in the wheel - no download required.
|
|
51
|
-
|
|
52
|
-
## Quick Start
|
|
53
|
-
|
|
54
|
-
```python
|
|
55
|
-
import lawkit
|
|
56
|
-
|
|
57
|
-
# Analyze financial data with Benford Law
|
|
58
|
-
result = lawkit.analyze_benford('financial_data.csv')
|
|
59
|
-
print(result)
|
|
60
|
-
|
|
61
|
-
# Get structured JSON output
|
|
62
|
-
json_result = lawkit.analyze_benford(
|
|
63
|
-
'accounting.csv',
|
|
64
|
-
lawkit.LawkitOptions(format='json')
|
|
65
|
-
)
|
|
66
|
-
print(f"Risk level: {json_result.risk_level}")
|
|
67
|
-
print(f"P-value: {json_result.p_value}")
|
|
68
|
-
|
|
69
|
-
# Check if data follows Pareto principle (80/20 rule)
|
|
70
|
-
pareto_result = lawkit.analyze_pareto(
|
|
71
|
-
'sales_data.csv',
|
|
72
|
-
lawkit.LawkitOptions(format='json', gini_coefficient=True)
|
|
73
|
-
)
|
|
74
|
-
print(f"Gini coefficient: {pareto_result.gini_coefficient}")
|
|
75
|
-
print(f"80/20 concentration: {pareto_result.concentration_80_20}")
|
|
76
|
-
```
|
|
77
|
-
|
|
78
|
-
## Features
|
|
79
|
-
|
|
80
|
-
### Statistical Laws Supported
|
|
81
|
-
|
|
82
|
-
- **Benford Law**: Detect fraud and anomalies in numerical data
|
|
83
|
-
- **Pareto Principle**: Analyze 80/20 distributions and concentration
|
|
84
|
-
- **Zipf Law**: Analyze word frequencies and power-law distributions
|
|
85
|
-
- **Normal Distribution**: Test for normality and detect outliers
|
|
86
|
-
- **Poisson Distribution**: Analyze rare events and count data
|
|
87
|
-
|
|
88
|
-
### Advanced Analysis
|
|
89
|
-
|
|
90
|
-
- **Multi-law Comparison**: Compare multiple statistical laws on the same data
|
|
91
|
-
- **Outlier Detection**: Advanced anomaly detection algorithms
|
|
92
|
-
- **Time Series Analysis**: Trend and seasonality detection
|
|
93
|
-
- **International Numbers**: Support for various number formats (Japanese, Chinese, etc.)
|
|
94
|
-
- **Memory Efficient**: Handle large datasets with streaming analysis
|
|
95
|
-
|
|
96
|
-
### File Format Support
|
|
97
|
-
|
|
98
|
-
- **CSV, JSON, YAML, TOML, XML**: Standard structured data formats
|
|
99
|
-
- **Excel Files**: `.xlsx` and `.xls` support
|
|
100
|
-
- **PDF Documents**: Extract and analyze numerical data from PDFs
|
|
101
|
-
- **Word Documents**: Analyze data from `.docx` files
|
|
102
|
-
- **PowerPoint**: Extract data from presentations
|
|
103
|
-
|
|
104
|
-
## Usage Examples
|
|
105
|
-
|
|
106
|
-
### Command Line Interface (CLI) via Python Module
|
|
107
|
-
|
|
108
|
-
```bash
|
|
109
|
-
# Install and use immediately - binary included automatically
|
|
110
|
-
pip install lawkit-python
|
|
111
|
-
|
|
112
|
-
# Use lawkit CLI directly through Python module
|
|
113
|
-
python -m lawkit benf financial_data.csv
|
|
114
|
-
python -m lawkit pareto sales_data.csv --gini-coefficient
|
|
115
|
-
python -m lawkit analyze --laws all dataset.csv
|
|
116
|
-
python -m lawkit validate dataset.csv --consistency-check
|
|
117
|
-
python -m lawkit diagnose dataset.csv --report detailed
|
|
118
|
-
|
|
119
|
-
# Generate sample data for testing
|
|
120
|
-
python -m lawkit generate benf --samples 1000 --output-file test_data.csv
|
|
121
|
-
python -m lawkit generate pareto --samples 500 --concentration 0.8
|
|
122
|
-
```
|
|
123
|
-
|
|
124
|
-
### Modern API (Recommended)
|
|
125
|
-
|
|
126
|
-
```python
|
|
127
|
-
import lawkit
|
|
128
|
-
|
|
129
|
-
# Analyze with Benford Law
|
|
130
|
-
result = lawkit.analyze_benford('invoice_data.csv')
|
|
131
|
-
print(result)
|
|
132
|
-
|
|
133
|
-
# Get detailed JSON analysis
|
|
134
|
-
json_result = lawkit.analyze_benford(
|
|
135
|
-
'financial_statements.xlsx',
|
|
136
|
-
lawkit.LawkitOptions(
|
|
137
|
-
format='excel',
|
|
138
|
-
output='json',
|
|
139
|
-
confidence=0.95,
|
|
140
|
-
verbose=True
|
|
141
|
-
)
|
|
142
|
-
)
|
|
143
|
-
|
|
144
|
-
if json_result.risk_level == "High":
|
|
145
|
-
print("⚠️ High risk of fraud detected!")
|
|
146
|
-
print(f"Chi-square: {json_result.chi_square}")
|
|
147
|
-
print(f"P-value: {json_result.p_value}")
|
|
148
|
-
print(f"MAD: {json_result.mad}%")
|
|
149
|
-
|
|
150
|
-
# Pareto analysis for business insights
|
|
151
|
-
pareto_result = lawkit.analyze_pareto(
|
|
152
|
-
'customer_revenue.csv',
|
|
153
|
-
lawkit.LawkitOptions(
|
|
154
|
-
output='json',
|
|
155
|
-
gini_coefficient=True,
|
|
156
|
-
business_analysis=True,
|
|
157
|
-
percentiles="70,80,90"
|
|
158
|
-
)
|
|
159
|
-
)
|
|
160
|
-
|
|
161
|
-
print(f"Top 20% customers generate {pareto_result.concentration_80_20:.1f}% of revenue")
|
|
162
|
-
print(f"Income inequality (Gini): {pareto_result.gini_coefficient:.3f}")
|
|
163
|
-
|
|
164
|
-
# Normal distribution analysis with outlier detection
|
|
165
|
-
normal_result = lawkit.analyze_normal(
|
|
166
|
-
'quality_measurements.csv',
|
|
167
|
-
lawkit.LawkitOptions(
|
|
168
|
-
output='json',
|
|
169
|
-
outlier_detection=True,
|
|
170
|
-
test_type='shapiro'
|
|
171
|
-
)
|
|
172
|
-
)
|
|
173
|
-
|
|
174
|
-
if normal_result.p_value < 0.05:
|
|
175
|
-
print("Data does not follow normal distribution")
|
|
176
|
-
if normal_result.outliers:
|
|
177
|
-
print(f"Found {len(normal_result.outliers)} outliers")
|
|
178
|
-
|
|
179
|
-
# Multi-law analysis
|
|
180
|
-
analysis = lawkit.analyze_laws(
|
|
181
|
-
'complex_dataset.csv',
|
|
182
|
-
lawkit.LawkitOptions(format='json', laws='benf,pareto,zipf')
|
|
183
|
-
)
|
|
184
|
-
print(f"Analysis results: {analysis.data}")
|
|
185
|
-
print(f"Overall risk level: {analysis.risk_level}")
|
|
186
|
-
|
|
187
|
-
# Data validation
|
|
188
|
-
validation = lawkit.validate_laws(
|
|
189
|
-
'complex_dataset.csv',
|
|
190
|
-
lawkit.LawkitOptions(format='json', consistency_check=True)
|
|
191
|
-
)
|
|
192
|
-
print(f"Validation status: {validation.data}")
|
|
193
|
-
|
|
194
|
-
# Conflict diagnosis
|
|
195
|
-
diagnosis = lawkit.diagnose_laws(
|
|
196
|
-
'complex_dataset.csv',
|
|
197
|
-
lawkit.LawkitOptions(format='json', report='detailed')
|
|
198
|
-
)
|
|
199
|
-
print(f"Diagnosis: {diagnosis.data}")
|
|
200
|
-
```
|
|
201
|
-
|
|
202
|
-
### Generate Sample Data
|
|
203
|
-
|
|
204
|
-
```python
|
|
205
|
-
import lawkit
|
|
206
|
-
|
|
207
|
-
# Generate Benford Law compliant data
|
|
208
|
-
benford_data = lawkit.generate_data('benf', samples=1000, seed=42)
|
|
209
|
-
print(benford_data)
|
|
210
|
-
|
|
211
|
-
# Generate normal distribution data
|
|
212
|
-
normal_data = lawkit.generate_data('normal', samples=500, mean=100, stddev=15)
|
|
213
|
-
|
|
214
|
-
# Generate Pareto distribution data
|
|
215
|
-
pareto_data = lawkit.generate_data('pareto', samples=1000, concentration=0.8)
|
|
216
|
-
|
|
217
|
-
# Test the pipeline: generate → analyze
|
|
218
|
-
data = lawkit.generate_data('benf', samples=10000, seed=42)
|
|
219
|
-
result = lawkit.analyze_string(data, 'benf', lawkit.LawkitOptions(output='json'))
|
|
220
|
-
print(f"Generated data risk level: {result.risk_level}")
|
|
221
|
-
```
|
|
222
|
-
|
|
223
|
-
### Analyze String Data Directly
|
|
224
|
-
|
|
225
|
-
```python
|
|
226
|
-
import lawkit
|
|
227
|
-
|
|
228
|
-
# Analyze CSV data from string
|
|
229
|
-
csv_data = """amount
|
|
230
|
-
123.45
|
|
231
|
-
456.78
|
|
232
|
-
789.12
|
|
233
|
-
234.56
|
|
234
|
-
567.89"""
|
|
235
|
-
|
|
236
|
-
result = lawkit.analyze_string(
|
|
237
|
-
csv_data,
|
|
238
|
-
'benf',
|
|
239
|
-
lawkit.LawkitOptions(format='json')
|
|
240
|
-
)
|
|
241
|
-
print(f"Risk assessment: {result.risk_level}")
|
|
242
|
-
|
|
243
|
-
# Analyze JSON data
|
|
244
|
-
json_data = '{"values": [12, 23, 34, 45, 56, 67, 78, 89]}'
|
|
245
|
-
result = lawkit.analyze_string(
|
|
246
|
-
json_data,
|
|
247
|
-
'normal',
|
|
248
|
-
lawkit.LawkitOptions(format='json')
|
|
249
|
-
)
|
|
250
|
-
print(f"Is normal: {result.p_value > 0.05}")
|
|
251
|
-
```
|
|
252
|
-
|
|
253
|
-
### Advanced Options
|
|
254
|
-
|
|
255
|
-
```python
|
|
256
|
-
import lawkit
|
|
257
|
-
|
|
258
|
-
# High-performance analysis with optimization
|
|
259
|
-
result = lawkit.analyze_benford(
|
|
260
|
-
'large_dataset.csv',
|
|
261
|
-
lawkit.LawkitOptions(
|
|
262
|
-
optimize=True,
|
|
263
|
-
parallel=True,
|
|
264
|
-
memory_efficient=True,
|
|
265
|
-
min_count=50,
|
|
266
|
-
threshold=0.001
|
|
267
|
-
)
|
|
268
|
-
)
|
|
269
|
-
|
|
270
|
-
# International number support
|
|
271
|
-
result = lawkit.analyze_benford(
|
|
272
|
-
'japanese_accounting.csv',
|
|
273
|
-
lawkit.LawkitOptions(
|
|
274
|
-
international=True,
|
|
275
|
-
format='csv',
|
|
276
|
-
output='json'
|
|
277
|
-
)
|
|
278
|
-
)
|
|
279
|
-
|
|
280
|
-
# Time series analysis
|
|
281
|
-
result = lawkit.analyze_normal(
|
|
282
|
-
'sensor_data.csv',
|
|
283
|
-
lawkit.LawkitOptions(
|
|
284
|
-
time_series=True,
|
|
285
|
-
outlier_detection=True,
|
|
286
|
-
output='json'
|
|
287
|
-
)
|
|
288
|
-
)
|
|
289
|
-
```
|
|
290
|
-
|
|
291
|
-
### Legacy API (Backward Compatibility)
|
|
292
|
-
|
|
293
|
-
```python
|
|
294
|
-
from lawkit import run_lawkit
|
|
295
|
-
|
|
296
|
-
# Direct command execution
|
|
297
|
-
result = run_lawkit(["benf", "data.csv", "--format", "csv", "--output", "json"])
|
|
298
|
-
|
|
299
|
-
if result.returncode == 0:
|
|
300
|
-
print("Analysis successful")
|
|
301
|
-
print(result.stdout)
|
|
302
|
-
else:
|
|
303
|
-
print("Analysis failed")
|
|
304
|
-
print(result.stderr)
|
|
305
|
-
|
|
306
|
-
# Legacy analysis functions
|
|
307
|
-
from lawkit.compat import run_benford_analysis, run_pareto_analysis
|
|
308
|
-
|
|
309
|
-
benford_result = run_benford_analysis("financial.csv", format="csv", output="json")
|
|
310
|
-
pareto_result = run_pareto_analysis("sales.csv", gini_coefficient=True)
|
|
311
|
-
```
|
|
312
|
-
|
|
313
|
-
## Installation and Setup
|
|
314
|
-
|
|
315
|
-
### Automatic Installation (Recommended)
|
|
316
|
-
|
|
317
|
-
```bash
|
|
318
|
-
pip install lawkit-python
|
|
319
|
-
```
|
|
320
|
-
|
|
321
|
-
The binary is pre-embedded in the wheel for your platform.
|
|
322
|
-
|
|
323
|
-
### Manual Binary Installation
|
|
324
|
-
|
|
325
|
-
If automatic download fails:
|
|
326
|
-
|
|
327
|
-
```bash
|
|
328
|
-
lawkit-download-binary
|
|
329
|
-
```
|
|
330
|
-
|
|
331
|
-
### Development Installation
|
|
332
|
-
|
|
333
|
-
```bash
|
|
334
|
-
git clone https://github.com/kako-jun/lawkit
|
|
335
|
-
cd lawkit/lawkit-python
|
|
336
|
-
pip install -e .[dev]
|
|
337
|
-
```
|
|
338
|
-
|
|
339
|
-
### Verify Installation
|
|
340
|
-
|
|
341
|
-
```python
|
|
342
|
-
import lawkit
|
|
343
|
-
|
|
344
|
-
# Check if lawkit is available
|
|
345
|
-
if lawkit.is_lawkit_available():
|
|
346
|
-
print("✅ lawkit is installed and working")
|
|
347
|
-
print(f"Version: {lawkit.get_version()}")
|
|
348
|
-
else:
|
|
349
|
-
print("❌ lawkit is not available")
|
|
350
|
-
|
|
351
|
-
# Run self-test
|
|
352
|
-
if lawkit.selftest():
|
|
353
|
-
print("✅ All tests passed")
|
|
354
|
-
else:
|
|
355
|
-
print("❌ Self-test failed")
|
|
356
|
-
```
|
|
357
|
-
|
|
358
|
-
## Use Cases
|
|
359
|
-
|
|
360
|
-
### Financial Fraud Detection
|
|
361
|
-
|
|
362
|
-
```python
|
|
363
|
-
import lawkit
|
|
364
|
-
|
|
365
|
-
# Analyze invoice amounts for fraud
|
|
366
|
-
result = lawkit.analyze_benford('invoices.csv',
|
|
367
|
-
lawkit.LawkitOptions(output='json'))
|
|
368
|
-
|
|
369
|
-
if result.risk_level in ['High', 'Critical']:
|
|
370
|
-
print("🚨 Potential fraud detected in invoice data")
|
|
371
|
-
print(f"Statistical significance: p={result.p_value:.6f}")
|
|
372
|
-
print(f"Deviation from Benford Law: {result.mad:.2f}%")
|
|
373
|
-
```
|
|
374
|
-
|
|
375
|
-
### Business Intelligence
|
|
376
|
-
|
|
377
|
-
```python
|
|
378
|
-
import lawkit
|
|
379
|
-
|
|
380
|
-
# Analyze customer revenue distribution
|
|
381
|
-
result = lawkit.analyze_pareto('customer_revenue.csv',
|
|
382
|
-
lawkit.LawkitOptions(
|
|
383
|
-
output='json',
|
|
384
|
-
business_analysis=True,
|
|
385
|
-
gini_coefficient=True
|
|
386
|
-
))
|
|
387
|
-
|
|
388
|
-
print(f"Revenue concentration: {result.concentration_80_20:.1f}%")
|
|
389
|
-
print(f"Market inequality: {result.gini_coefficient:.3f}")
|
|
390
|
-
```
|
|
391
|
-
|
|
392
|
-
### Quality Control
|
|
393
|
-
|
|
394
|
-
```python
|
|
395
|
-
import lawkit
|
|
396
|
-
|
|
397
|
-
# Analyze manufacturing measurements
|
|
398
|
-
result = lawkit.analyze_normal('measurements.csv',
|
|
399
|
-
lawkit.LawkitOptions(
|
|
400
|
-
output='json',
|
|
401
|
-
outlier_detection=True,
|
|
402
|
-
test_type='shapiro'
|
|
403
|
-
))
|
|
404
|
-
|
|
405
|
-
if result.p_value < 0.05:
|
|
406
|
-
print("⚠️ Process out of control - not following normal distribution")
|
|
407
|
-
if result.outliers:
|
|
408
|
-
print(f"Found {len(result.outliers)} outlying measurements")
|
|
409
|
-
```
|
|
410
|
-
|
|
411
|
-
### Text Analysis
|
|
412
|
-
|
|
413
|
-
```python
|
|
414
|
-
import lawkit
|
|
415
|
-
|
|
416
|
-
# Analyze word frequency in documents
|
|
417
|
-
result = lawkit.analyze_zipf('document.txt',
|
|
418
|
-
lawkit.LawkitOptions(output='json'))
|
|
419
|
-
|
|
420
|
-
print(f"Text follows Zipf Law: {result.p_value > 0.05}")
|
|
421
|
-
print(f"Power law exponent: {result.exponent:.3f}")
|
|
422
|
-
```
|
|
423
|
-
|
|
424
|
-
## API Reference
|
|
425
|
-
|
|
426
|
-
### Main Functions
|
|
427
|
-
|
|
428
|
-
- `analyze_benford(input_data, options)` - Benford Law analysis
|
|
429
|
-
- `analyze_pareto(input_data, options)` - Pareto principle analysis
|
|
430
|
-
- `analyze_zipf(input_data, options)` - Zipf Law analysis
|
|
431
|
-
- `analyze_normal(input_data, options)` - Normal distribution analysis
|
|
432
|
-
- `analyze_poisson(input_data, options)` - Poisson distribution analysis
|
|
433
|
-
- `analyze_laws(input_data, options)` - Multi-law analysis
|
|
434
|
-
- `validate_laws(input_data, options)` - Data validation and consistency check
|
|
435
|
-
- `diagnose_laws(input_data, options)` - Conflict diagnosis and detailed reporting
|
|
436
|
-
- `generate_data(law_type, samples, **kwargs)` - Generate sample data
|
|
437
|
-
- `analyze_string(content, law_type, options)` - Analyze string data directly
|
|
438
|
-
|
|
439
|
-
### Utility Functions
|
|
440
|
-
|
|
441
|
-
- `is_lawkit_available()` - Check if lawkit CLI is available
|
|
442
|
-
- `get_version()` - Get lawkit version
|
|
443
|
-
- `selftest()` - Run self-test
|
|
444
|
-
|
|
445
|
-
### Classes
|
|
446
|
-
|
|
447
|
-
- `LawkitOptions` - Configuration options for analysis
|
|
448
|
-
- `LawkitResult` - Analysis results with structured access
|
|
449
|
-
- `LawkitError` - Exception class for lawkit errors
|
|
450
|
-
|
|
451
|
-
## Platform Support
|
|
452
|
-
|
|
453
|
-
- **Windows**: x86_64
|
|
454
|
-
- **macOS**: x86_64, ARM64 (Apple Silicon)
|
|
455
|
-
- **Linux**: x86_64, ARM64
|
|
456
|
-
|
|
457
|
-
## Requirements
|
|
458
|
-
|
|
459
|
-
- Python 3.8+
|
|
460
|
-
- No additional dependencies required
|
|
461
|
-
|
|
462
|
-
## License
|
|
463
|
-
|
|
464
|
-
This project is licensed under the MIT License.
|
|
465
|
-
|
|
466
|
-
## Support
|
|
467
|
-
|
|
468
|
-
- GitHub Issues: https://github.com/kako-jun/lawkit/issues
|
|
469
|
-
- Documentation: https://github.com/kako-jun/lawkit/tree/main/docs
|
|
470
|
-
- Examples: https://github.com/kako-jun/lawkit/tree/main/docs/user-guide/examples.md
|
|
471
|
-
|
|
472
|
-
## Contributing
|
|
473
|
-
|
|
474
|
-
Contributions are welcome! Please read the [Contributing Guide](https://github.com/kako-jun/lawkit/blob/main/CONTRIBUTING.md) for details.
|