pystylometry 1.3.0__py3-none-any.whl → 1.3.5__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pystylometry/__init__.py +42 -3
- pystylometry/_types.py +205 -3
- pystylometry/cli.py +321 -0
- pystylometry/lexical/__init__.py +5 -1
- pystylometry/lexical/repetition.py +506 -0
- pystylometry/lexical/ttr.py +288 -97
- pystylometry-1.3.5.dist-info/LICENSE +21 -0
- pystylometry-1.3.5.dist-info/METADATA +78 -0
- {pystylometry-1.3.0.dist-info → pystylometry-1.3.5.dist-info}/RECORD +11 -9
- {pystylometry-1.3.0.dist-info → pystylometry-1.3.5.dist-info}/WHEEL +1 -1
- {pystylometry-1.3.0.dist-info → pystylometry-1.3.5.dist-info}/entry_points.txt +1 -0
- pystylometry-1.3.0.dist-info/METADATA +0 -136
|
@@ -1,136 +0,0 @@
|
|
|
1
|
-
Metadata-Version: 2.4
|
|
2
|
-
Name: pystylometry
|
|
3
|
-
Version: 1.3.0
|
|
4
|
-
Summary: Comprehensive Python package for stylometric analysis
|
|
5
|
-
License: MIT
|
|
6
|
-
Keywords: stylometry,nlp,text-analysis,authorship,readability,lexical-diversity,readability-metrics
|
|
7
|
-
Author: Craig Trim
|
|
8
|
-
Author-email: craigtrim@gmail.com
|
|
9
|
-
Requires-Python: >=3.9,<4.0
|
|
10
|
-
Classifier: Development Status :: 4 - Beta
|
|
11
|
-
Classifier: Intended Audience :: Developers
|
|
12
|
-
Classifier: Intended Audience :: Science/Research
|
|
13
|
-
Classifier: License :: OSI Approved :: MIT License
|
|
14
|
-
Classifier: Programming Language :: Python :: 3
|
|
15
|
-
Classifier: Programming Language :: Python :: 3.9
|
|
16
|
-
Classifier: Programming Language :: Python :: 3.10
|
|
17
|
-
Classifier: Programming Language :: Python :: 3.11
|
|
18
|
-
Classifier: Programming Language :: Python :: 3.12
|
|
19
|
-
Classifier: Programming Language :: Python :: 3.13
|
|
20
|
-
Classifier: Programming Language :: Python :: 3.14
|
|
21
|
-
Classifier: Topic :: Scientific/Engineering :: Information Analysis
|
|
22
|
-
Classifier: Topic :: Text Processing :: Linguistic
|
|
23
|
-
Classifier: Typing :: Typed
|
|
24
|
-
Requires-Dist: stylometry-ttr (>=1.0.3,<2.0.0)
|
|
25
|
-
Project-URL: Homepage, https://github.com/craigtrim/pystylometry
|
|
26
|
-
Project-URL: Issues, https://github.com/craigtrim/pystylometry/issues
|
|
27
|
-
Project-URL: Repository, https://github.com/craigtrim/pystylometry
|
|
28
|
-
Description-Content-Type: text/markdown
|
|
29
|
-
|
|
30
|
-
# pystylometry
|
|
31
|
-
|
|
32
|
-
[](https://badge.fury.io/py/pystylometry)
|
|
33
|
-
[](https://pepy.tech/project/pystylometry)
|
|
34
|
-
[](https://www.python.org/downloads/)
|
|
35
|
-
[](https://opensource.org/licenses/MIT)
|
|
36
|
-
[]()
|
|
37
|
-
|
|
38
|
-
Stylometric analysis and authorship attribution for Python. 50+ metrics across 11 modules, from vocabulary diversity to AI-generation detection.
|
|
39
|
-
|
|
40
|
-
## Install
|
|
41
|
-
|
|
42
|
-
```bash
|
|
43
|
-
pip install pystylometry # Core (lexical metrics)
|
|
44
|
-
pip install pystylometry[all] # Everything
|
|
45
|
-
```
|
|
46
|
-
|
|
47
|
-
<details>
|
|
48
|
-
<summary>Individual extras</summary>
|
|
49
|
-
|
|
50
|
-
```bash
|
|
51
|
-
pip install pystylometry[readability] # Readability formulas (pronouncing, spaCy)
|
|
52
|
-
pip install pystylometry[syntactic] # POS/parse analysis (spaCy)
|
|
53
|
-
pip install pystylometry[authorship] # Attribution methods
|
|
54
|
-
pip install pystylometry[ngrams] # N-gram entropy
|
|
55
|
-
pip install pystylometry[viz] # Matplotlib visualizations
|
|
56
|
-
```
|
|
57
|
-
</details>
|
|
58
|
-
|
|
59
|
-
## Usage
|
|
60
|
-
|
|
61
|
-
```python
|
|
62
|
-
from pystylometry.lexical import compute_mtld, compute_yule
|
|
63
|
-
from pystylometry.readability import compute_flesch
|
|
64
|
-
|
|
65
|
-
result = compute_mtld(text)
|
|
66
|
-
print(result.mtld_average) # 72.4
|
|
67
|
-
|
|
68
|
-
result = compute_flesch(text)
|
|
69
|
-
print(result.reading_ease) # 65.2
|
|
70
|
-
print(result.grade_level) # 8.1
|
|
71
|
-
```
|
|
72
|
-
|
|
73
|
-
Every function returns a typed dataclass with the score, components, and metadata -- never a bare float.
|
|
74
|
-
|
|
75
|
-
### Unified API
|
|
76
|
-
|
|
77
|
-
```python
|
|
78
|
-
from pystylometry import analyze
|
|
79
|
-
|
|
80
|
-
results = analyze(text, lexical=True, readability=True, syntactic=True)
|
|
81
|
-
```
|
|
82
|
-
|
|
83
|
-
### Style Drift Detection
|
|
84
|
-
|
|
85
|
-
Detect authorship changes, spliced content, and AI-generated text within a single document.
|
|
86
|
-
|
|
87
|
-
```python
|
|
88
|
-
from pystylometry.consistency import compute_kilgarriff_drift
|
|
89
|
-
|
|
90
|
-
result = compute_kilgarriff_drift(document)
|
|
91
|
-
print(result.pattern) # "sudden_spike"
|
|
92
|
-
print(result.pattern_confidence) # 0.71
|
|
93
|
-
print(result.max_location) # Window 23 -- the splice point
|
|
94
|
-
```
|
|
95
|
-
|
|
96
|
-
### CLI
|
|
97
|
-
|
|
98
|
-
```bash
|
|
99
|
-
pystylometry-drift manuscript.txt --window-size=500 --stride=250
|
|
100
|
-
pystylometry-viewer report.html
|
|
101
|
-
```
|
|
102
|
-
|
|
103
|
-
## Modules
|
|
104
|
-
|
|
105
|
-
| Module | Metrics | Description |
|
|
106
|
-
|--------|---------|-------------|
|
|
107
|
-
| [**lexical**](pystylometry/lexical/) | TTR, MTLD, Yule's K/I, Hapax, MATTR, VocD-D, HD-D, MSTTR, function words, word frequency | Vocabulary diversity and richness |
|
|
108
|
-
| [**readability**](pystylometry/readability/) | Flesch, Flesch-Kincaid, SMOG, Gunning Fog, Coleman-Liau, ARI, Dale-Chall, Fry, FORCAST, Linsear Write, Powers-Sumner-Kearl | Grade-level and difficulty scoring |
|
|
109
|
-
| [**syntactic**](pystylometry/syntactic/) | POS ratios, sentence types, parse tree depth, clausal density, passive voice, T-units, dependency distance | Sentence and parse structure (requires spaCy) |
|
|
110
|
-
| [**authorship**](pystylometry/authorship/) | Burrows' Delta, Cosine Delta, Zeta, Kilgarriff chi-squared, MinMax, John's Delta, NCD | Author attribution and text comparison |
|
|
111
|
-
| [**stylistic**](pystylometry/stylistic/) | Contractions, hedges, intensifiers, modals, punctuation, vocabulary overlap (Jaccard/Dice/Cosine/KL), cohesion, genre/register | Style markers and text similarity |
|
|
112
|
-
| [**character**](pystylometry/character/) | Letter frequencies, digit/uppercase ratios, special characters, whitespace | Character-level fingerprinting |
|
|
113
|
-
| [**ngrams**](pystylometry/ngrams/) | Word/character/POS n-grams, Shannon entropy, skipgrams | N-gram profiles and entropy |
|
|
114
|
-
| [**dialect**](pystylometry/dialect/) | British/American classification, spelling/grammar/vocabulary markers, markedness | Regional dialect detection |
|
|
115
|
-
| [**consistency**](pystylometry/consistency/) | Sliding-window chi-squared drift, pattern classification | Intra-document style analysis |
|
|
116
|
-
| [**prosody**](pystylometry/prosody/) | Syllable stress, rhythm regularity | Prose rhythm (requires spaCy) |
|
|
117
|
-
| [**viz**](pystylometry/viz/) | Timeline, scatter, report (PNG + interactive HTML) | Drift detection visualization |
|
|
118
|
-
|
|
119
|
-
## Development
|
|
120
|
-
|
|
121
|
-
```bash
|
|
122
|
-
git clone https://github.com/craigtrim/pystylometry && cd pystylometry
|
|
123
|
-
pip install -e ".[dev,all]"
|
|
124
|
-
make test # 1022 tests
|
|
125
|
-
make lint # ruff + mypy
|
|
126
|
-
make all # lint + test + build
|
|
127
|
-
```
|
|
128
|
-
|
|
129
|
-
## License
|
|
130
|
-
|
|
131
|
-
MIT
|
|
132
|
-
|
|
133
|
-
## Author
|
|
134
|
-
|
|
135
|
-
Craig Trim -- craigtrim@gmail.com
|
|
136
|
-
|