neologdn 0.5.1__cp39-cp39-win32.whl → 0.5.4__cp39-cp39-win32.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.

Potentially problematic release.


This version of neologdn might be problematic. Click here for more details.

@@ -0,0 +1,11 @@
1
+ Metadata-Version: 2.1
2
+ Name: neologdn
3
+ Version: 0.5.4
4
+ Summary: UNKNOWN
5
+ Home-page: UNKNOWN
6
+ License: UNKNOWN
7
+ Platform: UNKNOWN
8
+ License-File: LICENSE
9
+
10
+ UNKNOWN
11
+
@@ -0,0 +1,6 @@
1
+ neologdn.cp39-win32.pyd,sha256=j6ALzZ9-aIY6ljJmeq6Ld8uULRlmwrtX6N2s4V785GA,80384
2
+ neologdn-0.5.4.dist-info/LICENSE,sha256=by4ifwXxhgMhU4eP1DMM7g-VdtQb2JQSNm0cUfpZjHQ,11547
3
+ neologdn-0.5.4.dist-info/METADATA,sha256=keDf8x3KckLm6QflFBs3lWmlvt4ty38Njwx_3JwzN_0,166
4
+ neologdn-0.5.4.dist-info/WHEEL,sha256=fZeZcvO81TGKovPknVkGj2GQZ1CPMjuvGsfkCVevWxE,96
5
+ neologdn-0.5.4.dist-info/top_level.txt,sha256=lzaJi2z9LzYyyKzSzM8ZW85oXgUIXZRCKd629uxKRg8,9
6
+ neologdn-0.5.4.dist-info/RECORD,,
@@ -1,5 +1,5 @@
1
1
  Wheel-Version: 1.0
2
- Generator: bdist_wheel (0.36.2)
2
+ Generator: bdist_wheel (0.44.0)
3
3
  Root-Is-Purelib: false
4
4
  Tag: cp39-cp39-win32
5
5
 
neologdn.cp39-win32.pyd CHANGED
Binary file
@@ -1,187 +0,0 @@
1
- Metadata-Version: 2.1
2
- Name: neologdn
3
- Version: 0.5.1
4
- Summary: Japanese text normalizer for mecab-neologd
5
- Home-page: http://github.com/ikegami-yukino/neologdn
6
- Author: Yukino Ikegami
7
- Author-email: yknikgm@gmail.com
8
- License: Apache Software License
9
- Keywords: japanese,MeCab
10
- Platform: UNKNOWN
11
- Classifier: Development Status :: 3 - Alpha
12
- Classifier: Intended Audience :: Developers
13
- Classifier: Natural Language :: Japanese
14
- Classifier: License :: OSI Approved :: Apache Software License
15
- Classifier: Programming Language :: Cython
16
- Classifier: Programming Language :: Python
17
- Classifier: Programming Language :: Python :: 3
18
- Classifier: Programming Language :: Python :: 3.6
19
- Classifier: Programming Language :: Python :: 3.7
20
- Classifier: Programming Language :: Python :: 3.8
21
- Classifier: Programming Language :: Python :: 3.9
22
- Classifier: Topic :: Text Processing :: Linguistic
23
-
24
- neologdn
25
- ===========
26
-
27
- |travis| |pyversion| |version| |license|
28
-
29
- neologdn is a Japanese text normalizer for `mecab-neologd <https://github.com/neologd/mecab-ipadic-neologd>`_.
30
-
31
- The normalization is based on the neologd's rules:
32
- https://github.com/neologd/mecab-ipadic-neologd/wiki/Regexp.ja
33
-
34
-
35
- Contributions are welcome!
36
-
37
- NOTE: Installing this module requires C++11 compiler.
38
-
39
- Installation
40
- ------------
41
-
42
- ::
43
-
44
- $ pip install neologdn
45
-
46
- Usage
47
- -----
48
-
49
- .. code:: python
50
-
51
- import neologdn
52
- neologdn.normalize("ハンカクカナ")
53
- # => 'ハンカクカナ'
54
- neologdn.normalize("全角記号!?@#")
55
- # => '全角記号!?@#'
56
- neologdn.normalize("全角記号例外「・」")
57
- # => '全角記号例外「・」'
58
- neologdn.normalize("長音短縮ウェーーーーイ")
59
- # => '長音短縮ウェーイ'
60
- neologdn.normalize("チルダ削除ウェ~∼∾〜〰~イ")
61
- # => 'チルダ削除ウェイ'
62
- neologdn.normalize("いろんなハイフン˗֊‐‑‒–⁃⁻₋−")
63
- # => 'いろんなハイフン-'
64
- neologdn.normalize("   PRML  副 読 本   ")
65
- # => 'PRML副読本'
66
- neologdn.normalize(" Natural Language Processing ")
67
- # => 'Natural Language Processing'
68
- neologdn.normalize("かわいいいいいいいいい", repeat=6)
69
- # => 'かわいいいいいい'
70
- neologdn.normalize("無駄無駄無駄無駄ァ", repeat=1)
71
- # => '無駄ァ'
72
- neologdn.normalize("1995〜2001年", tilde="normalize")
73
- # => '1995~2001年'
74
- neologdn.normalize("1995~2001年", tilde="normalize_zenkaku")
75
- # => '1995〜2001年'
76
- neologdn.normalize("1995〜2001年", tilde="ignore") # Don't convert tilde
77
- # => '1995〜2001年'
78
- neologdn.normalize("1995〜2001年", tilde="remove")
79
- # => '19952001年'
80
- neologdn.normalize("1995〜2001年") # Default parameter
81
- # => '19952001年'
82
-
83
-
84
- Benchmark
85
- ----------
86
-
87
- .. code:: python
88
-
89
- # Sample code from
90
- # https://github.com/neologd/mecab-ipadic-neologd/wiki/Regexp.ja#python-written-by-hideaki-t--overlast
91
- import normalize_neologd
92
-
93
- %timeit normalize(normalize_neologd.normalize_neologd)
94
- # => 1 loop, best of 3: 18.3 s per loop
95
-
96
-
97
- import neologdn
98
- %timeit normalize(neologdn.normalize)
99
- # => 1 loop, best of 3: 9.05 s per loop
100
-
101
-
102
- neologdn is about x2 faster than sample code.
103
-
104
- details are described as the below notebook:
105
- https://github.com/ikegami-yukino/neologdn/blob/master/benchmark/benchmark.ipynb
106
-
107
-
108
- License
109
- -------
110
-
111
- Apache Software License.
112
-
113
-
114
- Contribution
115
- ------------
116
-
117
- Contributions are welcome! See: https://github.com/ikegami-yukino/neologdn/blob/master/.github/CONTRIBUTING.md
118
-
119
-
120
- .. |travis| image:: https://travis-ci.org/ikegami-yukino/neologdn.svg?branch=master
121
- :target: https://travis-ci.org/ikegami-yukino/neologdn
122
- :alt: travis-ci.org
123
-
124
- .. |version| image:: https://img.shields.io/pypi/v/neologdn.svg
125
- :target: http://pypi.python.org/pypi/neologdn/
126
- :alt: latest version
127
-
128
- .. |pyversion| image:: https://img.shields.io/pypi/pyversions/neologdn.svg
129
-
130
- .. |license| image:: https://img.shields.io/pypi/l/neologdn.svg
131
- :target: http://pypi.python.org/pypi/neologdn/
132
- :alt: license
133
-
134
-
135
-
136
- CHANGES
137
- ========
138
-
139
- 0.5.1 (2021-05-02)
140
- ----------------------------
141
-
142
- - Improve performance of shorten_repeat function (Many thanks @yskn67)
143
- - Add tilde option to normalize function
144
-
145
- 0.4 (2018-12-06)
146
- ----------------------------
147
-
148
- - Add shorten_repeat function, which shortening contiguous substring. For example: neologdn.normalize("無駄無駄無駄無駄ァ", repeat=1) -> 無駄ァ
149
-
150
- 0.3.2 (2018-05-17)
151
- ----------------------------
152
-
153
- - Add option for suppression removal of spaces between Japanese characters
154
-
155
- 0.2.2 (2018-03-10)
156
- ----------------------------
157
-
158
- - Fix bug (daku-ten & handaku-ten)
159
- - Support mac osx 10.13 (Many thanks @r9y9)
160
-
161
- 0.2.1 (2017-01-23)
162
- ----------------------------
163
-
164
- - Fix bug (Check if a previous character of daku-ten character is in maps) (Many thanks @unnonouno)
165
-
166
- 0.2 (2016-04-12)
167
- ----------------------------
168
-
169
- - Add lengthened expression (repeating character) threshold
170
-
171
- 0.1.2 (2016-03-29)
172
- ----------------------------
173
-
174
- - Fix installation bug
175
-
176
- 0.1.1.1 (2016-03-19)
177
- ----------------------------
178
-
179
- - Support Windows
180
- - Explicitly specify to -std=c++11 in build (Many thanks @id774)
181
-
182
- 0.1.1 (2015-10-10)
183
- ----------------------------
184
-
185
- Initial release.
186
-
187
-
@@ -1,6 +0,0 @@
1
- neologdn.cp39-win32.pyd,sha256=hJhd-sKgATHChyGhNMk1JdV6-aejcQiI5K1WMO8RuW0,67072
2
- neologdn-0.5.1.dist-info/LICENSE,sha256=by4ifwXxhgMhU4eP1DMM7g-VdtQb2JQSNm0cUfpZjHQ,11547
3
- neologdn-0.5.1.dist-info/METADATA,sha256=h1tDuRQHCFhMqGLkBNDAkuXCKQEKSOQ4keAfvuk1hL8,5213
4
- neologdn-0.5.1.dist-info/WHEEL,sha256=oz1Y-dEwtUqQxLRjBaFy1chmu8DINNM_NE3ttEIz2rw,96
5
- neologdn-0.5.1.dist-info/top_level.txt,sha256=lzaJi2z9LzYyyKzSzM8ZW85oXgUIXZRCKd629uxKRg8,9
6
- neologdn-0.5.1.dist-info/RECORD,,