SinaTools 0.1.35__py2.py3-none-any.whl → 0.1.37__py2.py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -1,64 +1,63 @@
1
- Metadata-Version: 2.1
2
- Name: SinaTools
3
- Version: 0.1.35
4
- Summary: Open-source Python toolkit for Arabic Natural Understanding, allowing people to integrate it in their system workflow.
5
- Home-page: https://github.com/SinaLab/sinatools
6
- License: MIT license
7
- Keywords: sinatools
8
- Platform: UNKNOWN
9
- Description-Content-Type: text/markdown
10
- Requires-Dist: six
11
- Requires-Dist: farasapy
12
- Requires-Dist: tqdm
13
- Requires-Dist: requests
14
- Requires-Dist: regex
15
- Requires-Dist: pathlib
16
- Requires-Dist: torch (==1.13.0)
17
- Requires-Dist: transformers (==4.24.0)
18
- Requires-Dist: torchtext (==0.14.0)
19
- Requires-Dist: torchvision (==0.14.0)
20
- Requires-Dist: seqeval (==1.2.2)
21
- Requires-Dist: natsort (==7.1.1)
22
-
23
- SinaTools
24
- ======================
25
- Open Source Toolkit for Arabic NLP and NLU developed by [SinaLab](http://sina.birzeit.edu/) at Birzeit University. SinaTools is available through Python APIs, command lines, colabs, and online demos.
26
-
27
- See the full list of [Available Packages](https://sina.birzeit.edu/sinatools/), which include: (1) [Morphology Tagging](https://sina.birzeit.edu/sinatools/index.html#morph), (2) [Named Entity Recognition (NER)](https://sina.birzeit.edu/sinatools/index.html#ner), (3) [Word Sense Disambiguation (WSD)](https://sina.birzeit.edu/sinatools/index.html#wsd), (4) [Semantic Relatedness](https://sina.birzeit.edu/sinatools/index.html#sr), (5) [Synonymy Extraction and Evaluation](https://sina.birzeit.edu/sinatools/index.html#se), (6) [Relation Extraction](https://sina.birzeit.edu/sinatools/index.html#re), (7) [Utilities](https://sina.birzeit.edu/sinatools/index.html#u) (diacritic-based word matching, Jaccard similarly, parser, tokenizers, corpora processing, transliteration, etc).
28
-
29
- See [Demo Pages](https://sina.birzeit.edu/sinatools/).
30
-
31
- See the [benchmarking](https://www.jarrar.info/publications/HJK24.pdf), which shows that SinaTools outperformed all related toolkits.
32
-
33
- Installation
34
- --------
35
- To install SinaTools, ensure you are using Python version 3.10.8, then clone the [GitHub](git://github.com/SinaLab/SinaTools) repository.
36
-
37
- Alternatively, you can execute the following command:
38
-
39
- ```bash
40
- pip install sinatools
41
- ```
42
-
43
- Installing Models and Data Files
44
- --------
45
- Some modules in SinaTools require some data files and fine-tuned models to be downloaded. To download these models, please consult the [DataDownload](https://sina.birzeit.edu/sinatools/documentation/cli_tools/DataDownload/DataDownload.html).
46
-
47
- Documentation
48
- --------
49
- For information, please refer to the [main page](https://sina.birzeit.edu/sinatools) or the [online domuementation](https://sina.birzeit.edu/sinatools/documentation).
50
-
51
- Citation
52
- -------
53
- Tymaa Hammouda, Mustafa Jarrar, Mohammed Khalilia: [SinaTools: Open Source Toolkit for Arabic Natural Language Understanding](http://www.jarrar.info/publications/HJK24.pdf). In Proceedings of the 2024 AI in Computational Linguistics (ACLing 2024), Procedia Computer Science, Dubai. ELSEVIER.
54
-
55
- License
56
- --------
57
- SinaTools is available under the MIT License. See the [LICENSE](https://github.com/SinaLab/sinatools/blob/main/LICENSE) file for more information.
58
-
59
- Reporting Issues
60
- --------
61
- To report any issues or bugs, please contact us at "sina.institute.bzu@gmail.com" or visit [SinaTools Issues](https://github.com/SinaLab/sinatools/issues).
62
-
63
-
64
-
1
+ Metadata-Version: 2.1
2
+ Name: SinaTools
3
+ Version: 0.1.37
4
+ Summary: Open-source Python toolkit for Arabic Natural Understanding, allowing people to integrate it in their system workflow.
5
+ Home-page: https://github.com/SinaLab/sinatools
6
+ License: MIT license
7
+ Keywords: sinatools
8
+ Description-Content-Type: text/markdown
9
+ License-File: LICENSE
10
+ License-File: AUTHORS.rst
11
+ Requires-Dist: six
12
+ Requires-Dist: farasapy
13
+ Requires-Dist: tqdm
14
+ Requires-Dist: requests
15
+ Requires-Dist: regex
16
+ Requires-Dist: pathlib
17
+ Requires-Dist: torch ==1.13.0
18
+ Requires-Dist: transformers ==4.24.0
19
+ Requires-Dist: torchtext ==0.14.0
20
+ Requires-Dist: torchvision ==0.14.0
21
+ Requires-Dist: seqeval ==1.2.2
22
+ Requires-Dist: natsort ==7.1.1
23
+
24
+ SinaTools
25
+ ======================
26
+ Open Source Toolkit for Arabic NLP and NLU developed by [SinaLab](http://sina.birzeit.edu/) at Birzeit University. SinaTools is available through Python APIs, command lines, colabs, and online demos.
27
+
28
+ See the full list of [Available Packages](https://sina.birzeit.edu/sinatools/), which include: (1) [Morphology Tagging](https://sina.birzeit.edu/sinatools/index.html#morph), (2) [Named Entity Recognition (NER)](https://sina.birzeit.edu/sinatools/index.html#ner), (3) [Word Sense Disambiguation (WSD)](https://sina.birzeit.edu/sinatools/index.html#wsd), (4) [Semantic Relatedness](https://sina.birzeit.edu/sinatools/index.html#sr), (5) [Synonymy Extraction and Evaluation](https://sina.birzeit.edu/sinatools/index.html#se), (6) [Relation Extraction](https://sina.birzeit.edu/sinatools/index.html#re), (7) [Utilities](https://sina.birzeit.edu/sinatools/index.html#u) (diacritic-based word matching, Jaccard similarly, parser, tokenizers, corpora processing, transliteration, etc).
29
+
30
+ See [Demo Pages](https://sina.birzeit.edu/sinatools/).
31
+
32
+ See the [benchmarking](https://www.jarrar.info/publications/HJK24.pdf), which shows that SinaTools outperformed all related toolkits.
33
+
34
+ Installation
35
+ --------
36
+ To install SinaTools, ensure you are using Python version 3.10.8, then clone the [GitHub](git://github.com/SinaLab/SinaTools) repository.
37
+
38
+ Alternatively, you can execute the following command:
39
+
40
+ ```bash
41
+ pip install sinatools
42
+ ```
43
+
44
+ Installing Models and Data Files
45
+ --------
46
+ Some modules in SinaTools require some data files and fine-tuned models to be downloaded. To download these models, please consult the [DataDownload](https://sina.birzeit.edu/sinatools/documentation/cli_tools/DataDownload/DataDownload.html).
47
+
48
+ Documentation
49
+ --------
50
+ For information, please refer to the [main page](https://sina.birzeit.edu/sinatools) or the [online domuementation](https://sina.birzeit.edu/sinatools/documentation).
51
+
52
+ Citation
53
+ -------
54
+ Tymaa Hammouda, Mustafa Jarrar, Mohammed Khalilia: [SinaTools: Open Source Toolkit for Arabic Natural Language Understanding](http://www.jarrar.info/publications/HJK24.pdf). In Proceedings of the 2024 AI in Computational Linguistics (ACLing 2024), Procedia Computer Science, Dubai. ELSEVIER.
55
+
56
+ License
57
+ --------
58
+ SinaTools is available under the MIT License. See the [LICENSE](https://github.com/SinaLab/sinatools/blob/main/LICENSE) file for more information.
59
+
60
+ Reporting Issues
61
+ --------
62
+ To report any issues or bugs, please contact us at "sina.institute.bzu@gmail.com" or visit [SinaTools Issues](https://github.com/SinaLab/sinatools/issues).
63
+
@@ -1,10 +1,10 @@
1
- SinaTools-0.1.35.data/data/sinatools/environment.yml,sha256=OzilhLjZbo_3nU93EQNUFX-6G5O3newiSWrwxvMH2Os,7231
2
- sinatools/VERSION,sha256=cVbVTfIguj1zWCurwk_MTvuyWUDhNgp0IfcGYvhdzcY,6
1
+ SinaTools-0.1.37.data/data/sinatools/environment.yml,sha256=OzilhLjZbo_3nU93EQNUFX-6G5O3newiSWrwxvMH2Os,7231
2
+ sinatools/VERSION,sha256=rds3CaJrvi4kNl0xJMt9fYHplBe78mGMmyBFfi9Zsco,6
3
3
  sinatools/__init__.py,sha256=bEosTU1o-FSpyytS6iVP_82BXHF2yHnzpJxPLYRbeII,135
4
4
  sinatools/environment.yml,sha256=OzilhLjZbo_3nU93EQNUFX-6G5O3newiSWrwxvMH2Os,7231
5
5
  sinatools/install_env.py,sha256=EODeeE0ZzfM_rz33_JSIruX03Nc4ghyVOM5BHVhsZaQ,404
6
6
  sinatools/sinatools.py,sha256=vR5AaF0iel21LvsdcqwheoBz0SIj9K9I_Ub8M8oA98Y,20
7
- sinatools/CLI/DataDownload/download_files.py,sha256=u_DFXbHcIU_4Ub5Y0cL9_p1hL8h6LLWPemn9Al-XFgc,2603
7
+ sinatools/CLI/DataDownload/download_files.py,sha256=EezvbukR3pZ8s6mGZnzTcjsbo3CBDlC0g6KhJWlYp1w,2686
8
8
  sinatools/CLI/morphology/ALMA_multi_word.py,sha256=rmpa72twwIJHme_kpQ1lu3_7y_Jorj70QTvOnQMJRuI,1274
9
9
  sinatools/CLI/morphology/morph_analyzer.py,sha256=HPamEKos_JRYCJv_2q6c12N--da58_JXTno9haww5Ao,3497
10
10
  sinatools/CLI/ner/corpus_entity_extractor.py,sha256=DdvigsDQzko5nJBjzUXlIDqoBMBTVzktjSo7JfEXTIA,4778
@@ -77,13 +77,11 @@ sinatools/morphology/ALMA_multi_word.py,sha256=hj_-8ojrYYHnfCGk8WKtJdUR8mauzQdma
77
77
  sinatools/morphology/__init__.py,sha256=I4wVBh8BhyNl-CySVdiI_nUSn6gj1j-gmLKP300RpE0,1216
78
78
  sinatools/morphology/morph_analyzer.py,sha256=JOH2UWKNQWo5UzpWNzP9R1D3B3qLSogIiMp8n0N_56o,7177
79
79
  sinatools/ner/__init__.py,sha256=59kLMX6UQhF6JpE10RhaDYC3a2_jiWOIVPuejsoflFE,1050
80
- sinatools/ner/data.py,sha256=lvOW86dXse8SC75Q0supQaE0rrRffoxNjIA0Qbv5WZY,4354
81
80
  sinatools/ner/data_format.py,sha256=7Yt0aOicOn9_YuuyCkM_IYi_rgjGYxR9bCuUaNGM73o,4341
82
81
  sinatools/ner/datasets.py,sha256=mG1iwqSm3lXCFHLqE-b4wNi176cpuzNBz8tKaBU6z6M,5059
83
82
  sinatools/ner/entity_extractor.py,sha256=O2epRwRFUUcQs3SnFIYHVBI4zVhr8hRcj0XJYeby4ts,3588
84
83
  sinatools/ner/helpers.py,sha256=dnOoDY5JMyOLTUWVIZLMt8mBn2IbWlVaqHhQyjs1voo,2343
85
84
  sinatools/ner/metrics.py,sha256=Irz6SsIvpOzGIA2lWxrEV86xnTnm0TzKm9SUVT4SXUU,2734
86
- sinatools/ner/relation_extractor.py,sha256=a85xGX6V72fDpJk0GKmmtlWf8S8ezY-2pm5oGc9_ESY,9750
87
85
  sinatools/ner/transforms.py,sha256=vti3mDdi-IRP8i0aTQ37QqpPlP9hdMmJ6_bAMa0uL-s,4871
88
86
  sinatools/ner/data/__init__.py,sha256=W0C1ge_XxTfmdEGz0hkclz57aLI5VFS5t6BjByCfkFk,57
89
87
  sinatools/ner/data/datasets.py,sha256=lcdDDenFMEKIGYQmfww2dk_9WKWrJO9HtKptaAEsRmY,5064
@@ -93,9 +91,9 @@ sinatools/ner/nn/BertNestedTagger.py,sha256=_fwAn1kiKmXe6m5y16Ipty3kvXIEFEmiUq74
93
91
  sinatools/ner/nn/BertSeqTagger.py,sha256=dFcBBiMw2QCWsyy7aQDe_PS3aRuNn4DOxKIHgTblFvc,504
94
92
  sinatools/ner/nn/__init__.py,sha256=UgQD_XLNzQGBNSYc_Bw1aRJZjq4PJsnMT1iZwnJemqE,170
95
93
  sinatools/ner/trainers/BaseTrainer.py,sha256=Ifz4SeTxJwVn1_uWZ3I9KbcSo2hLPN3ojsIYuoKE9wE,4050
96
- sinatools/ner/trainers/BertNestedTrainer.py,sha256=Pb4O2WeBmTvV3hHMT6DXjxrTzgtuh3OrKQZnogYy8RQ,8429
97
- sinatools/ner/trainers/BertTrainer.py,sha256=B_uVtUwfv_eFwMMPsKQvZgW_ZNLy6XEsX5ePR0s8d-k,6433
98
- sinatools/ner/trainers/__init__.py,sha256=UDok8pDDpYOpwRBBKVLKaOgSUlmqqb-zHZI1p0xPxzI,188
94
+ sinatools/ner/trainers/BertNestedTrainer.py,sha256=iJOah69tXZsAXBimqP0odEsk8SPX4A355riePzW2BFs,8632
95
+ sinatools/ner/trainers/BertTrainer.py,sha256=BtttsrHPolmK3eRDqrgVUuv6lVMuImIeskxhi02Q-44,6596
96
+ sinatools/ner/trainers/__init__.py,sha256=Xnbi_M4KKJRqV7FJe1vklyT0nEW2Q2obxgcWkbR0ZbA,190
99
97
  sinatools/relations/__init__.py,sha256=cYjsP2mlTYvAwVIEFtgA6i9gLUSkGVOuDggMs7TvG5k,272
100
98
  sinatools/relations/relation_extractor.py,sha256=UuDlaaR0ch9BFv4sBF1tr7P-P9xq8oRZF41tAze6_ok,9751
101
99
  sinatools/semantic_relatedness/__init__.py,sha256=S0xrmqtl72L02N56nbNMudPoebnYQgsaIyyX-587DsU,830
@@ -104,24 +102,22 @@ sinatools/synonyms/__init__.py,sha256=yMuphNZrm5XLOR2T0weOHcUysJm-JKHUmVLoLQO839
104
102
  sinatools/synonyms/synonyms_generator.py,sha256=jRd0D3_kn-jYBaZzqY-7oOy0SFjSJ-mjM7JhsySzX58,9037
105
103
  sinatools/utils/__init__.py,sha256=47DEQpj8HBSa-_TImW-5JCeuQeRkm5NMpJWZG3hSuFU,0
106
104
  sinatools/utils/charsets.py,sha256=rs82oZJqRqosZdTKXfFAJfJ5t4PxjMM_oAPsiWSWuwU,2817
107
- sinatools/utils/implication.py,sha256=MsbI6S1LNY-fCxGMxFTuaV639r3QijkkdcfH48rvY7A,27804
108
- sinatools/utils/jaccard.py,sha256=kLIptPNB2VIqnemVve9auyOL1kXHIsCkKCEwxFM8yP4,10114
109
105
  sinatools/utils/parser.py,sha256=qvHdln5R5CAv_0UOJWe0mcp8JCsGqgazoeIIkoALH88,6259
110
106
  sinatools/utils/readfile.py,sha256=xE4LEaCqXJIk9v37QUSSmWb-aY3UnCFUNb7uVdx3cpM,133
111
- sinatools/utils/similarity.py,sha256=CgKOJpRAU5UaSjOg-sdZcACCNl9tuKDRwdFAKATCL_w,10762
107
+ sinatools/utils/similarity.py,sha256=HAK6OmyVnfjPm0GWL3z9s4ZoUwpZHVKxt3CeSMfqLIQ,11990
112
108
  sinatools/utils/text_dublication_detector.py,sha256=FeSkbfWGMQluz23H4CBHXION-walZPgjueX6AL8u_Q0,5660
113
109
  sinatools/utils/text_transliteration.py,sha256=F3smhr2AEJtySE6wGQsiXXOslTvSDzLivTYu0btgc10,8769
114
110
  sinatools/utils/tokenizer.py,sha256=nyk6lh5-p38wrU62hvh4wg7ni9ammkdqqIgcjbbBxxo,6965
115
111
  sinatools/utils/tokenizers_words.py,sha256=efNfOil9qDNVJ9yynk_8sqf65PsL-xtsHG7y2SZCkjQ,656
116
112
  sinatools/utils/word_compare.py,sha256=rS2Z74sf7R-7MTXyrFj5miRi2TnSG9OdTDp_qQYuo2Y,28200
117
113
  sinatools/wsd/__init__.py,sha256=mwmCUurOV42rsNRpIUP3luG0oEzeTfEx3oeDl93Oif8,306
118
- sinatools/wsd/disambiguator.py,sha256=9ottQn_WwOFX5Trr0Rpg66-Jpaln5yJduFqP6cdOOBA,22616
114
+ sinatools/wsd/disambiguator.py,sha256=h-3idc5rPPbMDSE_QVJAsEVkDHwzYY3L2SEPNXIdOcc,20104
119
115
  sinatools/wsd/settings.py,sha256=6XflVTFKD8SVySX9Wj7zYQtV26WDTcQ2-uW8-gDNHKE,747
120
116
  sinatools/wsd/wsd.py,sha256=gHIBUFXegoY1z3rRnIlK6TduhYq2BTa_dHakOjOlT4k,4434
121
- SinaTools-0.1.35.dist-info/AUTHORS.rst,sha256=aTWeWlIdfLi56iLJfIUAwIrmqDcgxXKLji75_Fjzjyg,174
122
- SinaTools-0.1.35.dist-info/LICENSE,sha256=uwsKYG4TayHXNANWdpfMN2lVW4dimxQjA_7vuCVhD70,1088
123
- SinaTools-0.1.35.dist-info/METADATA,sha256=N1gUEgccLIIpfCHthFpI-2HU01LogkZWo1C-1qANx5M,3267
124
- SinaTools-0.1.35.dist-info/WHEEL,sha256=6T3TYZE4YFi2HTS1BeZHNXAi8N52OZT4O-dJ6-ome_4,116
125
- SinaTools-0.1.35.dist-info/entry_points.txt,sha256=-YGM-r0_UtNPnI0C4UcK1ptrpwFZpUhxdy2qHkehNCo,1303
126
- SinaTools-0.1.35.dist-info/top_level.txt,sha256=8tNdPTeJKw3TQCaua8IJIx6N6WpgZZmVekf1OdBNJpE,10
127
- SinaTools-0.1.35.dist-info/RECORD,,
117
+ SinaTools-0.1.37.dist-info/AUTHORS.rst,sha256=aTWeWlIdfLi56iLJfIUAwIrmqDcgxXKLji75_Fjzjyg,174
118
+ SinaTools-0.1.37.dist-info/LICENSE,sha256=uwsKYG4TayHXNANWdpfMN2lVW4dimxQjA_7vuCVhD70,1088
119
+ SinaTools-0.1.37.dist-info/METADATA,sha256=1OAigouXXSSaZ3MpOAxxAHfh5yPltiXjaOGe656KjTc,3346
120
+ SinaTools-0.1.37.dist-info/WHEEL,sha256=DZajD4pwLWue70CAfc7YaxT1wLUciNBvN_TTcvXpltE,110
121
+ SinaTools-0.1.37.dist-info/entry_points.txt,sha256=_CsRKM_tSCWV5hefBNUsWf9_6DrJnzFlxeAo1wm5XqY,1302
122
+ SinaTools-0.1.37.dist-info/top_level.txt,sha256=8tNdPTeJKw3TQCaua8IJIx6N6WpgZZmVekf1OdBNJpE,10
123
+ SinaTools-0.1.37.dist-info/RECORD,,
@@ -1,6 +1,6 @@
1
- Wheel-Version: 1.0
2
- Generator: bdist_wheel (0.34.2)
3
- Root-Is-Purelib: true
4
- Tag: py2-none-any
5
- Tag: py3-none-any
6
-
1
+ Wheel-Version: 1.0
2
+ Generator: bdist_wheel (0.43.0)
3
+ Root-Is-Purelib: true
4
+ Tag: py2-none-any
5
+ Tag: py3-none-any
6
+
@@ -20,4 +20,3 @@ sentence_tokenizer = sinatools.CLI.utils.sentence_tokenizer:main
20
20
  text_dublication_detector = sinatools.CLI.utils.text_dublication_detector:main
21
21
  transliterate = sinatools.CLI.utils.text_transliteration:main
22
22
  wsd = sinatools.CLI.wsd.disambiguator:main
23
-
@@ -52,16 +52,17 @@ def main():
52
52
  for file in args.files:
53
53
  print("file: ", file)
54
54
  if file == "wsd":
55
- #download_file(urls["morph"])
56
- #download_file(urls["ner"])
55
+ download_file(urls["morph"])
56
+ download_file(urls["ner"])
57
57
  #download_file(urls["wsd_model"])
58
- download_folder_from_hf("SinaLab/ArabGlossBERT", "bert-base-arabertv02_22_May_2021_00h_allglosses_unused01")
59
58
  #download_file(urls["wsd_tokenizer"])
60
- #download_file(urls["one_gram"])
61
- #download_file(urls["five_grams"])
62
- #download_file(urls["four_grams"])
63
- #download_file(urls["three_grams"])
64
- #download_file(urls["two_grams"])
59
+ download_folder_from_hf("SinaLab/ArabGlossBERT", "bert-base-arabertv02_22_May_2021_00h_allglosses_unused01")
60
+ download_folder_from_hf("SinaLab/ArabGlossBERT", "bert-base-arabertv02")
61
+ download_file(urls["one_gram"])
62
+ download_file(urls["five_grams"])
63
+ download_file(urls["four_grams"])
64
+ download_file(urls["three_grams"])
65
+ download_file(urls["two_grams"])
65
66
  elif file == "synonyms":
66
67
  download_file(urls["graph_l2"])
67
68
  download_file(urls["graph_l3"])
sinatools/VERSION CHANGED
@@ -1 +1 @@
1
- 0.1.35
1
+ 0.1.37