qurankit 0.4.0__tar.gz

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
@@ -0,0 +1,354 @@
1
+ Metadata-Version: 2.4
2
+ Name: qurankit
3
+ Version: 0.4.0
4
+ Summary: A query-first Python toolkit for Quranic text, morphology, translations, tafsir, and pattern discovery.
5
+ Author-email: Faisal Alshargi <alshargi.de@gmail.com>
6
+ License: MIT
7
+ Project-URL: Homepage, https://sanaa.ai
8
+ Project-URL: Repository, https://github.com/alshargi/qurankit
9
+ Project-URL: Documentation, https://sanaa.ai/qurankit
10
+ Keywords: quran,arabic,nlp,morphology,roots,lemmas,translations,islamic-studies
11
+ Classifier: Development Status :: 3 - Alpha
12
+ Classifier: Intended Audience :: Developers
13
+ Classifier: Intended Audience :: Science/Research
14
+ Classifier: License :: OSI Approved :: MIT License
15
+ Classifier: Programming Language :: Python :: 3
16
+ Classifier: Programming Language :: Python :: 3.9
17
+ Classifier: Programming Language :: Python :: 3.10
18
+ Classifier: Programming Language :: Python :: 3.11
19
+ Classifier: Programming Language :: Python :: 3.12
20
+ Classifier: Topic :: Text Processing :: Linguistic
21
+ Requires-Python: >=3.9
22
+ Description-Content-Type: text/markdown
23
+
24
+ # QuranKit
25
+
26
+ **QuranKit** is a query-first Python toolkit for computational Quran analysis, providing structured access to Quranic text, morphology, roots, lemmas, translations, tafsir, and linguistic pattern discovery.
27
+
28
+ Developed by **Dr. Faisal Alshargi**.
29
+
30
+ ---
31
+
32
+ ## Features
33
+
34
+ * Quranic verse retrieval
35
+ * Root-based search
36
+ * Lemma-based search
37
+ * Word-level linguistic analysis
38
+ * POS tagging access
39
+ * Morphological feature access
40
+ * Multi-language translations
41
+ * Tafsir retrieval
42
+ * Root-order pattern discovery
43
+ * Frequency analysis
44
+ * Research-oriented querying
45
+ * Memory-only dataset loading
46
+
47
+ ---
48
+
49
+ ## Design Philosophy
50
+
51
+ QuranKit is designed around **feature-level access rather than dataset export**.
52
+
53
+ Users interact with the Quran through focused search and analysis functions instead of downloading or manipulating the complete dataset.
54
+
55
+ ### Core Principles
56
+
57
+ * Returns only the requested information
58
+ * Compact results by default
59
+ * No API key required
60
+ * No remote inference
61
+ * Query-first architecture
62
+ * Memory-only dataset loading
63
+ * Translation and tafsir access only when explicitly requested
64
+
65
+ QuranKit intentionally does **not** provide:
66
+
67
+ ```python
68
+ q.to_dataframe()
69
+ q.export_json()
70
+ q.export_dataset()
71
+ q.dump_all()
72
+ ```
73
+
74
+ This keeps the package focused on analysis, exploration, and research rather than dataset replication.
75
+
76
+ ---
77
+
78
+ ## Installation
79
+
80
+ ```bash
81
+ pip install qurankit
82
+ ```
83
+
84
+ For development:
85
+
86
+ ```bash
87
+ pip install -e .
88
+ ```
89
+
90
+ ---
91
+
92
+ ## Quick Start
93
+
94
+ ```python
95
+ from qurankit import QuranKit
96
+
97
+ q = QuranKit()
98
+
99
+ print(q.stats())
100
+
101
+ ayah = q.get_ayah(1, 1)
102
+ print(ayah)
103
+
104
+ results = q.search_root("رحم", limit=3)
105
+
106
+ translation = q.get_translation(2, 255, lang="en")
107
+ print(translation["translation"])
108
+ ```
109
+
110
+ ---
111
+
112
+ ## Statistics
113
+
114
+ ```python
115
+ q.stats()
116
+ ```
117
+
118
+ Example:
119
+
120
+ ```python
121
+ {
122
+ "surahs": 114,
123
+ "ayahs": 6236,
124
+ "languages": 20,
125
+ "roots": 1600,
126
+ "lemmas": 14000
127
+ }
128
+ ```
129
+
130
+ ---
131
+
132
+ ## Verse Retrieval
133
+
134
+ ```python
135
+ q.get_ayah(1, 1)
136
+ ```
137
+
138
+ Example output:
139
+
140
+ ```python
141
+ {
142
+ "sura": 1,
143
+ "ayah": 1,
144
+ "sura_ar": "الفاتحة",
145
+ "sura_en": "Al-Fātiḥah",
146
+ "ayatext_nt": "بسم الله الرحمن الرحيم"
147
+ }
148
+ ```
149
+
150
+ ---
151
+
152
+ ## Linguistic Features
153
+
154
+ Retrieve individual linguistic layers without loading unnecessary information.
155
+
156
+ ```python
157
+ q.get_words(2, 255)
158
+
159
+ q.get_roots(2, 255)
160
+
161
+ q.get_lemmas(2, 255)
162
+
163
+ q.get_pos(2, 255)
164
+
165
+ q.get_morph_tags(2, 255)
166
+
167
+ q.get_analysis(2, 255)
168
+
169
+ q.word_analysis(2, 255)
170
+ ```
171
+
172
+ ---
173
+
174
+ ## Search
175
+
176
+ ### Search Text
177
+
178
+ ```python
179
+ q.search_text("الله", limit=5)
180
+ ```
181
+
182
+ ### Search Word
183
+
184
+ ```python
185
+ q.search_word("جنة", limit=5)
186
+ ```
187
+
188
+ ### Search Root
189
+
190
+ ```python
191
+ q.search_root("رحم", limit=5)
192
+ ```
193
+
194
+ ### Search Lemma
195
+
196
+ ```python
197
+ q.search_lemma("ٱللَّه", limit=5)
198
+ ```
199
+
200
+ ### Search POS Tags
201
+
202
+ ```python
203
+ q.search_pos("V", limit=5)
204
+ ```
205
+
206
+ Search results remain compact and exclude long translations and tafsir content.
207
+
208
+ ---
209
+
210
+ ## Translations
211
+
212
+ List available languages:
213
+
214
+ ```python
215
+ q.available_languages()
216
+ ```
217
+
218
+ Retrieve a specific translation:
219
+
220
+ ```python
221
+ q.get_translation(2, 255, lang="en")
222
+ ```
223
+
224
+ Retrieve multiple translations:
225
+
226
+ ```python
227
+ q.get_translations(
228
+ 1,
229
+ 1,
230
+ langs=["en", "tr", "de"]
231
+ )
232
+ ```
233
+
234
+ Search inside translations:
235
+
236
+ ```python
237
+ q.search_translation(
238
+ "mercy",
239
+ lang="en",
240
+ limit=5
241
+ )
242
+ ```
243
+
244
+ ---
245
+
246
+ ## Tafsir
247
+
248
+ Retrieve tafsir only when requested.
249
+
250
+ ### Al-Muyassar
251
+
252
+ ```python
253
+ q.get_tafsir(
254
+ 1,
255
+ 1,
256
+ source="muyassar"
257
+ )
258
+ ```
259
+
260
+ ### Al-Jalalayn
261
+
262
+ ```python
263
+ q.get_tafsir(
264
+ 1,
265
+ 1,
266
+ source="jalalayn"
267
+ )
268
+ ```
269
+
270
+ ---
271
+
272
+ ## Pattern Discovery
273
+
274
+ Repeated root-order patterns:
275
+
276
+ ```python
277
+ q.find_repeated_root_orders(
278
+ min_occurrences=2,
279
+ limit=10
280
+ )
281
+ ```
282
+
283
+ Root frequency statistics:
284
+
285
+ ```python
286
+ q.root_frequency(limit=20)
287
+ ```
288
+
289
+ ---
290
+
291
+ ## Random Verse
292
+
293
+ ```python
294
+ q.random_ayah()
295
+ ```
296
+
297
+ ---
298
+
299
+ ## Dataset Handling
300
+
301
+ QuranKit ships with a minimal bundled sample dataset for testing and fallback purposes.
302
+
303
+ The complete Quran dataset is loaded from the internal default dataset source into memory only.
304
+
305
+ QuranKit does not intentionally persist the full dataset to local storage and does not provide dataset export utilities.
306
+
307
+ ---
308
+
309
+ ## Performance
310
+
311
+ ```python
312
+ import time
313
+
314
+ start = time.time()
315
+
316
+ q = QuranKit()
317
+
318
+ print(
319
+ "Load time:",
320
+ round(time.time() - start, 2),
321
+ "seconds"
322
+ )
323
+ ```
324
+
325
+ ---
326
+
327
+ ## Example Research Workflow
328
+
329
+ ```python
330
+ from qurankit import QuranKit
331
+
332
+ q = QuranKit()
333
+
334
+ verses = q.search_root("رحم", limit=10)
335
+
336
+ for verse in verses:
337
+ print(
338
+ verse["sura"],
339
+ verse["ayah"],
340
+ verse["ayatext_nt"]
341
+ )
342
+ ```
343
+
344
+ ---
345
+
346
+ ## Disclaimer
347
+
348
+ QuranKit is intended for computational analysis, research, education, and software development.
349
+
350
+ It is not a source of religious rulings, legal opinions, or authoritative tafsir.
351
+
352
+ ```
353
+ ```
354
+
@@ -0,0 +1,331 @@
1
+ # QuranKit
2
+
3
+ **QuranKit** is a query-first Python toolkit for computational Quran analysis, providing structured access to Quranic text, morphology, roots, lemmas, translations, tafsir, and linguistic pattern discovery.
4
+
5
+ Developed by **Dr. Faisal Alshargi**.
6
+
7
+ ---
8
+
9
+ ## Features
10
+
11
+ * Quranic verse retrieval
12
+ * Root-based search
13
+ * Lemma-based search
14
+ * Word-level linguistic analysis
15
+ * POS tagging access
16
+ * Morphological feature access
17
+ * Multi-language translations
18
+ * Tafsir retrieval
19
+ * Root-order pattern discovery
20
+ * Frequency analysis
21
+ * Research-oriented querying
22
+ * Memory-only dataset loading
23
+
24
+ ---
25
+
26
+ ## Design Philosophy
27
+
28
+ QuranKit is designed around **feature-level access rather than dataset export**.
29
+
30
+ Users interact with the Quran through focused search and analysis functions instead of downloading or manipulating the complete dataset.
31
+
32
+ ### Core Principles
33
+
34
+ * Returns only the requested information
35
+ * Compact results by default
36
+ * No API key required
37
+ * No remote inference
38
+ * Query-first architecture
39
+ * Memory-only dataset loading
40
+ * Translation and tafsir access only when explicitly requested
41
+
42
+ QuranKit intentionally does **not** provide:
43
+
44
+ ```python
45
+ q.to_dataframe()
46
+ q.export_json()
47
+ q.export_dataset()
48
+ q.dump_all()
49
+ ```
50
+
51
+ This keeps the package focused on analysis, exploration, and research rather than dataset replication.
52
+
53
+ ---
54
+
55
+ ## Installation
56
+
57
+ ```bash
58
+ pip install qurankit
59
+ ```
60
+
61
+ For development:
62
+
63
+ ```bash
64
+ pip install -e .
65
+ ```
66
+
67
+ ---
68
+
69
+ ## Quick Start
70
+
71
+ ```python
72
+ from qurankit import QuranKit
73
+
74
+ q = QuranKit()
75
+
76
+ print(q.stats())
77
+
78
+ ayah = q.get_ayah(1, 1)
79
+ print(ayah)
80
+
81
+ results = q.search_root("رحم", limit=3)
82
+
83
+ translation = q.get_translation(2, 255, lang="en")
84
+ print(translation["translation"])
85
+ ```
86
+
87
+ ---
88
+
89
+ ## Statistics
90
+
91
+ ```python
92
+ q.stats()
93
+ ```
94
+
95
+ Example:
96
+
97
+ ```python
98
+ {
99
+ "surahs": 114,
100
+ "ayahs": 6236,
101
+ "languages": 20,
102
+ "roots": 1600,
103
+ "lemmas": 14000
104
+ }
105
+ ```
106
+
107
+ ---
108
+
109
+ ## Verse Retrieval
110
+
111
+ ```python
112
+ q.get_ayah(1, 1)
113
+ ```
114
+
115
+ Example output:
116
+
117
+ ```python
118
+ {
119
+ "sura": 1,
120
+ "ayah": 1,
121
+ "sura_ar": "الفاتحة",
122
+ "sura_en": "Al-Fātiḥah",
123
+ "ayatext_nt": "بسم الله الرحمن الرحيم"
124
+ }
125
+ ```
126
+
127
+ ---
128
+
129
+ ## Linguistic Features
130
+
131
+ Retrieve individual linguistic layers without loading unnecessary information.
132
+
133
+ ```python
134
+ q.get_words(2, 255)
135
+
136
+ q.get_roots(2, 255)
137
+
138
+ q.get_lemmas(2, 255)
139
+
140
+ q.get_pos(2, 255)
141
+
142
+ q.get_morph_tags(2, 255)
143
+
144
+ q.get_analysis(2, 255)
145
+
146
+ q.word_analysis(2, 255)
147
+ ```
148
+
149
+ ---
150
+
151
+ ## Search
152
+
153
+ ### Search Text
154
+
155
+ ```python
156
+ q.search_text("الله", limit=5)
157
+ ```
158
+
159
+ ### Search Word
160
+
161
+ ```python
162
+ q.search_word("جنة", limit=5)
163
+ ```
164
+
165
+ ### Search Root
166
+
167
+ ```python
168
+ q.search_root("رحم", limit=5)
169
+ ```
170
+
171
+ ### Search Lemma
172
+
173
+ ```python
174
+ q.search_lemma("ٱللَّه", limit=5)
175
+ ```
176
+
177
+ ### Search POS Tags
178
+
179
+ ```python
180
+ q.search_pos("V", limit=5)
181
+ ```
182
+
183
+ Search results remain compact and exclude long translations and tafsir content.
184
+
185
+ ---
186
+
187
+ ## Translations
188
+
189
+ List available languages:
190
+
191
+ ```python
192
+ q.available_languages()
193
+ ```
194
+
195
+ Retrieve a specific translation:
196
+
197
+ ```python
198
+ q.get_translation(2, 255, lang="en")
199
+ ```
200
+
201
+ Retrieve multiple translations:
202
+
203
+ ```python
204
+ q.get_translations(
205
+ 1,
206
+ 1,
207
+ langs=["en", "tr", "de"]
208
+ )
209
+ ```
210
+
211
+ Search inside translations:
212
+
213
+ ```python
214
+ q.search_translation(
215
+ "mercy",
216
+ lang="en",
217
+ limit=5
218
+ )
219
+ ```
220
+
221
+ ---
222
+
223
+ ## Tafsir
224
+
225
+ Retrieve tafsir only when requested.
226
+
227
+ ### Al-Muyassar
228
+
229
+ ```python
230
+ q.get_tafsir(
231
+ 1,
232
+ 1,
233
+ source="muyassar"
234
+ )
235
+ ```
236
+
237
+ ### Al-Jalalayn
238
+
239
+ ```python
240
+ q.get_tafsir(
241
+ 1,
242
+ 1,
243
+ source="jalalayn"
244
+ )
245
+ ```
246
+
247
+ ---
248
+
249
+ ## Pattern Discovery
250
+
251
+ Repeated root-order patterns:
252
+
253
+ ```python
254
+ q.find_repeated_root_orders(
255
+ min_occurrences=2,
256
+ limit=10
257
+ )
258
+ ```
259
+
260
+ Root frequency statistics:
261
+
262
+ ```python
263
+ q.root_frequency(limit=20)
264
+ ```
265
+
266
+ ---
267
+
268
+ ## Random Verse
269
+
270
+ ```python
271
+ q.random_ayah()
272
+ ```
273
+
274
+ ---
275
+
276
+ ## Dataset Handling
277
+
278
+ QuranKit ships with a minimal bundled sample dataset for testing and fallback purposes.
279
+
280
+ The complete Quran dataset is loaded from the internal default dataset source into memory only.
281
+
282
+ QuranKit does not intentionally persist the full dataset to local storage and does not provide dataset export utilities.
283
+
284
+ ---
285
+
286
+ ## Performance
287
+
288
+ ```python
289
+ import time
290
+
291
+ start = time.time()
292
+
293
+ q = QuranKit()
294
+
295
+ print(
296
+ "Load time:",
297
+ round(time.time() - start, 2),
298
+ "seconds"
299
+ )
300
+ ```
301
+
302
+ ---
303
+
304
+ ## Example Research Workflow
305
+
306
+ ```python
307
+ from qurankit import QuranKit
308
+
309
+ q = QuranKit()
310
+
311
+ verses = q.search_root("رحم", limit=10)
312
+
313
+ for verse in verses:
314
+ print(
315
+ verse["sura"],
316
+ verse["ayah"],
317
+ verse["ayatext_nt"]
318
+ )
319
+ ```
320
+
321
+ ---
322
+
323
+ ## Disclaimer
324
+
325
+ QuranKit is intended for computational analysis, research, education, and software development.
326
+
327
+ It is not a source of religious rulings, legal opinions, or authoritative tafsir.
328
+
329
+ ```
330
+ ```
331
+
@@ -0,0 +1,10 @@
1
+ from qurankit import QuranKit
2
+
3
+ q = QuranKit()
4
+
5
+ print(q.stats())
6
+ print(q.get_ayah(1, 1))
7
+ print(q.get_roots(1, 1))
8
+ print(q.search_root("رحم", limit=3))
9
+ print(q.get_translation(2, 255, lang="en"))
10
+ print(q.word_analysis(1, 1))