shamela 1.3.1 → 1.3.3
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/README.md +362 -406
- package/dist/index.d.ts +34 -27
- package/dist/index.js +9 -7
- package/dist/index.js.map +1 -1
- package/package.json +11 -11
package/README.md
CHANGED
|
@@ -15,28 +15,50 @@
|
|
|
15
15
|
|
|
16
16
|
A universal TypeScript library for accessing and downloading Maktabah Shamela v4 APIs. The package runs in both Node.js and modern browsers, providing ergonomic helpers to interact with the Shamela API, download master and book databases, and retrieve book data programmatically.
|
|
17
17
|
|
|
18
|
+
## Features
|
|
19
|
+
|
|
20
|
+
- 🚀 **Full data lifecycle** – fetch metadata, download master and book databases, and query the results entirely in-memory.
|
|
21
|
+
- 🔐 **Runtime configuration** – configure API credentials, WASM paths, and custom fetch/logging implementations at runtime.
|
|
22
|
+
- 🧠 **Content tooling** – parse, sanitise, and post-process Arabic book content with utilities tailored for Shamela formatting.
|
|
23
|
+
- 🌐 **Environment aware** – automatically selects optimal sql.js WASM bundles for Node.js, browsers, and bundled runtimes.
|
|
24
|
+
- 🧪 **Well-tested** – comprehensive unit and end-to-end coverage to ensure reliable integrations.
|
|
25
|
+
|
|
18
26
|
## Table of Contents
|
|
19
27
|
|
|
28
|
+
- [Features](#features)
|
|
20
29
|
- [Installation](#installation)
|
|
21
|
-
- [
|
|
22
|
-
- [
|
|
23
|
-
|
|
24
|
-
|
|
25
|
-
|
|
26
|
-
|
|
27
|
-
|
|
28
|
-
|
|
29
|
-
|
|
30
|
-
|
|
31
|
-
|
|
32
|
-
|
|
30
|
+
- [Quick Start](#quick-start)
|
|
31
|
+
- [Standard Node.js](#standard-nodejs)
|
|
32
|
+
- [Next.js / Bundled Environments](#nextjs--bundled-environments)
|
|
33
|
+
- [Browser](#browser)
|
|
34
|
+
- [API Reference](#api-reference)
|
|
35
|
+
- [Configuration](#configuration)
|
|
36
|
+
- [configure](#configure)
|
|
37
|
+
- [resetConfig](#resetconfig)
|
|
38
|
+
- [getConfig](#getconfig)
|
|
39
|
+
- [getConfigValue](#getconfigvalue)
|
|
40
|
+
- [requireConfigValue](#requireconfigvalue)
|
|
41
|
+
- [Metadata & Downloads](#metadata--downloads)
|
|
42
|
+
- [getMasterMetadata](#getmastermetadata)
|
|
43
|
+
- [downloadMasterDatabase](#downloadmasterdatabase)
|
|
44
|
+
- [getBookMetadata](#getbookmetadata)
|
|
45
|
+
- [downloadBook](#downloadbook)
|
|
46
|
+
- [getCoverUrl](#getcoverurl)
|
|
47
|
+
- [Data Access](#data-access)
|
|
48
|
+
- [getBook](#getbook)
|
|
49
|
+
- [getMaster](#getmaster)
|
|
50
|
+
- [Content Utilities](#content-utilities)
|
|
51
|
+
- [parseContentRobust](#parsecontentrobust)
|
|
52
|
+
- [sanitizePageContent](#sanitizepagecontent)
|
|
53
|
+
- [splitPageBodyFromFooter](#splitpagebodyfromfooter)
|
|
54
|
+
- [removeArabicNumericPageMarkers](#removearabicnumericpagemarkers)
|
|
55
|
+
- [removeTagsExceptSpan](#removetagsexceptspan)
|
|
56
|
+
- [Supporting Utilities](#supporting-utilities)
|
|
57
|
+
- [buildUrl](#buildurl)
|
|
58
|
+
- [httpsGet](#httpsget)
|
|
33
59
|
- [Examples](#examples)
|
|
34
|
-
- [Downloading the Master Database](#downloading-the-master-database)
|
|
35
|
-
- [Downloading a Book](#downloading-a-book)
|
|
36
|
-
- [Retrieving Book Data](#retrieving-book-data)
|
|
37
|
-
- [Retrieving Master Data in memory](#retrieving-master-data-in-memory)
|
|
38
60
|
- [Data Structures](#data-structures)
|
|
39
|
-
- [Next.js
|
|
61
|
+
- [Next.js Demo](#nextjs-demo)
|
|
40
62
|
- [Troubleshooting](#troubleshooting)
|
|
41
63
|
- [Testing](#testing)
|
|
42
64
|
- [License](#license)
|
|
@@ -44,103 +66,53 @@ A universal TypeScript library for accessing and downloading Maktabah Shamela v4
|
|
|
44
66
|
## Installation
|
|
45
67
|
|
|
46
68
|
```bash
|
|
47
|
-
|
|
69
|
+
npm install shamela
|
|
48
70
|
```
|
|
49
71
|
|
|
50
|
-
or
|
|
51
|
-
|
|
52
72
|
```bash
|
|
53
|
-
|
|
73
|
+
bun add shamela
|
|
54
74
|
```
|
|
55
75
|
|
|
56
|
-
or
|
|
57
|
-
|
|
58
76
|
```bash
|
|
59
77
|
yarn add shamela
|
|
60
78
|
```
|
|
61
79
|
|
|
62
|
-
or
|
|
63
|
-
|
|
64
80
|
```bash
|
|
65
81
|
pnpm install shamela
|
|
66
82
|
```
|
|
67
83
|
|
|
68
|
-
##
|
|
69
|
-
|
|
70
|
-
Before using the library, you need to set up some environment variables for API keys and endpoints:
|
|
84
|
+
## Quick Start
|
|
71
85
|
|
|
72
|
-
|
|
73
|
-
- `SHAMELA_API_MASTER_PATCH_ENDPOINT`: The endpoint URL for the master database patches.
|
|
74
|
-
- `SHAMELA_API_BOOKS_ENDPOINT`: The base endpoint URL for book-related API calls.
|
|
75
|
-
- `SHAMELA_SQLJS_WASM_URL` (optional): Override the default CDN URL used to load the `sql.js` WebAssembly binary when running in the browser.
|
|
86
|
+
### Standard Node.js
|
|
76
87
|
|
|
77
|
-
|
|
78
|
-
|
|
79
|
-
```dotenv
|
|
80
|
-
SHAMELA_API_KEY=your_api_key_here
|
|
81
|
-
SHAMELA_API_MASTER_PATCH_ENDPOINT=https://shamela.ws/api/master_patch
|
|
82
|
-
SHAMELA_API_BOOKS_ENDPOINT=https://shamela.ws/api/books
|
|
83
|
-
# Optional when you host sql-wasm.wasm yourself
|
|
84
|
-
# SHAMELA_SQLJS_WASM_URL=https://example.com/sql-wasm.wasm
|
|
85
|
-
```
|
|
88
|
+
For simple Node.js scripts (non-bundled environments), the library auto-detects the WASM file:
|
|
86
89
|
|
|
87
|
-
|
|
88
|
-
|
|
89
|
-
When you cannot rely on environment variables—such as when running inside a browser, an edge worker, or a serverless function—use the `configure` helper to provide credentials at runtime:
|
|
90
|
-
|
|
91
|
-
```ts
|
|
92
|
-
import { configure } from 'shamela';
|
|
90
|
+
```typescript
|
|
91
|
+
import { configure, getBook } from 'shamela';
|
|
93
92
|
|
|
93
|
+
// Configure API credentials
|
|
94
94
|
configure({
|
|
95
|
-
|
|
96
|
-
|
|
97
|
-
|
|
98
|
-
|
|
99
|
-
sqlJsWasmUrl: '/assets/sql-wasm.wasm',
|
|
100
|
-
// Optional: integrate with your application's logging system
|
|
101
|
-
logger: console,
|
|
102
|
-
// Optional: provide a custom fetch implementation (for tests or SSR)
|
|
103
|
-
fetchImplementation: fetch,
|
|
95
|
+
apiKey: process.env.SHAMELA_API_KEY,
|
|
96
|
+
booksEndpoint: process.env.SHAMELA_BOOKS_ENDPOINT,
|
|
97
|
+
masterPatchEndpoint: process.env.SHAMELA_MASTER_ENDPOINT,
|
|
98
|
+
// sqlJsWasmUrl is auto-detected in standard Node.js
|
|
104
99
|
});
|
|
105
|
-
```
|
|
106
|
-
|
|
107
|
-
You can call `configure` multiple times—values are merged, so later calls update only the keys you pass in.
|
|
108
|
-
|
|
109
|
-
The optional `logger` must expose `debug`, `info`, `warn`, and `error` methods. When omitted, the library stays silent by default.
|
|
110
100
|
|
|
111
|
-
|
|
112
|
-
|
|
113
|
-
|
|
114
|
-
|
|
115
|
-
First, import the library functions into your project:
|
|
116
|
-
|
|
117
|
-
```javascript
|
|
118
|
-
import {
|
|
119
|
-
getMasterMetadata,
|
|
120
|
-
downloadMasterDatabase,
|
|
121
|
-
getBookMetadata,
|
|
122
|
-
downloadBook,
|
|
123
|
-
getBook,
|
|
124
|
-
getMaster,
|
|
125
|
-
getCoverUrl,
|
|
126
|
-
} from 'shamela';
|
|
101
|
+
// Use the library
|
|
102
|
+
const book = await getBook(26592);
|
|
103
|
+
console.log(`Downloaded book with ${book.pages.length} pages`);
|
|
127
104
|
```
|
|
128
105
|
|
|
129
106
|
### Next.js / Bundled Environments
|
|
130
107
|
|
|
131
|
-
|
|
132
|
-
|
|
133
|
-
#### 1. Update your Next.js configuration
|
|
108
|
+
For Next.js, webpack, Turbopack, and other bundlers, you need to explicitly configure the WASM file path.
|
|
134
109
|
|
|
135
|
-
|
|
110
|
+
**1. Update `next.config.ts` or `next.config.js`:**
|
|
136
111
|
|
|
137
112
|
```typescript
|
|
138
113
|
import type { NextConfig } from 'next';
|
|
139
114
|
|
|
140
115
|
const nextConfig: NextConfig = {
|
|
141
|
-
experimental: {
|
|
142
|
-
serverComponentsExternalPackages: ['shamela', 'sql.js'],
|
|
143
|
-
},
|
|
144
116
|
serverExternalPackages: ['shamela', 'sql.js'],
|
|
145
117
|
// ... rest of your config
|
|
146
118
|
};
|
|
@@ -148,30 +120,7 @@ const nextConfig: NextConfig = {
|
|
|
148
120
|
export default nextConfig;
|
|
149
121
|
```
|
|
150
122
|
|
|
151
|
-
|
|
152
|
-
|
|
153
|
-
#### 2. Create a server-only configuration file
|
|
154
|
-
|
|
155
|
-
Create a configuration file that will be imported only in server-side code:
|
|
156
|
-
|
|
157
|
-
**Option A: Using the `createNodeConfig` helper (Recommended)**
|
|
158
|
-
|
|
159
|
-
```typescript
|
|
160
|
-
// lib/shamela-server.ts
|
|
161
|
-
import { configure, createNodeConfig } from 'shamela';
|
|
162
|
-
|
|
163
|
-
// Configure once when this module loads
|
|
164
|
-
configure(createNodeConfig({
|
|
165
|
-
apiKey: process.env.SHAMELA_API_KEY,
|
|
166
|
-
booksEndpoint: process.env.SHAMELA_BOOKS_ENDPOINT,
|
|
167
|
-
masterPatchEndpoint: process.env.SHAMELA_MASTER_ENDPOINT,
|
|
168
|
-
}));
|
|
169
|
-
|
|
170
|
-
// Re-export the functions you need
|
|
171
|
-
export { getBookMetadata, downloadBook, getMaster, getBook } from 'shamela';
|
|
172
|
-
```
|
|
173
|
-
|
|
174
|
-
**Option B: Manual configuration**
|
|
123
|
+
**2. Create a server-only configuration file:**
|
|
175
124
|
|
|
176
125
|
```typescript
|
|
177
126
|
// lib/shamela-server.ts
|
|
@@ -185,10 +134,10 @@ configure({
|
|
|
185
134
|
masterPatchEndpoint: process.env.SHAMELA_MASTER_ENDPOINT,
|
|
186
135
|
});
|
|
187
136
|
|
|
188
|
-
export {
|
|
137
|
+
export { downloadBook, getBook, getBookMetadata, getMaster, downloadMasterDatabase } from 'shamela';
|
|
189
138
|
```
|
|
190
139
|
|
|
191
|
-
|
|
140
|
+
**3. Use in Server Actions:**
|
|
192
141
|
|
|
193
142
|
```typescript
|
|
194
143
|
'use server';
|
|
@@ -197,36 +146,106 @@ import { getBookMetadata, downloadBook } from '@/lib/shamela-server';
|
|
|
197
146
|
|
|
198
147
|
export async function downloadBookAction(bookId: number) {
|
|
199
148
|
const metadata = await getBookMetadata(bookId);
|
|
200
|
-
|
|
149
|
+
return await downloadBook(bookId, {
|
|
201
150
|
bookMetadata: metadata,
|
|
202
151
|
outputFile: { path: `./books/${bookId}.db` }
|
|
203
152
|
});
|
|
204
|
-
return result;
|
|
205
153
|
}
|
|
206
154
|
```
|
|
207
155
|
|
|
208
|
-
**Important:**
|
|
156
|
+
**Important:** Only import `shamela` in server-side code (Server Actions, API Routes, or Server Components). Never import in client components or `layout.tsx`.
|
|
209
157
|
|
|
210
|
-
###
|
|
158
|
+
### Browser
|
|
211
159
|
|
|
212
|
-
|
|
213
|
-
|
|
214
|
-
Fetches metadata for the master database.
|
|
160
|
+
In browsers, the library automatically uses a CDN-hosted WASM file:
|
|
215
161
|
|
|
216
162
|
```typescript
|
|
217
|
-
|
|
163
|
+
import { configure, getBook } from 'shamela';
|
|
164
|
+
|
|
165
|
+
configure({
|
|
166
|
+
apiKey: 'your-api-key',
|
|
167
|
+
booksEndpoint: 'https://SHAMELA_INSTANCE.ws/api/books',
|
|
168
|
+
masterPatchEndpoint: 'https://SHAMELA_INSTANCE.ws/api/master_patch',
|
|
169
|
+
// Automatically uses CDN: https://cdn.jsdelivr.net/npm/sql.js@1.13.0/dist/sql-wasm.wasm
|
|
170
|
+
});
|
|
171
|
+
|
|
172
|
+
const book = await getBook(26592);
|
|
218
173
|
```
|
|
219
174
|
|
|
220
|
-
|
|
175
|
+
## API Reference
|
|
176
|
+
|
|
177
|
+
### Configuration
|
|
221
178
|
|
|
222
|
-
|
|
179
|
+
#### configure
|
|
180
|
+
|
|
181
|
+
Initialises runtime configuration including API credentials, custom fetch implementations, sql.js WASM location, and logger overrides.
|
|
182
|
+
|
|
183
|
+
```typescript
|
|
184
|
+
configure(options: ConfigureOptions): void
|
|
185
|
+
```
|
|
223
186
|
|
|
224
187
|
**Example:**
|
|
225
188
|
|
|
226
|
-
```
|
|
227
|
-
|
|
228
|
-
|
|
229
|
-
|
|
189
|
+
```typescript
|
|
190
|
+
import { configure } from 'shamela';
|
|
191
|
+
|
|
192
|
+
configure({
|
|
193
|
+
apiKey: process.env.SHAMELA_API_KEY!,
|
|
194
|
+
booksEndpoint: process.env.SHAMELA_BOOKS_ENDPOINT!,
|
|
195
|
+
masterPatchEndpoint: process.env.SHAMELA_MASTER_ENDPOINT!,
|
|
196
|
+
});
|
|
197
|
+
```
|
|
198
|
+
|
|
199
|
+
#### resetConfig
|
|
200
|
+
|
|
201
|
+
Clears runtime overrides and restores the default silent logger.
|
|
202
|
+
|
|
203
|
+
```typescript
|
|
204
|
+
resetConfig(): void
|
|
205
|
+
```
|
|
206
|
+
|
|
207
|
+
Use this in tests or long-running processes when you need a clean configuration slate.
|
|
208
|
+
|
|
209
|
+
#### getConfig
|
|
210
|
+
|
|
211
|
+
Returns the merged configuration snapshot combining runtime overrides with environment variables.
|
|
212
|
+
|
|
213
|
+
```typescript
|
|
214
|
+
getConfig(): ShamelaConfig
|
|
215
|
+
```
|
|
216
|
+
|
|
217
|
+
#### getConfigValue
|
|
218
|
+
|
|
219
|
+
Reads a single configuration value without throwing when it is missing.
|
|
220
|
+
|
|
221
|
+
```typescript
|
|
222
|
+
getConfigValue<Key extends ShamelaConfigKey>(key: Key): ShamelaConfig[Key] | undefined
|
|
223
|
+
```
|
|
224
|
+
|
|
225
|
+
#### requireConfigValue
|
|
226
|
+
|
|
227
|
+
Retrieves a configuration entry and throws an error if the value is not present.
|
|
228
|
+
|
|
229
|
+
```typescript
|
|
230
|
+
requireConfigValue(key: Exclude<ShamelaConfigKey, 'fetchImplementation'>): string
|
|
231
|
+
```
|
|
232
|
+
|
|
233
|
+
### Metadata & Downloads
|
|
234
|
+
|
|
235
|
+
#### getMasterMetadata
|
|
236
|
+
|
|
237
|
+
Fetches metadata for the master database, including download URLs for the latest patches.
|
|
238
|
+
|
|
239
|
+
```typescript
|
|
240
|
+
getMasterMetadata(version?: number): Promise<GetMasterMetadataResponsePayload>
|
|
241
|
+
```
|
|
242
|
+
|
|
243
|
+
- `version` (optional): The version number to check for updates (defaults to 0)
|
|
244
|
+
|
|
245
|
+
```typescript
|
|
246
|
+
const metadata = await getMasterMetadata();
|
|
247
|
+
console.log(metadata.url); // Download URL
|
|
248
|
+
console.log(metadata.version); // Version number
|
|
230
249
|
|
|
231
250
|
// Check for updates from a specific version
|
|
232
251
|
const updates = await getMasterMetadata(5);
|
|
@@ -234,449 +253,386 @@ const updates = await getMasterMetadata(5);
|
|
|
234
253
|
|
|
235
254
|
#### downloadMasterDatabase
|
|
236
255
|
|
|
237
|
-
Downloads the master database
|
|
256
|
+
Downloads the master database containing all books, authors, and categories and writes it to disk or a custom writer.
|
|
238
257
|
|
|
239
258
|
```typescript
|
|
240
259
|
downloadMasterDatabase(options: DownloadMasterOptions): Promise<string>
|
|
241
260
|
```
|
|
242
261
|
|
|
243
|
-
- `options
|
|
244
|
-
|
|
245
|
-
- `outputFile`: Object with `path` property specifying the output file path
|
|
246
|
-
|
|
247
|
-
**Returns:** Promise that resolves to the path of the created output file
|
|
248
|
-
|
|
249
|
-
**Example:**
|
|
262
|
+
- `options.masterMetadata` (optional): Pre-fetched metadata to avoid an extra HTTP call
|
|
263
|
+
- `options.outputFile.path`: Output file path (`.db`, `.sqlite`, or `.json`)
|
|
250
264
|
|
|
251
|
-
```
|
|
265
|
+
```typescript
|
|
252
266
|
// Download as SQLite database
|
|
253
267
|
await downloadMasterDatabase({
|
|
254
|
-
|
|
268
|
+
outputFile: { path: './master.db' }
|
|
255
269
|
});
|
|
256
270
|
|
|
257
271
|
// Download as JSON
|
|
258
272
|
await downloadMasterDatabase({
|
|
259
|
-
|
|
260
|
-
});
|
|
261
|
-
|
|
262
|
-
// Use pre-fetched metadata for efficiency
|
|
263
|
-
const masterMetadata = await getMasterMetadata();
|
|
264
|
-
await downloadMasterDatabase({
|
|
265
|
-
masterMetadata,
|
|
266
|
-
outputFile: { path: './master.db' },
|
|
273
|
+
outputFile: { path: './master.json' }
|
|
267
274
|
});
|
|
268
275
|
```
|
|
269
276
|
|
|
270
277
|
#### getBookMetadata
|
|
271
278
|
|
|
272
|
-
Fetches metadata for a specific book.
|
|
279
|
+
Fetches metadata for a specific book, including patch release information.
|
|
273
280
|
|
|
274
281
|
```typescript
|
|
275
282
|
getBookMetadata(id: number, options?: GetBookMetadataOptions): Promise<GetBookMetadataResponsePayload>
|
|
276
283
|
```
|
|
277
284
|
|
|
278
|
-
- `id`:
|
|
279
|
-
- `options` (optional):
|
|
280
|
-
|
|
281
|
-
- `minorVersion`: The minor version to check against
|
|
282
|
-
|
|
283
|
-
**Returns:** Promise that resolves to book metadata including release URLs and versions
|
|
284
|
-
|
|
285
|
-
**Example:**
|
|
285
|
+
- `id`: Book identifier
|
|
286
|
+
- `options.majorVersion` (optional): Major version to check
|
|
287
|
+
- `options.minorVersion` (optional): Minor version to check
|
|
286
288
|
|
|
287
|
-
```
|
|
289
|
+
```typescript
|
|
288
290
|
const metadata = await getBookMetadata(26592);
|
|
289
|
-
console.log(metadata.majorReleaseUrl);
|
|
290
|
-
console.log(metadata.
|
|
291
|
-
|
|
292
|
-
// Check specific versions
|
|
293
|
-
const versionedMetadata = await getBookMetadata(26592, {
|
|
294
|
-
majorVersion: 1,
|
|
295
|
-
minorVersion: 2,
|
|
296
|
-
});
|
|
291
|
+
console.log(metadata.majorReleaseUrl);
|
|
292
|
+
console.log(metadata.minorReleaseUrl);
|
|
297
293
|
```
|
|
298
294
|
|
|
299
295
|
#### downloadBook
|
|
300
296
|
|
|
301
|
-
Downloads and processes a book from
|
|
297
|
+
Downloads and processes a book from Shamela, writing it to JSON or SQLite on disk.
|
|
302
298
|
|
|
303
299
|
```typescript
|
|
304
300
|
downloadBook(id: number, options: DownloadBookOptions): Promise<string>
|
|
305
301
|
```
|
|
306
302
|
|
|
307
|
-
- `id`:
|
|
308
|
-
- `options
|
|
309
|
-
|
|
310
|
-
- `outputFile`: Object with `path` property specifying the output file path
|
|
311
|
-
|
|
312
|
-
**Returns:** Promise that resolves to the path of the created output file
|
|
303
|
+
- `id`: Book identifier
|
|
304
|
+
- `options.bookMetadata` (optional): Pre-fetched metadata to avoid re-fetching
|
|
305
|
+
- `options.outputFile.path`: Output file path (`.db`, `.sqlite`, or `.json`)
|
|
313
306
|
|
|
314
|
-
|
|
315
|
-
|
|
316
|
-
```javascript
|
|
307
|
+
```typescript
|
|
317
308
|
// Download as JSON
|
|
318
309
|
await downloadBook(26592, {
|
|
319
|
-
|
|
310
|
+
outputFile: { path: './book.json' }
|
|
320
311
|
});
|
|
321
312
|
|
|
322
|
-
// Download as SQLite
|
|
313
|
+
// Download as SQLite
|
|
323
314
|
await downloadBook(26592, {
|
|
324
|
-
|
|
315
|
+
outputFile: { path: './book.db' }
|
|
325
316
|
});
|
|
317
|
+
```
|
|
326
318
|
|
|
327
|
-
|
|
328
|
-
|
|
329
|
-
|
|
330
|
-
|
|
331
|
-
|
|
332
|
-
|
|
319
|
+
#### getCoverUrl
|
|
320
|
+
|
|
321
|
+
Generates the URL for a book's cover image using the configured Shamela host.
|
|
322
|
+
|
|
323
|
+
```typescript
|
|
324
|
+
getCoverUrl(bookId: number): string
|
|
325
|
+
```
|
|
326
|
+
|
|
327
|
+
```typescript
|
|
328
|
+
const coverUrl = getCoverUrl(26592);
|
|
329
|
+
// Returns: "https://shamela.ws/covers/26592.jpg"
|
|
333
330
|
```
|
|
334
331
|
|
|
332
|
+
### Data Access
|
|
333
|
+
|
|
335
334
|
#### getBook
|
|
336
335
|
|
|
337
|
-
Retrieves complete book data as a JavaScript object
|
|
336
|
+
Retrieves complete book data as a JavaScript object, returning pages and title entries.
|
|
338
337
|
|
|
339
338
|
```typescript
|
|
340
339
|
getBook(id: number): Promise<BookData>
|
|
341
340
|
```
|
|
342
341
|
|
|
343
|
-
|
|
342
|
+
```typescript
|
|
343
|
+
const book = await getBook(26592);
|
|
344
|
+
console.log(book.pages.length);
|
|
345
|
+
console.log(book.titles?.length);
|
|
346
|
+
console.log(book.pages[0].content);
|
|
347
|
+
```
|
|
344
348
|
|
|
345
|
-
|
|
349
|
+
#### getMaster
|
|
346
350
|
|
|
347
|
-
|
|
351
|
+
Retrieves the entire master dataset as a JavaScript object, including version information.
|
|
348
352
|
|
|
349
|
-
```
|
|
350
|
-
|
|
351
|
-
console.log(bookData.pages.length); // Number of pages in the book
|
|
352
|
-
console.log(bookData.titles?.length); // Number of title entries
|
|
353
|
-
console.log(bookData.pages[0].content); // Content of the first page
|
|
353
|
+
```typescript
|
|
354
|
+
getMaster(): Promise<MasterData>
|
|
354
355
|
```
|
|
355
356
|
|
|
356
|
-
|
|
357
|
+
```typescript
|
|
358
|
+
const master = await getMaster();
|
|
359
|
+
console.log(master.version);
|
|
360
|
+
console.log(master.books.length);
|
|
361
|
+
console.log(master.authors.length);
|
|
362
|
+
console.log(master.categories.length);
|
|
363
|
+
```
|
|
364
|
+
|
|
365
|
+
### Content Utilities
|
|
357
366
|
|
|
358
|
-
|
|
359
|
-
|
|
367
|
+
#### parseContentRobust
|
|
368
|
+
|
|
369
|
+
Parses Shamela HTML snippets into structured lines while preserving title hierarchy and Arabic punctuation.
|
|
360
370
|
|
|
361
371
|
```typescript
|
|
362
|
-
|
|
372
|
+
parseContentRobust(content: string): Line[]
|
|
363
373
|
```
|
|
364
374
|
|
|
365
|
-
|
|
375
|
+
```typescript
|
|
376
|
+
const lines = parseContentRobust(rawHtml);
|
|
377
|
+
lines.forEach((line) => console.log(line.id, line.text));
|
|
378
|
+
```
|
|
366
379
|
|
|
367
|
-
|
|
380
|
+
#### sanitizePageContent
|
|
368
381
|
|
|
369
|
-
|
|
370
|
-
|
|
371
|
-
|
|
372
|
-
|
|
373
|
-
console.log(masterData.categories.length); // Number of categories available
|
|
382
|
+
Normalises page content by applying regex-based replacement rules tuned for Shamela sources.
|
|
383
|
+
|
|
384
|
+
```typescript
|
|
385
|
+
sanitizePageContent(text: string, rules?: Record<string, string>): string
|
|
374
386
|
```
|
|
375
387
|
|
|
376
|
-
####
|
|
388
|
+
#### splitPageBodyFromFooter
|
|
377
389
|
|
|
378
|
-
|
|
390
|
+
Separates page body content from trailing footnotes using the default Shamela marker.
|
|
379
391
|
|
|
380
392
|
```typescript
|
|
381
|
-
|
|
393
|
+
splitPageBodyFromFooter(content: string, marker?: string): readonly [string, string]
|
|
382
394
|
```
|
|
383
395
|
|
|
384
|
-
|
|
396
|
+
#### removeArabicNumericPageMarkers
|
|
385
397
|
|
|
386
|
-
|
|
398
|
+
Removes Arabic numeral markers enclosed in ⦗ ⦘, commonly used to denote page numbers.
|
|
387
399
|
|
|
388
|
-
|
|
400
|
+
```typescript
|
|
401
|
+
removeArabicNumericPageMarkers(text: string): string
|
|
402
|
+
```
|
|
389
403
|
|
|
390
|
-
|
|
391
|
-
|
|
392
|
-
|
|
404
|
+
#### removeTagsExceptSpan
|
|
405
|
+
|
|
406
|
+
Strips anchor and hadeeth tags while preserving nested `<span>` elements.
|
|
407
|
+
|
|
408
|
+
```typescript
|
|
409
|
+
removeTagsExceptSpan(content: string): string
|
|
410
|
+
```
|
|
411
|
+
|
|
412
|
+
### Supporting Utilities
|
|
413
|
+
|
|
414
|
+
#### buildUrl
|
|
415
|
+
|
|
416
|
+
Constructs authenticated API URLs with query parameters and optional API key injection.
|
|
417
|
+
|
|
418
|
+
```typescript
|
|
419
|
+
buildUrl(endpoint: string, queryParams: Record<string, any>, useAuth?: boolean): URL
|
|
420
|
+
```
|
|
421
|
+
|
|
422
|
+
#### httpsGet
|
|
423
|
+
|
|
424
|
+
Makes HTTPS GET requests using the configured fetch implementation, automatically parsing JSON responses and returning binary data otherwise.
|
|
425
|
+
|
|
426
|
+
```typescript
|
|
427
|
+
httpsGet<T extends Uint8Array | Record<string, any>>(url: string | URL, options?: { fetchImpl?: typeof fetch }): Promise<T>
|
|
393
428
|
```
|
|
394
429
|
|
|
395
430
|
## Examples
|
|
396
431
|
|
|
397
432
|
### Downloading the Master Database
|
|
398
433
|
|
|
399
|
-
```
|
|
434
|
+
```typescript
|
|
400
435
|
import { downloadMasterDatabase } from 'shamela';
|
|
401
436
|
|
|
402
|
-
|
|
403
|
-
|
|
404
|
-
|
|
405
|
-
|
|
406
|
-
|
|
407
|
-
|
|
408
|
-
|
|
409
|
-
|
|
410
|
-
|
|
411
|
-
|
|
412
|
-
outputFile: { path: './shamela_master.json' },
|
|
413
|
-
});
|
|
414
|
-
console.log(`Master data exported to: ${jsonPath}`);
|
|
415
|
-
console.log('The JSON file includes authors, books, categories, and the master version number.');
|
|
416
|
-
} catch (error) {
|
|
417
|
-
console.error('Error downloading master database:', error);
|
|
418
|
-
}
|
|
419
|
-
})();
|
|
437
|
+
// Download as SQLite
|
|
438
|
+
const dbPath = await downloadMasterDatabase({
|
|
439
|
+
outputFile: { path: './shamela_master.db' }
|
|
440
|
+
});
|
|
441
|
+
console.log(`Downloaded to: ${dbPath}`);
|
|
442
|
+
|
|
443
|
+
// Download as JSON
|
|
444
|
+
const jsonPath = await downloadMasterDatabase({
|
|
445
|
+
outputFile: { path: './shamela_master.json' }
|
|
446
|
+
});
|
|
420
447
|
```
|
|
421
448
|
|
|
422
449
|
### Downloading a Book
|
|
423
450
|
|
|
424
|
-
```
|
|
451
|
+
```typescript
|
|
425
452
|
import { downloadBook, getBookMetadata } from 'shamela';
|
|
426
453
|
|
|
427
|
-
|
|
428
|
-
const bookId = 26592;
|
|
454
|
+
const bookId = 26592;
|
|
429
455
|
|
|
430
|
-
|
|
431
|
-
|
|
432
|
-
|
|
433
|
-
|
|
434
|
-
});
|
|
456
|
+
// Download book
|
|
457
|
+
await downloadBook(bookId, {
|
|
458
|
+
outputFile: { path: `./book_${bookId}.db` }
|
|
459
|
+
});
|
|
435
460
|
|
|
436
|
-
|
|
437
|
-
|
|
438
|
-
|
|
439
|
-
|
|
440
|
-
|
|
441
|
-
|
|
442
|
-
} catch (error) {
|
|
443
|
-
console.error('Error downloading book:', error);
|
|
444
|
-
}
|
|
445
|
-
})();
|
|
461
|
+
// With pre-fetched metadata
|
|
462
|
+
const metadata = await getBookMetadata(bookId);
|
|
463
|
+
await downloadBook(bookId, {
|
|
464
|
+
bookMetadata: metadata,
|
|
465
|
+
outputFile: { path: `./book_${bookId}.json` }
|
|
466
|
+
});
|
|
446
467
|
```
|
|
447
468
|
|
|
448
469
|
### Retrieving Book Data
|
|
449
470
|
|
|
450
|
-
```
|
|
471
|
+
```typescript
|
|
451
472
|
import { getBook } from 'shamela';
|
|
452
473
|
|
|
453
|
-
|
|
454
|
-
try {
|
|
455
|
-
const bookData = await getBook(26592);
|
|
456
|
-
|
|
457
|
-
console.log(`Book has ${bookData.pages.length} pages`);
|
|
474
|
+
const book = await getBook(26592);
|
|
458
475
|
|
|
459
|
-
|
|
460
|
-
console.log(`Book has ${bookData.titles.length} titles/chapters`);
|
|
461
|
-
|
|
462
|
-
// Display table of contents
|
|
463
|
-
bookData.titles.forEach((title) => {
|
|
464
|
-
console.log(`${title.id}: ${title.content} (Page ${title.page})`);
|
|
465
|
-
});
|
|
466
|
-
}
|
|
467
|
-
|
|
468
|
-
// Access page content
|
|
469
|
-
const firstPage = bookData.pages[0];
|
|
470
|
-
console.log(`First page content: ${firstPage.content.substring(0, 100)}...`);
|
|
471
|
-
} catch (error) {
|
|
472
|
-
console.error('Error retrieving book:', error);
|
|
473
|
-
}
|
|
474
|
-
})();
|
|
475
|
-
```
|
|
476
|
+
console.log(`Book has ${book.pages.length} pages`);
|
|
476
477
|
|
|
477
|
-
|
|
478
|
-
|
|
479
|
-
|
|
480
|
-
|
|
481
|
-
|
|
482
|
-
(async () => {
|
|
483
|
-
try {
|
|
484
|
-
const masterData = await getMaster();
|
|
478
|
+
// Display table of contents
|
|
479
|
+
book.titles?.forEach(title => {
|
|
480
|
+
console.log(`${title.id}: ${title.content} (Page ${title.page})`);
|
|
481
|
+
});
|
|
485
482
|
|
|
486
|
-
|
|
487
|
-
|
|
488
|
-
|
|
489
|
-
} catch (error) {
|
|
490
|
-
console.error('Error retrieving master data:', error);
|
|
491
|
-
}
|
|
492
|
-
})();
|
|
483
|
+
// Access page content
|
|
484
|
+
const firstPage = book.pages[0];
|
|
485
|
+
console.log(firstPage.content.substring(0, 100));
|
|
493
486
|
```
|
|
494
487
|
|
|
495
|
-
### Getting Book
|
|
496
|
-
|
|
497
|
-
```javascript
|
|
498
|
-
import { getCoverUrl, downloadMasterDatabase } from 'shamela';
|
|
488
|
+
### Getting Book Covers
|
|
499
489
|
|
|
500
|
-
|
|
501
|
-
|
|
502
|
-
// Download master data to get book information
|
|
503
|
-
const masterData = await downloadMasterDatabase({
|
|
504
|
-
outputFile: { path: './master.json' },
|
|
505
|
-
});
|
|
490
|
+
```typescript
|
|
491
|
+
import { getCoverUrl, getMaster } from 'shamela';
|
|
506
492
|
|
|
507
|
-
|
|
508
|
-
const data = await Bun.file('./master.json').json();
|
|
493
|
+
const master = await getMaster();
|
|
509
494
|
|
|
510
|
-
|
|
511
|
-
|
|
512
|
-
|
|
513
|
-
|
|
514
|
-
|
|
515
|
-
} catch (error) {
|
|
516
|
-
console.error('Error processing covers:', error);
|
|
517
|
-
}
|
|
518
|
-
})();
|
|
495
|
+
// Generate cover URLs for all books
|
|
496
|
+
master.books.forEach(book => {
|
|
497
|
+
const coverUrl = getCoverUrl(book.id);
|
|
498
|
+
console.log(`${book.name}: ${coverUrl}`);
|
|
499
|
+
});
|
|
519
500
|
```
|
|
520
501
|
|
|
521
502
|
## Data Structures
|
|
522
503
|
|
|
523
|
-
The library provides comprehensive TypeScript types for all data structures:
|
|
524
|
-
|
|
525
504
|
### BookData
|
|
526
505
|
|
|
527
|
-
|
|
528
|
-
|
|
506
|
+
```typescript
|
|
507
|
+
type BookData = {
|
|
508
|
+
pages: Page[];
|
|
509
|
+
titles: Title[];
|
|
510
|
+
};
|
|
511
|
+
```
|
|
529
512
|
|
|
530
513
|
### MasterData
|
|
531
514
|
|
|
532
|
-
|
|
533
|
-
|
|
534
|
-
|
|
535
|
-
|
|
515
|
+
```typescript
|
|
516
|
+
type MasterData = {
|
|
517
|
+
authors: Author[];
|
|
518
|
+
books: Book[];
|
|
519
|
+
categories: Category[];
|
|
520
|
+
version: number;
|
|
521
|
+
};
|
|
522
|
+
```
|
|
536
523
|
|
|
537
524
|
### Page
|
|
538
525
|
|
|
539
|
-
|
|
540
|
-
|
|
541
|
-
|
|
542
|
-
|
|
543
|
-
|
|
526
|
+
```typescript
|
|
527
|
+
type Page = {
|
|
528
|
+
id: number;
|
|
529
|
+
content: string;
|
|
530
|
+
part?: string;
|
|
531
|
+
page?: number;
|
|
532
|
+
number?: string;
|
|
533
|
+
};
|
|
534
|
+
```
|
|
544
535
|
|
|
545
536
|
### Title
|
|
546
537
|
|
|
547
|
-
|
|
548
|
-
|
|
549
|
-
|
|
550
|
-
|
|
551
|
-
|
|
552
|
-
|
|
553
|
-
|
|
538
|
+
```typescript
|
|
539
|
+
type Title = {
|
|
540
|
+
id: number;
|
|
541
|
+
content: string;
|
|
542
|
+
page: number;
|
|
543
|
+
parent?: number;
|
|
544
|
+
};
|
|
545
|
+
```
|
|
554
546
|
|
|
555
|
-
|
|
556
|
-
- `sanitizePageContent(content: string)`: Removes common footnote markers and normalises ligatures from Shamela pages.
|
|
547
|
+
### Content Helpers
|
|
557
548
|
|
|
558
|
-
|
|
549
|
+
- `parseContentRobust(content: string)`: Converts Shamela page HTML into structured lines
|
|
550
|
+
- `sanitizePageContent(content: string)`: Removes footnote markers and normalizes text
|
|
551
|
+
- `splitPageBodyFromFooter(content: string)`: Separates page content from footnotes
|
|
552
|
+
- `removeArabicNumericPageMarkers(text: string)`: Removes Arabic page number markers
|
|
553
|
+
- `removeTagsExceptSpan(content: string)`: Strips HTML tags except span elements
|
|
559
554
|
|
|
560
|
-
|
|
555
|
+
## Next.js Demo
|
|
561
556
|
|
|
562
|
-
|
|
557
|
+
A minimal Next.js 16 demo application is available in the `demo/` directory.
|
|
563
558
|
|
|
564
|
-
|
|
565
|
-
SHAMELA_API_MASTER_PATCH_ENDPOINT=https://dev.shamela.ws/api/v1/patches/master
|
|
566
|
-
SHAMELA_API_BOOKS_ENDPOINT=https://dev.shamela.ws/api/v1/patches/book-updates
|
|
567
|
-
# Optional when hosting the wasm asset yourself
|
|
568
|
-
# SHAMELA_SQLJS_WASM_URL=https://example.com/sql-wasm.wasm
|
|
569
|
-
```
|
|
559
|
+
**Setup:**
|
|
570
560
|
|
|
571
|
-
|
|
561
|
+
Create `demo/.env.local`:
|
|
572
562
|
|
|
573
|
-
```
|
|
574
|
-
|
|
563
|
+
```env
|
|
564
|
+
SHAMELA_API_KEY=your_api_key
|
|
565
|
+
SHAMELA_API_MASTER_PATCH_ENDPOINT=https://SHAMELA_INSTANCE.ws/api/master_patch
|
|
566
|
+
SHAMELA_API_BOOKS_ENDPOINT=https://SHAMELA_INSTANCE.ws/api/books
|
|
575
567
|
```
|
|
576
568
|
|
|
577
|
-
|
|
569
|
+
**Run:**
|
|
578
570
|
|
|
579
571
|
```bash
|
|
580
|
-
bun run demo
|
|
581
|
-
bun run demo:
|
|
572
|
+
bun run demo # Development
|
|
573
|
+
bun run demo:build # Production build
|
|
574
|
+
bun run demo:start # Production server
|
|
582
575
|
```
|
|
583
576
|
|
|
584
|
-
|
|
577
|
+
Visit [http://localhost:3000](http://localhost:3000) to explore the API.
|
|
585
578
|
|
|
586
579
|
## Troubleshooting
|
|
587
580
|
|
|
588
|
-
### Error: "Unable to locate sql-wasm.wasm file"
|
|
581
|
+
### Error: "Unable to automatically locate sql-wasm.wasm file"
|
|
589
582
|
|
|
590
|
-
This
|
|
583
|
+
This occurs in bundled environments (Next.js, webpack, Turbopack).
|
|
591
584
|
|
|
592
|
-
**Solution:**
|
|
585
|
+
**Solution:** Add explicit configuration:
|
|
593
586
|
|
|
594
587
|
```typescript
|
|
595
|
-
import { configure
|
|
596
|
-
|
|
597
|
-
// Option 1: Use the helper (recommended)
|
|
598
|
-
configure(createNodeConfig({
|
|
599
|
-
apiKey: process.env.SHAMELA_API_KEY,
|
|
600
|
-
booksEndpoint: process.env.SHAMELA_BOOKS_ENDPOINT,
|
|
601
|
-
masterPatchEndpoint: process.env.SHAMELA_MASTER_ENDPOINT,
|
|
602
|
-
}));
|
|
603
|
-
|
|
604
|
-
// Option 2: Manual configuration
|
|
588
|
+
import { configure } from 'shamela';
|
|
605
589
|
import { join } from 'node:path';
|
|
606
590
|
|
|
607
591
|
configure({
|
|
608
592
|
sqlJsWasmUrl: join(process.cwd(), 'node_modules', 'sql.js', 'dist', 'sql-wasm.wasm'),
|
|
609
593
|
apiKey: process.env.SHAMELA_API_KEY,
|
|
610
|
-
|
|
594
|
+
booksEndpoint: process.env.SHAMELA_BOOKS_ENDPOINT,
|
|
595
|
+
masterPatchEndpoint: process.env.SHAMELA_MASTER_ENDPOINT,
|
|
611
596
|
});
|
|
612
597
|
```
|
|
613
598
|
|
|
614
|
-
###
|
|
615
|
-
|
|
616
|
-
This means you're in a Node.js environment but providing an HTTPS URL for the WASM file. Node.js requires a filesystem path, not a URL.
|
|
617
|
-
|
|
618
|
-
**Solution:** Use a filesystem path instead of a URL:
|
|
619
|
-
|
|
620
|
-
```typescript
|
|
621
|
-
// ❌ Wrong - HTTPS URL in Node.js
|
|
622
|
-
configure({
|
|
623
|
-
sqlJsWasmUrl: 'https://cdn.jsdelivr.net/npm/sql.js@1.13.0/dist/sql-wasm.wasm'
|
|
624
|
-
});
|
|
599
|
+
### Next.js: Module not found errors
|
|
625
600
|
|
|
626
|
-
|
|
627
|
-
|
|
601
|
+
1. Add to `next.config.ts`:
|
|
602
|
+
```typescript
|
|
603
|
+
serverExternalPackages: ['shamela', 'sql.js']
|
|
604
|
+
```
|
|
628
605
|
|
|
629
|
-
|
|
630
|
-
apiKey: process.env.SHAMELA_API_KEY,
|
|
631
|
-
// ... other config
|
|
632
|
-
}));
|
|
633
|
-
```
|
|
606
|
+
2. Only import shamela in server-side code
|
|
634
607
|
|
|
635
|
-
|
|
608
|
+
3. Create a separate `lib/shamela-server.ts` for configuration
|
|
636
609
|
|
|
637
|
-
|
|
610
|
+
### Production build works differently than development
|
|
638
611
|
|
|
639
|
-
|
|
640
|
-
2. Ensure you're only importing shamela in server-side code (Server Actions, API Routes, Server Components)
|
|
641
|
-
3. Never import shamela in `layout.tsx` or client components
|
|
642
|
-
4. Create a separate `lib/shamela-server.ts` file for configuration
|
|
612
|
+
Ensure `serverExternalPackages` is set in your Next.js config for both development and production.
|
|
643
613
|
|
|
644
|
-
###
|
|
614
|
+
### Monorepo setup issues
|
|
645
615
|
|
|
646
|
-
|
|
616
|
+
Adjust the WASM path based on your structure:
|
|
647
617
|
|
|
648
618
|
```typescript
|
|
649
|
-
|
|
650
|
-
|
|
651
|
-
|
|
652
|
-
|
|
653
|
-
booksEndpoint: process.env.SHAMELA_BOOKS_ENDPOINT,
|
|
654
|
-
masterPatchEndpoint: process.env.SHAMELA_MASTER_ENDPOINT,
|
|
655
|
-
}));
|
|
619
|
+
configure({
|
|
620
|
+
sqlJsWasmUrl: join(process.cwd(), '../../node_modules/sql.js/dist/sql-wasm.wasm'),
|
|
621
|
+
// ... other config
|
|
622
|
+
});
|
|
656
623
|
```
|
|
657
624
|
|
|
658
625
|
## Testing
|
|
659
626
|
|
|
660
|
-
|
|
661
|
-
|
|
662
|
-
```bash
|
|
663
|
-
bun test src
|
|
664
|
-
```
|
|
665
|
-
|
|
666
|
-
For end-to-end tests:
|
|
667
|
-
|
|
668
|
-
```bash
|
|
669
|
-
bun run e2e
|
|
670
|
-
```
|
|
671
|
-
|
|
672
|
-
### Formatting
|
|
673
|
-
|
|
674
|
-
Apply Biome formatting across the repository with:
|
|
627
|
+
Run tests with Bun:
|
|
675
628
|
|
|
676
629
|
```bash
|
|
677
|
-
bun
|
|
630
|
+
bun test src # Unit tests
|
|
631
|
+
bun run e2e # End-to-end tests
|
|
632
|
+
bun run format # Format code
|
|
633
|
+
bun run lint # Lint code
|
|
678
634
|
```
|
|
679
635
|
|
|
680
636
|
## License
|
|
681
637
|
|
|
682
|
-
|
|
638
|
+
MIT License - see LICENSE file for details.
|