ef-dl 1.0.0
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- package/LICENSE +21 -0
- package/README.md +521 -0
- package/fonts/Sub-Zero.flf +629 -0
- package/index.ts +866 -0
- package/package.json +65 -0
- package/src/browserless/browser-client.ts +307 -0
- package/src/browserless/challenger.ts +352 -0
- package/src/browserless/helpers.ts +171 -0
- package/src/types/browserless.d.ts +31 -0
- package/src/types/constants.ts +3 -0
- package/src/types/enums.ts +5 -0
- package/src/utils/ascii.ts +66 -0
- package/src/utils/helpers.ts +260 -0
- package/src/utils/logger.ts +42 -0
- package/src/utils/progress.ts +130 -0
- package/src/utils/prompt.ts +87 -0
- package/src/workers/coordinator.ts +635 -0
- package/src/workers/index.ts +40 -0
- package/src/workers/task-queue.ts +388 -0
- package/src/workers/types.ts +135 -0
- package/src/workers/worker-pool.ts +227 -0
- package/src/workers/worker.ts +290 -0
package/LICENSE
ADDED
|
@@ -0,0 +1,21 @@
|
|
|
1
|
+
MIT License
|
|
2
|
+
|
|
3
|
+
Copyright (c) 2026 EF-DL
|
|
4
|
+
|
|
5
|
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
|
6
|
+
of this software and associated documentation files (the "Software"), to deal
|
|
7
|
+
in the Software without restriction, including without limitation the rights
|
|
8
|
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
|
9
|
+
copies of the Software, and to permit persons to whom the Software is
|
|
10
|
+
furnished to do so, subject to the following conditions:
|
|
11
|
+
|
|
12
|
+
The above copyright notice and this permission notice shall be included in all
|
|
13
|
+
copies or substantial portions of the Software.
|
|
14
|
+
|
|
15
|
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
|
16
|
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
|
17
|
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
|
18
|
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
|
19
|
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
|
20
|
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
|
21
|
+
SOFTWARE.
|
package/README.md
ADDED
|
@@ -0,0 +1,521 @@
|
|
|
1
|
+
# EF-DL: Epstein Files Downloader
|
|
2
|
+
|
|
3
|
+
```
|
|
4
|
+
______ ______ _____ __
|
|
5
|
+
/\ ___\ /\ ___\ /\ __-. /\ \
|
|
6
|
+
\ \ __\ \ \ __\ \ \ \/\ \ \ \ \____
|
|
7
|
+
\ \_____\ \ \_\ \ \____- \ \_____\
|
|
8
|
+
\/_____/ \/_/ \/____/ \/_____/
|
|
9
|
+
```
|
|
10
|
+
|
|
11
|
+
> **DISCLAIMER**: This application is for **EDUCATIONAL PURPOSES ONLY**. By using this tool, you certify that you are 18 years of age or older and will use it responsibly and legally.
|
|
12
|
+
|
|
13
|
+
An interactive CLI tool for downloading the epstein files from the DOJ Epstein Files search portal. This tool automates the process of searching, downloading metadata, and downloading PDF files with support for pagination, prefixes, and deduplication.
|
|
14
|
+
|
|
15
|
+
## Table of Contents
|
|
16
|
+
|
|
17
|
+
- [Features](#features)
|
|
18
|
+
- [Installation](#installation)
|
|
19
|
+
- [Option 1: Docker (Recommended)](#option-1-docker-recommended)
|
|
20
|
+
- [Option 2: Bun Package Manager](#option-2-bun-package-manager)
|
|
21
|
+
- [Option 3: Local Development](#option-3-local-development)
|
|
22
|
+
- [Quick Start](#quick-start)
|
|
23
|
+
- [Start with interactive mode (default)](#start-with-interactive-mode-default)
|
|
24
|
+
- [Download all pages](#download-all-pages)
|
|
25
|
+
- [Download a specific page](#download-a-specific-page)
|
|
26
|
+
- [Interactive mode with pre-filled values](#interactive-mode-with-pre-filled-values)
|
|
27
|
+
- [Docker Usage](#docker-usage)
|
|
28
|
+
- [Quick Start with Docker](#quick-start-with-docker)
|
|
29
|
+
- [Docker Commands](#docker-commands)
|
|
30
|
+
- [Development with Docker](#development-with-docker)
|
|
31
|
+
- [Usage](#usage)
|
|
32
|
+
- [Command Line Options](#command-line-options)
|
|
33
|
+
- [Interactive Mode](#interactive-mode)
|
|
34
|
+
- [Examples](#examples)
|
|
35
|
+
- [Download Flow](#download-flow)
|
|
36
|
+
- [File Organization](#file-organization)
|
|
37
|
+
- [Tech Stack](#tech-stack)
|
|
38
|
+
- [Development](#development)
|
|
39
|
+
- [Important Notes](#important-notes)
|
|
40
|
+
- [Contributing](#contributing)
|
|
41
|
+
- [License](#license)
|
|
42
|
+
|
|
43
|
+
## Features
|
|
44
|
+
|
|
45
|
+
- **Search Portal Integration**: Automatically searches justice.gov Epstein Files portal
|
|
46
|
+
- **PDF Downloads**: Downloads PDFs with automatic deduplication based on filename and file size
|
|
47
|
+
- **Progress Tracking**: Visual progress bars for JSON fetching and PDF downloads
|
|
48
|
+
- **Parallel Workers**: Multi-process downloads with a queue-backed resume system
|
|
49
|
+
- **Resume Support**: Restart interrupted runs from the queue state
|
|
50
|
+
- **Custom Prefixes**: Add custom prefixes to PDF filenames or use page numbers automatically
|
|
51
|
+
- **Smart Deduplication**: Detects existing files and skips/renames them appropriately
|
|
52
|
+
- **Batch Processing**: Download single pages or all pages at once
|
|
53
|
+
- **Interactive Mode**: Guided prompts for configuration (great for first-time users)
|
|
54
|
+
- **Age Verification**: Built-in age verification for legal compliance
|
|
55
|
+
- **Security Handling**: Automatically handles CAPTCHA and age verification challenges
|
|
56
|
+
|
|
57
|
+
## Installation
|
|
58
|
+
|
|
59
|
+
> **Bun Runtime Required**: This package uses Bun-specific APIs (`bun:sqlite`) and requires the Bun runtime. It will not work with Node.js.
|
|
60
|
+
|
|
61
|
+
### Option 1: Docker (Recommended)
|
|
62
|
+
|
|
63
|
+
No local runtime installation needed - just Docker:
|
|
64
|
+
|
|
65
|
+
- [Docker](https://www.docker.com/) v20.0.0 or higher
|
|
66
|
+
- [Docker Compose](https://docs.docker.com/compose/) v2.0.0 or higher (optional)
|
|
67
|
+
|
|
68
|
+
Docker Images:
|
|
69
|
+
|
|
70
|
+
- Docker Hub: `iammorpheus/ef-dl:latest`
|
|
71
|
+
- GitHub Container Registry: `ghcr.io/iammorpheuszion/ef-dl:latest`
|
|
72
|
+
|
|
73
|
+
```bash
|
|
74
|
+
# Option A: Using docker-compose (recommended)
|
|
75
|
+
# Download & run docker-compose.yml
|
|
76
|
+
curl -O https://raw.githubusercontent.com/iammorpheuszion/ef-dl/main/docker-compose.yml
|
|
77
|
+
|
|
78
|
+
docker compose run -it --rm ef-dl
|
|
79
|
+
```
|
|
80
|
+
|
|
81
|
+
```
|
|
82
|
+
# Option B: Using docker run directly
|
|
83
|
+
# Configure download location with -v flag
|
|
84
|
+
docker run -it --rm -v ./downloads:/app/downloads iammorpheus/ef-dl
|
|
85
|
+
```
|
|
86
|
+
|
|
87
|
+
See [Docker Usage](#docker-usage) for more details.
|
|
88
|
+
|
|
89
|
+
### Option 2: Bun Package Manager
|
|
90
|
+
|
|
91
|
+
Install from npm using Bun:
|
|
92
|
+
|
|
93
|
+
```bash
|
|
94
|
+
# Using bunx (no installation needed) - like npx but for Bun
|
|
95
|
+
bunx ef-dl -s "your search term" -d ./downloads
|
|
96
|
+
|
|
97
|
+
# Or install globally with Bun
|
|
98
|
+
bun install -g ef-dl
|
|
99
|
+
ef-dl -s "your search term" -d ./downloads
|
|
100
|
+
```
|
|
101
|
+
|
|
102
|
+
### Option 3: Local Development
|
|
103
|
+
|
|
104
|
+
Clone and run from source:
|
|
105
|
+
|
|
106
|
+
<details>
|
|
107
|
+
<summary>Local installation steps</summary>
|
|
108
|
+
|
|
109
|
+
**Prerequisites:** [Bun](https://bun.sh/) v1.0.0 or higher
|
|
110
|
+
|
|
111
|
+
1. Clone the repository:
|
|
112
|
+
|
|
113
|
+
```bash
|
|
114
|
+
git clone https://github.com/iammorpheuszion/ef-dl.git
|
|
115
|
+
cd ef-dl
|
|
116
|
+
```
|
|
117
|
+
|
|
118
|
+
2. Install dependencies:
|
|
119
|
+
|
|
120
|
+
```bash
|
|
121
|
+
bun install
|
|
122
|
+
```
|
|
123
|
+
|
|
124
|
+
3. Verify installation:
|
|
125
|
+
|
|
126
|
+
```bash
|
|
127
|
+
bun run typecheck
|
|
128
|
+
```
|
|
129
|
+
|
|
130
|
+
</details>
|
|
131
|
+
|
|
132
|
+
## Quick Start
|
|
133
|
+
|
|
134
|
+
### Start with interactive mode (default)
|
|
135
|
+
|
|
136
|
+
Running without arguments automatically starts interactive mode:
|
|
137
|
+
|
|
138
|
+
```bash
|
|
139
|
+
bun run start
|
|
140
|
+
```
|
|
141
|
+
|
|
142
|
+
### Interactive mode with pre-filled values
|
|
143
|
+
|
|
144
|
+
```bash
|
|
145
|
+
bun run start -i -s "your search term" -p 5
|
|
146
|
+
```
|
|
147
|
+
|
|
148
|
+
### Download all pages
|
|
149
|
+
|
|
150
|
+
```bash
|
|
151
|
+
bun run start -s "your search term" -d ./downloads
|
|
152
|
+
```
|
|
153
|
+
|
|
154
|
+
### Download a specific page
|
|
155
|
+
|
|
156
|
+
```bash
|
|
157
|
+
bun run start -s "your search term" -p 5 -d ./downloads
|
|
158
|
+
```
|
|
159
|
+
|
|
160
|
+
## Usage
|
|
161
|
+
|
|
162
|
+
### Command Line Options
|
|
163
|
+
|
|
164
|
+
| Flag | Short | Description | Required | Default |
|
|
165
|
+
| --------------- | ----- | ----------------------------------------- | -------- | ----------- |
|
|
166
|
+
| `--search` | `-s` | Search term to query the portal | Yes | - |
|
|
167
|
+
| `--directory` | `-d` | Download directory path | Yes | - |
|
|
168
|
+
| `--page` | `-p` | Page number to download | - | All pages |
|
|
169
|
+
| `--all` | `-a` | Download all pages from specified page | - | `false` |
|
|
170
|
+
| `--prefix` | - | Custom filename prefix (sequential mode) | - | Page number |
|
|
171
|
+
| `--workers` | - | Number of parallel workers (1-10) | - | `5` |
|
|
172
|
+
| `--fresh` | - | Force fresh start, ignore resume | - | `false` |
|
|
173
|
+
| `--sequential` | - | Use sequential download (disable workers) | - | `false` |
|
|
174
|
+
| `--verbose` | `-v` | Enable verbose debug output | - | `false` |
|
|
175
|
+
| `--interactive` | `-i` | Interactive mode with prompts | - | `false` |
|
|
176
|
+
| `--help` | `-h` | Show help menu | - | - |
|
|
177
|
+
| `--version` | `-V` | Show version number | - | - |
|
|
178
|
+
|
|
179
|
+
### Interactive Mode
|
|
180
|
+
|
|
181
|
+
Interactive mode provides guided prompts for all configuration options. **Running the tool without any arguments automatically enters interactive mode.**
|
|
182
|
+
|
|
183
|
+
```bash
|
|
184
|
+
# Start interactive mode (no arguments needed)
|
|
185
|
+
bun run start
|
|
186
|
+
|
|
187
|
+
# Explicit interactive mode
|
|
188
|
+
bun run start -i
|
|
189
|
+
|
|
190
|
+
# Interactive with pre-filled values
|
|
191
|
+
bun run start -i -s "your search term"
|
|
192
|
+
```
|
|
193
|
+
|
|
194
|
+
**Interactive prompts:**
|
|
195
|
+
|
|
196
|
+
1. Search term
|
|
197
|
+
2. Download directory
|
|
198
|
+
3. Page number (leave empty for all pages)
|
|
199
|
+
4. Download mode (single page or all from page)
|
|
200
|
+
5. Custom prefix (leave empty for page number)
|
|
201
|
+
6. Verbose mode (yes/no)
|
|
202
|
+
|
|
203
|
+
### Examples
|
|
204
|
+
|
|
205
|
+
<details>
|
|
206
|
+
<summary>Click to see all example commands</summary>
|
|
207
|
+
|
|
208
|
+
```bash
|
|
209
|
+
# Download all pages with parallel workers (default: 5)
|
|
210
|
+
bun run start -s "your search term" -d ./downloads
|
|
211
|
+
|
|
212
|
+
# Download with 10 parallel workers
|
|
213
|
+
bun run start -s "your search term" -d ./downloads --workers 10
|
|
214
|
+
|
|
215
|
+
# Download with sequential mode (no parallelism)
|
|
216
|
+
bun run start -s "your search term" -d ./downloads --sequential
|
|
217
|
+
|
|
218
|
+
# Download only page 5
|
|
219
|
+
bun run start -s "your search term" -p 5 -d ./downloads
|
|
220
|
+
|
|
221
|
+
# Download all pages starting from page 5
|
|
222
|
+
bun run start -s "your search term" -p 5 -a -d ./downloads
|
|
223
|
+
|
|
224
|
+
# Download page 5 (uses page number as prefix: 5-filename.pdf)
|
|
225
|
+
bun run start -s "your search term" -p 5 -d ./downloads
|
|
226
|
+
# Results in: 5-EFTA00000001.pdf
|
|
227
|
+
|
|
228
|
+
# Download with custom prefix
|
|
229
|
+
bun run start -s "your search term" -p 5 -d ./downloads --prefix EPSTEIN
|
|
230
|
+
# Results in: EPSTEIN-EFTA00000001.pdf
|
|
231
|
+
|
|
232
|
+
# Download with verbose output
|
|
233
|
+
bun run start -s "your search term" -d ./downloads -v
|
|
234
|
+
|
|
235
|
+
# Force fresh start (ignore previous resume)
|
|
236
|
+
bun run start -s "your search term" -d ./downloads --fresh
|
|
237
|
+
|
|
238
|
+
# Interactive mode (prompts for all options)
|
|
239
|
+
bun run start -i
|
|
240
|
+
|
|
241
|
+
# Interactive mode with pre-filled values
|
|
242
|
+
bun run start -i -s "your search term" -d ./downloads
|
|
243
|
+
```
|
|
244
|
+
|
|
245
|
+
</details>
|
|
246
|
+
|
|
247
|
+
## Docker Usage
|
|
248
|
+
|
|
249
|
+
You can also run EF-DL using Docker without installing Bun or Node.js locally.
|
|
250
|
+
|
|
251
|
+
### Quick Start with Docker
|
|
252
|
+
|
|
253
|
+
```bash
|
|
254
|
+
# Run in interactive mode
|
|
255
|
+
docker compose run -it --rm ef-dl
|
|
256
|
+
|
|
257
|
+
# Download specific search term
|
|
258
|
+
docker compose run -it --rm ef-dl bun index.ts -s "your search term" -d ./downloads
|
|
259
|
+
```
|
|
260
|
+
|
|
261
|
+
### Docker Commands
|
|
262
|
+
|
|
263
|
+
**Volume Binding:** Use `-v` to map a local directory to the container's download location. Downloads will be saved to your local machine.
|
|
264
|
+
|
|
265
|
+
<details>
|
|
266
|
+
<summary>Click to see all example commands</summary>
|
|
267
|
+
|
|
268
|
+
```bash
|
|
269
|
+
# Build the image
|
|
270
|
+
docker build -t ef-dl .
|
|
271
|
+
|
|
272
|
+
# Run interactively - downloads go to ./downloads on your machine
|
|
273
|
+
docker run -it --rm -v $(pwd)/downloads:/app/downloads ef-dl
|
|
274
|
+
|
|
275
|
+
# Run with arguments - save to current directory
|
|
276
|
+
docker run -it --rm -v $(pwd)/downloads:/app/downloads ef-dl bun index.ts -s "your_search_term" -d ./downloads
|
|
277
|
+
|
|
278
|
+
# Custom download location - use absolute path
|
|
279
|
+
docker run -it --rm -v /path/to/your/downloads:/app/downloads ef-dl bun index.ts -s "your_search_term" -d ./downloads
|
|
280
|
+
|
|
281
|
+
# Windows users (PowerShell)
|
|
282
|
+
docker run -it --rm -v ${PWD}/downloads:/app/downloads ef-dl
|
|
283
|
+
|
|
284
|
+
# Use production-optimized image
|
|
285
|
+
docker build -f Dockerfile.production -t ef-dl:prod .
|
|
286
|
+
docker run -it --rm -v $(pwd)/downloads:/app/downloads ef-dl:prod
|
|
287
|
+
```
|
|
288
|
+
|
|
289
|
+
</details>
|
|
290
|
+
|
|
291
|
+
### Development with Docker
|
|
292
|
+
|
|
293
|
+
```bash
|
|
294
|
+
# Run with hot reload
|
|
295
|
+
docker-compose --profile dev run --rm ef-dl-dev
|
|
296
|
+
```
|
|
297
|
+
|
|
298
|
+
## Download Flow
|
|
299
|
+
|
|
300
|
+
Parallel mode (default) uses a producer-consumer pipeline with a SQLite queue and worker pool. Use `--sequential` to run the legacy single-process flow.
|
|
301
|
+
|
|
302
|
+
<details>
|
|
303
|
+
<summary>View detailed flow diagram</summary>
|
|
304
|
+
|
|
305
|
+
```mermaid
|
|
306
|
+
flowchart TD
|
|
307
|
+
A[Start CLI] --> B{Resume check}
|
|
308
|
+
B -->|No queue or --fresh| C[Initialize queue DB + cache]
|
|
309
|
+
B -->|Queue exists| D[Show resume prompt]
|
|
310
|
+
D -->|Resume| E[Reset in-progress -> pending]
|
|
311
|
+
D -->|Fresh| C
|
|
312
|
+
D -->|Abort| Z[Exit]
|
|
313
|
+
|
|
314
|
+
C --> F[Discover totals]
|
|
315
|
+
E --> F
|
|
316
|
+
F --> G[Start worker pool]
|
|
317
|
+
G --> H[Init progress bars]
|
|
318
|
+
|
|
319
|
+
subgraph Producer[Coordinator: JSON producer]
|
|
320
|
+
H --> I[Fetch JSON pages]
|
|
321
|
+
I --> J[Save JSON to cache]
|
|
322
|
+
J --> K[Extract PDFs]
|
|
323
|
+
K --> L[Insert tasks into queue DB]
|
|
324
|
+
L --> M[Update JSON progress]
|
|
325
|
+
M --> I
|
|
326
|
+
end
|
|
327
|
+
|
|
328
|
+
subgraph Queue[SQLite queue DB]
|
|
329
|
+
L --> Q[(pdf_tasks + metadata)]
|
|
330
|
+
Q --> N[Workers claim tasks]
|
|
331
|
+
end
|
|
332
|
+
|
|
333
|
+
subgraph Workers[Worker pool]
|
|
334
|
+
N --> O[Download PDF]
|
|
335
|
+
O --> P[Mark complete/failed]
|
|
336
|
+
P --> N
|
|
337
|
+
P --> R{json_fetch_complete?}
|
|
338
|
+
R -->|No| N
|
|
339
|
+
R -->|Yes & no pending| S[Worker exits]
|
|
340
|
+
end
|
|
341
|
+
|
|
342
|
+
I --> T[Set json_fetch_complete = true]
|
|
343
|
+
T --> U[Wait for workers]
|
|
344
|
+
|
|
345
|
+
subgraph Progress[Progress tracking]
|
|
346
|
+
H --> V[Add JSON + PDF bars]
|
|
347
|
+
V --> W[Poll queue progress 1s]
|
|
348
|
+
W --> X[Update PDF bar]
|
|
349
|
+
M --> Y[Update JSON bar]
|
|
350
|
+
end
|
|
351
|
+
|
|
352
|
+
U --> AA[Show summary]
|
|
353
|
+
AA --> AB{Cleanup cache?}
|
|
354
|
+
AB -->|Yes| AC[Delete cache + queue DB]
|
|
355
|
+
AB -->|No| AD[Keep cache for resume]
|
|
356
|
+
AC --> AE[Done]
|
|
357
|
+
AD --> AE[Done]
|
|
358
|
+
```
|
|
359
|
+
|
|
360
|
+
</details>
|
|
361
|
+
|
|
362
|
+
## File Organization
|
|
363
|
+
|
|
364
|
+
**JSON Metadata:** Automatically saved with search results, document metadata, URLs, file sizes, and excerpts.
|
|
365
|
+
|
|
366
|
+
**PDF Files:** Prefixed with page number by default (e.g., `5-filename.pdf`). Custom prefixes supported in sequential mode. Duplicate detection based on filename AND file size.
|
|
367
|
+
|
|
368
|
+
<details>
|
|
369
|
+
<summary>View directory structures</summary>
|
|
370
|
+
|
|
371
|
+
### Parallel mode (default)
|
|
372
|
+
|
|
373
|
+
```
|
|
374
|
+
{download-directory}/
|
|
375
|
+
├── cache/
|
|
376
|
+
│ └── {search-term}/
|
|
377
|
+
│ ├── json/
|
|
378
|
+
│ │ ├── search-{term}-page-1-{timestamp}.json
|
|
379
|
+
│ │ ├── search-{term}-page-2-{timestamp}.json
|
|
380
|
+
│ │ └── ...
|
|
381
|
+
│ └── {search-term}.db
|
|
382
|
+
└── files/
|
|
383
|
+
└── {search-term}/
|
|
384
|
+
├── {page}-EFTA00000001.pdf
|
|
385
|
+
├── {page}-EFTA00000002.pdf
|
|
386
|
+
└── ...
|
|
387
|
+
```
|
|
388
|
+
|
|
389
|
+
### Sequential mode (`--sequential`)
|
|
390
|
+
|
|
391
|
+
```
|
|
392
|
+
{download-directory}/
|
|
393
|
+
└── {search-term}/
|
|
394
|
+
├── json/
|
|
395
|
+
│ ├── search-{term}-page-1-{timestamp}.json
|
|
396
|
+
│ ├── search-{term}-page-2-{timestamp}.json
|
|
397
|
+
│ └── ...
|
|
398
|
+
└── pdfs/
|
|
399
|
+
├── {prefix}-EFTA00000001.pdf
|
|
400
|
+
├── {prefix}-EFTA00000002.pdf
|
|
401
|
+
└── ...
|
|
402
|
+
```
|
|
403
|
+
|
|
404
|
+
</details>
|
|
405
|
+
|
|
406
|
+
## Tech Stack
|
|
407
|
+
|
|
408
|
+
**Core:** TypeScript, Bun/Node.js, Puppeteer for browser automation
|
|
409
|
+
|
|
410
|
+
<details>
|
|
411
|
+
<summary>View all dependencies</summary>
|
|
412
|
+
|
|
413
|
+
### Dependencies
|
|
414
|
+
|
|
415
|
+
| Package | Version | Purpose |
|
|
416
|
+
| --------------------- | -------- | ----------------------------------------------- |
|
|
417
|
+
| `@inquirer/prompts` | ^8.2.0 | Interactive CLI prompts and user input handling |
|
|
418
|
+
| `browserless` | ^10.9.18 | Headless browser automation for web scraping |
|
|
419
|
+
| `chalk` | ^5.6.2 | Terminal string styling and colors |
|
|
420
|
+
| `commander` | ^14.0.3 | CLI argument parsing and command structure |
|
|
421
|
+
| `figlet` | ^1.10.0 | ASCII art text generation for headers |
|
|
422
|
+
| `multi-progress-bars` | ^5.0.3 | Multiple concurrent progress bar display |
|
|
423
|
+
| `puppeteer` | ^24.36.1 | Browser automation and PDF downloads |
|
|
424
|
+
|
|
425
|
+
### Development Dependencies
|
|
426
|
+
|
|
427
|
+
| Package | Version | Purpose |
|
|
428
|
+
| --------------- | ------- | ------------------------------------------- |
|
|
429
|
+
| `@types/bun` | latest | TypeScript type definitions for Bun runtime |
|
|
430
|
+
| `@types/figlet` | ^1.7.0 | TypeScript type definitions for Figlet |
|
|
431
|
+
|
|
432
|
+
</details>
|
|
433
|
+
|
|
434
|
+
## Development
|
|
435
|
+
|
|
436
|
+
<details>
|
|
437
|
+
<summary>Scripts and project structure</summary>
|
|
438
|
+
|
|
439
|
+
### Scripts
|
|
440
|
+
|
|
441
|
+
| Script | Command | Description |
|
|
442
|
+
| -------------- | ---------------------------------- | ------------------------ |
|
|
443
|
+
| `dev` | `bun --watch --hot index.ts` | Run with hot reloading |
|
|
444
|
+
| `start` | `bun index.ts` | Run the application |
|
|
445
|
+
| `build` | `bun build index.ts --outdir dist` | Build for production |
|
|
446
|
+
| `typecheck` | `tsc --noEmit` | TypeScript type checking |
|
|
447
|
+
| `test:browser` | `bun src/browser-client.ts` | Test browser client |
|
|
448
|
+
|
|
449
|
+
### Project Structure
|
|
450
|
+
|
|
451
|
+
```
|
|
452
|
+
ef-dl/
|
|
453
|
+
├── index.ts # Main application entry point
|
|
454
|
+
├── src/
|
|
455
|
+
│ ├── browser-client.ts # Web scraping and PDF download logic
|
|
456
|
+
│ ├── progress.ts # Progress bar management
|
|
457
|
+
│ ├── types/
|
|
458
|
+
│ │ ├── enums.ts # Shared enums (prompt types)
|
|
459
|
+
│ │ └── browserless.d.ts # Browserless module typings
|
|
460
|
+
│ ├── utils/
|
|
461
|
+
│ │ ├── ascii.ts # ASCII art header generation
|
|
462
|
+
│ │ ├── logger.ts # Centralized logging utilities
|
|
463
|
+
│ │ └── prompt.ts # Unified prompt handling
|
|
464
|
+
│ └── workers/
|
|
465
|
+
│ ├── coordinator.ts # Producer logic
|
|
466
|
+
│ ├── task-queue.ts # SQLite operations
|
|
467
|
+
│ ├── worker-pool.ts # Worker management
|
|
468
|
+
│ ├── worker.ts # Worker process
|
|
469
|
+
│ └── types.ts # Worker types
|
|
470
|
+
├── downloads/ # Default download directory (created on first run)
|
|
471
|
+
├── package.json
|
|
472
|
+
├── tsconfig.json
|
|
473
|
+
└── README.md
|
|
474
|
+
```
|
|
475
|
+
|
|
476
|
+
</details>
|
|
477
|
+
|
|
478
|
+
## Important Notes
|
|
479
|
+
|
|
480
|
+
- **Age Requirement**: You must be 18+ to use this application
|
|
481
|
+
- **Educational Use**: For educational purposes only
|
|
482
|
+
- **Default Behavior**: Running without arguments starts interactive mode
|
|
483
|
+
- **Parallel by Default**: Worker pipeline is default; use `--sequential` for single-process
|
|
484
|
+
- **File Deduplication**: Detected by filename AND size to prevent duplicates
|
|
485
|
+
|
|
486
|
+
<details>
|
|
487
|
+
<summary>Troubleshooting</summary>
|
|
488
|
+
|
|
489
|
+
### "required option not specified" error
|
|
490
|
+
|
|
491
|
+
This error only occurs in non-interactive mode. Either:
|
|
492
|
+
|
|
493
|
+
- Run without arguments to use interactive mode: `bun index.ts`
|
|
494
|
+
- Provide all required flags: `bun index.ts -s "term" -d ./downloads`
|
|
495
|
+
- Use interactive mode explicitly: `bun index.ts -i`
|
|
496
|
+
|
|
497
|
+
### Download fails
|
|
498
|
+
|
|
499
|
+
- Check your internet connection
|
|
500
|
+
- Try with `-v` (verbose) flag to see detailed error messages
|
|
501
|
+
- Ensure you have sufficient disk space
|
|
502
|
+
|
|
503
|
+
### Files not being detected as duplicates
|
|
504
|
+
|
|
505
|
+
The tool checks both filename AND file size. If a file exists with a different size, it will be re-downloaded.
|
|
506
|
+
|
|
507
|
+
</details>
|
|
508
|
+
|
|
509
|
+
## Contributing
|
|
510
|
+
|
|
511
|
+
Contributions are welcome. Please feel free to submit a Pull Request.
|
|
512
|
+
|
|
513
|
+
## License
|
|
514
|
+
|
|
515
|
+
[](https://opensource.org/licenses/MIT)
|
|
516
|
+
|
|
517
|
+
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
|
518
|
+
|
|
519
|
+
---
|
|
520
|
+
|
|
521
|
+
**Disclaimer**: This is an independent educational tool and is not affiliated with or endorsed by the US Department of Justice. Use responsibly and in accordance with all applicable laws and terms of service.
|