pdflinkcheck 1.1.72__py3-none-any.whl → 1.1.94__py3-none-any.whl
This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
- pdflinkcheck/__init__.py +2 -5
- pdflinkcheck/analyze_pymupdf.py +12 -6
- pdflinkcheck/analyze_pypdf.py +25 -7
- pdflinkcheck/analyze_pypdf_v2.py +5 -6
- pdflinkcheck/cli.py +82 -91
- pdflinkcheck/data/I Have Questions.md +51 -0
- pdflinkcheck/data/LICENSE +17 -654
- pdflinkcheck/data/README.md +49 -49
- pdflinkcheck/data/icons/BoxArt-1080x1080.png +0 -0
- pdflinkcheck/data/icons/Logo-150x150.png +0 -0
- pdflinkcheck/data/icons/Logo-300x300.png +0 -0
- pdflinkcheck/data/icons/Logo-71x71.png +0 -0
- pdflinkcheck/data/icons/PosterArt-720x1080.png +0 -0
- pdflinkcheck/data/icons/SmallLogo-44x44.png +0 -0
- pdflinkcheck/data/icons/SplashScreen-620x300.png +0 -0
- pdflinkcheck/data/icons/StoreLogo-50x50.png +0 -0
- pdflinkcheck/data/icons/WideLogo-310x150.png +0 -0
- pdflinkcheck/data/icons/red_pdf_512px.ico +0 -0
- pdflinkcheck/data/pyproject.toml +20 -23
- pdflinkcheck/data/themes/forest/forest-dark/border-accent-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/border-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/border-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/border-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/border-invalid.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/card.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-tri-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-tri-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-tri-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-unsel-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-unsel-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-unsel-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/check-unsel-pressed.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/combo-button-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/combo-button-focus.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/combo-button-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/down.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/empty.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/hor-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/hor-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/hor-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/notebook.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/off-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/off-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/off-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/on-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/on-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/on-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-tri-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-tri-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-tri-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-unsel-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-unsel-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-unsel-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/radio-unsel-pressed.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/rect-accent-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/rect-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/rect-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/rect-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/right.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/scale-hor.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/scale-vert.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/separator.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/sizegrip.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/spin-button-down-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/spin-button-down-focus.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/spin-button-up.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/tab-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/tab-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/tab-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/thumb-hor-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/thumb-hor-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/thumb-hor-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/thumb-vert-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/thumb-vert-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/thumb-vert-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/tree-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/tree-pressed.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/up.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/vert-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/vert-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark/vert-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-dark.tcl +536 -0
- pdflinkcheck/data/themes/forest/forest-light/border-accent-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/border-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/border-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/border-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/border-invalid.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/card.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-tri-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-tri-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-tri-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-unsel-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-unsel-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-unsel-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/check-unsel-pressed.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/combo-button-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/combo-button-focus.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/combo-button-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/down-focus.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/down.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/empty.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/hor-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/hor-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/hor-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/notebook.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/off-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/off-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/off-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/on-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/on-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/on-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-tri-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-tri-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-tri-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-unsel-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-unsel-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-unsel-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/radio-unsel-pressed.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/rect-accent-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/rect-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/rect-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/rect-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/right-focus.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/right.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/scale-hor.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/scale-vert.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/separator.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/sizegrip.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/spin-button-down-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/spin-button-down-focus.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/spin-button-up.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/tab-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/tab-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/tab-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/thumb-hor-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/thumb-hor-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/thumb-hor-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/thumb-vert-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/thumb-vert-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/thumb-vert-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/tree-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/tree-pressed.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/up.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/vert-accent.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/vert-basic.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light/vert-hover.png +0 -0
- pdflinkcheck/data/themes/forest/forest-light.tcl +544 -0
- pdflinkcheck/datacopy.py +2 -0
- pdflinkcheck/dev.py +10 -23
- pdflinkcheck/environment.py +64 -0
- pdflinkcheck/gui.py +229 -103
- pdflinkcheck/io.py +4 -18
- pdflinkcheck/report.py +161 -89
- pdflinkcheck/stdlib_server.py +14 -6
- pdflinkcheck/update_msix_version.py +47 -0
- pdflinkcheck/validate.py +59 -80
- pdflinkcheck/version_info.py +5 -2
- {pdflinkcheck-1.1.72.dist-info → pdflinkcheck-1.1.94.dist-info}/METADATA +54 -52
- pdflinkcheck-1.1.94.dist-info/RECORD +176 -0
- pdflinkcheck-1.1.94.dist-info/licenses/LICENSE +24 -0
- pdflinkcheck-1.1.94.dist-info/licenses/LICENSE-MIT +9 -0
- pdflinkcheck-1.1.72.dist-info/RECORD +0 -21
- {pdflinkcheck-1.1.72.dist-info → pdflinkcheck-1.1.94.dist-info}/WHEEL +0 -0
- {pdflinkcheck-1.1.72.dist-info → pdflinkcheck-1.1.94.dist-info}/entry_points.txt +0 -0
- /pdflinkcheck-1.1.72.dist-info/licenses/LICENSE → /pdflinkcheck-1.1.94.dist-info/licenses/LICENSE-AGPL3 +0 -0
pdflinkcheck/__init__.py
CHANGED
|
@@ -19,9 +19,7 @@ import os as _os
|
|
|
19
19
|
# Library functions
|
|
20
20
|
from pdflinkcheck.analyze_pymupdf import extract_links_pymupdf, extract_toc_pymupdf
|
|
21
21
|
from pdflinkcheck.analyze_pypdf import extract_links_pypdf, extract_toc_pypdf
|
|
22
|
-
|
|
23
|
-
from pdflinkcheck.report import run_report
|
|
24
|
-
from pdflinkcheck.report import run_report as run_analysis # for backwards compatibility with previos versions
|
|
22
|
+
from pdflinkcheck.report import run_report_and_call_exports as run_report
|
|
25
23
|
#from pdflinkcheck import dev
|
|
26
24
|
|
|
27
25
|
# For the kids. This is what I wanted when learning Python in a mysterious new REPL.
|
|
@@ -48,13 +46,12 @@ else:
|
|
|
48
46
|
# Define __all__ such that the library functions are self documenting.
|
|
49
47
|
__all__ = [
|
|
50
48
|
"run_report",
|
|
51
|
-
"run_analysis",
|
|
52
49
|
"extract_links_pymupdf",
|
|
53
50
|
"extract_toc_pymupdf",
|
|
54
51
|
"extract_links_pypdf",
|
|
55
52
|
"extract_toc_pypdf",
|
|
56
53
|
#"start_gui" if _load_gui_func else None,
|
|
57
|
-
|
|
54
|
+
"dev",
|
|
58
55
|
]
|
|
59
56
|
if _load_gui_func:
|
|
60
57
|
__all__.append("start_gui")
|
pdflinkcheck/analyze_pymupdf.py
CHANGED
|
@@ -1,3 +1,7 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
# SPDX-License-Identifier: MIT
|
|
3
|
+
# pdflinkcheck/analyze_pymupdf.py
|
|
4
|
+
|
|
1
5
|
import sys
|
|
2
6
|
from pathlib import Path
|
|
3
7
|
import logging
|
|
@@ -5,14 +9,15 @@ from typing import Dict, Any, Optional, List
|
|
|
5
9
|
|
|
6
10
|
logging.getLogger("fitz").setLevel(logging.ERROR)
|
|
7
11
|
|
|
12
|
+
from pdflinkcheck.environment import pymupdf_is_available
|
|
8
13
|
try:
|
|
9
|
-
|
|
14
|
+
if pymupdf_is_available():
|
|
15
|
+
import fitz # PyMuPDF
|
|
16
|
+
else:
|
|
17
|
+
fitz = None
|
|
10
18
|
except ImportError:
|
|
11
19
|
fitz = None
|
|
12
20
|
|
|
13
|
-
from pdflinkcheck.report import run_report
|
|
14
|
-
#from pdflinkcheck.validate import run_validation
|
|
15
|
-
|
|
16
21
|
"""
|
|
17
22
|
Inspect target PDF for both URI links and for GoTo links.
|
|
18
23
|
"""
|
|
@@ -331,8 +336,9 @@ def call_stable():
|
|
|
331
336
|
Note: This requires defining PROJECT_NAME, CLI_MAIN_FILE, etc., or
|
|
332
337
|
passing them as arguments to run_report.
|
|
333
338
|
"""
|
|
334
|
-
|
|
335
|
-
|
|
339
|
+
from pdflinkcheck.report import run_report_and_call_exports
|
|
340
|
+
|
|
341
|
+
run_report_and_call_exports(pdf_library = "pymupdf")
|
|
336
342
|
|
|
337
343
|
if __name__ == "__main__":
|
|
338
344
|
call_stable()
|
pdflinkcheck/analyze_pypdf.py
CHANGED
|
@@ -1,3 +1,5 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
# SPDX-License-Identifier: MIT
|
|
1
3
|
# src/pdflinkcheck/analyze_pypdf.py
|
|
2
4
|
import sys
|
|
3
5
|
from pathlib import Path
|
|
@@ -9,8 +11,6 @@ from pypdf.generic import Destination, NameObject, ArrayObject, IndirectObject
|
|
|
9
11
|
|
|
10
12
|
|
|
11
13
|
from pdflinkcheck.io import error_logger, export_report_data, get_first_pdf_in_cwd, LOG_FILE_PATH
|
|
12
|
-
from pdflinkcheck.report import run_report
|
|
13
|
-
#from pdflinkcheck.validate import run_validation
|
|
14
14
|
|
|
15
15
|
"""
|
|
16
16
|
Inspect target PDF for both URI links and for GoTo links, using only pypdf, not Fitz
|
|
@@ -51,7 +51,23 @@ def get_anchor_text_pypdf(page, rect) -> str:
|
|
|
51
51
|
|
|
52
52
|
return cleaned if cleaned else "Graphic/Empty Link"
|
|
53
53
|
|
|
54
|
-
def resolve_pypdf_destination(reader: PdfReader, dest, obj_id_to_page: dict) ->
|
|
54
|
+
def resolve_pypdf_destination(reader: PdfReader, dest, obj_id_to_page: dict) -> Optional[int]:
|
|
55
|
+
try:
|
|
56
|
+
if isinstance(dest, Destination):
|
|
57
|
+
return dest.page_number + 1 # Return int directly
|
|
58
|
+
|
|
59
|
+
if isinstance(dest, IndirectObject):
|
|
60
|
+
return obj_id_to_page.get(dest.idnum)
|
|
61
|
+
|
|
62
|
+
if isinstance(dest, ArrayObject) and len(dest) > 0:
|
|
63
|
+
if isinstance(dest[0], IndirectObject):
|
|
64
|
+
return obj_id_to_page.get(dest[0].idnum)
|
|
65
|
+
|
|
66
|
+
return None # Unresolved → None
|
|
67
|
+
except Exception:
|
|
68
|
+
return None
|
|
69
|
+
|
|
70
|
+
def resolve_pypdf_destination_(reader: PdfReader, dest, obj_id_to_page: dict) -> str:
|
|
55
71
|
"""
|
|
56
72
|
Resolves a Destination object or IndirectObject to a 1-based page number string.
|
|
57
73
|
"""
|
|
@@ -105,7 +121,7 @@ def extract_links_pypdf(pdf_path):
|
|
|
105
121
|
'type': 'Other Action',
|
|
106
122
|
'target': 'Unknown'
|
|
107
123
|
}
|
|
108
|
-
|
|
124
|
+
|
|
109
125
|
# Handle URI (External)
|
|
110
126
|
if "/A" in obj and "/URI" in obj["/A"]:
|
|
111
127
|
uri = obj["/A"]["/URI"]
|
|
@@ -114,11 +130,12 @@ def extract_links_pypdf(pdf_path):
|
|
|
114
130
|
'url': uri,
|
|
115
131
|
'target': uri
|
|
116
132
|
})
|
|
117
|
-
|
|
133
|
+
|
|
118
134
|
# Handle GoTo (Internal)
|
|
119
135
|
elif "/Dest" in obj or ("/A" in obj and "/D" in obj["/A"]):
|
|
120
136
|
dest = obj.get("/Dest") or obj["/A"].get("/D")
|
|
121
137
|
target_page = resolve_pypdf_destination(reader, dest, obj_id_to_page)
|
|
138
|
+
# print(f"DEBUG: resolved target_page = {target_page} (type: {type(target_page)})")
|
|
122
139
|
link_dict.update({
|
|
123
140
|
'type': 'Internal (GoTo/Dest)',
|
|
124
141
|
'destination_page': target_page,
|
|
@@ -177,8 +194,9 @@ def call_stable():
|
|
|
177
194
|
Note: This requires defining PROJECT_NAME, CLI_MAIN_FILE, etc., or
|
|
178
195
|
passing them as arguments to run_report.
|
|
179
196
|
"""
|
|
180
|
-
|
|
181
|
-
|
|
197
|
+
from pdflinkcheck.report import run_report_and_call_exports
|
|
198
|
+
|
|
199
|
+
run_report_and_call_exports(pdf_library = "pypdf")
|
|
182
200
|
|
|
183
201
|
if __name__ == "__main__":
|
|
184
202
|
call_stable()
|
pdflinkcheck/analyze_pypdf_v2.py
CHANGED
|
@@ -1,4 +1,6 @@
|
|
|
1
|
-
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
# SPDX-License-Identifier: MIT
|
|
3
|
+
# src/pdflinkcheck/analyze_pypdf_v2.py
|
|
2
4
|
import sys
|
|
3
5
|
from pathlib import Path
|
|
4
6
|
import logging
|
|
@@ -7,9 +9,6 @@ from typing import Dict, Any, List
|
|
|
7
9
|
from pypdf import PdfReader
|
|
8
10
|
from pypdf.generic import Destination, NameObject, IndirectObject
|
|
9
11
|
|
|
10
|
-
from pdflinkcheck.report import run_report
|
|
11
|
-
#from pdflinkcheck.validate import run_validation
|
|
12
|
-
|
|
13
12
|
"""
|
|
14
13
|
Inspect target PDF for both URI links and GoTo links, using only pypdf (no PyMuPDF/Fitz).
|
|
15
14
|
Fully fixed and improved version as of December 2025 (compatible with pypdf >= 4.0).
|
|
@@ -209,9 +208,9 @@ def call_stable():
|
|
|
209
208
|
"""
|
|
210
209
|
Entry point for command-line execution or integration with reporting module.
|
|
211
210
|
"""
|
|
212
|
-
|
|
213
|
-
# run_validation(library_pdf="pypdf") # Uncomment if validation step is needed
|
|
211
|
+
from pdflinkcheck.report import run_report_and_call_exports
|
|
214
212
|
|
|
213
|
+
run_report_and_call_exports(pdf_library="pypdf")
|
|
215
214
|
|
|
216
215
|
if __name__ == "__main__":
|
|
217
216
|
call_stable()
|
pdflinkcheck/cli.py
CHANGED
|
@@ -1,10 +1,12 @@
|
|
|
1
|
+
#!/usr/bin/env python3
|
|
2
|
+
# SPDX-License-Identifier: MIT
|
|
1
3
|
# src/pdflinkcheck/cli.py
|
|
2
4
|
import typer
|
|
3
5
|
from typing import Literal
|
|
4
6
|
from typer.models import OptionInfo
|
|
5
7
|
from rich.console import Console
|
|
6
8
|
from pathlib import Path
|
|
7
|
-
from pdflinkcheck.report import
|
|
9
|
+
from pdflinkcheck.report import run_report_and_call_exports # Assuming core logic moves here
|
|
8
10
|
from typing import Dict, Optional, Union, List
|
|
9
11
|
import pyhabitat
|
|
10
12
|
import sys
|
|
@@ -13,7 +15,8 @@ from importlib.resources import files
|
|
|
13
15
|
|
|
14
16
|
from pdflinkcheck.version_info import get_version_from_pyproject
|
|
15
17
|
from pdflinkcheck.validate import run_validation
|
|
16
|
-
|
|
18
|
+
from pdflinkcheck.environment import is_in_git_repo, assess_default_pdf_library
|
|
19
|
+
from pdflinkcheck.io import get_first_pdf_in_cwd
|
|
17
20
|
|
|
18
21
|
console = Console() # to be above the tkinter check, in case of console.print
|
|
19
22
|
|
|
@@ -25,13 +28,19 @@ app = typer.Typer(
|
|
|
25
28
|
no_args_is_help = False,
|
|
26
29
|
)
|
|
27
30
|
|
|
28
|
-
|
|
29
31
|
@app.callback()
|
|
30
|
-
def main(ctx: typer.Context
|
|
32
|
+
def main(ctx: typer.Context,
|
|
33
|
+
version: Optional[bool] = typer.Option(
|
|
34
|
+
None, "--version", is_flag=True, help="Show the version."
|
|
35
|
+
)
|
|
36
|
+
):
|
|
31
37
|
"""
|
|
32
38
|
If no subcommand is provided, launch the GUI.
|
|
33
39
|
"""
|
|
34
|
-
|
|
40
|
+
if version:
|
|
41
|
+
typer.echo(get_version_from_pyproject())
|
|
42
|
+
raise typer.Exit(code=0)
|
|
43
|
+
|
|
35
44
|
if ctx.invoked_subcommand is None:
|
|
36
45
|
gui_command()
|
|
37
46
|
raise typer.Exit(code=0)
|
|
@@ -54,10 +63,10 @@ if os.environ.get('DEV_TYPER_HELP_TREE',0) in ('true','1'):
|
|
|
54
63
|
@app.command(name="docs", help="Show the docs for this software.")
|
|
55
64
|
def docs_command(
|
|
56
65
|
license: Optional[bool] = typer.Option(
|
|
57
|
-
None, "--license", "-l", help="Show the
|
|
66
|
+
None, "--license", "-l", help="Show the LICENSE text."
|
|
58
67
|
),
|
|
59
68
|
readme: Optional[bool] = typer.Option(
|
|
60
|
-
None, "--readme", "-r", help="Show the
|
|
69
|
+
None, "--readme", "-r", help="Show the README.md content."
|
|
61
70
|
),
|
|
62
71
|
):
|
|
63
72
|
"""
|
|
@@ -70,6 +79,11 @@ def docs_command(
|
|
|
70
79
|
console.print("[yellow]Please use either the --license or --readme flag.[/yellow]")
|
|
71
80
|
return # Typer will automatically show the help message.
|
|
72
81
|
|
|
82
|
+
if is_in_git_repo():
|
|
83
|
+
"""This is too aggressive. But we don't expect it often. Probably worth it."""
|
|
84
|
+
from pdflinkcheck.datacopy import ensure_data_files_for_build
|
|
85
|
+
ensure_data_files_for_build()
|
|
86
|
+
|
|
73
87
|
# --- Handle --license flag ---
|
|
74
88
|
if license:
|
|
75
89
|
try:
|
|
@@ -100,19 +114,50 @@ def docs_command(
|
|
|
100
114
|
# Exit successfully if any flag was processed
|
|
101
115
|
raise typer.Exit(code=0)
|
|
102
116
|
|
|
117
|
+
@app.command(name="tools", help= "Additional features, hamburger menu.")
|
|
118
|
+
def tools_command(
|
|
119
|
+
clear_cache: bool = typer.Option(
|
|
120
|
+
False,
|
|
121
|
+
"--clear-cache",
|
|
122
|
+
is_flag=True,
|
|
123
|
+
help="Clear the environment caches. \n - pymupdf_is_available() \n - is_in_git_repo() \nMain purpose: Run after adding PyMuPDF to an existing installation where it was previously missing, because pymupdf_is_available() would have been cached as False."
|
|
124
|
+
)
|
|
125
|
+
):
|
|
126
|
+
from pdflinkcheck.environment import clear_all_caches
|
|
127
|
+
if clear_cache:
|
|
128
|
+
clear_all_caches()
|
|
129
|
+
|
|
130
|
+
"""
|
|
131
|
+
def validate_pdf_commands(
|
|
132
|
+
pdf_path: Optional[Path] = typer.Argument(
|
|
133
|
+
None,
|
|
134
|
+
exists=True,
|
|
135
|
+
file_okay=True,
|
|
136
|
+
dir_okay=False,
|
|
137
|
+
readable=True,
|
|
138
|
+
resolve_path=True,
|
|
139
|
+
help="Path to the PDF file to validate. If omitted, searches current directory."
|
|
140
|
+
),
|
|
141
|
+
pdf_library: Literal["pypdf", "pymupdf"] = typer.Option(
|
|
142
|
+
"pypdf",
|
|
143
|
+
"--library", "-l",
|
|
144
|
+
envvar="PDF_ENGINE",
|
|
145
|
+
help="PDF parsing engine: pypdf (pure Python) or pymupdf (faster, if available)"
|
|
146
|
+
),
|
|
147
|
+
"""
|
|
103
148
|
@app.command(name="analyze") # Added a command name 'analyze' for clarity
|
|
104
149
|
def analyze_pdf( # Renamed function for clarity
|
|
105
|
-
pdf_path: Path = typer.Argument(
|
|
106
|
-
|
|
150
|
+
pdf_path: Optional[Path] = typer.Argument(
|
|
151
|
+
None,
|
|
107
152
|
exists=True,
|
|
108
153
|
file_okay=True,
|
|
109
154
|
dir_okay=False,
|
|
110
155
|
readable=True,
|
|
111
156
|
resolve_path=True,
|
|
112
|
-
help="
|
|
157
|
+
help="Path to the PDF file to analyze. If omitted, searches current directory."
|
|
113
158
|
),
|
|
114
159
|
export_format: Optional[Literal["JSON", "TXT", "JSON,TXT", "NONE"]] = typer.Option(
|
|
115
|
-
"JSON",
|
|
160
|
+
"JSON,TXT",
|
|
116
161
|
"--export-format","-e",
|
|
117
162
|
case_sensitive=False,
|
|
118
163
|
help="Export format. Use 'None' to suppress file export.",
|
|
@@ -125,10 +170,10 @@ def analyze_pdf( # Renamed function for clarity
|
|
|
125
170
|
),
|
|
126
171
|
|
|
127
172
|
pdf_library: Literal["pypdf", "pymupdf"] = typer.Option(
|
|
128
|
-
|
|
173
|
+
assess_default_pdf_library(),
|
|
129
174
|
"--pdf-library","-p",
|
|
130
175
|
envvar="PDF_ENGINE",
|
|
131
|
-
help="
|
|
176
|
+
help="PDF parsing library. pypdf (pure Python) or pymupdf (faster, if available).",
|
|
132
177
|
)
|
|
133
178
|
):
|
|
134
179
|
"""
|
|
@@ -138,6 +183,11 @@ def analyze_pdf( # Renamed function for clarity
|
|
|
138
183
|
• Internal GoTo links point to valid pages
|
|
139
184
|
• Remote GoToR links point to existing files
|
|
140
185
|
• TOC bookmarks target valid pages
|
|
186
|
+
|
|
187
|
+
Validates:
|
|
188
|
+
• Are referenced files available?
|
|
189
|
+
• Are the page numbers referenced by GoTo links within the length of the document?
|
|
190
|
+
|
|
141
191
|
"""
|
|
142
192
|
|
|
143
193
|
"""
|
|
@@ -149,110 +199,51 @@ def analyze_pdf( # Renamed function for clarity
|
|
|
149
199
|
|
|
150
200
|
Env Var: If no flag is present, it checks PDF_ENGINE.
|
|
151
201
|
|
|
152
|
-
Code Default: (Lowest priority) It falls back to "pypdf" as defined in
|
|
202
|
+
Code Default: (Lowest priority) It falls back to "pypdf" as defined in typer.Option.
|
|
153
203
|
"""
|
|
154
204
|
|
|
205
|
+
if pdf_path is None:
|
|
206
|
+
pdf_path = get_first_pdf_in_cwd()
|
|
207
|
+
if pdf_path is None:
|
|
208
|
+
console.print("[red]Error: No PDF file provided and none found in current directory.[/red]")
|
|
209
|
+
raise typer.Exit(code=1)
|
|
210
|
+
console.print(f"[dim]No file specified — using: {Path(pdf_path).name}[/dim]")
|
|
211
|
+
|
|
212
|
+
pdf_path_str = str(pdf_path)
|
|
213
|
+
|
|
155
214
|
VALID_FORMATS = ("JSON") # extend later
|
|
156
215
|
requested_formats = [fmt.strip().upper() for fmt in export_format.split(",")]
|
|
157
216
|
if "NONE" in requested_formats or not export_format.strip() or export_format == "0":
|
|
158
217
|
export_formats = ""
|
|
159
218
|
else:
|
|
160
219
|
# Filter for valid ones: ("JSON", "TXT")
|
|
161
|
-
# This allows "JSON,TXT" to become "JSONTXT" which
|
|
220
|
+
# This allows "JSON,TXT" to become "JSONTXT" which run_report logic can handle
|
|
162
221
|
valid = [f for f in requested_formats if f in ("JSON", "TXT")]
|
|
163
222
|
export_formats = "".join(valid)
|
|
164
223
|
|
|
165
224
|
if not valid and "NONE" not in requested_formats:
|
|
166
225
|
typer.echo(f"Warning: No valid formats found in '{export_format}'. Supported: JSON, TXT.")
|
|
167
226
|
|
|
168
|
-
|
|
227
|
+
# The meat and potatoes
|
|
228
|
+
report_results = run_report_and_call_exports(
|
|
169
229
|
pdf_path=str(pdf_path),
|
|
170
230
|
max_links=max_links,
|
|
171
231
|
export_format = export_formats,
|
|
172
232
|
pdf_library = pdf_library,
|
|
173
233
|
)
|
|
174
234
|
|
|
175
|
-
|
|
176
|
-
def validate_pdf(
|
|
177
|
-
pdf_path: Optional[Path] = typer.Argument(
|
|
178
|
-
None,
|
|
179
|
-
exists=True,
|
|
180
|
-
file_okay=True,
|
|
181
|
-
dir_okay=False,
|
|
182
|
-
readable=True,
|
|
183
|
-
resolve_path=True,
|
|
184
|
-
help="Path to the PDF file to validate. If omitted, searches current directory."
|
|
185
|
-
),
|
|
186
|
-
export: bool = typer.Option(
|
|
187
|
-
True,
|
|
188
|
-
"--export",#"--no-export",
|
|
189
|
-
help = "JSON export for validation check."
|
|
190
|
-
),
|
|
191
|
-
pdf_library: Literal["pypdf", "pymupdf"] = typer.Option(
|
|
192
|
-
"pypdf",
|
|
193
|
-
"--library", "-l",
|
|
194
|
-
envvar="PDF_ENGINE",
|
|
195
|
-
help="PDF parsing engine: pypdf (pure Python) or pymupdf (faster, if available)"
|
|
196
|
-
),
|
|
197
|
-
fail_on_broken: bool = typer.Option(
|
|
198
|
-
False,
|
|
199
|
-
"--fail",
|
|
200
|
-
help="Exit with code 1 if any broken links are found (useful for CI)"
|
|
201
|
-
)
|
|
202
|
-
):
|
|
203
|
-
"""
|
|
204
|
-
Validate internal, remote, and TOC links in a PDF.
|
|
205
|
-
|
|
206
|
-
1. Call the run_report() function, like calling the 'analyze' CLI command.
|
|
207
|
-
2. Inspects the results from 'run_report():
|
|
208
|
-
- Are referenced files available?
|
|
209
|
-
- Are the page numbers referenced by GoTo links within the length of the document?
|
|
210
|
-
"""
|
|
211
|
-
from pdflinkcheck.io import get_first_pdf_in_cwd
|
|
212
|
-
|
|
213
|
-
if pdf_path is None:
|
|
214
|
-
pdf_path = get_first_pdf_in_cwd()
|
|
215
|
-
if pdf_path is None:
|
|
216
|
-
console.print("[red]Error: No PDF file provided and none found in current directory.[/red]")
|
|
217
|
-
raise typer.Exit(code=1)
|
|
218
|
-
console.print(f"[dim]No file specified — using: {pdf_path.name}[/dim]")
|
|
219
|
-
|
|
220
|
-
pdf_path_str = str(pdf_path)
|
|
221
|
-
|
|
222
|
-
console.print(f"[bold]Validating links in:[/bold] {pdf_path.name}")
|
|
223
|
-
console.print(f"[bold]Using engine:[/bold] {pdf_library}\n")
|
|
224
|
-
|
|
225
|
-
# Step 1: Run analysis (quietly)
|
|
226
|
-
report = run_report(
|
|
227
|
-
pdf_path=pdf_path_str,
|
|
228
|
-
max_links=0,
|
|
229
|
-
export_format="",
|
|
230
|
-
pdf_library=pdf_library,
|
|
231
|
-
print_bool=False
|
|
232
|
-
)
|
|
233
|
-
|
|
234
|
-
if not report or not report.get("data"):
|
|
235
|
+
if not report_results or not report_results.get("data"):
|
|
235
236
|
console.print("[yellow]No links or TOC found — nothing to validate.[/yellow]")
|
|
236
237
|
raise typer.Exit(code=0)
|
|
237
238
|
|
|
238
|
-
|
|
239
|
-
validation_results = run_validation(
|
|
240
|
-
report_results=report,
|
|
241
|
-
pdf_path=pdf_path_str,
|
|
242
|
-
pdf_library=pdf_library,
|
|
243
|
-
export_json=export,
|
|
244
|
-
print_bool=True
|
|
245
|
-
)
|
|
246
|
-
|
|
239
|
+
validation_results = report_results["data"]["validation"]
|
|
247
240
|
# Optional: fail on broken links
|
|
248
241
|
broken_count = validation_results["summary-stats"]["broken-page"] + validation_results["summary-stats"]["broken-file"]
|
|
249
|
-
|
|
250
|
-
|
|
251
|
-
raise typer.Exit(code=1)
|
|
252
|
-
elif broken_count > 0:
|
|
242
|
+
|
|
243
|
+
if broken_count > 0:
|
|
253
244
|
console.print(f"\n[bold yellow]Warning:[/bold yellow] {broken_count} broken link(s) found.")
|
|
254
245
|
else:
|
|
255
|
-
console.print(f"\n[bold green]Success:[/bold green] No broken links or TOC issues
|
|
246
|
+
console.print(f"\n[bold green]Success:[/bold green] No broken links or TOC issues!\n")
|
|
256
247
|
|
|
257
248
|
raise typer.Exit(code=0 if broken_count == 0 else 1)
|
|
258
249
|
|
|
@@ -260,7 +251,7 @@ def validate_pdf(
|
|
|
260
251
|
def serve(
|
|
261
252
|
host: str = typer.Option("0.0.0.0", "--host", "-h", help="Host to bind (use 0.0.0.0 for network access)"),
|
|
262
253
|
port: int = typer.Option(8000, "--port", "-p", help="Port to listen on"),
|
|
263
|
-
reload: bool = typer.Option(False, "--reload", help="Auto-reload on code changes (dev only)"),
|
|
254
|
+
reload: bool = typer.Option(False, "--reload", is_flag=True, help="Auto-reload on code changes (dev only)"),
|
|
264
255
|
):
|
|
265
256
|
"""
|
|
266
257
|
Start the built-in web server for uploading and analyzing PDFs in the browser.
|
|
@@ -0,0 +1,51 @@
|
|
|
1
|
+
# I Have Questions.md
|
|
2
|
+
|
|
3
|
+
## Subject matter:
|
|
4
|
+
How to create a graphical user interface.
|
|
5
|
+
|
|
6
|
+
## Body:
|
|
7
|
+
When I was about 10 years old I dug through 'C:/Program Files/' repeatedly in a hope to discover how to make a pop-up window.
|
|
8
|
+
What defined the edges of an interface?
|
|
9
|
+
Why do some windows look different than others?
|
|
10
|
+
How can I add buttons?
|
|
11
|
+
|
|
12
|
+
I was excited. I wanted to make something.
|
|
13
|
+
|
|
14
|
+
Could I mimick code from a software that was installed on my computer?
|
|
15
|
+
I checked each folder in 'C:/Program Files/' looking for clues.
|
|
16
|
+
|
|
17
|
+
As I searched, the questions changed.
|
|
18
|
+
What is a DLL?
|
|
19
|
+
Mostly, the only files I could open and inspect were little icons and fuzzy images - Why?
|
|
20
|
+
|
|
21
|
+
I gave up.
|
|
22
|
+
Wait - No I didn't. I am here now.
|
|
23
|
+
|
|
24
|
+
Honestly, I still don't understand where the edges of the window come from.
|
|
25
|
+
The easy answer is: **libraries**.
|
|
26
|
+
Many people have done a lot of work to build various GUI libraries, to help people like me (and you) build software.
|
|
27
|
+
|
|
28
|
+
For this package, the application window is built with Tkinter, which is included in Python's standard library.
|
|
29
|
+
You can see how the graphical user interface (GUI) is defined at: https://raw.githubusercontent.com/City-of-Memphis-Wastewater/pdflinkcheck/main/src/pdflinkheck/gui.py
|
|
30
|
+
|
|
31
|
+
This gui.py file isn't perfect, but exploring it will be far more illuminating than trying to open a DLL file in Notepad.
|
|
32
|
+
|
|
33
|
+
This is not a recomendation to use Tkinter. I would recommend learning how to build a basic web-stack GUI which can be served locally.
|
|
34
|
+
|
|
35
|
+
You might not want to make classic interfaces.
|
|
36
|
+
It is what I grew up with, so I get a tickle when I participate in the tradition of local programs, but web and mobile are super valid.
|
|
37
|
+
If you want to make classic interfaces, you should learn about Tauri.
|
|
38
|
+
If you write core logic and then expose it in a way that’s friendly to the web, you can then use Tauri to wrap that web interface into something that feels native on your machine.
|
|
39
|
+
This sounds wild, to go from native core to web tech back to native distribution, but it makes sense when you figure that:
|
|
40
|
+
- Web stack interfaces (HTML, CSS, TS/JS) offers the most control and best portability of graphics, with lots of people having built tools that you can leverage.
|
|
41
|
+
- Making your code accessible via web requests and/or an API will help it have the widest possible reach.
|
|
42
|
+
|
|
43
|
+
Personally, I get really excited when my Python code can run smoothly on Windows, iOS, Linux, and mostly importantly, as Linux on Android via Termux. Yes, sure, if Android is a target, the same core can be packaged as an Android app and be more accessible. Why do I want Termux? Because it's more about leveraging the machine. Basically, with code that can run on Termux, I can take any old android phone in a drawer and use it like I might use a Raspberry Pi. Tkinter will not run from Termux, not without proot. It is better to start a server on Termux, and then vew the app on localhost through your browser.
|
|
44
|
+
|
|
45
|
+
Links:
|
|
46
|
+
- https://docs.python.org/3/library/tkinter.html
|
|
47
|
+
- https://v2.tauri.app/start/
|
|
48
|
+
- https://pyo3.rs/main/doc/pyo3_ffi/index.html
|
|
49
|
+
- https://bheisler.github.io/post/calling-rust-in-python/
|
|
50
|
+
|
|
51
|
+
Copyright © 2025 George Clayton Bennett
|