pdflinkcheck 1.1.72__py3-none-any.whl → 1.1.94__py3-none-any.whl

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (177) hide show
  1. pdflinkcheck/__init__.py +2 -5
  2. pdflinkcheck/analyze_pymupdf.py +12 -6
  3. pdflinkcheck/analyze_pypdf.py +25 -7
  4. pdflinkcheck/analyze_pypdf_v2.py +5 -6
  5. pdflinkcheck/cli.py +82 -91
  6. pdflinkcheck/data/I Have Questions.md +51 -0
  7. pdflinkcheck/data/LICENSE +17 -654
  8. pdflinkcheck/data/README.md +49 -49
  9. pdflinkcheck/data/icons/BoxArt-1080x1080.png +0 -0
  10. pdflinkcheck/data/icons/Logo-150x150.png +0 -0
  11. pdflinkcheck/data/icons/Logo-300x300.png +0 -0
  12. pdflinkcheck/data/icons/Logo-71x71.png +0 -0
  13. pdflinkcheck/data/icons/PosterArt-720x1080.png +0 -0
  14. pdflinkcheck/data/icons/SmallLogo-44x44.png +0 -0
  15. pdflinkcheck/data/icons/SplashScreen-620x300.png +0 -0
  16. pdflinkcheck/data/icons/StoreLogo-50x50.png +0 -0
  17. pdflinkcheck/data/icons/WideLogo-310x150.png +0 -0
  18. pdflinkcheck/data/icons/red_pdf_512px.ico +0 -0
  19. pdflinkcheck/data/pyproject.toml +20 -23
  20. pdflinkcheck/data/themes/forest/forest-dark/border-accent-hover.png +0 -0
  21. pdflinkcheck/data/themes/forest/forest-dark/border-accent.png +0 -0
  22. pdflinkcheck/data/themes/forest/forest-dark/border-basic.png +0 -0
  23. pdflinkcheck/data/themes/forest/forest-dark/border-hover.png +0 -0
  24. pdflinkcheck/data/themes/forest/forest-dark/border-invalid.png +0 -0
  25. pdflinkcheck/data/themes/forest/forest-dark/card.png +0 -0
  26. pdflinkcheck/data/themes/forest/forest-dark/check-accent.png +0 -0
  27. pdflinkcheck/data/themes/forest/forest-dark/check-basic.png +0 -0
  28. pdflinkcheck/data/themes/forest/forest-dark/check-hover.png +0 -0
  29. pdflinkcheck/data/themes/forest/forest-dark/check-tri-accent.png +0 -0
  30. pdflinkcheck/data/themes/forest/forest-dark/check-tri-basic.png +0 -0
  31. pdflinkcheck/data/themes/forest/forest-dark/check-tri-hover.png +0 -0
  32. pdflinkcheck/data/themes/forest/forest-dark/check-unsel-accent.png +0 -0
  33. pdflinkcheck/data/themes/forest/forest-dark/check-unsel-basic.png +0 -0
  34. pdflinkcheck/data/themes/forest/forest-dark/check-unsel-hover.png +0 -0
  35. pdflinkcheck/data/themes/forest/forest-dark/check-unsel-pressed.png +0 -0
  36. pdflinkcheck/data/themes/forest/forest-dark/combo-button-basic.png +0 -0
  37. pdflinkcheck/data/themes/forest/forest-dark/combo-button-focus.png +0 -0
  38. pdflinkcheck/data/themes/forest/forest-dark/combo-button-hover.png +0 -0
  39. pdflinkcheck/data/themes/forest/forest-dark/down.png +0 -0
  40. pdflinkcheck/data/themes/forest/forest-dark/empty.png +0 -0
  41. pdflinkcheck/data/themes/forest/forest-dark/hor-accent.png +0 -0
  42. pdflinkcheck/data/themes/forest/forest-dark/hor-basic.png +0 -0
  43. pdflinkcheck/data/themes/forest/forest-dark/hor-hover.png +0 -0
  44. pdflinkcheck/data/themes/forest/forest-dark/notebook.png +0 -0
  45. pdflinkcheck/data/themes/forest/forest-dark/off-accent.png +0 -0
  46. pdflinkcheck/data/themes/forest/forest-dark/off-basic.png +0 -0
  47. pdflinkcheck/data/themes/forest/forest-dark/off-hover.png +0 -0
  48. pdflinkcheck/data/themes/forest/forest-dark/on-accent.png +0 -0
  49. pdflinkcheck/data/themes/forest/forest-dark/on-basic.png +0 -0
  50. pdflinkcheck/data/themes/forest/forest-dark/on-hover.png +0 -0
  51. pdflinkcheck/data/themes/forest/forest-dark/radio-accent.png +0 -0
  52. pdflinkcheck/data/themes/forest/forest-dark/radio-basic.png +0 -0
  53. pdflinkcheck/data/themes/forest/forest-dark/radio-hover.png +0 -0
  54. pdflinkcheck/data/themes/forest/forest-dark/radio-tri-accent.png +0 -0
  55. pdflinkcheck/data/themes/forest/forest-dark/radio-tri-basic.png +0 -0
  56. pdflinkcheck/data/themes/forest/forest-dark/radio-tri-hover.png +0 -0
  57. pdflinkcheck/data/themes/forest/forest-dark/radio-unsel-accent.png +0 -0
  58. pdflinkcheck/data/themes/forest/forest-dark/radio-unsel-basic.png +0 -0
  59. pdflinkcheck/data/themes/forest/forest-dark/radio-unsel-hover.png +0 -0
  60. pdflinkcheck/data/themes/forest/forest-dark/radio-unsel-pressed.png +0 -0
  61. pdflinkcheck/data/themes/forest/forest-dark/rect-accent-hover.png +0 -0
  62. pdflinkcheck/data/themes/forest/forest-dark/rect-accent.png +0 -0
  63. pdflinkcheck/data/themes/forest/forest-dark/rect-basic.png +0 -0
  64. pdflinkcheck/data/themes/forest/forest-dark/rect-hover.png +0 -0
  65. pdflinkcheck/data/themes/forest/forest-dark/right.png +0 -0
  66. pdflinkcheck/data/themes/forest/forest-dark/scale-hor.png +0 -0
  67. pdflinkcheck/data/themes/forest/forest-dark/scale-vert.png +0 -0
  68. pdflinkcheck/data/themes/forest/forest-dark/separator.png +0 -0
  69. pdflinkcheck/data/themes/forest/forest-dark/sizegrip.png +0 -0
  70. pdflinkcheck/data/themes/forest/forest-dark/spin-button-down-basic.png +0 -0
  71. pdflinkcheck/data/themes/forest/forest-dark/spin-button-down-focus.png +0 -0
  72. pdflinkcheck/data/themes/forest/forest-dark/spin-button-up.png +0 -0
  73. pdflinkcheck/data/themes/forest/forest-dark/tab-accent.png +0 -0
  74. pdflinkcheck/data/themes/forest/forest-dark/tab-basic.png +0 -0
  75. pdflinkcheck/data/themes/forest/forest-dark/tab-hover.png +0 -0
  76. pdflinkcheck/data/themes/forest/forest-dark/thumb-hor-accent.png +0 -0
  77. pdflinkcheck/data/themes/forest/forest-dark/thumb-hor-basic.png +0 -0
  78. pdflinkcheck/data/themes/forest/forest-dark/thumb-hor-hover.png +0 -0
  79. pdflinkcheck/data/themes/forest/forest-dark/thumb-vert-accent.png +0 -0
  80. pdflinkcheck/data/themes/forest/forest-dark/thumb-vert-basic.png +0 -0
  81. pdflinkcheck/data/themes/forest/forest-dark/thumb-vert-hover.png +0 -0
  82. pdflinkcheck/data/themes/forest/forest-dark/tree-basic.png +0 -0
  83. pdflinkcheck/data/themes/forest/forest-dark/tree-pressed.png +0 -0
  84. pdflinkcheck/data/themes/forest/forest-dark/up.png +0 -0
  85. pdflinkcheck/data/themes/forest/forest-dark/vert-accent.png +0 -0
  86. pdflinkcheck/data/themes/forest/forest-dark/vert-basic.png +0 -0
  87. pdflinkcheck/data/themes/forest/forest-dark/vert-hover.png +0 -0
  88. pdflinkcheck/data/themes/forest/forest-dark.tcl +536 -0
  89. pdflinkcheck/data/themes/forest/forest-light/border-accent-hover.png +0 -0
  90. pdflinkcheck/data/themes/forest/forest-light/border-accent.png +0 -0
  91. pdflinkcheck/data/themes/forest/forest-light/border-basic.png +0 -0
  92. pdflinkcheck/data/themes/forest/forest-light/border-hover.png +0 -0
  93. pdflinkcheck/data/themes/forest/forest-light/border-invalid.png +0 -0
  94. pdflinkcheck/data/themes/forest/forest-light/card.png +0 -0
  95. pdflinkcheck/data/themes/forest/forest-light/check-accent.png +0 -0
  96. pdflinkcheck/data/themes/forest/forest-light/check-basic.png +0 -0
  97. pdflinkcheck/data/themes/forest/forest-light/check-hover.png +0 -0
  98. pdflinkcheck/data/themes/forest/forest-light/check-tri-accent.png +0 -0
  99. pdflinkcheck/data/themes/forest/forest-light/check-tri-basic.png +0 -0
  100. pdflinkcheck/data/themes/forest/forest-light/check-tri-hover.png +0 -0
  101. pdflinkcheck/data/themes/forest/forest-light/check-unsel-accent.png +0 -0
  102. pdflinkcheck/data/themes/forest/forest-light/check-unsel-basic.png +0 -0
  103. pdflinkcheck/data/themes/forest/forest-light/check-unsel-hover.png +0 -0
  104. pdflinkcheck/data/themes/forest/forest-light/check-unsel-pressed.png +0 -0
  105. pdflinkcheck/data/themes/forest/forest-light/combo-button-basic.png +0 -0
  106. pdflinkcheck/data/themes/forest/forest-light/combo-button-focus.png +0 -0
  107. pdflinkcheck/data/themes/forest/forest-light/combo-button-hover.png +0 -0
  108. pdflinkcheck/data/themes/forest/forest-light/down-focus.png +0 -0
  109. pdflinkcheck/data/themes/forest/forest-light/down.png +0 -0
  110. pdflinkcheck/data/themes/forest/forest-light/empty.png +0 -0
  111. pdflinkcheck/data/themes/forest/forest-light/hor-accent.png +0 -0
  112. pdflinkcheck/data/themes/forest/forest-light/hor-basic.png +0 -0
  113. pdflinkcheck/data/themes/forest/forest-light/hor-hover.png +0 -0
  114. pdflinkcheck/data/themes/forest/forest-light/notebook.png +0 -0
  115. pdflinkcheck/data/themes/forest/forest-light/off-accent.png +0 -0
  116. pdflinkcheck/data/themes/forest/forest-light/off-basic.png +0 -0
  117. pdflinkcheck/data/themes/forest/forest-light/off-hover.png +0 -0
  118. pdflinkcheck/data/themes/forest/forest-light/on-accent.png +0 -0
  119. pdflinkcheck/data/themes/forest/forest-light/on-basic.png +0 -0
  120. pdflinkcheck/data/themes/forest/forest-light/on-hover.png +0 -0
  121. pdflinkcheck/data/themes/forest/forest-light/radio-accent.png +0 -0
  122. pdflinkcheck/data/themes/forest/forest-light/radio-basic.png +0 -0
  123. pdflinkcheck/data/themes/forest/forest-light/radio-hover.png +0 -0
  124. pdflinkcheck/data/themes/forest/forest-light/radio-tri-accent.png +0 -0
  125. pdflinkcheck/data/themes/forest/forest-light/radio-tri-basic.png +0 -0
  126. pdflinkcheck/data/themes/forest/forest-light/radio-tri-hover.png +0 -0
  127. pdflinkcheck/data/themes/forest/forest-light/radio-unsel-accent.png +0 -0
  128. pdflinkcheck/data/themes/forest/forest-light/radio-unsel-basic.png +0 -0
  129. pdflinkcheck/data/themes/forest/forest-light/radio-unsel-hover.png +0 -0
  130. pdflinkcheck/data/themes/forest/forest-light/radio-unsel-pressed.png +0 -0
  131. pdflinkcheck/data/themes/forest/forest-light/rect-accent-hover.png +0 -0
  132. pdflinkcheck/data/themes/forest/forest-light/rect-accent.png +0 -0
  133. pdflinkcheck/data/themes/forest/forest-light/rect-basic.png +0 -0
  134. pdflinkcheck/data/themes/forest/forest-light/rect-hover.png +0 -0
  135. pdflinkcheck/data/themes/forest/forest-light/right-focus.png +0 -0
  136. pdflinkcheck/data/themes/forest/forest-light/right.png +0 -0
  137. pdflinkcheck/data/themes/forest/forest-light/scale-hor.png +0 -0
  138. pdflinkcheck/data/themes/forest/forest-light/scale-vert.png +0 -0
  139. pdflinkcheck/data/themes/forest/forest-light/separator.png +0 -0
  140. pdflinkcheck/data/themes/forest/forest-light/sizegrip.png +0 -0
  141. pdflinkcheck/data/themes/forest/forest-light/spin-button-down-basic.png +0 -0
  142. pdflinkcheck/data/themes/forest/forest-light/spin-button-down-focus.png +0 -0
  143. pdflinkcheck/data/themes/forest/forest-light/spin-button-up.png +0 -0
  144. pdflinkcheck/data/themes/forest/forest-light/tab-accent.png +0 -0
  145. pdflinkcheck/data/themes/forest/forest-light/tab-basic.png +0 -0
  146. pdflinkcheck/data/themes/forest/forest-light/tab-hover.png +0 -0
  147. pdflinkcheck/data/themes/forest/forest-light/thumb-hor-accent.png +0 -0
  148. pdflinkcheck/data/themes/forest/forest-light/thumb-hor-basic.png +0 -0
  149. pdflinkcheck/data/themes/forest/forest-light/thumb-hor-hover.png +0 -0
  150. pdflinkcheck/data/themes/forest/forest-light/thumb-vert-accent.png +0 -0
  151. pdflinkcheck/data/themes/forest/forest-light/thumb-vert-basic.png +0 -0
  152. pdflinkcheck/data/themes/forest/forest-light/thumb-vert-hover.png +0 -0
  153. pdflinkcheck/data/themes/forest/forest-light/tree-basic.png +0 -0
  154. pdflinkcheck/data/themes/forest/forest-light/tree-pressed.png +0 -0
  155. pdflinkcheck/data/themes/forest/forest-light/up.png +0 -0
  156. pdflinkcheck/data/themes/forest/forest-light/vert-accent.png +0 -0
  157. pdflinkcheck/data/themes/forest/forest-light/vert-basic.png +0 -0
  158. pdflinkcheck/data/themes/forest/forest-light/vert-hover.png +0 -0
  159. pdflinkcheck/data/themes/forest/forest-light.tcl +544 -0
  160. pdflinkcheck/datacopy.py +2 -0
  161. pdflinkcheck/dev.py +10 -23
  162. pdflinkcheck/environment.py +64 -0
  163. pdflinkcheck/gui.py +229 -103
  164. pdflinkcheck/io.py +4 -18
  165. pdflinkcheck/report.py +161 -89
  166. pdflinkcheck/stdlib_server.py +14 -6
  167. pdflinkcheck/update_msix_version.py +47 -0
  168. pdflinkcheck/validate.py +59 -80
  169. pdflinkcheck/version_info.py +5 -2
  170. {pdflinkcheck-1.1.72.dist-info → pdflinkcheck-1.1.94.dist-info}/METADATA +54 -52
  171. pdflinkcheck-1.1.94.dist-info/RECORD +176 -0
  172. pdflinkcheck-1.1.94.dist-info/licenses/LICENSE +24 -0
  173. pdflinkcheck-1.1.94.dist-info/licenses/LICENSE-MIT +9 -0
  174. pdflinkcheck-1.1.72.dist-info/RECORD +0 -21
  175. {pdflinkcheck-1.1.72.dist-info → pdflinkcheck-1.1.94.dist-info}/WHEEL +0 -0
  176. {pdflinkcheck-1.1.72.dist-info → pdflinkcheck-1.1.94.dist-info}/entry_points.txt +0 -0
  177. /pdflinkcheck-1.1.72.dist-info/licenses/LICENSE → /pdflinkcheck-1.1.94.dist-info/licenses/LICENSE-AGPL3 +0 -0
pdflinkcheck/validate.py CHANGED
@@ -1,29 +1,30 @@
1
+ #!/usr/bin/env python3
2
+ # SPDX-License-Identifier: MIT
1
3
  # src/pdflinkcheck/validate.py
2
4
 
3
5
  import sys
4
6
  from pathlib import Path
5
7
  from typing import Dict, Any
6
8
 
7
- from pdflinkcheck.report import run_report
8
- from pdflinkcheck.io import get_friendly_path, export_validation_json
9
+ from pdflinkcheck.io import get_friendly_path
10
+ from pdflinkcheck.environment import pymupdf_is_available
11
+
12
+ SEP_COUNT=28
9
13
 
10
14
  def run_validation(
11
15
  report_results: Dict[str, Any],
12
16
  pdf_path: str,
13
17
  pdf_library: str = "pypdf",
14
- check_external: bool = False,
15
- export_json: bool = True,
16
- print_bool: bool = True
18
+ check_external: bool = False
17
19
  ) -> Dict[str, Any]:
18
20
  """
19
- Validates links using the output from run_report().
21
+ Validates links during run_report() using a partial completion of the data dict.
20
22
 
21
23
  Args:
22
- report_results: The dict returned by run_report()
24
+ report_results: The dict returned by run_report_and_call_exports()
23
25
  pdf_path: Path to the original PDF (needed for relative file checks and page count)
24
26
  pdf_library: Engine used ("pypdf" or "pymupdf")
25
27
  check_external: Whether to validate HTTP URLs (requires network + requests)
26
- print_bool: Whether to print results to console
27
28
 
28
29
  Returns:
29
30
  Validation summary stats with valid/broken counts and detailed issues
@@ -35,13 +36,12 @@ def run_validation(
35
36
  toc = data.get("toc", [])
36
37
 
37
38
  if not all_links and not toc:
38
- if print_bool:
39
- print("No links or TOC to validate.")
39
+ print("No links or TOC to validate.")
40
40
  return {"summary-stats": {"valid": 0, "broken": 0}, "issues": []}
41
41
 
42
42
  # Get total page count (critical for internal validation)
43
43
  try:
44
- if pdf_library == "pymupdf":
44
+ if pymupdf_is_available() and pdf_library == "pymupdf":
45
45
  import fitz
46
46
  doc = fitz.open(pdf_path)
47
47
  total_pages = doc.page_count
@@ -51,44 +51,54 @@ def run_validation(
51
51
  reader = PdfReader(pdf_path)
52
52
  total_pages = len(reader.pages)
53
53
  except Exception as e:
54
- if print_bool:
55
- print(f"Could not determine page count: {e}")
54
+ print(f"Could not determine page count: {e}")
56
55
  total_pages = None
57
56
 
58
57
  pdf_dir = Path(pdf_path).parent
59
58
 
60
59
  issues = []
61
60
  valid_count = 0
61
+ file_found_count = 0
62
62
  broken_file_count = 0
63
63
  broken_page_count = 0
64
- file_found_count = 0
64
+ no_destination_page_count = 0
65
65
  unknown_web_count = 0
66
66
  unknown_reasonableness_count = 0
67
67
  unknown_link_count = 0
68
68
 
69
69
  # Validate active links
70
+ #print("DEBUG validate: entering loop with", len(all_links), "links")
70
71
  for i, link in enumerate(all_links):
71
72
  link_type = link.get("type")
72
73
  status = "valid"
73
74
  reason = None
74
75
  if link_type in ("Internal (GoTo/Dest)", "Internal (Resolved Action)"):
75
- target_page = int(link.get("destination_page"))
76
- if not isinstance(target_page, int):
77
- status = "broken-page"
78
- reason = f"Target page not a number: {target_page}"
79
- elif (1 <= target_page) and total_pages is None:
80
- status = "unknown-reasonableness"
81
- reason = "Total page count unavailable, but the page number is reasonable"
82
- elif (1 <= target_page <= total_pages):
83
- status = "valid"
84
- reason = f"Page {target_page} within range (1–{total_pages})"
85
- elif target_page < 1:
86
- status = "broken-page"
87
- reason = f"TOC targets page negative {target_page}."
88
- elif not (1 <= target_page <= total_pages):
89
- status = "broken-page"
90
- reason = f"Page {target_page} out of range (1–{total_pages})"
91
-
76
+ dest_page_raw = link.get("destination_page")
77
+ if dest_page_raw is None:
78
+ status = "no-destinstion-page"
79
+ reason = "No destination page resolved"
80
+ else:
81
+ try:
82
+ target_page = int(dest_page_raw)
83
+ #target_page = int(link.get("destination_page"))
84
+ if not isinstance(target_page, int):
85
+ status = "broken-page"
86
+ reason = f"Target page not a number: {target_page}"
87
+ elif (1 <= target_page) and total_pages is None:
88
+ status = "unknown-reasonableness"
89
+ reason = "Total page count unavailable, but the page number is reasonable"
90
+ elif (1 <= target_page <= total_pages):
91
+ status = "valid"
92
+ reason = f"Page {target_page} within range (1–{total_pages})"
93
+ elif target_page < 1:
94
+ status = "broken-page"
95
+ reason = f"TOC targets page negative {target_page}."
96
+ elif not (1 <= target_page <= total_pages):
97
+ status = "broken-page"
98
+ reason = f"Page {target_page} out of range (1–{total_pages})"
99
+ except (ValueError, TypeError):
100
+ status = "broken-page"
101
+ reason = f"Invalid page value: {dest_page_raw}"
92
102
  elif link_type == "Remote (GoToR)":
93
103
  remote_file = link.get("remote_file")
94
104
  if not remote_file:
@@ -130,13 +140,15 @@ def run_validation(
130
140
  unknown_reasonableness_count += 1
131
141
  elif status == "unknown-link":
132
142
  unknown_link_count += 1
133
- elif status == "broken-file":
143
+ elif status == "broken-page":
134
144
  broken_page_count += 1
135
145
  issues.append(link_with_val)
136
146
  elif status == "broken-file":
137
- broken_page_count += 1
147
+ broken_file_count += 1
148
+ issues.append(link_with_val)
149
+ elif status == "no-destinstion-page":
150
+ no_destination_page_count += 1
138
151
  issues.append(link_with_val)
139
-
140
152
  # Validate TOC entries
141
153
  for entry in toc:
142
154
  target_page = int(entry.get("target_page"))
@@ -154,7 +166,7 @@ def run_validation(
154
166
  continue
155
167
  else:
156
168
  status = "broken-page"
157
- reason = f"TOC targets page {page} (out of 1–{total_pages})"
169
+ reason = f"TOC targets page {target_page} (out of 1–{total_pages})"
158
170
  broken_count += 1
159
171
  else:
160
172
  status = "broken-page"
@@ -175,6 +187,7 @@ def run_validation(
175
187
  "file-found": file_found_count,
176
188
  "broken-page": broken_page_count,
177
189
  "broken-file": broken_file_count,
190
+ "no_destination_page_count": no_destination_page_count,
178
191
  "unknown-web": unknown_web_count,
179
192
  "unknown-reasonableness": unknown_reasonableness_count,
180
193
  "unknown-link": unknown_link_count,
@@ -192,23 +205,23 @@ def run_validation(
192
205
  def log(msg: str):
193
206
  validation_buffer.append(msg)
194
207
 
195
- log("\n" + "=" * 70)
208
+ log("\n" + "=" * SEP_COUNT)
196
209
  log("## Validation Results")
197
- log("=" * 70)
210
+ log("=" * SEP_COUNT)
198
211
  log(f"PDF Path = {get_friendly_path(pdf_path)}")
199
212
  log(f"Total items checked: {summary_stats['total_checked']}")
200
213
  log(f"✅ Valid: {summary_stats['valid']}")
201
214
  log(f"🌐 Web Addresses (Not Checked): {summary_stats['unknown-web']}")
202
215
  log(f"⚠️ Unknown Page Reasonableness (Due to Missing Total Page Count): {summary_stats['unknown-reasonableness']}")
203
216
  log(f"⚠️ Unsupported PDF Links: {summary_stats['unknown-link']}")
204
- log(f"❌ Broken Page Reference: {summary_stats['broken-page']}")
205
- log(f"❌ Broken File Reference: {summary_stats['broken-file']}")
206
- log("=" * 70)
217
+ log(f"❌ Broken Page Reference (Page number beyond scope of availability): {summary_stats['broken-page']}")
218
+ log(f"❌ Broken File Reference (File not available): {summary_stats['broken-file']}")
219
+ log("=" * SEP_COUNT)
207
220
 
208
221
  if issues:
209
222
  log("\n## Issues Found")
210
223
  log("{:<5} | {:<12} | {:<30} | {}".format("Idx", "Type", "Text", "Problem"))
211
- log("-" * 70)
224
+ log("-" * SEP_COUNT)
212
225
  for i, issue in enumerate(issues[:25], 1):
213
226
  link_type = issue.get("type", "Link")
214
227
  text = issue.get("link_text", "") or issue.get("title", "") or "N/A"
@@ -218,7 +231,7 @@ def run_validation(
218
231
  if len(issues) > 25:
219
232
  log(f"... and {len(issues) - 25} more issues")
220
233
  else:
221
- log("No issues found — all links and TOC entries are valid!")
234
+ log("Success: No broken links or TOC issues!")
222
235
 
223
236
  # Final aggregation of the buffer into one string
224
237
  validation_buffer_str = "\n".join(validation_buffer)
@@ -226,8 +239,6 @@ def run_validation(
226
239
  return validation_buffer_str
227
240
 
228
241
  summary_txt = generate_validation_summary_txt_buffer(summary_stats, issues, pdf_path)
229
- if print_bool:
230
- print(summary_txt)
231
242
 
232
243
  validation_results = {
233
244
  "pdf_path" : pdf_path,
@@ -237,9 +248,6 @@ def run_validation(
237
248
  "total_pages": total_pages
238
249
  }
239
250
 
240
- # Have export run interally so that the logic need not happen in an interface
241
-
242
- export_validation_json(validation_results, pdf_path, pdf_library)
243
251
  return validation_results
244
252
 
245
253
 
@@ -254,7 +262,7 @@ def run_validation_more_readable_slop(pdf_path: str = None, pdf_library: str = "
254
262
  if check_external_links:
255
263
  import requests
256
264
 
257
- # 1. Setup Library Engine (Reuse your logic)
265
+ # 1. Setup Library Engine (Reuse logic)
258
266
  pdf_library = pdf_library.lower()
259
267
  if pdf_library == "pypdf":
260
268
  from pdflinkcheck.analyze_pypdf import extract_links_pypdf as extract_links
@@ -330,18 +338,18 @@ def run_validation_more_readable_slop(pdf_path: str = None, pdf_library: str = "
330
338
  else:
331
339
  results['broken'].append(link)
332
340
 
333
- print("\n" + "=" * 70)
341
+ print("\n" + "=" * SEP_COUNT)
334
342
  print(f"--- Validation Summary Stats for {Path(pdf_path).name} ---")
335
343
  print(f"Total Checked: {total_links}")
336
344
  print(f"✅ Valid: {len(results['valid'])}")
337
345
  print(f"❌ Broken: {len(results['broken'])}")
338
- print("=" * 70)
346
+ print("=" * SEP_COUNT)
339
347
 
340
348
  # 4. Print Detail Report for Broken Links
341
349
  if results['broken']:
342
350
  print("\n## ❌ Broken Links Found:")
343
351
  print("{:<5} | {:<5} | {:<30} | {}".format("Idx", "Page", "Reason", "Target"))
344
- print("-" * 70)
352
+ print("-" * SEP_COUNT)
345
353
  for i, link in enumerate(results['broken'], 1):
346
354
  target = link.get('url') or link.get('destination_page') or link.get('remote_file')
347
355
  print("{:<5} | {:<5} | {:<30} | {}".format(
@@ -349,32 +357,3 @@ def run_validation_more_readable_slop(pdf_path: str = None, pdf_library: str = "
349
357
  ))
350
358
 
351
359
  return results
352
-
353
-
354
- if __name__ == "__main__":
355
-
356
- from pdflinkcheck.io import get_first_pdf_in_cwd
357
- pdf_path = get_first_pdf_in_cwd()
358
- # Run analysis first
359
- report = run_report(
360
- pdf_path=pdf_path,
361
- max_links=0,
362
- export_format="",
363
- pdf_library="pypdf",
364
- print_bool=False # We handle printing in validation
365
- )
366
-
367
- if not report or not report.get("data"):
368
- print("No data extracted — nothing to validate.")
369
- sys.exit(1)
370
-
371
- # Then validate
372
- validation_results = run_validation(
373
- report_results=report,
374
- pdf_path=pdf_path,
375
- pdf_library="pypdf",
376
- export_json=True,
377
- print_bool=True
378
- )
379
-
380
- export_validation_results()
@@ -1,4 +1,7 @@
1
+ #!/usr/bin/env python3
2
+ # SPDX-License-Identifier: MIT
1
3
  # src/pdflinkcheck/version_info.py
4
+
2
5
  import re
3
6
  from pathlib import Path
4
7
  import sys
@@ -11,7 +14,7 @@ This portion of the codebase is MIT licensed. It does not rely on any AGPL-licen
11
14
 
12
15
  MIT License
13
16
 
14
- Copyright (c) 2025 George Clayton Bennett <george.bennett@memphistn.gov>
17
+ Copyright © 2025 George Clayton Bennett <george.bennett@memphistn.gov>
15
18
 
16
19
  Permission is hereby granted, free of charge, to any person obtaining a copy
17
20
  of this software and associated documentation files (the "Software"), to deal
@@ -52,7 +55,7 @@ def find_pyproject(start: Path) -> Path | None:
52
55
  if candidate.exists():
53
56
  return candidate
54
57
 
55
- # 3. Handle Installed / Wheel / Shiv state (using your force-include path)
58
+ # 3. Handle Installed / Wheel / Shiv state (using force-include path)
56
59
  internal_path = Path(__file__).parent / "data" / "pyproject.toml"
57
60
  if internal_path.exists():
58
61
  return internal_path
@@ -1,10 +1,13 @@
1
1
  Metadata-Version: 2.4
2
2
  Name: pdflinkcheck
3
- Version: 1.1.72
3
+ Version: 1.1.94
4
4
  Summary: A purpose-built PDF link analysis and reporting tool with GUI and CLI.
5
5
  Author: George Clayton Bennett
6
6
  Author-email: George Clayton Bennett <george.bennett@memphistn.gov>
7
+ License-Expression: MIT AND AGPL-3.0-or-later
7
8
  License-File: LICENSE
9
+ License-File: LICENSE-AGPL3
10
+ License-File: LICENSE-MIT
8
11
  Classifier: Programming Language :: Python :: 3
9
12
  Classifier: Programming Language :: Python :: 3 :: Only
10
13
  Classifier: Programming Language :: Python :: 3.10
@@ -13,6 +16,7 @@ Classifier: Programming Language :: Python :: 3.12
13
16
  Classifier: Programming Language :: Python :: 3.13
14
17
  Classifier: Programming Language :: Python :: 3.14
15
18
  Classifier: License :: OSI Approved :: GNU Affero General Public License v3 or later (AGPLv3+)
19
+ Classifier: License :: OSI Approved :: MIT License
16
20
  Classifier: Operating System :: OS Independent
17
21
  Classifier: Intended Audience :: End Users/Desktop
18
22
  Classifier: Intended Audience :: Developers
@@ -32,14 +36,12 @@ Requires-Dist: pypdf>=6.4.2
32
36
  Requires-Dist: rich>=14.2.0
33
37
  Requires-Dist: typer>=0.20.0
34
38
  Requires-Dist: pymupdf>=1.26.7 ; extra == 'full'
35
- Requires-Dist: sv-ttk>=2.6.1 ; extra == 'gui'
36
39
  Maintainer: George Clayton Bennett
37
40
  Maintainer-email: George Clayton Bennett <george.bennett@memphistn.gov>
38
41
  Requires-Python: >=3.10
39
42
  Project-URL: Homepage, https://github.com/city-of-memphis-wastewater/pdflinkcheck
40
43
  Project-URL: Repository, https://github.com/city-of-memphis-wastewater/pdflinkcheck
41
44
  Provides-Extra: full
42
- Provides-Extra: gui
43
45
  Description-Content-Type: text/markdown
44
46
 
45
47
  # pdflinkcheck
@@ -48,7 +50,7 @@ A purpose-built tool for comprehensive analysis of hyperlinks and GoTo links wit
48
50
 
49
51
  -----
50
52
 
51
- ![Screenshot of the pdflinkcheck GUI](https://raw.githubusercontent.com/City-of-Memphis-Wastewater/pdflinkcheck/main/assets/pdflinkcheck_gui_v1.1.58.png)
53
+ ![Screenshot of the pdflinkcheck GUI](https://raw.githubusercontent.com/City-of-Memphis-Wastewater/pdflinkcheck/main/assets/pdflinkcheck_gui_v1.1.92.png)
52
54
 
53
55
  -----
54
56
 
@@ -86,20 +88,9 @@ The tool can be run as simple cross-platform graphical interface (Tkinter).
86
88
 
87
89
  ### Launching the GUI
88
90
 
89
- There are three ways to launch the GUI interface:
90
-
91
- 1. **Implicit Launch:** Run the main command with no arguments, subcommands, or flags (`pdflinkcheck`).
91
+ Ways to launch the GUI interface:
92
+ 1. **Implicit Launch:** Run the tool or file with no arguments, subcommands, or flags. (Note: PyInstaller builds use the --windowed (or -noconsole) flag, except for on Termux.)
92
93
  2. **Explicit Command:** Use the dedicated GUI subcommand (`pdflinkcheck gui`).
93
- 3. **Binary Double-Click:**
94
- * **Windows:** Double-click the `pdflinkcheck-VERSION-gui.bat` file.
95
- * **macOS/Linux:** Double-click the downloaded `.pyz` or `.elf` file.
96
-
97
- ### Planned GUI Updates
98
-
99
- We are actively working on the following enhancements:
100
-
101
- * **Report Export:** Functionality to export the full analysis report to a plain text file.
102
- * **License Visibility:** A dedicated "License Info" button within the GUI to display the terms of the AGPLv3+ license.
103
94
 
104
95
  -----
105
96
 
@@ -107,20 +98,30 @@ We are actively working on the following enhancements:
107
98
 
108
99
  The core functionality is accessed via the `analyze` command.
109
100
 
110
- `DEV_TYPER_HELP_TREE=1 pdflinkcheck help-tree`:
111
- ![Screenshot of the pdflinkcheck CLI Tree Help](https://raw.githubusercontent.com/City-of-Memphis-Wastewater/pdflinkcheck/main/assets/pdflinkcheck_cli_v1.1.58_tree_help.png)
112
-
113
101
  `pdflinkcheck --help`:
114
- ![Screenshot of the pdflinkcheck CLI Tree Help](https://raw.githubusercontent.com/City-of-Memphis-Wastewater/pdflinkcheck/main/assets/pdflinkcheck_cli_v1.1.58.png)
102
+ ![Screenshot of the pdflinkcheck CLI Tree Help](https://raw.githubusercontent.com/City-of-Memphis-Wastewater/pdflinkcheck/main/assets/pdflinkcheck_cli_v1.1.92.png)
103
+
104
+
105
+ See the Help Tree by unlocking the help-tree CLI command, using the DEV_TYPER_HELP_TREE env var.
106
+
107
+ ```
108
+ DEV_TYPER_HELP_TREE=1 pdflinkcheck help-tree` # bash
109
+ $env:DEV_TYPER_HELP_TREE = "1"; pdflinkcheck help-tree` # PowerShell
110
+ ```
111
+
112
+ ![Screenshot of the pdflinkcheck CLI Tree Help](https://raw.githubusercontent.com/City-of-Memphis-Wastewater/pdflinkcheck/main/assets/pdflinkcheck_cli_v1.1.92_tree_help.png)
113
+
115
114
 
116
115
 
117
116
  ### Available Commands
118
117
 
119
118
  |**Command**|**Description**|
120
119
  |---|---|
121
- |`pdflinkcheck analyze`|Analyzes a PDF file for links |
120
+ |`pdflinkcheck analyze`|Analyzes a PDF file for links and validates their reasonableness.|
122
121
  |`pdflinkcheck gui`|Explicitly launch the Graphical User Interface.|
123
122
  |`pdflinkcheck docs`|Access documentation, including the README and AGPLv3+ license.|
123
+ |`pdflinkcheck serve`|Serve a basic local web app which uses only the Python standard library.|
124
+ |`pdflinkcheck tools`|Access additional tools, like `--clear-cache`.|
124
125
 
125
126
  ### `analyze` Command Options
126
127
 
@@ -232,37 +233,23 @@ This `help-tree` feature has not yet been submitted for inclusion into Typer.
232
233
 
233
234
  ## ⚠️ Compatibility Notes
234
235
 
235
- #### Termux Compatibility as a Key Goal
236
+ ### Termux Compatibility as a Key Goal
236
237
  A key goal of City-of-Memphis-Wastewater is to release all software as Termux-compatible.
237
238
 
238
- Termux compatibility is important in the modern age as Android devices are common among technicians, field engineers, and maintenace staff.
239
+ Termux compatibility is important in the modern age, because Android devices are common among technicians, field engineers, and maintenace staff.
239
240
  Android is the most common operating system in the Global South.
240
241
  We aim to produce stable software that can do the most possible good.
241
242
 
242
- While using `PyMuPDF` in Python dependency resolution on Termux simply isn't possible, we are proud to have achieved a work-around by implementing a parallel solution in `pypdf`!
243
- Now, there is PDF Engine selection in both the CLI and the GUI.
244
- `pypdf` is the default in pdflinkcheck.report.run_report(); PyMuPDF can be explicitly requested in the CLI and is the default in the TKinter GUI.
245
-
246
- Now that `pdflinkcheck` can run on Termux, we may find a work-around and be able to drop the PyMuPDF dependency.
247
- - Build `pypdf`-only artifacts, to reduce size.
248
- - Build a web-stack GUI as an alternative to the Tkinter GUI, to be compatible with Termux.
249
-
250
- Because it works, we plan to keep the `PyMuPDF` portion of the codebase.
251
-
252
- ### Document Compatibility:
253
- Not all PDF files can be processed successfully. This tool is designed primarily for digitally generated (vector-based) PDFs.
254
-
255
- Processing may fail or yield incomplete results for:
256
- * **Scanned PDFs** (images of text) that lack an accessible text layer.
257
- * **Encrypted or Password-Protected** documents.
258
- * **Malformed or non-standard** PDF files.
243
+ Now `pdflinkcheck` can run on Termux by using the `pypdf` engine.
244
+ Benefits:
245
+ - `pypdf`-only artifacts, to reduce size to about 6% compared to artifacts that include `PyMuPDF`.
246
+ - Web-stack GUI as an alternative to the Tkinter GUI, which can be run locally on Termux or as a web app.
259
247
 
260
- -----
261
248
 
262
- ## PDF Library Selection
263
- At long last, `PyMuPDF` is an optional dependency. The default is `pypdf`. All testing has shown identical performance, though the `analyze_pymupdf.py` is faster and more direct and robust than `analyze_pypdf.py`, which requires a lot of intentional parsing.
249
+ ### PDF Library Selection
250
+ At long last, `PyMuPDF` is an optional dependency. All testing comparing `pyp df` and `PyMuPDF` has shown identical validation performance. However `PyMuPDF` is much faster. The benfit of `pypdf` is small size of packages and cross-platform compatibility.
264
251
 
265
- Binaries and artifacts are expected to contain PyMuPDF, unless they are build on Android. The GUI and CLI interfaces both allow selection of the library; if PyMuPDF is selected but is not available, the user will be warned.
252
+ Expecte that all binaries and artifacts contain PyMuPDF, unlss they are built on Android. The GUI and CLI interfaces both allow selection of the library; if PyMuPDF is selected but is not available, the user will be warned.
266
253
 
267
254
  To install the complete version use one of these options:
268
255
 
@@ -273,6 +260,16 @@ uv tool install "pdflinkcheck[full]"
273
260
  uv add "pdflinkcheck[full]"
274
261
  ```
275
262
 
263
+ ---
264
+
265
+ ### Document Compatibility:
266
+ Not all PDF files can be processed successfully. This tool is designed primarily for digitally generated (vector-based) PDFs.
267
+
268
+ Processing may fail or yield incomplete results for:
269
+ * **Scanned PDFs** (images of text) that lack an accessible text layer.
270
+ * **Encrypted or Password-Protected** documents.
271
+ * **Malformed or non-standard** PDF files.
272
+
276
273
  -----
277
274
 
278
275
  ## Run from Source (Developers)
@@ -301,22 +298,27 @@ uv run python -m pdflinkcheck.stdlib_server
301
298
 
302
299
  ## 📜 License Implications (AGPLv3+)
303
300
 
304
- **`pdflinkcheck` is licensed under the `GNU Affero General Public License` version 3 or later (`AGPLv3+`).**
305
301
 
306
- The `AGPL3+` is required for portions of this codebase because `pdflinkcheck` uses `PyMuPDF`, which is licensed under the `AGPL3`.
302
+ The `AGPL3-or-later` is required for binaries of `pdflinkcheck` which include `PyMuPDF`, which is licensed under the `AGPL3`.
303
+ The source code itself for `pdflinkcheck` is licensed under the `MIT`.
307
304
 
308
- To stay in compliance, the AGPL3 license text is readily available in the CLI and the GUI, and it is included in the build artifacts.
309
- The `AGPL3` appears as the primary license file in the source code. While this infers that the entire project is AGPL3-licensed, this is not true - portions of the codebase are MIT-licensed.
310
-
311
- This license has significant implications for **distribution and network use**, particularly for organizations:
305
+ The AGPL3-or-later license has significant implications for **distribution and network use**, particularly for organizations:
312
306
 
313
307
  * **Source Code Provision:** If you distribute this tool (modified or unmodified) to anyone, you **must** provide the full source code under the same license.
314
308
  * **Network Interaction (Affero Clause):** If you modify this tool and make the modified version available to users over a computer network (e.g., as a web service or backend), you **must** also offer the source code to those network users.
315
309
 
316
310
  > **Before deploying or modifying this tool for organizational use, especially for internal web services or distribution, please ensure compliance with the AGPLv3+ terms.**
317
311
 
312
+ Because the AGPLv3 is a strong copyleft license, any version of `pdflinkcheck` that includes AGPL‑licensed components (such as `PyMuPDF`) must be distributed as a whole under AGPLv3+. This means that for those versions, anyone who distributes the application — or makes a modified version available over a network — must also provide the complete corresponding source code under the same terms.
313
+
314
+ The source code of pdflinkcheck itself remains licensed under the **MIT License**; only the distributed binary becomes AGPL‑licensed when PyMuPDF is included.
315
+
316
+
318
317
  Links:
319
318
  - Source code: https://github.com/City-of-Memphis-Wastewater/pdflinkcheck/
320
- - Official AGPLv3 Text (FSF): https://www.gnu.org/licenses/agpl-3.0.html
319
+ - PyMuPDF source code: https://github.com/pymupdf/PyMuPDF/
320
+ - pypdf source code: https://github.com/py-pdf/pypdf/
321
+ - AGPLv3 text (FSF): https://www.gnu.org/licenses/agpl-3.0.html
322
+ - MIT License text: https://opensource.org/license/mit
321
323
 
322
324
  Copyright © 2025 George Clayton Bennett