markdown_link_checker_sc 0.0.142 → 0.0.144

This diff represents the content of publicly available package versions that have been released to one of the supported registries. The information contained in this diff is provided for informational purposes only and reflects changes between package versions as they appear in their respective public registries.
Files changed (37) hide show
  1. package/.claude/settings.local.json +5 -6
  2. package/CLAUDE.md +155 -152
  3. package/LICENSE +21 -21
  4. package/README.md +133 -124
  5. package/index.js +433 -428
  6. package/package.json +29 -29
  7. package/src/errors.js +249 -249
  8. package/src/filters.js +138 -138
  9. package/src/helpers.js +56 -56
  10. package/src/links.js +200 -200
  11. package/src/output_errors.js +113 -109
  12. package/src/process_external_url_links.js +563 -563
  13. package/src/process_image_orphans.js +96 -96
  14. package/src/process_internal_url_links.js +22 -22
  15. package/src/process_local_image_links.js +45 -45
  16. package/src/process_markdown.js +453 -453
  17. package/src/process_markdown_reflinks.js +192 -192
  18. package/src/process_orphans.js +144 -144
  19. package/src/process_relative_links.js +114 -114
  20. package/src/shared_data.js +1 -1
  21. package/src/slugify.js +18 -18
  22. package/tests/errortype/linked_internal_file_html/file_exists.html +4 -4
  23. package/tests/errortype/linked_internal_file_html/file_exists_as_markdown.md +6 -6
  24. package/tests/errortype/linked_internal_file_html/links_to_file_that_is_html.md +9 -9
  25. package/tests/fixtures/anchor_targets/linker.md +9 -9
  26. package/tests/fixtures/anchor_targets/page.md +16 -16
  27. package/tests/fixtures/link_formats/page_a.md +40 -40
  28. package/tests/fixtures/link_formats/page_b.md +5 -5
  29. package/tests/integration/anchor_targets.test.js +167 -167
  30. package/tests/integration/error_cases.test.js +228 -228
  31. package/tests/integration/link_formats.test.js +149 -149
  32. package/tests/real/stlink/probe_stlink.md +197 -197
  33. package/tests/unit/externalLinks.test.js +1219 -1219
  34. package/tests/unit/filters.test.js +276 -276
  35. package/tests/unit/outputErrors.test.js +144 -0
  36. package/tests/unit/processMarkdown.test.js +304 -304
  37. package/tests/unit/slugify.test.js +49 -49
@@ -1,12 +1,11 @@
1
1
  {
2
2
  "permissions": {
3
3
  "allow": [
4
- "Read(///wsl.localhost/Ubuntu-24.04/home/ubunut/github/hamishwillee/**)",
5
- "Read(///wsl.localhost/Ubuntu-24.04/home/ubunut/github/hamishwillee/markdown_link_checker_sc/**)"
6
- ],
7
- "additionalDirectories": [
8
- "\\home\\ubunut\\github\\hamishwillee\\markdown_link_checker_sc",
9
- "\\\\wsl.localhost\\Ubuntu-24.04\\home\\ubunut\\github\\hamishwillee\\markdown_link_checker_sc"
4
+ "Read(//wsl.localhost/Ubuntu-24.04/home/ubunut/github/hamishwillee/**)",
5
+ "Bash(echo \"exit: $?\")",
6
+ "Bash(node *)",
7
+ "Bash(cd \"\\\\\\\\wsl.localhost\\\\Ubuntu-24.04\\\\home\\\\ubunut\\\\github\\\\hamishwillee\\\\markdown_link_checker_sc\")",
8
+ "Bash(grep -rn \"\\\\\\\\-p\" /c/Users/hamis/github/hamishwillee/markdown_link_checker_sc/tests --include=\"*.js\")"
10
9
  ]
11
10
  }
12
11
  }
package/CLAUDE.md CHANGED
@@ -1,152 +1,155 @@
1
- # markdown_link_checker_sc
2
-
3
- Node.js internal (and optional external) markdown link checker. Alpha quality — focused on better internal link handling than existing tools.
4
-
5
- ## Running
6
-
7
- ```bash
8
- # Basic usage — check a docs directory
9
- node index.js -r <repo-root> -d <docs-subdir> -e <lang-subdir> -i <image-subdir>
10
-
11
- # Example: PX4 docs
12
- node index.js -r ~/github/PX4/PX4-Autopilot/ -d docs -e en -i assets
13
-
14
- # Include external link checking (slow)
15
- node index.js -r ~/github/PX4/PX4-Autopilot/ -d docs -e en -i assets -x true
16
- ```
17
-
18
- ## Key CLI Options
19
-
20
- | Flag | Purpose |
21
- |------|---------|
22
- | `-r` | Repo root (everything resolved relative to this) |
23
- | `-d` | Docs subfolder relative to `-r` |
24
- | `-e` | Subdirectory within docs to check (e.g. `en`) |
25
- | `-i` | Image directory for orphan checking |
26
- | `-x true` | Also check external links |
27
- | `-f` | JSON file listing specific files to report on |
28
- | `-t` | TOC/summary file path (inferred if not set) |
29
- | `-u` | Site base URL to catch absolute links that should be relative |
30
- | `-m false` | Disable markdown fallback for `.html` links (try `.md` if `.html` not found) |
31
- | `-p true` | Interactive mode build ignore list by answering prompts |
32
- | `-o false` | Disable log file output |
33
-
34
- ## Output
35
-
36
- - Console: markdown-friendly grouped error list
37
- - `logs/filteredErrors.json` — final errors (post-filter)
38
- - `logs/allErrors.json` — all errors before filtering
39
- - `logs/allResults.json` complete parse results
40
-
41
- ## Dependencies
42
-
43
- ```bash
44
- npm install
45
- ```
46
-
47
- Three runtime dependencies: `commander`, `normalize-path`, `prompt-sync`.
48
-
49
- ## Architecture
50
-
51
- Pipeline in `index.js`:
52
- 1. Scan directory recursively
53
- 2. Parse each file (`process_markdown.js`) — extracts links, headings, HTML ids, reference links
54
- 3. Validate relative links (`process_relative_links.js`)
55
- 4. Validate local image links (`process_local_image_links.js`)
56
- 5. Flag absolute URLs to current site (`process_internal_url_links.js`)
57
- 6. Check orphaned pages (`process_orphans.js`)
58
- 7. Check orphaned images (`process_image_orphans.js`)
59
- 8. Optionally check external URLs (`process_external_url_links.js`)
60
- 9. Filter errors (`filters.js`)
61
- 10. Output errors (`output_errors.js`)
62
-
63
- State is managed in `index.js` via `sharedData` from `src/shared_data.js`. All processing/filter/output functions receive `options` as an explicit parameter — `sharedData` is not imported by any `src/` module other than the session state it provides to `index.js`.
64
-
65
- ## Ignore Files
66
-
67
- - `<docsroot>/_link_checker_sc/ignorefile.json` — array of files to skip parsing (paths relative to docsroot)
68
- - `<repo>/_link_checker_sc/ignore_errors.json` — specific errors to suppress (built interactively with `-p true`)
69
-
70
- Entries in `ignore_errors.json` may have an optional `expiry` field (`YYYY-MM-DD`). When an entry expires:
71
- - It is **kept** in the file with `expired: true` added (not deleted), so the history is visible.
72
- - The error re-appears in output, annotated with `[Previously ignored until <date>: "<reason>"]`.
73
- - In interactive mode (`-p true`), the previous reason is offered as the default when re-ignoring.
74
- - To renew: remove or update `expiry` and remove `expired: true`. To clean up: delete the entry manually.
75
-
76
- ## Source Files
77
-
78
- | File | Role |
79
- |------|------|
80
- | `src/shared_data.js` | Session state object — used only by `index.js` |
81
- | `src/helpers.js` | File type detection, logging (`logFunction(options, name)`, `logToFile(path, data, options)`) |
82
- | `src/slugify.js` | VuePress heading → anchor slug |
83
- | `src/links.js` | `Link` classparses and classifies a URL; constructor takes `options` |
84
- | `src/errors.js` | Error class hierarchy (13 types); all constructors take `docsroot` explicitly |
85
- | `src/process_markdown.js` | Main regex parser `processMarkdown(contents, page, options)` |
86
- | `src/process_markdown_reflinks.js` | Reference-style links `[text][ref]` — `processReferenceLinks(content, page, options)` |
87
- | `src/process_relative_links.js` | Relative link/anchor validation — `processRelativeLinks(results, options)` |
88
- | `src/process_local_image_links.js` | Image file existence checks — `checkLocalImageLinks(results, options)` |
89
- | `src/process_internal_url_links.js` | Flags absolute URLs to current site — `processUrlsToLocalSource(results, options)` |
90
- | `src/process_external_url_links.js` | External URL checks with concurrency/retry |
91
- | `src/process_orphans.js` | Orphaned page detection — `checkPageOrphans(results, options)`, `getPageWithMostLinks(pages, options)` |
92
- | `src/process_image_orphans.js` | Orphaned image detection — `checkImageOrphansGlobal(results, options, allImageFiles)` |
93
- | `src/filters.js` | Error filtering `filterErrors(errors, options)`, `filterIgnoreErrors(errors, options)` |
94
- | `src/output_errors.js` | Console + file output — `outputErrors(results, options)` |
95
-
96
- ## Testing
97
-
98
- Uses Node.js built-in `node:test` — no extra test dependencies.
99
-
100
- ```bash
101
- # Run from WSL (UNC paths prevent npm test working from Windows PowerShell)
102
- node --test --test-reporter spec tests/unit/*.test.js tests/integration/*.test.js
103
- ```
104
-
105
- | Directory | Purpose |
106
- |---|---|
107
- | `tests/fixtures/link_formats/` | Fixture markdown files covering every link format |
108
- | `tests/fixtures/anchor_targets/` | Fixture files covering every anchor target mechanism |
109
- | `tests/errortype/` | Existing per-error-type fixtures (kept from original) |
110
- | `tests/unit/processMarkdown.test.js` | Unit tests: link parsing and anchor detection |
111
- | `tests/unit/slugify.test.js` | Unit tests: VuePress slug algorithm |
112
- | `tests/integration/link_formats.test.js` | Pipeline test on link format fixtures |
113
- | `tests/integration/anchor_targets.test.js` | Pipeline test on anchor target fixtures |
114
- | `tests/integration/error_cases.test.js` | Error detection tests using `tests/errortype/` |
115
-
116
- Known limitations are documented as `test.skip` entries with descriptive names so they appear in the test report.
117
-
118
- ### Mock options for new tests
119
-
120
- ```js
121
- const opts = {
122
- docsroot: '/abs/path/to/fixture/dir',
123
- markdownroot: '/abs/path/to/fixture/dir',
124
- log: [],
125
- anchor_in_heading: true,
126
- tryMarkdownforHTML: true,
127
- site_url: null, // set to 'mysite.com' for UrlToLocalSite tests
128
- toc: null,
129
- files: [],
130
- errors: 'ExternalLinkWarning',
131
- logtofile: false,
132
- interactive: false,
133
- };
134
- ```
135
-
136
- Replicate `index.js processFile()` logic in tests to build result objects:
137
- ```js
138
- const result = processMarkdown(contents, filePath, opts);
139
- result.page_file = filePath;
140
- result.anchors_auto_headings = result.headings.map(slugifyVuepress);
141
- ```
142
-
143
- ## Known Limitations
144
-
145
- - Regex-based parsing — links inside code blocks or HTML comments are captured
146
- - No support for autolinks (`<http://example.com>`)
147
- - No URL-escaped anchor comparison
148
- - Plain reference links `[ref]` not supported only `[text][ref]`
149
- - Reference definitions must be on a single line
150
- - `[text][missing-ref]` `ReferenceForLinkNotFoundError` is created but not pushed to errors (commented out in `process_markdown_reflinks.js`)
151
- - `<a name="x">` not detected as an anchor target (only `id=` captured); also crashes the parser if it has inner text and no `href`/`id`
152
- - `msg_docs/` auto-generated files produce many false-positive `CurrentFileMissingAnchor` errors (constants linked as anchors)
1
+ # markdown_link_checker_sc
2
+
3
+ Node.js internal (and optional external) markdown link checker. Alpha quality — focused on better internal link handling than existing tools.
4
+
5
+ ## Running
6
+
7
+ ```bash
8
+ # Basic usage — check a docs directory
9
+ node index.js -r <repo-root> -d <docs-subdir> -e <lang-subdir> -i <image-subdir>
10
+
11
+ # Example: PX4 docs
12
+ node index.js -r ~/github/PX4/PX4-Autopilot/ -d docs -e en -i assets
13
+
14
+ # PX4 docs also flag absolute links that should be relative (e.g. https://docs.px4.io/...)
15
+ node index.js -r ~/github/PX4/PX4-Autopilot/ -d docs -e en -i assets -u docs.px4.io
16
+
17
+ # Include external link checking (slow)
18
+ node index.js -r ~/github/PX4/PX4-Autopilot/ -d docs -e en -i assets -x true
19
+ ```
20
+
21
+ ## Key CLI Options
22
+
23
+ | Flag | Purpose |
24
+ |------|---------|
25
+ | `-r` | Repo root (everything resolved relative to this) |
26
+ | `-d` | Docs subfolder relative to `-r` |
27
+ | `-e` | Subdirectory within docs to check (e.g. `en`) |
28
+ | `-i` | Image directory for orphan checking |
29
+ | `-x true` | Also check external links |
30
+ | `-f` | JSON file listing specific files to report on |
31
+ | `-t` | TOC/summary file path (inferred if not set) |
32
+ | `-u` | Site base URL (e.g. `docs.px4.io`). Without `-u`, absolute links to your site are treated as external URLs. With `-u`, they are flagged as `UrlToLocalSite` errors ("should this be relative?") |
33
+ | `-m false` | Disable markdown fallback for `.html` links (try `.md` if `.html` not found) |
34
+ | `--interactive` | Interactive mode — build ignore list by answering prompts |
35
+ | `-o false` | Disable log file output |
36
+
37
+ ## Output
38
+
39
+ - Console: markdown-friendly grouped error list
40
+ - `logs/filteredErrors.json` — final errors (post-filter)
41
+ - `logs/allErrors.json` — all errors before filtering
42
+ - `logs/allResults.json` — complete parse results
43
+
44
+ ## Dependencies
45
+
46
+ ```bash
47
+ npm install
48
+ ```
49
+
50
+ Three runtime dependencies: `commander`, `normalize-path`, `prompt-sync`.
51
+
52
+ ## Architecture
53
+
54
+ Pipeline in `index.js`:
55
+ 1. Scan directory recursively
56
+ 2. Parse each file (`process_markdown.js`) — extracts links, headings, HTML ids, reference links
57
+ 3. Validate relative links (`process_relative_links.js`)
58
+ 4. Validate local image links (`process_local_image_links.js`)
59
+ 5. Flag absolute URLs to current site (`process_internal_url_links.js`)
60
+ 6. Check orphaned pages (`process_orphans.js`)
61
+ 7. Check orphaned images (`process_image_orphans.js`)
62
+ 8. Optionally check external URLs (`process_external_url_links.js`)
63
+ 9. Filter errors (`filters.js`)
64
+ 10. Output errors (`output_errors.js`)
65
+
66
+ State is managed in `index.js` via `sharedData` from `src/shared_data.js`. All processing/filter/output functions receive `options` as an explicit parameter — `sharedData` is not imported by any `src/` module other than the session state it provides to `index.js`.
67
+
68
+ ## Ignore Files
69
+
70
+ - `<docsroot>/_link_checker_sc/ignorefile.json` array of files to skip parsing (paths relative to docsroot)
71
+ - `<repo>/_link_checker_sc/ignore_errors.json` specific errors to suppress (built interactively with `--interactive`)
72
+
73
+ Entries in `ignore_errors.json` may have an optional `expiry` field (`YYYY-MM-DD`). When an entry expires:
74
+ - It is **kept** in the file with `expired: true` added (not deleted), so the history is visible.
75
+ - The error re-appears in output, annotated with `[Previously ignored until <date>: "<reason>"]`.
76
+ - In interactive mode (`--interactive`), the previous reason is offered as the default when re-ignoring.
77
+ - To renew: remove or update `expiry` and remove `expired: true`. To clean up: delete the entry manually.
78
+
79
+ ## Source Files
80
+
81
+ | File | Role |
82
+ |------|------|
83
+ | `src/shared_data.js` | Session state object used only by `index.js` |
84
+ | `src/helpers.js` | File type detection, logging (`logFunction(options, name)`, `logToFile(path, data, options)`) |
85
+ | `src/slugify.js` | VuePress heading anchor slug |
86
+ | `src/links.js` | `Link` class parses and classifies a URL; constructor takes `options` |
87
+ | `src/errors.js` | Error class hierarchy (13 types); all constructors take `docsroot` explicitly |
88
+ | `src/process_markdown.js` | Main regex parser — `processMarkdown(contents, page, options)` |
89
+ | `src/process_markdown_reflinks.js` | Reference-style links `[text][ref]` — `processReferenceLinks(content, page, options)` |
90
+ | `src/process_relative_links.js` | Relative link/anchor validation `processRelativeLinks(results, options)` |
91
+ | `src/process_local_image_links.js` | Image file existence checks — `checkLocalImageLinks(results, options)` |
92
+ | `src/process_internal_url_links.js` | Flags absolute URLs to current site — `processUrlsToLocalSource(results, options)` |
93
+ | `src/process_external_url_links.js` | External URL checks with concurrency/retry |
94
+ | `src/process_orphans.js` | Orphaned page detection — `checkPageOrphans(results, options)`, `getPageWithMostLinks(pages, options)` |
95
+ | `src/process_image_orphans.js` | Orphaned image detection — `checkImageOrphansGlobal(results, options, allImageFiles)` |
96
+ | `src/filters.js` | Error filtering — `filterErrors(errors, options)`, `filterIgnoreErrors(errors, options)` |
97
+ | `src/output_errors.js` | Console + file output — `outputErrors(results, options)` |
98
+
99
+ ## Testing
100
+
101
+ Uses Node.js built-in `node:test` no extra test dependencies.
102
+
103
+ ```bash
104
+ # Run from WSL (UNC paths prevent npm test working from Windows PowerShell)
105
+ node --test --test-reporter spec tests/unit/*.test.js tests/integration/*.test.js
106
+ ```
107
+
108
+ | Directory | Purpose |
109
+ |---|---|
110
+ | `tests/fixtures/link_formats/` | Fixture markdown files covering every link format |
111
+ | `tests/fixtures/anchor_targets/` | Fixture files covering every anchor target mechanism |
112
+ | `tests/errortype/` | Existing per-error-type fixtures (kept from original) |
113
+ | `tests/unit/processMarkdown.test.js` | Unit tests: link parsing and anchor detection |
114
+ | `tests/unit/slugify.test.js` | Unit tests: VuePress slug algorithm |
115
+ | `tests/integration/link_formats.test.js` | Pipeline test on link format fixtures |
116
+ | `tests/integration/anchor_targets.test.js` | Pipeline test on anchor target fixtures |
117
+ | `tests/integration/error_cases.test.js` | Error detection tests using `tests/errortype/` |
118
+
119
+ Known limitations are documented as `test.skip` entries with descriptive names so they appear in the test report.
120
+
121
+ ### Mock options for new tests
122
+
123
+ ```js
124
+ const opts = {
125
+ docsroot: '/abs/path/to/fixture/dir',
126
+ markdownroot: '/abs/path/to/fixture/dir',
127
+ log: [],
128
+ anchor_in_heading: true,
129
+ tryMarkdownforHTML: true,
130
+ site_url: null, // set to 'mysite.com' for UrlToLocalSite tests
131
+ toc: null,
132
+ files: [],
133
+ errors: 'ExternalLinkWarning',
134
+ logtofile: false,
135
+ interactive: false,
136
+ };
137
+ ```
138
+
139
+ Replicate `index.js processFile()` logic in tests to build result objects:
140
+ ```js
141
+ const result = processMarkdown(contents, filePath, opts);
142
+ result.page_file = filePath;
143
+ result.anchors_auto_headings = result.headings.map(slugifyVuepress);
144
+ ```
145
+
146
+ ## Known Limitations
147
+
148
+ - Regex-based parsing links inside code blocks or HTML comments are captured
149
+ - No support for autolinks (`<http://example.com>`)
150
+ - No URL-escaped anchor comparison
151
+ - Plain reference links `[ref]` not supported only `[text][ref]`
152
+ - Reference definitions must be on a single line
153
+ - `[text][missing-ref]` — `ReferenceForLinkNotFoundError` is created but not pushed to errors (commented out in `process_markdown_reflinks.js`)
154
+ - `<a name="x">` not detected as an anchor target (only `id=` captured); also crashes the parser if it has inner text and no `href`/`id`
155
+ - `msg_docs/` auto-generated files produce many false-positive `CurrentFileMissingAnchor` errors (constants linked as anchors)
package/LICENSE CHANGED
@@ -1,21 +1,21 @@
1
- MIT License
2
-
3
- Copyright (c) 2023 Hamish Willee
4
-
5
- Permission is hereby granted, free of charge, to any person obtaining a copy
6
- of this software and associated documentation files (the "Software"), to deal
7
- in the Software without restriction, including without limitation the rights
8
- to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
- copies of the Software, and to permit persons to whom the Software is
10
- furnished to do so, subject to the following conditions:
11
-
12
- The above copyright notice and this permission notice shall be included in all
13
- copies or substantial portions of the Software.
14
-
15
- THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
- IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
- FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
- AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
- LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
- OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
- SOFTWARE.
1
+ MIT License
2
+
3
+ Copyright (c) 2023 Hamish Willee
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.